Thursday, June 20, 2013

How to Probably get Pandoc Running on a Linux Server

(

Drink-by date: This post was written based based on work I did in June, 2013. The GHC, Haskell, Cabal and pandoc were all current versions installed in Ubuntu 10.04. If you're reading this documentation after June, 2015, consider it out of date.

This blog post was the original inspiration to use Pandoc

)

A few months ago I started putting my resumé on Github. As a freelance software dev it's quicker to send somebody to 1 spot to see both code for projects I work on and descriptive, bulleted-list of accomplishments (without all of the monetization noise of LinkedIn). Recently I spent 5 hours on one of those endlessly-recursive campaigns of compiling things from source code in order to save myself a few minutes of maintaining my online resumé in different formats (pdf and markdown) and since it deals with some bugs in some standard Ubuntu / Debian packages, I'll post instructions here for how I fixed them.

My goal was to be able to update my resumé in markdown, and have run a single deploy script that would commit changes, generate a pdf from the markdown and push the updated markdown and pdf to GitHub. There's a Linux utility called "pandoc" that does this and a whole lot more. It's written in Haskell, a powerful functional programming language that is beloved by academics and high-frequency stock traders and otherwise hasn't broken into mainstream software developement.

My first time around, I did a standard "sudo apt-get install pandoc" on my Ubuntu server and thought I was good to go. The syntax for a basic conversin is simple, and like ffmpeg, the input and output formats are inferred by file extension:

$ pandoc resume.md -o resume.pdf

Except that this yielded the following error:

pandoc: resume.md: hGetContents: invalid argument (Invalid or incomplete
multibyte or wide character)

A character-encoding bug. Removing curly quotes from the markdown file confirmed this since it worked fine if the input document was all ascii. But what's the point of a PDF if you have to limit yourself to ugly straight quotes and spelling "resumé" as "resume"? There was some chatter on google groups that this was an issue with setting the right LANG in your locale, but that in later version of pandoc this bug was fixed. Rather than fiddle with global settings, I opted to upgrade pandoc.

The version of pandoc, as well as Haskell and GHC (the Glasgow Haskell Compiler) are all years out of date in debian / ubuntu. (The computer I was working on was running Lucid 10.04 which is supported through 2015). So geting the latest pandoc woudln't be as simple as an apt-get upgrade, and would require upgrading the GHC, Haskell, the Haskell Package Manager (cabal) and finally pandoc. All of these are source-code installs except for the final upgrade of pandoc.

If you've already tried to install pandoc or Haskell via apt-get you'll need to remove the packages via the following command:

$ sudo apt-get autoremove ghc6

Now we're ready to start the installation process. First, install the dependencies via apt-get:

$ sudo apt-get install libgmp3c2 freeglut3 libedit2 libedit-dev freeglut3-dev libglu1-mesa-dev
pandoc uses LaTeX for formatting, so you'll need to get pdflatex on your system.
$ sudo apt-get install texlive-full

Now you'll want to get the source code to build the GHC and the Haskell Platform. First the GHC. This a standard configure / make install build.

Cabal, the Haskell Package manager, is included in the Haskell Platform

$ wget http://lambda.haskell.org/platform/download/2013.2.0.0/haskell-platform-2013.2.0.0.tar.gz
$ gunzip haskell-platform-2013.2.0.0.tar.gz 
$ tar -xvvf haskell-platform-2013.2.0.0.tar 
$ cd haskell-platform-2013.2.0.0
$ ./configure
$ ./make
$ ./sudo make install

You can also get the source for Cabal from GitHub.

Once you have Cabal up and running, first refresh the packages list:

$ cabal update
Now, you can install pandoc
$ cabal install pandoc
Cabal will install a bunch of dependencies. This will tak a few minutes. Cabal installs executables to ~/.cabal/bin/pandoc - you can symlink this to a directory that's already in your path:
sudo ln -s ~/.cabal/bin/pandoc /usr/local/bin/pandoc

Now you should be ready to go. For some pointers as to how I actually automated the generating of pdf's by committing changes to my resumé, check out this shell script: https://github.com/erstwhile/resume/blob/master/deploy

No comments:

Post a Comment