Tales of PDF Generation in Rails

Last week’s “inflight project” (that is, some fun code I hack out while in the air) was to do some custom PDF generation in Rails. That is, a special view to render a PDF.

There appear to be two main approaches to this currently in rails:

The approaches are fundamentally different. PDFKit is designed to take HTML files and make them PDFs. Prawn is more like a “builder” in that you build your PDF (and it abstracts away a lot of the PDF nastyness).

PDFKit is great for when you need to make PDF’s of your existing HTML documents (railscast here). In fact in a few lines of Ruby in your config file, you can add the ability to make PDFs of every single URL on your site, making for a fancy “save as PDF” feature. With a bit of print-targeted CSS, you can make this pretty amazing.

However, I’m building a custom PDF that won’t be viewed as HTML. So that benefit is moot for this task. A few people did however mention that the nice thing about PDFKit is the learning curve is not very steep as you leverage your existing HTML skills.

So down that route I plunged.

Designing for a PDF with exact sizes, I laid out my HTML file using mm units. I created a “page” div class with the height and width set to my PDF page size. Perfect – now I can generate PDF’s that are page-accurate.

wkhtmltopdf is an amazing library. Don’t get me wrong. It is truly awesome how you can just give it an URL and get back a PDF. And if you set your CSS up nicely, your PDF can be exactly how you designed in in the browser (and that’s the other cool thing – you can preview in HTML…).

But I ran into a few near deal-breaking bugs. I have more-or-less worked around these issues for now, but they are sadly huge problems in this otherwise awesome API. I’m not all complaints here… I have contributed my own test cases and thoughts to these bugs, and in some cases offered some “coder motivation” in the form of a bounty. But a fact is a fact – some are very critical and have gone unsolved for years.

These bugs (as of 2011-04-18) are:

  1. Totally borked kerning on characters in Linux. This is a severity one, deal-breaking bug. What looks awesome in Mac OS X, looks terrible in Linux. So be warned if (like me) you dev on OS X and deploy to Linux, as you’re in for a surprise. Workaround: try embedding the fonts and tweak the sizes a lot to hopefully achieve passable results. (nb. there is no real solution to this, only ways to mitigate the effect). Test on your deployment platform during development.
  2. You cannot create a pdf that draws to the very edges of the document. There is a small white border at the bottom and right. Workaround: Suck it up.
  3. @font-face relative links don’t seem to work. Workaround: use absolute ones

NB. if you’re using Centos5 and try to embed a font using @font-face, be aware of this missing dependancy issue in Centos5 which will crash wkhtmltopdf. Solution: upgrade your dependancies.

Aside: @font-face

Embedded fonts in HTML documents are surprisingly well supported now (even in IE4+, and no that’s not a typo), albiet not quite standardized. I highly recommend Font Squirrel’s @font-face generator for processing your embedded fonts and making them work across all browsers.

Concolusions

Given these pretty critical bugs, if I was starting over, I would probably consider Prawn for any custom PDF view generation, and use wkhtmltopdf more as Middleware for PDFing standard HTML views. wkhtmltopdf can be a little fickle to get the perfect output you want (see the bug list above), and sadly some of these bugs has gone unsolved for a long time. This could be due to the inherit complexities of WebKit and how Linux handles fonts, but it does’t change the end-result. Were these issues fixed – then I do quite like the elegance of re-using my HTML knowledge (hence why I chose that route to begin with).

I guess the biggest thing to realise is that Prawn is cross-platform, wkhtmltopdf is not. Sure, wkhtmltopdf will run on OS X, Linux and even Windows, but the output is not identical, and I doubt ever will be. wkhtmltopdf remains a totally kick-arse tool, one just needs to be aware of these inherit limitations.


Leave a Reply

Your email address will not be published. Required fields are marked *