Tales of PDF Generation in Rails
Last week’s “inflight project” (that is, some fun code I hack out while in the air) was to do some custom PDF generation in Rails. That is, a special view to render a PDF.
There appear to be two main approaches to this currently in rails:
- PDFKit (which uses the wkhtmltopdf binary, itself based on WebKit)
- Prawn (a native Ruby API)
The approaches are fundamentally different. PDFKit is designed to take HTML files and make them PDFs. Prawn is more like a “builder” in that you build your PDF (and it abstracts away a lot of the PDF nastyness).
PDFKit is great for when you need to make PDF’s of your existing HTML documents (railscast here). In fact in a few lines of Ruby in your config file, you can add the ability to make PDFs of every single URL on your site, making for a fancy “save as PDF” feature. With a bit of print-targeted CSS, you can make this pretty amazing.
However, I’m building a custom PDF that won’t be viewed as HTML. So that benefit is moot for this task. A few people did however mention that the nice thing about PDFKit is the learning curve is not very steep as you leverage your existing HTML skills.
So down that route I plunged.
Designing for a PDF with exact sizes, I laid out my HTML file using mm units. I created a “page” div class with the height and width set to my PDF page size. Perfect – now I can generate PDF’s that are page-accurate.
wkhtmltopdf is an amazing library. Don’t get me wrong. It is truly awesome how you can just give it an URL and get back a PDF. And if you set your CSS up nicely, your PDF can be exactly how you designed in in the browser (and that’s the other cool thing – you can preview in HTML…).
But I ran into a few near deal-breaking bugs. I have more-or-less worked around these issues for now, but they are sadly huge problems in this otherwise awesome API. I’m not all complaints here… I have contributed my own test cases and thoughts to these bugs, and in some cases offered some “coder motivation” in the form of a bounty. But a fact is a fact – some are very critical and have gone unsolved for years.
These bugs (as of 2011-04-18) are:
- Totally borked kerning on characters in Linux. This is a severity one, deal-breaking bug. What looks awesome in Mac OS X, looks terrible in Linux. So be warned if (like me) you dev on OS X and deploy to Linux, as you’re in for a surprise. Workaround: try embedding the fonts and tweak the sizes a lot to hopefully achieve passable results. (nb. there is no real solution to this, only ways to mitigate the effect). Test on your deployment platform during development.
- You cannot create a pdf that draws to the very edges of the document. There is a small white border at the bottom and right. Workaround: Suck it up.
- @font-face relative links don’t seem to work. Workaround: use absolute ones
NB. if you’re using Centos5 and try to embed a font using @font-face, be aware of this missing dependancy issue in Centos5 which will crash wkhtmltopdf. Solution: upgrade your dependancies.
Aside: @font-face
Embedded fonts in HTML documents are surprisingly well supported now (even in IE4+, and no that’s not a typo), albiet not quite standardized. I highly recommend Font Squirrel’s @font-face generator for processing your embedded fonts and making them work across all browsers.
Concolusions
Given these pretty critical bugs, if I was starting over, I would probably consider Prawn for any custom PDF view generation, and use wkhtmltopdf more as Middleware for PDFing standard HTML views. wkhtmltopdf can be a little fickle to get the perfect output you want (see the bug list above), and sadly some of these bugs has gone unsolved for a long time. This could be due to the inherit complexities of WebKit and how Linux handles fonts, but it does’t change the end-result. Were these issues fixed – then I do quite like the elegance of re-using my HTML knowledge (hence why I chose that route to begin with).
I guess the biggest thing to realise is that Prawn is cross-platform, wkhtmltopdf is not. Sure, wkhtmltopdf will run on OS X, Linux and even Windows, but the output is not identical, and I doubt ever will be. wkhtmltopdf remains a totally kick-arse tool, one just needs to be aware of these inherit limitations.
