What’s the best way to programmatically edit a pdf in ruby?

I've been doing a good deal of PDF generation in Rails, and had to go through the process of comparing all the available techniques and frameworks in order to find the right solution for my needs.

Its great that there are so many tools out there, but it can be a daunting task to figure out which is best, which will scale, which will continue to grow and improve, and to evaluate the true 'cost' of free vs. commercial.

With all this info finally digested and sorted out,  I was surprised when I got a client request to be able to add a banner to an existing pdf, and from what I can  recall, none of the libraries I know about seem able to do this.  Right now, I'm in the middle of googling the hell out of it, but haven't found my silver-bullet answer yet.  (maybe I should ask jeeves?)

I've done various searches and have come across a few categories of PDF tools:

  1. Wrappers to existing pdf generation tools like fpdf, and iText
  2. PDF Template tools where you build a pdf skeleton file, and bind values to it programmatically
  3. Pure Ruby PDF Generation tools
  4. PDF Readers and inspection tools

I did find a discussion about how this could work on Google Groups between Greg Brown creator of Prawn and Ruport, and James Healey developer of PDF::Reader, but that discussion basically ended with, "Yeah, that would be cool!".

At this point I'm looking into the Origami library which is actually designed for pdf 'security' and testing, and isn't explicitly designed for editing pdfs in this way, but at the moment its the leading candidate in my list.

Have I missed something? Is there an obvious way to do this in ruby/rails that I'm completely overlooking? (I haven't looked very deeply at tools that shell out to the bigger libraries, but I wouldn't rule them out)

The initial requirement was to be able to add a banner/header to an existing PDF, but I can see the complexities of determining how to shift the existing content down without screwing up all the formatting, so I think even being able to insert a coverpage might be a suffcient implementation for now. (Maybe I should be searching for pdf 'merging' instead of editing)

I'll update you with my final solution in an upcoming blog post, and I'll be covering all of the info I've learned on PDF Generation tools for Ruby and Rails  at this year's WindyCityRails Conference on September 12th. Drop by http://windycityrails.org to register. (early registration ends Aug 1st)

Related posts:

  1. What makes Ruby/Rails Development Fun
  2. Ruby on Rails with Windows – How I made it work
  3. Upcoming Pathfinder Appearances
  4. Ruby on Rails Internship
  5. ChicagoRuby meeting ‘Test Prescriptions’ recap

Comments: 10 so far

  1. Have you looked at http://www.princexml.com/ ? It’s been bullet-proof for me, but may be overkill for your needs.

    Comment by George Anderson, Tuesday, July 21, 2009 @ 8:54 pm

  2. Pragmatic Programmers uses iText to create the personalised stamps on the PDF versions of their books. (From the article in PragPub it looks as if this is done as an edit to an existing PDF master copy, too.)

    It’s a fairly short article (no technical details), but it may be worth contacting them for more info to see if it could suit your needs.

    Comment by MattR, Tuesday, July 21, 2009 @ 9:09 pm

  3. I’ve heard good things about princexml, and the ruby wrappers for it, but I balked at the price, given the pdf requirements that I was aware of at the time I was making my selection. I think if the reports I’m generating need to go beyond the formatting I’m able to get with Prawn now, I’m going to start looking into princeXml and seeing what it can do. (do you have any issue with the pricing when running in production, or is all your stuff on a single ’server’ anyway? – I guess I could say its on a single physical server, independent of how many VMs are running in the cluster/cloud)

    [WORDPRESS HASHCASH] The poster sent us ‘0 which is not a hashcash value.

    Comment by John McCaffrey, Wednesday, July 22, 2009 @ 9:26 am

  4. I’ll have to check this out, it seems like being deeply familiar with iText, and being able to create good docs from any programming stack would be good for me anyway as this is something that comes up on every project no matter if its Java, Rails, .Net or PHP. Thanks for the tip (I’ll have to find that article and contact the author)

    [WORDPRESS HASHCASH] The poster sent us ‘0 which is not a hashcash value.

    Comment by John McCaffrey, Wednesday, July 22, 2009 @ 9:28 am

  5. I’m all for Ruby (it’s my favorite language at the moment). But if you can’t find any good library in ruby for this, then perhaps in this case it would be a good idea to look elsewhere.

    Comment by ehsanul, Wednesday, July 22, 2009 @ 6:08 pm

  6. Also, have you tried asking on stack overflow? Ruby IRC chat? Could prove useful.

    Comment by ehsanul, Wednesday, July 22, 2009 @ 6:09 pm

  7. I’ve dealt with this, and maybe you’ll think this is stupid, but you’ll get the best fidelity out of a print to pdf solution (through cups). There are walkthroughs out there for setting up the cups/pdf print. Anyways, try putting on your mozrepl boots and march through loading up the page to print (as a url), and print. Obviously, this is single user account based, so go ahead and create buttloads of user accounts and a nice dynamic load balancer / queue for pdf gen requests. In the end, you’ll have a beautifully orchestrated telnet directed circus, replete with lions, tigers, and pdfs.

    Comment by alan, Wednesday, July 22, 2009 @ 9:11 pm

  8. I forgot to mention, you won’t beat html styling in other libraries, and certain packages (of royal lineage – for example) suck at converting html to pdf.

    Comment by alan, Wednesday, July 22, 2009 @ 9:15 pm

  9. I had this exact same problem to solve recently, and settled on passing a bunch of JSON out to a PHP service as I wasn’t happy with any of the tools for editing or creating PDFs with Ruby (and Rails).

    I needed to write on top of a PDF template and managed to do so with little issue using FPDF and FPDI on PHP. I know it’s not Ruby but it solved my issue in the short term, and might be useful for others.

    Comment by mlambie, Wednesday, July 22, 2009 @ 10:01 pm

  10. Check out pdftk for easy merging of PDF documents (it’s a CLI app)

    http://www.accesspdf.com/pdftk/

    Comment by Alex, Thursday, July 23, 2009 @ 1:46 am

Leave a comment

Powered by WP Hashcash

Launch: Pathfinder Newsletter

    Get a monthly update on best practices for delivering successful software.

    Subscribe via email


    Subscribe via RSS      RSS icon

Topics

Search

WordPress

Comments about this site: info@pathf.com