Hack 82 Integrate pdftk with gVim for Seamless PDF Editing

figs/expert.gif figs/hack82.gif

Turn gVim into a PDF editor.

gVim is an excellent text editor that can also be handy for viewing and editing PDF code. It handles binary data nicely, it is mature, and it is free. Also, you can extend it with plug-ins, which is what we'll do. First, let's download and install gVim.

Visit http://www.vim.org. The download page offers links and instructions for numerous platforms. Windows users can download the installer from http://www.vim.org/download.php. As of this writing, it is called gvim63.exe. During installation, the default settings should suit most needs. Click through to the end and it will create a Programs menu from which you can launch gVim.

The first-time gVim user should run gVim in Easy mode, which will make it behave like most other text editors. On Windows, do this by running gVim Easy from the Programs menu. Or, you can activate Easy mode from inside gVim by typing (the initial colon is essential) :source $VIMRUNTIME/evim.vim into your gVim session.

gVim comes with an interactive tutorial and a good online help system. Learn about it by invoking :help.

If gVim frequently complains about "Illegal Back Reference" errors, check your HOME environment variable (Start Settings Control Panel System Advanced Environment Variables). Some backslash character combinations in HOME, such as \1 or \2, will trigger these errors. Try replacing all the backslashes with forward slashes in HOME.

6.10.1 Plug pdftk into gVim

The pdftk plug-in turns gVim into a PDF editor. When opening a PDF it automatically uncompresses page streams so that you can read and modify them [Hack #80] . When closing a PDF, it automatically compresses page streams. If any changes were made to the PDF, it also fixes internal PDF byte offsets [Hack #81] .

Visit http://www.AccessPDF.com/pdftk/ and download pdftk.vim.zip. Unzip and then move the resulting file, pdftk.vim, to the gVim plug-ins directory. This usually is located someplace such as C:\Vim\vim63\plugin\. To help find this directory, try searching for the file gzip.vim, which should be there already. The pdftk plug-in will be sourced the next time you run gVim, so restart gVim if necessary.

Use care when testing the plug-in for the first time. Copy a PDF to create a test file named test1.pdf. Launch gVim and use it to open test1.pdf. There will be a delay while pdftk uncompresses it. Data should then appear in the editor as readable text, as shown in Figure 6-8. Any graphic bitmaps still will appear as unreadable gibberish.

Figure 6-8. Clear text from compressed PDF streams, thanks to our gVim plug-in

Without making any changes to the file, save it as test2.pdf and close it. gVim will pause again while it compresses test2.pdf. Now, open test2.pdf in Acrobat or Reader. If everything is in place, it should open just fine. If Acrobat or Reader complains about the file being damaged, double-check the installation.

Acrobat and Reader display a warning as they repair a corrupted PDF file, but sometimes this warning flashes by too quickly to notice. After the PDF is repaired, they will display the PDF as if nothing happened. So, Acrobat and Reader aren't the most reliable tools for testing PDFs. Consider these alternatives.

The free, command-line pdfinfo program from the Xpdf project (http://www.foolabs.com/xpdf/) can tell you whether a PDF is damaged. The Multivalent Tools (http://multivalent.sourceforge.net/Tools/index.html) also provide a free PDF validator.

6.10.2 Hacking the Hack

With our PDF extensions, gVim enables you to conveniently edit PDF code. You can bring power and beauty together by configuring Acrobat's TouchUp tool to use gVim for editing PDF objects.

In Acrobat, select Edit Preferences General TouchUp. Click Choose Page/Object Editor . . . and a file selector will open. Select gvim.exe, which usually lives somewhere such as C:\Vim\vim63\. Click OK and you are done.

Test out your new configuration on a disposable document. Open the PDF in Acrobat and select the TouchUp Object tool (it might be hidden by the TouchUp Text tool), as shown in Figure 6-9.

Figure 6-9. The TouchUp Object tool button, which can be found hiding under the TouchUp Text tool button in Acrobat 6 (left) and Acrobat 5 (right)

Click a paragraph and a box appears, outlining the selection. Right-click inside this box and choose Edit Object . . . . gVim will open, displaying the PDF code used to describe this selection, as shown in Figure 6-10. It will be a full PDF document, with fonts and an XREF table.

Figure 6-10. The PDF code behind selected sections

Find some paragraph text and make some small changes [Hack #80] . When you save the gVim file, Acrobat should promptly update the visible page to reflect your change. Sometimes this update looks imperfect, temporarily. You can make many successive PDF edits this way.

You might notice occasional warnings from gVim about the data having been modified on the disk by another program. You can safely ignore these.