Hack 71 PDF Web Skins

figs/moderate.gif figs/hack71.gif

Split a PDF into pages and frame them in HTML, where the fun begins.

In general, HTML files are called pages, while PDF files are called documents. By splitting a PDF document into PDF pages we shift it into HTML's paradigm where we now can program the document like a web site. Let's start with a basic document skin, shown in Figure 5-17, which gives us a cool look and handy document navigation.

Figure 5-17. The Classic skin, which includes navigation features
figs/pdfh_0517.gif


Our Classic skin has a number of nice built-in features:

  • Table of contents portal page based on PDF bookmarks

  • Navigation cluster for flipping through pages

  • Table of Contents navigation sidebar based on PDF bookmarks

  • A hyperlink to the full, unsplit PDF for download on each page

  • Convenient Email This Page link on each page

Test-drive our online version at http://www.pdfhacks.com/eno/. The HTML, JavaScript, and user interface icons are freely distributable under the GPL, so feel free to use them in your own templates.

5.22.1 Skinning PDF

First, install pdftk [Hack #79] . Next, visit http://www.pdfhacks.com/skins/ and download pdfskins-1.1.zip. Unzip, and move pdfskins.exe to a convenient location, such as C:\Windows\system32\. On other platforms, compile pdfskins from the included source code. Just cd pdfskins-1.1 and run make.

Download a skin template from http://www.pdfhacks.com/skins/. The template pdfskins_classic_js uses client-side JavaScript to create the dynamic pieces. pdfskins_classic_php uses server-side PHP instead. Pick one and unzip it into a new directory:

unzip pdfskins_classic_js-1.1.zip

Copy your PDF document into this new directory and burst it into pages with pdftk. This also creates doc_data.txt, which reports on the document's title, metadata, and bookmarks:

pdftk  full_doc.pdf  burst

Finally, in this same directory, spin skins using pdfskins. It reads doc_data.txt, created earlier, for the document title and other data. Pass the PDF filename as the first argument, if you plan to make the full PDF document available for download. This first argument is used only for constructing the Download Full Document hyperlink. It can be a full or relative URL. Omit this filename, and this hyperlink will not be displayed.

pdfskins  full_doc.pdf

Fire up your web browser and point it at index.html, located in the directory where you've been working. The portal should appear, showing the table of contents and graphic placeholders for your logo (logo.gif) and document cover thumbnail (thumb.gif). If you used the php or comments templates, the pages must be served to you by a PHP-enabled web server.

The PDF pages that make up our skinned PDF do not need to be linearized; nor does the web server require byte serving configuration [Hack #67] . The only requirement is that the user has Adobe Reader configured to display PDF inside the browser, which is the default Reader configuration.


5.22.2 Changing Colors, Overriding the Title

You can add or change data in the doc_data.txt file, or you can pass additional, overriding data to pdfskins on the command line. This is most useful for changing the default colors used in the Classic skin. For example:

pdfskins full_doc.pdf -title "Great American Novel" -color1 #336600 \ 

-color2 white

In the Classic skin, color1 is the color of the header and color2 is the color around the upper-left logo. Alternatively, you can add or change these lines in doc_data.txt:

InfoKey: Color1

InfoValue: #336600

InfoKey: Color2

InfoValue: white

InfoKey: Title

InfoValue: Great American Novel

5.22.3 PDF Skins as Copy Protection

By bursting your PDF into pages and then not making the full document available for download, you compel readers to return to your site when they desire your material. If this is your intent, you should also secure your pages against merging, so nobody can easily reassemble your pages into the original PDF document. Do this when bursting the document. For example:

pdftk full_doc.pdf burst encrypt_128bits owner_pw 23@#5dfa \

allow DegradedPrinting

See [Hack #52] for more details on how to secure documents with pdftk.

Test our PHP-based hacks on your Windows machine by installing the Apache web server. See [Hack #74] for a discussion about installing Apache and PHP on Windows using IndigoPerl.


5.22.4 Hacking the Hack

Now, you control the document. You can take it in any direction you choose. See [Hack #21] for some ideas on how to add full-text document search. See [Hack #72] to learn how to add online page commenting.