Add live session data to your PDF on its way down the chute.
After publishing your PDF online, it can be hard to gauge what impact it had on readers. Get a clearer picture of reader response by modifying the PDF's hyperlinks so that they pass document information to your web server.
For example, if your July newsletter's PDF edition has hyperlinks to:
http://www.pdfhacks.com/index.html
you can append the newsletter's edition to the PDF hyperlinks using a question mark:
http://www.pdfhacks.com/index.html?edition=0407
When somebody reading your PDF newsletter follows this link into your site, your web logs record exactly which newsletter they were reading.
Take this reader response idea a step further by adding data to PDF hyperlinks that identifies the user who originally downloaded the PDF. With a little preparation, this is easy to do as the PDF is being served.
A PDF page can include hyperlinks to web content. You can create them using the Link tool, the Button tool (Acrobat 6), or the Form tool (Acrobat 5). Use the Link tool shown in Figure 6-11 if you want to add a hyperlink to existing text or graphics. Use the Button/Form tool if you want to add a hyperlink and add text/graphics to the page, as shown in Figure 6-12. For example, you would use the Button/Form tool to create a web-style navigation bar [Hack #65] .
To create a hyperlink button in Acrobat 6, select the Button tool (Tools Advanced Editing Forms Button Tool). Click the PDF page and drag out a rectangle. Release the rectangle and a Field Properties dialog opens. Set the button's appearance using the General, Appearance, and Options tabs.
Open the Actions tab. Set the Trigger to Mouse Up, set the Action to Open a Web Link, and then click Add . . . . A dialog will open where you can enter the hyperlink URL.
|
To create a hyperlink button in Acrobat 5, select the Form tool. Click the PDF page and drag out a rectangle. Release the rectangle and a Field Properties dialog opens. Set the field type to Button and enter a unique Name. Set the button's appearance using the Appearance and Options tabs.
Open the Actions tab. Select Mouse Up and click Add . . . . Set the Action Type to World Wide Web Link. Click Edit URL . . . and enter the hyperlink URL.
When entering your link or button URL, use an identifying name, such as urlbeg_userhome, instead of the actual URL. Pad this placeholder with asterisks (*) so that it is at least as long as your longest possible URL, as shown in Figure 6-13. Use a constant prefix across all these names (e.g., urlbeg) so that they are easy to find later using grep.
When your PDF is ready to distribute online, run it through pdftk [Hack #79] . This formats the PDF code to ensure that each URL is on its own line. Add the extension pdfsrc to the output filename instead of pdf:
pdftk mydocument .pdf output mydocument .pdfsrc
From this point on, you should not treat the file like a PDF, and this pdfsrc extension will remind you.
Find the byte offsets to your URL placeholders with grep (Windows users visit http://gnuwin32.sf.net/packages/grep.htm or install MSYS [Hack #97] to get grep). grep will tell you the byte offset and display the specific placeholder located on that line in the PDF. For example:
ssteward@armand:~$ grep -ab
urlbeg mydocument
.pdfsrc
9202:<</URI (urlbeg_userhome*******************)
11793:<</URI (urlbeg_userhome*******************)
17046:<</URI (urlbeg_newsletters*******************)
In your text editor [Hack #82], open your pdfsrc file and add one line for each offset to the beginning. Each line should look like this:
#- urlname - urloffset
For example, this is how the previous grep output would appear at the start of mydocument.pdfsrc:
#-userhome-9202 #-userhome-11793 #-newsletters-17046 %PDF-1.3...
After adding these lines, do not modify the PDF with pdftk, gVim, or Acrobat. The pdfsrc extension should remind you to not treat this file like a PDF. Altering the PDF could break these byte offsets.
This example PHP script, serve_newsletter.php, opens a pdfsrc file, reads the offset data we added, then serves the PDF. As it serves the PDF, it replaces the placeholders with hyperlinks. It uses the input GET query string's edition and user values to tailor the PDF hyperlinks.
For example, when invoked like this:
http://www.pdfhacks.com/serve_newsletter.php?edition=0307&user=84
it opens the PDF file newsletter.0307.pdfsrc and serves it, replacing all userhome hyperlink placeholders with:
http://www.pdfhacks.com/user_home.php?user=84
and replacing all newsletters placeholders with:
http://www.pdfhacks.com/newsletter_home.php?user=84&edition=0307
Tailor serve_newsletter.php to your purpose:
<?php // serve_newsletter.php, version 1.0 // http://www.pdfhacks.com/dynamic_links/ $fp= @fopen( "./newsletter.{$_GET['edition']}.pdfsrc", 'r' ); if( $fp ) { if( $_GET['debug'] ) { header("Content-Type: text/plain"); // debug } else { header('Content-Type: application/pdf'); } $pdf_offset= 0; $url_offsets= array( ); // iterate over first lines of pdfsrc file to load $url_offsets while( $cc= fgets($fp, 1024) ) { if( $cc{0}== '#' ) { // one of our comments list($comment, $name, $offset)= explode( '-', $cc ); if( $name== 'userhome' ) { $url_offsets[(int)$offset]= 'http://www.pdfhacks.com/user_home.php?user=' . $_GET['user']; } else if( $name== 'newsletters' ) { $url_offsets[(int)$offset]= 'http://www.pdfhacks.com/newsletter_home.php?user=' . $_GET['user'] . '&edition=' . $_GET['edition']; } else { // default $url_offsets[(int)$offset]= 'http://www.pdfhacks.com'; } } else { // finished with our comments echo $cc; $pdf_offset= strlen($cc)+ 1; break; } } // sort by increasing offsets ksort( $url_offsets, SORT_NUMERIC ); reset( $url_offsets ); $output_url_line_b= false; $output_url_b= false; $closed_string_b= false; list( $offset, $url )= each( $url_offsets ); $url_ii= 0; $url_len= strlen($url); // iterate over rest of file while( ($cc= fgetc($fp))!= "" ) { if( $output_url_line_b && $cc== '(' ) { // we have reached the beginning of our URL $output_url_line_b= false; $output_url_b= true; echo '('; } else if( $output_url_b ) { if( $cc== ')' ) { // finished with this URL if( $closed_string_b ) { // string has already been capped; pad echo ' '; } else { echo ')'; } // get next offset/URL pair list( $offset, $url )= each( $url_offsets ); $url_ii= 0; $url_len= strlen($url); // reset $output_url_b= false; $closed_string_b= false; } else if( $url_ii< $url_len ) { // output one character of $url echo $url{$url_ii++}; } else if( $url_ii== $url_len ) { // done with $url, so cap this string echo ')'; $closed_string_b= true; $url_ii++; } else { echo ' '; // replace padding with space } } else { // output this character echo $cc; if( $offset== $pdf_offset ) { // we have reached a line in pdfsrc where // our URL should be; begin a lookout for '(' $output_url_line_b= true; } } ++$pdf_offset; } fclose( $fp ); } else { // file open failure echo 'Error: failed to open: '."./newsletter.{$_GET['edition']}.pdfsrc"; } ?>
Upload this file to your web server along with your modified PDF file. Invoke the script with an information-packed URL, such as this one:
http://www.pdfhacks.com/newsletters.php?ed=0307&u=84572