Hack 89 Assemble Pages and Serve PDF

figs/expert.gif figs/hack89.gif

Collate an online document at serve-time.

Imagine that you have a travel web site. A user visits to learn what packages you offer. She enters her preferences and tastes into an online form and your site returns several suggestions. Now, take this scenario to the next level. Create a custom PDF report based on these suggestions by assembling your literature into a single document. She can download and print this report, and the full impact of your literature is preserved. She can share it with her friends, read it in a comfortable chair, and leave it on her desk as a reminder to follow up?a personal touch with professional execution.

Assembling PDFs into a single document should be easy, and it is. In Java use iText. Elsewhere, use our command-line pdftk [Hack #79] .

6.17.1 Assemble Pages in Java with iText

If your web site runs Java, consider using the iText library (http://www.lowagie.com/iText/) to assemble PDF documents. The following code demonstrates how to use iText to combine PDF pages. Compile and run this Java program from the command-line, or use its code in your Java application:

/*

  concat_pdf, version 1.0, adapted from the iText tools

  concatenate input PDF files and write the results into a new PDF

  http://www.pdfhacks.com/concat/



  This code is free software. It may only be copied or modified

  if you include the following copyright notice:



  This class by Mark Thompson. Copyright (c) 2002 Mark Thompson.



  This code is distributed in the hope that it will be useful,

  but WITHOUT ANY WARRANTY; without even the implied warranty of

  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

 */



import java.io.*;



import com.lowagie.text.*;

import com.lowagie.text.pdf.*;



public class concat_pdf extends java.lang.Object {

    

  public static void main( String args[] ) {

    if( 2<= args.length ) {

      try {

        int input_pdf_ii= 0;

        String outFile= args[ args.length-1 ];

        Document document= null;

        PdfCopy writer= null;



        while( input_pdf_ii < args.length- 1 ) {

          // we create a reader for a certain document

          PdfReader reader= new PdfReader( args[input_pdf_ii] );

          reader.consolidateNamedDestinations( );



          // we retrieve the total number of pages

          int num_pages= reader.getNumberOfPages( );

          System.out.println( "There are "+ num_pages+ 

                              " pages in "+ args[input_pdf_ii] );

                    

          if( input_pdf_ii== 0 ) {

            // step 1: creation of a document-object

            document= new Document( reader.getPageSizeWithRotation(1) );



            // step 2: we create a writer that listens to the document

            writer= new PdfCopy( document, new FileOutputStream(outFile) );



            // step 3: we open the document

            document.open( );

          }



          // step 4: we add content

          PdfImportedPage page;

          for( int ii= 0; ii< num_pages; ) {

            ++ii;

            page= writer.getImportedPage( reader, ii );

            writer.addPage( page );

            System.out.println( "Processed page "+ ii );

          }



          PRAcroForm form= reader.getAcroForm( );

          if( form!= null ) {

            writer.copyAcroForm( reader );

          }



          ++input_pdf_ii;

        }



        // step 5: we close the document

        document.close( );

      }

      catch( Exception ee ) {

        ee.printStackTrace( );

      }

    }

    else { // input error

      System.err.println("arguments: file1 [file2 ...] destfile");

    }

  }

}

To create a command-line Java program, copy the preceding code into a file named concat_pdf.java. Then, compile concat_pdf.java using javac, setting the classpath to the name and location of your iText jar:

javac -classpath  ./itext-paulo.jar  concat_pdf.java

Finally, invoke concat_pdf to combine PDF documents, like so:

java -classpath  ./itext-paulo.jar :. \

concat_pdf  in1.pdf in2.pdf in3.pdf out123.pdf

6.17.2 Assemble Pages in PHP with pdftk

This example of using pdftk with PHP demonstrates how easily it assembles server-side PDF. Pass pdftk a hyphen instead of an output filename, and it will deliver its work on stdout.

<?php

// the input PDF filenames

$brochure_dir= '/var/www/brochures/';

$report_pieces= 

   array( 'our_cover.pdf', 'boston.pdf', 'yorktown.pdf', 'our_info.pdf' );



// the command and its arguments

$cmd= '/usr/local/bin/pdftk ';

foreach( $report_pieces as $ii => $piece ) {

   $full_fn= $brochure_dir.$piece;

   if( is_readable( $full_fn ) ) {

      $cmd.= ' '.$full_fn;

   }

}

$cmd.= ' cat output -'; // hyphen means output to stdout



// serve it up

header( 'Content-type: application/pdf' );

passthru( $cmd ); // command output gets passed to client

?>

6.17.3 See Also

Consider some of these other free tools for assembling PDF:

  • Multivalent Document Tools (http://multivalent.sourceforge.net/Tools/index.html) are Java tools for manipulating PDF documents.

  • PDFBox (http://www.pdfbox.org) is a Java library that can combine PDF documents.

  • PDF::Extract (http://search.cpan.org/~nsharrock/) is a Perl module for extracting pages from a PDF document.

  • PDF::Reuse (http://search.cpan.org/~larslund/) is a Perl module designed for mass-producing PDF documents from templates.