Using Java to Print PDF Documents

Printing has been possible since the beginnings of Java. It's not the most elegant aspect of the language, which reflects it's evolution over the years from a single Toolkit method in Java 1.0 to the PrinterJob of Java 1.2 and the current PrintService approach in 1.4. Thankfully the API is now fairly stable, although fixes were made in Java 5 and 6; printing is one area where a newer version of Java does help.

The packages for printing are the java.awt.print and javax.print packages - the latter dates from Java 1.4 and supplements, rather than replaces, the original API. To print a PDF from Java the process broadly works as follows:

  1. Search for and select a PrintService to print to
  2. Create a PrinterJob and assign it to this service
  3. Assign the PDF and the attributes for this print job, then print

Of these steps, the only PDF-specific one is the last, so in practice printing a PDF in Java is like printing any other type of document, but we'll go through all the steps in detail. All the code below assumes you've imported java.awt.print, javax.print and its sub-packages.

Selecting a PrintService

In order to chose a printer you must first know what you're printing, which means choosing a DocFlavor. In the case of a PDF it's a PAGEABLE object, which means that the contents are a (potentially) multi-page object which will be formatted by the print API.

You also get to specify a set of PrintRequestAttributes to qualify your search for a printer - for instance, you may need a printer than can print double-sided or that can print to A3, and there are attributes for each of these. Here's what the code would look like if we were looking for a printer that can print double-sided:

DocFlavor flavor = DocFlavor.SERVICE_FORMATTED.PAGEABLE;
PrintRequestAttributeSet patts = new HashPrintRequestAttributeSet();
patts.add(Sides.DUPLEX);
PrintService[] ps = PrintServiceLookup.lookupPrintServices(flavor, patts);
if (ps.length == 0) {
    throw new IllegalStateException("No Printer found");
}
PrinterJob job = PrinterJob.getPrinterJob();
job.setPrintService(ps[0]);

Spooling the print job

Now we need to specify the document and the job attributes. Just like the PrintRequestAttribute helps to specify the type of printer you're looking for, the DocAttribute specifies details about the print job itself - the name of the the printjob, whether it's double-sided, which pages to print and so on.

How you specify the document depends on what sort of "flavor" you specified. We chose the PAGEABLE flavor so we need to pass in a Pageable object - this is the PDFParser object, which implements Pageable. Here's how to print to our double-sided printer:

DocumentAttributeSet datts = new HashDocumentAttributeSet();
datts.add(Sides.DUPLEX);
PDFParser parser = new PDFParser(pdf);
job.setPageable(parser);
job.print(datts);

Nine times out of ten that's all you'll need to know, but if your print job doesn't run as expected you'll need to peek behind the curtain to know why.

The gory details

The first thing that happens is our Pageable object is asked for a sequence of Printable objects, which represent a single page. Printable has a single print method which is given a Graphics2D object implementing PrinterGraphics, and our PDF API simply "draws" the PDF to that Graphics object, as if we were rendering to the screen. This Graphics object will convert our graphics operations to a format suitable for printing, and in practice that means PostScript.

PostScript

PostScript is the grandaddy of PDF, and if you're reading about printing you've already heard of it. It can represent text, vector graphics and images, so if your PDF contains a vector shape it's going to be reproduced almost exactly in PostScript. Other graphics operations (like transparency) can't be done in PostScript, and the Java print layer handles this by rasterizing the page to a bitmap and including that bitmap in the spool file.

The PrintGraphics doesn't know in advance if a page has transparency, so printing in Java means potentially painting the page twice - a best effort first, then falling back to a bitmap if that's not possible (actually the page image is often printed in "stripes", so the print method may be called several times).

However the main consequence of rasterization is that a 600dpi image of a page is a lot bigger than the PDF page itself, even before you get into PostScript's tendency to drastically overinflate. It's not uncommon to see spool files of several MB per page, so avoiding this if possible is a good idea.

If you know in advance that your page is going to be rasterized you can avoid the page being painted twice by setting the "bfo.printasimage" option on the PDF before printing:

pdf.setOption("bfo.printasimage", "300");
will rasterize the PDF as a 300-dpi bitmap before printing (you can also set the System property sun.java2d.print.pipeline to raster). This is also useful when the print layer gives incorrect results, or when you want to print the image at a lower DPI than your printer is capable of.

What to avoid for better PostScript

  • Compositing,or more specifically any sort of transparency (including image transparency) or any composites other than SRC_OVER. In PDF, composites are called "Blend Modes" and you'll see them used when blending one image over another, or even when printing a page containing a "highlight" annotation (which is done with a Multiply composite). Any of these will cause the page to be rasterized
  • Patterns. Tiling, shading, anything other than a solid opaque java.awt.Color will cause the page to be rasterized
  • Bitmap fonts - sadly, we still see these quite a bit in documents created by older versions of GhostScript. Bitmap fonts have one image per glyph which acts as a mask. The mask is done with transparency, and this will cause the page to be rasterized
  • Embedded Fonts - Java has some deep-seated issues with fonts, which are used in a very different way inside a PDF. Many fonts that are valid in PDF will throw exceptions if parsed by the AWT, so the PDF API has it's own font parsers for TrueType, Type 1 and Type 2 (compact) fonts. However, due to other assumptions within the AWT we can't just pass these fonts in to the Graphics2D for rendering: we have to draw each glyph as a Shape. In the AWT this largely boils down to the same operation, but for PostScript it means that each letter in the font has to be drawn as a path. This will lead to a much larger file, although usually it's not as bad as rasterizing. We're edging towards another approach to embedded fonts, but at the time of writing this is still the case.
  • Colors outside the sRGB gamut. Not nearly as awkward as the above points, but even if your PDF contains CMYK colors and your PostScript is destined for a CMYK printer, the colors are still going to be converted to RGB by the Java print layer, then back again by the printer. Don't expect photo-realistic printing with Java.

Scaling

Barring the issues above, printing in recently versions of Java is good enough for most documents. However there are a few other tips we can offer: The most common printing questions we get are about scaling (and not just when printing from Java).

PDF is a page language and a page in PDF should represent a page on the printer. If they're the same size that's usually OK (provided your printer driver isn't trying to add margins, of course), but if the sizes differ there are several options.

  • If the output page is too small the image can be clipped or scaled down to fit.
  • If the output page is bigger than the PDF page, the image can be scaled up or just centered in the page.

You can control this with the print.scaling PDF option added in Acrobat 8: set this option to None to prevent all page scaling. If you're printing to Java with our API you can choose from a few other options as well:

  • Fit - scale the page up or down to fit the printable area, preserving the aspect ratio
  • FitUnlocked - as for "Fit" but don't preserve the aspect ratio
  • ShrinkToFit - like Fit, but will only ever scale the page down (our default if not specified)
  • ShrinkToFitUnlocked - like ShrinkToFit, but don't preserve the aspect ratio.
For examples, if you want to fit a Letter page exactly to A4 and don't mind losing the aspect ratio of the text, do this before spooling the file with Java
pdf.setOption("print.scaling", "FitUnlocked");

Duplex and other print options

You can specify other options in the PDF - see our setOption method. However other than scaling as described above these will only set the defaults in the print dialog of our viewer. If you're printing from your own code and want to honours these options, you'll need to turn them into appropriate PrintRequestAttributes or DocAttributes yourself.