The PDF library can convert PDF to TIFF (or other bitmap formats) very easily, but as PDF is designed for printing, the size of each page is normally a traditonal print size - A4 or Letter - even though the content may take up a fraction of that page. In situations like this it's sometimes useful to trim the whitespace from the image before saving it.
There are two ways to do this. First, you can set the CropBox
of the page before you convert the PDF, by calling the PDFPage.setBox
method; but this means you have to know the bounds in advance. If you want to crop
the image to just remove whitespace around the page (sometimes called "trimming"),
you need to take a different approach.
Finding the content boundaries
The easiest approach for this is to render the PDF to a BufferedImage as normal, then scan the image to look for the boundaries. This is very simple, as you can see from the following method:
static BufferedImage trim(BufferedImage image) { int x1=Integer.MAX_VALUE, y1=Integer.MAX_VALUE, x2=0, y2=0; for (int x=0;x<image.getWidth();x++) { for (int y=0;y<image.getHeight();y++) { int argb = image.getRGB(x, y); if (argb != -1) { x1 = Math.min(x1, x); y1 = Math.min(y1, y); x2 = Math.max(x2, x); y2 = Math.max(y2, y); } } } WritableRaster r = image.getRaster(); ColorModel cm = image.getColorModel(); r = r.createWritableChild(x1, y1, x2-x1, y2-y1, 0, 0, null); return new BufferedImage(cm, r, cm.isAlphaPremultiplied(), null); }
This will return a BufferedImage
which is trimmed of all the whitespace around the edges (argb
will be 0xFFFFFFFF if the pixels are opaque white).
Now you have the cropped BufferedImage, saving as a PNG or similar is very easy using
the ImageIO
class. If you want to save the image as a TIFF, the approach outlined in this article shows you how to do this with the JAI package.