Trimming whitespace from a TIFF before saving

The PDF library can convert PDF to TIFF (or other bitmap formats) very easily, but as PDF is designed for printing, the size of each page is normally a traditonal print size - A4 or Letter - even though the content may take up a fraction of that page. In situations like this it's sometimes useful to trim the whitespace from the image before saving it.

There are two ways to do this. First, you can set the CropBox of the page before you convert the PDF, by calling the PDFPage.setBox method; but this means you have to know the bounds in advance. If you want to crop the image to just remove whitespace around the page (sometimes called "trimming"), you need to take a different approach.

Finding the content boundaries

The easiest approach for this is to render the PDF to a BufferedImage as normal, then scan the image to look for the boundaries. This is very simple, as you can see from the following method:

static BufferedImage trim(BufferedImage image) {
  int x1=Integer.MAX_VALUE, y1=Integer.MAX_VALUE, x2=0, y2=0;
  for (int x=0;x<image.getWidth();x++) {
    for (int y=0;y<image.getHeight();y++) {
      int argb = image.getRGB(x, y);
      if (argb != -1) {
        x1 = Math.min(x1, x);
        y1 = Math.min(y1, y);
        x2 = Math.max(x2, x);
        y2 = Math.max(y2, y);
      }     
    }   
  }
  WritableRaster r = image.getRaster();
  ColorModel cm = image.getColorModel();
  r = r.createWritableChild(x1, y1, x2-x1, y2-y1, 0, 0, null);
  return new BufferedImage(cm, r, cm.isAlphaPremultiplied(), null);
}   

This will return a BufferedImage which is trimmed of all the whitespace around the edges (argb will be 0xFFFFFFFF if the pixels are opaque white).

Now you have the cropped BufferedImage, saving as a PNG or similar is very easy using the ImageIO class. If you want to save the image as a TIFF, the approach outlined in this article shows you how to do this with the JAI package.