Print-preview rendering

Creating multi-channel bitmaps from a PDF

When converting a PDF to a bitmap image in Java we use the AWT, and the AWT works in RGB (although the API seems to allow for non-RGB colors, there are some very deep-seated assumptions that ensure everything is collapsed to RGB before rendering).

In most cases this is ideal - your screen is RGB, after all. But PDF is at least partly print-focused and there are some documents that simply can not be rendered accurately this way; documents that make use CMYK blending, and documents that use Overprinting. The latter in particular requires every "ink" used in the PDF to be tracked individually, so collapsing colors to RGB is a problem.

Release 2.28.3 of our PDF Library adds support for rendering each ink individually. It works by creating N images, one for each ink, and rendering to each one separately (this is currently N times slower than RGB rendering as a result, but we're working on that). The individual channels are then combined into a single N-channel image (based on a DeviceNColorSpace) from which individual channels can be extracted, or the whole image converted to RGB to give a visual representation of what the file would look like on a printer supporting spot colors: a print-preview mode identical to the "Output Preview" option in Adobe Acrobat.

If all this sounds complicated, the API is actually very simple: nothing more than a new PDFParser.SEPARATIONS ColorModel to pass into the various methods in the PagePainter class, one new helper method in the DeviceNColorSpace class, and a few more in DeviceNColorSpace.Builder.

import org.faceless.pdf2.*;
import java.awt.image.*;
import java.util.List;
import java.awt.color.ColorSpace;

PDF pdf = ...
PDFParser parser = new PDFParser(pdf);
PagePainter painter = parser.getPagePainter(pdf.getPage(0));
BufferedImage image = painter.getImage(200, PDFParser.SEPARATIONS);
// image is an "N-channel" image, containing at least Cyan, Magenta,
// Yellow and Black, plus any spot colors used in the image.

// To find out what colors are used in the image, try this.
// Will output eg. "[Cyan, Magenta, Yellow, Black, Pantone Reflex Blue CV]"
DeviceNColorSpace cs = (DeviceNColorSpace)image.getColorModel().getColorSpace();
List<String> names = cs.geNames();

// To convert to RGB
ColorSpace sRGB = ColorSpace.getInstance(ColorSpace.CS_sRGB);
BufferedImageOp op = cs.getColorConvertOp(sRGB);
BufferedImage rgbimage = op.filter(image, null);
// "rgbimage" is, not unsurprisingly, an RGB image.

// To extract a single channel, say for an "Output Preview" type interface.
String ink = "Cyan";    // any name in the list of names.
SpotColorSpace spot = cs.getComponentColorSpace(names.indexOf(ink));
BufferedImageOp op = cs.getColorConvertOp(spot);
BufferedImage channelImage = op.filter(image, null);
// "channelImage" is a grayscale image: a pixel value of "0" means no ink,
// a value of "255" means 100% ink.
// For other possibilities see DeviceNColorColor.getColorConvertOp() API docs

// Or to remove a single channel...
DeviceNColorSpace.Builder builder = new DeviceNColorSpace.Builder(cs);
DeviceNColorSpace csWithoutCyan = builder.create();
BufferedImageOp op = cs.getColorConvertOp(csWithoutCyan);
BufferedImage channelImage = op.filter(image, null);

// And to convert the image back to a PDF, just turn it into a PDFImage
PDFImage pdfimage = new PDFImage(pdf);

There are all sorts of possibilities - for example, by passing in ColorSpace based on the image colorspace but without a particular component, it's possible to genreate a bitmap of the PDF without that spot color. This could be used generate a "mask" bitmap from a PDF, excluding technical separations - separations that represent fold or cut lines, rather than ink.

An example

How much does this matter? For the majority of PDFs, switching to the "per separation" rendering model described here will make no difference, other than making things slower. But for certain PDFs making use of Overprint, it will be critical. As an example we've been testing with sample files provided by the by the Ghent Workgroup which demonstrate this sort of workflow.

Here's a traditional RGB rendering on the left, next to a rendering in "per separation" mode.

The orange layer is in an ink called "SpotVarnish" - it uses overprinting to cover the content, but shouldn't replace it. Using the code above it's simple to extract an image showing just the varnish layer, below on the left. And for contrast, on the right is the image without the varnish layer (notice its absence in the color bar).

Hopefully this gives you an idea of the possibilities with this new functionality. The almost trivial interface presented here is the tip of ten-thousand line iceberg of code, with a focus in the first release (2.28.3) being correctness rather than speed. We expect this to improve in future releases. And as always we'd welcome any feedback on what we can do to make this new functionality more useful.