Home Download << Previous | Home RSS feed

Software developers delighted with "wonderful" PDF Library

Swiss software house hails BFO for having the most stable support and cleanest API in the Java PDF market.


PDF Library 2.18 and the OutputProfiler class

We detail the major new change in 2.18, the OutputProfiler class


Big Faceless releases PDF Library 2.17.1

BFO have just released a new update to their PDF Library, the first for a few months. Here we describe a few features that are in the new release in a bit more detail.


Software house phihochzwei UG use BFO to speed up development

German based software developers use BFOs PDF Library, Viewer & Report Generator to satisfy customer requirements.


Text Highlighting with the PDF Viewer

The PDF Library ships with a PDF Viewer that offers the possibility to programmatically highlight words or sentences that match a filter.

You can highlight pieces of information on an embedded PDF online, or hide something from the Viewer (you should however not forget to prevent the text selection and the copying and pasting of text from your document). We are going to quickly browse through those cases with an example: this products catalog file.

Working with the viewer

First things first, let's see how to create a basic text highlighter and how to apply it to the PDF viewer. To load a document and display it in the Viewer, use the following snippet:

    final List<ViewerFeature> features = new ArrayList<>(ViewerFeature.getAllEnabledFeatures()); // 1
    // Here will go our TextHighlighter related code
    SwingUtilities.invokeLater(new Runnable() {
        public void run() {
            PDFViewer viewer = PDFViewer.newPDFViewer(features); // 2
            viewer.loadPDF(new File(pathToPDF)); // 3

In this short piece of code, we first fetch all the Viewer features (1`) that are typically enabled (highlighting, annotation, etc) and create a new Viewer with them (2). We then feed our PDF document to the Viewer via .loadPDF (3).

Adding a first text highlighter

There are two ways of using the TextHighlighter: one can either use a list of words that have to be highlighted or work with regular expression and use a pattern to decide. In any case the basis is the same:

    TextHighlighter wordHighlighter = new TextHighlighter();
    //configure the text highlighter here

You can repeat this pattern to have several highlighters at once. Now, let us take a look at how can we configure a TextHighlighter. The most basic is via the method .addWord(String word).


With this code, all the occurrences of US or AU in our list will be highlighted by a yellow rectangle. However, as you can see by running the example, words that contains US or AU (e.g. AU01) are also highlighted.

We can use a regular expression instead and replace the addWord("AU") by .setPattern(Pattern pattern).

    TextHighlighter wordHighlighter = new TextHighlighter();
    String patternStr = "\\bAU\\b";
    Pattern pattern = Pattern.compile(patternStr);

But keep in mind that setPattern cannot be used with addWord, if you want to catch several words with setPattern you need to handle it in your regular expression!

Highlighting customization

PDF Library also allows you to customize the text highlighting used by TextHighlighter. For instance:

    TextHighlighter frSeHighlighter = new TextHighlighter();
    frSeHighlighter.setHighlightType(TextTool.TYPE_BLOCK, new Color(0xaa0000ff, true), new BasicStroke(), 1);
    TextHighlighter auHighlighter = new TextHighlighter();
    auHighlighter.setHighlightType(TextTool.TYPE_OUTLINE, new Color(0xff0000),new BasicStroke(), 0.2f);

With this setup, FR and SE occurrences will be highlighted by a transparent blue rectangle while AU will be highlighted by a rectangular red box.

To go further, we could use this mechanism to preseve the anonimity of our producers (note we're only preventing the text from appearing in the Viewer, for permanent removal it would be better to redact it). We simply need to remove the TextTool feature from the original list (to make sure no one can select and copy text from our document):

    final ArrayList<ViewerFeature> features = new ArrayList<>(ViewerFeature.getAllEnabledFeatures());
    Iterator<ViewerFeature> iter = features.iterator(); 
    while (iter.hasNext()) {
        ViewerFeature viewerFeature = iter.next();
        if ( viewerFeature.getName().equals("TextTool") ) { 

Then we can highlight names with a black block:

    TextHighlighter wordHighlighter = new TextHighlighter();
    wordHighlighter.addWord("Pierre Dubois");
    wordHighlighter.addWord("John Watson");
    wordHighlighter.setHighlightType(TextTool.TYPE_BLOCK, Color.BLACK, new BasicStroke(), 1);

And we are done!

New features in the PDF Library 2.16

Our major new release has been several months coming - here's why.


Text Extraction Using BFOs PDF Library

How to extract text from a PDF using BFO's PDF Library API. We will show you with code examples of how it can be done.


New features in the PDF Library 2.15

It's been 4 months since our last PDF API release, what what does it have in store? Besides changes to the page list, there are two major new areas:

  • PDF/A-2 and PDF/A-3 support has been added
  • The Swing classes now support linearized loading

New PDF/A revisions

We're seeing more and more companies adopt ISO 19005, aka PDF/A, and we're pleased to have added support for revisions 2 and 3 of the specification. Of course there's no need to change if you're already targeting PDF/A-1 - if you're not familiar with the new revision than this is a good summary. But for those that need the new features allowed in the later revisions of the specification then this new release is for you. We think the most significant are:

  • Embedded files are now allowed: those files must also be PDF/A for PDF/A-2, but this is relaxed in PDF/A-3
  • JPEG2000 compression is now allowed
  • Transparency is now allowed

Currently we only support creation of PDF/A-2b and PDF/A-3b documents, but support for the "U" variation (for Unicode) will be in an upcoming release.

Linearization support in the viewer

For some customers, this is the big one. Linearized documents are designed to be displayable before the entire document has been downloaded, but although we added support for this to the core API in the previous release 2.14, it took until now to get this added to the viewer. It's a complex change, because it invalidates some previous assumptions (namely that pdf.getPage) will return immediately).

The good news it's in in and working, and for a demonstration point your web-browser to our example applet and select the 12MB "Linearized Example" from the drop-down list above it. The first page should show within a few seconds, but if you check the title bar you'll see a percentage showing how much of the document is actually downloaded.

How to take advantage of Linearization

To make use of this new feature there's actually very little you need to do. The PDF viewer will do this automatically if the following conditions are met:
  1. The PDF you're loading has to be linearized - probably goes without saying, but we'll say it anyway. Our PDF Library has been able to create linearised PDFs for a long time, and of course most other tools can create them too - they're variously called "Web Ready" or "Optimized" PDF in Acrobat.
  2. The PDF must be loaded from an HTTP or HTTPS URL. Our viewer has an API method to do this: PDFViewer.loadPDF(URL) - and if you're using the viewer as an applet you can do this by specifying the URL (relative or absolute) with the pdf parameter to the applet. See the PDFViewerApplet applet for details.
  3. The web-server serving the PDF must support the Range HTTP header in requests, and it must advertise this by adding Accept-Ranges: bytes in the initial response. Most do, if the file is a static file and being served from the filesystem by the default method.

    If you've got your own servlet which is serving the files, as you might if they were loaded from a database for instance, then you need to make sure you've implemented this. Your servlet will see an initial request for the PDF, and if the PDF is linearized that will be cancelled and many other requests made for smaller byte ranges. So if retrieving the PDF is a slow operation, perhaps because it's being retrieved from a remote location or a slow database, or perhaps because it might be modified by another process, then it makes sense to hold a copy of the PDF locally which can be discarded if there are no requests for a set period of time (we'd suggest 30 seconds to be safe).

Linearization and custom viewer features

If you've modified the viewer to add your own custom features, then there are more things to consider. First, if you're not loading linearized documents then you shouldn't need to worry too much: your features will still work, almost certainly without any changes required.

If you want to load linearized PDFs and use your custom features, then some work might be required. The main thing to remember is that a call to pdf.getPage(), or indeed any other code that returns a data structure of some sort from the PDF (form fields, bookmarks, file attachments etc.) might not return immediately - it might trigger a load. If you're doing this on the Swing thread then this will lock the thread, which of course is a bad thing.

To avoid this we've added the LinearizedSupport class to the viewer package. This is an easy way of adding callbacks, so your task will be run when the page is loaded. Let's say, for example, that your feature is going to jump to a specific page in the file when activated. Previously your code might have looked like this:

public void action(ViewerEvent event) {
    List pages = pdf.getPages();
    PDFPage page = pages.get(pagenumber);
This will jump the viewer to the page when run, but if that page hasn't been loaded yet the Swing thread will lock until it has (on the pages.get() line), which will make the application unresponsive. A linearization-aware approach would be to replace this with the following:
public void action(ViewerEvent event) {
    final DocumentPanel dp = getViewer().getActiveDocumentPanel();
    LinearizedSupport support = dp.getLinearizedSupport();
    support.invokeOnPageLoadWithDialog(pagenumber, new Runnable() {
        public void run() {
This will bring up a loading dialog while the requested page is loading and switch pages on completion - or, if the page is already loaded, will switch immediately. The LinearizedSupport class has several other methods which allow you to schedule tasks when the PDF has loaded the required section of the file.

BFO PDF Library 2.15 - but what happened to 2.14.1?

You released 2.14.1 of your PDF API yesterday, and today there's a 2.15. What are you people playing at? Read on, we'll explain.


Tags :

Valuation Företagsvärderingar - creating professional PDF reports & graphs with BFO Software

Swedish company valuations leader creates valuation reports for clients with the BFO Report Generator and BFO Graph Library.


Archiving PDF Documents with BFO for the Austrian Notaries Chamber

Long term PDF/A archiving for the Austrian Notaries Chamber, thanks to cyberDOC and BFO.


Converting PDFs to bitmap PDFs

When only a raster will do, how to do it efficiently

There are many situations where a PDF has to be "rasterized" - the contents of each page turned into a bitmap image - such as when a PDF is being converted to PDF/A and the page contents cannot be repaired. This article shows how to do it efficiently.


ObjectiveIT Integrates BFOs Report Generator into Insurance Tariff Comparison Software

ObjectiveIT develops an insurance tariff comparison solution for their insurance broker clients with the Report Generator.


BFO releases Java PDF Library 2.13

A bundle of small changes, and the permissions framework.

We've put out our first PDF library in 5 months, and although there are a lot of small changes there are very few headline grabbers. Perhaps the most interesting is the ability to restrict operations in the viewer with permissions - here we go into that framework in a little more detail.


The Firefox pdf.js Viewer

We've been getting a few emails asking about the new "pdf.js" viewer in Firefox, and why some of our documents don't render correctly in that viewer. Read on to find out why.


Tags :

Odds and Ends - PDF Valentines Cards

Because it's friday

This challenge was too good to resist. We've neglected to make our cards PDF/A compliant, which you are welcome to interpret as a commentary on the impermanence of romantic love, or perhaps it would have just taken longer to do.

Either way we hope you had a happy Hallmark day. The code is below, and if you want to generate your own cards for someone you love (or even someone you don't) you can do so with this form.

import org.faceless.pdf2.*;
import java.io.*;
import javax.servlet.*;
import javax.servlet.http.*;
import java.awt.geom.*;
import java.awt.*;

public class ValentineServlet extends HttpServlet {

  public static PDF makeCard(String to, String from) {
    PDF pdf = new PDF();
    PDFPage page = pdf.newPage("A5");

    // Make a heart. 
    GeneralPath p = new GeneralPath();
    p.moveTo(0, -10);
    p.curveTo(20, 30, 60, 30, 70, -10);
    p.curveTo(70, -30, 60, -40, 50, -50);
    p.lineTo(0, -90);
    p.lineTo(-50, -50);
    p.curveTo(-60, -40, -70, -30, -70, -10);
    p.curveTo(-60, 30, -20, 30, 0, -10);
    Rectangle2D r = p.getBounds();

    float linewidth = 8;
    PDFCanvas heart = new PDFCanvas((float)r.getWidth()+linewidth, (float)r.getHeight()+linewidth);
    PDFStyle style = new PDFStyle();
    style.setFillColor(new Color(255, 0, 128));
    style.setLineColor(new Color(128, 0, 0));
    heart.transform(AffineTransform.getTranslateInstance(-r.getMinX()+linewidth/2, -r.getMinY()+linewidth/2));

    // Draw loads of hearts randomly rotated and
    // positioned onto a canvas
    PDFCanvas canvas = new PDFCanvas(page.getWidth(), page.getHeight());
    for (int i=0;i<200;i++) {
      AffineTransform t = new AffineTransform();
      // Rotate left/right by <= 45°, scale up or down by factor of 2
      t.rotate((Math.random() - 0.5) * Math.PI / 2);
      t.translate(Math.random() * page.getWidth() - heart.getWidth()/2, Math.random() * page.getHeight() - heart.getHeight()/2);
      double scale = 1 / (Math.random() + 0.5);
      t.scale(scale, scale);
      canvas.drawCanvas(heart, 0, 0, heart.getWidth(), heart.getHeight());
    page.drawCanvas(canvas, 0, 0, canvas.getWidth(), canvas.getHeight());

    // Add the text
    PDFCanvas canvas = new PDFCanvas(page.getWidth(), page.getHeight());
    PDFStyle textstyle = new PDFStyle();
    textstyle.setFont(new StandardFont(StandardFont.HELVETICA), 40);

    PDFStyle smallstyle = new PDFStyle(textstyle);
    smallstyle.setFont(new StandardFont(StandardFont.HELVETICABOLDOBLIQUE), 24);

    LayoutBox box = new LayoutBox(page.getWidth());
    if (to != null) {
      box.addText("Dear "+to+"\n\n", smallstyle, null);
    box.addText("Roses are red\nViolets are blue\nHere's a PDF\nJust for You\n\n", textstyle, null);
    textstyle.setFont(new StandardFont(StandardFont.HELVETICABOLDOBLIQUE), 24);
    box.addText("Nothing says \"I Love You\"\nlike ISO PDF 32000-1:2008.\n\n", smallstyle, null);
    box.addText("Happy Valentines Day\nfrom ", smallstyle, null);
    if (from != null) {
      box.addText(from+" and ", smallstyle, null);
    box.addText("BFO", smallstyle, null);
    page.drawLayoutBox(box, 50, 500);

    return pdf;

  public void doGet(HttpServletRequest req, HttpServletResponse res) throws IOException {
    String from = req.getParameter("from");
    String to = req.getParameter("to");
    if (from != null && from.trim().length() == 0) {
      from = null;
    if (to != null && to.trim().length() == 0) {
      to = null;
    PDF pdf = makeCard(to, from);
    ByteArrayOutputStream out = new ByteArrayOutputStream();


  public static void main(String[] args) throws Exception {
    String to = args.length > 0 ? args[0] : null;
    String from = args.length > 1 ? args[1] : null;
    PDF pdf = makeCard(to, from);
    pdf.render(new FileOutputStream("valentine.pdf"));

Tags :

XFA Forms


The "P" in PDF stands for "Portable", and PDF is now an ISO Specification. So you could be forgiven for being surprised when you learn about XFA. We're asked about it a lot so what follows is a bit of a FAQ.

What is XFA

XFA stands for "XML Forms Architecture", and it's been part of Acrobat since Acrobat 6. It's an XML syntax which defines the document (the whole document, not just the form fields) and is embedded inside the PDF. While the specification itself is open and available, it's not part of the ISO PDF specification. It's also long (1500+ pages) and complex, having gone through 10 revisions since Acrobat 6.

Why does it exist?

Well, the original forms in PDF are arguably a bit of a flawed design and there are a lot of things that could have been done better, so there was room for improvement. XFA is a dialect of XML, which is a sensible container format, and it separates data from content in much the same way as the W3C XForms specification, which is undeniably a good thing.

So what are the problems with XFA

Personally I have quite a list, but the main one is XFA replaces, not augments, the PDF specification: the PDF file is now just a container, and the entire document is defined in the XFA layer. It undoubtedly warranted a new XFA file format; so by trying to elbow it in via the existing standard of PDF Adobe ensured a generation of confusion and annoyance from third party vendors and their customers.

Which tools support it?

For full support, you need Adobe's own products. Our API has limited support as described below, and we expect other third-party products to have support ranging from none to limited.

How do I create an XFA document?

You need an XFA-aware PDF producer, which is likely to be Adobe LiveCycle. When you save your document it will save it as an XFA PDF, and you'll have two options:
  • By default, the XFA-enabled PDF is just a basic shell around the XFA document. The entire document is defined in XFA, and an application that's not aware of XFA simply gets a single page PDF requesting you use a newer version of Acrobat.
  • You can also save your XFA PDF in "compatibility" mode, which will also create the pages, form fields and other content in the normal PDF way - the document is effectively stored twice, once as XFA, once as PDF. An XFA-aware application like Acrobat will read from the XFA layer (and ignore the PDF layer), and a non-XFA aware application will ignore the XFA layer and use the PDF layer. Obviously, subsequent edits should be made with a tool that can keep the two in sync.

What support do BFO tools have for XFA?

  • For PDFs saved without the "compatibility" layer, almost none. You can retrieve or update the XFA object as an XML document, or you can update just the "datasets" object, which is the data model. This effectively allows you to read and write the form values, although you can't see the fields themselves. You can also read/write the document metadata (author, title etc.) but anything else related to document content is unavailable: you can't access the document pages (the pagelist will always return a single dummy page) and you can't view or edit the form fields.
  • PDFs saved with a "compatibility" layer can be accessed for reading in a normal way - the PDF pages are valid so you can display them in our viewer or list the form fields and their content. You can also update the values of the form fields (we synchronize the XFA data to match) but any other changes to the PDF will not be synchronized and so will be ignored by Acrobat - so changes like this should be avoided.

    The final thing you can do with compatibility XFA documents is delete the XFA layer. Once removed, Acrobat will treat the PDF as a normal PDF and pages can be modified, form fields added or removed without problems.

How do I know what sort of PDF I have?

  • To identify an XFA document, you can check the XFAForm feature in the PDF OutputProfile:
    boolean xfa = pdf.getBasicOutputProfile().isSet(OutputProfile.Feature.XFAForm);
  • Identifying a non-compatibility layer PDF is trickier. Our API will only find a single page and no form fields, and most XFA documents would contain at least one field so this is probably a good test. The only way to know for sure is if you open the PDF with our viewer (or any non-Acrobat viewer) and you see "To view the full contents of this document, you need a later version of the PDF viewer", "If this message is not eventually replaced by the proper contents of the document, your PDF viewer may not be able to display this type of document." or other words to that effect.

How do I delete the XFA layer and what are the consequences?

How is very simple: with our API, just call
There are some XFA features that cannot be supported in PDF. For example, if your form allows you to choose date fields from a date picker then deleting the XFA will remove that functionality, and in general anything related to field validation will probably go as well (although with some effort it's probably possible to reimplement this with regular PDF JavaScript).

For documents that are going through a final stage of processing before being sent out, and where the customer isn't expected to modify the form, removing the XFA layer should be fine.

What is the best practice for using XFA?

  1. If there are any products in the PDF's life cycle that are not produced by Acrobat - this includes general tools like ours, PDF viewers (perhaps on your customers machine), any archival requirements like those imposed by PDF/A or print service suppliers - then the best practice is to avoid it. Support from third-party vendors is extremely limited and likely to stay that way.
  2. If you have to use XFA, then always save your PDF with a "compatibility" layer. This will allow basic modifications as described above, and will give you the option of deleting the XFA layer if necessary.
Tags : ,

How to print with "Comments Summary"

This article shows how you can create a custom viewer feature that duplicates the functionality of Acrobat's "Print with Comments Summary" feature.


New features in PDF Library 2.12

What have we been up to?

Yesterday we released our first PDF Library for a few months, version 2.12, so it's a good to give a bit of a summary of the changes


Client Customizes PDF Viewer Using Source Code

Client adopts BFO's customizable Java PDF Viewer for their project.