The PDF Library ships with a PDF Viewer that offers the possibility to programmatically highlight words or sentences that match a filter.
You can highlight pieces of information on an embedded PDF online, or hide something from the Viewer (you should however not forget to prevent the text selection and the copying and pasting of text from your document). We are going to quickly browse through those cases with an example: this products catalog file.
Working with the viewer
First things first, let's see how to create a basic text highlighter and how to apply it to the PDF viewer. To load a document and display it in the Viewer, use the following snippet:
final List<ViewerFeature> features = new ArrayList<>(ViewerFeature.getAllEnabledFeatures()); // 1 // Here will go our TextHighlighter related code SwingUtilities.invokeLater(new Runnable() { public void run() { PDFViewer viewer = PDFViewer.newPDFViewer(features); // 2 viewer.loadPDF(new File(pathToPDF)); // 3 } });
In this short piece of code, we first fetch all the Viewer features (1`) that are
typically enabled (highlighting, annotation, etc) and create a new Viewer with them
(2). We then feed our PDF document to the Viewer via .loadPDF
(3).
Adding a first text highlighter
There are two ways of using the TextHighlighter: one can either use a list of words that have to be highlighted or work with regular expression and use a pattern to decide. In any case the basis is the same:
TextHighlighter wordHighlighter = new TextHighlighter(); //configure the text highlighter here ... ... features.add(wordHighlighter);
You can repeat this pattern to have several highlighters at once.
Now, let us take a look at how can we configure a TextHighlighter
.
The most basic is via the method .addWord(String word).
wordHighlighter.addWord("US"); wordHighlighter.addWord("AU");
With this code, all the occurrences of US or AU in our list will be highlighted by a yellow rectangle. However, as you can see by running the example, words that contains US or AU (e.g. AU01) are also highlighted.
We can use a regular expression instead and replace the addWord("AU")
by .setPattern(Pattern pattern).
TextHighlighter wordHighlighter = new TextHighlighter(); String patternStr = "\\bAU\\b"; Pattern pattern = Pattern.compile(patternStr); wordHighlighter.setPattern(pattern);
But keep in mind that setPattern
cannot be used with addWord
, if you want to catch several words with setPattern
you need to handle it in your regular expression!
Highlighting customization
PDF Library also allows you to customize the text highlighting used by TextHighlighter
. For instance:
TextHighlighter frSeHighlighter = new TextHighlighter(); frSeHighlighter.addWord("FR"); frSeHighlighter.addWord("SE"); frSeHighlighter.setHighlightType(TextTool.TYPE_BLOCK, new Color(0xaa0000ff, true), new BasicStroke(), 1); highlighters.add(frSeHighlighter); TextHighlighter auHighlighter = new TextHighlighter(); auHighlighter.addWord("AU"); auHighlighter.setHighlightType(TextTool.TYPE_OUTLINE, new Color(0xff0000),new BasicStroke(), 0.2f); highlighters.add(auHighlighter);
With this setup, FR and SE occurrences will be highlighted by a transparent blue rectangle while AU will be highlighted by a rectangular red box.
To go further, we could use this mechanism to preseve the anonimity of our producers
(note we're only preventing the text from appearing in the Viewer, for permanent removal
it would be better to redact it). We simply need to remove the TextTool
feature from the original list (to make sure no one can select and copy text from
our document):
final ArrayList<ViewerFeature> features = new ArrayList<>(ViewerFeature.getAllEnabledFeatures()); Iterator<ViewerFeature> iter = features.iterator(); while (iter.hasNext()) { ViewerFeature viewerFeature = iter.next(); if ( viewerFeature.getName().equals("TextTool") ) { iter.remove(); break; } }
Then we can highlight names with a black block:
TextHighlighter wordHighlighter = new TextHighlighter(); wordHighlighter.addWord("Pierre Dubois"); wordHighlighter.addWord("John Watson"); wordHighlighter.setHighlightType(TextTool.TYPE_BLOCK, Color.BLACK, new BasicStroke(), 1);
And we are done!