Frequently Asked Questions
What is the difference between the normal and extended edition?
The extended edition of both the PDF Library and the Report Generator offer three additional features:
Can I convert HTML pages to PDF with the Report Generator?
Not directly. HTML comes in lots of different flavours, whereas the Report Generator uses its own XML
(similar to XHTML, but with a few extensions and exceptions specific to PDF). You cannot parse arbitary HTML
with the Report Generator - it will require some transformation, e.g. the top level tag is <pdf> not
<html>. The "SampleTransformer.java" example in the download package shows one way to
convert these tags, and the userguide has a useful section
on "Migrating from HTML". The Report Generator conforms to the CSS2 specification, so if your HTML documents
use CSS2 style sheets to separate content from presentation, the transformation will be simpler. The
Tidy package, which converts HTML to XHTML, may help your conversion.
Can I convert MS Office documents (Word and Excel) to PDF?
No. Microsoft Office documents are saved in a proprietary format which we cannot parse.
Why do I get the message "Cannot connect to X11 server"?
This is only an issue with the PDF library viewer or the Graph library on UNIX when rendering to bitmap images, like PNG or GIF, and is a frustrating aspect of Java on UNIX - the java.awt.* classes need an X11 system to connect to. You have three options.
java -Djava.awt.headless=true PDFtoTIFF file.pdf.
This is the best option, although it does require the X11 packages to be
installed (even if X11 isn't running).
N.B. Unlike Etek The X11 libraries are native libraries, not Java libraries, so you wouldn't include them in the classpath. They would typically be installed in /usr/lib, which needs root permission.
The Report Generator outputs to PDF rather than a bitmap, so you won't need any form of "windows" running at all to produce graphs, regardless of operating system.
Why is a blank screen is shown in Internet Explorer when a PDF document is requested from a Servlet or JSP?
This may occur when IE has failed to open your PDF viewer due to how this particular browser overrides the mime-type of the response with it's own guess, based on the suffix of the URL. More information regarding this "feature" of Internet Explorer can be found on the MSDN Web Site.
We have found the easiest way around this is to append a harmless "?.pdf" or a "&.pdf" to the end of the request string, e.g. http://bfo.com/products/report/filteredexamples/date.jsp?.pdf
How can I generate PDFs from ASP pages ?
Java and ASP are not the most natural of bedfellows but thanks to the platform independency of XML you can integrate the Report Generator with your Microsoft Web Application.
The basic elements of an application that uses ASP to generate PDF documents are as follows.
To set up a test harness probably the best Java Application Server you can get for free is Tomcat. This will run on a Windows box and comes with good documentation to get you started. Setting up the Report Generator to run in Tomcat is a simple task, and there are complete instructions upon how to do this at the start of our userguide. Once the Report Generator is installed you will need to set up a PDFProxyServlet - one example called "SampleServlet.java" can be found in your Report Generator download. If you are new to Servlets you may require some help with a few definitions, but your Server documentation should have information on how to get them up an running.
So how does it work? In your application all requests for PDF Documents should be made via the Proxy Servlet running on your Java Application Server. Your servlet will need to return the URL of the ASP page returning the XML, which the Proxy Servlet will convert to a PDF to return to the client
For more information our userguide has a complete explanation of the PDFProxyServlet method.
Why does my table not continue onto the next page?
With the Report Generator there are specific rules for where a page break can occur. Page 17 of the userguide has some useful information regarding this, but the most basic rule to bear in mind is that only the following tags will split if they are spread across multiple pages.
Automatic pagination will not occur inside a <td> tag. A <table> nested inside a <td> will be cut off at the bottom if it spreads across multiple pages.
Does the PDF Library support "Web Ready" or "Linearised" PDFs?
Linearisation is Adobes method of constructing a PDF so that it appears to load faster in a Web browser. This is achieved by showing the first few pages of the document whilst the rest of the document is loaded in the background. The PDF Library and Report Generator can both read and write Linearized PDF's.
Why do I receive a "(0xd): Skipping unknown character" warning message?
The message you receive is telling you that the Unicode character 0x0D (hex) cannot be found in the font you have selected to draw it with. In this case it is due to 0x0D being an invisible control character.
Why is the document a diffent size when I print it?
The PDF Library and Report Generator will create pages and draw elements to the size you specify, but we can't control how the reader will print them. When printing a PDF, to ensure they are printed at the correct size check that the "Shrink oversized pages to paper size" or "Stretch undersized pages to paper size" checkboxes are unchecked in Acrobats Print dialogue.
Can we extract text or images from an existing PDF?
Yes, since release 2.6.2 of the PDF library this is possible. See the ExtractText.java example in the PDF library package to see how.
Can we create PDF files with Chinese, Japanese and Korean characters?
Yes, the PDF Library and Report Generator support Chinese, Japanese and Korean characters. The best place to start is in the
examples that come with the download, e.g. "example/samples/HelloWorld-chinese.xml", and in the userguide where
there is a section devoted to Internationalization.
For more details on CJK font support please refer to the
StandardCJKFont
class in the API docs. When using these standard CJK fonts the required fonts
need to be installed on the client machine - the Asian font packs for Acrobat can be found
here.
Is the PDF Library "Thread Safe"?
The PDF Library is thread safe in as much as you can have two separate threads manipulating two separate documents and they will not interfere with each other. If you have two separate threads manipulating the same document you are likely to come into problems.
Why doesn't my <jsp:include> work ?
There may be a couple of reasons for this. If you are trying to include a .PDF file, this will not work because a
PDF is binary content, but we expect a JSP included in this way to produce text content (i.e. XML that the Report Generator
can understand). You can include another PDF file in your document, but this is done in the Report Generator by using the
"background-pdf" attribute in the body tag.
If you are using the PDFFilter method, the content type of an HTTP response must be text/xml for it to be parsed.
We have noted in some servers (e.g. early versions of Tomcat 4.x) that an inner JSP may erroneously override the Content-type
set in the outer JSP - causing it to be skipped by the PDFFilter. If in doubt, ensure both the outer and the included JSP set
the content type to text/xml.
Will the Report Generator work with the JSPs that use custom tags or tag-libs?
Yes, as long as those tags are transformed into tags the Report Generator can parse. In the case of tag libraries, these tags will be resolved in the JSP compilation stage, well before the Report Generator takes over.
Will the Report Generator work with the Jakarta Struts framework?
Yes - we have had no reports to the contrary. See the question above.
Why when I use &, <, > in my XML does the Report Generator throw an error?
These characters are used to mark elements in XML documents, and cannot be used unquoted as they are in HTML. If you need to use
these characters either wrap the text in a CDATA block, or use the
entities & for "&", < for "<" and > for ">".
How do I set headers and footers for specific pages ?
The Report Generator userguide has a section devoted to headers and footers. In version 1.1.x of the Report Generator the options for setting headers are:
examples/dynamic/bigtable.jsp" example in the download package).At the moment we do not have a facility for explicitly assigning a footer or header to the last page in a document when the number of pages is unknown.
How do I force a page to begin on an odd or even page?
You can set this is in the <pbr> tag - eg. <pbr page-break-before="odd"> which means only do a page break
if the next page is odd. You can also do <pbr page-break-before="even">, which means only do a page break if
the next page is even.
I am having problems with ligatures in words like "Office", characters are being drawn on top of each other?
The font you are using may have a zero width for the ligature 'fi'. You have a few options:
Is it possible to put a line-break in the labels of a graph?
In the Report Generator using the will work for carriage returns in labels.
For the Graph Library you can put a '\n' in the label text if you're using JSP code to create the label dynamically:
<bfg:label distance="40"><%= value1 + "\n" + value2 %></bfg:label>
or just hit Enter and put the line break into the XML explicitly:
<bfg:label distance="20">On Multiple Lines</bfg:label>
Does it work over HTTPS?
Yes, PDFs and graphs can be requested and returned via HTTPS. When using the Report Generator to create PDFs, you may find a
CertificateException is thrown.
This occurrs whenever a Java process tries to open an SSL connection as a client, and means that the certificate
presented by the webserver isn't trusted by the client (your Java process). This even happens when the client and server
processes are the same Application Server. Although it's not specific to the Report Generator,
you'll often see it when you reference images or stylesheets in your XML with relative URLs. There are three ways around it, in
decreasing order of preference.
<meta name="base" value="file:C:/webapp/report.jsp"/>.
This has the effect of making a relative URL like <img src="images/myimage.jpg">resolve to the URL
file:C:/webapp/images/myimage.jpg - ie. the image is loaded from the filesystem, not through the webserver, and no
certificate exception occurs. It's a bit faster too, but obviously will only work for resources that are files.init method of this class will do the trick (note this code is completely unsupported - use at your own risk)
import java.security.cert.*;
import javax.net.ssl.*;
public class EasyConnect implements X509TrustManager, HostnameVerifier {
public final static void init() throws Exception {
EasyConnect e = new EasyConnect();
SSLContext sc = SSLContext.getInstance("SSL");
sc.init(null, new TrustManager[] { e }, null);
HttpsURLConnection.setDefaultSSLSocketFactory(sc.getSocketFactory());
HttpsURLConnection.setDefaultHostnameVerifier(e);
}
public void checkClientTrusted(X509Certificate[] chain, String auth) { }
public void checkServerTrusted(X509Certificate[] chain, String auth) { }
public X509Certificate[] getAcceptedIssuers() { return null; }
public boolean verify(String urlHostname, SSLSession session) { return true; }
}
How can I optimize the Report Generator to make response time as fast as possible?
We've optimised the Report Generator as best we can and we are always exploring ideas to speed things up even more - so the best place to start is to make sure you have the latest and greatest version. Having said that, here are some ideas.
XML Tag density - More tags means more to parse and hence more time. If you can reduce the number of tags in your document with some concise and clever markup this will always help. Use a good style sheet rather than attributes or inline styles. It is best to avoid relying on nested tables and spacer GIFs for controlling layout, instead using "colspan", "rowspan", "margin" and "padding" attributes to achieve the same effect. Using smaller tables with "table-layout=fixed" will save the Report Genrator from reading the whole table to determine the width of the column.
External Resources - A document that uses external resources such as fonts, images or other PDFs will always take longer to create than one without. External resources are cached within the document, but are not cached across multiple documents or requests.
Application design - Another good place to look is to ask some questions about your application use cases. Do all documents need to be created from scratch? Are 30 users viewing more or less the same document with slight changes? Could I use a cached template? Could I write the document once, store it on disk or database and serve it up?
Does the PDF Library support a news-paper or column type layout?
Not directly, but it can be done. Using the LayoutBox class you
could first create a LayoutBox the width of one column. Fill this with your text, pictures etc., and then call the LayoutBox.split()
method to split the LayoutBox into column-length chunks. Then just draw each chunk side by side on the page. A simple example illustrating
this is included with the download package as "ColumnLayout.java".
When I copy or add pages from an existing document to a new document, form fields are not copied.
In version 2.0, if you want to copy a form field annotation from one document to another you need to move the FormElement associated with it separately. Have a look at section 2 in the PDF userguide, which has all the information you need.
"WARNING (PG1): Annotation 1/152 on page 1 is part of another PDF's form - removing" - what does this mean?
See Appendix A in the PDF userguide for a list of warning messages. Also refer to the question above.
After setting the origin of a page, why when I add a form field does it not appear where I expect?
Annotations (such as Form Fields) use absolute co-ordinates for positioning on the page, starting at (0,0) in the bottom left corner.
They are not affected by calls to the setUnits method
We want to create really big PDF documents - will it cope?
The biggest document we know of created with the library was 7000 pages or so, although we haven't reproduced this ourselves as we don't have enough memory here!
The PDF specification allows for documents of up to 10Gb, but as the design of the PDF Library (and consequently the Report Generator) is to hold most of the document in RAM, you're not going to get anywhere close to this. Using a Cache will help, particularly if your document consists of large streams (usually bitmap images), but If you're trying to save memory, use low-res images and the built-in fonts rather than embedding, and, with the Report Generator, try and limit the number of tags by making use of the padding/margin attributes instead of nesting tables. Don't forget you can increase the heap size if necessary by passing arguments in to the "java" command.
Also be sure to keep up to date with revisions - we release often, and making our products both smaller and faster is high on our priority list.
How can I prevent a PDF from being saved to disk?
A PDF is simply a file, so there's really no way to do this. Even if there were some option that could be set in the document to signal Acrobat (and there isn't), the user could simply right click on the link and select "Save As", or extract the document from the cache. You can make it a little harder for them by having it returned from an HTML POST perhaps, but there is no 100% effective way to do this (and this applies to any type of file returned from a webserver, not just PDFs).
I get the message "mmiVerifyTpAndGetWorkSize: stack_height=2 should be zero" in my logs.
This message is printed by some implementations of IBMs JIT compiler, such as that supplied with WebSphere 5. We have no idea what it means, but it doesn't seem to make any difference - the program still runs correctly, so as best as we can tell you can safely ignore it. Update - May 2004: IBM have listed this as a known bug on their website, and although at the time of writing the information is pretty sparse, it appears it will be fixed in an upcoming release of their JVM.
How can I create a PDF from a JSP using the PDF library (not the Report Generator)?
It is not recommended to create PDF's from JSPs. JSP's are intended to return
text only, not binary content like PDF's or images (for which you should use a Servlet).
In more detail, a JSP page only has access to the PrintWriter, not the
ServletOutputStream, so your response will be dependent on the encoding
of the page. In addition, any newlines or spaces in your JSP will be inserted
into your PDF, which as it's a binary file is not a good idea. We're not saying it
can't be done if you know what you're doing and you're careful, but whether it will work
will depend on the browser, your application server and the environment it's running in -
so we don't support it.
I've got an OutOfMemoryError when using the PDF library or Report Generator
This is one of the most common questions we get. First, this does not mean there is a memory leak - it simply means that there is not enough memory available. By default Java only has 64Mb of heap regardless of the amount of physical memory in the machine, which is not enough when you're manipulating large documents (how big "large" is depends on what you're doing with it and the composition of the document, but chances are it's smaller than you think). You have some simple options to fix this:
java -Xmx256M YourAppPDFReader, use a File rather than a FileInputStream in the constructor.InputStream, you need to close it. The API doesn't close any streams it didn't open itself.If this still doesn't work, you need to look at your architecture to find ways to save memory. There are a number of tricks which we recommend.
LayoutBox to draw the text directly onto the page in the same location as the field. When helping a customer struggling with a large number of fields, using this approach in our tests gave smaller documents, reduced memory requirements by 40% and a sped things up by a factor of 15! This is our #1 tip for improving performance.PDF(PDF) constructor. Reading them in is the complicated bit, so you'll certainly see speed improvements.My XML fails when parsing non-ASCII characters
This is an encoding issue. By default XML is encoded in UTF-8, so if you're creating your XML with a text editor, be sure to save the XML using the UTF-8 encoding. If you're returning XML from a JSP you need to set the encoding explicitly, as JSP's encode their content as ISO-8859-1 by default. See the internationalization section of the userguide for more info on this, but if you want a quick fix, ensure the first two lines of your JSP look something like this:
<?xml version="1.0"?> <%@ page language="java" contentType="text/xml; charset=UTF-8"%>
How can I stop the letters in my table from being stretched out?
By default the text in tables is justified. In order to prevent this you need to set align="left". Remember that each <td> element has a <p> implicitly placed around the data, so the best way to achieve this is to use a style sheet and add:
td p { align:left }
which will cause all the table data elements to align to the left.
How do I use the keycode I've been issued?
To upgrade from the demo version, you need to have purchased one of the products and been issued with a licence key. Once you've been sent the keycode:
Graph.setLicenseKey("...");
<web-app> tag. The web server will need to be restarted after this change.
<context-param> <param-name>org.faceless.graph2.License</param-name> <param-value>...</param-value> </context-param>
PDF.setLicenseKey("...");
ReportParser.setLicenseKey("...");
<init-param> block in the "web.xml" file of your web application. The web server will need to be restarted after this change.<filter>
<filter-name>bforeport</filter-name>
<filter-class>org.faceless.report.PDFFilter</filter-class>
<init-param>
<param-name>license</param-name>
<param-value>...</param-value>
</init-param>
</filter>
If you're running the SampleServlet, it should look something like this:
<servlet>
<servlet-name>ReportServlet</servlet-name>
<servlet-class>SampleServlet</servlet-class>
<init-param>
<param-name>license</param-name>
<param-value>...</param-value>
</init-param>
</servlet>
My license key has suddenly stopped working and I'm seeing "DEMO" again.
The only way this could happen is if you're accidentally resetting it somewhere else in
your code, or if you're still using a temporary key by accident.
Search all of your code for any calls to setLicenseKey - ideally you want just
one, set in the manner described above. Also make sure that you're not
calling PDF.setLicenseKey or Graph.setLicenseKey after a call to
ReportParser.setLicenseKey, and that if you're running the Report Generator your
call is actually to ReportParser.setLicenseKey rather than PDF or
Graph (this applies even if you're using the Graph Library component of the
Report Generator as a standalone component).
You may also want to check your logs. This happens often enough that if a previously valid
license key is overridden with an invalid one, recent versions of our products will write a
warning to stderr, along with a stack trace of the second call to
setLicenseKey to help you find and remove it.
Can your product work with Adobe Reader Extensions
No. Adobe have created the Reader Extensions so that they can only be used with their Document Server product, presumably in order to avoid losing sales of Acrobat to a combination of Acrobat Reader and third party products like our own. Only Adobe products (and perhaps some of their licensees) can work with PDFs contains Reader Extensions.
How do I find out the location of an element created in the Report Generator
This is a common question by those wanting to add something to a PDF created by the
Report Generator that can't be defined in the XML, such as a custom annotation or
maybe a type of page numbering that can't be supported directly by the XML syntax.
The ideas is that after the XML is converted to a PDF but before it's written to the
OutputStream, the PDF can be altered using the PDF library API.
The trick is finding out the location of the element you're using as a marker in the
document.
To do this you can add an annotation to the tag and then search for it once the PDF
is generated. First, add an href to the element you want to find:
<h1 href="pdf:dummy">Heading</h1>
Then edit the code that converts the XML to the PDF so that after the PDF is created by before it's written out, find that annotation (and delete it when you're done). The new code to insert is indicated below.
PDF pdf = parser.parse(inputsource);
// BEGIN INSERTED CODE
List pages = pdf.getPages();
for (int i=0;i<pages.size();i++) {
PDFPage page = (PDFPage)pages.get(i);
List annots = page.getAnnotations();
for (int j=0;j<annots.size();j++) {
PDFAnnotation annot = (PDFAnnotation)annots.get(j);
if (annot.getAction().getType().equals("Named:dummy")) {
annots.remove(j);
float[] rect = annot.getRectangle();
// Now you have "rect" and "page" set to the location of the tag
// in your document - do whatever you need to with them.
break;
}
}
}
// END INSERTED CODE
pdf.render(outputstream);
This technique of modifying the PDF after it's created but before it's written can be
used to do other things too - append other documents, reorder pages and so on. For
those using the PDFFilter, the source code is supplied in the
docs directory. Remember to put the modified version a different package.
Extracting text from a PDF gives incorrect results
Extracting text from a PDF can fail for a number of reasons, mostly due to the way they're constructed internally. A PDF has no concept of a sentence or a word, only letters. Typically these are grouped together internally to form words which we can search for, but this isn't always the case - we've seen documents where all the capital letters in a line were printed in one go followed by the lower case letters, or documents where letters were arranged from right-to-left, the cursor moving backwards between each one. Some older documents use images to make up the letters, or fonts with no useful encoding so it's impossible to know which letter is which. The one thing all these variations have in common is it's almost impossible to extract useful text from them (and this will apply to any PDF tool, including Acrobat and our library).
99% of the time this won't be an issue, but it's for this reason when we're asked if it's possible to extract text from a PDF we say usually, rather than an outright yes.
(Also be aware that our trial version replaces all lowercase e's with lowercase a's. This is deliberate, not a fault, and isn't the case when you have a valid license key).
What sort of compression is used when converting PDF to TIFF
This depends on the colormodel used. If you're using a 1-bit colormodel like
PDFParser.BLACKANDWHIE then CCITT Group 4 compression is used, otherwise LZW
compression is used. These are the two best options available in baseline TIFF - although
Flate compression is is defined in an extension to the TIFF specification, it's support is
limited and LZW is, despite it's patent problems in the past, still a very effective
compression algorithm.
The raster format also depends on the ColorModel used - if it's an
IndexColorModel then the image will be stored as an indexed image with 8pp.
Most color images are not indexed however, and will be stored at 24bpp for RGB or 32bpp for
CMYK images.
When embedding a v2 Graph into a PDF, the text is "fuzzy" when printed
Fuzzy text is a result of embedding a low resolution image into a PDF - it looks fine
on screen (which is typically 96dpi) but when printed on a 600dpi printer it looks blocky
and blurred. If using the tag library, the solution is to either set the dpi parameter
on the axesgraph or piegraph element, or (if you have the
extended edition of the Report Generator) set the format
parameter to "rg1pdf".
When verifying a digital signature I get "java.security.cert.CertificateException: toDerInputStream rejects tag type -96"
This is a problem with the Sun crypto package, which incorrectly fails on some X.509 certificates. The solution is to use another JCE provider - we recommend the Bouncy Castle Crypto API from http://www.bouncycastle.org. Download the appropriate JAR and install it, either by following the instructions supplied with their package or (for a quick fix) by adding the line
Security.insertProviderAt(new org.bouncycastle.jce.provider.BouncyCastleProvider(), 1);
to your code during an initialization routine, before the signature is verified. Alternatively, try upgrading to Java 1.6.