Class PDF
- java.lang.Object
-
- org.faceless.pdf2.PDF
-
- All Implemented Interfaces:
java.lang.Cloneable
public class PDF extends java.lang.Object
A
PDF
describes a single document in Adobe's Portable Document Format. It is the highest-level object in the package.The life-cycle of a PDF generally consists of being created, adding new pages, optionally adding information about the document structure (e.g. bookmarks), and finally rendering to an
OutputStream
.This class only deals with the structure of the document. To actually create some content see the
Here's the ubiquitous example:PDFPage
class.import org.faceless.pdf2.*; // Create a new PDF PDF p = new PDF(); // Create a new page PDFPage page = p.newPage(PDF.PAGESIZE_A4); // Create a new "style" to write in - Black 24pt Times Roman. PDFStyle mystyle = new PDFStyle(); mystyle.setFont(new StandardFont(StandardFont.TIMES), 24); mystyle.setFillColor(java.awt.Color.black); // Put something on the page page.setStyle(mystyle); page.drawText("Hello, PDF-viewing World!", 100, 100); // Automatically go to this page when the document is opened. p.setAction(Event.OPEN, PDFAction.goTo(page)); // Add some document info p.setInfo("Author", "Joe Bloggs"); p.setInfo("Title", "My Document"); // Add a bookmark java.util.List bookmarks = p.getBookmarks(); bookmarks.add(new PDFBookmark("Hello World page", PDFAction.goTo(page))); // Write the document to a file OutputStream out = new FileOutputStream("test.pdf"); p.render(out); out.close();
-
-
Field Summary
Fields Modifier and Type Field Description static java.lang.String
PAGESIZE_A4
A parameter tonewPage(String)
to create a new A4 page - 210x297mmstatic java.lang.String
PAGESIZE_A4_LANDSCAPE
A parameter tonewPage(String)
to create a new landscape A4 page - 297x210mmstatic java.lang.String
PAGESIZE_A5
A parameter tonewPage(String)
to create a new A5 page - 148x210mmstatic java.lang.String
PAGESIZE_A5_LANDSCAPE
A parameter tonewPage(String)
to create a new landscape A5 page - 210x148mmstatic java.lang.String
PAGESIZE_LETTER
A parameter tonewPage(String)
to create a new US Letter page - 8.5x11instatic java.lang.String
PAGESIZE_LETTER_LANDSCAPE
A parameter tonewPage(String)
to create a new landscape US Letter page - 11x8.5instatic java.lang.String
VERSION
This variable contains the version number of the current build.
-
Constructor Summary
Constructors Constructor Description PDF()
Create a new, empty PDF documentPDF(OutputProfile targetprofile)
Create a new PDF and immediately apply the specifiedOutputProfile
.PDF(PDF pdf)
Create a PDF that's a clone of the specified PDF.PDF(PDFReader reader)
Create a PDF from the specifiedPDFReader
.PDF(PDFReader reader, int revision)
Create a PDF from the specifiedPDFReader
, using the specified revision of the document.
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Deprecated Methods Modifier and Type Method Description void
addPropertyChangeListener(java.beans.PropertyChangeListener listener)
Add aPropertyChangeListener
to this objectprotected java.lang.Object
clone()
void
close()
Close any file resources the PDF may be holding on to.PDFAction
getAction(Event event)
Return the action that's performed when the specified event occurs on the document, as set bysetAction
.OutputProfile
getBasicOutputProfile()
Return a basic OutputProfile for this PDF.java.util.List<PDFBookmark>
getBookmarks()
Return the List of bookmarks at the top level of the document.java.lang.String
getDocumentID(boolean primary)
Returns a String representing this documents unique ID.DocumentPart
getDocumentPart()
Return the rootDocumentPart
, which will never be null but which will beempty
unless this file uses DocumentPartsjava.util.Map<java.lang.String,EmbeddedFile>
getEmbeddedFiles()
Return a Map containing all the Embedded Files associated with this document.EmbeddedFile
getEmbeddedFileSource()
When a PDF is loaded fromEmbeddedFile.getPDF()
, this method will return the EmbeddedFile that contains this object.EncryptionHandler
getEncryptionHandler()
Return theEncryptionHandler
used to encrypt the document, ornull
if no encryption handler is in use.static java.util.concurrent.ExecutorService
getExecutor()
Returns theExecutorService
used by the PDF library to run tasks, as set bysetExecutor(java.util.concurrent.ExecutorService)
.Form
getForm()
Return the InteractiveForm
or "AcroForm" object which is part of each PDF document.OutputProfile
getFullOutputProfile()
Deprecated.since 2.18 the OutputProfiler class gives more control and should be used instead of PDF.getFullOutputProfilejava.util.Map<java.lang.String,java.lang.Object>
getInfo()
Return the PDF meta information, as set bysetInfo()
.java.lang.String
getInfo(java.lang.String key)
Return document meta data as set bysetInfo()
as a String.java.lang.String
getJavaScript()
Return the document-wide JavaScript, as set bysetJavaScript(java.lang.String)
, ornull
if no JavaScript is defined for this document.PDFPage
getLastPage()
Return the last page of this PDF.static java.lang.Object
getLicensedProperty(java.lang.String key)
Retrieve a property from the PDF License.LoadState
getLoadState(int index)
For linearized documents that are being loaded from aURL
via thePDFReader.setSource(URL)
, this method relays the current load state of the specified page.java.util.Locale
getLocale()
Return the PDF's Locale, as set bysetLocale
or (since 2.6.1) as loaded from the PDFs "Lang" tag.java.io.Reader
getMetaData()
Return any XML metadata associated with the document.java.util.Map<java.lang.String,PDFAction>
getNamedActions()
Return a Map containing all the named actions in the PDF.int
getNumberOfPages()
Return the number of pages in this PDF.int
getNumberOfRevisions()
Return the number of revisions made to the document.java.lang.Object
getOption(java.lang.String key)
Returns the current value of an option, as set bysetOption()
.java.util.List<OptionalContentLayer>
getOptionalContentLayers()
Return the list ofOptionalContentLayer
objects defined in the PDF.PDFPage
getPage(int pagenumber)
Return the specified page.PDFPage
getPage(java.lang.String name)
Get a "Named Page" from the PDF.java.lang.String
getPageLabel(int num)
Get the "Page Label" for the specified page number, or null if none is specified.java.util.List<PDFPage>
getPages()
Returns a List of the documents pages which may be manipulated to reorder, delete or append pages to the document.int
getPDFVersion()
Get the version of the PDF.Portfolio
getPortfolio()
Return the PDF portfolio, creating it if necessary.static PropertyManager
getPropertyManager()
Get thePropertyManager
currently being used by the PDF libraryfloat
getRenderProgress()
Get the progress of therender()
method running in a different thread.org.w3c.dom.Document
getStructureTree()
Returns the Structure Tree for the entire document as a W3C DOM.java.lang.Object
getUserData(java.lang.String key)
Return a property previously set on the PDF with theputUserData()
methodXMP
getXMP()
Return the XMP metadata as an XMP object.void
importFDF(FDF fdf)
Import the contents of the specifiedFDF
into the PDF document.static boolean
isLicensed()
Return true if the PDF is licensed, false if it's running as a demovoid
makePortfolio(boolean portfolio)
Deprecated.call #getPortfolio insteadPDFPage
newPage(int w, int h)
Create a newPDFPage
object of the specified size and add it to this PDF.PDFPage
newPage(java.lang.String pagesize)
Create a new page of the specified page size and add it to this PDF.PDFPage
newPage(PDFPage page)
Create a newPDFPage
object that is a clone of the specified page, and add it to this PDF.void
putLiteral(java.lang.String key, java.lang.String tokens)
Put a literal token sequnce.void
putUserData(java.lang.String key, java.lang.Object value)
Set a custom property on the PDF.void
rebuildStructureTree()
Rebuild the Structure Tree returned fromgetStructureTree()
.void
removePropertyChangeListener(java.beans.PropertyChangeListener listener)
Remove aPropertyChangeListener
to this objectvoid
render(java.io.OutputStream out)
This method renders the completed PDF to anOutputStream
.void
setAction(Event event, PDFAction action)
Specify an action to perform when the specified event occurs on the document.static void
setCache(Cache cache)
Set theCache
to be used by the library.void
setEncryptionHandler(EncryptionHandler encrypt)
Set theEncryptionHandler
to encrypt this document with.static void
setExecutor(java.util.concurrent.ExecutorService e)
Set theExecutorService
to be used by the PDF library to run any parallel operations.void
setInfo(java.lang.String key, java.lang.Object val)
Set an item of PDF meta-information, such as author or title.void
setJavaScript(java.lang.String javascript)
Set the document-wide JavaScript.static void
setLicenseKey(java.lang.String key)
Set the license key for the library.void
setLocale(java.util.Locale locale)
Set the default locale for this document.void
setMetaData(java.lang.String xmldata)
Set the XMP Metadata associated with this document.void
setOption(java.lang.String key, java.lang.Object value)
Set various options and on the PDF, which largely (but not necessarily) follows the options available in the "Document Properties" dialog of Acrobat.void
setOutputProfile(OutputProfile targetprofile)
Deprecated.since 2.18 the OutputProfiler class or PDF(OutputProfile) constructor should be used instead of calling PDF.setOutputProfilevoid
setPageLabel(int startpage, int displaystart, java.lang.String prefix, char type)
Set the "Page Label" for a range of pages in the PDF - the way the page number is presented.static void
setPropertyManager(PropertyManager manager)
Set thePropertyManager
to be used by the PDF libraryjava.lang.String
toString()
static void
useAWTEventModel(boolean awtevent)
Set the PDF Library to work with the AWT event model.
-
-
-
Field Detail
-
VERSION
public static final java.lang.String VERSION
This variable contains the version number of the current build. A typical values would be "2.0". Please be sure to include this information with any bug reports
-
PAGESIZE_A4
public static final java.lang.String PAGESIZE_A4
A parameter tonewPage(String)
to create a new A4 page - 210x297mm- See Also:
- Constant Field Values
-
PAGESIZE_A4_LANDSCAPE
public static final java.lang.String PAGESIZE_A4_LANDSCAPE
A parameter tonewPage(String)
to create a new landscape A4 page - 297x210mm- See Also:
- Constant Field Values
-
PAGESIZE_LETTER
public static final java.lang.String PAGESIZE_LETTER
A parameter tonewPage(String)
to create a new US Letter page - 8.5x11in- See Also:
- Constant Field Values
-
PAGESIZE_LETTER_LANDSCAPE
public static final java.lang.String PAGESIZE_LETTER_LANDSCAPE
A parameter tonewPage(String)
to create a new landscape US Letter page - 11x8.5in- See Also:
- Constant Field Values
-
PAGESIZE_A5
public static final java.lang.String PAGESIZE_A5
A parameter tonewPage(String)
to create a new A5 page - 148x210mm- See Also:
- Constant Field Values
-
PAGESIZE_A5_LANDSCAPE
public static final java.lang.String PAGESIZE_A5_LANDSCAPE
A parameter tonewPage(String)
to create a new landscape A5 page - 210x148mm- See Also:
- Constant Field Values
-
-
Constructor Detail
-
PDF
public PDF()
Create a new, empty PDF document- Since:
- 1.0
-
PDF
public PDF(PDF pdf)
Create a PDF that's a clone of the specified PDF. When creating multiple copies of a single PDF, it's much faster to use this method than to re-read the PDF using a newPDFReader
- Since:
- 2.0
-
PDF
public PDF(PDFReader reader)
Create a PDF from the specifiedPDFReader
. ThePDFReader
class is available as part of the "Extended Edition" of the PDF library, and is included with this package. If the document contains multiple revisions, the latest revision is loaded.- Since:
- 1.1.12
-
PDF
public PDF(PDFReader reader, int revision)
Create a PDF from the specifiedPDFReader
, using the specified revision of the document. ThePDFReader
class is available as part of the "Extended Edition" of the PDF library, and is included with this package. The revision number must be between 1 andPDFReader.getNumberOfRevisions()
, otherwise anIllegalArgumentException
is thrown.- Parameters:
reader
- thePDFReader
to userevision
- the revision number to use - betweenPDFReader.getNumberOfRevisions()
to load the latest or 1 to load the original document.- Throws:
java.lang.IllegalArgumentException
- if the revision is outside the specified range- Since:
- 1.2.1
-
PDF
public PDF(OutputProfile targetprofile)
Create a new PDF and immediately apply the specifiedOutputProfile
. This constructor can be used to ensure that a brand new PDF will be created to comply with the requirements of the profile, and is the recommended way to enforce this for a new PDF.
-
-
Method Detail
-
isLicensed
public static boolean isLicensed()
Return true if the PDF is licensed, false if it's running as a demo- Since:
- 2.11.22
-
setPropertyManager
public static final void setPropertyManager(PropertyManager manager)
Set thePropertyManager
to be used by the PDF library- Since:
- 2.8.5
-
getPropertyManager
public static final PropertyManager getPropertyManager()
Get thePropertyManager
currently being used by the PDF library- Since:
- 2.8.5
-
setExecutor
public static final void setExecutor(java.util.concurrent.ExecutorService e)
Set the
ExecutorService
to be used by the PDF library to run any parallel operations. Parallel operations in the API include reading the PDF with aPDFReader
,saving
the PDF, and profiling. Prior to 2.18.1 parallel work was done in short lived Threads, but the use of an Executor allows the use of a system-wide thread pool for better resource management.The parameter to this method is the ExecutorService to use for these parallel operations, or
null
to use the PDF Library default. If the default is used, then afixed size thread pool
is created with the number of threads based on theThreads
property
if specified, or thenumber of processors
if not. A special value of "1" for this property will ensure there is no parallel processing and everything is done in the calling thread.- Since:
- 2.18.1
-
getExecutor
public static final java.util.concurrent.ExecutorService getExecutor()
Returns theExecutorService
used by the PDF library to run tasks, as set bysetExecutor(java.util.concurrent.ExecutorService)
.- Since:
- 2.18.1
-
close
public void close()
Close any file resources the PDF may be holding on to. These will be automatically closed during garbage collection, but this method may be called earlier if necessary to speed disposal of those resources.
- Since:
- 2.11.2 - prior to that no resources were held and this method wasn't necessary
-
getPDFVersion
public int getPDFVersion()
Get the version of the PDF. The version provides an indication of which version of Acrobat the file can be loaded in, although it is quite normal for a 1.4 document to be loaded correctly by a 1.3 viewer (for example). Since Acrobat 9 and ISO 32000, version numbering has become more complicated, so to interpret the value from this method you will need the following table. Note the earliest version of PDF supported by this API is 1.3, so any documents from earlier revisions will be automatically upgraded.3 PDF 1.3 (as created by Acrobat 4.x) 4 PDF 1.4 (as created by Acrobat 5.x) 5 PDF 1.5 (as created by Acrobat 6.x) 6 PDF 1.6 (as created by Acrobat 7.x) 7 PDF 1.7 / ISO 32000-1:2008 (as created by Acrobat 8.x) 8 PDF 1.7 / ISO 32000-1:2008 Extension Level 3 (as created by Acrobat 9.x) 9 PDF 1.7 / ISO 32000-1:2008 Extension Level 5 (as created by Acrobat 9.1) 10 PDF 1.7 / ISO 32000-1:2008 Extension Level 8 (as created by Acrobat X) 11 PDF 1.7 / ISO 32000-1:2008 Extension Level 11 (as created by Acrobat XI) 12 PDF 2.0 / ISO 32000-2:2012 - Returns:
- the version of the PDF document
- Since:
- 2.0
- See Also:
setOutputProfile(org.faceless.pdf2.OutputProfile)
-
setOutputProfile
@Deprecated public void setOutputProfile(OutputProfile targetprofile)
Deprecated.since 2.18 the OutputProfiler class or PDF(OutputProfile) constructor should be used instead of calling PDF.setOutputProfileSet the Output Profile to use when rendering this PDF document. Since 2.18 this method will work as before, but is deprecated in favour of the new
OutputProfiler
class. Code calling this method like this:OutputProfile oldprofile = pdf.getFullOutputProfile(); pdf.setOutputProfile(newprofile);
should be updated to look like this:OutputProfiler profiler = new OutputProfiler(new PDFParser(pdf)); OutputProfile oldprofile = profiler.getProfile(); profiler.apply(newprofile);
and for code that applies an OutputProfile to a new PDF, call the
PDF(OutputProfile)
constructor:OutputProfile profile = new OutputProfile(OutputProfile.PDFA3a, "sRGB", null, "http://www.color.org", null, icc); PDF pdf = new PDF(profile);
or, finally, for simple cases where the features being applied are all part of the "basic" OutputProfile.
new OutputProfiler(pdf).apply(newprofile);
- Throws:
java.lang.IllegalStateException
- if the current profile doesn't match and can't be altered to match the specified profile, or if the current profile isn't known (because it's an existing PDF document that hasn't been scanned withgetFullOutputProfile()
).- Since:
- 2.0
- See Also:
getPDFVersion()
,getBasicOutputProfile()
,getFullOutputProfile()
-
getBasicOutputProfile
public OutputProfile getBasicOutputProfile()
Return a basic OutputProfile for this PDF. The "Basic" profile consists of information which be easily determined without having to traverse the PDF or parse the page streams. It takes no time to run, and as it doesn't parse the page content it requires only the basic License to run. See theOutputProfile.Feature
class to see which features are returned in the basic profile.- Since:
- 2.6.1
- See Also:
OutputProfiler
-
getFullOutputProfile
@Deprecated public OutputProfile getFullOutputProfile()
Deprecated.since 2.18 the OutputProfiler class gives more control and should be used instead of PDF.getFullOutputProfileReturn a full OutputProfile for this PDF. This routine parses the entire document to determine it's contents - this can be a very lengthy operation, so calling the
getBasicOutputProfile()
method is generally prefereable unless theFeature
you're querying is not tested by that method.This method cycles through every object in the PDF structure in a process very similar to rendering the entire PDF to a bitmap, and updates the OutputProfile returned by
getBasicOutputProfile()
with the complete list of features used in this PDF - which may cause anIllegalStateException
if it's set to something other thanOutputProfile.Default
.This method requires an "Extended Edition plus Viewer" license to run.
- Since:
- 2.6.1
- See Also:
OutputProfiler
-
getNumberOfRevisions
public int getNumberOfRevisions()
Return the number of revisions made to the document. This will only be useful for documents read in using a
PDFReader
- all other PDFs will return zero. See thePDFReader
class for more information on revisions.Note that in 2.7 the return value of this method was modified slightly so that the original version of a PDF is revision 1, not revision 0. New documents created with this library will still have a revision 0 before they're saved.
- Since:
- 1.2.1
- See Also:
PDFReader
,FormSignature.getNumberOfRevisionsCovered()
-
newPage
public PDFPage newPage(java.lang.String pagesize)
Create a new page of the specified page size and add it to this PDF. The page size is specified as a string of the form "WxHU", where W is the width of the page, H is the height of the page, and U is an optional units specifier - it may be "mm", "cm" or "in", and if it's not specified it's assumed to be points. The resulting page size is rounded to the nearest integer unless the units are specified as points (eg.
595.5x842
- fractional sizes added in 2.2.3).For convenience we've defined several standard sizes that you can pass in, like
PAGESIZE_A4
,PAGESIZE_A4_LANDSCAPE
,PAGESIZE_LETTER
,PAGESIZE_LETTER_LANDSCAPE
and so on.Since 2.2.3 you can also pass in a String containing the common name of the paper size, optionally with a "-landscape" suffix, eg "A4", "Letter", "A2-landscape", "DL" and so on. All ISO sizes and most US and JIS paper (and some envelope) sizes are recognised.
Example values include "210x297mm", "595x842" or "A4", which would both produce an A4 page, and "8.5x11in", "612x792" or "Letter", which would both produce a US Letter page.
This method is identical to calling:
PDFPage page = new PDFPage(pagesize); pdf.getPages().add(page);
- Parameters:
pagesize
- the size of the page to create- Throws:
java.lang.IllegalArgumentException
- if the specified page size cannot be parsed
-
newPage
public PDFPage newPage(int w, int h)
Create a newPDFPage
object of the specified size and add it to this PDF. The size is specified in points. This method is identical to calling:PDFPage page = new PDFPage(w, h); pdf.getPages().add(page);
The arguments are integers for API compatibilty reasons only. If required you can create pages sized to a fraction of a point using the
newPage(String)
method.- Parameters:
w
- the width of the page, in pointsh
- the height of the page, in points- Returns:
- a new
PDFPage
object - Since:
- 1.0
- See Also:
getPages()
-
newPage
public PDFPage newPage(PDFPage page)
Create a newPDFPage
object that is a clone of the specified page, and add it to this PDF. This method is identical to calling:PDFPage page = new PDFPage(originalpage); pdf.getPages().add(page);
- Parameters:
page
- the PDFPage object to clone- Returns:
- a new
PDFPage
object which is a clone of the specified page - Since:
- 2.0
- See Also:
getPages()
-
getPages
public java.util.List<PDFPage> getPages()
Returns a List of the documents pages which may be manipulated to reorder, delete or append pages to the document. This is done using the standardList
methods. For example, to reverse the pages in the document, you could do something like this:List pages = pdf.getPages(); List temp = new ArrayList(pages); pages.clear(); for (int i=temp.size()-1;i>=0;i--) { pages.add(temp.get(i)); }
or to move (not copy) all the pages from one PDF to another, trypdf1.getPages().addAll(pdf2.getPages());
Note that each page can only be in this list once, and a page can't be in the page list of more than one PDF. Attempting to add a page from this list (or another PDF's page list) will remove it from that location automatically.- Since:
- 1.1.12
-
getNumberOfPages
public int getNumberOfPages()
Return the number of pages in this PDF. Simply callspdf.getPages().size()
- Returns:
- the number of pages in the document
- Since:
- 1.1
- See Also:
getPages()
-
getPage
public PDFPage getPage(int pagenumber)
Return the specified page. Identical topdf.getPages().get(pagenumber)
- Parameters:
pagenumber
- the page number, between 0 andgetNumberOfPages()
-1- Returns:
- the specified page
- Throws:
java.lang.ArrayIndexOutOfBoundsException
- if the page number is not in range- Since:
- 1.1
- See Also:
getPages()
-
getPage
public PDFPage getPage(java.lang.String name)
Get a "Named Page" from the PDF. If a Template with the specified name is found in the PDF it will be returned, and may be cloned via thePDFPage(PDFPage)
constructor.- Since:
- 2.10.5
-
getLastPage
public PDFPage getLastPage()
Return the last page of this PDF. Identical topdf.getPage(pdf.getNumberOfPages()-1)
- Since:
- 1.0
- See Also:
getPages()
-
setEncryptionHandler
public void setEncryptionHandler(EncryptionHandler encrypt)
Set the
EncryptionHandler
to encrypt this document with. This method allows you to limit access to the document, either by requiring a password to open it, preventing the document from being printed and so on, or more.Changing encryption will destroy any digital signatures in the document, which is why Acrobat won't allow you to do lt. Prior to version 2.4, this library didn't preserve previously applied signatures when writing a file, so this wasn't an issue - a warning was displayed and the signature was removed. Now, however, signatures can be preserved, and this method will throw an
IllegalStateException
if called on a previously signed document. This will also occur if the encryption settings (like password, permission flags etc.) are changed. If you want to re-encrypt a signed document, you have to delete any existing signatures first.- Parameters:
encrypt
- the EncryptionHandler to be used to encrypt and limit access to the document- Since:
- 2.0
- See Also:
EncryptionHandler
,StandardEncryptionHandler
-
getEncryptionHandler
public EncryptionHandler getEncryptionHandler()
Return theEncryptionHandler
used to encrypt the document, ornull
if no encryption handler is in use.- Since:
- 2.0
- See Also:
EncryptionHandler
,StandardEncryptionHandler
-
setInfo
public void setInfo(java.lang.String key, java.lang.Object val)
Set an item of PDF meta-information, such as author or title. Prior to version 2.6.2 this method only updated the original "Info" dictionary, used by PDFs since the early days to store metadata. Since 2.6.2 this method can now be used to update both the original Info dictionary and the XMP Metadata.
Most people won't need to worry about the details. To set the metadata in the PDF, just specify an appropriate key to set the Title, Author etc. of the document. The list of known keys is below, although any key can be used - if it's not on the list it will appear in the "Custom" pane in Acrobat's "Document Properties" window.
Title The document's title. This value must be set for PDF/X documents Author The name of the person who created the document. Subject The subject of the document. Keywords Comma separated list of keywords associated with the document. Creator If the document was converted to PDF from another format, the name of the application that created the original document from which it was converted. Trapped The document's trapping status. Must be "True", "False" or "Unknown". This has to be set to "True" or "False" for PDF/X documents Note that "CreationDate" and "ModDate" are set by the PDF Library internally and do not need to be set manually (although since 2.11.26 they can be overridden). "Producer" is set internally and cannot be changed.
Since 2.6.2, updating one of the fields listed above will also update the XMP metadata to match. It's also possible to update fields in the XMP metadata that aren't listed above. This can be done be specifying the key name as
xmp:ns:attribute
, where ns is the recommended namespace prefix (as specified in the XMP specification) and attribute is the attribute to set. For instance, to set the "rights" attribute in the Dublin Core schema, you could callsetInfo("xmp:dc:rights", "Copyright (C) Whoever");
.No validation is done on these fields or the data, although fields listed as bags, sequences or language alternates in the XMP specificaiton will be automatically wrapped in the appropriate structure. If more complex fields need to be set then the
setMetaData(java.lang.String)
method can be used to pass in an entire RDF object.The
value
parameter may beString
,Date
,Boolean
orFloat
, ornull
to remove that item of meta-information.- Parameters:
key
- the meta-information field to setval
- the value to set it to - aString
,Date
,Boolean
,Float
ornull
- Since:
- 1.0
-
getInfo
public java.lang.String getInfo(java.lang.String key)
Return document meta data as set by
setInfo()
as a String. If the key name begins withxmp:
then the appropriate field will be extracted from the XMP metadata stream - see thesetInfo
for more informationFor example, to get the author of the document from the PDF Info dictionary:
String author = pdf.getInfo("Author");
and to extract the "rights" attribute of the Dublin Core Schema from the XMP metadata:String copyright = pdf.getInfo("xmp:dc:rights");
If a type of object requested from the XMP metadata cannot obviosuly be turned into a String, the value returned from this method is undefined.- Parameters:
key
- the field to get- Returns:
- the value of the specified field, or null if the field is not set
- Since:
- 1.0
-
getInfo
public java.util.Map<java.lang.String,java.lang.Object> getInfo()
Return the PDF meta information, as set by
setInfo()
. This is in the form of an unmodifableMap
, where the keys areString
objects, and values may beString
,Date
,Boolean
,Calendar
orFloat
objects. If no meta information is available, returns an empty Map.Since version 2.1.2, any keys representing Dates (such as "ModDate" or "CreationDate") will also have an equivalent entry with a leading underscore, eg. "_ModDate". These give the same information but as a
Calendar
rather than aDate
. This is to allow extraction of TimeZone information, sadly lacking from theDate
class.Note this map doesn't include any of the XMP metadata - only data from the original Info dictionary.
- Returns:
- an unmodifiable Map containing any meta information specified in the document.
- Since:
- 1.1.12
-
setLocale
public void setLocale(java.util.Locale locale)
Set the default locale for this document. This is mainly useful in right-to-left locales like arabic, as it sets the default text alignment. The locale may be set and reset as many times as required. The locale in use when the document is rendered is considered to be the locale of the document as a whole.- Since:
- 1.1
-
getLocale
public java.util.Locale getLocale()
Return the PDF's Locale, as set bysetLocale
or (since 2.6.1) as loaded from the PDFs "Lang" tag. If no locale is specified this method returnsnull
.- Since:
- 1.1
-
setAction
public void setAction(Event event, PDFAction action)
Specify an action to perform when the specified event occurs on the document. Valid events areEvent.OPEN
andEvent.CLOSE
, which occur within every version Acrobat, andEvent.PRE_SAVE
,Event.POST_SAVE
,Event.PRE_PRINT
andEvent.POST_PRINT
, which only occur in Acrobat 5.0 or newer viewers.- Parameters:
event
- the event on which to perform the actionaction
- the action to perform, ornull
to remove any current action- Since:
- 2.0
-
getAction
public PDFAction getAction(Event event)
Return the action that's performed when the specified event occurs on the document, as set bysetAction
. If no action is specified for that event, returnnull
- Since:
- 2.0
-
setJavaScript
public void setJavaScript(java.lang.String javascript)
Set the document-wide JavaScript. This JavaScript is executed when the document is first loaded - this is normally used to define functions and the like, in the same way as JavaScript defined in the<HEAD>
of an HTML document.- Parameters:
javascript
- the JavaScript to use for the entire document- Since:
- 1.1.23
- See Also:
getJavaScript()
,PDFAction.formJavaScript(java.lang.String)
-
getJavaScript
public java.lang.String getJavaScript()
Return the document-wide JavaScript, as set bysetJavaScript(java.lang.String)
, ornull
if no JavaScript is defined for this document.- Since:
- 1.1.23
- See Also:
setJavaScript(java.lang.String)
,PDFAction.formJavaScript(java.lang.String)
-
getNamedActions
public java.util.Map<java.lang.String,PDFAction> getNamedActions()
Return a Map containing all the named actions in the PDF. Named actions (which must always be "GoTo" type actions) can be referenced from outside the PDF, which allows the document to be opened at a specific location. Here's how to do this:
In the PDF, add the following code:pdf.getNamedActions().put("Myaction", PDFAction.goTo(somepage));
Then in your HTML document, add the following code:<a href="http://www.mycompany.com/mypdf.pdf#Myaction">
The Map returned from this method can be manipulated using the normal
Map
methods to add or delete actions. The only restrictions is that keys must always beString
objects and values must always bePDFAction
objects that jump to a location in the document, like those returned from one of thePDFAction.goTo...
methods.- Since:
- 1.1.12
-
getEmbeddedFiles
public java.util.Map<java.lang.String,EmbeddedFile> getEmbeddedFiles()
Return a Map containing all the Embedded Files associated with this document. Note this method does not return files embedded by way of a
AnnotationFile
method - they must be accessed via that class in the usual way.The Map returned from this method can be manipulated using the normal
Map
methods to add or delete actions. The only restrictions is that keys must always beString
objects and values must always beEmbeddedFile
objects. As with any map, the keys must be unique - we recommend adding files using their filenames as keys, like so:EmbeddedFile file = new EmbeddedFile(new File("Attachment.txt")); pdf.getEmbeddedFiles().put(file.getName(), file);
Since 2.26, an EmbeddedFile can be a Folder or a File - although Folders will only exist in a "Portfolio" PDF. This is a new datamodel in PDF 2.0, so it's a slightly awkward fit with the existing API.
If this PDF contains folders, the returned Map will contain only the Files, not the intermediate Folders. However Folders can be added to this Map, and if they are the Collection will be properly reconciled before the PDF is saved.
-
getPortfolio
public Portfolio getPortfolio()
Return the PDF portfolio, creating it if necessary.- Since:
- 2.26
-
setMetaData
public void setMetaData(java.lang.String xmldata)
Set the XMP Metadata associated with this document. Since 2.26 this method calls
getXMP().read(new StringReader(xmldata == null ? "" : xmldata))
. We strongly recommend using thegetXMP()
method and modifying the XMP directly rather than using this method.- Parameters:
xmldata
- the XML data to embed into the document, ornull
to remove it.- Since:
- 1.1.12
- See Also:
getXMP()
-
getXMP
public XMP getXMP()
Return the XMP metadata as an XMP object. For properly-formatted XMP, this new (2020) approach is a considerably improvement over thegetMetaData()
method, which dates from 2001. If the PDF contains metadata which cannot be parsed as an XMP object (for example if it's not valid XML, or if the XML doesn't meet the basic requirements of XMP) then this method returns an XMP object which hasXMP.isValid()
== false (between 2.24.4 and 2.26 it returned null).- Returns:
- the XMP, which may be empty or invalid but wil never be null
- Since:
- 2.24.4
-
getMetaData
public java.io.Reader getMetaData() throws java.io.IOException
Return any XML metadata associated with the document. Since 2.26 this simply returns
getXMP().isEmpty() ? null : new StringReader(getXMP().toString())
. It is strongly recommended that any code migrates to using thegetXMP()
method.Since 2.24.3, the returned type is guaranteed to hava a
toString()
method that will return the Metadata as a String.- Returns:
- a
Reader
containing the source of the XML, withtoString()
guaranteed to be the value of the metadata as a string, or null if the XMP is empty or missing - Throws:
java.io.IOException
- Since:
- 1.1.12
- See Also:
getXMP()
-
makePortfolio
public void makePortfolio(boolean portfolio)
Deprecated.call #getPortfolio insteadConvert the PDF to (or from) simple Portfolio PDF. The files to include should be added to the
getEmbeddedFiles()
Map, and content may optionally be written to the (single) page in this PDF which will be displayed by any PDF viewer other than Acrobat. Note that most of the fancy layout options available in Acrobat for Portfolios are implemented with Flash, and are not supported here by the PDF API.In Acrobat X and later, files may be emnbdded into subfolders. We support this by the
EmbeddedFile.setPortfolioFolder(java.lang.String)
method, but as folders are implemented in a very awkward way in the PDF object this must be set before the file is added to theEmbeddedFiles
map. Attempting to modify the folder after the file is added will result in an exception.Here is an extremely simple example showing how to create a Portfolio with one file in a subfolder.
PDF pdf = new PDF(); PDFPage page = pdf.newPage("A4"); pdf.makePortfolio(true); Map files = pdf.getEmbeddedFiles(); EmbeddedFile ef = new EmbeddedFile(new File("file1.pdf")); ef.setPortfolioFolder("subfolder"); files.put("File 1", ef); pdf.render(new FileOutputStream("portfolio.pdf"));
- Parameters:
portfolio
- true to convert the PDF to a "portfolio" PDF, false to reverse this: the PDF will be a plain PDF with some attachments.- Since:
- 2.14.1
- See Also:
getEmbeddedFiles()
,EmbeddedFile.setPortfolioFolder(java.lang.String)
-
setOption
public void setOption(java.lang.String key, java.lang.Object value)
Set various options and on the PDF, which largely (but not necessarily) follows the options available in the "Document Properties" dialog of Acrobat. The key is a case-insensitive String and the value is an object - it may be
String
,Boolean
,Integer
or some other type.Passing in an unrecognised key or an invalid value as a parameter will not throw an exception, but will simply have no effect. The list of currently supported options is below.
view.fullscreen boolean Open the document in full-screen mode view.displayDocTitle boolean The window's title bar should display the document title taken from the the Title
entry of thegetInfo()
map. If false the title bar should display the filename instead (only works in Acrobat 5 and later)view.hideToolBar boolean Hide the viewer application's tool bars when the document is active view.hideMenuBar boolean Hide the viewer application's menu bar when the document is active view.hideWindowUI boolean Hide user interface elements in the Document window (such as scroll bars and navigation controls), leaving only the document's contents displayed view.fitWindow boolean Resize the document's window to fit the size of the first displayed page. Note this resizes the window to fit the document, not the other way round view.centerWindow boolean Position the document's window in the center of the screen. Note this moves the whole viewer to the center of the screen, not the document to the center of the viewer rtl boolean Set the reading direction for this document - true will set it as "right to left", false to the default of left-to-right. Note that setting the Locale will automatically set this value to an appropriate value. pagelayout string How pages are displayed in the main Acrobat window pane. Values are typically SinglePage
(the default),OneColumn
("Single Page Continuous" in Acrobat 8),TwoColumnLeft
("Two-Up Continuous (Facing)" in Acrobat 8),TwoColumnRight
("Two-Up Continuous (Cover Page)" in Acrobat 8),TwoPageLeft
("Two-Up (Facing)" in Acrobat 8) orTwoPageRight
("Two-Up (Cover Page)" in Acrobat 8). Other values are possibile but won't be recognised by Acrobatpagemode string What to display in the left-most pane of the Acrobat window. Values are typically UseNone
(the default), which prevents the left-pane from being displayed, orUseOutlines
to display Bookmarks,UseThumbs
to display the Page Thumbnails,UseOC
to show the Layers tab orUseAttachments
to show the Attachments tab. The value "UseSignatures" can also be used to set the initial panel to the Signature panel in the BFO PDF Viewer, although this value has no effect in Acrobatview.area string Which page box to display when viewing the document on screen. One of CropBox
(the default),MediaBox
,TrimBox
,BleedBox
orArtBox
. Typically this setting is best left unchangedview.clip string Which page box to clip the page contents to when viewing the document on screen. One of CropBox
(the default),MediaBox
,TrimBox
,BleedBox
orArtBox
. Typically this setting is best left unchangedprint.area string Which page box to display when printing the document. One of CropBox
(the default),MediaBox
,TrimBox
,BleedBox
orArtBox
. Typically this setting is best left unchangedprint.clip string Which page box to clip the page contents to when printing the document. One of CropBox
(the default),MediaBox
,TrimBox
,BleedBox
orArtBox
. Typically this setting is best left unchangedprint.scaling string How to scale the document when printed. One of AppDefault
(which uses the application defaults) orNone
(for no scaling). Some non-standard values are also recognized by our viewer, includingFit
(scale the page up or down to fit the printable area, preserving the aspect ratio),FitUnlocked
(as before but don't preserve the aspect ratio),ShrinkToFit
andShrinkToFitUnlocked
(as for Fit and FitUnlocked, but only scale down to fit on the page, not up).print.duplex string What to set the print duplex settings to in the Acrobat Print Dialog. One of Simplex
(the default),DuplexFlipLongEdge
to duplex print on the long edge, orDuplexFlipShortEdge
to duplex print on the short edge.print.matchtraysize string Whether to attempt to match the paper source to the page size. print.numcopies integer The number of copies to set in the print dialog, from 1 to 5. print.pagerange List Which pages to set as the default pages to print in the Print dialog. Specified as a java.util.List
containing PDFPage objects.bfo.printasimage boolean Force the PDF to be printed as an image when printing with the BFO API only. This option may rarely be needed to print some documents correctly on some JVMs. It will be ignored by non-BFO applications marked boolean Identify the PDF as containing marked content (since 2.16) - Parameters:
key
- a case-insensitive key determining the option to set - may not be nullvalue
- the value to set that key to. The type depends on the key, but in general a value of null means the default.- Since:
- 2.7.6
-
getOption
public java.lang.Object getOption(java.lang.String key)
Returns the current value of an option, as set bysetOption()
. Boolean values will return "true" or null.- Parameters:
key
- a case-insensitive key determining the option to set - may not benull
- Since:
- 2.7.6
-
getBookmarks
public java.util.List<PDFBookmark> getBookmarks()
Return the List of bookmarks at the top level of the document. TheList
contains zero or morePDFBookmark
objects, and can be altered using any of the standardList
methods to order the documents bookmarks in any way you see fit. New documents start with an empty list.- Returns:
- the List of bookmarks at the top level of the document
- Since:
- 1.0
- See Also:
PDFBookmark
-
getDocumentID
public java.lang.String getDocumentID(boolean primary)
Returns a String representing this documents unique ID. The PDF specification recommends (but not requires) that every document is given a unique ID when it's created which is stored in two parts. The primary ID stays constant throughout the life of the document, the secondary should be updated on every revision - although in the first revision of a document they should be the same. So when comparing the IDs of two documents, if the primary and secondary both match you've found the same document, and when only the primary ID matches you've found a different version of the same document.
This method return either the primary or secondary ID, depending on whether the
primary
parameter istrue
orfalse
. The ID is generally just random characters.Calling this method before the document is created (ie when you've just created a new PDF but not called
render()
) will result in this method returningnull
. It may also returnnull
for PDFs that do not have an ID specified, although they are fairly rare these days.Although the IDs are stored internally as 16 bytes, we return a String of 32 hex-characters to make them easier to display and compare.
- Parameters:
primary
- whether to return the primary or secondary ID- Returns:
- a 32-character String representing the ID, or
null
if no ID is set - Since:
- 2.1.2
-
getForm
public Form getForm()
Return the InteractiveForm
or "AcroForm" object which is part of each PDF document. Note that using interactive forms requires the "Extended Edition" of the library - although the classes are supplied with the package an "Extended Edition" license must be purchased to activate this functionality.- Returns:
- the documents AcroForm
- Since:
- 1.1.13
-
importFDF
public void importFDF(FDF fdf)
Import the contents of the specified
FDF
into the PDF document. Any form values specified in the FDF file will be used to set the corresponding form fields in the PDF, and since 2.2.2 any annotations in the FDF will be imported as well. If a field doesn't exist, a warning is printed and the field is ignored.Note that since 2.11.18 any JavaScript on the FDF will be imported as well, and this may involve executing JavaScript with the permissions of the PDF class. See the
FDF.willExecuteJavaScript()
method and theFDF.setJavaScript(java.lang.String, java.lang.String)
method to disable this.- Since:
- 1.2.1
-
render
public void render(java.io.OutputStream out) throws java.io.IOException
This method renders the completed PDF to an
OutputStream
. The stream is left open on completion. A document may be rendered more than once.Rendering the document typically merges all the revisions of a document, so after rendering the
getNumberOfRevisions()
method will always return zero. The exception to this is documents containing an existing digital signature, or documents with anOutputProfile
requiring theOutputProfile.Feature.MultipleRevisions
feature; there is a very specific technical case where this may be necessary, see the API docs on that class for more information.- Parameters:
out
- the output stream to write the PDF to- Throws:
java.io.IOException
- if the process could not be completed- Since:
- 1.0
-
getRenderProgress
public float getRenderProgress()
Get the progress of therender()
method running in a different thread. The returned value will start at 0 and move towards 1 as the render progresses.- Since:
- 2.8
-
addPropertyChangeListener
public void addPropertyChangeListener(java.beans.PropertyChangeListener listener)
Add aPropertyChangeListener
to this object- Since:
- 2.11.19
-
removePropertyChangeListener
public void removePropertyChangeListener(java.beans.PropertyChangeListener listener)
Remove aPropertyChangeListener
to this object- Since:
- 2.11.19
-
setCache
public static void setCache(Cache cache)
Set theCache
to be used by the library. Note this is a static, method, which means a single cache is used for all PDFs. This also means you do not need to call this method more than once, and doing so is not only inefficient, it could theoretically cause problems in multi-threaded environments like servlet engines. To repeat - if you are going to call this method, do it once in an initialization routine before the first PDF is created.- Since:
- 2.2.2
-
setPageLabel
public void setPageLabel(int startpage, int displaystart, java.lang.String prefix, char type)
Set the "Page Label" for a range of pages in the PDF - the way the page number is presented. Calling the method will set the format for all pages from the specified
startpage
to the end of the document, so if multiple formats are required they should be set in ascending order.For example, to set the first 4 pages to i, ii, iii, iv and then number normally from 1, call:
pdf.setPageLabel(0, 1, null, 'r'); // Number all pages in roman starting from 1 pdf.setPageLabel(4, 1, null, 'D'); // Number from 4th page in decimal starting from 1
To reset the page labels the the default, call
setPageLabel(0, 1, null, 'D')
. This will number all pages as decimal numbers starting from 1.- Parameters:
startpage
- the first page in the PDF to format with this label, starting from 0displaystart
- the number to give the page specified by startpage - subsequent pages will be numbered sequentially from this value. Minimum value is 1type
- one of 'D' for decimal, 'R' for upper-case roman, 'r' for lower-case roman, 'A' for upper-case letters, 'a' for lower-case letters or 'x' for no numbering - in this case just the prefix will be used.prefix
- the prefix to give to the page labels, or null for no prefix- Since:
- 2.11.19
-
getPageLabel
public java.lang.String getPageLabel(int num)
Get the "Page Label" for the specified page number, or null if none is specified.- Parameters:
num
- the page number to get the label for, starting with 0- Since:
- 2.11.19
- See Also:
setPageLabel(int, int, java.lang.String, char)
-
setLicenseKey
public static void setLicenseKey(java.lang.String key)
Set the license key for the library. When the library is purchased, BFO supplies a key which removes the "DEMO" stamp on each of the documents.
Please note this method is static - it should be called BEFORE the first PDF is created, like so:
PDF.setLicenseKey(.....); PDF pdf = new PDF();
- Parameters:
key
- the license key
-
getLicensedProperty
public static java.lang.Object getLicensedProperty(java.lang.String key)
Retrieve a property from the PDF License.- Parameters:
key
- the property- Since:
- 2.26.3
-
useAWTEventModel
public static void useAWTEventModel(boolean awtevent)
Set the PDF Library to work with the AWT event model. Without this flag set (the default) anyPropertyChangeEvent
objects fired by classes in this package will be fired immediately. If this flag is set to true, they will be batched up and fired from the AWTEventQueue
at some point in the future. If the PDF Library is being used in an AWT application, especially one that may have background threads performing tasks, this value should be set to true.- Since:
- 2.12
-
getLoadState
public LoadState getLoadState(int index)
For linearized documents that are being loaded from aURL
via thePDFReader.setSource(URL)
, this method relays the current load state of the specified page. If the page is fully loaded this method returnsnull
, otherwise it returns aLoadState
which can be used to monitor the progress of the load.- Parameters:
index
- the number of the page to query (0-indexed) - a value of -1 will check all pages, and return true only if they are all loaded.- Returns:
- a LoadState describing the progress of the load, or
null
if the page is fully loaded or the PDF is not linearized. - Since:
- 2.14
-
rebuildStructureTree
public void rebuildStructureTree()
Rebuild the Structure Tree returned fromgetStructureTree()
. As of 2.24, this is simply an alias forDocument document = pdf.getStructureTree(); document.normalizeDocument();
- Since:
- 2.19
- See Also:
getStructureTree()
,Document.normalizeDocument()
-
getStructureTree
public org.w3c.dom.Document getStructureTree()
Returns the Structure Tree for the entire document as a W3C DOM. This is a representation of the logical structure of the PDF, which is typically used to enable accessibility on the PDF.
The returned Document is live, and changes made to it will be reflected in the PDF. By default the tree will not contain any text content. Populating the tree with text content is a relatively time-consuming operation for large documents, so is not done by default. The tree will contain
<bfo:content>
elements marking where the text-content will go. Those nodes will be populated if the extract-text DOM config parameter is set to true; see below.The special nodes in the
bfo
namespace have a fixed set of attributes which identify the current page, marked-content id and/or index into the page'sannotation list
of the item; the attribute are live and will update as pages are reordered or removed.Changes made indirectly to this Document (either by moving pages in and out of the document, or by calls to
beginTag
on PDFPage, PDFCanvas or LayoutBox) may not be reflected in the tree until theDocument.normalizeDocument()
method is called.The returned Document can be modified, although it it not possible to modify or create new text or
<bfo:content>
elements. Modification is useful when pages from multiple PDFs have been merged together, to rationalize the structure.There are various parameters that can be set on the Document before the
Document.normalizeDocument()
method is called, to control how the tree is modified. With the exception of role-map, roles, lexicons, class-map, and trim-empty, all values areBoolean
and are set and retrieved like so:document.getDomConfig().setParameter("extract-text", true); Object o = document.getDomConfig().getParameter("extract-text");
extract-text This value can be set to a Boolean
; when true, the next normalization of the Document will extract any text that has not yet been extracted, and populate the<bfo:content>
elements in the tree with text and<bfo:blob>
elements which are (currently) placeholders for images or other graphical operations. Note that if the PDF is retrieved fromPDFParser.getStructureTree()
, you will get the same object, but with this parameter set to true by default.fix-invalid-xml The Document is a representation of an internal structure in the PDF, not an actual XML Document. As such is may contain content which is not valid in XML, such as element or attribute names with spaces or other illegal characters. This isn't a problem unless you are trying to import a copy of this Document into a regular XML document. If that's the case, setting this value to true will replace any invalid characters in the tree with underscores. fix-structure This setting defaults to true. If there are any restrictions in the OutputProfile
that would cause rendering to fail, if this flag is true an attempt to repair the tree will be made. For example, inPDF/UA-1
, weak headings (e.g. H1, H2, H3 elements) are required to descend consecutively - H3 must follow H2, not H1). If the Document fails to meet this requirement and fix-structure is set to true, the headings will be renumbered to meet this requirement. Note since 2.28.5 this value is a String - as well as true or false, it can be a space-separated list of things to repair. Specific values currently includetable list ruby warichu inline caption heading math root unknown bubble alt list-numbering
for different classes of repair, mostly to the hierarchy - for example,table
means create table elements as required to fix the hierarchy, andbubble
means move block/group elements up in the tree until they have a valid parent.trim-empty Documents that have seen pages removed will tend to accumulate empty elements, if the content within those elements was on the removed pages. Setting this property to the String "always" or Boolean true
will delete elements with no content descendants that are considered "safe"; this is most elements except those that denote structure, like >td<. Setting this value to the String "move" (the default) will move empty elements along with their siblings if pages are moved to/from a PDF. Setting this value to "none" or Booleanfalse
will leave empty elements unmodified (which was the default behaviour to 2.24.4)role-map In PDF, it is possible to "map" one type of element name to another. This allows custom elements to be created without breaking the validation rules; for example, if <Foo>
is mapped to<Td>
then the structure<Table><Tr><Foo>...
is perfectly valid. The mappings are specified by aMap<String,String>
which is retrieved from the role-map parameter; unlike the other parameters this cannot be set, although the returned map can be modified. For the previous example you would dorolemap.put("Foo", "Td")
. From 2.24.1 it is possible to include namespaces in both the keys and values to this map, by setting the name touri + "\n" + localname
. Names with no prefix are considered to be in the default namespace used by PDF 1.x. For example, in PDF 2.x the above example should berolemap.put(NS + "\nFoo", NS + "\nTd")
, whereNS=https://www.iso.org/pdf2/ssn
.roles New in 2.28.2, the roles
user parameter is an array of namespaces to prioritise. As described forrole-map
, in PDF it is possible to map one type of element to another in a different namespace. These maps are transitive (an element can be mapped to several namespaces at once) which can get confusing. Theroles
list can be used to determine which view of the tags you want to take - empty by default, but if namespaces are added the element name and namespace will be rolemapped to the first matching namespace in the list. For example:List<String> roles = (List<String>)document.getDomConfig().getParameter("roles"); roles.add("http://iso.org/ssn/pdf2"); roles.add("http://iso.org/ssn/pdf");
will ensure that if any elements in the tree are role-mapped to the PDF2 or PDF1 namespace, the role-mapped element names are returned instead. All other element names/namespaces are returned as normalclass-map Each element in the Structure Tree may belong to one or more "classes". Belonging to a class means the element inherits the attributes defined on that class, although this feature seems to be rarely used. Since 2.24.1 this map of class attributes can be retrieved with the class-map parameter - the returned value is a Map<String,NamedNodeMap>
.lexicons The Structure Tree may include one or more pronunciation dictionaries stored as PLS (Pronunciation Lexicon Specification 1.0) files. Since 2.24.4 a List<EmbeddedFile>
can be retrieved with the lexicons parameter, and altered to add new lexicons if required.Since 2.26, normal elements and <bfo:content> elements can have
XMP
metadata and/or a set ofEmbeddedFile
objects associated with them, which may be set or retrieved by callingElement.getUserData("metadata")
orElement.getUserData("files")
respectively. The "metadata" value is set as anXMP
,String
orReader
and retrieved as aXMP
. The "files" property is set as anEmbeddedFile
or a collection of the same, and retrieved as aCollection<EmbeddedFile>
. In both cases, the returned objects are live and changes to them will be reflected when the PDF is written out.The presence of each of these structured in the XMP is indicated by two special attributes,
bfo:metadata
andbfo:files
. If these attributes exist on an element, it will have the corresponding structure present in the user data.Since 2.28.4, every DOM node has a special read-only "placement" userdata which can be retrieved. This is a
Map<PDFPage,Shape>
which gives the physical position of this node on the page(s). This is always set for PDFs that have been read in, but not guaranteed to be set for trees that are in the process of being constructed.Populating the Document with text content requires an Extended Edition plus Viewer license.
-
getOptionalContentLayers
public java.util.List<OptionalContentLayer> getOptionalContentLayers()
Return the list of
OptionalContentLayer
objects defined in the PDF. This list will be empty for a freshly created PDF, and any layers created by the user must be added in the order they're required. When an existing PDF has been loaded via aPDFReader
, the first call to this method will populate the list with the current state from within the PDF. The list is live, and any changes made to it will be saved when the PDF is saved.Items may be added to the list more than once but later occurrances will be ignored. Clearing the list will remove all optional content from the PDF.
- Returns:
- the Optional Content list
- Since:
- 2.23.5
-
putUserData
public void putUserData(java.lang.String key, java.lang.Object value)
Set a custom property on the PDF. The property will be saved with the file with the "BFOO_" prefix.- Parameters:
value
- a CharSequence, Number, Date, Calendar, Boolean, byte[], or a List/Map of those values, or null to remove the property- Since:
- 2.24.2
-
getUserData
public java.lang.Object getUserData(java.lang.String key)
Return a property previously set on the PDF with theputUserData()
method- Returns:
- a String, Boolean, Number, Calendar, byte[] or a Map/List of those values if found, or null if no such property exists.
- Since:
- 2.24.2
-
getEmbeddedFileSource
public EmbeddedFile getEmbeddedFileSource()
When a PDF is loaded fromEmbeddedFile.getPDF()
, this method will return the EmbeddedFile that contains this object. Otherwise it will return null- Since:
- 2.26
-
getDocumentPart
public DocumentPart getDocumentPart()
Return the rootDocumentPart
, which will never be null but which will beempty
unless this file uses DocumentParts- Since:
- 2.28.3
-
toString
public java.lang.String toString()
-
putLiteral
public void putLiteral(java.lang.String key, java.lang.String tokens)
Put a literal token sequnce. For debugging- Parameters:
key
- the keytokens
- the token sequence, eg "true" or "/foo" or "[/Foo/Bar]". No refs, just direct objects.
-
clone
protected java.lang.Object clone()
- Overrides:
clone
in classjava.lang.Object
-
-