Class XMP
- java.lang.Object
-
- org.faceless.pdf2.XMP
-
public class XMP extends java.lang.Object
The XMP class encapsulates the "Extensible Metadata Platform" format metadata which underpins all PDF metadata since PDF 1.4. While the PDF API has had support for XMP for a very long time, it has been all under the surface and so difficult to work with directly. This class encapsulates the XMP model as defined in ISO16684 (2019) and incorporating all the legacy properties going back to 2004.
Most commonly the XMP class will be used to set the PDF metadata, but it can also be used to create a standalone XMP object, which can have content loaded (with
read(java.io.Reader)
) or written outwrite(java.lang.Appendable)
). There is no need to do this when working with the XMP object returned frompdf.getXMP()
- the PDF will be updated with the metadata automatically.First, some simple examples. Here are several ways to set the "Subject" on the PDF.
// Set the subject on the legacy "Info" dictionary. This will sync "dc:subject" in the XMP to match pdf.setInfo("Subject", "My Subject"); // Set the dc:subject on XMP. This will sync to the "Subject" key in the legacy Info dictionary. // Before the XMP class was introduced, this was the only way to set content on the XMP directly pdf.setInfo("xmp:dc:subject", "My Subject"); // With the XMP class, this is easier. This is what the above call to setInfo translates to. pdf.getXMP().set("dc:subject", "My Subject"); // dc:Subject is actually a list of Subjects. We can specify the value as a List List<String> subjects = new ArrayList<>(); subjects.add("Subject 1"); subjects.add("Subject 2"); pdf.getXMP().set("dc:subect", subjects); // Finally, the "dc:subject" string means the "dc:subject" property. You can use the property directly XMP xmp = pdf.getXMP(); XMP.Property property = xmp.getProperty("dc:subject"); // get the property property = xmp.getProperty("{http://purl.org/dc/elements/1.1/}subject"); // the same, but using the URI XMP.Type type = property.getType(); // The type of "dc:property" is "Bag Text" XMP.Value value = type.create(xmp, subjects); xmp.set(property, value);
All the above will achieve the same result, setting the "dc:subject" property to a list of one or more Strings. Here's how to retrieve the value we just set.// The only way to get info out of the XMP without parsing it yourself, prior to this class. // Although a list, the API forced us to serialize it as a String String value = pdf.getInfo("xmp:dc:subject"); // The recommended approach now List<String> subject = (List<String>)pdf.getXMP().get("dc:subject"); // You can also retrieve the "Value" object XMP.Value value = xmp.get(xmp.getProperty("dc:subject")); if (value != null) { subject = (List<String>)value.getData(); } // Another way to get "value" value = xmp.getValues().get(xmp.getProperty("dc:subject"));
Properties and Values
When getting or setting values, the main reason to use theXMP.Value
class is if you want to either modify an existing value - perhaps by adding an entry to a list - or if you want to set a qualifier. Here's roughly what we do inaddHistory()
to add an entry to thexmpMM:History
property, which is a common Adobe property for recording a history of operations on a file. It's defined asXMP.Type
"Seq ResourceEvent", which means it's a List. We want to add a new "ResourceEvent" object to the end of the list.// First create a Map with the fields used in the
Another reason to use the Value class is if you want to qualify a value. Each value may have a list of zero or more qualifiers, which are themselves properties with values. For instance, the propertyResourceEvent
type Map<String,Object> map = new HashMap<>(); map.put("action", "edited"); map.put("when", new Date()); map.put("softwareAgent", "My application name"); // Retrieve the existing List of Values, creating it if not XMP.Property p = xmp.getProperty("xmpMM:History"); XMP.Value list = xmp.get(p); if (list == null) { xmp.set(p, list = p.getType().create(xmp, Collections.EMPTY_LIST)); } // Add a new entry to the list, by setting the entry at "list.size()" XMP.Value entry = p.getType().getComponentType().create(xmp, map); list.set(list.size(), entry);dc:creator
is a list of the creators of the document. Each entry can be qualified to describe the role of the creator. For example:// Retrieve the existing List of Values, creating it if not XMP.Property p = xmp.getProperty("dc:creator"); XMP.Property qp = xmp.getProperty("dcq:creatorType"); XMP.Value list = xmp.get(p); if (list == null) { xmp.set(p, list = p.getType().create(xmp, Collections.EMPTY_LIST)); } // Add a new entry to add to the list, and set a qualfier property on it XMP.Value entry = p.getType().getComponentType().create(xmp, "René Goscinny"); entry.putQualifier(qp, qp.getType().create("Author")); list.set(list.size(), entry); // Add a second entry to the list. XMP.Value entry = p.getType().getComponentType().create(xmp, "Alberto Uderzo"); entry.putQualifier(qp, qp.getType().create("Illustrator")); list.set(list.size(), entry);
Defining a custom Schema
The above code, if run exactly as shown, would fail by default as there is no "dcq:creatorType" property. In order to define one we need to add a custom Schema. This is very simple:
XMP.Schema schema = new XMP.Schema("http://purl.org/dc/qualifiers/1.0/", "dcq", "The (superceded) Dublin Core Qualifiers namespace"); schema.newProperty("creatorType", xmp.getType("Text"), "The creatorType qualifier", true); xmp.addSchema(schema);
A Schema can be shared across multiple XMP objects, across multiple threads - although it shouldn't be modified after it's been added to an XMP.
A Schema can declare new Types as well as new Properties. Types are declared on a Schema but are accessed by Name, so Type names should be globaly unique. As an example, lets say we want to store a list of HTTP headers as part of our metadata. Each header has a "header" and a "value" component, and we want to store a List of these headers, because a header may be repeated. Here's how we might do this.
XMP xmp = pdf.getXMP(); XMP.Schema schema = new XMP.Schema("http://example.org/ns/http/", "http", "An example HTTP header schema"); // First we declare an "HttpHeader" type with two fields. Type type = schema.newType("HttpHeader", "An HTTP header type"); type.newField("header", xmp.getType("Text"), "The HTTP header name"); type.newField("value", xmp.getType("Text"), "The value of an header"); // Then we want to declare a new property, "httpHeaders", which is a "Seq HttpHeader" - a List of the // type we just declared. schema.newProperty("httpHeaders", Type.seqOf(type), "A list of HTTP headers", true); xmp.addSchema(schema); // Finally, lets populate our XMP with some data using this new Property String[] data = { "Content-Type", "text/html", "Date": "Wed, 11 Nov 2020 19:47:20 GMT", "Server", "Apache" }; List<Map<String,String>> headers = new ArrayList<>(); for (int i=0;i<data.length;) { Map<String,String> map = new HashMap<>(); map.put("header", data[i++]); // "header" and "value" are the names of our fields map.put("value", data[i++]); headers.add(map); } xmp.set("http:httpHeaders", headers);
Content created in accordance with a custom schema will be correctly serialised to the XMP object, with a PDF/A extension schema written if required. It will also be used as part of a RelaxNG schema generated by
generateRelaxNGSchema
.Validation (ad-hoc)
Validation of XMP is theoretical possibility, but in practice not as easy as it looks. XMP has not been designed for formal validation: types have changed over time, version numbering of specifications is patchy, there is no firm concept of a "valid" complex type - no formal definition of whether a field is required or optional, or what combinations they can be used in. XMP validation was a goal of PDF/A that has been gradually relaxed as it was found to be unworkable in too many cases, and is now optional in PDF/A-4 (where it is done with a RelaxNG schema, rather than the ad-hoc PDF/A "extensions" model).
Having said that, this API provides the tools to allow validation of XMP content if desired, and validation for known properties is on by default. When creating new XMP content, if
isValidating()
is true (the default) it is impossible to create content of an incorrect type. The API does this by enforcing the following:- When calling
Type.create()
to create new content to add to an XMP, the content must match the Type in question. - When adding a Value to the XMP, the Value must match the Type of the Property
The exception to the above is where a new property is created to have an undefined type by calling
set(String,Object)
. Any value can be set in this case. For PDF/A-4, or when you just don't care about the schema, callsetValidating(boolean)
to disable these checks.When reading in new content, the parser will consume any well-formed XMP regardless of type mismatches. There are a number of
OutputProfile.Feature
s that will be set to show when a Value is of the wrong type - for example,XMPMetaDataTypeMismatch
,XMPMetaDataTypeMatches2005
andXMPMetaDataTypeUnknownField
. The various PDF/A profiles are set to deny an appropriate subset of these. When traversing an XMP, it's easy enough to verify if a value matches its property by comparing their types:XMP xmp = pdf.getXMP(); for (Map.Entry<Property,Value> e : xmp.getValues()) { XMP.Property p = e.getKey(); XMP.Value v = e.getValue(); if (!p.getType().equals(v.getType())) { System.out.println("Property " + p + " has wrong type"); } }
A type is considered invalid when reading if a simple type doesn't match - for instance, a Property or Field declared as a "Date" cannot be parsed as a Date. A "Seq Date" containing a single entry that is invalid, is itself invalid, and the same applies to "Bag" or "Alt".
A complex type is not considered invalid if it has missing or additional fields - for example, if we were reading a PDF with the "httpHeaders" metadata we defined above, and one of the HttpHeader values was missing a "value" field, or had an additional unknown field. If this sort of validation is required, consider setting RelaxNG schema with
setReaxNGSchema()
. Any undefined fields fields will have a type whereXMP.Type.isUndefined()
returns true.Validation (with RelaxNG)
New in ISO16684-2:2019 (the 2019 version of the XMP specification) is the ability to validate an XMP object with a RelaxNG schema. Including a suitable Schema with the metadata is recommended (but not required) in PDF/A-4. The
getReaxNGSchema()
,setRelaxNGSchema()
andgenerateRelaxNGSchema()
methods can be used to do this, and thevalidateRelaxNGSchema()
method used to validate the XMP against a schema - although as no RelaxNG is shipping with the JVM at this time, this will probably require a third-party library (specifically, Jing). The API docs for those methods will give more detail.Interaction with other tools
The XMP metadata is visible in the "Document Information" dialog in Acrobat. The correspondence is:
Acrobat Field Property Name Notes Application xmp:CreatorTool Document Title dc:title The value with language "x-default" Author dc:creator First item in the list Author Title photoshop:AuthorsPosition Description dc:description The value with language "x-default" Description Writer photoshop:CaptionWriter Keywords dc:subject List items are joined with semi-colons in Acrobat Copyright Status xmpRights:Marked Boolean Copyright Notice dc:rights Copyright Info xmpRights:WebStatement URL of a copyright notice The XMP metadata in the legacy "Info Dictionary" (retrievable from
PDF.getInfo()
) is also mapped to XMP fields. Updating one structure will update the other automatically, so there is usually no need to worry about this, but it may arise that an Info dictionary and XMP need to be kept in sync manually. The mapping of fields to XMP values is fixed, so if you wanted to migrate the content from an Info dictionary to XMP, we recommend the following code:// Migrate from Info to XMP String[] keys = new String[] { "xmp:CreateDate", "_CreationDate", "xmp:ModifyDate", "_ModDate", "xmp:CreatorTool", "Creator", "dc:description", "Subject", "dc:creator", "Author", "dc:title", "Title", "pdf:Keywords", "Keywords", "pdf:Producer", "Producer", "pdf:Trapped", "Trapped" }; for (int i=0;i<keys.length;) { String xmpkey = keys[i++]; String infokey = keys[i++]; if (xmp.get(xmpkey) == null) { Object value = pdf.getInfo().get(infokey); if (!infokey.equals("Trapped") || "True".equals(value) || "False".equals(value)) { xmp.set(xmpkey, source.getInfo().get(infokey)); } } }
The reverse direction can be easily derived if required; the exceptions to the obvious mappings are:- for "dc:creator" (a list of values), "Author" is set to the first first entry in the list
- for "dc:subject" and "dc:title", "Subject" and "Title" respectively are set to the
entry in the list that has a
language
of "x-default"
- Since:
- 2.24.4
- When calling
-
-
Nested Class Summary
Nested Classes Modifier and Type Class Description static class
XMP.Property
A Property is a "key" for any values set on the XMP.static class
XMP.Schema
A Schema is a collection of properties and types, grouped together under a single XML namespace.static class
XMP.Type
A Type determines the underlying Type of a Property.static class
XMP.Value
A Value is a typed-value which is stored in the XMP against aXMP.Property
.
-
Constructor Summary
Constructors Constructor Description XMP()
Create a new XMP.
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description void
addAll(XMP xmp)
Add all the properties and extensions from the supplied XMP object into this XMP objectXMP.Value
addDeclaration(java.lang.String conformsTo, java.lang.String claimant, java.lang.String credentials, java.lang.String report, java.util.Calendar when)
Add a PDF Declaration to the "pdfd:declarations" structure in the the Metadata, creating it if necessary.XMP.Value
addHistory(java.lang.String action, java.lang.String parameters, java.lang.String softwareAgent, java.lang.String instanceID, java.util.Calendar when)
Add an event the "xmpMM:History" structure in the Metadata, creating it if necessary.XMP.Schema
addSchema(XMP.Schema schema)
Add a new Schema to this XMP.void
clear()
Remove any properties, schemas or types set on this XMP.java.lang.String
generateRelaxNGSchema(java.util.Collection<XMP.Property> properties)
Generate a RelaxNG Schema which will fully describe the specified Properties.java.lang.Object
get(java.lang.String name)
If a property with the specified name is present in the XMP, return the value it's set to, otherwise return null.XMP.Value
get(XMP.Property key)
Return the value of the specified property as set on this XMPjava.util.Collection<XMP.Schema>
getAllSchemas()
Return a read-only set of all Schemas available to this XMP object.java.util.Collection<java.lang.Object>
getOwners()
Return the set of presumed "owners" of this XMP - the object(s) the XMP is associated with.XMP.Property
getProperty(java.lang.String name)
Return the Property matching the specied name.java.lang.String
getRelaxNGSchema()
Return any RelaxNG Schema associated with this Metadata.XMP.Schema
getSchema(java.lang.String uri)
Return the Schema from the Collection returned bygetAllSchemas()
that matches the specified URI, or null if not found.java.util.Collection<XMP.Schema>
getSchemas()
Return a read-only set of Schemas explicitly added to this XMP object.XMP.Type
getType(java.lang.String name)
Return the specified Type, if it's known to the XMP, or null otherwise.java.util.Map<XMP.Property,XMP.Value>
getValues()
Return a live, read-only view of all the values set on this XMP object.OutputProfile
getXMPOutputProfile()
Return a partial OutputProfile that reflects only the features that apply to this XMP objectboolean
isEmpty()
Return true if this XMP is empty and has no properties.boolean
isValid()
Return true if the XMP content passed toread(java.io.Reader)
was valid XMP format.boolean
isValidating()
Return true if this XMP object is validating (the default is true).boolean
read(java.io.Reader reader)
Read the XMP stream from the supplied reader, and return true if it contains a valid XMP stream.boolean
removeDeclaration(java.lang.String conformsTo)
Remove a declaration such as one previously added byaddDeclaration(java.lang.String, java.lang.String, java.lang.String, java.lang.String, java.util.Calendar)
.void
repair(PDF pdf, OutputProfile target, OutputProfiler.Strategy... strategy)
Attempt to repair this XMP object to match the specified target, using the specified strategy.XMP.Value
set(java.lang.String name, java.lang.Object data)
Set a value on this XMP object.void
set(XMP.Property p, XMP.Value value)
Set the specified property to have the specified value, or if the value is null, delete the specified property from this XMP.void
setEntityResolver(org.xml.sax.EntityResolver resolver)
If external entities need to be resolved, set the EntityResolver to use.void
setRelaxNGSchema(java.lang.String schema)
Set the RelaxNG Schema associated with this Metadata.void
setValidating(boolean validating)
Set whether this XMP is validating or not.java.lang.String
toString()
Return the XMP as a String, by callingwrite(java.lang.Appendable)
with a new StringBuilder.boolean
validateRelaxNGSchema(java.lang.String schema, org.xml.sax.ErrorHandler handler)
Attempt to validate the current XMP object against the supplied RelaxNG schema.void
write(java.lang.Appendable w)
Write this XMP object to the suppliedAppendable
.
-
-
-
Method Detail
-
isEmpty
public boolean isEmpty()
Return true if this XMP is empty and has no properties. An empty XMP is what's returned fromPDF.getXMP()
if no XMP content existed in the file. An empty XMP is never written to the PDF. CallingtoString()
on on an empty, valid XMP will return the empty string.- Since:
- 2.26
-
isValid
public boolean isValid()
Return true if the XMP content passed toread(java.io.Reader)
was valid XMP format. If it's not valid the XMP will be empty (isEmpty()
will return true), and thetoString()
andwrite(java.lang.Appendable)
methods will reflect the invalid text. Setting any properties on an invalid XMP will reset it to valid.- Since:
- 2.26
-
isValidating
public boolean isValidating()
Return true if this XMP object is validating (the default is true).- Since:
- 2.26.3
- See Also:
setValidating(boolean)
-
setValidating
public void setValidating(boolean validating)
Set whether this XMP is validating or not. When validating, only known properties can be passed to theset(java.lang.String, java.lang.Object)
methods, and the values must match the expected formats for those properties. In non-validating mode, any String can be passed as a property name; it will be parsed as normal (ie prefix:name) or as a URL. Values can be aMap
,Collection
, or a primitive type and will be recursively expanded.- Since:
- 2.26.3
-
getOwners
public java.util.Collection<java.lang.Object> getOwners()
Return the set of presumed "owners" of this XMP - the object(s) the XMP is associated with. This is populated when a full OutputProfile of the PDF is generated with the
OutputProfiler
class, and will otherwise be empty.The BFO API does not give full access to the PDF object model, and XMP Metadata can be associated with literally anything, including more than one object at a time. So in some cases the object returned will have no meaning, and may even be null if it cannot be determined. Where it is returned, in most cases it will be a
PDF
,PDFPage
,PDFAnnotation
,PDFImage
,PDFFont
or aElement
accessible from thePDF.getStructureTree()
method. All of these objects have methods for managing Metadata directly. directly.- Since:
- 2.26
- See Also:
EmbeddedFile.getOwners()
,OutputProfile.getXMPs()
-
setEntityResolver
public void setEntityResolver(org.xml.sax.EntityResolver resolver)
If external entities need to be resolved, set the EntityResolver to use. This is not require for normal use, but is required when reading a JSON+LD schema that reference an external @context- Parameters:
resolver
- the EntityResource that will be used to retrieve any external resources referenced
-
getProperty
public XMP.Property getProperty(java.lang.String name)
Return the Property matching the specied name. The name may be supplied with a standard prefix, egxmpMM:InstanceID
, or it may be supplied with a namespace, eg{http://ns.adobe.com/xap/1.0/}InstanceID
. If it can be matched to any Schema known to this XMP, it will be returned, otherwise it will return null.- Parameters:
name
- the name of the property- Returns:
- the Property, or null if not found
-
getType
public XMP.Type getType(java.lang.String name)
Return the specified Type, if it's known to the XMP, or null otherwise. The supplied name may be a simple type name (eg Integer, Text, Boolean, URL, Date), a complex type defined in one of the known Schema (eg "ResourceEvent"), or one of those values prefixed by "Alt", "Seq" or "Bag" - for example, "Bag Text" or "Seq ResourceEvent". The value "Lang Alt" may be used to retrieve a "Language Alternate" type.- Parameters:
name
- the name of the Type- Returns:
- the Type, or null if not found.
-
set
public XMP.Value set(java.lang.String name, java.lang.Object data)
Set a value on this XMP object. If the specified property is known, this method is identical to the following code:Property property = xmp.getProperty(name); Value value = property.getType().create(xmp, data); xmp.set(property, value);
If the property is not known, it will be created with an undefined type. This allows values to be quickly set on the XMP, although the resulting XMP would not validate against PDF/A-1, 2 or 3, or against a PDF/A-4 Schema.
Some examples.
// xmp:Rights is of type "Text" - it takes a String xmp.set("xmp:Rights", "The Keywords Value"); // dc:Date is of type "Seq Date" - it takes a list of dates List<Date> list = Collections.<Date>singletonList(new Date()); xmp.set("dc:date", new Date()); // As a convenience, where a List is required but a single item // supplied, we'll promote it to a single-item list. This is // equivalent to the above example xmp.set("dc:date", new Date()); // As another convenience, where the type is a "Lang Alt", a Map<String,String> // can be supplied where the map-key is the language and the map-value is the Text value. xmp.set("dc:title", Collectons.<String,String>singletonMap("x-default", "my title")); // For "Lang Alt" types, where the only language to be set is "x-default", the value can // be suppied as a simple type. This is identical to the above call. xmp.set("dc:title", "my title"); xmp.set("dc:title", null); // Delete the "dc:title" value if it exists. // Add a new custom property of undefined type, and set a value on it xmp.set("{http://mydomain.com/}myProperty", "custom");
Some special values can also be set:
- "id" - this is the ID of the resource. It will be used as the "rdf:about" value of the
metdata, and is optional. If this method is set prior to a call to
read(java.io.Reader)
, if the selected resource specifies an ID that is not blank and that does not match this one, the resource will be ignored. - "indent" - set this to a number greater than zero to pretty-print any generated XML by indenting the specified amount. The default is 0
- "use-xmpmeta" - set this to any value other than null or False to wrap the generated XMP in an optional xmpmeta element.
- Parameters:
name
- the name of the propertydata
- the value of the object, or null to delete the property- Returns:
- the Value created to contain the property
- Throws:
java.lang.IllegalArgumentException
- if the supplied "data" value cannot be parsed to match the property typeProfileComplianceException
- if this XMP belongs to a PDF and the PDFOutputProfile
does not allow this property
- "id" - this is the ID of the resource. It will be used as the "rdf:about" value of the
metdata, and is optional. If this method is set prior to a call to
-
set
public void set(XMP.Property p, XMP.Value value)
Set the specified property to have the specified value, or if the value is null, delete the specified property from this XMP.- Parameters:
p
- the propertyvalue
- the value- Throws:
ProfileComplianceException
- if this XMP belongs to a PDF and the PDFOutputProfile
does not allow this property
-
addHistory
public XMP.Value addHistory(java.lang.String action, java.lang.String parameters, java.lang.String softwareAgent, java.lang.String instanceID, java.util.Calendar when)
Add an event the "xmpMM:History" structure in the Metadata, creating it if necessary.- Parameters:
action
- the action - the action performed (eg "created", "signed"). Must not be nullparameters
- the parameters to the action, free text which may be nullsoftwareAgent
- the name of the software performing this action, free text which may be nullinstanceID
- the ID of the original document. If null this will be populated from the current "xmpMM:InstanceID" automatically.when
- when this action was performed - if null, will default to now.- Returns:
- the newly created Value record
-
addDeclaration
public XMP.Value addDeclaration(java.lang.String conformsTo, java.lang.String claimant, java.lang.String credentials, java.lang.String report, java.util.Calendar when)
Add a PDF Declaration to the "pdfd:declarations" structure in the the Metadata, creating it if necessary. Thereport
field is the URL of a report detailing the conformance. If the report is attached to this PDF in thePDF.getEmbeddedFiles()
map, the URL should be#ef=name
, where name is the key the file is stored with in that map.- Parameters:
conformsTo
- the URL specifying the standard or profile referred to by the PDF Declaration (required)claimant
- the name of the organization and/or individual and/or software making the claim (optional)credentials
- the claimant's credentials (optional)report
- a URL to a report containing details of the specific conformance claim (optional)when
- a date identifying when the claim was made (optional)- Returns:
- the newly created Value record
- Since:
- 2.27.2
-
removeDeclaration
public boolean removeDeclaration(java.lang.String conformsTo)
Remove a declaration such as one previously added byaddDeclaration(java.lang.String, java.lang.String, java.lang.String, java.lang.String, java.util.Calendar)
. If more than one declaration exists (it shouldn't), remove the first one.- Returns:
- true if the declaration existed and was removed
- Since:
- 2.28.4
-
addAll
public void addAll(XMP xmp)
Add all the properties and extensions from the supplied XMP object into this XMP object- Parameters:
xmp
- the XMP object to merge over this XMP object- Since:
- 2.26.3
-
get
public java.lang.Object get(java.lang.String name)
If a property with the specified name is present in the XMP, return the value it's set to, otherwise return null. If no such property exists, return null. This method is a convenient shortcut forProperty p = xmp.getProperty(name); if (p != null) { Value v = xmp.get(p); if (v != null) { return v.getData(); } } return null;
- Parameters:
name
- the name of the property- Returns:
- the property value if that property is set, or null otherwise
-
get
public XMP.Value get(XMP.Property key)
Return the value of the specified property as set on this XMP- Parameters:
key
- the property- Returns:
- the value if set, or null otherwise
-
getValues
public java.util.Map<XMP.Property,XMP.Value> getValues()
Return a live, read-only view of all the values set on this XMP object.
-
getXMPOutputProfile
public OutputProfile getXMPOutputProfile()
Return a partial OutputProfile that reflects only the features that apply to this XMP object- Since:
- 2.26.3
-
repair
public void repair(PDF pdf, OutputProfile target, OutputProfiler.Strategy... strategy) throws ProfileComplianceException
Attempt to repair this XMP object to match the specified target, using the specified strategy. This method is mostly useful when you're constructing a new PDF with existing content, such as images, which may have XMP that is invalid. Calling this method on the XMP associated with the images allows you to clean them up before saving.- Parameters:
pdf
- the PDF this XMP is going to be added totarget
- the target OutputProfile, for exampleOutputProfile.PDFA1b_2005
stratagy
- the strategy to use for repair, which will commonly beOutputProfiler.Strategy.JustFixIt
- Throws:
ProfileComplianceException
- if a particular aspect of the XMP cannot be repaiared with the specifeid strategy.- Since:
- 2.26.3
-
getRelaxNGSchema
public java.lang.String getRelaxNGSchema()
Return any RelaxNG Schema associated with this Metadata. The association of a RelaxNG with a Metadata object is new in PDF/A-4 and is optional. If no schema is set, this returns null.- Returns:
- the RelaxNG schema stored with this XMP object, as a String
-
setRelaxNGSchema
public void setRelaxNGSchema(java.lang.String schema)
Set the RelaxNG Schema associated with this Metadata. The association of a RelaxNG with a Metadata object is new in PDF/A-4 and is optional. The supplied object should be a valid RelaxNG schema, possibly just the one generated by
generateRelaxNGSchema(java.util.Collection<org.faceless.pdf2.XMP.Property>)
. No validation is performed on the input. A value of null will remove any existing Schema.Note that although a schema is optional in PDF/A-4, if one is set it must meet the RelaxNG specification.
- Parameters:
schema
- the RelaxNG Schema, or null to remove it.
-
generateRelaxNGSchema
public java.lang.String generateRelaxNGSchema(java.util.Collection<XMP.Property> properties)
Generate a RelaxNG Schema which will fully describe the specified Properties. Generating a Schema from the XML is admittedly backwards - the list of properties in use, and the qualifiers on each of them, should be known in advance. However, a RelaxNG schema is moderately complex and being able to generate one even as a template is still useful.
A typical use would be:
XMP xmp = pdf.getXMP(); xmp.set("indent", 1); // Pretty print our output xmp.set("dc:title", "Document Title"); String schema = xmp.generateRelaxNGSchema(xmp.getValues().keySet()); xmp.setRelaxNGSchema(schema);
The list of properties can be customized to include properties that haven't yet been set on the metadata, which will allow new properties to be added the Metadata in the future without invalidating the schema. The goal of this method is that the following assertion holds true:
String schema = xmp.generateRelaxNGSchema(xmp.getValues().keySet()); assert xmp.validateRelaxNGSchema(schema, null) == true
The one problem with this assertion is some XMP properties are set very late, during the rendering process. Another option to generate a RelaxNG schema that exactly describes the state of the XMP object is to set the
OutputProfile.Feature.XMPMetaDataRelaxNGSchema
to be required on the PDFOutputProfile
. Provided no schema is currently set, one will be generated during the render process that completely describes all the properties in the XMP.The generated RelaxNG is more easily customized if the "indent" property on the XMP object has been set first, as shown in the above example.
- Parameters:
properties
- the list of Properties to include in the schema. If null, the properties currently set on thie XMP object are used.- Returns:
- a RelaxNG schema which should validate the current XMP object.
-
validateRelaxNGSchema
public boolean validateRelaxNGSchema(java.lang.String schema, org.xml.sax.ErrorHandler handler) throws org.xml.sax.SAXException
Attempt to validate the current XMP object against the supplied RelaxNG schema. This method requires a RelaxNG schema validator available to Java - we have tested with Jing. (note you will need the build from Github, not the build dated 2009 from code.google.com).
This method is intended as a quick, simple test. The schema is recompiled each time, so if efficiency is important to you then repeated calls to this method are not the best approach.
- Parameters:
handler
- an optional ErrorHandler to receive the warning/error events (if any) emitted by the validator.- Returns:
- true if the Schema vaidates this XMP object, false otherwise.
- Throws:
org.xml.sax.SAXException
-
getAllSchemas
public java.util.Collection<XMP.Schema> getAllSchemas()
Return a read-only set of all Schemas available to this XMP object. This will include any default Schema, as well as any loaded in by way of a PDF/A Extension, or any added by the user by callingaddSchema(org.faceless.pdf2.XMP.Schema)
. If this XMP is dependent on a parent XMP, it will include any Schema from the parent in the list.
-
getSchemas
public java.util.Collection<XMP.Schema> getSchemas()
Return a read-only set of Schemas explicitly added to this XMP object. This will not include default or inherited Schema - for that, seegetAllSchemas()
-
getSchema
public XMP.Schema getSchema(java.lang.String uri)
Return the Schema from the Collection returned bygetAllSchemas()
that matches the specified URI, or null if not found.- Parameters:
uri
- the URI to compare theXMP.Schema.getURI()
against- Returns:
- the matching Schema, or null of not found.
-
addSchema
public XMP.Schema addSchema(XMP.Schema schema)
Add a new Schema to this XMP. Any Properties or Types defined in the Schema can then be used in this XMP: if they are, a PDF/A schema extension will be written when the PDF is saved and the
OutputProfile
demands it - specifically, this means the PDF OutputProfile when the file is saved must disallow one of theXMPMetaDataMissing2004
,XMPMetaDataMissing2005
orXMPMetaDataMissing2008
features. This is already done by the PDF/A-1, A-2 and A-3 OutputProfiles, which is why schema extensions are written for those types of PDF.If the Schema has the same namespace as an existing Schema, the old Schema will be replaced.
Calling this method after properties have been added to this XMP will update the Properties and Types used to define those properties. However it will not update the list of
OutputProfile.Feature
features - if a document was read with undefined properties, adding a Schema afterwards that defines those properties will not suddenly make it PDF/A compliant.- Parameters:
schema
- the Schema- Returns:
- the Schema passed in.
-
clear
public void clear()
Remove any properties, schemas or types set on this XMP. Will effectively reset it to the state it would be immediately after the constructor was called.
-
read
public boolean read(java.io.Reader reader)
Read the XMP stream from the supplied reader, and return true if it contains a valid XMP stream. For example, here is how to serialize XMP content to a PDF entirely independently of thePDF.getXMP()
methodXMP xmp = new XMP(); xmp.read(pdf.getMetaData()); xmp.setProperty("pdf:Trapped", "True"); pdf.setMetaData(xmp.toString());
- Returns:
- true if the supplied Reader contained a valid XMP metadata stream, false otherwise.
-
write
public void write(java.lang.Appendable w) throws java.io.IOException
Write this XMP object to the suppliedAppendable
. This method is called by thetoString()
method and that may be easier to use. Note that XMP is defined against XML 1.1, at least since 2005. So the output may theoretically contain control chars between U+0001 and U+001F which are invalid in XML 1.0.- Parameters:
w
- the Appendable to write to- Throws:
java.io.IOException
- if the Appendable throws an IOException while writing.
-
toString
public java.lang.String toString()
Return the XMP as a String, by callingwrite(java.lang.Appendable)
with a new StringBuilder. If the XMP isinvalid
, it returns the original value that was passed intoread(java.io.Reader)
.- Overrides:
toString
in classjava.lang.Object
-
-