Interface ArlingtonModelIssue
-
public interface ArlingtonModelIssue
This interface represents an "issue" reported by comparing a PDF against the Arlington Model, a formal description of the PDF file format described at https://github.com/pdf-association/arlington-pdf-model.
Arlington Model validation is a bit like "spell checking" a PDF. It compares each PDF object against the list of requirements in the specification and reports where the file deviates from the specification. In the majority of cases the deviations found will be fairly inconsequential, as errors important enough to cause a noticeable problem tend to get fixed.
By contrast, the
OutputProfile
class and theOutputProfiler.apply(org.faceless.pdf2.OutputProfile)
method can also be used to repair damage to a PDF, but tends to focus on the bigger problems - damaged fonts, damaged structures and so on; the kind of damage that tends to be harder to diagnose and repair, and to cause visible problems.For this reason "Arlington Model" issues and "OutputProfile" issues are entirely indepdent and can both be used to validate and repair a PDF. Most PDF products create files that somehow fail to match the Arlington model, including this API prior to release 2.27.2. Repairing these issues where possible is a good idea, is very lightweight compared to the
OutputProfiler
approach.Note: the Arlington Model is still under active development, and there will be some false positives identified (although none of those will be repairable).
Usage
To run the Arlington Model against a PDF, something like the following is what we'd expect to be fairly typical.
OutputProfiler profiler = new OutputProfiler(pdf); List<ArlingtonModelIssue> list = profiler.getArlingtonModelIssues(); for (ArlingtonModelIssue issue : list) { if (issue.getRepairType() != null) { issue.repair(null); } else { System.out.println("Can't repair: " + issue); } }
- Since:
- 2.27.2
- See Also:
OutputProfiler.getArlingtonModelIssues()
-
-
Method Summary
All Methods Instance Methods Abstract Methods Modifier and Type Method Description java.lang.Object
getChildValue()
Return the "child" object of this issue.int
getIndex()
Return the index into the parent object that caused the warning that was being processed, if the parent object is an array.java.lang.String
getKey()
Return the key within the parent object that caused the warning that was being processed, if the parent object is a dictionary or stream.java.lang.String
getMessage()
Return the message associated with this Resultjava.lang.Object
getParentValue()
Return the "parent" object of this issue.java.lang.String
getPath()
Return the "PDF path" to the object that caused the warning.PDF
getPDF()
Return the PDF this Result applies tojava.lang.String
getRepairType()
Return a brief description of the Repair that will be made if this issue is repaired by callingrepair(java.lang.Object)
, ornull
if no repair is possible.java.lang.String
getRepairWarning()
If there is anything to consider before applying a Repair, this method will return a textual description of the implications.java.lang.String
getTable()
Return the name of the Table in the Model that caused the warning, ornull
of no Table appliedjava.lang.String
getVersionSuggestion()
If the error could potentially be fixed by increasing the version number of the PDF, return the minimum version that would be required.boolean
isDeprecation()
Return true if this issue is because a property is used in the PDF that has been deprecated in version specified in the PDF.boolean
isError()
Return true if this issue is an "Error" - a condition is described in the model, and this item fails to meet that condition.boolean
isFromLaterVersion()
Return true if this issue is because a property is used in the PDF that is first described in a later version of the PDF specification.boolean
repair(java.lang.Object o)
Attempt to repair the issue.
-
-
-
Method Detail
-
getPDF
PDF getPDF()
Return the PDF this Result applies to
-
getPath
java.lang.String getPath()
Return the "PDF path" to the object that caused the warning. This is always the "parent" object - for example, if a value in a dictionary is incorrect, it's the path to the dictionary, not the value. It may not be the shortest path, and it is quite possible for the same object to be returned in two different objects with two different paths and models.- Returns:
- the PDF path to the object this issues was identified on
-
getTable
java.lang.String getTable()
Return the name of the Table in the Model that caused the warning, ornull
of no Table applied- Returns:
- the Arlington Model Table name
-
getKey
java.lang.String getKey()
Return the key within the parent object that caused the warning that was being processed, if the parent object is a dictionary or stream. Otherwise returnsnull
.- Returns:
- the key in the parent object, or
null
-
getIndex
int getIndex()
Return the index into the parent object that caused the warning that was being processed, if the parent object is an array. Otherwise returns -1- Returns:
- the index in the parent object, or
-1
-
getMessage
java.lang.String getMessage()
Return the message associated with this Result- Returns:
- the message
-
getVersionSuggestion
java.lang.String getVersionSuggestion()
If the error could potentially be fixed by increasing the version number of the PDF, return the minimum version that would be required. Returned values are usually of the form "1.4", "1.7", "2.0" but may also be of the form "1.7e3", "1.7e11". Other return values may be used in the future to indicate extensions to PDF.- Returns:
- the version suggestion, or
null
if not applicable
-
isError
boolean isError()
Return true if this issue is an "Error" - a condition is described in the model, and this item fails to meet that condition. If the error is that a property uses a value that is only allowed in a later version, the recommended version number will be returned fromgetVersionSuggestion()
and repairing this issue will simply result in the version of the PDF being increased.- Returns:
- true if this issue is an error
-
isFromLaterVersion
boolean isFromLaterVersion()
Return true if this issue is because a property is used in the PDF that is first described in a later version of the PDF specification. It's not an error condition, because undefined fields are always allowed. However for the field to be semantically correct, the PDF version number should be increased. Issues of this type can always be repaired; doing so will increase the version number of the PDF.- Returns:
- true if this issue is due to a property from a later PDF version
-
isDeprecation
boolean isDeprecation()
Return true if this issue is because a property is used in the PDF that has been deprecated in version specified in the PDF. Deprecated fields can always be safely removed; repairing an issue of this type will remove them. However there's no real need to do so.- Returns:
- true if this issue is due to a deprecated property
-
getParentValue
java.lang.Object getParentValue()
Return the "parent" object of this issue. For example, if the issue was a value of incorrect type, the "parent" object is the dictionary containing that value, and the "child" object is the value itself. The type of the returned object most likely one of the internal PDF model types; therefore this method is of limited public use.- Returns:
- the object the issue was found in
-
getChildValue
java.lang.Object getChildValue()
Return the "child" object of this issue. For example, if the issue was a value of incorrect type, the "parent" object is the dictionary containing that value, and the "child" object is the value itself. If the issue is that the value is missing, this value will benull
The type of the returned object most likely one of the internal PDF model types; therefore this method is of limited public use.- Returns:
- the child of the parent value relating to this issue
-
getRepairType
java.lang.String getRepairType()
Return a brief description of the Repair that will be made if this issue is repaired by callingrepair(java.lang.Object)
, ornull
if no repair is possible. The returned value can act as an identifier for this particular repair type.- Returns:
- a brief description of the repair, or
null
if it's not repairable
-
getRepairWarning
java.lang.String getRepairWarning()
If there is anything to consider before applying a Repair, this method will return a textual description of the implications. If the Repair is 100% straightforward and with no side effects, this method returnsnull
- Returns:
- a warning relating to the repair, or
null
if no warning is required
-
repair
boolean repair(java.lang.Object o)
Attempt to repair the issue. If
getRepairType()
is null, or the repair fails for any reason, this method returns false; otherwise it returns true to indicate that the PDF has been repaired.Some types of repair may allow a suggested value or operation - if so suggestions will be made in the
getRepairWarning()
method describing the value to pass in to this method. But in most cases, the value should be null.- Parameters:
o
- an optional object that may be used to influence the repair, ornull
to use the defaults.- Returns:
- true if the repair succeeded
-
-