Class OutputProfiler
- java.lang.Object
-
- org.faceless.pdf2.OutputProfiler
-
- All Implemented Interfaces:
Runnable
public class OutputProfiler extends Object implements Runnable
An
OutputProfileris used to create anOutputProfilefor a PDF or to attempt to apply a new OutputProfile, modifying the PDF in the process. This can be a basic OutputProfile, which is very quick to create, or a full OutputProfile which involves scanning the entire PDF, and takes much longer.This class now underlies the
PDF.getBasicOutputProfile()andPDF.getFullOutputProfile()methods, and brings several advantages; you can re-run the profile when you know the PDF has changed, you can create the profile in one thread and monitor its progress in another, and you can make structural changes to the PDF (such as substituting fonts or colors) that aren't possible with the previous API.To create a new profile with the same information as
PDF.getBasicOutputProfile():OutputProfiler profiler = new OutputProfiler(pdf); OutputProfile profile = profiler.getProfile();
and to duplicate the functionality of thePDF.getFullOutputProfile()method:OutputProfiler profiler = new OutputProfiler(new PDFParser(pdf)); OutputProfile profile = profiler.getProfile();
To duplicate the functionality of thePDF.setOutputProfile(org.faceless.pdf2.OutputProfile)method, you would callapply(org.faceless.pdf2.OutputProfile). For example, to retrieve the full profile of the PDF, check if it's compatible with a "target" profile and attempt to convert the PDF to that profile if not:OutputProfiler profiler = new OutputProfiler(new PDFParser(pdf)); OutputProfile profile = profiler.getProfile(); OutputProfile.Feature[] list = profile.isCompatibleWith(target); if (list != null) { profiler.apply(target); }This is an oversimplified example, as typically converting a PDF to a profile (known as "preflighting") requires more information. The
OutputProfilerclass allows you to specify various actions to perform on the PDF when converting - if specified these will involve a rebuild of the entire document, which can be time- consuming.As an example, assume a PDF has an embedded font in it - this is not allowed in PDF/A. To try to convert the PDF to PDF/A-1b, you could run the following code:
PDF pdf = new PDF(new PDFReader(new File("unembeddedfont.pdf"))); OutputProfiler profiler = new OutputProfiler(new PDFParser(pdf)); OutputProfile profile = profiler.getProfile(); ColorSpace srgb = ColorSpace.getInstance(ColorSpace.CS_sRGB); OutputProfile target = new OutputProfile(OutputProfile.PDFA1b_2005); target.getOutputIntents().add(new OutputIntent("GTS_PDFA1", null, icc); OutputProfile.Feature[] list = profile.isCompatibleWith(target); if (list != null) { profiler.apply(target); // This line will fail }This will fail with an
IllegalStateException("Denied Feature 'Unembedded TrueType Font' is set"). To fix this you need to set an action on theOutputProfilerbefore you apply the new profile. This will cause the PDF to be rebuilt internally. Here's how the above example could be modified to replace some, or all unembedded fonts with an embedded font from the OS.PDF pdf = new PDF(new PDFReader(new File("unembeddedfont.pdf"))); OutputProfiler profiler = new OutputProfiler(new PDFParser(pdf)); OutputProfile profile = profiler.getProfile(); ColorSpace srgb = ColorSpace.getInstance(ColorSpace.CS_sRGB); OutputProfile target = new OutputProfile(OutputProfile.PDFA1b_2005); target.getOutputIntents().add(new OutputIntent("GTS_PDFA1", null, icc); OutputProfile.Feature[] list = profile.isCompatibleWith(target); if (list != null) { OutputProfiler.AutoEmbeddingFontAction fontaction = new OutputProfiler.AutoEmbeddingFontAction(); fontaction.add(new OpenTypeFont(new FileInputStream("C:\\Windows\\Fonts\\arial.ttf"), 2)); profiler.setFontAction(fontaction); profiler.apply(target); }We recommend you check our Blog for more on this topic.- Since:
- 2.18
- See Also:
OutputProfile
-
-
Nested Class Summary
Nested Classes Modifier and Type Class Description static classOutputProfiler.AutoEmbeddingFontActionTheAutoEmbeddingFontActionclass is an implementation ofOutputProfiler.FontActionthat will replace unembedded fonts with embedded ones via a "best fit" algorithm.static interfaceOutputProfiler.ColorActionAn action that can beseton an OutputProfiler to replace Colors.static interfaceOutputProfiler.FontActionAn action that can beseton an OutputProfiler to replace one font with another in the PDF.static interfaceOutputProfiler.ImageActionAn action that can be used to resample or recompress bitmap images.static classOutputProfiler.ImageTypeImageType constants are passed in to thesetMaxImageDPImethodstatic classOutputProfiler.ProcessColorActionTheProcessColorActionclass is an implementation ofOutputProfiler.ColorActionwhich will convert any process colors (i.e.static classOutputProfiler.RasterizingActionAn action that will rasterize a page to a bitmap if required.static classOutputProfiler.RenderingIntentRenderingIntent constants are passed in to theOutputProfiler.ProcessColorAction.setRenderingIntent(org.faceless.pdf2.OutputProfiler.RenderingIntent)methodstatic classOutputProfiler.SimpleImageActionAn implementation ofOutputProfiler.ImageActionthat implements the functionality that was available via thesetMaxImageDPI(org.faceless.pdf2.OutputProfiler.ImageType, float, float)method.static classOutputProfiler.StrategyThe Strategy enum determines how a PDF is repaired when an OutputProfile is applied to it - for example, are invalid fields in the metadata deleted?
-
Constructor Summary
Constructors Constructor Description OutputProfiler()Create a new OutputProfilerOutputProfiler(PDF pdf)Create a new OutputProfiler and callsetPDF()OutputProfiler(PDFParser parser)Create a new OutputProfiler and callsetParser()
-
Method Summary
All Methods Instance Methods Concrete Methods Deprecated Methods Modifier and Type Method Description voidapply(OutputProfile targetprofile)Set the specifiedOutputProfileon the PDF.voidcancel()Cancel this OutputProfiler's operation - if it is being run in another thread, that thread should terminate safely shortly after this method is called.List<ArlingtonModelIssue>getArlingtonModelIssues()Traverse the PDF and generate a list of issues based on the Arlington PDF validation model.OutputProfiler.ColorActiongetColorAction()OutputProfiler.FontActiongetFontAction()Return the FontAction set bysetFontAction(org.faceless.pdf2.OutputProfiler.FontAction)floatgetHairlineWidth()Return the hairline repair width, as set bysetHairlineWidth(float).OutputProfiler.ImageActiongetImageAction()OutputProfilegetProfile()Return theOutputProfilecalculated by therun()method.floatgetProgress()Return the progress of therun()orapply(org.faceless.pdf2.OutputProfile)operation, or 0 if this is not being run, has completed or has been cancelled.OutputProfiler.RasterizingActiongetRasterizingAction()ExecutorServicegetRasterizingActionExecutorService()Return the ExecutorService set bysetRasterizingActionExecutorService(java.util.concurrent.ExecutorService)List<OutputProfiler.Strategy>getStrategy()Return a copy of the list of all strategies currently being applied.booleanisCancelled()Return true if thecancel()method has been called.booleanisDone()Return true if therun()orapply(org.faceless.pdf2.OutputProfile)method has completed or been cancelled, false if it's still running or has not yet been started.booleanisRunning()Return true if therun()orapply(org.faceless.pdf2.OutputProfile)method is running in another thread, and false if it has completed, been cancelled or not yet started.booleanisStrategy(OutputProfiler.Strategy s)Return true if the specified Strategy will be considered by theapply(org.faceless.pdf2.OutputProfile)method when applying an OutputProfile.voidrun()Analyze the PDF and generate its profile.voidsetColorAction(OutputProfiler.ColorAction action)Set theOutputProfiler.ColorActionto run on the PDF.voidsetFontAction(OutputProfiler.FontAction action)Set theOutputProfiler.FontActionto run on the PDF.voidsetFull(boolean full)Sets whether the OutputProfiler will create a full OutputProfile when it is run.voidsetHairlineWidth(float width)IfHairlinesorzero-width linesare denied when a new profile isapplied, they will be changed to be lines of at least this width.voidsetImageAction(OutputProfiler.ImageAction action)Set theOutputProfiler.ImageActionto run on the PDF.voidsetJustNoticeableDifference(float threshold, String methodHint)Set the threshold level at which two colors are considered "different", which is a criteria that is tested at various points throughout theapply(org.faceless.pdf2.OutputProfile)method.voidsetMaxImageDPI(OutputProfiler.ImageType imagetype, float threshold, float target)Deprecated.please callsetImageAction(org.faceless.pdf2.OutputProfiler.ImageAction)instead.voidsetParser(PDFParser parser)Set the PDFParser to create the OutputProfile from.voidsetPDF(PDF pdf)Set the PDF to create the OutputProfile from.voidsetRasterizingAction(OutputProfiler.RasterizingAction action)Set theOutputProfiler.RasterizingActionto run on the PDF.voidsetRasterizingActionExecutorService(ExecutorService service)Set the ExecutorService to be used for rasterizing pages pages with aOutputProfiler.RasterizingAction.voidsetStrategy(Collection<OutputProfiler.Strategy> strategy)Set the strategy that will be used to resolve problems encountered duringapply(org.faceless.pdf2.OutputProfile).voidsetStrategy(OutputProfiler.Strategy... strategy)Set the strategy that will be used to resolve problems encountered duringapply(org.faceless.pdf2.OutputProfile).voidsetUseLegacyXMPModel(boolean legacy)This method changes how the profile returned fromgetProfile()is calculated, allowing the fix to the XMP validation model made in release 2.29 to be reversed.OutputProfilewaitForProfile()Wait for the profiling operation running in this (or another) thread to finish, and return the profile when done.
-
-
-
Constructor Detail
-
OutputProfiler
public OutputProfiler()
Create a new OutputProfiler
-
OutputProfiler
public OutputProfiler(PDF pdf)
Create a new OutputProfiler and callsetPDF()- Parameters:
pdf- the PDF
-
OutputProfiler
public OutputProfiler(PDFParser parser)
Create a new OutputProfiler and callsetParser()- Parameters:
parser- the PDFParser
-
-
Method Detail
-
setPDF
public void setPDF(PDF pdf)
Set the PDF to create the OutputProfile from. Setting just a PDF will allow only basic OutputProfile features to be extracted. Once set it cannot be changed.- Parameters:
pdf- the PDF to scan for features- See Also:
setParser(org.faceless.pdf2.PDFParser),setFull(boolean)
-
setParser
public void setParser(PDFParser parser)
Set the PDFParser to create the OutputProfile from. Setting a PDFParser will allow both basic and full OutputProfile features to be extracted. Once set, it cannot be changed, but it can be reset by passing in null- Parameters:
parser- the PDFParser containing the PDF to scan for features- See Also:
setPDF(org.faceless.pdf2.PDF),setFull(boolean)
-
setFull
public void setFull(boolean full)
Sets whether the OutputProfiler will create a full OutputProfile when it is run. This method simply creates a newPDFParserand callssetParser(org.faceless.pdf2.PDFParser)- Parameters:
full- whether to extract a full profile from the PDF.
-
setJustNoticeableDifference
public void setJustNoticeableDifference(float threshold, String methodHint)Set the threshold level at which two colors are considered "different", which is a criteria that is tested at various points throughout the
apply(org.faceless.pdf2.OutputProfile)method. In particular, when two differentSeparationsare found, they will be merged if the maximum Δe (delta-E) value for the two separations is less than this value. If greater than this value, the page will probably have to be rasterized.The
methodHintcan also be set to try and adjust the algorithm for determining Delta-E. Supported values are currently "CIDE2000" and "CIE94", ornullfor no change.The default values if not set are equivalent to
setJustNoticeableDifference(2.5, "CIEDE2000"). Note that although the theoreticaly correct value for the JND threshold is 1, the alternative is rasterization. So a little tolerance here is probably justified.- Parameters:
threshold- the value to use for "just noticable difference" - two colors with a difference above this value are considered to be different colorsmethodHint- the method to use for deltaE calculation.
-
setUseLegacyXMPModel
public void setUseLegacyXMPModel(boolean legacy)
This method changes how the profile returned fromgetProfile()is calculated, allowing the fix to the XMP validation model made in release 2.29 to be reversed. There is no need to call this model for normal workflows, but it may be useful when comparing validation results against older versions of the PDF Library. The following workflow would be typical.PDFParser parser = new PDFParser(pdf); OutputProfiler profiler = new OutputProfiler(parserr); OutputProfile profile = profiler.getProfile(); if (profile.isSet(OutputProfile.XMPMetaDataTypeRequiresGlobalScope)) { // If this feature is set, it's an indication that the results MAY be different // under the XMP model prior to 2.29. Request a profile under the old model to // find out for certain. profiler = new OutputProfiler(parser); // Create a new parser profiler.setUseLegacyXMPModel(true); // set the flag before getting the profile OutputProfile old_profile = profiler.getProfile(); // now "profile" is the OutputProfile according to the current XMP validation model, // and "old_profile is the OutputProfile according to the old XMP validation model. }- Since:
- 2.29.2
-
run
public void run()
Analyze the PDF and generate its profile. Whether this method calculates a "basic" or "full" profile depends on whether a
PDFParserwas specified on this class, either in the constructor or by callingsetParser(org.faceless.pdf2.PDFParser). If available a full profile will be run, which can take some time. If not, a basic profile is generated which is essentially instantaneous.The process reads, but does not write to the structures of the PDF so can safely be run in parallel other operations that read the PDF, such as signature validation or rendering to bitmap.
- Specified by:
runin interfaceRunnable- See Also:
isRunning(),getProfile(),apply(org.faceless.pdf2.OutputProfile)
-
cancel
public void cancel()
Cancel this OutputProfiler's operation - if it is being run in another thread, that thread should terminate safely shortly after this method is called. Once this object is cancelled, it cannot be restarted.- See Also:
isCancelled()
-
isRunning
public boolean isRunning()
Return true if therun()orapply(org.faceless.pdf2.OutputProfile)method is running in another thread, and false if it has completed, been cancelled or not yet started.- See Also:
run()
-
isDone
public boolean isDone()
Return true if therun()orapply(org.faceless.pdf2.OutputProfile)method has completed or been cancelled, false if it's still running or has not yet been started.
-
isCancelled
public boolean isCancelled()
Return true if thecancel()method has been called.- See Also:
isRunning()
-
getProfile
public OutputProfile getProfile()
Return theOutputProfilecalculated by therun()method. Ifrun()has not been called already, it will be called by this method. If it has already completed, it will return the result (ornullif it failed). If it is currently running in another thread, this method will returnnullimmediately.- See Also:
isRunning()
-
waitForProfile
public OutputProfile waitForProfile()
Wait for the profiling operation running in this (or another) thread to finish, and return the profile when done. This method will also wait if the profiling has not yet started.- See Also:
isRunning()
-
getProgress
public float getProgress()
Return the progress of therun()orapply(org.faceless.pdf2.OutputProfile)operation, or 0 if this is not being run, has completed or has been cancelled.- Returns:
- the progress of the operation, from 0 to 1
- See Also:
isRunning()
-
setHairlineWidth
public void setHairlineWidth(float width)
IfHairlinesorzero-width linesare denied when a new profile isapplied, they will be changed to be lines of at least this width. This will rebuild the PDF. If no hairlines are present in the PDF when this method is called, no rebuild will be performed.- Parameters:
width- the width (in pts) to use to replace any hairlines. Must be > 0. The default is 0.2
-
setFontAction
public void setFontAction(OutputProfiler.FontAction action)
Set theOutputProfiler.FontActionto run on the PDF. This can be used to replace fonts in the PDF with new fonts. If this value is not null, the PDF will be rebuilt inapply().- Parameters:
action- the FontAction
-
getFontAction
public OutputProfiler.FontAction getFontAction()
Return the FontAction set bysetFontAction(org.faceless.pdf2.OutputProfiler.FontAction)- Since:
- 2.26
-
setColorAction
public void setColorAction(OutputProfiler.ColorAction action)
Set theOutputProfiler.ColorActionto run on the PDF. This can be used to replace colors in the PDF. If this value is not null, the PDF will be rebuilt inapply().- Parameters:
action- the ColorAction
-
getColorAction
public OutputProfiler.ColorAction getColorAction()
Return theOutputProfiler.ColorActionset bysetColorAction(org.faceless.pdf2.OutputProfiler.ColorAction)- Since:
- 2.26
-
setImageAction
public void setImageAction(OutputProfiler.ImageAction action)
Set theOutputProfiler.ImageActionto run on the PDF. This can be used to resample or recompress images colors in the PDF. If this value is not null, the PDF will be rebuilt inapply().- Parameters:
action- the ImageAction- Since:
- 2.22.2
-
getImageAction
public OutputProfiler.ImageAction getImageAction()
Return theOutputProfiler.ImageActionset bysetImageAction(org.faceless.pdf2.OutputProfiler.ImageAction)- Since:
- 2.26
-
setRasterizingAction
public void setRasterizingAction(OutputProfiler.RasterizingAction action)
Set theOutputProfiler.RasterizingActionto run on the PDF. This can be used to rasterize page content to images. If this value is not null, the PDF will be rebuilt inapply().- Parameters:
action- the RasterizingAction- Since:
- 2.26
-
setRasterizingActionExecutorService
public void setRasterizingActionExecutorService(ExecutorService service)
Set the ExecutorService to be used for rasterizing pages pages with aOutputProfiler.RasterizingAction. A value of null means they are rasterized one at a time on the current thread (the default). Be aware that rasterizing is a memory intensive task, so to many threads will cause memory pressure.- Since:
- 2.26.1
-
getRasterizingActionExecutorService
public ExecutorService getRasterizingActionExecutorService()
Return the ExecutorService set bysetRasterizingActionExecutorService(java.util.concurrent.ExecutorService)- Since:
- 2.26.1
-
getRasterizingAction
public OutputProfiler.RasterizingAction getRasterizingAction()
Return theOutputProfiler.RasterizingActionset bysetRasterizingAction(org.faceless.pdf2.OutputProfiler.RasterizingAction)- Since:
- 2.26
-
getHairlineWidth
public float getHairlineWidth()
Return the hairline repair width, as set bysetHairlineWidth(float).- Since:
- 2.26.1
-
setMaxImageDPI
@Deprecated public void setMaxImageDPI(OutputProfiler.ImageType imagetype, float threshold, float target)
Deprecated.please callsetImageAction(org.faceless.pdf2.OutputProfiler.ImageAction)instead.Set the maximum image resolution to be used in the PDF. If the PDF contains an image of the specified type which is not embedded at less than the specified threshold resolution, it will be resampled to the target resolution and replaced. Calling this method will cause the PDF to be rebuilt inapply().- Parameters:
imagetype- the ImageType whether this applies to one-bit, gray or color imagestarget- the resolution to test the image against - all copies of the image embedded in the PDF must be this resolution or higher for it to be resampled.target- the resolution to resample the image to.
-
setStrategy
public void setStrategy(OutputProfiler.Strategy... strategy)
Set the strategy that will be used to resolve problems encountered duringapply(org.faceless.pdf2.OutputProfile). By default, the strategy isOutputProfiler.Strategy.Default, but multiple items can be passed into this method to define the set of strategies that will be tried when thereturnapply() method is called.- Parameters:
strategy- a list of strategies to apply- Since:
- 2.26
-
setStrategy
public void setStrategy(Collection<OutputProfiler.Strategy> strategy)
Set the strategy that will be used to resolve problems encountered duringapply(org.faceless.pdf2.OutputProfile). Like {@link #setStrategy(Strategy...} but this method takes a Collection.- Parameters:
strategy- a collection of strategies to apply- Since:
- 2.28
-
getStrategy
public List<OutputProfiler.Strategy> getStrategy()
Return a copy of the list of all strategies currently being applied.- Since:
- 2.26.3
-
isStrategy
public boolean isStrategy(OutputProfiler.Strategy s)
Return true if the specified Strategy will be considered by theapply(org.faceless.pdf2.OutputProfile)method when applying an OutputProfile.- Since:
- 2.26
-
apply
public void apply(OutputProfile targetprofile)
Set the specified
OutputProfileon the PDF. The supplied "target" profile will have a number of featuresdeniedandrequired, and this method will attempt to modify the PDF to match those requirements. If it's not possible then anIllegalStateExceptionwill be thrown.If the supplied profile references any features that require a full scan and the PDF has been loaded in (rather than create from scratch), then a full profile of the existing PDF must be
run()to determine which features are currently set. If this is alreadyin progressin another thread, this method will wait for it to complete. If it hasn't yet been started, it will be started on this thread by callinggetProfile(). If noPDFParserhas been set (in the constructor or through thesetParsermethod) then a full profile cannot be created, and anIllegalStateExceptionwill be thrown.If a
OutputProfiler.FontAction,OutputProfiler.ColorAction,OutputProfiler.ImageActionorOutputProfiler.RasterizingActionhas been set on this class, an extra stage will be run which rebuilds the PDF content. It is also run if the full profile shows up anyhairlinesand thesetHairlineWidthmethod was calling with a non-zero value.After this stage, or if no actions or hairline-replacement are specified, then the method will attempt to modify the PDF to add or remove required or denied features, as specified in the target profile. If that completes successfully, the OutputIntent on the target profile will be applied to the PDF and this method will complete.
While this method is running the
isRunning()method will return true, and the progress value returned fromgetProgress()will be updated, although the returned value is approximate at best: the amount of work required to modify a PDF to meet a target profile cannot realistically be predicted in advance. Thecancel()method can be used to request the apply() method is interrupted. The PDF should be left in a consistent state if this happens, but that state will necessarily be somewhere between how the PDF was originally, and how it was going to be after modification. There is no way to revert the PDF to it's original state other than reloading. When this method finishes theisDone()method will return true, and theisCancelled()method will be false if the method completed successfully or threw an exception, and true if it was cancelled.Note that this method modifies the PDF extensively, so (unlike the retrieval of the OutputProfile from the
run()method), any threads that read from the PDF must be paused while this method is running. The functionality to manage the progress of this method was added in 2.26.1- Parameters:
targetprofile- the OutputProfile that this PDF should be converted to match.
-
getArlingtonModelIssues
public List<ArlingtonModelIssue> getArlingtonModelIssues()
Traverse the PDF and generate a list of issues based on the Arlington PDF validation model. The list is recreated each time this method is called.- Since:
- 2.27.2
- See Also:
ArlingtonModelIssue
-
-