Class OutputProfiler
- java.lang.Object
-
- org.faceless.pdf2.OutputProfiler
-
- All Implemented Interfaces:
Runnable
public class OutputProfiler extends Object implements Runnable
An
OutputProfiler
is used to create anOutputProfile
for a PDF or to attempt to apply a new OutputProfile, modifying the PDF in the process. This can be a basic OutputProfile, which is very quick to create, or a full OutputProfile which involves scanning the entire PDF, and takes much longer.This class now underlies the
PDF.getBasicOutputProfile()
andPDF.getFullOutputProfile()
methods, and brings several advantages; you can re-run the profile when you know the PDF has changed, you can create the profile in one thread and monitor its progress in another, and you can make structural changes to the PDF (such as substituting fonts or colors) that aren't possible with the previous API.To create a new profile with the same information as
PDF.getBasicOutputProfile()
:OutputProfiler profiler = new OutputProfiler(pdf); OutputProfile profile = profiler.getProfile();
and to duplicate the functionality of thePDF.getFullOutputProfile()
method:OutputProfiler profiler = new OutputProfiler(new PDFParser(pdf)); OutputProfile profile = profiler.getProfile();
To duplicate the functionality of thePDF.setOutputProfile(org.faceless.pdf2.OutputProfile)
method, you would callapply(org.faceless.pdf2.OutputProfile)
. For example, to retrieve the full profile of the PDF, check if it's compatible with a "target" profile and attempt to convert the PDF to that profile if not:OutputProfiler profiler = new OutputProfiler(new PDFParser(pdf)); OutputProfile profile = profiler.getProfile(); OutputProfile.Feature[] list = profile.isCompatibleWith(target); if (list != null) { profiler.apply(target); }
This is an oversimplified example, as typically converting a PDF to a profile (known as "preflighting") requires more information. The
OutputProfiler
class allows you to specify various actions to perform on the PDF when converting - if specified these will involve a rebuild of the entire document, which can be time- consuming.As an example, assume a PDF has an embedded font in it - this is not allowed in PDF/A. To try to convert the PDF to PDF/A-1b, you could run the following code:
PDF pdf = new PDF(new PDFReader(new File("unembeddedfont.pdf"))); OutputProfiler profiler = new OutputProfiler(new PDFParser(pdf)); OutputProfile profile = profiler.getProfile(); ColorSpace srgb = ColorSpace.getInstance(ColorSpace.CS_sRGB); OutputProfile target = new OutputProfile(OutputProfile.PDFA1b_2005); target.getOutputIntents().add(new OutputIntent("GTS_PDFA1", null, icc); OutputProfile.Feature[] list = profile.isCompatibleWith(target); if (list != null) { profiler.apply(target); // This line will fail }
This will fail with an
IllegalStateException
("Denied Feature 'Unembedded TrueType Font' is set"). To fix this you need to set an action on theOutputProfiler
before you apply the new profile. This will cause the PDF to be rebuilt internally. Here's how the above example could be modified to replace some, or all unembedded fonts with an embedded font from the OS.PDF pdf = new PDF(new PDFReader(new File("unembeddedfont.pdf"))); OutputProfiler profiler = new OutputProfiler(new PDFParser(pdf)); OutputProfile profile = profiler.getProfile(); ColorSpace srgb = ColorSpace.getInstance(ColorSpace.CS_sRGB); OutputProfile target = new OutputProfile(OutputProfile.PDFA1b_2005); target.getOutputIntents().add(new OutputIntent("GTS_PDFA1", null, icc); OutputProfile.Feature[] list = profile.isCompatibleWith(target); if (list != null) { OutputProfiler.AutoEmbeddingFontAction fontaction = new OutputProfiler.AutoEmbeddingFontAction(); fontaction.add(new OpenTypeFont(new FileInputStream("C:\\Windows\\Fonts\\arial.ttf"), 2)); profiler.setFontAction(fontaction); profiler.apply(target); }
We recommend you check our Blog for more on this topic.- Since:
- 2.18
- See Also:
OutputProfile
-
-
Nested Class Summary
Nested Classes Modifier and Type Class Description static class
OutputProfiler.AutoEmbeddingFontAction
TheAutoEmbeddingFontAction
class is an implementation ofOutputProfiler.FontAction
that will replace unembedded fonts with embedded ones via a "best fit" algorithm.static interface
OutputProfiler.ColorAction
An action that can beset
on an OutputProfiler to replace Colors.static interface
OutputProfiler.FontAction
An action that can beset
on an OutputProfiler to replace one font with another in the PDF.static interface
OutputProfiler.ImageAction
An action that can be used to resample or recompress bitmap images.static class
OutputProfiler.ImageType
ImageType constants are passed in to thesetMaxImageDPI
methodstatic class
OutputProfiler.ProcessColorAction
TheProcessColorAction
class is an implementation ofOutputProfiler.ColorAction
which will convert any process colors (i.e.static class
OutputProfiler.RasterizingAction
An action that will rasterize a page to a bitmap if required.static class
OutputProfiler.RenderingIntent
RenderingIntent constants are passed in to theOutputProfiler.ProcessColorAction.setRenderingIntent(org.faceless.pdf2.OutputProfiler.RenderingIntent)
methodstatic class
OutputProfiler.SimpleImageAction
An implementation ofOutputProfiler.ImageAction
that implements the functionality that was available via thesetMaxImageDPI(org.faceless.pdf2.OutputProfiler.ImageType, float, float)
method.static class
OutputProfiler.Strategy
The Strategy enum determines how a PDF is repaired when an OutputProfile is applied to it - for example, are invalid fields in the metadata deleted?
-
Constructor Summary
Constructors Constructor Description OutputProfiler()
Create a new OutputProfilerOutputProfiler(PDF pdf)
Create a new OutputProfiler and callsetPDF()
OutputProfiler(PDFParser parser)
Create a new OutputProfiler and callsetParser()
-
Method Summary
All Methods Instance Methods Concrete Methods Deprecated Methods Modifier and Type Method Description void
apply(OutputProfile targetprofile)
Set the specifiedOutputProfile
on the PDF.void
cancel()
Cancel this OutputProfiler's operation - if it is being run in another thread, that thread should terminate safely shortly after this method is called.List<ArlingtonModelIssue>
getArlingtonModelIssues()
Traverse the PDF and generate a list of issues based on the Arlington PDF validation model.OutputProfiler.ColorAction
getColorAction()
OutputProfiler.FontAction
getFontAction()
Return the FontAction set bysetFontAction(org.faceless.pdf2.OutputProfiler.FontAction)
float
getHairlineWidth()
Return the hairline repair width, as set bysetHairlineWidth(float)
.OutputProfiler.ImageAction
getImageAction()
OutputProfile
getProfile()
Return theOutputProfile
calculated by therun()
method.float
getProgress()
Return the progress of therun()
orapply(org.faceless.pdf2.OutputProfile)
operation, or 0 if this is not being run, has completed or has been cancelled.OutputProfiler.RasterizingAction
getRasterizingAction()
ExecutorService
getRasterizingActionExecutorService()
Return the ExecutorService set bysetRasterizingActionExecutorService(java.util.concurrent.ExecutorService)
List<OutputProfiler.Strategy>
getStrategy()
Return a copy of the list of all strategies currently being applied.boolean
isCancelled()
Return true if thecancel()
method has been called.boolean
isDone()
Return true if therun()
orapply(org.faceless.pdf2.OutputProfile)
method has completed or been cancelled, false if it's still running or has not yet been started.boolean
isRunning()
Return true if therun()
orapply(org.faceless.pdf2.OutputProfile)
method is running in another thread, and false if it has completed, been cancelled or not yet started.boolean
isStrategy(OutputProfiler.Strategy s)
Return true if the specified Strategy will be considered by theapply(org.faceless.pdf2.OutputProfile)
method when applying an OutputProfile.void
run()
Analyze the PDF and generate its profile.void
setColorAction(OutputProfiler.ColorAction action)
Set theOutputProfiler.ColorAction
to run on the PDF.void
setFontAction(OutputProfiler.FontAction action)
Set theOutputProfiler.FontAction
to run on the PDF.void
setFull(boolean full)
Sets whether the OutputProfiler will create a full OutputProfile when it is run.void
setHairlineWidth(float width)
IfHairlines
orzero-width lines
are denied when a new profile isapplied
, they will be changed to be lines of at least this width.void
setImageAction(OutputProfiler.ImageAction action)
Set theOutputProfiler.ImageAction
to run on the PDF.void
setJustNoticeableDifference(float threshold, String methodHint)
Set the threshold level at which two colors are considered "different", which is a criteria that is tested at various points throughout theapply(org.faceless.pdf2.OutputProfile)
method.void
setMaxImageDPI(OutputProfiler.ImageType imagetype, float threshold, float target)
Deprecated.please callsetImageAction(org.faceless.pdf2.OutputProfiler.ImageAction)
instead.void
setParser(PDFParser parser)
Set the PDFParser to create the OutputProfile from.void
setPDF(PDF pdf)
Set the PDF to create the OutputProfile from.void
setRasterizingAction(OutputProfiler.RasterizingAction action)
Set theOutputProfiler.RasterizingAction
to run on the PDF.void
setRasterizingActionExecutorService(ExecutorService service)
Set the ExecutorService to be used for rasterizing pages pages with aOutputProfiler.RasterizingAction
.void
setStrategy(Collection<OutputProfiler.Strategy> strategy)
Set the strategy that will be used to resolve problems encountered duringapply(org.faceless.pdf2.OutputProfile)
.void
setStrategy(OutputProfiler.Strategy... strategy)
Set the strategy that will be used to resolve problems encountered duringapply(org.faceless.pdf2.OutputProfile)
.OutputProfile
waitForProfile()
Wait for the profiling operation running in this (or another) thread to finish, and return the profile when done.
-
-
-
Constructor Detail
-
OutputProfiler
public OutputProfiler()
Create a new OutputProfiler
-
OutputProfiler
public OutputProfiler(PDF pdf)
Create a new OutputProfiler and callsetPDF()
- Parameters:
pdf
- the PDF
-
OutputProfiler
public OutputProfiler(PDFParser parser)
Create a new OutputProfiler and callsetParser()
- Parameters:
parser
- the PDFParser
-
-
Method Detail
-
setPDF
public void setPDF(PDF pdf)
Set the PDF to create the OutputProfile from. Setting just a PDF will allow only basic OutputProfile features to be extracted. Once set it cannot be changed.- Parameters:
pdf
- the PDF to scan for features- See Also:
setParser(org.faceless.pdf2.PDFParser)
,setFull(boolean)
-
setParser
public void setParser(PDFParser parser)
Set the PDFParser to create the OutputProfile from. Setting a PDFParser will allow both basic and full OutputProfile features to be extracted. Once set, it cannot be changed, but it can be reset by passing in null- Parameters:
parser
- the PDFParser containing the PDF to scan for features- See Also:
setPDF(org.faceless.pdf2.PDF)
,setFull(boolean)
-
setFull
public void setFull(boolean full)
Sets whether the OutputProfiler will create a full OutputProfile when it is run. This method simply creates a newPDFParser
and callssetParser(org.faceless.pdf2.PDFParser)
- Parameters:
full
- whether to extract a full profile from the PDF.
-
setJustNoticeableDifference
public void setJustNoticeableDifference(float threshold, String methodHint)
Set the threshold level at which two colors are considered "different", which is a criteria that is tested at various points throughout the
apply(org.faceless.pdf2.OutputProfile)
method. In particular, when two differentSeparations
are found, they will be merged if the maximum Δe (delta-E) value for the two separations is less than this value. If greater than this value, the page will probably have to be rasterized.The
methodHint
can also be set to try and adjust the algorithm for determining Delta-E. Supported values are currently "CIDE2000" and "CIE94", ornull
for no change.The default values if not set are equivalent to
setJustNoticeableDifference(2.5, "CIEDE2000")
. Note that although the theoreticaly correct value for the JND threshold is 1, the alternative is rasterization. So a little tolerance here is probably justified.- Parameters:
threshold
- the value to use for "just noticable difference" - two colors with a difference above this value are considered to be different colorsmethodHint
- the method to use for deltaE calculation.
-
run
public void run()
Analyze the PDF and generate its profile. Whether this method calculates a "basic" or "full" profile depends on whether a
PDFParser
was specified on this class, either in the constructor or by callingsetParser(org.faceless.pdf2.PDFParser)
. If available a full profile will be run, which can take some time. If not, a basic profile is generated which is essentially instantaneous.The process reads, but does not write to the structures of the PDF so can safely be run in parallel other operations that read the PDF, such as signature validation or rendering to bitmap.
- Specified by:
run
in interfaceRunnable
- See Also:
isRunning()
,getProfile()
,apply(org.faceless.pdf2.OutputProfile)
-
cancel
public void cancel()
Cancel this OutputProfiler's operation - if it is being run in another thread, that thread should terminate safely shortly after this method is called. Once this object is cancelled, it cannot be restarted.- See Also:
isCancelled()
-
isRunning
public boolean isRunning()
Return true if therun()
orapply(org.faceless.pdf2.OutputProfile)
method is running in another thread, and false if it has completed, been cancelled or not yet started.- See Also:
run()
-
isDone
public boolean isDone()
Return true if therun()
orapply(org.faceless.pdf2.OutputProfile)
method has completed or been cancelled, false if it's still running or has not yet been started.
-
isCancelled
public boolean isCancelled()
Return true if thecancel()
method has been called.- See Also:
isRunning()
-
getProfile
public OutputProfile getProfile()
Return theOutputProfile
calculated by therun()
method. Ifrun()
has not been called already, it will be called by this method. If it has already completed, it will return the result (ornull
if it failed). If it is currently running in another thread, this method will returnnull
immediately.- See Also:
isRunning()
-
waitForProfile
public OutputProfile waitForProfile()
Wait for the profiling operation running in this (or another) thread to finish, and return the profile when done. This method will also wait if the profiling has not yet started.- See Also:
isRunning()
-
getProgress
public float getProgress()
Return the progress of therun()
orapply(org.faceless.pdf2.OutputProfile)
operation, or 0 if this is not being run, has completed or has been cancelled.- Returns:
- the progress of the operation, from 0 to 1
- See Also:
isRunning()
-
setHairlineWidth
public void setHairlineWidth(float width)
IfHairlines
orzero-width lines
are denied when a new profile isapplied
, they will be changed to be lines of at least this width. This will rebuild the PDF. If no hairlines are present in the PDF when this method is called, no rebuild will be performed.- Parameters:
width
- the width (in pts) to use to replace any hairlines. Must be > 0. The default is 0.2
-
setFontAction
public void setFontAction(OutputProfiler.FontAction action)
Set theOutputProfiler.FontAction
to run on the PDF. This can be used to replace fonts in the PDF with new fonts. If this value is not null, the PDF will be rebuilt inapply()
.- Parameters:
action
- the FontAction
-
getFontAction
public OutputProfiler.FontAction getFontAction()
Return the FontAction set bysetFontAction(org.faceless.pdf2.OutputProfiler.FontAction)
- Since:
- 2.26
-
setColorAction
public void setColorAction(OutputProfiler.ColorAction action)
Set theOutputProfiler.ColorAction
to run on the PDF. This can be used to replace colors in the PDF. If this value is not null, the PDF will be rebuilt inapply()
.- Parameters:
action
- the ColorAction
-
getColorAction
public OutputProfiler.ColorAction getColorAction()
Return theOutputProfiler.ColorAction
set bysetColorAction(org.faceless.pdf2.OutputProfiler.ColorAction)
- Since:
- 2.26
-
setImageAction
public void setImageAction(OutputProfiler.ImageAction action)
Set theOutputProfiler.ImageAction
to run on the PDF. This can be used to resample or recompress images colors in the PDF. If this value is not null, the PDF will be rebuilt inapply()
.- Parameters:
action
- the ImageAction- Since:
- 2.22.2
-
getImageAction
public OutputProfiler.ImageAction getImageAction()
Return theOutputProfiler.ImageAction
set bysetImageAction(org.faceless.pdf2.OutputProfiler.ImageAction)
- Since:
- 2.26
-
setRasterizingAction
public void setRasterizingAction(OutputProfiler.RasterizingAction action)
Set theOutputProfiler.RasterizingAction
to run on the PDF. This can be used to rasterize page content to images. If this value is not null, the PDF will be rebuilt inapply()
.- Parameters:
action
- the RasterizingAction- Since:
- 2.26
-
setRasterizingActionExecutorService
public void setRasterizingActionExecutorService(ExecutorService service)
Set the ExecutorService to be used for rasterizing pages pages with aOutputProfiler.RasterizingAction
. A value of null means they are rasterized one at a time on the current thread (the default). Be aware that rasterizing is a memory intensive task, so to many threads will cause memory pressure.- Since:
- 2.26.1
-
getRasterizingActionExecutorService
public ExecutorService getRasterizingActionExecutorService()
Return the ExecutorService set bysetRasterizingActionExecutorService(java.util.concurrent.ExecutorService)
- Since:
- 2.26.1
-
getRasterizingAction
public OutputProfiler.RasterizingAction getRasterizingAction()
Return theOutputProfiler.RasterizingAction
set bysetRasterizingAction(org.faceless.pdf2.OutputProfiler.RasterizingAction)
- Since:
- 2.26
-
getHairlineWidth
public float getHairlineWidth()
Return the hairline repair width, as set bysetHairlineWidth(float)
.- Since:
- 2.26.1
-
setMaxImageDPI
@Deprecated public void setMaxImageDPI(OutputProfiler.ImageType imagetype, float threshold, float target)
Deprecated.please callsetImageAction(org.faceless.pdf2.OutputProfiler.ImageAction)
instead.Set the maximum image resolution to be used in the PDF. If the PDF contains an image of the specified type which is not embedded at less than the specified threshold resolution, it will be resampled to the target resolution and replaced. Calling this method will cause the PDF to be rebuilt inapply()
.- Parameters:
imagetype
- the ImageType whether this applies to one-bit, gray or color imagestarget
- the resolution to test the image against - all copies of the image embedded in the PDF must be this resolution or higher for it to be resampled.target
- the resolution to resample the image to.
-
setStrategy
public void setStrategy(OutputProfiler.Strategy... strategy)
Set the strategy that will be used to resolve problems encountered duringapply(org.faceless.pdf2.OutputProfile)
. By default, the strategy isOutputProfiler.Strategy.Default
, but multiple items can be passed into this method to define the set of strategies that will be tried when thereturnapply() method is called.- Parameters:
strategy
- a list of strategies to apply- Since:
- 2.26
-
setStrategy
public void setStrategy(Collection<OutputProfiler.Strategy> strategy)
Set the strategy that will be used to resolve problems encountered duringapply(org.faceless.pdf2.OutputProfile)
. Like {@link #setStrategy(Strategy...} but this method takes a Collection.- Parameters:
strategy
- a collection of strategies to apply- Since:
- 2.28
-
getStrategy
public List<OutputProfiler.Strategy> getStrategy()
Return a copy of the list of all strategies currently being applied.- Since:
- 2.26.3
-
isStrategy
public boolean isStrategy(OutputProfiler.Strategy s)
Return true if the specified Strategy will be considered by theapply(org.faceless.pdf2.OutputProfile)
method when applying an OutputProfile.- Since:
- 2.26
-
apply
public void apply(OutputProfile targetprofile)
Set the specified
OutputProfile
on the PDF. The supplied "target" profile will have a number of featuresdenied
andrequired
, and this method will attempt to modify the PDF to match those requirements. If it's not possible then anIllegalStateException
will be thrown.If the supplied profile references any features that require a full scan and the PDF has been loaded in (rather than create from scratch), then a full profile of the existing PDF must be
run()
to determine which features are currently set. If this is alreadyin progress
in another thread, this method will wait for it to complete. If it hasn't yet been started, it will be started on this thread by callinggetProfile()
. If noPDFParser
has been set (in the constructor or through thesetParser
method) then a full profile cannot be created, and anIllegalStateException
will be thrown.If a
OutputProfiler.FontAction
,OutputProfiler.ColorAction
,OutputProfiler.ImageAction
orOutputProfiler.RasterizingAction
has been set on this class, an extra stage will be run which rebuilds the PDF content. It is also run if the full profile shows up anyhairlines
and thesetHairlineWidth
method was calling with a non-zero value.After this stage, or if no actions or hairline-replacement are specified, then the method will attempt to modify the PDF to add or remove required or denied features, as specified in the target profile. If that completes successfully, the OutputIntent on the target profile will be applied to the PDF and this method will complete.
While this method is running the
isRunning()
method will return true, and the progress value returned fromgetProgress()
will be updated, although the returned value is approximate at best: the amount of work required to modify a PDF to meet a target profile cannot realistically be predicted in advance. Thecancel()
method can be used to request the apply() method is interrupted. The PDF should be left in a consistent state if this happens, but that state will necessarily be somewhere between how the PDF was originally, and how it was going to be after modification. There is no way to revert the PDF to it's original state other than reloading. When this method finishes theisDone()
method will return true, and theisCancelled()
method will be false if the method completed successfully or threw an exception, and true if it was cancelled.Note that this method modifies the PDF extensively, so (unlike the retrieval of the OutputProfile from the
run()
method), any threads that read from the PDF must be paused while this method is running. The functionality to manage the progress of this method was added in 2.26.1- Parameters:
targetprofile
- the OutputProfile that this PDF should be converted to match.
-
getArlingtonModelIssues
public List<ArlingtonModelIssue> getArlingtonModelIssues()
Traverse the PDF and generate a list of issues based on the Arlington PDF validation model. The list is recreated each time this method is called.- Since:
- 2.27.2
- See Also:
ArlingtonModelIssue
-
-