Package com.pdftools.extraction
Class Extractor
- java.lang.Object
-
- com.pdftools.internal.NativeBase
-
- com.pdftools.internal.NativeObject
-
- com.pdftools.extraction.Extractor
-
public class Extractor extends NativeObject
Allows for extracting page-wide content of a PDF.
-
-
Constructor Summary
Constructors Constructor Description Extractor()
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description voidextractText(Document inDoc, Stream outStream)Extract text from a PDF documentvoidextractText(Document inDoc, Stream outStream, TextOptions options)Extract text from a PDF documentvoidextractText(Document inDoc, Stream outStream, TextOptions options, java.lang.Integer firstPage)Extract text from a PDF documentvoidextractText(Document inDoc, Stream outStream, TextOptions options, java.lang.Integer firstPage, java.lang.Integer lastPage)Extract text from a PDF document-
Methods inherited from class com.pdftools.internal.NativeObject
equals, hashCode
-
-
-
-
Method Detail
-
extractText
public void extractText(Document inDoc, Stream outStream) throws java.io.IOException, GenericException, LicenseException, ProcessingException
Extract text from a PDF document
- Parameters:
inDoc- The input PDF document.outStream- The stream to which output file the extracted text is written.- Throws:
LicenseException- The license check has failed.ProcessingException- The processing has failed.java.io.IOException- Writing to the output text file has failed.GenericException- A generic error occurred.java.lang.IllegalArgumentException- ThefirstPageorlastPageare not in the allowed range.java.lang.IllegalArgumentException- ifinDocisnulljava.lang.IllegalArgumentException- ifoutStreamisnull
-
extractText
public void extractText(Document inDoc, Stream outStream, TextOptions options) throws java.io.IOException, GenericException, LicenseException, ProcessingException
Extract text from a PDF document
- Parameters:
inDoc- The input PDF document.outStream- The stream to which output file the extracted text is written.options- The option object that controls the text extraction.- Throws:
LicenseException- The license check has failed.ProcessingException- The processing has failed.java.io.IOException- Writing to the output text file has failed.GenericException- A generic error occurred.java.lang.IllegalArgumentException- ThefirstPageorlastPageare not in the allowed range.java.lang.IllegalArgumentException- ifinDocisnulljava.lang.IllegalArgumentException- ifoutStreamisnull
-
extractText
public void extractText(Document inDoc, Stream outStream, TextOptions options, java.lang.Integer firstPage) throws java.io.IOException, GenericException, LicenseException, ProcessingException
Extract text from a PDF document
- Parameters:
inDoc- The input PDF document.outStream- The stream to which output file the extracted text is written.options- The option object that controls the text extraction.firstPage-Optional parameter denoting the index of the first page to be copied. This index is one-based. If set, the number must be in the range of
1(first page) topdftools.pdf.Document.getPageCount(last page).If not set,
1is used.- Throws:
LicenseException- The license check has failed.ProcessingException- The processing has failed.java.io.IOException- Writing to the output text file has failed.GenericException- A generic error occurred.java.lang.IllegalArgumentException- ThefirstPageorlastPageare not in the allowed range.java.lang.IllegalArgumentException- ifinDocisnulljava.lang.IllegalArgumentException- ifoutStreamisnull
-
extractText
public void extractText(Document inDoc, Stream outStream, TextOptions options, java.lang.Integer firstPage, java.lang.Integer lastPage) throws java.io.IOException, GenericException, LicenseException, ProcessingException
Extract text from a PDF document
- Parameters:
inDoc- The input PDF document.outStream- The stream to which output file the extracted text is written.options- The option object that controls the text extraction.firstPage-Optional parameter denoting the index of the first page to be copied. This index is one-based. If set, the number must be in the range of
1(first page) topdftools.pdf.Document.getPageCount(last page).If not set,
1is used.lastPage-Optional parameter denoting the index of the last page to be copied. This index is one-based. If set, the number must be in the range of
1(first page) topdftools.pdf.Document.getPageCount(last page).If not set,
pdftools.pdf.Document.getPageCountis used.- Throws:
LicenseException- The license check has failed.ProcessingException- The processing has failed.java.io.IOException- Writing to the output text file has failed.GenericException- A generic error occurred.java.lang.IllegalArgumentException- ThefirstPageorlastPageare not in the allowed range.java.lang.IllegalArgumentException- ifinDocisnulljava.lang.IllegalArgumentException- ifoutStreamisnull
-
-