pdftools_sdk.ocr.ocr_options
Classes
The options for OCR processing |
- class pdftools_sdk.ocr.ocr_options.OcrOptions[source]
Bases:
_NativeObjectThe options for OCR processing
This class aggregates all OCR processing options including resolution settings, image processing, text processing and page processing.
- property dpi: float
The default resolution in DPI used for OCR
Each page’s optimal OCR resolution is determined automatically, such that all images and text can be recognized. The default resolution is chosen if it is within the range of optimal resolutions.
The range should be within the resolutions supported by the OCR engine. Most OCR engines are optimized for resolutions around 300 DPI.
Default value: 300.0
- Returns:
float
- property min_dpi: float
The minimum resolution in DPI used for OCR
Default value: 200.0
- Returns:
float
- property max_dpi: float
The maximum resolution in DPI used for OCR
Default value: 400.0
- Returns:
float
- property process_embedded_files: bool
Whether to process embedded files recursively
If enabled, embedded PDF files are also processed with OCR. The default is to copy all embedded files as-is.
Default value: False
- Returns:
bool
- property image_options: ImageOptions
The options for image processing
Options controlling how images in the PDF are processed during OCR.
- Returns:
pdftools_sdk.ocr.image_options.ImageOptions
- property text_options: TextOptions
The options for text processing
Options controlling how existing text is processed during OCR.
- Returns:
pdftools_sdk.ocr.text_options.TextOptions
- property page_options: PageOptions
The options for page processing
Options controlling page-level OCR processing and tagging.
- Returns:
pdftools_sdk.ocr.page_options.PageOptions