Class TextOptions


  • public class TextOptions
    extends NativeObject

    Options for text extraction

    This class specifies the details of text extraction.
    • Constructor Detail

      • TextOptions

        public TextOptions()
    • Method Detail

      • setExtractionFormat

        public void setExtractionFormat​(TextExtractionFormat value)

        Format of the extracted text. (Setter)

        Specifies the format of the extracted text.

        Default value: TextExtractionFormat.DOCUMENT_ORDER

        Throws:
        java.lang.IllegalArgumentException - if value is null
      • getAdvanceWidth

        public Length getAdvanceWidth()

        The horizontal space in a PDF that corresponds to a character in monospaced text output.

        If null, the horizontal space is 7.2pt.

        Default value: null

      • setAdvanceWidth

        public void setAdvanceWidth​(Length value)

        The horizontal space in a PDF that corresponds to a character in monospaced text output.

        If null, the horizontal space is 7.2pt.

        Default value: null

      • getLineHeight

        public Length getLineHeight()

        The vertical space in a PDF that triggers a new line in monospaced text output.

        If null, no extra blank lines are added in the text output.

        Default value: null

      • setLineHeight

        public void setLineHeight​(Length value)

        The vertical space in a PDF that triggers a new line in monospaced text output.

        If null, no extra blank lines are added in the text output.

        Default value: null

      • getWordSeparationFactor

        public double getWordSeparationFactor()

        This parameter defines a factor multiplied by the width of the space character to determine word boundaries. If the distance between two characters exceeds this calculated value, it is recognized as a word separation.

        Default value: 0.3

      • setWordSeparationFactor

        public void setWordSeparationFactor​(double value)

        This parameter defines a factor multiplied by the width of the space character to determine word boundaries. If the distance between two characters exceeds this calculated value, it is recognized as a word separation.

        Default value: 0.3

        Throws:
        java.lang.IllegalArgumentException - The word separation factor is invalid.