Sectioning¶
-
class
py_pdf_parser.sectioning.
Section
(document, name, unique_name, start_element, end_element)¶ A continuous group of elements within a document.
A section is intended to label a group of elements. Said elements must be continuous in the document.
Warning
You should not instantiate a Section class yourself, but should call create_section from the Sectioning class below.
Parameters: - document (PDFDocument) – A reference to the document.
- name (str) – The name of the section.
- unique_name (str) – Multiple sections can have the same name, but a unique name will be generated by the Sectioning class.
- start_element (PDFElement) – The first element in the section.
- end_element (PDFElement) – The last element in the section.
-
elements
¶ All the elements in the section.
Returns: All the elements in the section. Return type: ElementList
-
class
py_pdf_parser.sectioning.
Sectioning
(document: PDFDocument)¶ A sectioning utilities class, made available on all PDFDocuments as
.sectioning
.-
create_section
(name: str, start_element: PDFElement, end_element: PDFElement, include_last_element: bool = True)¶ Creates a new section with the specified name.
Creates a new section with the specified name, starting at start_element and ending at end_element (inclusive). The unique name will be set to name_<idx> where <idx> is the number of existing sections with that name.
Parameters: - name (str) – The name of the new section.
- start_element (PDFElement) – The first element in the section.
- end_element (PDFElement) – The last element in the section.
- include_last_element (bool) – Whether the end_element should be included in the section, or only the elements which are strictly before the end element. Default: True (i.e. include end_element).
Returns: The created section.
Return type: Raises: InvalidSectionError
– If a the created section would be invalid. This is usually because the end_element comes after the start element.
-
get_section
(unique_name: str) → py_pdf_parser.sectioning.Section¶ Returns the section with the given unique name.
Raises: SectionNotFoundError
– If there is no section with the given unique_name.
-
get_sections_with_name
(name: str) → Generator[[py_pdf_parser.sectioning.Section, None], None]¶ Returns a list of all sections with the given name.
-
sections
¶ Returns the list of all created Sections.
-