Classification And Extraction
Overview
In the Classification and Extraction settings, you can:
Enable Document Splitting based on QR codes
Configure amount formatting
Set up table extraction
Toggle processing of unsupported ZUGFeRD files
Define special classification rules
Monitor Custom-Trained AI Models used in the classification process
This page provides a detailed explanation of all available settings.
Accessing Classification and Extraction Settings
To access the Classification and Extraction settings, go to: Settings → Document Processing → Classification and Extraction

Document Splitting
In the Document Splitting section, you can configure whether an uploaded document should be split into multiple documents whenever a barcode appears on one of its pages.
To activate this feature:
Click Split Documents.
Select Split by Barcode/QR Code.
You will then have the option to:
Select one or more barcode types to be detected.
Specify a regex pattern that the barcode must match in order to trigger document splitting.
Amount Formatting
In the Amount Formatting section, you have two options:
Allow Rounding During Amount Comparison: If enabled, a tolerance of ±0.5 is allowed during amount comparison. If disabled, a default tolerance of ±0.05 applies.
Require Exact Match for Amount Comparison: If enabled, amounts must match exactly with zero tolerance. If disabled, a tolerance of ±0.05 is allowed.
Note: Only one of these settings can be active at a time.
Table Extraction
You can extract tables from documents by enabling either Table Extraction or AI Table Extraction. A trained table—whether AI-based or manual—will always be linked to a specific supplier.
Table Extraction: Activates manual table extraction. Tables must be trained manually. Learn more about manual training here.
AI Table Extraction: Uses AI to automatically extract tables. If the results are not accurate enough, it's recommended to switch to manual Table Extraction for better control and training.
Table Extraction for Costing Elements: When enabled, DocBits can extract costing elements from tables at the line level and classify them accordingly. Detailed explanation available here.
Auto Extract Tax Code: When enabled, the system automatically fills the Tax Code field on the Validation Screen—provided that a tax code field is configured. More information on this setting here.
AI Model: Allows you to specify which AI model is used for table extraction. You’ll also see a table showing:
Which suppliers are using which AI model
Whether they use E-Text
Options to delete an entry and reset training data
This setting is explained in detail here.
Electronic Document
Process Unsupported ZUGFeRD PDFs: If enabled, unsupported ZUGFeRD versions will be processed as standard PDFs, and the embedded XML will be ignored.
The list of supported ZUGFeRD versions can be found here.
Classification Rules
In the Classification Rules section, you can define specific regex patterns and criteria to help the system automatically classify documents during processing.
To access this section, click the Classification Rules tab at the top of the page.

Add a New Classification Rule
To create a new rule:
Click Add in the top-right corner.
Fill in the following fields:
Pattern: The regex pattern the system should search for to trigger classification.
Type: Where the pattern should be searched (e.g., Barcode).
Sub-Organization (optional): Specify which sub-organization the rule applies to.
Document Type: Define the document type to assign when the pattern is matched.
Sub-Document Type (optional): Specify a sub-type for more detailed classification.
Click Save to save your classification rule.
Edit a Classification Rule
To edit an existing rule:
Click the three dots in the Actions column.
Select Edit.
Make your desired changes.
Click Save to apply the updates.
Delete a Classification Rule
To delete a rule:
Click the three dots in the Actions column.
Select Delete.
AI Models
The AI Models section displays all custom-trained models that have been specifically fine-tuned for your needs.
Accessing the AI Models Section
To open this section, click the AI Models tab located at the top of the page.

Model Categories
Models are organized into categories. Below each category name, the number of models it contains is shown. Click on a category to view its details.

At the top of the selected category page, you’ll see key information about each model:
Type: The type of model.
First Page Only: Indicates whether the model processes only the first page of a document.
Version: The version number of the model.
Model Table
All models within a category are listed in a table, which includes the following information:
Name: The name of the model.
Next Model: The model that will further process the output of the current model.
Document Type: The primary document type assigned by the model during classification.
Document Sub-Types: The sub-types into which the document is further classified.
Priority: The priority level that determines the model’s position in the classification queue.

Editing a Model
To edit a model:
Click the pen icon in the Actions column next to the model you want to edit.
Update the available fields:
Next Model: Select the model that should process the output from the current model.
Document Type: Choose the document type the model should classify the input as.
Click Save to apply your changes.
Last updated
Was this helpful?