Extraction Accuracy Issues
How to improve extraction accuracy and fix mistakes
Extraction Accuracy Issues
If DigiParser is extracting data incorrectly, this guide helps you fix it.
Common accuracy issues
Wrong values extracted
Problem: A field has the wrong value (e.g. invoice number is "123" but should be "456").
Solution:
- Edit the value directly in the document view (click the cell, type the correct value, save)
- Check the original document to see what the correct value should be
- Re-process if the document itself was unclear or low quality
Missing values
Problem: A field that should have a value is empty.
Solution:
- Check the document to see if the value is actually there
- Edit and add the value manually if it's in the document
- Improve field description in Fields & Tables to help the AI find it
- Check confidence scores (if enabled) to see if the field had low confidence
Values in wrong fields
Problem: A value appears in the wrong field (e.g. vendor name in invoice number field).
Solution:
- Edit the values to move them to the correct fields
- Improve field names to be more specific (e.g. "Invoice Number" instead of "Number")
- Add AI Descriptions to clarify what each field should contain
Table data mixed up
Problem: Table columns have wrong values or rows are missing.
Solution:
- Edit table cells to correct values
- Check column names match what's in the document
- Use clearer column names (e.g. "Unit Price" instead of "Price")
- See Table Extraction Tips for more help
How to improve accuracy
Use clear field names
Good: "Invoice Number", "Total Amount", "Vendor Name"
Avoid: "Field 1", "Data", "Info"
Clear names help the AI understand what to extract.
Add AI Descriptions
In Fields & Tables, add AI Description text that explains what the field is:
- "The invoice number, usually at the top right"
- "The total amount including tax, usually at the bottom"
- "The vendor or supplier name, usually in the header"
Use the right field types
- Amounts, quantities → Number
- Dates → Date
- Text, names, addresses → Plain text
Using the correct type helps extraction accuracy.
Enable confidence scores
In Parser Settings → Parsing Configuration, enable Calculate confidence scores. This shows which values might be wrong (low confidence = might need review).
Note: This uses 1 credit per page.
Improve document quality
- Use clear, readable scans or PDFs
- Avoid blurry or low-resolution images
- Use consistent document formats (same invoice layout, etc.)
Review and correct
- Review extracted data regularly
- Edit incorrect values
- Re-process documents after improving fields/descriptions
Fixing mistakes
Edit values
- Open the document.
- Click the incorrect value.
- Type the correct value.
- Click Save.
Re-process after changes
If you change Fields & Tables (e.g. improve descriptions, add fields):
- Open the document.
- Click More (⋮) → Re-process document.
- The document is processed again with the updated schema.
Note: Re-processing uses credits (1 credit per page).
When to re-process
- After changing field names or descriptions
- After adding new fields
- After improving AI Descriptions
- When the original file was unclear or low quality
Tips
- Start simple: Begin with a few important fields, then add more
- Test on a few documents: Check extraction on 2-3 documents before processing many
- Use confidence scores: Enable them to spot potential errors
- Review regularly: Check extracted data, especially for new document types
- Improve over time: Add better descriptions and field names as you learn what works
Next steps
- How to Increase Extraction Accuracy – General tips
- Table Extraction Tips – Improve table extraction
- Editing Extracted Data – Fix mistakes manually
How is this guide?