Fields & TablesWorking with Tables
Table Extraction Tips
Tips for better table extraction from documents
Table Extraction Tips
DigiParser extracts tables (e.g. line items, transactions) from your documents using AI. These tips help you get better, more consistent results.
Use clear column names
Use column names that match what appears in your documents:
- Good: “Description”, “Quantity”, “Unit Price”, “Total”
- Avoid: “Col1”, “Field A”, “Item”
Clear names help the system find the right cells and reduce mix-ups between columns.
Match your document layout
- Headers: If your documents have table headers (e.g. “Description”, “Qty”, “Amount”), use similar names for your columns. The AI uses headers to align columns.
- Order: When possible, keep column order in Fields & Tables similar to the order in the document. It can improve accuracy.
Use the right column types
- Amounts, quantities, prices → Number
- Dates → Date
- Descriptions, codes, memos → Plain text
Using the correct type helps the AI parse each cell correctly and keeps data consistent in exports.
Prefer clear, readable documents
- Quality: Clear scans or PDFs with readable text work best. Blurry or low-resolution images can hurt table extraction.
- Layout: Straight, well-aligned tables extract better than crooked or heavily annotated ones.
- Consistency: Similar layouts across documents (e.g. same invoice format) usually give more consistent results.
Start simple, then add columns
- Begin with a few important columns (e.g. Description, Quantity, Total).
- Check extraction on a few documents.
- Add more columns as needed. Adding many columns at once can make it harder to spot issues.
Use AI Description for tricky columns
If a column is often wrong or ambiguous:
- Add a short AI Description (e.g. “The line total before tax, usually right-aligned”).
- This gives the AI extra context and can improve accuracy.
Handle merged cells and complex layouts
- Merged cells or very complex table layouts in the document can be harder to extract. If you see missing or misplaced data, use a simpler table structure (e.g. fewer columns or one main table) to improve results.
- Multi-page tables: Extraction works across pages. Ensure the table structure is consistent across pages (same headers, same columns).
Review and correct when it matters
- Spot-check extracted tables on a few documents, especially when you change columns or add new document types.
- Edit incorrect cells in the document view. Your corrections are saved.
- Use Re-process document after you change table columns so existing documents match your current setup.
Use one parser per “table type”
If you process different kinds of tables (e.g. invoice line items vs. bank transactions), use separate parsers with different table setups. Each parser can have the right columns and types for that document type.
Bank statements and similar documents
- Transactions: Use a Transactions (or similar) table with columns like Date, Description, Amount, Balance.
- Consistent format: Bank statements often follow a fixed layout. Matching your column names and types to that layout helps.
- Debits/credits: If your documents use separate columns for debits and credits, add both as Number columns. You can combine or reformat them in Post Processing or when exporting if needed.
Common issues
| Issue | What to try |
|---|---|
| Columns mixed up | Use clearer column names; add AI Description; check document layout |
| Missing rows | Check document quality; ensure table has clear structure; make sure your schema includes all expected columns and the table header is clearly labeled in the document |
| Wrong data types | Set correct type for each column (Number, Date, Plain text) |
| Inconsistent results | Use consistent document format; one parser per document/table type; improve scan quality |
Next steps
- Setting Up Tables – Add and configure tables.
- Add/Edit Fields – Add fields and tables.
- How to Increase Extraction Accuracy – General tips for better extraction.
How is this guide?