Parsing Configuration
Configure which pages and file types to parse
Parsing Configuration
In Parser Settings → Parsing Configuration, you can control which pages and which file types DigiParser processes from your documents.
Where to find it
- Open your parser and go to Settings → Parser Settings.
- Scroll to the Parsing Configuration section.
Pages to be parsed
By default, DigiParser processes all pages in a document. You can limit which pages are extracted (e.g. only odd or even pages, or specific page ranges). This applies to PDFs only.
To set: In the Pages to be parsed section, turn the toggle on and choose Only odd pages, Only even pages, or Page ranges and enter a custom range (e.g. 1,2,8 or 4,7,12-16 or 2n-1). Then click Save Parser.
See Pages to be parsed for full details and examples.
Document Types
By default, DigiParser processes all supported file types. You can restrict it to specific types.
Supported types:
- Images: PNG, JPEG/JPG
- Office: Word (.docx), Excel (.xlsx), PowerPoint (.pptx)
- Text: Markdown, plain text, CSV, HTML, JSON
When to use:
- PDFs only: If you only process PDFs and want to ignore images or Office files
- Specific formats: If you only want to process certain file types (e.g. only PDFs and images)
To set: Use the Document Types dropdown to select which file types to process, then click Save Parser.
Calculate confidence scores
If enabled, DigiParser shows how confident it is in each extracted value (e.g. as a percentage or color). This helps you spot values that might be wrong.
Cost: Uses 1 credit per page (increases processing costs).
When to use:
- When you want to prioritize review on low-confidence documents
- When you need to spot potential errors before export
To enable: Turn on Calculate confidence scores for extracted fields and click Save Parser.
Enable markdown parsing
When enabled, documents are converted to markdown so you can view and download them from the Markdown tab in the document viewer.
Cost: Uses 1 credit per page (increases processing costs).
When to use:
- When you want to view documents as markdown text
- When you need to download markdown versions of documents
To enable: Turn on Enable markdown parsing and click Save Parser.
Tips
- Start with defaults: Use default settings (all pages, all file types) unless you have a specific need
- Test page ranges: If limiting pages, test on a few documents first to make sure you're not missing data
- Confidence scores: Only enable if you need them—they increase costs
- Markdown parsing: Only enable if you need markdown view/download—it increases costs
Next steps
- General Settings – Parser name and description
- Pages to be parsed – Limit which PDF pages are extracted
- Email Processing – Email processing options
- Split Documents – PDF splitting options
How is this guide?