Tomáš Horák

Invoice OCR pipeline for an accountant

Email-in → vision model extracts line items, vendor, totals, dates → writes to QuickBooks. Handles 12 European invoice formats.

Cuts month-end close from 5 days to 1.5.
0

Problem

Accounting practice spent 30+ hours/client at month-end re-typing invoices from PDFs into QuickBooks. Errors meant rework, deadlines slipped.

Flow

IMAP poll → attachment extracted → vision model → field mapping per format → confidence check → QuickBooks write or human queue.

Cadence

Per inbound invoice (instant), plus end-of-month batch close

What I built

  • Dedicated email inbox per client with IMAP polling
  • Vision model extracts vendor, line items, totals, dates, VAT
  • Per-format mapping for 12 EU invoice templates with fallbacks
  • QuickBooks Online write with auto-categorization rules per vendor

Limits

  • Confidence-gated — anything below 0.85 routes to a human queue
  • Currency conversion uses month-end rate; daily rate is a v2
Invoice OCR pipeline for an accountant

AI tools used

  • Replicate
  • Anthropic API
  • n8n