Production-ready outputs
built on cross-referenced factsProduction-ready outputs built on cross-referenced facts

The highest accuracydocument extraction for your AI

Drop a document. Ask anything.
Get accurate answers, with sources.Drop a document. Ask anything. Get accurate answers, with sources.

$ pip install proofpudding

from proofpudding import PDF
PDF.load("tesla_10k_2024.pdf").extract("List all subsidiaries")[0]

{
  "name": "Tesla Motors Netherlands BV",
  "jurisdiction": "Amsterdam, NL",
  "ownership": "100%",
  "_proof": { "page": 142, "text": "wholly-owned subsidiary...",    
      "bbox": [0.12, 0.34, 0.88, 0.41] }
}

Put it to the test

Upload any document and ask a question

Ask anything...

No signup required. Your document is processed and deleted.

Document extraction
that proves itself

Every answer traced back to the source

Any document or language

PDFs, scans, tables, forms. 200+ pages. Japanese, Korean, Arabic, Thai, Chinese, and more.

Evidence for every answer

Every value links to the exact page and source text. Verifiable by default.

Predictable costs

Pay per extraction, not per token. Budget based on task complexity.

Best-in-class accuracy

Production-ready outputs built on cross-referenced facts. No guessing, no gaps.

Minimal code, maximum data

Load, extract and use in a few lines of code

from proofpudding import PDF
doc = PDF.load("contract.pdf")

Works with your stack

Native integrations for every major agent framework

Pay per extraction

No subscriptions, no minimums

$0.05per tool call + LLM costs

Document

2-page invoice

50-page contract

100-page SEC filing

150-page report

Example cost

$0.06

$0.30

$0.55

$3.45

No card required

Questions

PDF (native and scanned), DOCX, images (PNG, JPG). More coming soon.

Our agent processes documents up to 200 pages. For longer documents, contact us.

All languages Claude supports, with particular strength in English, Japanese, Korean, Chinese, Arabic, Thai, and European languages.

Documents are processed and deleted. We don't store your data.

LlamaParse converts documents to markdown for RAG. We extract structured data from specific queries. Parsing ≠ extraction.

Ready to extract?

$10 free credit.
No card required.

$ pip install proofpudding