Production-ready outputs
built on cross-referenced facts

The highest accuracydocument extraction for your AI

Drop a document. Ask anything.
Get accurate answers, with sources.

$ pip install proofpudding

from proofpudding import PDF
PDF.load("tesla_10k_2024.pdf").extract("List all subsidiaries")[0]

{
  "name": "Tesla Motors Netherlands BV",
  "jurisdiction": "Amsterdam, NL",
  "ownership": "100%",
  "_proof": { "page": 142, "text": "wholly-owned subsidiary...",    
      "bbox": [0.12, 0.34, 0.88, 0.41] }
}

Put it to the test

Upload any document and ask a question

Ask anything...

No signup required. Your document is processed and deleted.

Document extraction
that proves itself

Every answer traced back to the source

Any document or language

Any document or language

PDFs, scans, tables, forms. 200+ pages. Japanese, Korean, Arabic, Thai, Chinese, and more.

Evidence for every answer

Evidence for every answer

Every value links to the exact page and source text. Verifiable by default.

Predictable costs

Predictable costs

Pay per extraction, not per token. Budget based on task complexity.

Best-in-class accuracy

Best-in-class accuracy

Production-ready outputs built on cross-referenced facts. No guessing, no gaps.

Minimal code, maximum data

Load, extract and use in a few lines of code

from proofpudding import PDF
doc = PDF.load("contract.pdf")

Works with your stack

Native integrations for every major agent framework

LangChain
CrewAI
OpenAI
Anthropic

Pay per extraction

No subscriptions, no minimums

$0.05per tool call + LLM costs
Document
2-page invoice
50-page contract
100-page SEC filing
150-page report
Example cost
$0.06
$0.30
$0.55
$3.45
No card required

Questions

PDF (native and scanned), DOCX, images (PNG, JPG). More coming soon.

Our agent processes documents up to 200 pages. For longer documents, contact us.

All languages Claude supports, with particular strength in English, Japanese, Korean, Chinese, Arabic, Thai, and European languages.

Documents are processed and deleted. We don't store your data.

LlamaParse converts documents to markdown for RAG. We extract structured data from specific queries. Parsing ≠ extraction.

Ready to extract?

$10 free credit.
No card required.

$ pip install proofpudding