Research
Document Understanding
How long, semi-structured, and visual documents become usable context for decisions.
Important business context often lives in PDFs, statements, filings, contracts, scans, and long notes. Document understanding studies how to parse that material while keeping structure, layout, and provenance intact.
Core Questions
- How should a system represent pages, sections, tables, images, spans, and extracted facts together?
- What should be preserved from layout and visual structure before a model summarizes or reasons?
- How do we make document-derived outputs reviewable by operators and advisors?
Where It Shows Up
Application
Evidence-Backed Records
Parse long-form and semi-structured documents into evidence-backed records.
Application
Source-Linked Review
Support source-linked review for compliance, finance, and operating decisions.
Application
Document-To-Graph Inputs
Feed extraction and graph workflows without losing page-level context.
Artifacts
- DocUnderstand experiments.
- Document parsing and provenance notes.
- Research reads on graph-aware document extraction.