| Mistral OCR 4(mistral.ai) | |
| 473 points by meetpateltech 23 hours ago | 126 comments | |
tl;dr: Mistral released OCR 4, a document extraction model that returns bounding boxes, typed block classifications (tables, equations, signatures, etc.), and per-word confidence scores across 170 languages, deployable in a single self-hosted container. It claims top scores on OlmOCRBench (85.20) and 72% win rates in human preference tests against competitors, though Mistral notes benchmark scoring artifacts inflate apparent errors on math and multi-column docs. Pricing is $4/1k pages via API ($2 batch), $5/1k for Document AI, available through Mistral Studio, AWS SageMaker, and Microsoft Foundry. | |
HN Discussion:
| |