Vision Guides
A curated collection of 1 vision guides for building with AI coding agents.
Guide
Using Vision-Language Models for OCR, Documents, and Video Understanding
How to use vision-language models for OCR, documents, and video: how they differ from traditional OCR, their failure modes, and getting reliable output.
2m read· AgentsCamp