A text extraction library supporting PDFs, images, office documents and more.
Kreuzberg is a Python library for text extraction from documents. It provides a unified async interface for extracting text from PDFs, images, office documents, and more.
Docling parses documents and exports them to the desired format with ease and speed. Reads popular document formats (PDF, DOCX, PPTX, Images, HTML, AsciiDoc, Markdown) and exports to Markdown and JSON.
Spend less time translating and more time on the task at hand. No matter what or where you're translating, DeepL Pro ensures it's accurate, secure, and tailored to your needs.