Extract structured data from PDFs.
Stop wasting time extracting PDFs.
Transform your PDF documents into structured data with Documind. Simple, powerful and open-source.
Documind is an advanced document processing tool that leverages AI to extract structured data from PDFs. It is built to handle PDF conversions, extract relevant information, and format results as specified by customizable schemas.
MinerU is a tool that converts PDFs into machine-readable formats (e.g., markdown, JSON), allowing for easy extraction into any format. MinerU was born during the pre-training process of InternLM. We focus on solving symbol conversion issues in scientific literature and hope to contribute to technological development in the era of large models. Compared to well-known commercial products, MinerU is still young. If you encounter any issues or if the results are not as expected, please submit an issue on issue and attach the relevant PDF.
Docling parses documents and exports them to the desired format with ease and speed.
🗂️ Reads popular document formats (PDF, DOCX, PPTX, Images, HTML, AsciiDoc, Markdown) and exports to Markdown and JSON.
I, Librarian is an online service that will organize your collection of PDF papers and office documents. It provides a lot of extra features for students and research groups both in industry and academia. It is a reference manager, PDF manager and organizer focused on private group collaboration.
The Data Processor for Agents.
Marly allows your agents to extract tables & text from your PDFs, Powerpoints, etc in a structured format making it easy for them to take subsequent actions (database call, API call, creating a chart etc).
Create Interactive Flipbooks on our Digital Publishing Platform.
Issuu turns PDFs and other file types into digital Flipbooks and shareable content types. Upload a document, watch it transform, and enhance it with interactive features like Videos and Links. Easily share the URL, Embed it onto your website, and sell content with Digital Sales. Promote your work across all channels with Social Posts, Articles, and GIFs.
Open Source Document Signing. Open source DocuSign alternative. Create, fill, and sign digital documents ✍️
DocuSeal is an open source platform that provides secure and efficient digital document signing and processing. Create PDF forms to have them filled and signed online on any device with an easy-to-use, mobile-optimized web tool.
PDF processor api & cli.
pdfcpu is a PDF processing library written in Go that supports encryption and offers both an API and a command-line interface (CLI). It is compatible with all PDF versions with basic support and ongoing improvement for PDF 2.0 (ISO-32000-2).
A post-processing tool for scanned sheets of paper.
unpaper is a post-processing tool for scanned sheets of paper, especially for book pages that have been scanned from previously created photocopies. The main purpose is to make scanned book pages better readable on screen after conversion to PDF. Additionally, unpaper might be useful to enhance the quality of scanned pages before performing optical character recognition (OCR).
OCRmyPDF adds an optical character recognition (OCR) text layer to scanned PDF files, allowing them to be searched.
PdfDing is a selfhosted PDF manager and viewer offering a seamless user experience on multiple devices. It's designed be to be minimal, fast, and easy to set up using Docker.
PdfDing is a PDF manager and viewer that you can host yourself. It offers a seamless user experience on multiple devices. It's designed be to be minimal, fast, and easy to set up using Docker. As all data stays on your server you have full control over your data and privacy.
With its simple, intuitive and adjustable UI, PdfDing makes it easy for users to keep track of their PDFs and access them whenever they need to. With a dark mode and colored themes users can style the app to their liking. As PdfDing offers SSO support via OIDC it can be easily integrated into existing setups.
The Free & OpenSource Alternative to Docusign.
Seal the Deal, Openly. Your ultimate open source PDF E-Signature Solution.
Transform the Way You Sign, Store, and Secure Your Documents. All in One Place - All for Free.
Free web software for signing, organizing, editing metadatas or compressing PDFs.
The leading HTML5 client solution for generating PDFs.
Transform your PDF generation process for your event tickets, reports, certificates, and more.
Client-side JavaScript PDF generation for everyone.
Convert PDF to markdown quickly with high accuracy
mPDF is a PHP library which generates PDF files from UTF-8 encoded HTML.
It is based on FPDF and HTML2FPDF with a number of enhancements.
dompdf is an HTML to PDF converter.
At its heart, dompdf is (mostly) a CSS 2.1 compliant HTML layout and rendering engine written in PHP. It is a style-driven renderer: it will download and read external stylesheets, inline style tags, and the style attributes of individual HTML elements. It also supports most presentational HTML attributes.