indexation
Internet search engine for text-oriented websites. Indexing the small, old and weird web.
The aim of the project is to develop new and alternative discovery methods for the Internet. It's an experimental workshop as much as it is a public service, the overarching goal is to elevate the more human, non-commercial sides of the Internet.
Related contents:
thread-based email index, search and tagging.
Notmuch is a system for indexing, searching, reading, and tagging large collections of email messages in maildir or mh format. It uses the Xapian library to provide fast, full-text search with a convenient search syntax.
Hexagonal hierarchical geospatial indexing system.
H3 is a geospatial indexing system using a hexagonal grid that can be (approximately) subdivided into finer and finer hexagonal grids, combining the benefits of a hexagonal grid with S2's hierarchical subdivisions.
The Apache Tika™ toolkit detects and extracts metadata and text from over a thousand different file types (such as PPT, XLS, and PDF).
All of these file types can be parsed through a single interface, making Tika useful for search engine indexing, content analysis, translation, and much more.
Sphider is a lightweight web spider and search engine written in PHP, using MySQL as its back end database. It is a great tool for adding search functionality to your web site or building your custom search engine. Sphider is small, easy to set up and modify, and is used in thousands of websites across the world.