DataFusion is a very fast, extensible query engine for building high-quality data-centric systems in Rust, using the Apache Arrow in-memory format.
DataFusion is great for building projects such as domain specific query engines, new database platforms and data pipelines, query languages and more. It lets you start quickly from a fully working engine, and then customize those features specific to your use.
Open Source Distributed POSIX File System for Cloud. JuiceFS is a distributed POSIX file system built on top of Redis and S3.
JuiceFS is a high-performance POSIX file system released under Apache License 2.0, particularly designed for the cloud-native environment. The data, stored via JuiceFS, will be persisted in Object Storage (e.g. Amazon S3), and the corresponding metadata can be persisted in various compatible database engines such as Redis, MySQL, and TiKV based on the scenarios and requirements.
With JuiceFS, massive cloud storage can be directly connected to big data, machine learning, artificial intelligence, and various application platforms in production environments. Without modifying code, the massive cloud storage can be used as efficiently as local storage.
open source big data platform.
Trunk Data Platform is an Open Source, free, Hadoop distribution.
XetHub brings speedy access and Git-based collaboration to large scale repositories of data, code, or any combination of files.
Our instant mount feature makes it possible to access GBs and TBs of data in seconds at the speed of localhost, while our de-duplication algorithm stores data and differences efficiently to save money and speed up development cycles.
XetHub is ideal for teams who already use Git to track their code changes, and want to leverage the power of infinite history, pull requests, and difference-based tracking for larger assets such as datasets or media files. Managing complete projects with familiar Git semantics makes change tracking and continuous integration a breeze, especially for workflows that use code to generate or augment assets.
Graph Database Management System.
Neo4j Graph Data Platform. Blazing-Fast Graph, Petabyte Scale.
With proven trillion+ entity performance, developers, data scientists, and enterprises rely on Neo4j as the top choice for high-performance, scalable analytics, intelligent app development, and advanced AI/ML pipelines.
Daily Earth Data to See Change and Make Better Decisions.
Planet provides daily satellite data that helps businesses, governments, researchers, and journalists understand the physical world and take action.
Climate TRACE was built to collect and share greenhouse gas emissions from anthropogenic (human) activities to facilitate climate action .
Robtex is used for various kinds of research of IP numbers, Domain names, etc.
Robtex uses various sources to gather public information about IP numbers, domain names, host names, Autonomous systems, routes etc. It then indexes the data in a big database and provide free access to the data.
We aim to make the fastest and most comprehensive free DNS lookup tool on the Internet.
Our database now contains billions of documents of internet data collected over more than a decade.
The scalable, open source
big data analytics platform
for networks and services.
Apache HBase™ is the Hadoop database, a distributed, scalable, big data store.
Use Apache HBase when you need random, realtime read/write access to your Big Data. This project's goal is the hosting of very large tables -- billions of rows X millions of columns -- atop clusters of commodity hardware. Apache HBase is an open-source, distributed, versioned, non-relational database modeled after Google's Bigtable: A Distributed Storage System for Structured Data by Chang et al. Just as Bigtable leverages the distributed data storage provided by the Google File System, Apache HBase provides Bigtable-like capabilities on top of Hadoop and HDFS.
D3.js is a JavaScript library for manipulating documents based on data. D3 helps you bring data to life using HTML, SVG and CSS. D3’s emphasis on web standards gives you the full capabilities of modern browsers without tying yourself to a proprietary framework, combining powerful visualization components and a data-driven approach to DOM manipulation.