Bigdata®
Bigdata® is a horizontally scaled storage and computing fabric supporting optional transactions, very high concurrency, and very high aggregate IO rates. Bigdata® was designed from the ground up as a distributed database architecture running over clusters of 100s to 1000s of machines, but can also run in a high-performance single-server mode.
- Petabyte scale
- Dynamic sharding
- Commodity Hardware
- Open Source / Java
- Temporal database
- High Performance
- High Concurrency (MVCC)
- High Availability
The bigdata® architecture provides a high-performance platform for data-intensive distributed computing, indexing, and high-level query on commodity clusters. While the semantic web database layer has received the most attention, the bigdata® architecture is well suited for a wide range of data models, workloads, and applications.
Bigdata® RDF Database
Bigdata® includes a high-performance RDF database supporting RDFS and limited OWL inference. The Bigdata® RDF database features fast load throughput and best-in-class query performance. It is the only RDF database capable of distributed operations on a cluster with dynamic key-range sharding of indices. This means that your deployed footprint (number of nodes) can grow incrementally with your data scale without reloading your data each time you add new machines.
The Bigdata® RDF Database was designed specifically for very large scale semantic alignment and federation of disparate data sets. With its flexible data model, RDF is a Semantic Web technology particularly well-suited to near real-time data integration, and bigdata® allows you to tackle your data integration problems at scale.
The Bigdata RDF Database provides the core features for a semantic web data tier, including:
- SPARQL
- RDFS+ inference.
- Fast load and query.
- Support for triples, triples plus statement level provenance, or quads.
- Custom rules for additional inferences, SNA, etc.
- Simple full text indexer suitable for entity matching and integration hooks for full text search and indexing using Lucene.
Performance
Bigdata® is a high performance platform. The standalone database scales to 50B triples or quads on a single Journal (one node). Performance on some standard benchmarks is reported on the project wiki and periodically on the bigdata blog.
Licensing
Bigdata® is freely available under an open-source license (GPL v2). It is also available under an evaluation / research license (pdf).
If you are interested in bundling bigdata® with your product as an OEM using commercial (non-GPL) licensing, please contact us directly.
SYSTAP, LLC does not offer commercial end user licenses directly. However, if you would like a commercial license and support for a semantic web application server platform bundling bigdata®, we can redirect your inquiry to a value added OEM reseller.
Open Source Support
Community-based support is available through an online forum. There is also a project wiki.
Please contact us directly for more information about open-source support subscriptions.
Community
We are always looking for people with a background in databases and semantic web and a passion for open source. If you are interested in being a contributor to the bigdata® open source project and joining our growing community please contact the project administrators. Also, please see the issue tracker to get an overview of our roadmap and an idea where you might be able to contribute.
Before you will be able to join as a contributor, you will have to sign the bigdata® Open Source Contributor Agreement (CA). The bigdata® CA follows open source best practices and is designed to protect everyone by ensuring that all contributions are appropriately authorized. There is one version of the CA for the individual developer and one for an employer. Since most employee contracts include IP clauses, you must have your employer sign the Bigdata Corporate CA (pdf) before you can sign the Individual CA (pdf) yourself.
Interesting Links
- Blog
- Bigdata Architecture Whitepaper (draft)
- Bigdata High Availability Whitepaper (draft)
- SourceForge Project
- Help Forum
- Getting Started Guide
- Roadmap
- Javadoc
- bigdata presented at OSCON 2008
Bigdata® is a registered trademark of SYSTAP, LLC. SYSTAP takes great care in the development and protection of its trademarks and reserves all rights of ownership of its trademarks. No other company may use SYSTAP's trademarks unless it has the express written permission of SYSTAP, or is licensed by SYSTAP to do so.