Crux is an open source document database with bitemporal graph queries.
Crux has been available as a Public Alpha since 19th April 2019. The Public Alpha period will continue until Crux is released as a Generally Available open source software product by JUXT later in 2019.
Crux is a bitemporal database that stores transaction time and valid time histories. While a [uni]temporal database enables "time travel" querying through the transactional sequence of database states from the moment of database creation to its current state, Crux also provides "time travel" querying for a discrete valid time axis without unnecessary design complexity or performance impact. This means a Crux user can populate the database with past and future information regardless of the order in which the information arrives, and make corrections to past recordings to build an ever-improving temporal model of a given domain.
Bitemporal modelling is broadly useful for event-based architectures and is a critical requirement for systems in any industry with strong auditing regulations, where you need to be able to answer the question "what did you know and when did you know it?".
Crux supports a Datalog query interface for reading data and traversing relationships across all documents. Queries are executed so that the results are lazily streamed from the underlying indexes.
Crux is ultimately a store of versioned EDN documents. The fields within these documents are automatically indexed as Entity-Attribute-Value triples to support efficient graph queries.
Crux does not enforce any schema for the documents it stores. One reason for this is that data might come from many different places, and may not ultimately be owned by the service using Crux to query the data. This design enables schema-on-write and/or schema-on-read to be achieved outside of the core of Crux to meet the exact application requirements.
Crux—to use Martin Kleppmann’s phrase—is an unbundled database.
What do we have to gain from turning the database inside out? Simpler code, better scalability, better robustness, lower latency, and more flexibility for doing interesting things with data.
It is a database turned inside out, using:
Apache Kafka for the primary storage of transactions and documents as semi-immutable logs.
RocksDB or LMDB to host indexes for rich query support.
This decoupled design enables Crux to be maintained as a small and efficient core that can be scaled and evolved to support a large variety of use-cases.
Nodes can come and go, with local indexes stored in a Key/Value store such as RocksDB, whilst reading and writing master data to central log topics (only Kafka is currently supported). Queries are not distributed and there is no sharding of documents across nodes.
Crux can also run in a non-distributed "standalone" mode, where the transaction and document logs exist only inside of a local Key/Value store such as RocksDB. This is appropriate for non-critical usage where availability and durability requirements do not justify the need for Kafka.
Crux supports eviction of active and historical data to assist with technical compliance for information privacy regulations.
The main transaction log contains only hashes and is immutable. All document content is stored in a dedicated document topic that can be evicted by compaction.
Reading the documentation is a good way to learn about Crux, so you are in the right place! Let’s continue the journey.