Open Time Store™
Copyright © JUXT LTD 2018-2019

Philosophy

Crux is an unbundled database, where parts are pluggable and can be swapped out for alternative implementations. We are attempting to follow the Unix philosophy of each part doing one thing well.

This document walks through some of the high-level Crux components.

ObjectStore

Crux contains an ObjectStore for storing and retreiving documents.

Unresolved directive in design.adoc - include::../src/crux/db.clj[tags=ObjectStore]

The main implementation is KvObjectStore wrapped by CachedObjectstore.

Note
Objects are stored against used their hashed value.
Note
The KvObjectStore wraps the lower-level KvStore

KvStore

Unresolved directive in design.adoc - include::../src/crux/kv.clj[tags=KvStore]

The KvStore exposes the operations Crux needs to use a Key Value store for indexing purposes, and for storing Objects.

Implementations of the KvStore include:

Table 1. KvStores
Implementation Description

crux.kv.rocksdb/RocksKv

Uses RocksDB and the standard Java API that ships with RocksDB

crux.kv.rocksdb.jnr/RocksJNRKv

Uses RocksDB and a custom built JNR bridge

crux.kv.lmdb/LMDBKv

Uses LMBD via lwjql

crux.kv.lmdb.jnr.LMDBJNRKv

Uses LMDB via lmdbjava

crux.kv.memdb/MemKv

An in-memory KvStore

TxLog

Crux uses a replayable transaction log that is used to derive indicies.

Unresolved directive in design.adoc - include::../src/crux/db.clj[tags=TxLog]

The transaction log is used to write both transaction and documents to an appendable transaction log.

Table 2. TxLogs
Implementation Description

crux.kafka.KafkaTxLog

(default), using Kafka with separate Kafka topics for both documents and transactions

crux.tx.EventTxLog

Uses the KvStore for the transaction log in standalone mode, backed by moberg. This is useful when Kafka isn’t desired, and the topology of having distributed Crux nodes feeding of a centralised transaction log isn’t required.

Indexer

The indexer indexes documents and transactions for query purposes. This might be on the back of subscriptions to Kafka topics, or by direct calls made in a single node topological setup.

Unresolved directive in design.adoc - include::../src/crux/db.clj[tags=Indexer]

The single implememention is crux.tx.KvIndexer, which makes use of the KvStore to persist indices.

Query Design

Index

Crux has the fundamental notion of an index.

Unresolved directive in design.adoc - include::../src/crux/db.clj[tags=Index]

The two operations are seek and next.

Layered Index

The layered index exists to faciliate the idea of navigating up and down an index, in a tree like manner.

Unresolved directive in design.adoc - include::../src/crux/db.clj[tags=LayeredIndex]

For example the index attribute+value+entity+content-hash is the following tree:

diagram classes

open-level gives instructions to open and move down a level. In the above example it could be moving the index down to point at the values within a given attribute. That is to say that if we have an attribute :name, the index will iterate across all values for that attribute, until there are no more name values.

close-level moves the index back-up, so in the above example, we can iterate at the higher level of attribute.

Virtual Index

A Virtual Index comprises together multiple child indices. This is to join indices together, returning key/value pairs on where they match.

A join condition in a query could be reflected by a Virtual Index. A Virtual Index will maintain state as to where Index is currently positioned.

UnaryJoinVirtualIndex comprises of multiple child indices and implements both Index and LayeredIndex. Calling seek-values on it will advance all the child indices internally until they all contain the same key. This involves calling seek-values on each child index until the indices match at the same level. Calling next-values would move all the indices along until the next common key that all the indices share.

Table 3. Virtual Index Example

a

0

1

3

4

5

6

b

0

2

3

5

c

2

3

4

5

6

In the above example, where UnaryJoinVirtualIndex joins three RelationVirtualIndexs (a,b,c). Calling seek-values would return 3 as the first value found.

Calling next-values would jump ahead to the value 5.