Open Time Store™
Copyright © JUXT LTD 2018-2019

To start a Crux node, use the Java API or the Clojure crux.api.

Within Clojure, we call start-node from within crux.api, passing it a set of options for the node. There are a number of different configuration options a Crux node can have, grouped into topologies.

Table 1. Crux Topologies
Name Transaction Log Topology

Standalone

Uses local event log

:crux.standalone/topology

Kafka

Uses Kafka

:crux.kafka/topology

JDBC

Uses JDBC event log

:crux.jdbc/topology

Use a Kafka node when horizontal scalability is required or when you want the guarantees that Kafka offers in terms of resiliency, availability, and retention of data.

Multiple Kafka nodes participate in a cluster with Kafka as the primary store and as the central means of coordination.

The JDBC node is useful when you don’t want the overhead of maintaining a Kafka cluster. Read more about the motivations of this setup here.

The Standalone node is a single Crux instance which has everything it needs locally. This is good for experimenting with Crux and for small to medium sized deployments, where running a single instance is permissible.

Crux nodes implement the ICruxAPI interface and are the starting point for making use of Crux. Nodes also implement java.io.Closeable and can therefore be lifecycle managed.

Properties

The following properties are within the topology used as a base for the other topologies, crux.node:

Table 2. crux.node configuration
Property Default Value

:crux.node/kv-store

'crux.kv.rocksdb/kv

:crux.node/object-store

'crux.object-store/kv-object-store

The following set of options are used by KV backend implementations, defined within crux.kv:

Table 3. crux.kv options
Property Description Default Value

:crux.kv/db-dir

Directory to store K/V files

data

:crux.kv/sync?

Sync the KV store to disk after every write?

false

:crux.kv/check-and-store-index-version

Check and store index version upon start?

true

Standalone Node

Using a Crux standalone node is the best way to get started. Once you’ve started a standalone Crux instance as described below, you can then follow the getting started example.

Local Standalone Mode
Table 4. Standalone configuration
Property Description Default Value

:crux.standalone/event-log-kv-store

Key/Value store to use for standalone event-log persistence

'crux.kv.rocksdb/kv

:crux.standalone/event-log-dir

Directory used to store the event-log and used for backup/restore, i.e. "data/eventlog-1"

:crux.standalone/event-log-sync?

Sync the event-log backend KV store to disk after every write?

false

Project Dependency

  juxt/crux-core {:mvn/version "19.09-1.5.0-alpha"}

Getting started

The following code creates a node which runs completely within memory (with both the event-log store and db store using crux.kv.memdb/kv):

(require '[crux.api :as crux])
(import (crux.api ICruxAPI))

(def ^crux.api.ICruxAPI node
  (crux/start-node {:crux.node/topology :crux.standalone/topology
                    :crux.node/kv-store "crux.kv.memdb/kv"
                    :crux.kv/db-dir "data/db-dir-1"
                    :crux.standalone/event-log-dir "data/eventlog-1"
                    :crux.standalone/event-log-kv-store "crux.kv.memdb/kv"}))

You can later stop the node if you wish:

(.close node)

RocksDB

RocksDB is used, by default, as Crux’s primary store (in place of the in memory kv store in the example above). In order to use RocksDB within crux, however, you must first add RocksDB as a project dependency:

Project Dependency

  juxt/crux-rocksdb {:mvn/version "19.09-1.5.0-alpha"}

Starting a node using RocksDB

(def ^crux.api.ICruxAPI node
  (crux/start-node {:crux.node/topology :crux.standalone/topology
                    :crux.node/kv-store "crux.kv.rocksdb/kv"
                    :crux.kv/db-dir "data/db-dir-1"
                    :crux.standalone/event-log-dir "data/eventlog-1"}))

Kafka Nodes

When using Crux at scale it is recommended to use multiple Crux nodes connected via a Kafka cluster.

Local Cluster Mode

Kafka nodes have the following properties:

Table 5. Kafka node configuration
Property Description Default value

:crux.kafka/bootstrap-servers

URL for connecting to Kafka

localhost:9092

:crux.kafka/tx-topic

Name of Kafka transaction log topic

crux-transaction-log

:crux.kafka/doc-topic

Name of Kafka documents topic

crux-docs

:crux.kafka/create-topics

Option to automatically create Kafka topics if they do not already exist

true

:crux.kafka/doc-partitions

Number of partitions for the document topic

1

:crux.kafka/replication-factor

Number of times to replicate data on Kafka

1

:crux.kafka/group-id

Kafka client group.id

(Either environment variable HOSTNAME, COMPUTERNAME, or a random UUID)

:crux.kafka/kafka-properties-file

File to supply Kakfa connection properties to the underlying Kafka API

:crux.kafka/kafka-properties-map

Map to supply Kakfa connection properties to the underlying Kafka API

Project Dependencies

  juxt/crux-core {:mvn/version "19.09-1.5.0-alpha"}
  juxt/crux-kafka {:mvn/version "19.09-1.5.0-alpha"}

Getting started

Use the API to start a Kafka node, configuring it with the bootstrap-servers property in order to connect to Kafka:

(def ^crux.api.ICruxAPI node
  (crux/start-node {:crux.node/topology :crux.kafka/topology
                    :crux.node/kv-store "crux.kv.memdb/kv"
                    :crux.kafka/bootstrap-servers "localhost:9092"}))
Note
If you don’t specify kv-store then by default the Kafka node will use RocksDB. You will need to add RocksDB to your list of project dependencies.

You can later stop the node if you wish:

(.close node)

Embedded Kafka

Crux is ready to work with an embedded Kafka for when you don’t have a independently running Kafka available to connect to (such as during development).

Project Depencies

  juxt/crux-core {:mvn/version "19.09-1.5.0-alpha"}
  juxt/crux-kafka-embedded {:mvn/version "19.09-1.5.0-alpha"}

Getting started

(require '[crux.kafka.embedded :as ek])

(def storage-dir "dev-storage")
(def embedded-kafka-options
  {:crux.kafka.embedded/zookeeper-data-dir (str storage-dir "/zookeeper")
   :crux.kafka.embedded/kafka-log-dir (str storage-dir "/kafka-log")
   :crux.kafka.embedded/kafka-port 9092})

(def embedded-kafka (ek/start-embedded-kafka embedded-kafka-options))

You can later stop the Embedded Kafka if you wish:

(.close embedded-kafka)

JDBC Nodes

JDBC Nodes use next.jdbc internally and pass through the relevant configuration options that you can find here.

Local Cluster Mode

Below is the minimal configuration you will need:

Table 6. Minimal JDBC Configuration
Property Description

:crux.jdbc/dbtype

One of: postgresql, oracle, mysql, h2, sqlite

:crux.jdbc/dbname

Database Name

Depending on the type of JDBC database used, you may also need some of the following properties:

Table 7. Other JDBC Properties
Property Description

:crux.kv/db-dir

For h2 and sqlite

:crux.jdbc/host

Database Host

:crux.jdbc/user

Database Username

:crux.jdbc/password

Database Password

Project Dependencies

  juxt/crux-core {:mvn/version "19.09-1.5.0-alpha"}
  juxt/crux-jdbc {:mvn/version "19.09-1.5.0-alpha"}

Getting started

Use the API to start a JDBC node, configuring it with the required parameters:

(def ^crux.api.ICruxAPI node
  (crux/start-node {:crux.node/topology :crux.jdbc/topology
                    :crux.jdbc/dbtype "postgresql"
                    :crux.jdbc/dbname "cruxdb"
                    :crux.jdbc/host "<host>"
                    :crux.jdbc/user "<user>"
                    :crux.jdbc/password "<password>"}))

Http

Crux can be used programmatically as a library, but Crux also ships with an embedded HTTP server, that allows clients to use the API remotely via REST.

Remote Cluster Mode

Set the server-port configuration property on a Crux node to expose a HTTP port that will accept REST requests:

Table 8. HTTP Nodes Configuration
Component Property Description

http-server

server-port

Port for Crux HTTP Server e.g. 8080

Visit the guide on using the REST api for examples of how to interact with Crux over HTTP.

Docker

If you want to experiment with Crux using a demo Docker container from Docker Hub (no JVM/JDK/Clojure install required, only Docker!) then please see the standalone web service example. You can also use this self-contained demonstration image to experiment with the REST API.

Backup and Restore

Crux provides utility APIs for local backup and restore when you are using the standalone mode. For an example of usage, see the standalone web service example.

An additional example of backup and restore is provided that only applies to a stopped standalone node here.

In a clustered deployment, only Kafka’s official backup and restore functionality should be relied on to provide safe durability. The standalone mode’s backup and restore operations can instead be used for creating operational snapshots of a node’s indexes for scaling purposes.