System Architecture

A deep dive into the internals of SoliDB — a high-performance, distributed, multi-model database built with Rust.

Overview

SoliDB is engineered for performance, scalability, and developer experience. It combines document storage, graph traversal, full-text search, and time-series capabilities in a single, cohesive system.

40K+

Lines of Rust

14+

Core Modules

5

Index Types

100+

SDBQL Functions

API Layer

Axum + Tokio

Cluster Layer

Sharding + Sync

Query Layer

SDBQL Engine

Storage Layer

RocksDB

Core Features

Document Store

Schema-free JSON documents with optional JSON Schema validation. Auto-generated keys (UUIDv7) and atomic updates.

Graph Database

Native graph traversal with SDBQL. Edge collections, vertex relationships, and path queries.

Full-Text Search

Built-in fulltext indexing with n-gram tokenization and fuzzy matching support.

Geo Spatial

Geo indexes with haversine distance, radius queries, and nearest-neighbor search.

TTL Indexes

Time-to-live indexes for automatic document expiration. Perfect for caching and sessions.

Live Queries

Real-time WebSocket subscriptions with SDBQL. Push updates on document changes.

Transactions

ACID transactions with WAL (Write-Ahead Log). Commit, rollback, and isolation levels.

Job Queues

Built-in priority queues with scheduling, retries, and dead-letter handling.

Time Series

Optimized storage for time-series data with ranged pruning and downsampling aggregations.

Lua Scripting

Server-side Lua scripts for custom logic, triggers, and stored procedures.

Columnar Storage

Column-oriented storage engine with LZ4 compression. Optimized for analytics, aggregations, and report generation.

Rust Internals

SoliDB is written in Rust for memory safety, zero-cost abstractions, and fearless concurrency. This section explains how the Rust codebase is organized and how data flows through the system.

Query Execution Flow

HTTP Request
handlers.rs
SDBQL Parser
lexer + parser
Executor
executor.rs:297KB
RocksDB
collection.rs

Core Modules

sdbql/

Query language implementation with lexer, parser, AST, and executor (297KB core executor).

lexer.rs · parser.rs · ast.rs · executor.rs
storage/

RocksDB-backed persistence with collections, indexes, and TTL management (125KB collection module).

engine.rs · collection.rs · indexes/
server/

Axum HTTP API and WebSocket handlers (241KB handlers module). Handles routing, auth, and real-time.

router.rs · handlers.rs · auth.rs
cluster/

Node coordination with Hybrid Logical Clocks (HLC) for distributed timestamp ordering.

manager.rs · hlc.rs · health.rs
sync/

Master-master replication worker. Handles change propagation, blob sync, and conflict resolution.

worker.rs · transport.rs · blob.rs
sharding/

Horizontal partitioning with automatic rebalancing. Shard coordinator is 151KB.

coordinator.rs · router.rs · migrate.rs
transaction/

ACID transactions with configurable isolation levels and Write-Ahead Log (WAL) support.

mod.rs · wal.rs · isolation.rs
scripting/

Embedded Lua 5.4 runtime for custom endpoints, triggers, and stored procedures with sandbox.

mod.rs · sandbox.rs · api.rs
driver/

MessagePack binary protocol for high-performance clients. Internal cluster communication.

mod.rs · protocol.rs · client.rs

Architectural Patterns

Multiplexed Port

SoliDB uses a single port for all protocols via protocol detection:

  • HTTP — REST API, SDBQL queries
  • WebSocket — Live queries, chat
  • Driver — MessagePack binary
  • Cluster — Inter-node sync
Async Runtime

Tokio async runtime with work-stealing scheduler:

  • async/.await — Non-blocking I/O
  • spawn_blocking — CPU-intensive ops
  • Arc<RwLock<T>> — Shared state
  • tokio::sync — Channels, semaphores
Serialization

Serde-based serialization throughout:

  • JSON — REST API, storage
  • MessagePack — Driver protocol
  • Bincode — Internal storage
  • Postcard — Embedded scenarios
Error Handling

Unified error handling pattern:

pub enum
DbError
{ ... }
pub type
DbResult
<T> =
Result
<T, DbError>;

Entry Points

src/
main.rs
639 lines — Server startup, CLI argument parsing, daemon mode
├── parse_args() → Config
├── init_tokio() → Runtime
└── start_server(config) → Services
lib.rs
Public API exports and re-exports
pub use sdbql::*;
pub use storage::*;
error.rs
DbError enum with From implementations
20+ error variants, From<RocksDbError>, From<JsonError>
src/bin/
solidb-dump.rs Database export utility — dump collections to JSON/MessagePack
solidb-restore.rs Database restore utility — import from dumps
solidb-fuse.rs FUSE filesystem mount (optional feature)

Code Quality Standards

All code passes cargo fmt --check and cargo clippy -- -D warnings before commits. The codebase has 592 tests across 54 test files ensuring reliability.

Technology Stack

Core Runtime

  • Rust: Memory safety, zero-cost abstractions, no garbage collection.
  • Tokio: Async runtime with work-stealing scheduler (multi-threaded).
  • Tower: Middleware stack for request/response pipelines.

Storage Engine

  • RocksDB: LSM-tree storage with column families, compression, compaction.
  • Column Families: Separate namespaces for collections, indexes, metadata.
  • Write Batches: Atomic multi-key operations for consistency.

Network Layer

  • Axum: Type-safe, ergonomic HTTP framework built on Hyper.
  • WebSocket: Tokio-tungstenite for live queries and changefeeds.
  • Reqwest: HTTP client for inter-node cluster communication.

Tooling & Serialization

  • Serde: Zero-copy JSON serialization/deserialization.
  • mlua: Lua 5.4 interpreter for server-side scripting.
  • Tracing: Structured logging with span-based instrumentation.
  • jsonwebtoken: JWT-based authentication and authorization.

Project Structure

src/
storage/# Document storage, collections, indexes (RocksDB wrapper)
sdbql/# Query language: lexer, parser, AST, executor
server/# HTTP handlers, routes, auth, WebSocket
cluster/# Node discovery, health checks, HLC clocks
sharding/# Coordinator, distribution, migration, routing
sync/# Replication transport, worker, blob sync
transaction/# Transaction manager, WAL, isolation
queue/# Priority queues, scheduling, dead-letter
scripting/# Lua runtime, sandbox, script execution
driver/# Internal cluster client & protocol
ttl/# TTL index cleanup worker
bin/# CLI tools (dump, restore, bench, fuse)
error.rs# Error types (DbError, DbResult)
lib.rs# Public API exports
main.rs# Server entrypoint

Storage Layer

The storage layer provides an abstraction over RocksDB using column families for isolation. Each collection gets its own column family with documents, indexes, and metadata stored using prefixed keys.

Key Components

  • StorageEngine — Database instance manager
  • Collection — Document operations, indexing
  • Document — JSON wrapper with key/value
  • Index — Compound index metadata
  • GeoIndex — Spatial index with RTrees

Key Prefixes

doc: Document data
idx: Secondary index entries
idx_meta: Index metadata
geo: Geo index entries
ft: Fulltext n-grams
ttl: TTL index metadata
col: Columnar data & metadata

Index Types

Persistent

Sorted B-tree

Hash

O(1) lookup

Fulltext

N-gram tokens

Geo

Haversine

TTL

Expiration

JSON Schema Validation

While SoliDB is schema-less by default, it supports optional JSON Schema validation at the collection level. This allows you to enforce data integrity rules while maintaining flexibility.

Validation Modes

  • off No validation (default). Any valid JSON is accepted.
  • strict Rejects any document that violates the schema. Returns formatted error details.
  • lenient Accepts invalid documents but logs warnings. Useful for schema migrations or testing.

Configuration

Schemas are stored in the _schema system collection and can be updated at any time.

// Example Schema Object
{ "name": "users_v1", "validation_mode": "strict", "schema": { "type": "object", "required": ["email"], "properties": { "email": { "type": "string" } } } }

Standard Compliance

SoliDB uses the jsonschema crate, supporting Draft 7, Draft 2019-09, and Draft 2020-12 specifications. Complex validation rules like oneOf, pattern, and dependencies are fully supported.

SDBQL Engine

SDBQL (SoliDB Query Language) is an AQL-inspired query language with 100+ built-in functions. It supports document queries, graph traversal, aggregations, and data manipulation.

Query
String
Lexer
Tokens
Parser
AST
Optimizer
Plan
Executor
Results

Query Capabilities

Iteration

Multi-source iteration over collections, ranges, and static arrays.

FOR doc IN users
Filtering

Advanced filtering with optimized index usage and logical operators.

FILTER d.age > 18
Bindings

Variable declarations, intermediate results, and subqueries.

LET avg = (FOR ...)
Grouping

Data reduction via COLLECT with COUNT and AGGREGATE.

COLLECT city = d.city
Graph

Native graph traversal with edge/vertex hops and path finding.

FOR v,e,p IN 1..3
Matching

Pattern matching using LIKE and regular expressions (=~, !~).

FILTER u.name =~ "J.*"
Mutations

Atomic multi-document mutations within the query flow.

UPDATE u WITH {v:1}
Returning

Flexible response construction with projection and mapping.

RETURN { name: u.v }

Built-in Functions Reference

String & Pattern
CONCAT · CONCAT_SEPARATOR · SPLIT · SUBSTRING · TRIM · LTRIM · RTRIM · UPPER · LOWER · CONTAINS · SUBSTITUTE · REGEX_REPLACE · LEVENSHTEIN · LEFT · RIGHT · CHAR_LENGTH · FIND_FIRST · FIND_LAST · REGEX_TEST · REGEXP_MATCH ·
Numeric & Math
ABS · ROUND · FLOOR · CEIL · SQRT · POW · RANDOM · LOG · LOG10 · LOG2 · EXP · SIN · COS · TAN · ASIN · ACOS · ATAN · ATAN2 · PI ·
Array Operations
LENGTH · PUSH · APPEND · SLICE · FLATTEN · FIRST · LAST · NTH · POSITION · CONTAINS_ARRAY · UNIQUE · SORTED · REVERSE · UNION · INTERSECTION · MINUS · RANGE · ZIP · REMOVE_VALUE ·
Aggregation (Arrays)
SUM · AVG · MIN · MAX · COUNT · COUNT_DISTINCT · MEDIAN · PERCENTILE · VARIANCE · STDDEV ·
Object & JSON
HAS · KEEP · UNSET · ATTRIBUTES · VALUES · MERGE · JSON_PARSE · JSON_STRINGIFY ·
Date & Time
DATE_NOW · DATE_ISO8601 · DATE_TIMESTAMP · DATE_FORMAT · DATE_ADD · DATE_SUBTRACT · DATE_DIFF · DATE_TRUNC · TIME_BUCKET · DATE_YEAR · DATE_MONTH · DATE_DAY · DATE_HOUR · DATE_MINUTE · DATE_SECOND · DATE_DAYOFWEEK · DATE_QUARTER ·
Geo & Search
DISTANCE · GEO_DISTANCE · FULLTEXT · BM25 ·
Type Checking
IS_ARRAY · IS_BOOL · IS_NUMBER · IS_STRING · IS_OBJECT · IS_NULL · IS_DATE · TYPENAME · TO_STRING · TO_NUMBER ·
Utilities
UUIDV4 · UUIDV7 · MD5 · SHA256 · BASE64_ENCODE · BASE64_DECODE · SLEEP · ASSERT · IF · COLLECTION_COUNT · COALESCE · NOT_NULL · FIRST_NOT_NULL ·

Server & API

The HTTP server is built on Axum with async handlers powered by Tokio. It provides RESTful endpoints, WebSocket support, and JWT-based authentication.

HTTP Endpoints

  • /_api/query — SDBQL execution
  • /_api/document — CRUD operations
  • /_api/collection — Management
  • /_api/index — Index operations
  • /_api/blob — Binary storage

WebSocket

  • /ws/live — Live queries
  • /ws/changefeed — Real-time changes
  • /ws/presence — User presence
  • Multiplexed connections

Authentication

  • JWT Bearer tokens
  • Database-level auth
  • Admin vs user roles
  • Token refresh flow

Cluster & Sharding

SoliDB supports horizontal scaling through sharding. Collections can be distributed across multiple nodes with configurable replication factors.

Cluster Module

  • ClusterManager — Node registration, heartbeats
  • HealthChecker — Node liveness detection
  • HLC — Hybrid Logical Clocks for ordering
  • ClusterState — Membership and topology

Sharding Module

  • ShardCoordinator — Shard assignment, healing
  • Distribution — Replica placement algorithms
  • Migration — Live shard rebalancing
  • Router — Request routing to primaries

Replication Flow

Write
Primary
SyncLog
Append entry
Transport
HTTP/WS
Replicas
Apply changes

Transactions

SoliDB provides ACID transactions with a Write-Ahead Log (WAL) for durability. Transactions support multiple operations across collections.

Transaction API

  • BEGIN — Start transaction
  • COMMIT — Apply changes
  • ROLLBACK — Discard changes
  • Collection locking

WAL (Write-Ahead Log)

  • Append-only log file
  • Crash recovery
  • Log truncation
  • Replay on startup

Isolation Levels

  • Read Uncommitted
  • Read Committed
  • Snapshot Isolation

Lua Scripting

Server-side Lua 5.4 scripts enable custom business logic, API endpoints, and background jobs. Scripts run in a secure sandbox with access to SDBQL and HTTP utilities.

Script Types

  • GET/POST/PUT/DELETE: Custom API endpoints
  • WS: WebSocket handlers
  • Scheduled: Cron-like execution
  • Triggers: Document change handlers

Available APIs

  • sdbql(query) — Execute SDBQL
  • fetch(url, opts) — HTTP client
  • json.encode/decode — JSON handling
  • crypto.* — Hash, HMAC, JWT
  • time.* — Date/time utilities
  • log.* — Debug logging (debug, info, error)