Natural Language Queries

Write database queries in plain English using AI. SoliDB automatically translates your natural language into SDBQL and executes it.

Overview

SoliDB's Natural Language (NL) query feature lets you interact with your database using plain English. An AI model translates your requests into valid SDBQL queries, making database operations accessible to everyone regardless of their query language expertise.

Plain English

Write queries like "find all users over 25" or "count orders by status"

Schema-Aware

AI understands your database schema for accurate translations

Self-Correcting

Automatically retries with error feedback for improved accuracy

LLM Providers

SoliDB supports multiple AI providers for natural language processing. Configure your preferred provider in the _system/_env collection.

OpenAI

GPT-4o, GPT-4, etc.

API Key: OPENAI_API_KEY
Model: OPENAI_MODEL
Default: gpt-4o

Anthropic

Claude Sonnet, Opus, etc.

API Key: ANTHROPIC_API_KEY
Model: ANTHROPIC_MODEL
Default: claude-sonnet-4-20250514

Ollama

Local LLMs (Llama, etc.)

URL: OLLAMA_URL
Model: OLLAMA_MODEL
Default: llama3

Gemini

Google Gemini Pro

API Key: GEMINI_API_KEY
Model: GEMINI_MODEL
Default: gemini-1.5-pro

Default Provider

Set NL_DEFAULT_PROVIDER to specify which provider to use by default:

openai | anthropic | ollama | gemini

Configuration

Store your LLM credentials in the _system database's _env collection. Each credential is stored as a document with the key as _key and the value in a value field.

Setting Up Credentials

Via API

# Set Anthropic API key
curl -X POST http://localhost:7777/_api/database/_system/collection/_env \
  -H "Authorization: Bearer YOUR_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"_key": "ANTHROPIC_API_KEY", "value": "sk-ant-..."}'

# Set default provider
curl -X POST http://localhost:7777/_api/database/_system/collection/_env \
  -H "Authorization: Bearer YOUR_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"_key": "NL_DEFAULT_PROVIDER", "value": "anthropic"}'

Via Dashboard

  1. Navigate to Dashboard > _system database
  2. Go to Environment Variables section
  3. Add your API keys and configuration

Security Note

API keys stored in _env are accessible to database administrators. Ensure proper access controls are in place for your SoliDB instance.

Example Queries

Basic Queries

Natural Language

"Find all users"

Generated SDBQL

FOR u IN users
  RETURN u

Natural Language

"Get users older than 25 sorted by age descending"

Generated SDBQL

FOR u IN users
  FILTER u.age > 25
  SORT u.age DESC
  RETURN u

Aggregations

Natural Language

"Count all orders"

Generated SDBQL

RETURN {
  count: LENGTH(FOR o IN orders RETURN 1)
}

Natural Language

"Group orders by status and count each"

Generated SDBQL

FOR o IN orders
  COLLECT status = o.status
  AGGREGATE count = COUNT(1)
  RETURN { status, count }

Complex Queries

Natural Language

"Find top 5 customers by total order amount"

Generated SDBQL

FOR o IN orders
  COLLECT customer = o.customer_id
  AGGREGATE total = SUM(o.amount)
  SORT total DESC
  LIMIT 5
  RETURN { customer, total }

Natural Language

"Find users in Paris who haven't placed any orders"

Generated SDBQL

LET order_users = (
  FOR o IN orders
  RETURN DISTINCT o.user_id
)
FOR u IN users
  FILTER u.city == "Paris"
  FILTER u._key NOT IN order_users
  RETURN u

API Endpoint

Execute Natural Language Query

POST /_api/database/{db}/nl

Request Body

{
  "query": "find all users over 25 sorted by age",
  "execute": true,
  "provider": "anthropic"
}
Parameter Type Description
query string Natural language query
execute boolean Execute the query (default: true)
provider string? LLM provider: openai, anthropic, ollama, gemini
model string? Model override (uses env default if not set)

Response (Success)

{
  "sdbql": "FOR u IN users\n  FILTER u.age > 25\n  SORT u.age DESC\n  RETURN u",
  "result": [
    { "_key": "alice", "name": "Alice", "age": 30 },
    { "_key": "bob", "name": "Bob", "age": 28 }
  ],
  "attempts": 1
}

Response (Error)

{
  "error": "Failed to generate valid SDBQL after 3 attempts",
  "last_attempt": "FOR u IN users...",
  "parse_error": "Unexpected token at line 2"
}

Example: cURL

curl -X POST http://localhost:7777/_api/database/mydb/nl \
  -H "Authorization: Bearer YOUR_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "query": "find users in Paris with more than 5 orders",
    "execute": true
  }'

Dashboard NL Mode

The Query page in the SoliDB dashboard supports Natural Language mode alongside SDBQL and SQL. Use the NL toggle to write queries in plain English.

How to Use NL Mode

  1. Navigate to Dashboard > Query
  2. Click the NL button in the mode toggle
  3. Type your query in plain English
  4. Click Execute to translate and run the query
  5. View the generated SDBQL in the preview pane

AI-Powered

Your natural language is translated to SDBQL using the configured LLM provider.

Preview Generated Query

See the generated SDBQL before it executes, so you can learn the syntax.

Best Practices

Do

  • + Be specific about collection names
  • + Use field names that exist in your schema
  • + Specify sort direction (ascending/descending)
  • + Include limit for large datasets

Don't

  • - Use ambiguous terms without context
  • - Assume field names if unsure
  • - Write overly complex queries in one request
  • - Rely on NL for production-critical queries

Tips for Better Results

  • 1. Use collection and field names from your actual schema
  • 2. Break complex queries into simpler parts
  • 3. Review the generated SDBQL to learn the syntax
  • 4. Use execute=false to preview without running
  • 5. For critical queries, convert to SDBQL and test thoroughly

Schema Discovery

SoliDB automatically analyzes your database schema to provide context to the AI model. This ensures accurate query generation that matches your actual data structure.

What the AI Learns

Collection Names

All non-system collections in your database are discovered

Field Names & Types

Samples documents to infer field names and data types

Document Counts

Knows how many documents are in each collection

Available Indexes

Index definitions for optimized query suggestions

Example Schema Context Sent to AI

### Collection: `users` (1,542 documents)
Fields:
  - `_key`: string
  - `age`: number
  - `city`: string
  - `email`: string
  - `name`: string
Indexes:
  - email_idx(email)
  - city_age_idx(city, age)

### Collection: `orders` (8,391 documents)
Fields:
  - `_key`: string
  - `amount`: number
  - `status`: string
  - `user_id`: string

Self-Correcting Mechanism

When the AI generates invalid SDBQL, SoliDB automatically feeds the error back and requests a correction. This retry loop significantly improves success rates.

1. Natural Language
"Find users in Paris"
2. LLM Generates
FOR u IN users FILTER u.city = "Paris" RETURN u
3. Parser Error
"Expected '==' but found '='"
4. Retry with Error
Error sent back to LLM for correction
5. Valid Query
FOR u IN users FILTER u.city == "Paris" RETURN u

3

Max Retry Attempts

~95%

Success Rate with Retries

attempts

Returned in Response

Troubleshooting

"OPENAI_API_KEY not found in _system/_env collection"

The API key for your chosen provider hasn't been configured.

Solution:

Add the API key to _system/_env collection via API or Dashboard

"No collections found in database"

The database has no user collections for the AI to query.

Solution:

Create at least one collection with data before using NL queries

"Failed to generate valid SDBQL after 3 attempts"

The AI couldn't produce a valid query after multiple tries.

Solutions:

  • Simplify your query - break it into smaller parts
  • Use exact collection and field names from your schema
  • Try a different LLM model (e.g., GPT-4o instead of GPT-3.5)
  • Check the last_attempt in the error response

"Unknown LLM provider: xyz"

An unsupported provider was specified.

Solution:

Use one of: openai, anthropic, ollama

"Ollama API request failed: connection refused"

The Ollama server is not running or unreachable.

Solutions:

  • Start Ollama: ollama serve
  • Check OLLAMA_URL in _env (default: http://localhost:11434)
  • Ensure the model is pulled: ollama pull llama3

Client SDK Examples

JavaScript / Node.js

const response = await fetch('http://localhost:7777/_api/database/mydb/nl', {
  method: 'POST',
  headers: {
    'Authorization': `Bearer ${token}`,
    'Content-Type': 'application/json'
  },
  body: JSON.stringify({
    query: 'Find users in Paris older than 25',
    execute: true
  })
});

const { sdbql, result, attempts } = await response.json();
console.log(`Generated SDBQL (${attempts} attempts):`, sdbql);
console.log('Results:', result);

Python

import requests

response = requests.post(
    'http://localhost:7777/_api/database/mydb/nl',
    headers={
        'Authorization': f'Bearer {token}',
        'Content-Type': 'application/json'
    },
    json={
        'query': 'Count orders grouped by status',
        'execute': True,
        'provider': 'anthropic'
    }
)

data = response.json()
print(f"Generated SDBQL ({data['attempts']} attempts):")
print(data['sdbql'])
print("Results:", data['result'])

Go

type NLRequest struct {
    Query   string `json:"query"`
    Execute bool   `json:"execute"`
}

type NLResponse struct {
    SDBQL    string        `json:"sdbql"`
    Result   []interface{} `json:"result,omitempty"`
    Attempts int           `json:"attempts"`
}

body, _ := json.Marshal(NLRequest{
    Query:   "Find top 10 users by order count",
    Execute: true,
})

req, _ := http.NewRequest("POST",
    "http://localhost:7777/_api/database/mydb/nl",
    bytes.NewBuffer(body))
req.Header.Set("Authorization", "Bearer "+token)
req.Header.Set("Content-Type", "application/json")

resp, _ := http.DefaultClient.Do(req)
var result NLResponse
json.NewDecoder(resp.Body).Decode(&result)

fmt.Printf("SDBQL: %s\n", result.SDBQL)