Core Concepts

Embex simplifies vector database interactions into three main concepts: Collections, Vectors, and Search.

1. Collections

A Collection is a container for your vectors. It’s similar to a “table” in a SQL database.

Name: A unique identifier (e.g., “users”, “products”).
Dimension: The size of the vectors. This must match your embedding model (e.g., 384 for all-MiniLM-L6-v2, 1536 for OpenAI).

Python
Node.js

# Create a collection for MiniLM embeddings (384 dimensions)
await client.create_collection("products", dimension=384)

# List all collections
collections = await client.list_collections()

# Delete a collection
await client.delete_collection("products")

// Create a collection for MiniLM embeddings (384 dimensions)
await client.createCollection("products", 384);

// List all collections
const collections = await client.listCollections();

// Delete a collection
await client.deleteCollection("products");

2. Vectors

A Vector is the core data unit. It represents an object (text, image, audio) as a list of numbers generated by an embedding model.

Structure

Field	Type	Description
`id`	`String`	Unique identifier for the record.
`vector`	`List[Float]`	The embedding array (from your model).
`metadata`	`Map`	Optional JSON key-value pairs (e.g., source text, tags).

Python
Node.js

from embex import Vector

# Vector usually comes from a model:
# vector = model.encode("Super Widget").tolist()

vec = Vector(
    id="prod_123",
    vector=[-0.12, 0.05, 0.88, ...], # 384 floats
    metadata={
        "name": "Super Widget",
        "price": 99.99,
        "category": "electronics"
    }
)

await client.insert("products", [vec])

// Vector usually comes from a model:
// const vector = await embed("Super Widget");

const vec = {
    id: "prod_123",
    vector: [-0.12, 0.05, 0.88, ...], // 384 floats
    metadata: {
        name: "Super Widget",
        price: 99.99,
        category: "electronics"
    }
};

await client.insert("products", [vec]);

3. Search & Filtering

Search finds the vectors most similar to a query vector. You generate a vector for your query text and Embex finds the nearest neighbors.

# query_vector = model.encode("smartphone").tolist()

results = await client.search(
    collection_name="products",
    vector=query_vector,
    limit=5
)

# const queryVector = await embed("smartphone");

const results = await client.search({
    collection_name: "products",
    vector: queryVector,
    limit: 5
});

Metadata Filtering

You can refine search results using metadata filters. Embex uses a structured filter syntax:

Python
Node.js

results = await client.search(
    collection_name="products",
    vector=query_vector,
    limit=5,
    filter={
        "must": [
            {"key": "category", "match": {"value": "electronics"}}
        ],
        "must_not": [
            {"key": "price", "range": {"gt": 1000.0}}
        ]
    }
)

const results = await client.search({
    collection_name: "products",
    vector: queryVector,
    limit: 5,
    filter: {
        must: [
            { key: "category", match: { value: "electronics" } }
        ],
        must_not: [
            { key: "price", range: { gt: 1000.0 } }
        ]
    }
});