Core Concepts
Embex simplifies vector database interactions into three main concepts: Collections, Vectors, and Search.
1. Collections
Section titled “1. Collections”A Collection is a container for your vectors. It’s similar to a “table” in a SQL database.
- Name: A unique identifier (e.g., “users”, “products”).
- Dimension: The size of the vectors. This must match your embedding model (e.g., 384 for
all-MiniLM-L6-v2, 1536 for OpenAI).
# Create a collection for MiniLM embeddings (384 dimensions)await client.create_collection("products", dimension=384)
# List all collectionscollections = await client.list_collections()
# Delete a collectionawait client.delete_collection("products")// Create a collection for MiniLM embeddings (384 dimensions)await client.createCollection("products", 384);
// List all collectionsconst collections = await client.listCollections();
// Delete a collectionawait client.deleteCollection("products");2. Vectors
Section titled “2. Vectors”A Vector is the core data unit. It represents an object (text, image, audio) as a list of numbers generated by an embedding model.
Structure
Section titled “Structure”| Field | Type | Description |
|---|---|---|
id | String | Unique identifier for the record. |
vector | List[Float] | The embedding array (from your model). |
metadata | Map | Optional JSON key-value pairs (e.g., source text, tags). |
from embex import Vector
# Vector usually comes from a model:# vector = model.encode("Super Widget").tolist()
vec = Vector( id="prod_123", vector=[-0.12, 0.05, 0.88, ...], # 384 floats metadata={ "name": "Super Widget", "price": 99.99, "category": "electronics" })
await client.insert("products", [vec])// Vector usually comes from a model:// const vector = await embed("Super Widget");
const vec = { id: "prod_123", vector: [-0.12, 0.05, 0.88, ...], // 384 floats metadata: { name: "Super Widget", price: 99.99, category: "electronics" }};
await client.insert("products", [vec]);3. Search & Filtering
Section titled “3. Search & Filtering”Search finds the vectors most similar to a query vector. You generate a vector for your query text and Embex finds the nearest neighbors.
Basic Search
Section titled “Basic Search”# query_vector = model.encode("smartphone").tolist()
results = await client.search( collection_name="products", vector=query_vector, limit=5)# const queryVector = await embed("smartphone");
const results = await client.search({ collection_name: "products", vector: queryVector, limit: 5});Metadata Filtering
Section titled “Metadata Filtering”You can refine search results using metadata filters. Embex uses a structured filter syntax:
results = await client.search( collection_name="products", vector=query_vector, limit=5, filter={ "must": [ {"key": "category", "match": {"value": "electronics"}} ], "must_not": [ {"key": "price", "range": {"gt": 1000.0}} ] })const results = await client.search({ collection_name: "products", vector: queryVector, limit: 5, filter: { must: [ { key: "category", match: { value: "electronics" } } ], must_not: [ { key: "price", range: { gt: 1000.0 } } ] }});