Skip to content

Quickstart

This guide will walk you through installing Embex, generating real embeddings, and performing semantic search.

We recommend starting with LanceDB because it runs embedded locally—no Docker, no cloud setup, and no API keys required!

Install the Embex client and an embedding library.

bash pip install embex lancedb sentence-transformers

Create a file named quickstart.py or quickstart.ts. We will use a small, efficient model (all-MiniLM-L6-v2) to generate embeddings locally.

import asyncio
from embex import EmbexClient, Vector
from sentence_transformers import SentenceTransformer
async def main():
# 1. Setup Embedding Model
model = SentenceTransformer('all-MiniLM-L6-v2')
# 2. Initialize Client (uses LanceDB by default)
client = await EmbexClient.new_async(provider="lancedb", url="./data")
# 3. Create Collection
# 'all-MiniLM-L6-v2' produces 384-dimensional vectors
await client.create_collection("products", dimension=384)
# 4. Prepare Data
documents = [
{"id": "1", "text": "Apple iPhone 14", "category": "electronics"},
{"id": "2", "text": "Samsung Galaxy S23", "category": "electronics"},
{"id": "3", "text": "Organic Bananas", "category": "groceries"},
]
# 5. Generate Embeddings & Insert
vectors = []
for doc in documents:
embedding = model.encode(doc["text"]).tolist()
vectors.append(Vector(
id=doc["id"],
vector=embedding,
metadata={"text": doc["text"], "category": doc["category"]}
))
await client.insert("products", vectors)
# 6. Semantic Search
query_text = "smartphone"
query_vector = model.encode(query_text).tolist()
results = await client.search(
collection_name="products",
vector=query_vector,
limit=1
)
print(f"Query: '{query_text}'")
print(f"Match: {results[0].metadata['text']}")
if __name__ == "__main__":
asyncio.run(main())
bash python quickstart.py

You just built a semantic search engine! Notice how searching for “smartphone” found “Apple iPhone 14” even though the word “smartphone” isn’t in the text? That’s the power of vector embeddings. 🚀