Getting started with Weaviate
September 18, 2024
How to use weaviate
Weaviate is a vector DB usually used for Retrieval Augmented Generation. I like it because they provide hybrid queries. You can use both dense and sparse verctor search. My goal is walk you trough the basics of Weaviate.
Starting a weaviate instance
Let’s get an instance of Weaviate running with docker:
docker run -p 8080:8080 -p 50051:50051 cr.weaviate.io/semitechnologies/weaviate:1.26.3The python SDK
To interact with Weaviate, we’ll use the Python SDK. Install it with:
pip install -U weaviate-client # For beta versions: `pip install --pre -U "weaviate-client==4.*"`Create a weaviate.py file and add the following:
import weaviateclient = weaviate.connect_to_local()Basic Operations
Creating a collection
To create a new collection in Weaviate:
client.collections.create("<collection_name>")Adding data
Insert data into your collection:
collection = client.collections.get("<collection_name>")collection.data.insert({"property": "text"}, vector: list[float=vector)Where vector should be a list[float].
Search
Weaviate allows dense, sparse and hybrid search.
Vector
Perform a vector-based search:
collection = client.collections.get("<collection_name>")collection.query.near_vector(vector, limit=10)Full-text
Execute a text-based search using BM25:
collection = client.collections.get("<collection_name>")response = collection.query.bm25("<query>", limit=10)Hybrid
Combine vector and text-based search for optimal results:
collection = client.collections.get("<collection_name>")response = collection.query.hybrid( query="<query>", vector=vector, limit=10,) Tags: