DashVector
DashVector is a fully managed vector DB service that supports high-dimension dense and sparse vectors, real-time insertion and filtered search. It is built to scale automatically and can adapt to different application requirements. The vector retrieval service
DashVector
is based on theProxima
core of the efficient vector engine independently developed byDAMO Academy
, and provides a cloud-native, fully managed vector retrieval service with horizontal expansion capabilities.DashVector
exposes its powerful vector management, vector query and other diversified capabilities through a simple and easy-to-use SDK/API interface, which can be quickly integrated by upper-layer AI applications, thereby providing services including large model ecology, multi-modal AI search, molecular structure A variety of application scenarios, including analysis, provide the required efficient vector retrieval capabilities.
In this notebook, we'll demo the SelfQueryRetriever
with a DashVector
vector store.
Create DashVector vectorstore
First we'll want to create a DashVector
VectorStore and seed it with some data. We've created a small demo set of documents that contain summaries of movies.
To use DashVector, you have to have dashvector
package installed, and you must have an API key and an Environment. Here are the installation instructions.
NOTE: The self-query retriever requires you to have lark
package installed.
%pip install --upgrade --quiet lark dashvector
import os
import dashvector
client = dashvector.Client(api_key=os.environ["DASHVECTOR_API_KEY"])
from langchain_community.embeddings import DashScopeEmbeddings
from langchain_community.vectorstores import DashVector
from langchain_core.documents import Document
embeddings = DashScopeEmbeddings()
# create DashVector collection
client.create("langchain-self-retriever-demo", dimension=1536)
docs = [
Document(
page_content="A bunch of scientists bring back dinosaurs and mayhem breaks loose",
metadata={"year": 1993, "rating": 7.7, "genre": "action"},
),
Document(
page_content="Leo DiCaprio gets lost in a dream within a dream within a dream within a ...",
metadata={"year": 2010, "director": "Christopher Nolan", "rating": 8.2},
),
Document(
page_content="A psychologist / detective gets lost in a series of dreams within dreams within dreams and Inception reused the idea",
metadata={"year": 2006, "director": "Satoshi Kon", "rating": 8.6},
),
Document(
page_content="A bunch of normal-sized women are supremely wholesome and some men pine after them",
metadata={"year": 2019, "director": "Greta Gerwig", "rating": 8.3},
),
Document(
page_content="Toys come alive and have a blast doing so",
metadata={"year": 1995, "genre": "animated"},
),
Document(
page_content="Three men walk into the Zone, three men walk out of the Zone",
metadata={
"year": 1979,
"director": "Andrei Tarkovsky",
"genre": "science fiction",
"rating": 9.9,
},
),
]
vectorstore = DashVector.from_documents(
docs, embeddings, collection_name="langchain-self-retriever-demo"
)