LangChain Integration Guide

This guide shows how to load a VXDF file in LangChain so you can query it with LLM-powered chains (e.g. Retrieval-QA).

Prerequisites

pip install vxdf langchain openai

You should also have an OpenAI API key available via the OPENAI_API_KEY environment variable or saved in ~/.vxdf/config.toml (the CLI prompt we saw earlier).

Example: build a Retrieval-QA chain

from langchain.embeddings import OpenAIEmbeddings, HuggingFaceEmbeddings
from langchain.vectorstores import FAISS
from langchain.chains import RetrievalQA
from langchain.chat_models import ChatOpenAI

from vxdf.reader import VXDFReader

# 1. Read the VXDF file and extract documents & embeddings
docs, vectors = [], []
with VXDFReader("my.vxdf") as r:
    for doc_id, meta in r.header["docs"].items():
        chunk = r.get_chunk(doc_id)
        docs.append(chunk["text"])
        vectors.append(chunk["vector"])

# 2. Turn the stored embeddings into a FAISS index
index = FAISS.from_embeddings(vectors, docs)

# 3. Build the retrieval chain
retriever = index.as_retriever(search_kwargs={"k": 5})
llm = ChatOpenAI(model_name="gpt-4o", temperature=0)
chain = RetrievalQA.from_chain_type(llm, chain_type="stuff", retriever=retriever)

# 4. Ask a question
result = chain.run("What is the billing address on invoice FB8D355A-0009?")
print(result)

Using a local embedding model

If your VXDF was created with the default local model (all-MiniLM-L6-v2), replace OpenAIEmbeddings with HuggingFaceEmbeddings:

embeddings = HuggingFaceEmbeddings(model_name="all-MiniLM-L6-v2")

That’s it—LangChain sees VXDF as just another vector store.

Further reading