Vector Processing Internals

Pipeline

The vector pipeline converts knowledge resources into embeddings for semantic search. Guide's SearchContent tool queries these embeddings at runtime.

data/resources/*.json  ──>  TEI (BAAI/bge-small-en-v1.5)  ──>  data/vectors/index.jsonl
                                                                       │
                                                            services/vector (gRPC)
                                                                       │
                                                              Guide SearchContent

Two separate concerns use TEI:

Concern When Who calls TEI
Batch indexing fit-process-vectors CLI → TEI /v1/embeddings
Query embedding Runtime search vector service → TEI /v1/embeddings

Both read EMBEDDING_BASE_URL from .env (default http://localhost:8090). Authentication is not required for local TEI.


TEI (Text Embeddings Inference)

TEI is a Rust binary from HuggingFace that serves embedding models over HTTP. It must be running before batch processing or query-time search will work.

Installation

Install once via Cargo:

just tei-install

Or manually:

cargo install --git https://github.com/huggingface/text-embeddings-inference \
  --features candle text-embeddings-router

The first startup downloads the BAAI/bge-small-en-v1.5 model (~130 MB) to ~/.cache/huggingface/.

Running Natively

Run TEI in the foreground — useful for one-off batch processing or when you want direct control:

just tei-start

This runs text-embeddings-router on port 8090 and blocks the terminal. Stop with Ctrl-C. Equivalent to:

text-embeddings-router --model-id BAAI/bge-small-en-v1.5 --port 8090 --json-output

Running Under fit-rc

TEI is registered as an optional supervised service in config/config.json. When running the full service stack for development, fit-rc keeps TEI alive alongside the other services:

bunx fit-rc start              # Starts all services including TEI
bunx fit-rc start tei          # Start TEI only
bunx fit-rc status tei         # Check TEI status
bunx fit-rc stop tei           # Stop TEI

Because the service is marked optional, fit-rc skips it with a warning if text-embeddings-router is not installed. The external starter config does not include TEI — it is too heavy a dependency for a quick-start install.

Docker

docker-compose.yml defines a tei container using the HuggingFace CPU image. It listens on port 8080 inside the Docker network (tei.local:8080), accessible only to other containers. This is intended for containerized deployments, not local development.

Health Check

curl http://localhost:8090/health

Returns 200 when TEI is ready to serve requests.


Batch Processing

Resources must exist before vectors can be generated. The full processing chain is:

just process                   # export-framework → process-resources → process-graphs → process-vectors

To process only vectors (when resources already exist):

just process-vectors
# or directly:
bunx fit-process-vectors

Without TEI, use just process-fast to skip the vector step.

Processing Steps

  1. Load all resource identifiers from data/resources/.
  2. Filter out conversations (common.Conversation.*) and tool functions (tool.ToolFunction.*).
  3. Skip resources already present in the vector index (incremental).
  4. Batch remaining resources and POST to TEI /v1/embeddings.
  5. Write each embedding to data/vectors/index.jsonl.

Index Format

Each line in data/vectors/index.jsonl is a self-contained JSON record:

{
  "id": "common.Message.a4663ad1",
  "identifier": {
    "type": "common.Message",
    "name": "a4663ad1",
    "parent": "",
    "subjects": ["https://example.com/id/person/alice"],
    "tokens": 120
  },
  "vector": [-0.048, -0.039, ...]
}
  • Dimensions: 384 (bge-small-en-v1.5)
  • Normalization: Pre-normalized, so cosine similarity reduces to dot product
  • Search: VectorIndex.queryItems() computes dot products with SIMD-style loop unrolling, filters by prefix and token budget, returns ranked identifiers

Components

Component Location Role
VectorProcessor libraries/libvector/src/processor/ Batch embedding pipeline (extends ProcessorBase)
VectorIndex libraries/libvector/src/index/ JSONL-backed index with dot-product search
fit-process-vectors libraries/libvector/bin/ CLI for batch processing
fit-search libraries/libvector/bin/ CLI for ad-hoc similarity search
vector service services/vector/ gRPC service that loads the index and serves queries

Troubleshooting

TEI not reachable — Check curl http://localhost:8090/health. Verify EMBEDDING_BASE_URL in .env matches the port TEI is running on.

No resources to process — Run just process-resources first. The vector processor reads from data/resources/.

Empty vector index — Resources with empty or null content are skipped. Verify data/resources/ contains files with non-empty content fields.

Model download stalls — The first TEI startup downloads the model from HuggingFace. Check network access and ~/.cache/huggingface/ for partial downloads.