Vector Processing Internals

Pipeline

The vector pipeline converts knowledge resources into embeddings for semantic search. Guide's SearchContent tool queries these embeddings at runtime.

data/resources/*.json  ──>  TEI (BAAI/bge-small-en-v1.5)  ──>  data/vectors/index.jsonl
                                                                       │
                                                            services/vector (gRPC)
                                                                       │
                                                              Guide SearchContent

Two separate concerns use TEI:

Concern When Who calls TEI
Batch indexing fit-process-vectors CLI → embedding service (gRPC) → TEI
Query embedding Runtime search vector service → embedding service (gRPC) → TEI

Both go through the embedding service (SERVICE_EMBEDDING_URL), which proxies to TEI internally. Authentication is handled by the gRPC framework.


TEI (Text Embeddings Inference)

TEI is a Rust binary from HuggingFace that serves embedding models over HTTP. It must be running before batch processing or query-time search will work.

Installation

Install once via Cargo:

just tei-install

Or manually:

cargo install --git https://github.com/huggingface/text-embeddings-inference \
  --features candle text-embeddings-router

The first startup downloads the BAAI/bge-small-en-v1.5 model (~130 MB) to ~/.cache/huggingface/.

Running Under fit-rc

The embedding service wraps TEI — it starts a gRPC server and spawns text-embeddings-router as a managed child process. Add the service entry to config/config.json under init.services:

{
  "name": "embedding",
  "command": "node services/embedding/server.js",
  "optional": true
}

Then start it with fit-rc:

just tei-start                     # or: bunx fit-rc start embedding
bunx fit-rc status embedding       # Check status
bunx fit-rc stop embedding         # Stop

Because the service is marked optional, fit-rc skips it with a warning if text-embeddings-router is not installed.

The gRPC server listens on SERVICE_EMBEDDING_URL (default grpc://localhost:3015). TEI runs on an internal backend port (default 8090, configurable via SERVICE_EMBEDDING_BACKEND_PORT).

Docker

docker-compose.yml defines a tei container using the HuggingFace CPU image. It listens on port 8080 inside the Docker network (tei.local:8080), accessible only to other containers. This is intended for containerized deployments, not local development.

Health Check

curl http://localhost:8090/health

Returns 200 when TEI is ready to serve requests.


Batch Processing

Resources must exist before vectors can be generated. The full processing chain is:

just process                   # export-standard → process-resources → process-graphs → process-vectors

To process only vectors (when resources already exist):

just process-vectors
# or directly:
bunx fit-process-vectors

Without TEI, use just process-fast to skip the vector step.

Processing Steps

  1. Load all resource identifiers from data/resources/.
  2. Filter out conversations (common.Conversation.*) and tool functions (tool.ToolFunction.*).
  3. Skip resources already present in the vector index (incremental).
  4. Batch remaining resources and POST to TEI /v1/embeddings.
  5. Write each embedding to data/vectors/index.jsonl.

Index Format

Each line in data/vectors/index.jsonl is a self-contained JSON record:

{
  "id": "common.Message.a4663ad1",
  "identifier": {
    "type": "common.Message",
    "name": "a4663ad1",
    "parent": "",
    "subjects": ["https://example.com/id/person/alice"],
    "tokens": 120
  },
  "vector": [-0.048, -0.039, ...]
}
  • Dimensions: 384 (bge-small-en-v1.5)
  • Normalization: Pre-normalized, so cosine similarity reduces to dot product
  • Search: VectorIndex.queryItems() computes dot products with SIMD-style loop unrolling, filters by prefix and token budget, returns ranked identifiers

Components

Component Location Role
VectorProcessor libraries/libvector/src/processor/ Batch embedding pipeline (extends ProcessorBase)
VectorIndex libraries/libvector/src/index/ JSONL-backed index with dot-product search
fit-process-vectors libraries/libvector/bin/ CLI for batch processing
fit-search libraries/libvector/bin/ CLI for ad-hoc similarity search
vector service services/vector/ gRPC service that loads the index and serves queries

Troubleshooting

TEI not reachable — Check curl http://localhost:8090/health. Verify the embedding service is running (bunx fit-rc status embedding).

No resources to process — Run just process-resources first. The vector processor reads from data/resources/.

Empty vector index — Resources with empty or null content are skipped. Verify data/resources/ contains files with non-empty content fields.

Model download stalls — The first TEI startup downloads the model from HuggingFace. Check network access and ~/.cache/huggingface/ for partial downloads.

What's next