Vector Processing Internals
Pipeline
The vector pipeline converts knowledge resources into embeddings for
semantic search. Guide's SearchContent tool queries
these embeddings at runtime.
data/resources/*.json ──> TEI (BAAI/bge-small-en-v1.5) ──> data/vectors/index.jsonl
│
services/vector (gRPC)
│
Guide SearchContent
Two separate concerns use TEI:
| Concern | When | Who calls TEI |
|---|---|---|
| Batch indexing | fit-process-vectors |
CLI → embedding service (gRPC) → TEI |
| Query embedding | Runtime search | vector service → embedding service (gRPC) → TEI |
Both go through the embedding service
(SERVICE_EMBEDDING_URL), which proxies to TEI
internally. Authentication is handled by the gRPC framework.
TEI (Text Embeddings Inference)
TEI is a Rust binary from HuggingFace that serves embedding models over HTTP. It must be running before batch processing or query-time search will work.
Installation
Install once via Cargo:
just tei-install
Or manually:
cargo install --git https://github.com/huggingface/text-embeddings-inference \
--features candle text-embeddings-router
The first startup downloads the
BAAI/bge-small-en-v1.5 model (~130 MB) to
~/.cache/huggingface/.
Running Under fit-rc
The embedding service wraps TEI — it starts a gRPC server and spawns
text-embeddings-router as a managed child process. Add
the service entry to config/config.json under
init.services:
{
"name": "embedding",
"command": "node services/embedding/server.js",
"optional": true
}
Then start it with fit-rc:
just tei-start # or: bunx fit-rc start embedding
bunx fit-rc status embedding # Check status
bunx fit-rc stop embedding # Stop
Because the service is marked optional, fit-rc skips it
with a warning if text-embeddings-router is not
installed.
The gRPC server listens on
SERVICE_EMBEDDING_URL (default
grpc://localhost:3015). TEI runs on an internal backend
port (default 8090, configurable via
SERVICE_EMBEDDING_BACKEND_PORT).
Docker
docker-compose.yml defines a tei container
using the HuggingFace CPU image. It listens on port 8080 inside the
Docker network (tei.local:8080), accessible only to
other containers. This is intended for containerized deployments,
not local development.
Health Check
curl http://localhost:8090/health
Returns 200 when TEI is ready to serve requests.
Batch Processing
Resources must exist before vectors can be generated. The full processing chain is:
just process # export-standard → process-resources → process-graphs → process-vectors
To process only vectors (when resources already exist):
just process-vectors
# or directly:
bunx fit-process-vectors
Without TEI, use just process-fast to skip the vector
step.
Processing Steps
-
Load all resource identifiers from
data/resources/. -
Filter out conversations (
common.Conversation.*) and tool functions (tool.ToolFunction.*). - Skip resources already present in the vector index (incremental).
-
Batch remaining resources and POST to TEI
/v1/embeddings. -
Write each embedding to
data/vectors/index.jsonl.
Index Format
Each line in data/vectors/index.jsonl is a
self-contained JSON record:
{
"id": "common.Message.a4663ad1",
"identifier": {
"type": "common.Message",
"name": "a4663ad1",
"parent": "",
"subjects": ["https://example.com/id/person/alice"],
"tokens": 120
},
"vector": [-0.048, -0.039, ...]
}
- Dimensions: 384 (bge-small-en-v1.5)
- Normalization: Pre-normalized, so cosine similarity reduces to dot product
-
Search:
VectorIndex.queryItems()computes dot products with SIMD-style loop unrolling, filters by prefix and token budget, returns ranked identifiers
Components
| Component | Location | Role |
|---|---|---|
VectorProcessor |
libraries/libvector/src/processor/ |
Batch embedding pipeline (extends ProcessorBase)
|
VectorIndex |
libraries/libvector/src/index/ |
JSONL-backed index with dot-product search |
fit-process-vectors |
libraries/libvector/bin/ |
CLI for batch processing |
fit-search |
libraries/libvector/bin/ |
CLI for ad-hoc similarity search |
| vector service | services/vector/ |
gRPC service that loads the index and serves queries |
Troubleshooting
TEI not reachable — Check
curl http://localhost:8090/health. Verify the embedding
service is running (bunx fit-rc status embedding).
No resources to process — Run
just process-resources first. The vector processor
reads from data/resources/.
Empty vector index — Resources with empty or null
content are skipped. Verify data/resources/ contains
files with non-empty content fields.
Model download stalls — The first TEI startup
downloads the model from HuggingFace. Check network access and
~/.cache/huggingface/ for partial downloads.