Embedding Service

The embedding service provides local vector embeddings for semantic similarity search across knowledge base entries. It uses the `sentence-transformers` library with the `all-MiniLM-L6-v2` model to generate 384-dimensional embeddings, stored in `sqlite-vec` for efficient KNN queries. The entire pipeline runs locally with no external API calls.

Key Files

| File | Purpose | |------|---------| | `pyrite/services/embedding_service.py` | Full implementation: model loading, embedding generation, batch indexing, KNN search |

API / Key Classes

`EmbeddingService`

Initialized with a `PyriteDB` instance and an optional model name (defaults to `all-MiniLM-L6-v2`).

| Method | Description | |--------|-------------| | `embed_text(text)` | Generate a float32 embedding vector for arbitrary text | | `embed_entry(entry_id, kb_name)` | Embed a single entry (title + summary + first 500 chars of body) and upsert into `vec_entry` | | `embed_all(kb_name, force, progress_callback)` | Batch embed all entries for a KB (or all KBs). Returns `{embedded, skipped, errors}` counts. Skips already-embedded entries unless `force=True` | | `search_similar(query, kb_name, limit, max_distance)` | KNN search: embeds query, runs `sqlite-vec MATCH` with cosine distance, filters by KB and distance cutoff (default 1.1), returns up to `limit` results with `distance` and `snippet` fields | | `has_embeddings()` | Check if any embeddings exist in the database | | `embedding_stats()` | Returns `{available, count, total_entries, coverage}` statistics |

Module-Level Functions

| Function | Description | |----------|-------------| | `is_available()` | Check if `sentence-transformers` is installed (optional dependency) | | `_entry_text(entry)` | Combine title + summary + body[:500] into embedding input text | | `_generate_snippet(entry, max_len)` | Generate a display snippet from summary or first paragraph of body | | `_embedding_to_blob(embedding)` / `_blob_to_embedding(blob)` | Serialize/deserialize float32 vectors to bytes for sqlite-vec storage |

Design Notes

Lazy model loading. The `SentenceTransformer` model is loaded on first use (`_get_model`) to avoid startup cost when semantic search is not needed.

Optional dependency. The service requires `pip install pyrite[semantic]`. The `is_available()` function lets callers gracefully degrade when sentence-transformers is not installed.

sqlite-vec integration. Embeddings are stored in a virtual table (`vec_entry`) keyed by the `entry` table's `rowid`. The KNN query uses `WHERE v.embedding MATCH ? AND k = ?` with `ORDER BY v.distance` for cosine similarity ranking.

Over-fetch strategy. `search_similar` fetches `limit 2` (or `limit 3` when filtering by KB) from sqlite-vec to account for KB filtering and distance cutoff before trimming to the requested limit.

Embedding text composition. Each entry is embedded as `title + summary + body[:500]`, giving weight to metadata while capping body length to keep embeddings focused.

Consumers

Search service — uses `EmbeddingService.search_similar()` for `--mode semantic` and `--mode hybrid` search

CLI `pyrite search` — exposes semantic search to the command line

REST API search endpoint — provides semantic search via the `/search` route

MCP server — exposes semantic search as an MCP tool

search-service — orchestrates keyword + semantic search modes

storage-layer — `PyriteDB` manages the `vec_entry` virtual table and `vec_available` flag

background-embedding-worker — async embed_queue processor

background-embedding-pipeline — backlog item that delivered the pipeline

Embedding Servicecomponent