The RAG pipeline we actually want

Most RAG stacks treat tacit and explicit knowledge as the same shape. They aren't. Here's the pipeline we're building.

Tacit and explicit knowledge aren't the same shape

Explicit knowledge is text. Chunk it, embed it, retrieve it, re-rank, generate. Standard fare.

Tacit knowledge is different. It's:

spoken, not written
relational ("we did this because Karen pushed back")
temporal ("this was true in March, less true now")
contextual ("this applies to enterprise, not self-serve")

A RAG pipeline that flattens those signals into a 1024-dim embedding loses what makes the knowledge worth retrieving.

What we're shipping

The tacitylab pipeline is built around four layers:

Capture layer. Async structured interviews and short video prompts. Each clip is transcribed, summarised, and decomposed into typed claims (decision, principle, anecdote, anti-pattern, etc.).
Ingestion layer. Read-only connectors fold in docs, threads, tickets, meeting notes. We never copy raw content out of your tenant.
Retrieval layer. Hybrid retrieval (lexical + semantic) plus a re-ranker we tune on captured interviews. Claim-type-aware: a "decision" query is routed differently than a "how-to" query.
Grounding layer. Every response carries source, speaker, date. If the evidence is weak, tacitylab refuses. We'd rather say "I don't know" than hallucinate.

What's hard

The hard part isn't the embedding model. It's the capture: getting a senior engineer to braindump fifteen years of context in a way that's worth retrieving five years from now. That's the bet.

The RAG pipeline we actually want

Tacit and explicit knowledge aren't the same shape

What we're shipping

What's hard

Make what your teamknows, findable.

Make what your team
knows, findable.