Do I need a dedicated vector database, or will Postgres carry this?

Postgres carries it: pgvector handles the semantic side and tsvector the lexical side, so on Supabase or plain Postgres there's no separate vector store to run. The whole hybrid endpoint lives in one database.

If semantic search already understands meaning, why bother with BM25 lexical on top?

Because semantic search fumbles exact tokens: IDs, names, error codes, while lexical search misses paraphrases. Reciprocal Rank Fusion blends both rankings so you don't lose precise matches or conceptual ones; that combination is the whole point.

Does the sub-200ms target hold once my notes reach the millions?

That P95 target is framed around thousands of notes, not millions. At much larger scale you move into index tuning territory, this is a solid starting architecture, not a promise that latency stays flat forever.

By email right after purchase: ready to run, downloaded instantly, no setup wait.

One-time or subscription?

A one-time purchase; no subscription or hidden fees. VAT (20%) is included.

As a digital product, it can’t be refunded once downloaded. That’s why we show exactly what’s inside and who it’s for, right here.

Skill AI & LLM →

Brain Memory Hybrid Search

Bir agent'in memory/bilgi korpusu icin BM25 (lexical) + pgvector (semantic) hibrit arama, RRF skor birlesimi…

A complete recipe for a hybrid memory-search endpoint that combines BM25 lexical search with pgvector semantic search and fuses them with Reciprocal Rank Fusion, returning a diverse top-5. It recalls thousands of accumulated notes through one RAG endpoint with a sub-200ms P95 target, and injects results into agent context. Where exact term match wins it uses BM25, where intent matters it uses vectors, and the fusion beats either alone.

$15 one-time

Add to a kit →

Prices include 20% VAT. · Forged on real agency work · one-time, no lock-in

Type Skill
Category AI & LLM
Delivery Email · instant
License One-time

Run preview

forgehouse, brain-memory-hybrid-search

Inside the run · no black box

See the actual work before you buy it.

Every recall query fires two searches at once, lexical and semantic, then fuses the results. What follows is the full pipeline from chunking and masking to the five diverse chunks that land in agent context under 200ms.

Chunks each memory file into 512-token windows with 64-token overlap, masks personal data and secrets before any embedding API call, and tags every chunk with its source file and cluster
Embeds chunks in batches of 64 and upserts them into Postgres with an embedding version label, alongside an auto-generated full-text index on the same rows
At query time, runs keyword search (BM25 over the text index) and vector search (HNSW cosine) in parallel, each returning its top 100 ranked candidates
Fuses the two lists with Reciprocal Rank Fusion (k=60), a rank-based merge that needs no score normalization, so a document strong in either modality rises
Applies a diversity filter before returning: maximum 2 chunks per source file plus a per-cluster quota, so one old note cannot dominate the injected context
Returns the top 5 chunks into the agent context, bumps their recall counters asynchronously, and logs latency against the 200ms budget; chunks recalled too often get flagged for staleness review

Use cases · what happens when you plug it in

One power source. 6 lines out.

brain-memory-hybrid-search · core

core active · 6 lines

Injecting relevant past notes into agent context at the start of a task

✓ injecting relevant past
Standing up a hybrid RAG endpoint on Supabase with pgvector plus tsvector

✓ standing up a hybrid rag
Migrating loose JSON memory files into an indexed Postgres table

✓ migrating loose json mem…
Tracking how often a note is recalled to detect self-reinforcing bias

✓ tracking how often a note
Planning a re-embed when upgrading the embedding model and detecting drift

✓ planning a re-embed when
Adding paid-access course content search with row-level security

✓ adding paid-access course

Benefits · what you walk away with

Yours to keep.

Drag time forward. Watch what stays.

Forever

That's what owning means.

The rented stack

ai writing tool: subscription

expired · access lost

analytics suite: subscription

expired · access lost

design platform: subscription

expired · access lost

(nothing left)

Your forge

Recall that beats single-mode search by combining exact-term and semantic matching
license: perpetual
Fast retrieval with a sub-200ms P95 target via tuned HNSW and GIN indexes
license: perpetual
Balanced context that avoids one-sided bias through source-file and cluster diversity caps
license: perpetual
Near-zero embedding cost and lower latency through batching and a short-TTL query cache
license: perpetual

subscriptions expire · deeds don't

What's included · the full manifest

Everything in the box.

Pick a piece up. Watch it work.

Full schema with tsvector GIN and pgvector HNSW indexes plus row-level security

part 01 of 06 · in the box

6 parts · one working system · ships instantly by email

Who it's for

This wasn't forged for everyone.

Not for you if you'd rather rent a tool than own one.
Not for you if you want someone else to run your stack.
Not for you if you're happy guessing.

Still here? Good.

AI engineers and teams building a RAG memory layer who need fast, bias-aware hybrid recall on Postgres for both internal agents and customer-facing knowledge bases.

then this was forged for you.

Works with

Universal by design: these run in any AI. Delivered in the open Agent Skills + MCP format (native in Claude); ChatGPT, Gemini, Cursor and Copilot adapt the same files their own way.

Claude Native format
ChatGPT Adapts via open standards
Gemini Adapts via open standards
Cursor Adapts via open standards
Copilot Adapts via open standards

Questions · still in the air

Catch what's on your mind.

the air is clear. nothing between you and the forge.

catch a spark: the forge will answer

Do I need a dedicated vector database, or will Postgres carry this?

Postgres carries it: pgvector handles the semantic side and tsvector the lexical side, so on Supabase or plain Postgres there's no separate vector store to run. The whole hybrid endpoint lives in one database.
If semantic search already understands meaning, why bother with BM25 lexical on top?

Because semantic search fumbles exact tokens: IDs, names, error codes, while lexical search misses paraphrases. Reciprocal Rank Fusion blends both rankings so you don't lose precise matches or conceptual ones; that combination is the whole point.
Does the sub-200ms target hold once my notes reach the millions?

That P95 target is framed around thousands of notes, not millions. At much larger scale you move into index tuning territory, this is a solid starting architecture, not a promise that latency stays flat forever.
How is it delivered?

By email right after purchase: ready to run, downloaded instantly, no setup wait.
One-time or subscription?

A one-time purchase; no subscription or hidden fees. VAT (20%) is included.
Can I get a refund?

As a digital product, it can’t be refunded once downloaded. That’s why we show exactly what’s inside and who it’s for, right here.

Brain Memory Hybrid Search

See the actual work before you buy it.

One power source. 6 lines out.

Yours to keep.

The rented stack

Your forge

Everything in the box.

This wasn't forged for everyone.

Works with

Catch what's on your mind.

Do I need a dedicated vector database, or will Postgres carry this?

If semantic search already understands meaning, why bother with BM25 lexical on top?

Does the sub-200ms target hold once my notes reach the millions?

How is it delivered?

One-time or subscription?

Can I get a refund?

Related products

Agent Eval Suite Langsmith

Brain Context Engineering

Claude Agent Template Library

Context Driven Development