Skill AI & LLM →

Brain Memory Hybrid Search

Bir agent'in memory/bilgi korpusu icin BM25 (lexical) + pgvector (semantic) hibrit arama, RRF skor birlesimi…

A complete recipe for a hybrid memory-search endpoint that combines BM25 lexical search with pgvector semantic search and fuses them with Reciprocal Rank Fusion, returning a diverse top-5. It recalls thousands of accumulated notes through one RAG endpoint with a sub-200ms P95 target, and injects results into agent context. Where exact term match wins it uses BM25, where intent matters it uses vectors, and the fusion beats either alone.

$15 one-time
Add to a kit →

Prices include 20% VAT. · Forged on real agency work · one-time, no lock-in

  • Type Skill
  • Category AI & LLM
  • Delivery Email · instant
  • License One-time
Run preview
forgehouse, brain-memory-hybrid-search

Inside the run · no black box

See the actual work before you buy it.

Every recall query fires two searches at once, lexical and semantic, then fuses the results. What follows is the full pipeline from chunking and masking to the five diverse chunks that land in agent context under 200ms.

  1. Chunks each memory file into 512-token windows with 64-token overlap, masks personal data and secrets before any embedding API call, and tags every chunk with its source file and cluster
  2. Embeds chunks in batches of 64 and upserts them into Postgres with an embedding version label, alongside an auto-generated full-text index on the same rows
  3. At query time, runs keyword search (BM25 over the text index) and vector search (HNSW cosine) in parallel, each returning its top 100 ranked candidates
  4. Fuses the two lists with Reciprocal Rank Fusion (k=60), a rank-based merge that needs no score normalization, so a document strong in either modality rises
  5. Applies a diversity filter before returning: maximum 2 chunks per source file plus a per-cluster quota, so one old note cannot dominate the injected context
  6. Returns the top 5 chunks into the agent context, bumps their recall counters asynchronously, and logs latency against the 200ms budget; chunks recalled too often get flagged for staleness review
Use cases · what happens when you plug it in

One power source. 6 lines out.

brain-memory-hybrid-search · core

core active · 6 lines

  1. Injecting relevant past notes into agent context at the start of a task

    ✓ injecting relevant past
  2. Standing up a hybrid RAG endpoint on Supabase with pgvector plus tsvector

    ✓ standing up a hybrid rag
  3. Migrating loose JSON memory files into an indexed Postgres table

    ✓ migrating loose json mem…
  4. Tracking how often a note is recalled to detect self-reinforcing bias

    ✓ tracking how often a note
  5. Planning a re-embed when upgrading the embedding model and detecting drift

    ✓ planning a re-embed when
  6. Adding paid-access course content search with row-level security

    ✓ adding paid-access course
Benefits · what you walk away with

Yours to keep.

Drag time forward. Watch what stays.

Forever

That's what owning means.

The rented stack

ai writing tool: subscription

expired · access lost

analytics suite: subscription

expired · access lost

design platform: subscription

expired · access lost

(nothing left)

Your forge

  1. Recall that beats single-mode search by combining exact-term and semantic matching

    license: perpetual
  2. Fast retrieval with a sub-200ms P95 target via tuned HNSW and GIN indexes

    license: perpetual
  3. Balanced context that avoids one-sided bias through source-file and cluster diversity caps

    license: perpetual
  4. Near-zero embedding cost and lower latency through batching and a short-TTL query cache

    license: perpetual

subscriptions expire · deeds don't

What's included · the full manifest

Everything in the box.

Pick a piece up. Watch it work.

Full schema with tsvector GIN and pgvector HNSW indexes plus row-level security

part 01 of 06 · in the box

6 parts · one working system · ships instantly by email

Who it's for

This wasn't forged for everyone.

  • Not for you if you'd rather rent a tool than own one.
  • Not for you if you want someone else to run your stack.
  • Not for you if you're happy guessing.
Still here? Good.

AI engineers and teams building a RAG memory layer who need fast, bias-aware hybrid recall on Postgres for both internal agents and customer-facing knowledge bases.

then this was forged for you.

Works with

Universal by design: these run in any AI. Delivered in the open Agent Skills + MCP format (native in Claude); ChatGPT, Gemini, Cursor and Copilot adapt the same files their own way.

  • Claude Native format
  • ChatGPT Adapts via open standards
  • Gemini Adapts via open standards
  • Cursor Adapts via open standards
  • Copilot Adapts via open standards
Questions · still in the air

Catch what's on your mind.

the air is clear. nothing between you and the forge.
catch a spark: the forge will answer

  1. Do I need a dedicated vector database, or will Postgres carry this?

    Postgres carries it: pgvector handles the semantic side and tsvector the lexical side, so on Supabase or plain Postgres there's no separate vector store to run. The whole hybrid endpoint lives in one database.

  2. If semantic search already understands meaning, why bother with BM25 lexical on top?

    Because semantic search fumbles exact tokens: IDs, names, error codes, while lexical search misses paraphrases. Reciprocal Rank Fusion blends both rankings so you don't lose precise matches or conceptual ones; that combination is the whole point.

  3. Does the sub-200ms target hold once my notes reach the millions?

    That P95 target is framed around thousands of notes, not millions. At much larger scale you move into index tuning territory, this is a solid starting architecture, not a promise that latency stays flat forever.

  4. How is it delivered?

    By email right after purchase: ready to run, downloaded instantly, no setup wait.

  5. One-time or subscription?

    A one-time purchase; no subscription or hidden fees. VAT (20%) is included.

  6. Can I get a refund?

    As a digital product, it can’t be refunded once downloaded. That’s why we show exactly what’s inside and who it’s for, right here.