@persistent-ai/fireflow-search
Smart node search for FireFlow — lexical, schema-aware retrieval over the node registry.
Overview
This package ranks FireFlow's registered node types against a free-text query, replacing the old String.prototype.includes filter with relevance ranking, fuzzy/acronym matching, category awareness, and port schema compatibility filtering. It is a pure, in-process library: no database, no network, no DBOS. The index is built once at boot from a node catalog snapshot and queried per keystroke.
It exposes two things:
SearchEngine— the node-shaped search orchestrator. Build it once with the full node set; callsearch()per query (or per keystroke).rankList— a generic, item-type-agnostic ranker ({ id, title, description }[]→ ranked hits with match spans) that reuses the same primitives. It carries no node/MCP/VFS knowledge, so tool pickers and agent capability search share one ranking implementation.
Key properties:
- Pluggable retriever pipeline fused with weighted Reciprocal Rank Fusion (RRF)
- Pre-RRF lexical tiers (exact title → exact acronym → title prefix → fused) that no reranker may violate
- Schema-aware ranking from a dragged/referenced port via
canConnectVerbose(reuses the type system — no parallel type model) - Nominal-aware ordering (RFC 054):
nominal-exactmatches sub-sort above structuralexact MatchSpan[]output for<mark>-style highlighting on the client- Deterministic tie-breaks (no reliance on registry order)
- Per-category ranking weights and deprecated-node demotion, applied only at the fused tier
How It Works
parseQuery(raw)
│
▼
fan out across retrievers (parallel, each pre-capped at top-K)
├─ FuzzyTitle fzf over title + multi-token word-prefix DP aligner
├─ Acronym strict / mixed-case acronym match (gated to short queries)
├─ BM25TextFields Porter-stemmed BM25 over tags, aliases, description
├─ CategoryPath query token prefixes a category-path segment
├─ PortName top-level port titles / keys / descriptions
└─ SchemaCompat compatible ports for a dragged/referenced port (drag only)
│
▼
hard filters (category / is:pure; drag-strict gates to schema-allowed set)
│
▼
weighted RRF fusion → RRF(d) = Σ_r w_r · 1 / (k + rank_r(d)) (k = 60)
│
▼
tier classification + sort
├─ tier 1 exact title (case-insensitive)
├─ tier 2 exact mixed-case acronym
├─ tier 3 title prefix → ordered by title coverage
└─ tier 4 everything fused → RRF × coverage × categoryWeight × deprecated
│
▼
format → NodeHit[] (+ optional CategoryHit[] chips, facets, schema buckets)Retriever pipeline
Each retriever implements Retriever (build() once at index time, query() per keystroke, no external mutation). The default set is the six retrievers above; pass defaultRetrievers: false (or your own retrievers: []) to override. Per-retriever weights live in score-constants.ts (SchemaCompat is upweighted; FuzzyTitle is the primary signal).
RRF fusion
weightedRRF accumulates each retriever's weighted reciprocal rank into a single score and concatenates the contributing match spans and port matches. Nodes outside a retriever's top-K contribute nothing (they are not penalized with rank ∞). Fusion alone does not decide order — it feeds the tiered sort.
Lexical tiers
classifyTier assigns a pre-RRF tier so exact and prefix matches always outrank fused-only matches. The tier is carried on each NodeHit so any future reranker preserves the lattice (it may reorder within a tier, never promote a tier-4 hit above a tier-3 hit). Tier 4 applies a title-coverage soft multiplier, the per-category categoryWeight, and a demotion factor for deprecated nodes. Tiers 1–3 are never re-weighted — an explicit title/acronym/prefix hit always surfaces.
Schema / port-compat ranking
When the request carries a draggedPort (the editor's edge-into-empty-space gesture, or an agent's resolved port reference), the SchemaCompatRetriever:
- Inverts the port direction (output → input/passthrough, etc.).
- Enumerates candidate flat ports from the
SchemaIndex, optionally gated by apath:pattern. - Calls
canConnectVerbose(source, candidate)in the correct orientation, memoized in aVerdictCachekeyed by(sourceFingerprint, portId, direction). - Scores each match by compat kind, depth penalty, mutable penalty, concreteness, and path specificity, then aggregates per node with a specificity factor (full-surface matches outrank a single match buried among many ports).
Matches fall into three visual buckets — exact, mutable, any-typed — which dominate ordering during a drag. In strict drag mode every other retriever is hard-filtered to the schema-compatible set; compatible / lenient relax this. Each NodeHit carries portMatches[] (best-first) so the UI can auto-connect to the deepest matching sub-port.
Reranking slot
The Reranker interface and the SearchContext fields (upstreamNodeTypes, nearbyNodeTypes, recentlyAdded, …) reserve a hook for context- or frecency-based reranking. No reranker implementation ships in this package; the engine runs rerankers only if a consumer supplies them, and a reranker may not break the lexical tier order.
Who Uses It
| Consumer | Surface | Entry point |
|---|---|---|
fireflow-frontend | Node palette / context-menu search, drag-from-port suggestions, highlighted match spans | tRPC search.* (wraps SearchEngine) |
| Flow-builder agent | registry_search over the real engine, with optional schema-compat ranking from a live port reference (RFC 062) | tRPC search procedures |
| MCP tool picker | mcp.searchTools — ranked search over a .ffmcp package's derived tools (RFC 061) | rankList |
| Capability discovery | flow.searchCapabilities — one ranked list interleaving .ffmcp tools and .fflow actions (RFC 062) | rankList |
The tRPC layer lives in fireflow-trpc (server/procedures/search/, server/mcp/procedures/search-tools.ts, server/procedures/flow/search-capabilities.ts); it builds the engine from the live node catalog and serializes requests/responses.
Package Exports
Two export paths.
| Export | Contents |
|---|---|
@persistent-ai/fireflow-search | Full public surface — engine, retrievers, fusion, list ranker, projection, index, score constants, types |
@persistent-ai/fireflow-search/types | Type-only surface (IndexableNode, SearchRequest, SearchResponse, NodeHit, port/compat types, …) |
Main exported symbols (from .):
| Symbol | Kind | Purpose |
|---|---|---|
SearchEngine | class | Top-level orchestrator; search(), browse(), rebuild(), getNode(), setRetrievers(), setRerankers() |
SearchEngineDeps | type | Constructor deps (nodes, optional retrievers / rerankers / defaultRetrievers) |
rankList | function | Generic { id, title, description }[] ranker → RankedListHit[] with match spans |
RankListItem, RankedListHit | types | Input/output shapes for rankList |
projectFromNodeCatalog, projectCatalogNode | functions | Project a NodeCatalog (or one entry) into IndexableNode[] (drops hidden nodes/categories) |
projectIndexableNode | function | Build an IndexableNode from a loose RegisteredNodeSnapshot (tests, synthetic sets) |
parseQuery, isLikelyAcronym, isMixedCaseAcronym | functions | Query parsing helpers |
weightedRRF | function | Weighted Reciprocal Rank Fusion |
classifyTier | function | Pre-RRF lexical tier classification |
SchemaIndex | class | Flat-port lookup index for schema-compat |
VerdictCache | class | Memoized canConnectVerbose verdicts |
SchemaCompatRetriever, bucketForKind, VERDICT_QUALITY | class / helpers | Schema-aware retrieval and bucketing |
parsePathPattern, matchPathPattern | functions | path: pattern parsing / matching |
RETRIEVER_WEIGHTS, RRF_K, COMPAT_KIND_SCORE, … | constants | Tunable scoring constants |
(The individual retriever classes — FuzzyTitleRetriever, AcronymRetriever, BM25TextFieldsRetriever, CategoryPathRetriever, PortNameRetriever — are constructed internally by the engine's default set and are not part of the root export.)
Usage
Node search
import { projectFromNodeCatalog, SearchEngine } from '@persistent-ai/fireflow-search'
// Build once at boot from the live node catalog.
const engine = new SearchEngine({ nodes: projectFromNodeCatalog(nodeCatalog) })
// Query per keystroke.
const res = await engine.search({ query: 'LLM Ca wi to', limit: 20 }, userId)
res.hits // ranked NodeHit[] (+ CategoryHit[] chips), each with match spans
res.facets // { categories: Record<categoryId, count> }
// Drag-from-port: rank only schema-compatible nodes.
const dragRes = await engine.search({
query: '',
context: {
draggedPort: { direction: 'output', config: portConfig, strictness: 'strict' },
},
}, userId)
dragRes.buckets // { exact, mutable, anyTyped } counts (present only during a drag)
// Rebuild the index after the registry changes (e.g. HMR).
engine.rebuild(projectFromNodeCatalog(nodeCatalog))Generic list ranking
import { rankList } from '@persistent-ai/fireflow-search'
const items = tools.map(t => ({ id: t.uri, title: t.title, description: t.description }))
const ranked = rankList(items, 'summarize') // RankedListHit[]; title outranks description
// An empty query returns every item in input order with score 0 (warms the picker).Directory Structure
src/
├── index.ts # Public exports
├── types.ts # IndexableNode, SearchRequest/Response, Retriever, Reranker, …
├── engine.ts # SearchEngine — parse → retrieve → filter → fuse → tier → format
├── parse-query.ts # parseQuery + acronym detection
├── score-constants.ts # RRF weights, k, compat-kind scores, penalties, demotions
├── category-weights.ts # resolveCategoryWeight (tier-4 multiplier)
├── fusion/
│ ├── weighted-rrf.ts # weightedRRF
│ └── pre-rrf-tier.ts # classifyTier (tiers 1–4)
├── retrievers/
│ ├── fuzzy-title.ts # FuzzyTitleRetriever (fzf + word-prefix DP aligner)
│ ├── acronym.ts # AcronymRetriever
│ ├── bm25-text-fields.ts # BM25TextFieldsRetriever (minisearch + stemmer)
│ ├── category-path.ts # CategoryPathRetriever (+ category chips)
│ ├── port-name.ts # PortNameRetriever
│ └── schema-compat.ts # SchemaCompatRetriever, buckets, VERDICT_QUALITY
├── index/
│ ├── project-from-catalog.ts # projectFromNodeCatalog / projectCatalogNode
│ ├── indexable-node.ts # projectIndexableNode + flat-port projection
│ ├── flat-port-index.ts # SchemaIndex
│ ├── tokenize-title.ts # title tokenization + acronym forms
│ └── verdict-cache.ts # VerdictCache
├── list/
│ └── rank-list.ts # rankList (generic item ranker)
├── path-pattern/
│ ├── parse.ts # parsePathPattern
│ └── match.ts # matchPathPattern
└── __tests__/ # engine (lexical / schema / nominal), retrievers, rank-list, …Development
pnpm --filter @persistent-ai/fireflow-search build # tsc -b
pnpm --filter @persistent-ai/fireflow-search typecheck # tsc -b
pnpm --filter @persistent-ai/fireflow-search test # vitest
pnpm --filter @persistent-ai/fireflow-search test:coverageDependencies
| Package | Purpose |
|---|---|
@persistent-ai/fireflow-types | Port system, canConnectVerbose, flat-port projection, NodeCatalog types |
fzf | Character-level fuzzy title matching (and rankList titles) |
minisearch | BM25 index over tags / aliases / description |
stemmer | Porter stemming for the BM25 retriever |
drizzle-orm and zod are optional peers (the package itself is storage-agnostic).
License
Business Source License 1.1 (BUSL-1.1) — see LICENSE.txt