@persistent-ai/fireflow-search

Smart node search for FireFlow — lexical, schema-aware retrieval over the node registry.

Overview

This package ranks FireFlow's registered node types against a free-text query, replacing the old String.prototype.includes filter with relevance ranking, fuzzy/acronym matching, category awareness, and port schema compatibility filtering. It is a pure, in-process library: no database, no network, no DBOS. The index is built once at boot from a node catalog snapshot and queried per keystroke.

It exposes two things:

SearchEngine — the node-shaped search orchestrator. Build it once with the full node set; call search() per query (or per keystroke).
rankList — a generic, item-type-agnostic ranker ({ id, title, description }[] → ranked hits with match spans) that reuses the same primitives. It carries no node/MCP/VFS knowledge, so tool pickers and agent capability search share one ranking implementation.

Key properties:

Pluggable retriever pipeline fused with weighted Reciprocal Rank Fusion (RRF)
Pre-RRF lexical tiers (exact title → exact acronym → title prefix → fused) that no reranker may violate
Schema-aware ranking from a dragged/referenced port via canConnectVerbose (reuses the type system — no parallel type model)
Nominal-aware ordering (RFC 054): nominal-exact matches sub-sort above structural exact
MatchSpan[] output for <mark>-style highlighting on the client
Deterministic tie-breaks (no reliance on registry order)
Per-category ranking weights and deprecated-node demotion, applied only at the fused tier

How It Works

parseQuery(raw)
  │
  ▼
fan out across retrievers (parallel, each pre-capped at top-K)
  ├─ FuzzyTitle      fzf over title + multi-token word-prefix DP aligner
  ├─ Acronym         strict / mixed-case acronym match (gated to short queries)
  ├─ BM25TextFields  Porter-stemmed BM25 over tags, aliases, description
  ├─ CategoryPath    query token prefixes a category-path segment
  ├─ PortName        top-level port titles / keys / descriptions
  └─ SchemaCompat    compatible ports for a dragged/referenced port (drag only)
  │
  ▼
hard filters (category / is:pure; drag-strict gates to schema-allowed set)
  │
  ▼
weighted RRF fusion  →  RRF(d) = Σ_r  w_r · 1 / (k + rank_r(d))   (k = 60)
  │
  ▼
tier classification + sort
  ├─ tier 1  exact title (case-insensitive)
  ├─ tier 2  exact mixed-case acronym
  ├─ tier 3  title prefix          → ordered by title coverage
  └─ tier 4  everything fused      → RRF × coverage × categoryWeight × deprecated
  │
  ▼
format → NodeHit[] (+ optional CategoryHit[] chips, facets, schema buckets)

Retriever pipeline

Each retriever implements Retriever (build() once at index time, query() per keystroke, no external mutation). The default set is the six retrievers above; pass defaultRetrievers: false (or your own retrievers: []) to override. Per-retriever weights live in score-constants.ts (SchemaCompat is upweighted; FuzzyTitle is the primary signal).

RRF fusion

weightedRRF accumulates each retriever's weighted reciprocal rank into a single score and concatenates the contributing match spans and port matches. Nodes outside a retriever's top-K contribute nothing (they are not penalized with rank ∞). Fusion alone does not decide order — it feeds the tiered sort.

Lexical tiers

classifyTier assigns a pre-RRF tier so exact and prefix matches always outrank fused-only matches. The tier is carried on each NodeHit so any future reranker preserves the lattice (it may reorder within a tier, never promote a tier-4 hit above a tier-3 hit). Tier 4 applies a title-coverage soft multiplier, the per-category categoryWeight, and a demotion factor for deprecated nodes. Tiers 1–3 are never re-weighted — an explicit title/acronym/prefix hit always surfaces.

Schema / port-compat ranking

When the request carries a draggedPort (the editor's edge-into-empty-space gesture, or an agent's resolved port reference), the SchemaCompatRetriever:

Inverts the port direction (output → input/passthrough, etc.).
Enumerates candidate flat ports from the SchemaIndex, optionally gated by a path: pattern.
Calls canConnectVerbose(source, candidate) in the correct orientation, memoized in a VerdictCache keyed by (sourceFingerprint, portId, direction).
Scores each match by compat kind, depth penalty, mutable penalty, concreteness, and path specificity, then aggregates per node with a specificity factor (full-surface matches outrank a single match buried among many ports).

Matches fall into three visual buckets — exact, mutable, any-typed — which dominate ordering during a drag. In strict drag mode every other retriever is hard-filtered to the schema-compatible set; compatible / lenient relax this. Each NodeHit carries portMatches[] (best-first) so the UI can auto-connect to the deepest matching sub-port.

Reranking slot

The Reranker interface and the SearchContext fields (upstreamNodeTypes, nearbyNodeTypes, recentlyAdded, …) reserve a hook for context- or frecency-based reranking. No reranker implementation ships in this package; the engine runs rerankers only if a consumer supplies them, and a reranker may not break the lexical tier order.

Who Uses It

Consumer	Surface	Entry point
`fireflow-frontend`	Node palette / context-menu search, drag-from-port suggestions, highlighted match spans	tRPC `search.*` (wraps `SearchEngine`)
Flow-builder agent	`registry_search` over the real engine, with optional schema-compat ranking from a live port reference (RFC 062)	tRPC `search` procedures
MCP tool picker	`mcp.searchTools` — ranked search over a `.ffmcp` package's derived tools (RFC 061)	`rankList`
Capability discovery	`flow.searchCapabilities` — one ranked list interleaving `.ffmcp` tools and `.fflow` actions (RFC 062)	`rankList`

The tRPC layer lives in fireflow-trpc (server/procedures/search/, server/mcp/procedures/search-tools.ts, server/procedures/flow/search-capabilities.ts); it builds the engine from the live node catalog and serializes requests/responses.

Package Exports

Two export paths.

Export	Contents
`@persistent-ai/fireflow-search`	Full public surface — engine, retrievers, fusion, list ranker, projection, index, score constants, types
`@persistent-ai/fireflow-search/types`	Type-only surface (`IndexableNode`, `SearchRequest`, `SearchResponse`, `NodeHit`, port/compat types, …)

Main exported symbols (from .):

Symbol	Kind	Purpose
`SearchEngine`	class	Top-level orchestrator; `search()`, `browse()`, `rebuild()`, `getNode()`, `setRetrievers()`, `setRerankers()`
`SearchEngineDeps`	type	Constructor deps (`nodes`, optional `retrievers` / `rerankers` / `defaultRetrievers`)
`rankList`	function	Generic `{ id, title, description }[]` ranker → `RankedListHit[]` with match spans
`RankListItem`, `RankedListHit`	types	Input/output shapes for `rankList`
`projectFromNodeCatalog`, `projectCatalogNode`	functions	Project a `NodeCatalog` (or one entry) into `IndexableNode[]` (drops hidden nodes/categories)
`projectIndexableNode`	function	Build an `IndexableNode` from a loose `RegisteredNodeSnapshot` (tests, synthetic sets)
`parseQuery`, `isLikelyAcronym`, `isMixedCaseAcronym`	functions	Query parsing helpers
`weightedRRF`	function	Weighted Reciprocal Rank Fusion
`classifyTier`	function	Pre-RRF lexical tier classification
`SchemaIndex`	class	Flat-port lookup index for schema-compat
`VerdictCache`	class	Memoized `canConnectVerbose` verdicts
`SchemaCompatRetriever`, `bucketForKind`, `VERDICT_QUALITY`	class / helpers	Schema-aware retrieval and bucketing
`parsePathPattern`, `matchPathPattern`	functions	`path:` pattern parsing / matching
`RETRIEVER_WEIGHTS`, `RRF_K`, `COMPAT_KIND_SCORE`, …	constants	Tunable scoring constants

(The individual retriever classes — FuzzyTitleRetriever, AcronymRetriever, BM25TextFieldsRetriever, CategoryPathRetriever, PortNameRetriever — are constructed internally by the engine's default set and are not part of the root export.)

Usage

Node search

typescript

import { projectFromNodeCatalog, SearchEngine } from '@persistent-ai/fireflow-search'

// Build once at boot from the live node catalog.
const engine = new SearchEngine({ nodes: projectFromNodeCatalog(nodeCatalog) })

// Query per keystroke.
const res = await engine.search({ query: 'LLM Ca wi to', limit: 20 }, userId)
res.hits         // ranked NodeHit[] (+ CategoryHit[] chips), each with match spans
res.facets       // { categories: Record<categoryId, count> }

// Drag-from-port: rank only schema-compatible nodes.
const dragRes = await engine.search({
  query: '',
  context: {
    draggedPort: { direction: 'output', config: portConfig, strictness: 'strict' },
  },
}, userId)
dragRes.buckets  // { exact, mutable, anyTyped } counts (present only during a drag)

// Rebuild the index after the registry changes (e.g. HMR).
engine.rebuild(projectFromNodeCatalog(nodeCatalog))

Generic list ranking

typescript

import { rankList } from '@persistent-ai/fireflow-search'

const items = tools.map(t => ({ id: t.uri, title: t.title, description: t.description }))
const ranked = rankList(items, 'summarize') // RankedListHit[]; title outranks description
// An empty query returns every item in input order with score 0 (warms the picker).

Directory Structure

src/
├── index.ts                       # Public exports
├── types.ts                       # IndexableNode, SearchRequest/Response, Retriever, Reranker, …
├── engine.ts                      # SearchEngine — parse → retrieve → filter → fuse → tier → format
├── parse-query.ts                 # parseQuery + acronym detection
├── score-constants.ts             # RRF weights, k, compat-kind scores, penalties, demotions
├── category-weights.ts            # resolveCategoryWeight (tier-4 multiplier)
├── fusion/
│   ├── weighted-rrf.ts            # weightedRRF
│   └── pre-rrf-tier.ts            # classifyTier (tiers 1–4)
├── retrievers/
│   ├── fuzzy-title.ts             # FuzzyTitleRetriever (fzf + word-prefix DP aligner)
│   ├── acronym.ts                 # AcronymRetriever
│   ├── bm25-text-fields.ts        # BM25TextFieldsRetriever (minisearch + stemmer)
│   ├── category-path.ts           # CategoryPathRetriever (+ category chips)
│   ├── port-name.ts               # PortNameRetriever
│   └── schema-compat.ts           # SchemaCompatRetriever, buckets, VERDICT_QUALITY
├── index/
│   ├── project-from-catalog.ts    # projectFromNodeCatalog / projectCatalogNode
│   ├── indexable-node.ts          # projectIndexableNode + flat-port projection
│   ├── flat-port-index.ts         # SchemaIndex
│   ├── tokenize-title.ts          # title tokenization + acronym forms
│   └── verdict-cache.ts           # VerdictCache
├── list/
│   └── rank-list.ts               # rankList (generic item ranker)
├── path-pattern/
│   ├── parse.ts                   # parsePathPattern
│   └── match.ts                   # matchPathPattern
└── __tests__/                     # engine (lexical / schema / nominal), retrievers, rank-list, …

Development

bash

pnpm --filter @persistent-ai/fireflow-search build       # tsc -b
pnpm --filter @persistent-ai/fireflow-search typecheck    # tsc -b
pnpm --filter @persistent-ai/fireflow-search test         # vitest
pnpm --filter @persistent-ai/fireflow-search test:coverage

Dependencies

Package	Purpose
`@persistent-ai/fireflow-types`	Port system, `canConnectVerbose`, flat-port projection, `NodeCatalog` types
`fzf`	Character-level fuzzy title matching (and `rankList` titles)
`minisearch`	BM25 index over tags / aliases / description
`stemmer`	Porter stemming for the BM25 retriever

drizzle-orm and zod are optional peers (the package itself is storage-agnostic).

License

Business Source License 1.1 (BUSL-1.1) — see LICENSE.txt

View source on GitHub →

@persistent-ai/fireflow-search ​

Overview ​

How It Works ​

Retriever pipeline ​

RRF fusion ​

Lexical tiers ​

Schema / port-compat ranking ​

Reranking slot ​

Who Uses It ​

Package Exports ​

Usage ​

Node search ​

Generic list ranking ​

Directory Structure ​

Development ​

Dependencies ​

License ​