Skip to content

PersistentAIAgents that work in production

Most AI agents fail quietly — losing context, mishandling errors, drifting from their intended behavior. PersistentAI is the execution layer that prevents that: persistent memory, fault tolerance, and guardrails built into the runtime itself.

PersistentAIPersistentAI

The problem

Building an AI agent is straightforward.

Building one you'd trust with a real customer, real money, or an irreversible action is not.

The difficulty isn't intelligence — it's infrastructure. Agents that run beyond a single interaction need durable execution, persistent memory, identity and permissions, audit trails, and operational visibility. Most teams build this from scratch, once per project, under pressure.

PersistentAI is that infrastructure. Already built.

The platform: three products

PersistentAI is one platform made of three products that ship together. You don't wire them up — they already are.

FireFlow Engine

Build & run agents

A production-grade engine for AI agents: build visually from typed blocks, run durably, scale horizontally.

  • Durable, exactly-once execution on DBOS — crash-recoverable, every step checkpointed
  • Deterministic parallelism (SP-Tree) — branches run in true parallel, same inputs → same outputs
  • 320+ typed building blocks; connections type-checked at build and at runtime
  • Agent-generated code runs in an isolated sandbox
  • MCP in and out — agents discover tools, your flows call them, governed by .ffmcp manifests

Flame Chorus

The conversation layer

A chat engine that runs the room — managing participants, routing each turn, and streaming replies to everyone in real time.

  • Multi-agent rooms — several agents and humans in one conversation
  • An orchestrator routes each message to the right agent
  • One ordered, append-only event feed per room — replayable from any point
  • Conversations survive browser close and reconnect
  • OAuth 2.1 + role-based access control, with generative UI streamed inline

Memory Tree

Memory, built in

Memory in layers that work together, so agents keep context, history, and state across every interaction. Nothing to wire up separately.

  • Full, replayable conversation history
  • Versioned files with change history (VFS)
  • Structured workspace data in one place
  • A complete log of every decision made
  • Knowledge graph + RAG — facts as triples, indexed for retrieval (shipping soon)
PersistentAI sits between user intent and output, drawing on public data, private data, and user-sourced data, and connecting out to LLM providers, MCP servers, and a sandbox.PersistentAI sits between user intent and output, drawing on public data, private data, and user-sourced data, and connecting out to LLM providers, MCP servers, and a sandbox.

PersistentAI solves it

Every team building a production agent hits the same ten stages on the way from prototype to something you can trust. They're predictable — and each one is already solved in the runtime.

1
Crash recovery
DBOS durable execution — checkpoint every step in PostgreSQL.
2
Exactly-once
Atomic checkpoint + operation in the same PostgreSQL transaction.
3
Observability
25+ event types, full audit trail, real-time WebSocket delivery.
4
Human-in-the-loop
Durable pause via DBOS.recv() — survives restarts, for hours or weeks.
5
Multi-agent
ExecuteFlow, SP-Tree parallelism, DBOS Stream Bridge.
6
Vault + ACL + VFS
AES-256-GCM, dual identity, lakeFS Git semantics.
7
Horizontal scaling
Stateless workers, PostgreSQL queue, no external brokers.
8
Streaming
Two-phase execution, MultiChannel, cross-worker DBOS Stream Bridge.
9
Testing
VFS branching, full event stream per run, non-deterministic eval support.
10
Cost management
Multi-provider routing, token aggregation, budget gates.

How it fits together

A user message enters Flame Chorus (auth, chat-room feed, orchestrator), routes to the FireFlow agent engine, which calls LLMs, MCP tools and the sandbox, writes to memory and state, and streams results back to the room.A user message enters Flame Chorus (auth, chat-room feed, orchestrator), routes to the FireFlow agent engine, which calls LLMs, MCP tools and the sandbox, writes to memory and state, and streams results back to the room.

A user message lands in a Flame Chorus chat room and becomes one ordered entry in that room's feed. The orchestrator routes the turn to the right agent; FireFlow runs the agent's logic, calls models and tools, and writes results back into the same feed as live AG-UI events — text, generative UI, and tool calls — that stream straight to the user. Memory Tree keeps the whole thing durable underneath: every message, file, record, and decision is persisted as it happens. Everything starts and ends in the room.

What the platform consists of

Each part of the platform earns its place. Here's what every piece buys you — and the failure mode you inherit without it.

Feature
What it enables
What happens without it
Durable engine
If something goes wrong mid-operation, the agent continues from the last saved point.
The user writes to the agent, it never responds, and forgets the request until the user writes again.
Visual editor
Fast iteration. Non-technical team members can change and adapt the agent — a product team can assemble one in a day. No CI/CD overhead, no deploy step.
Every small improvement goes through developers and the full request-to-deploy pipeline. Testing hypotheses in a single day becomes impossible.
Event sourcing & audit
Every agent execution is stored and replayable in the editor — detailed debugging plus an audit log.
Users report that something is off — the agent answers differently than expected, or not at all — and there is no way to understand why.
Human-in-the-loop
Approval of major operations. The agent’s full state is preserved and it waits until the user approves or rejects the action.
The agent sends money to the wrong place, or the amount is wrong by mistake. The user never sees it coming and cannot prevent it.
Request routing between agents
Manager agents and invisible router agents orchestrate requests to the right agents and sub-agents, each owning its own integrations.
One agent wired to everything at once: huge per-response cost and a high chance of hallucination.
Vault, access control, versioning
Secret protection, client isolation, and rollback to any previous version of the graph or agent.
Building agents inside an organization where several people work on the same agent becomes impossible.
Memory Tree — 4 layers
Chat history, VFS files, structured data in variables, and execution history. Everything the agent does is preserved across four layers and reusable later.
The agent forgets context after a few messages and cannot handle medium- or high-complexity tasks.
Messenger with agents
Several agents in one chat room, with generative UI as part of the interaction.
A bare chat interface that is sometimes less convenient than a classic UI.

Memory Tree

Most agents forget the moment a session ends. PersistentAI treats memory as foundational infrastructure, not an afterthought — Memory Tree is a set of layers that work together so agents retain context, history, and state across every interaction.

The conversation is the database

Every session is a single, ordered, append-only feed. Messages, state changes, tool calls, and the on-screen UI all live in that one stream — which also drives live sync, rendering, and the tool-call lifecycle. State moves as JSON-Patch deltas. Reload a week later and the session rebuilds exactly as it was, from the same source it always renders from. No separate history path. No drift.

Versioned files

Anything an agent produces or modifies — documents, data, or the agent's own definition — is a versioned file with branches, diffs, and rollback to any point in time. Built on a virtual file system backed by lakeFS and S3. Every write is durable and exactly once.

Structured workspace data

Agents read and write structured records directly within their workspace — JSON documents, key-value state, CSV tables, queryable data — without standing up or managing a separate database. The data layer is built in.

A complete decision trail

Every step an agent takes is recorded as it happens: what ran, what it received, what it produced. The execution log lives in Postgres and supports deterministic replay — use it for debugging, hand it over as an audit record, or reconstruct exactly what happened and why.

Knowledge graph & retrieval shipping soon

A graph layer stores facts as triples and indexes them — including for RAG — so agents can retrieve and reason over their own accumulated knowledge, not just the last few messages. Rolling out now.

The agent's logic, its tools via MCP, and every memory layer live together in FireFlow. The agent's brain and its memory are never two separate systems.

Runtime

Crash recovery

A server can restart mid-task and the agent resumes exactly where it stopped — no repeated steps, no lost progress. Execution is durable and exactly-once, built on DBOS with automatic crash recovery.

Deterministic parallelism

Independent branches execute simultaneously, yet the same inputs always produce the same outputs. Flows compile to a series-parallel tree (SP-Tree engine), so determinism is a property of the architecture, not something you have to engineer separately.

Permissions and guardrails

The agent knows who's asking, what they're permitted to do, and when to stop and wait for a human before taking an irreversible action. Untrusted code runs in a sealed sandbox. Authentication, role-based access control, human-in-the-loop checkpoints, and sandboxed execution are built into the runtime — with separate owner and caller identities so permission boundaries are unambiguous.

Generative UI

Users interact in plain language. The agent responds with live buttons, forms, and charts rendered directly inside the conversation — not in a separate interface. Built on the open AG-UI protocol, with support for multi-agent rooms.

Universal connectivity

Connect any tool, model, API, or data source. Expose your own agents the same way. Over 320 typed building blocks, with open tool protocols — MCP supported both inbound and outbound.

Use cases

Customer-facing copilots

A single conversation that quietly routes across dozens of specialized agents — each handling its domain, invisible to the user.

Back-office automation

Long-running workflows over your own data and tools, with a complete audit trail at every step.

Regulated industries

Finance, identity, compliance — environments where "the AI made a mistake" is not an acceptable answer. PersistentAI is built for exactly that accountability.

Internal tooling

Connect what you already have and ship a working agent in hours, not months.

FireFlow vs. other solutions

Plenty of tools cover a slice of this. FireFlow + Flame Chorus is the one that covers the whole agent lifecycle — visual graph, durable execution, event stream, sandbox, MCP, audit/replay, and a chat layer — in a single architecture. Green is a strength, yellow is an area we're still maturing, red is a genuine gap. We've kept ours honest.

ParameterFireFlow + Flame ChorusLangflow + Temporaln8nClaude Code
Solution type / product classA unified agent platform: visual graph, durable execution, event stream, sandbox, MCP primitives, audit/replay, and a chat layer within one architecture.A combination of a visual builder and a durable workflow engine; requires separate integration between canvas, runtime, observability, access control, and deployment.A low-code / workflow automation platform with a large number of integrations, AI nodes, and agent workflows.A developer tool: helps programmers read code, modify files, run commands, and build individual agents / automation scripts.
Primary use caseA platform for creating, launching, managing, and operating multiple agents.An architectural stack: Langflow as a visual canvas, Temporal as a durable runtime.Automation of business processes and integrations; well-suited for workflow automation and AI automations.A coding assistant / agentic developer tool; strong for writing, modifying, and debugging code.
Visual builderVisual graph connected to execution semantics, typed ports, and runtime.Langflow covers the visual canvas, but the runtime needs to be integrated separately.Strong visual workflow builder with many ready-made nodes and integrations.No full visual builder for business users; the main interface is terminal / IDE / dev environment.
Durable executionDurable execution built into the platform runtime; execution state, replay, and recovery are part of the architecture.Temporal provides a mature durable workflow engine, but it needs to be properly connected to Langflow graph semantics.Has execution history, retries, queue mode, and scaling, but this is workflow-automation durability rather than full agent-grade durable execution.Durable execution for enterprise agents needs to be designed separately; Claude Code is not a production runtime.
Builder + runtime integrationBuilder and runtime are part of one system.Requires a separate integration layer: mapping, checkpointing, replay, versioning, and permissions.Builder and runtime are already integrated within the platform.No builder + production runtime integration; the agent requires separate deployment and orchestration.
Multi-agent orchestrationSupports master-agent / sub-agent architectures and flow reuse.Possible, but requires independent orchestration design.Supports multi-step workflows and AI Agent Tool nodes.Can be implemented programmatically, but orchestration and management remain entirely with the developers.
Chat / end-user layerFlame Chorus provides multi-agent chat, ACL, HITL, and context management.No full chat backend; requires separate implementation.Has AI/chat capabilities, but the main focus is process automation.No ready-made user-facing chat layer.
Generative UIEnd-to-end support. The agent emits UI as an AG-UI tool call, and the frontend renders it via json-render. Interactive forms, two-way state, both ephemeral and persisted via the event log. 16 ready-made domain components (balance, transactions, invoice/payment request, cart, map, and more).No native support. Langflow Playground / embedded chat are text-chat interfaces; Temporal UI is not an end-user interface. Generative UI would be a separate frontend layer.No native support. Chat Trigger provides a text-chat interface; interactive UI needs a custom frontend on top of workflows.No native support. Claude Code is a terminal / IDE agent with text or markdown output. It can generate UI code as an artifact, but does not render live interactive UI inside its own chat.
Eval / quality gateCan be implemented through separate flows; enterprise packaging requires further development.Requires a separate eval component.Has built-in AI evaluations.Custom tests and eval scripts are possible, but there is no centralized lifecycle.
Observability / tracingEvent log, telemetry, and execution events are built in; enterprise observability requires further development.Langfuse layer provides traces, latency tracking, cost per trace, token usage, and anomaly monitoring.Execution logs, monitoring, OpenTelemetry, and workflow history.Has hooks and OpenTelemetry in the SDK, but no unified observability platform.
LLM routing / proxyPossible through flow logic; a centralized proxy requires separate packaging.LLM Proxy layer provides model abstraction, routing, fallback logic, and provider switching.Routing can be implemented inside workflows.Requires independent implementation.
MCP / tool governanceFull MCP client-side integration: MCP nodes, .ffmcp manifests, OAuth, and runtime mechanics. The governance layer — version catalog, approval flow, ACL, formal lifecycle — remains an area for further development.MCP Hub provides a tool catalog, versioning, approval process, and deprecation lifecycle.Strong ecosystem of integrations and credentials.Tool governance needs to be implemented independently.
Sandbox / execution isolationOpenSandbox, isolated environments, secrets, and access policies.Requires separate sandbox implementation.Has execution isolation, but no full agent sandbox.Uses the developer’s local environment; production sandboxing must be built separately.
Access control / secretsBuilt-in OAuth 2.1 server with PKCE, Dynamic Client Registration, and scoped tokens. Policy-based RBAC with roles, owner hierarchy, and isolated user scopes via Flame Chorus. Vault-based secrets with encryption at rest. Federation with external IdPs (SAML, OIDC, SCIM, LDAP) remains an area for further development.Requires unifying the security models of different components.Credentials management, external secrets, SSO/LDAP/SAML.No centralized access management.
Audit / replayEvent stream used for audit, replay, and recovery.Workflow history exists, but agent-level audit needs to be implemented separately.Execution history is available; audit depth depends on implementation.Change history is useful for developers, but does not replace enterprise audit.
Data sovereignty / self-hostFully supports self-hosting inside the corporate perimeter.Self-hosting is possible, but requires operating several systems.Strong self-hosting support.Deployment depends on the architecture of the solution created.
Deployment / operationsReady Kubernetes manifests and a unified platform for publishing and managing agents.Requires operating multiple components and their integration.Ready-made automation platform requiring DevOps support.Each agent needs to be deployed separately.
ReuseFlows, actions, templates, and MCP tools can be reused inside the platform.Requires custom conventions and a registry.High level of reuse through workflows and integrations.Depends on the quality of engineering practices.
Enterprise credibility / ecosystemStill has fewer public case studies and less ecosystem maturity.Strong ecosystem due to Temporal and Langflow.Very mature automation ecosystem.High developer trust due to Anthropic.
Main advantageUnified execution layer and full agent lifecycle.Mature durable engine and popular visual builder.Large number of integrations and fast automation launch.Maximum development speed.
Main limitationEnterprise features and product packaging need to be strengthened.Requires a custom glue layer between components.Main focus is workflow automation, not a specialized agent platform.Not a platform for operating enterprise agents.

Explore the docs

Open by design

Open protocols in and out — MCP, AG-UI, A2A — so you build with the ecosystem instead of betting everything on one model or vendor. Source-available under BUSL-1.1.

Get started in a few minutes

bash
git clone https://github.com/Persistent-AI/fireflow.git
cd fireflow && pnpm install && pnpm build

Start the databases, run migrations, and launch the dev environment:

bash
docker compose -f docker-compose.yaml up -d && \
  docker compose -f docker-compose.vfs.yaml up -d
pnpm run migrate
pnpm run dev

→ Full setup: Developer Docs · Try the visual editor: User Guide

ResourceURL
Documentationdocs.persistentai.org
GitHubgithub.com/Persistent-AI/fireflow
Visual Editor Demowl.persistentai.org/agent-editor

Licensed under BUSL-1.1