Vision & Category·September 6, 2025·9 min read

Stop Writing Prompt Loops: Why AI Agents Need a Platform, Not a Framework

Frameworks give you a loop around model.generate(). Production agents need ten more layers. Why agents are a platform problem, not a framework one.

By Matrix Team

The first agent demo is intoxicating. You wire up a while loop, call model.generate(), parse a tool call out of the response, run the tool, feed the result back, loop again. In an afternoon you have something that books a meeting or answers a question and it feels like magic.

Then you try to ship it. And the loop — the thing the framework gave you — turns out to be about 5% of the work.

This post is the argument that a serious agent product is a platform problem, not a framework problem. A framework hands you the prompt loop and a string. Production hands you ten more concerns you'll re-implement every single time, for every agent you ever build. So the right altitude to solve agents at is the platform, not the loop.

The prompt-loop ceiling

Every agent framework on the market is, at its core, the same primitive:

while not done:
    response = model.generate(prompt + history + tool_results)
    if response.tool_calls:
        tool_results = run_tools(response.tool_calls)
    else:
        done = True

That's genuinely useful. It's also where the framework's responsibility ends and yours begins. The moment a second customer signs up, a second agent gets created, or a real phone call comes in, you hit the ceiling. The loop doesn't know who the tenant is. It doesn't know which LLM key to use. It has no memory of the last conversation. It can't take a phone call. It has no idea what your private documents say. There's no admin screen. There's no access control.

None of that is a prompt problem. It's an infrastructure problem. And infrastructure is exactly what frameworks are designed not to opine about — that's what makes them frameworks.

The ten things you'll build anyway

Here's the uncomfortable part. The list of concerns between "demo" and "production agent" is remarkably stable across companies. You will build roughly these ten things, and you will build them whether you planned to or not:

Multi-tenant isolation — every row, every query scoped to an org, so customer A never sees customer B's data.
BYOK provider routing — let each tenant bring their own LLM keys, encrypted at rest, routed per org.
Persistent vector memory — agents that remember a contact across sessions, not just within one loop.
Real-time voice — because half your use cases are phone calls, and audio is a wire-protocol minefield.
RAG — agents grounded in your documents, not just the model's training data.
Tool and skill composition — a clean way to attach capabilities without re-wiring the runtime each time.
MCP — speaking the Model Context Protocol as both a client (calling external servers) and a server (exposing your own).
Access control — row, field, and type security so an agent can't surface data its caller shouldn't see.
Async / autonomous tasks — agents that run work in the background, not only in a request thread.
An admin UI — so non-engineers can create and operate agents without a deploy.

A framework helps with maybe item zero. The other ten are yours. And here's the multiplier that breaks the build-it-yourself plan: you don't build these once. You build them per agent. Every new persona drags the same ten concerns behind it unless they live in a shared substrate underneath. That's the difference between a framework and a platform: the framework is a library you call once per agent; the platform is the floor every agent stands on.

What an ai agent platform actually is

An ai agent platform is the production substrate that solves those ten concerns once, as shared infrastructure, so that shipping the eleventh agent costs the same as the first. The README for Matrix puts the contrast bluntly — for each of these capabilities the choice is "roll your own" or "already there":

	Roll your own	A platform
Multi-tenant isolation	build it	every row, every query
BYOK provider routing	build it	encrypted at rest
Persistent vector memory	build it	embed-on-write + HNSW recall
Real-time voice (phone + browser)	build it	barge-in, recording
RAG + GraphRAG	build it	drag-drop upload, auto-wired
MCP client and server	build it	same auth, no second model
New persona	ship code	fill out a form

Look at the last row. That's the whole thesis in one line.

Personas as data, not code

In a framework world, a new agent is a new code path: a new prompt file, a new tool registration, a new deploy. The agent is code, so shipping one means shipping software.

A platform inverts that. On Matrix there are no hardcoded personas. An agent is a configured record — a row in the generic entity model (everything is an EntityType / EntityNode in Neo4j) with a system prompt, a voice, a set of attached skills, knowledge corpora, and tools. You create one through the /admin/agents dashboard or a single POST:

POST /api/orgs/{slug}/agents
{
  "key": "admissions-counsellor",
  "name": "Admissions Counsellor",
  "systemPrompt": "You help prospective students...",
  "skills": ["toolkit-essentials"],
  "knowledge": ["course-catalog"]
}

No code ships. The agent persists across restarts. A non-engineer can do it from a form. If you can describe an agent, you can ship one — and that's only possible because the ten layers underneath are already solved generically. The persona is data riding on top of a platform; it is not the platform.

This is also why the platform's core has to stay strictly domain-agnostic. Your domain — recruiter, tutor, retention rep, support agent — lives entirely in data: entity rows, prompts, custom fields. It never lives in a fork of the runtime. (We go deep on this in Personas as Data, Not Code.)

One runtime, three channels, byte-for-byte parity

Here's where the framework approach quietly falls apart, and it's subtle enough that most teams don't see it coming.

A contact can reach your agent three ways: text chat, a real-time phone call, and — increasingly — an autonomous background task working a goal with no human in the loop. In the framework world, each of those is a different integration. Chat is your SSE endpoint. Voice is a whole separate audio bridge with its own prompt assembly. Autonomous is a job runner with its own context-building. Three code paths means three places for the agent's behaviour to drift apart. The agent that's warm and helpful in chat becomes a different personality on the phone, because somewhere the prompts diverged.

A platform refuses to let that happen. On Matrix, a single composition path — AgentToolSurface.composeForCaller — assembles every turn. For each turn it unions the agent's direct tools, its MCP servers, every attached skill's tools, the built-in toolbox, the auto-attached search_knowledge tool when a corpus is present, the memory built-ins, and an ambient clock — into one tool surface and one prompt. Then the same composed prompt drives text chat, real-time voice, and autonomous tasks.

Byte-for-byte prompt parity across channels is an enforced invariant, not an aspiration. The agent behaves identically no matter how someone reaches it, because there is exactly one place its behaviour is assembled. The only thing that branches is the driver — chat streams over SSE, voice bridges audio, autonomy loops on a goal — but they all read the same composed brain. (The 10-Layer Agent Stack walks each layer of that composition path.)

You cannot get this property by gluing three frameworks together. It's a structural guarantee that only exists when one runtime owns all three channels.

How Matrix does it

Matrix is built as exactly this substrate. The pieces aren't a roadmap — they're the floor:

Everything is an entity. EntityType / EntityNode in Neo4j, schema-driven via PropertyDefinitions. No hand-rolled domain classes, so a new field is a config edit, not a migration-and-deploy.
Multi-tenancy from line one. Every request populates a request-scoped TenantContext; every read and write filters by orgId. It's the floor every query stands on, not a WHERE clause you remember to add.
BYOK, encrypted. Each org brings its own provider keys, encrypted at rest. Spring AI ships the OpenAI and Anthropic chat paths; Gemini powers voice and embeddings.
Memory included. Every memory is embedded on write (768d) and recalled through Neo4j's native HNSW vector index, with a substring fallback. One memory pool per contact, shared across chat and phone.
Voice that ships. Production-grade full-duplex voice over the Gemini Live API — a telephony bridge for phone calls and a browser-direct path with no server in the audio path.
RAG by drag-and-drop. Drop a PDF into the dashboard; it's chunked, embedded, and the attached agent automatically gets a corpus-scoped search_knowledge tool. Flip graphragEnabled and ingestion also builds an entity/relation graph.
MCP both directions — a Streamable HTTP server at /mcp/** and a client for external servers, both gated by the same JWT as REST.

And the features with sharper edges ship opt-in, flagged off by default — call recording, self-improving agents (double-gated behind a per-agent flag and a platform kill-switch), and strict-mode access control (per-org, default off). You turn them on deliberately; they don't surprise you in production.

The build-vs-buy framing

So should you build the platform or buy one? The honest answer depends on whether the ten layers are your product or your plumbing.

If you're a research lab whose deliverable is a novel agent runtime, build it — that's the job. But if your product is the agent — the recruiter that fills roles, the counsellor that books admissions, the rep that retains customers — then those ten layers are plumbing. They're a year of undifferentiated engineering that doesn't make your agent any better at its actual job. Every week spent on multi-tenant query scoping or barge-in audio drop windows is a week not spent on the thing customers pay for.

The framework tempts you because the demo is cheap. The platform earns its keep on agent number two through infinity, when "ship a new persona" costs a form instead of a sprint. (We put real numbers on this trade-off in Build vs. Buy Agent Infrastructure.)

Takeaway

A prompt loop is the easy 5%. The other 95% — tenancy, keys, memory, voice, RAG, tools, MCP, access control, autonomy, and an admin UI — is infrastructure you'll build per agent unless it lives in a shared substrate underneath. That substrate is a platform, and treating agents as a platform problem is what lets a new persona cost a form instead of a release. Frameworks optimize the demo; platforms optimize the second, tenth, and hundredth agent.

If you've already written one prompt loop, you know where this goes. Skip the year of plumbing.

Explore the platform at thematrix.in, browse the source and docs on the repo, or just create a workspace and ship your first agent from a form — no prompt loop required.

#ai agent platform#framework vs platform#architecture

Build your first agent on Matrix

Spin up a workspace, wire up tools and knowledge, give your agent a voice, and talk to it in real time — no agent code required.

Create a workspace Read more articles