Budget-aware autonomy · full auditability · runtime freedom

We handle the
boring stuff.
You change the world.

The open-source framework for agentic platforms with guardrails that hold at scale.
Skills teach. Tools do. Modules bundle. Budgets enforce. Audits explain every action.
Anyone can ship one. Install with a click.

60 seconds to your own agentic OS — paste this into Cursor, Claude Code, Codex, or Gemini CLI inside any empty folder
deploy boringos shell on my localhost
Building a fully custom agentic product? npx create-boringos my-app
Install any 3rd-party module built for BoringOS

Drag a .hebbsmodonto the shell's Apps screen. Signed bundle, schema migrated, tools registered, UI hot-mounted — your agents know the new domain on the next wake.

crm-0.3.0.hebbsmod
signed · 1.4 MB
# one zip — skills, tools, schema, UI, lifecycle, signature
crm-0.3.0.hebbsmod
├── module.json             # id, version, kind, ui entry, publisher
├── index.mjs               # bundled Module — tools + skills + lifecycle
├── skills/deals.md         # injected into the agent prompt on wake
├── migrations/0001_init.sql  # per-tenant Drizzle
├── ui/index.mjs            # React surface, hot-loaded at /modules/crm/ui
└── signature               # Ed25519 over manifest + code + ui
# drop it on the shell's Apps screen — or one curl:
$ curl -F file=@crm-0.3.0.hebbsmod \
     -H "X-API-Key: $KEY" \
     https://your-host/api/admin/modules/upload
# the host:
✓ Verified Ed25519 signature (publisher: hebbs)
✓ Migrated 3 tables · crm__deals · crm__contacts · crm__activities
✓ Registered 7 tools at /api/tools/crm.*
✓ Loaded 4 skills into the agent prompt
✓ Mounted UI at /modules/crm/ui/* — sidebar updated via SSE
Your agents now know CRM. Open the shell.
100%
actions auditable
6
runtime adapters
4
budget guardrails
1-click
compliance export
per-run
replay trail
< 5 min
incident triage
The three primitives

One shape. For everything.

Every connector, every business app, every internal capability — same manifest, same install pipeline, same HTTP surface. Read it once, write your own.

Skills

Behavior, in markdown.

Plain .md files that get concatenated into the agent's system prompt on every wake. Teach the agent when to use a tool, edge cases, your house style. No templating — just words.

Module / persona / agent-instructions / tenant-override sources, prioritised + deduped.

Tools

Capability, with types.

Zod-typed callables, dispatched at POST /api/tools/<module>.<name>. Same handler runs from agents, workflows, routines, or your own routes. Every call is audited to tool_calls.

Idempotency + cost hints + permission scopes built in.

Modules

Everything, bundled.

One manifest binds skills + tools + schema + workflows + agents + routines + webhooks + OAuth + UI. Built-ins, third-parties, your own — all the same shape. app.module(x) wires the rest.

Three kinds: connector, module, hybrid — UI grouping hint, identical dispatch.

Open ecosystem

Anyone can ship a Module.

A Module is a TypeScript file with a manifest. Bundle it into a .hebbsmod archive. Upload it. Hosts install it per-tenant. Same flow as a Chrome extension or WordPress plugin — just for agents.

01

Author

Plain TypeScript. Implement the Module interface. Skills as .md, tools as Zod-typed handlers.

02

Bundle

.hebbsmod = signed zip with manifest + ESM entry + skills + migrations + UI. ~100KB–2MB.

03

Upload

Drag it into the shell. Ed25519 signature verified. Bytes content-addressed. Listed for tenants to install.

04

Install

Tenants opt in. Schema applied. Lifecycle hooks fire. Tools live at /api/tools/<id>.<name>. Agent reads new Skills next wake.

9 Modules shipped in the box.

Everything you'd otherwise wire up yourself — already a Module, already installed. Same shape as the one you'll write next.

frameworkcore

Tasks, comments, agents, runs

memorycore

Pluggable cognitive memory

drivecore

File storage + ACL

workflowcore

DAG runner over the tool registry

inboxcore

Email/Slack/event triage queue

triagecapability

Capability — routes new items to agents

copilotmodule

Built-in chat surface for every app

googleconnector

Gmail + Calendar via OAuth

slackconnector

Channels, DMs, slash commands

Built-in isolation

Autonomy without the blast radius.

Agents run with --dangerously-skip-permissionsso they don't stop to ask. That's safe because every byte they read or write goes through Drive — our proxy over the filesystem. Tenants can't see each other. Users keep private files private. Agents can't write to each other's drafts.

drive namespace · tenant=<tenantId>
<tenantId>/                  # isolation root — you cannot escape it
├── shared/...               # tenant-wide   · agents read+write
├── users/<userId>/...       # PRIVATE       · agents denied (drive-acl.ts:239)
├── agents/<agentId>/...     # agent home    · own=rw · others=read-only
├── tasks/<taskId>/...       # deliverables  · tenant-shared
└── projects/<projectId>/... # long-running  · tenant-shared

Tenant isolation, structural

Every path is prefixed with the tenant id at the storage layer (manager.ts:56). There is no code path that reads or writes outside the tenant root. Path traversal — .., absolute paths, percent-encoded escapes — rejected before the storage call.

User space is genuinely private

users/<id>/ returns 'users/<id>/ is private — not accessible to agents' on every agent attempt. It is not a permission you forgot to set — it is the default, in the type system, with a literal error string you can grep.

Per-agent ACLs

Agents can read each other's working drafts (transparency by default) but can only write to their own agents/<id>/ folder. Cross-agent writes return 'agent may only write to its own agents/<id>/ folder' — enforced before storage, not after.

Every byte goes through Drive

Reads, writes, lists, deletes — all go through DriveManager. That means tenant scoping, ACL check, audit row, event fan-out, memory-sync index, every time. No side door.

The agent has full power inside its lane. The lane is the load-bearing part. Source: @boringos/core/src/modules/drive-acl.ts.

Your AI team, orchestrated.

Agents delegate, escalate, and hand off to humans. Any task can route to the best runtime by policy, with fallback chains and shared memory across runs.

Autonomous delegation

CEO sets the goal. CTO breaks it down. Engineers execute. QA validates. next_actor state machine routes work between agents and humans.

Shared memory

Every agent remembers. Every run builds context. Pluggable provider — Hebbs out of the box, or your own.

Budget-enforced

Cost tracked per run, including Anthropic cache tokens. Limits per agent, per task, per tenant. No runaway spend.

Workflows you can see.

DAG that dispatches every node through the tool registry — same handlers your agents call. Persisted runs, live SSE, replay, fork-from-here, budget gates, and pause-on-approval keep autonomy controlled without slowing teams down.

Tool registry as the backend

Each block resolves to a Tool. Same Zod validation, same audit log, same handler whether the call comes from an agent, a workflow, or a routine.

Live runs

Every block transition streams via SSE. Watch the DAG light up — no polling, no reconstruction.

Replay & fork

Re-execute past runs. Fork from any block. Compare two runs side-by-side.

Everything you need.
Nothing you don't.

The host (@boringos/core) ships budget governance, runtime routing, auditable execution, and operations controls alongside the admin API, auth, SSE bus, and module install pipeline.

Budget Control Plane

Define spend caps by tenant, workflow, agent, or task. Choose hard stops or soft alerts, inspect incidents, and keep autonomous execution inside guardrails.

Audit Ledger

Every tool call, mutation, approval, and run transition is logged with actor, timestamp, and payload context. Replay and forensic traceability come built in.

Runtime Router

Route tasks to Claude, Codex, Gemini, Ollama, command, or webhook runtimes. Set quality/cost/speed policy and automatic fallback chains per workload.

Policy + Human Checkpoints

Pause-on-approval and wait-for-human blocks create safe handoffs for high-risk actions, while still letting low-risk paths run autonomously.

Evals + Scorecards

Track success rate, retries, cost-per-outcome, and regression risk across agent changes. Compare runs and tighten quality before production rollout.

Compliance Exports

Generate evidence bundles for internal review, security audits, and customer due diligence. Export machine-readable logs plus human-readable summaries.

Signed Module Trust

Ship and install signed .hebbsmod packages with integrity verification, content-addressing, and predictable lifecycle hooks per tenant.

Operations Console

Monitor queue pressure, run health, failures, and incident timelines from one operational view. Diagnose quickly and recover without guesswork.

From goal to done

You set the goal. The AI team figures out the rest. Every action is a Tool call. Every Tool call is audited.

1

Install the Modules you need

Built-ins are already there. Click-install third-party .hebbsmod packages. Each one adds skills the agent reads and tools the agent can call.

2

Assign a goal

Create a task. Assign to the CEO, CTO, or any role. Defaults route through Chief of Staff.

3

Delegation

The agent reads its skills, picks tools, breaks the goal into subtasks, assigns each to the right teammate. Workflow runs orchestrate where needed.

4

Autonomous work

Each agent spawns a CLI (Claude / Codex / Gemini / Ollama). Skills shape behavior, tools execute capability. Budget enforced. Memory shared.

5

Handoff & completion

next_actor flips between agent and human cleanly. Approvals route through the dashboard. Done tasks auto-post results as comments. Memory persists.

~60 seconds · one prompt

Your own agentic OS,
running on your laptop.

You don't install BoringOS. You tell your coding agent to. Open Cursor / Claude Code / Codex / Gemini CLI inside any empty folder, paste the prompt — it clones, installs, builds, boots Postgres, and brings up the shell at localhost:3000.

Cursor / Claude Code / Codex / Gemini CLI
prompt
deploy boringos shell on my localhost
→ agent runs:
✓ Cloned github.com/BoringOS-dev/boringos
✓ Installed 947 packages
✓ Built 14 workspace packages
✓ Embedded Postgres started on :54321
✓ Registered 9 built-in Modules
✓ Shell live at http://localhost:3000
Open the shell. Sign up. Start talking to your agents.

Works with your agent

Cursor, Claude Code, Codex, Gemini CLI, Aider — any agentic coding tool. The prompt is the install script.

Embedded Postgres

No Docker. No external DB. No Redis. Boot once, you have an admin API, a tool registry, an SSE bus, a queue.

Everything pre-wired

Inbox, Drive, Workflows, Agents, Copilot, Modules screen — all the built-in Modules already registered, all UI live.

Want a blank slate instead — shipping a fully custom agentic product on top of BoringOS? Scaffold your own host: npx create-boringos my-app

14 packages on npm.

The framework underneath. Modules sit on top of these; you usually only depend on @boringos/core and @boringos/module-sdk.

@boringos/coreApplication host
@boringos/agentExecution engine + dispatcher
@boringos/module-sdkModule / Tool / Skill types
@boringos/runtime6 CLI adapters
@boringos/memoryCognitive memory
@boringos/driveFile storage
@boringos/dbPostgres + Drizzle
@boringos/pipelineJob queue
@boringos/uiTyped client + React hooks
@boringos/shellReference UI shell
@boringos/connector-slackSlack reference Module
@boringos/connector-googleGmail + Calendar Module
@boringos/sharedBase types
create-boringosCLI scaffold

Every app ships with an AI copilot

The copilot Module is built in. Chat surface, session naming, can call any registered tool, can edit your code. Zero config.

Copilot
Show me all blocked deals
AI
Calling crm.list_deals
Found 2 blocked: Acme (waiting on legal), Globex (procurement hold).
Add a priority chart to the dashboard
AI
Edited app/page.tsx — added priority distribution chart.
+14 lines. Refresh to see the change.
Calls any Tool any Module registered — yours or built-in
Edits your code, adds features, fixes bugs — from chat
Sessions auto-named, runs auto-resume, costs tracked
Trust and operations

Enterprise control without losing autonomy.

Budgeting, auditability, runtime flexibility, and policy guardrails are not add-ons. They are first-class primitives in the execution path.

Immutable timeline

Track who did what, when, and why across agents, workflows, and humans.

Cost anomaly alerts

Detect spend spikes early and route incidents to the right owner with context.

Policy inheritance

Set org defaults once, then allow scoped tenant exceptions where required.

Signed evidence bundles

Export decision history and run artifacts for enterprise audits and customer security reviews.

The boring parts are handled.
Stop reading. Ship it.

One prompt. One minute. Your own agentic OS on localhost.

deploy boringos shell on my localhost
Building a fully custom agentic product instead? npx create-boringos my-app