Codex
§1 · What Codex is
Section titled “§1 · What Codex is”OpenAI’s own coding agent. A local codex binary runs interactive editing (a terminal UI plus a small Model Context Protocol / MCP suite); a cloud-tasks binary runs as a remote worker. The whole system is a Rust mono-repo where crates decouple through trait and protocol crates.
§2 · The 50+ crate layered architecture
Section titled “§2 · The 50+ crate layered architecture”| 维度 | Codex | Claude Code | OpenClaw | Hermes |
|---|---|---|---|---|
| Entry | `cli/` terminal entry (streaming stdin) | `tui/` full-screen ratatui UI (with onboarding / trust) | `app-server/` + `app-server-daemon/` speak JSON-RPC, called by IDE/plugins | `cloud-tasks/` standalone remote-worker binary |
| Core loop | `core/` single turn closure (assemble prompt → model call → tool dispatch → apply) | `core-api/` stable public interface | `core-plugins/` plugin attachment points | `protocol/` cross-crate shared types |
| State & memory | `state/` SQLite + lease/retry/backoff | `memories/` two-phase consolidation (stage1 + global) | `memory_citation` for traceback | `agent-graph-store/` thread / branch topology |
| Tools | `apply-patch/` V4A patch parser | `exec/` shell execution | `builtin-mcps/` bundled MCP bridges | `core-skills/` 8 sub-crate skill engine |
| Sandbox | macOS: `sandbox-macos-seatbelt/` emits .sbpl | Linux: `bwrap/` + `landlock/` + `seccomp/` | Windows: `windows-sandbox-rs/` | Unified: `sandbox/` + `execpolicy/` |
| Observability & cost | `analytics/` 20+ event types | `codex-otel/` OpenTelemetry bridge | `codex-rollout-trace/` full-session replay | `codex-cost/` token → USD |
§3 · Engineering highlights
Section titled “§3 · Engineering highlights”- Rust type system as backstop.
Resultandenummake state-machine errors impossible to silently swallow.Stage1JobClaimOutcome’s 5-way enum forces amatchon every branch. - Three native sandboxes. macOS uses seatbelt (works without disabling System Integrity Protection); Linux combines bubblewrap, landlock, and seccomp (no root required); Windows uses windows-sandbox-rs (no Windows Subsystem for Linux dependency). One implementation per platform.
- Memory consolidation in the background.
memory_consolidate_globalruns LLM-based rewrites of MEMORY.md andskills/as a background job. Zero cost on the main turn. - Replayable rollout-trace. Every session persists to JSONL and can be replayed. The only reliable way to diagnose agent drift.
apply-patchV4A format. A homegrown patch format more reliable than unified diff, more efficient than SEARCH/REPLACE, and more structured for LLMs.
§4 · Where it falls short
Section titled “§4 · Where it falls short”- Refactor cost is high. With 50+ crates, any cross-crate change touches
protocol/and triggers long rebuilds. - Rust skill bar. Async plus lifetimes plus tower-style trait composition. This is a Rust project first.
- No explicit user-level memory. MEMORY.md is cwd-scoped; cross-project preferences must live in personal-scope AGENTS.md.
- TUI-first. IDE integration runs through
app-serverJSON-RPC, but the ecosystem is far less mature than Claude Code’s IDE plugins.
§5 · Five things worth stealing
Section titled “§5 · Five things worth stealing”- Phase 2 consolidation prompt (
codex-rs/memories/write/templates/memories/consolidation.md). 800 lines that nail down what counts as high-signal memory, plus a wording-preservation rule. Reuse verbatim. - Three-OS sandbox abstraction (
sandbox/andexecpolicy/). One interface, three native backends. The best cross-OS agent reference. - Stage1JobClaimOutcome 5-way state machine (
state/src/model/memories.rs). All five outcomes of claiming a job are enum variants. No implicit branches. - rollout-trace JSONL. Persist every session for replay. Every agent project should have this.
- app-server JSON-RPC protocol. Run the agent engine as a separate process behind JSON-RPC. IDE plugins don’t reimplement the turn loop.