14. Workers: The Ghouls
Context and Problem Statement
The LychD Vessel is designed to be a high-performance, non-blocking interface. However, many tasks required of an autonomous daemon—waiting for long generative responses, performing recursive file operations, or executing complex verification rituals—are inherently slow and blocking. Running these tasks inside the primary web process presents a critical stability risk: a system crash wipes the volatile state, a container restart kills the active thought, and heavy CPU-bound operations can block the event loop, causing the application to fail health checks.
Requirements
- Labor Offloading: Mandatory offloading of slow or blocking tasks to resilient, persistent background processes that operate independently of the web server.
- Persistence beyond Death: Pending tasks must be stored in the Phylactery (06) and resumed automatically if the process restarts.
- Transactional Integrity: The enqueuing of labor must be atomic with database state changes; a job should only become visible to a worker if the associated database transaction commits successfully.
- Anatomical Partitioning: The background task system must utilize the dedicated
queuechamber (schema) of the unified database to ensure operational isolation. - Orchestrated Discipline: The labor force must be subject to the commands of the Orchestrator (23), allowing for the pausing of specific queues during state transitions.
- Reflex Arc Support: The worker system must provide the infrastructure for the "Long Sleep"—the ability to rehydrate the state of a Graph (24) and resume reasoning after an interruption.
- Massive Concurrency: A single worker process must be capable of juggling thousands of concurrent IO-bound tasks utilizing an asynchronous event loop.
- Infrastructure Minimalism: To adhere to the single-node doctrine, the system must not require a heavy external broker (e.g., Redis).
Considered Options
Option 1: In-Memory Async (asyncio.create_task)
Spawning background tasks directly within the web server process.
- Cons: Ephemeral. All pending work is lost on restart. No backpressure management. It introduces the risk of the entire Vessel failing if a background task causes a segmentation fault or Out-of-Memory error.
Option 2: Heavyweight Durable Execution (Temporal)
The industry standard for reliable, long-running workflows.
- Cons: Architectural Overkill. Requires maintaining a Java or Go cluster and additional database engines. The operational complexity contradicts the goal of a self-contained, lightweight daemon.
Option 3: Async Database Queue (SAQ)
Utilizing a lightweight, async-native queue backed by Postgres SKIP LOCKED and integrated into the backend framework.
- Pros:
- Minimalism: Reuses the existing database infrastructure; no new services to manage.
- Atomic Workflows: Allows a "Save and Enqueue" operation to occur within a single SQL transaction.
- Efficiency: The
SKIP LOCKEDmechanism provides high-performance job claiming without the polling overhead of legacy database queues.
Decision Outcome
SAQ is adopted as the engine for the background workers, referred to as Ghouls.
1. The Architecture of Labor
The Worker (Ghoul) is executed as a separate operating system process from the Web Server (Vessel), though they share the same codebase, dependencies, and database connection.
- The Engine: The worker utilizes the
SAQPluginprovided by the Backend (11) to ensure identical configuration and dependency injection. - The
queueChamber: Jobs are serialized into the dedicatedqueueschema within the Phylactery (06). This ensures that background labor is subject to the same Snapshot (07) and persistence laws as the rest of the system. - Async Efficiency: Because the Ghouls run on an asynchronous event loop, a single process can manage thousands of concurrent tasks (e.g., awaiting a response from a remote A2A peer or a slow local model) without exhausting system threads.
- Worker Profile Binding (Topology Split): To enforce the Dual-Plane Trust Delta, queue definitions are maintained globally, but worker execution loops are conditionally bound. Environment variables such as
LYCHD_WORKER_PROFILEdecide which queues a process may claim at boot. The Vessel boots under thecoreprofile for trusted orchestration tasks, while the Tomb boots under thetombprofile for untrusted code-execution tasks. This separation prevents a malicious payload from jumping execution queues by overwhelming a trusted worker.
2. The Doctrine: Brain in the Vessel, Hands in the Tomb
All cognitive labor—agent graph runners, LLM inference orchestration, Dispatcher resolution, memory curation—executes exclusively in the Vessel. The Tomb is a brainless executor. It receives serialized script payloads (Python code, CLI commands) via SAQ, runs them inside the nono sandbox, and returns stdout. It does not run agent logic, graph state machines, or make LLM provider calls.
The clever split is anatomical: agents live in the Vessel; when they need unsafe labor, only their hands enter the Tomb. A Tomb Ghoul is therefore an execution hand for a Vessel-side agent, not a second agent brain.
This doctrine exists because:
- State locality: Agent graph state lives in Vessel process memory. Keeping it there eliminates the need to serialize complex graph state across process boundaries.
- Security: The Tomb never needs LLM provider credentials, Dispatcher access, or graph runner dependencies. Its attack surface is minimal.
- Routing simplicity: The Vessel's Dispatcher and Orchestrator have instant visibility into all agent state because it never leaves Vessel memory. Tomb returns are just strings.
- Latency irrelevance: The SAQ queue hop (~50ms DB read) is negligible compared to multi-second LLM inference times.
Tomb Execution Flow
- A Vessel-side agent or Ghoul running a graph step needs code executed.
- It serializes the payload (script text, environment, dependency list) and enqueues it to the
tombSAQ queue. - A Tomb executor loop claims the job using its narrow execution credential.
- The Tomb executor uses
uvto fast-install any required dependencies into a job-scoped temporary workspace. - The Tomb executor spawns
nonowith the enriched workspace.nonohas zero network access and cannot read the container's environment variables. nonoexecutes the script, capturesstdout/stderr.- The Tomb executor writes the result back to SAQ.
- The Vessel Ghoul receives the result string and continues the graph step.
Untrusted Returns
Tomb stdout is untrusted. If the executed code processed data fetched through the Tomb loop's approved prefetch/proxy path, the output may contain adversarial content including prompt injection attempts. Tool outputs returning from the Tomb must be treated as untrusted when injected into agent context.
Per-Job Workspace Isolation
Multiple Tomb Ghouls may operate concurrently against the same Tomb workspace and artifact region. To prevent file collisions, every SAQ job must create a unique, isolated subdirectory under the Tomb job root (e.g., ~/.local/share/lychd/tomb/jobs/<job_id>/). The spawning Ghoul is responsible for cleanup after result collection.
3. Orchestrated Labor (The Command)
The Ghouls operate under the strict discipline of the Orchestrator (23).
- The Pause: When the Orchestrator initiates a Coven (08) swap, it issues a signal to the Ghoul process to pause the claiming of new jobs from the queue. This ensures that no tasks are dispatched to container services that are about to be banished.
- The Drain: Once a new Coven is manifested, the Orchestrator unpauses the Ghouls, allowing them to resume their labor with the newly available hardware capabilities.
Queue and Recovery Boundary
Workers own durable queue state: claim, ack, retry, result recording, and crash pickup. The Orchestrator may pause and drain workers during physical transitions, but it does not decide replay semantics for every job. Ordinary hardware stasis stops new claims and lets active work reach a safe boundary; Long Sleep, reboot recovery, and failed job retry are queue and Phylactery concerns.
A worker may hold non-authoritative in-memory state while it is actively laboring. After process death, that breath is lost and reconstructed from durable inputs: queued jobs, graph checkpoints, completed step outputs, traces, Codex configuration, and live capability probes. The testable invariant is not that every intermediate thought survives; it is that every declared recovery boundary can be replayed or safely abandoned.
4. The Reflex Arc and Memory Rituals
The Ghouls are the primary drivers of the Daemon's long-term cognitive processes.
- The Reflex Arc: The Worker process is responsible for the rehydration of complex state machines. When a cognitive process pauses to await an external event, its state is persisted. The Ghoul is the entity that wakes the mind, rehydrates the Graph (24) state, and steps the logic forward.
- Ingestion Rituals: The Ghouls perform the heavy lifting of Memory (27). They execute the partitioning of documents and the communication with the Dispatcher (22) to generate embeddings, ensuring the primary interface remains responsive during ingestion.
Metabolic Ghoul Profile
Memory augmentation runs as a dedicated Ghoul specialization:
- Performs Memori "Advanced Augmentation" (facts, entities, triples) asynchronously.
- Applies attribution on every write (
entity_id,process_id) before committing to the Phylactery. - Never blocks user-facing response paths; ingestion is eventual and durable.
- Defers heavy embedding/vectorization to available embedding covens under Orchestrator discipline.
Curator Ghoul Profile
Memory curation runs as a separate periodic Ghoul specialization:
- Scores candidate memories using recency, reinforcement, confidence, and contradiction checks.
- Applies lifecycle transitions:
promote,keep,archive,prune. - Preserves anchored identity facts regardless of decay score.
- Emits audit traces for every destructive prune action to support rollback and policy tuning.
5. Extension Rites
The architecture allows extensions to register their own background functions (Rites). This ensures that heavy logic added by extensions (e.g., document processing or code compilation) does not degrade the performance of the core Vessel.
6. Dual-Plane Trust Delta
Worker ownership spans both the Trusted and Semi-Trusted planes.
- Vessel workers remain fully trusted for control-plane tasks.
- Tomb workers are Semi-Trusted execution hands. The main Python loop in the Tomb container uses a narrow queue-only SAQ/Postgres execution credential to claim, ack, and retry execution-plane jobs.
- Untrusted Sub-steps: Real unsafe labor (executing AI code) is spawned inside the
nonosandbox by the Tomb worker loop. The sandbox has zero network access. - If a
nonosandbox escapes, the attacker is trapped in the Tomb container. They may steal the narrow SAQ/Postgres execution credential from the environment, but Layer 7 Auth prevents them from accessing Vessel's master tables, provider keys, signing keys, or control-plane secrets.
Policy Table
| Dimension | Vessel Workers (Trusted Control Plane) | Tomb Executors (Semi-Trusted Execution Plane) |
|---|---|---|
| Secrets | Accesses control-plane queue/database credentials and high-value API keys. | Narrow queue-only SAQ/Postgres execution credential. No provider keys, signing keys, Codex secrets, or control-plane credentials. |
| Mounts | Trusted mounts for queue processing and persistence orchestration. | Task workspace and temporary execution mounts; optional read-only/sanitized Codex projection only. |
| Network | Shared Pod network (Internet + Localhost). | Tomb loop may use shared Pod connectivity for queueing and approved prefetch/proxy work; sandboxed nono subprocesses have zero network. |
| Queue Ownership | Owns enqueue policy, durable scheduling, and retry lifecycle for core tasks. | Claims, acks, and retries untrusted execution jobs via the Semi-Trusted loop. |
| Authority Boundaries | Commits durable outcomes and controls retries. All agent/graph/LLM logic runs here. | Executes raw scripts/commands only. No agent logic, no graph runners, no LLM calls. Cannot mutate core infrastructure state. |
Consequences
Positive
- Operational Resiliency: The Daemon is crash-proof; work resumed after a failure picks up from the last successfully committed task in the Phylactery.
- Physical Synchronization: By linking job claiming to the Orchestrator, the system prevents "Task Blindness" where a worker attempts to use a dormant container.
- Unified Logic: Using the same framework and database for both web and background tasks eliminates the "Dual Schema" problem.
Negative
- Database Churn: High-volume queues generate significant dead tuples. The
queuechamber requires aggressive Autovacuum tuning within the persistence layer. - Polling Latency: While sub-second, a database-backed queue has slightly higher job-pickup latency compared to an in-memory or raw-socket broker.