Tag: openclaw

  • Why I picked NanoClaw over OpenClaw for a GTM pipeline

    Before getting into the comparison itself, one piece of context worth setting straight: neither OpenClaw nor NanoClaw is built for a non-technical audience. Both expect comfort with the command line, git, and at least one model provider’s API setup. Both reward fluency with the underlying stack. The marketing copy on both sites pitches “personal AI assistant for everyone,” which is aspirational. The reality today is that you need to know what pnpm install does and roughly what a Docker container is to get either one running smoothly.

    That said, the two frameworks make different trade-offs within that technical-user space, and which trade-off is right for you depends on what you’re actually building.

    OpenClaw is the more monolithic, more featureful option. It ships a guided CLI onboarding (openclaw onboard), supports multiple LLM providers natively (Anthropic, OpenAI, local), has a Companion App for macOS that gives you a menubar interface, and includes browser control, persistent memory, and dozens of community-built skills out of the box. The trade is operational complexity — ~434,000 lines of code, 70+ dependencies, single Node process with shared memory — and a security model that relies on application-level checks rather than OS isolation. Recent CVEs and security writeups in this space have mostly been openclaw-shaped.

    NanoClaw is the lighter, more opinionated alternative. ~3,900 lines of code, fewer than 10 dependencies, agents in isolated Linux containers with explicit mounts, single host process orchestrating per-session containers. Credential handling is delegated to OneCLI (NanoClaw’s companion credential proxy), which injects API keys at request time so agents never hold raw secrets — meaningful when an agent has bash access inside its container. The trade is that NanoClaw is built natively around the Claude Agent SDK — Claude Code is the primary harness, the slash commands that install channels and providers run inside Claude Code, and other providers (Codex, OpenCode, Ollama) are drop-in alternatives rather than peers. There’s no menubar app, no built-in dashboard, no UI beyond the chat channels you’ve installed. The codebase is small enough that “ask Claude Code to walk you through it” is a realistic onboarding strategy.

    For a personal-assistant use case, the openclaw trade-off probably wins. More features out of the box, more flexibility on providers, easier to bring up if you’re not already deep in the Claude Code ecosystem. For a pipeline-shaped workload — GTM, document processing, anything where the workflow exists independent of conversation — NanoClaw is structurally a better fit, and Claude Code being the assumed harness is actually an advantage rather than a constraint, because Claude Code is arguably the most capable AI authoring tool available right now and the framework is built around it.

    I went through both before settling. Here’s the rest of the comparison through the pipeline-shape lens.

    The shape of the workload matters

    A personal assistant is reactive. You send it something. It figures out what you meant, picks a tool, runs the tool, replies. The workflow is whatever the conversation is.

    A pipeline is the opposite. There’s a state machine. There are stages. Each prospect, ticket, document, or whatever the unit is moves through stages on its own clock. Some get stuck. Some get rerouted. Some need to be remembered six months later when a specific signal lights up. The workflow exists independent of any conversation.

    These two workloads want different things from a framework. The assistant wants flexibility, channels, plug-and-play tools, an LLM that figures out what to do. The pipeline wants determinism between stages, deterministic routing, dry-run capability, an LLM that does bounded judgment work inside a stage.

    This is the lens that matters. Most framework comparisons are feature bake-offs. The actual question is which workload shape you’re building.

    Three things that didn’t survive OpenClaw for me

    Routing. Openclaw’s agent picks what to do based on the inbound and its own reasoning. That’s the right model for “summarise my inbox” and the wrong model for “transition prospect ABC from awaiting-reply to unresponsive after 14 days.” The second decision has to be deterministic, replayable, dry-runnable, and outside the LLM. Tool-call routing is fine when the cost of a wrong decision is small. In a GTM pipeline a wrong routing decision is a duplicate touch, a wrong segment, a compliance breach.

    You can wire OpenClaw to do deterministic routing — through skill conditions, scheduled triggers, scripted control flow — but you’re working against the framework’s grain. Every hour spent there is an hour reinventing what a state machine engine gives you for free.

    Per-skill model preference. Pipelines benefit from heterogeneity. Small fast models for bulk discovery and augmentation. Larger models for content polish. Different providers for redundancy. OpenClaw supports multiple LLM backends as a first-class feature — you can configure Anthropic, OpenAI, or local models — but the routing decisions are made within the agent’s own reasoning rather than at the framework level. For a pipeline you want the framework to route deterministically based on skill family, not let the agent pick its own provider per call.

    NanoClaw’s approach is the opposite: provider is configured per agent group, one provider per group, multiple groups in parallel. That maps directly to “discovery and augmentation in one group on Nemotron, polish in another group on Claude.” Per-task provider hints would be cleaner, but group-level routing is what works today, and for most pipelines it’s enough because the natural skill boundaries align with provider preferences anyway.

    Operating cost. OpenClaw runs a websocket gateway with constant background activity. mDNS service discovery, periodic health probes, channel reconnect loops. On a 1GB droplet it spent most of its capacity on its own metabolism. Bumping the VPS works, but the symptom is telling.

    NanoClaw is much quieter at idle. The host process owns message queues, agent containers spin up per task, channels are explicit and minimal. A 2GB droplet has plenty of headroom for a working pipeline plus orchestrator plus operator UI.

    What NanoClaw doesn’t do, and why that’s useful

    NanoClaw has no built-in orchestrator. No state machine engine. No artifact store. No journal writer. No skill dispatcher. No dry-run harness. No business logic of any kind.

    For an assistant, this is missing functionality. For a pipeline, it’s the right scope.

    The orchestrator is the part that’s specific to your workflow. State transitions, when to retry, when to dead-letter, what counts as completion, what triggers the next stage. Building it as plain code (in any language; mine is TypeScript) means it stays readable, testable, and replaceable. NanoClaw runs the channel adapters and the agent containers. The orchestrator runs the workflow. They talk through structured task contracts.

    The trade is real: you write more code to start. The benefit is real: you understand and own every line of the pipeline that matters.

    What both share

    The skills system. Both frameworks treat skills as SKILL.md markdown files that the agent reads and executes. The same skills can technically run on either framework with minor adjustments, though the agent configuration files differ — openclaw uses SOUL.md for agent personality and config, NanoClaw uses CLAUDE.md for the same purpose. So you’re not locked into a framework by your skills library — you’re picking the framework that runs them at the right architectural layer.

    Both also lean on Claude Code as a useful authoring layer, though the relationship is different. NanoClaw is explicit about it — the slash commands that install channels and providers run inside Claude Code and copy source files into your fork from long-lived branches. OpenClaw is more flexible: you can author with Claude Code, edit config files by hand, or use whatever AI coding tool you prefer including the built in agents. Either way, having Claude Code in the loop is the best authoring experience available right now for both — it’s just that NanoClaw treats it as the assumption while openclaw treats it as one option among several.

    The forking model

    NanoClaw’s other design choice worth flagging: it’s opinionated about you forking the repo and treating the fork as your install. There’s no config-as-data layer that abstracts away your customizations. If you want different behaviour, you change the code. The codebase is small enough that this is safe.

    This is a discipline. It means every customization is a code change you can read and revert. It also means setup feels heavier than openclaw’s onboarding. For a pipeline you’ll be running for months, that’s the right trade. For a weekend assistant project, it’s overkill.

    The decision criteria, condensed

    Pick OpenClaw if:

    • You want a personal assistant that responds to messages on channels
    • The workflow is whatever the conversation is
    • You want maximum provider flexibility (Anthropic, OpenAI, local models all first-class)
    • You want a menubar app and guided onboarding out of the box
    • You’re fine with the larger codebase and application-level security model

    Pick NanoClaw if:

    • You’re building something with a state machine — pipeline-shaped, not chat-shaped
    • The workflow exists independent of any conversation
    • You need deterministic routing, dry-runs, replay
    • You want different providers for different stages, configured per agent group
    • You’re deep enough in Claude Code to leverage it as the authoring layer
    • You want OS-level container isolation as your security model
    • You’re willing to write the orchestrator yourself (and would rather, because you want to own the workflow logic)

    Worth knowing

    NanoClaw is younger and more spartan around setup edges — both because it does less by design and because the project is moving fast. If you hit a setup gotcha, the answer is usually in the docs and a quick edit by Claude Code resolves it. Filing an issue and waiting is the slower path. The flip side: the codebase is small enough that you can read all of it, and Claude Code can confidently make changes across it.

    OpenClaw has the larger community, more channel adapters in stable shape, and a richer ecosystem of community skills (ClawHub, the openclaw skills marketplace, has hundreds). If you’re operating in personal-assistant territory, those network effects matter. For pipelines, they don’t.

    Worth flagging for context: OpenClaw’s creator, Peter Steinberger, joined OpenAI in February 2026, with the project continuing as open source. The project’s velocity has been impressive but the security model has also been the subject of multiple writeups — anyone evaluating it for production should read the security analyses alongside the marketing copy.

    The full GTM system this comparison feeds into is in Building a GTM dark factory with Nemotron 3 and NanoClaw. For setup specifics — what it takes to actually run NanoClaw end to end — see the companion piece.

  • Building a GTM dark factory with Nemotron 3 and NanoClaw

    Building a GTM dark factory with Nemotron 3 and NanoClaw

    Outbound has a failure mode anyone running a B2B pipeline has hit. Go wide and the response rates collapse, the domain gets filtered, the brand looks like every other vendor blasting templates. Go narrow and the volume can’t sustain a business. The middle path — per-prospect research, context-aware first touches, disciplined follow-ups — used to need an army of SDRs.

    What the system below builds toward is functionally an AI-native CRM with marketing automation, segmentation, and funnels. It’s the same business object SaaS stacks like HubSpot, Salesforce + Marketo, or Apollo + Outreach + Clay assemble from a dozen subscriptions and a small ops team. Traditionally that operation is human-fronted at every stage: defining segments, enriching records, writing sequences, reviewing replies, tuning the funnel. Tools speed each step but don’t change the shape. Humans are in every loop because the judgment work is theirs.

    The dark factory operating model changes that. GTM is unusually well-suited to it because it’s a closed-loop domain. Every action generates measurable feedback: opens, replies, meetings booked, deals closed, journal of what worked and what didn’t. That feedback is what lets skills earn autonomy on evidence rather than wishful thinking, graduating from copilot mode (operator approves each output) to dark factory mode (autonomous, with sampling and exception escalation). Volume goes up because agents work on more prospects in parallel than any human can. Consistency goes up because the contract on the wire enforces it. The operator’s role compresses from reviewing every output to reviewing what the journal flags.

    The building blocks are NanoClaw as the agent and channel runtime, Nemotron 3 Super as the bulk runtime model alongside Claude for polish, and Claude Code and Codex as the authoring layer. None of them is a CRM. Composed together, with a state machine and journal sitting above them, they become one.

    What the engine does

    The engine takes a hypothesis (ex. “healthcare companies publicly investing in compliance automation are good prospects”) and produces a queue of prospects with structured profiles, draft first-touches in a collab-partner voice, and context packs for the channels where execution stays manual (LinkedIn, anything high-touch). The operator reviews and approves drafts. Email goes out via Resend with proper deliverability hygiene. Replies route through an inbound webhook, get classified, and trigger state transitions. The journal records every decision with rationale, confidence, alternatives considered, and source evidence.

    Two things distinguish it from the standard funnel.

    The qualifying signal is behavioural rather than firmographic. “This company’s CEO talked publicly about scaling regulatory automation last quarter” beats “this company has 80 employees in three cities.” The second tells you a company exists. The first tells you something is happening there worth a conversation.

    Disqualification states are first class: not a fit, not now, unreachable, unresponsive, do not contact, conflict. None of these are fallbacks at the edge of the state machine. They’re destinations the orchestrator routes to deliberately. A prospect that hit “not now” with a specific signal six months ago is a different lead than one that’s been silent. The state machine has to remember the difference.

    Operator in the loop, then less of it

    The two-mode model deserves a closer look because it’s where the architecture earns its keep. Copilot and dark factory aren’t synonyms for “manual” and “automated.” They’re different relationships between the operator and the agent group. Copilot is the operator approving every output and using the journal to spot patterns. Dark factory is the operator sampling outputs, reading exception escalations, and trusting the rubric for the rest. Some skills move between them in weeks. Some never graduate. Drafting outbound to a high-value prospect is a copilot job forever. Augmenting an early-funnel profile from public sources isn’t.

    Claude Code and Codex sit on the operator side of this loop, not the agent side. They edit the orchestrator, write skills, debug runs, apply patches. The agents inside NanoClaw containers run the domain skills, not the authoring code. The operator stitches the two layers together until each carries more on its own.

    Why this architecture for a GTM pipeline

    The framework choice matters because pipelines aren’t assistants. I started on OpenClaw. It’s the more featureful framework on paper, with channels, providers, scheduled tasks, and a guided onboarding flow all in one package. The pitch is right for a personal assistant. You point it at your stuff, it runs.

    For a GTM pipeline it’s the wrong shape. OpenClaw’s agent picks what to do based on the inbound and its own reasoning. That’s the right model for “summarise my inbox” and the wrong model for “transition prospect ABC from awaiting-reply to unresponsive after 14 days.” The second decision has to be deterministic, replayable, dry-runnable, and outside the LLM. Tool-call routing is fine when the cost of a wrong decision is small. In a GTM pipeline a wrong routing decision is a duplicate touch, a wrong segment, a compliance breach.

    NanoClaw makes the opposite design choice. It does less. It runs the channel adapters, one container per agent group, and a host process that owns the message queues. Skills are markdown files mounted into containers. There’s no built-in orchestrator, no business logic, no opinion on your workflow. For an assistant that would be missing functionality. For a pipeline it’s the right scope for the bottom layer.

    The full stack: NanoClaw is the channel and agent runtime. A separate orchestrator (custom code) sits above it and owns the pipeline state machine. Claude Code or Codex sits next to all of it as the authoring layer. The operator sits on top, reviewing outputs, approving drafts, gradually handing off more as each skill earns it. (I’ve written more on the framework comparison itself for those evaluating the two.)

    The orchestrator is plain code. State machine engine, artifact store, journal writer, skill dispatcher, dry-run harness. It dispatches structured tasks to the agent’s inbound queue. The agent runs the skill in its container and writes a result back. The result has to carry, at minimum, what was found, why, how confident the agent is, the alternatives considered and rejected, and the evidence with sources. The orchestrator validates against that contract on read. Validation failure means deterministic retry or dead-letter, never a re-prompt loop. The agent is allowed to be uncertain. It’s not allowed to be silent about it.

    Operating mode lives at the agent group, not in the task. A copilot group’s outputs land in a review queue. A dark factory group’s outputs trigger state transitions automatically. Promoting a skill from copilot to dark factory is moving its mount point, not rewriting it.

    For the model layer: Nemotron 3 Super handles the bulk runtime work. Strong instruction following, long context, throughput that holds up under volume. Augmentation skills that read four or five sources and synthesise a structured profile benefit from the long context: public LinkedIn snippets, recent posts, the company’s own site, a news mention or two. Drafting routes to Claude. The bulk-then-polish chain saves tokens on volume work and keeps the polish pass focused on prose that goes to a human. The free tier covers early-stage development; production volumes need API access. Multi-provider routing is less about feature redundancy and more about not having a single provider’s outage take out the whole pipeline. The orchestrator routes per skill family: bulk runtime to Nemotron, polish to Claude, redundancy keys for either in reserve.

    For setup specifics — Claude Code as the authoring dependency, the no-UI consequence, deployment gotchas a small VPS surfaces — checkout the companion piece on what it takes to actually run NanoClaw.

    DPDP Act compliance lives at the journal layer: every artifact change is logged with provenance, deletion requests tombstone the artifact while retaining audit evidence. Easier upfront than retrofitted.

    What this is, when it’s working

    A GTM dark factory is a specific shape: an AI-native CRM where the determinism lives between tasks and the LLM agency lives inside them. The agent does the bounded judgment work; the orchestrator decides what comes next; the journal holds both accountable. Volume goes up. Variance stays bounded. The operator’s role compresses to where it adds the most value — picking what gets built next, reviewing what the rubric can’t decide, deciding when a skill has earned graduation.

    Outbound that holds shape between wide and narrow doesn’t need an SDR army. It needs orchestration you can trust, a contract on the wire, and the discipline to let skills earn autonomy rather than be granted it. The framework choice is secondary. The split between framework, orchestrator, and authoring layer is what makes it work.