nemotron – AB's Reflections

Outbound has a failure mode anyone running a B2B pipeline has hit. Go wide and the response rates collapse, the domain gets filtered, the brand looks like every other vendor blasting templates. Go narrow and the volume can’t sustain a business. The middle path — per-prospect research, context-aware first touches, disciplined follow-ups — used to need an army of SDRs.

What the system below builds toward is functionally an AI-native CRM with marketing automation, segmentation, and funnels. It’s the same business object SaaS stacks like HubSpot, Salesforce + Marketo, or Apollo + Outreach + Clay assemble from a dozen subscriptions and a small ops team. Traditionally that operation is human-fronted at every stage: defining segments, enriching records, writing sequences, reviewing replies, tuning the funnel. Tools speed each step but don’t change the shape. Humans are in every loop because the judgment work is theirs.

The dark factory operating model changes that. GTM is unusually well-suited to it because it’s a closed-loop domain. Every action generates measurable feedback: opens, replies, meetings booked, deals closed, journal of what worked and what didn’t. That feedback is what lets skills earn autonomy on evidence rather than wishful thinking, graduating from copilot mode (operator approves each output) to dark factory mode (autonomous, with sampling and exception escalation). Volume goes up because agents work on more prospects in parallel than any human can. Consistency goes up because the contract on the wire enforces it. The operator’s role compresses from reviewing every output to reviewing what the journal flags.

The building blocks are NanoClaw as the agent and channel runtime, Nemotron 3 Super as the bulk runtime model alongside Claude for polish, and Claude Code and Codex as the authoring layer. None of them is a CRM. Composed together, with a state machine and journal sitting above them, they become one.

What the engine does

The engine takes a hypothesis (ex. “healthcare companies publicly investing in compliance automation are good prospects”) and produces a queue of prospects with structured profiles, draft first-touches in a collab-partner voice, and context packs for the channels where execution stays manual (LinkedIn, anything high-touch). The operator reviews and approves drafts. Email goes out via Resend with proper deliverability hygiene. Replies route through an inbound webhook, get classified, and trigger state transitions. The journal records every decision with rationale, confidence, alternatives considered, and source evidence.

Two things distinguish it from the standard funnel.

The qualifying signal is behavioural rather than firmographic. “This company’s CEO talked publicly about scaling regulatory automation last quarter” beats “this company has 80 employees in three cities.” The second tells you a company exists. The first tells you something is happening there worth a conversation.

Disqualification states are first class: not a fit, not now, unreachable, unresponsive, do not contact, conflict. None of these are fallbacks at the edge of the state machine. They’re destinations the orchestrator routes to deliberately. A prospect that hit “not now” with a specific signal six months ago is a different lead than one that’s been silent. The state machine has to remember the difference.

Operator in the loop, then less of it

The two-mode model deserves a closer look because it’s where the architecture earns its keep. Copilot and dark factory aren’t synonyms for “manual” and “automated.” They’re different relationships between the operator and the agent group. Copilot is the operator approving every output and using the journal to spot patterns. Dark factory is the operator sampling outputs, reading exception escalations, and trusting the rubric for the rest. Some skills move between them in weeks. Some never graduate. Drafting outbound to a high-value prospect is a copilot job forever. Augmenting an early-funnel profile from public sources isn’t.

Claude Code and Codex sit on the operator side of this loop, not the agent side. They edit the orchestrator, write skills, debug runs, apply patches. The agents inside NanoClaw containers run the domain skills, not the authoring code. The operator stitches the two layers together until each carries more on its own.

Why this architecture for a GTM pipeline

The framework choice matters because pipelines aren’t assistants. I started on OpenClaw. It’s the more featureful framework on paper, with channels, providers, scheduled tasks, and a guided onboarding flow all in one package. The pitch is right for a personal assistant. You point it at your stuff, it runs.

For a GTM pipeline it’s the wrong shape. OpenClaw’s agent picks what to do based on the inbound and its own reasoning. That’s the right model for “summarise my inbox” and the wrong model for “transition prospect ABC from awaiting-reply to unresponsive after 14 days.” The second decision has to be deterministic, replayable, dry-runnable, and outside the LLM. Tool-call routing is fine when the cost of a wrong decision is small. In a GTM pipeline a wrong routing decision is a duplicate touch, a wrong segment, a compliance breach.

NanoClaw makes the opposite design choice. It does less. It runs the channel adapters, one container per agent group, and a host process that owns the message queues. Skills are markdown files mounted into containers. There’s no built-in orchestrator, no business logic, no opinion on your workflow. For an assistant that would be missing functionality. For a pipeline it’s the right scope for the bottom layer.

The full stack: NanoClaw is the channel and agent runtime. A separate orchestrator (custom code) sits above it and owns the pipeline state machine. Claude Code or Codex sits next to all of it as the authoring layer. The operator sits on top, reviewing outputs, approving drafts, gradually handing off more as each skill earns it. (I’ve written more on the framework comparison itself for those evaluating the two.)

The orchestrator is plain code. State machine engine, artifact store, journal writer, skill dispatcher, dry-run harness. It dispatches structured tasks to the agent’s inbound queue. The agent runs the skill in its container and writes a result back. The result has to carry, at minimum, what was found, why, how confident the agent is, the alternatives considered and rejected, and the evidence with sources. The orchestrator validates against that contract on read. Validation failure means deterministic retry or dead-letter, never a re-prompt loop. The agent is allowed to be uncertain. It’s not allowed to be silent about it.

Operating mode lives at the agent group, not in the task. A copilot group’s outputs land in a review queue. A dark factory group’s outputs trigger state transitions automatically. Promoting a skill from copilot to dark factory is moving its mount point, not rewriting it.

For the model layer: Nemotron 3 Super handles the bulk runtime work. Strong instruction following, long context, throughput that holds up under volume. Augmentation skills that read four or five sources and synthesise a structured profile benefit from the long context: public LinkedIn snippets, recent posts, the company’s own site, a news mention or two. Drafting routes to Claude. The bulk-then-polish chain saves tokens on volume work and keeps the polish pass focused on prose that goes to a human. The free tier covers early-stage development; production volumes need API access. Multi-provider routing is less about feature redundancy and more about not having a single provider’s outage take out the whole pipeline. The orchestrator routes per skill family: bulk runtime to Nemotron, polish to Claude, redundancy keys for either in reserve.

For setup specifics — Claude Code as the authoring dependency, the no-UI consequence, deployment gotchas a small VPS surfaces — checkout the companion piece on what it takes to actually run NanoClaw.

DPDP Act compliance lives at the journal layer: every artifact change is logged with provenance, deletion requests tombstone the artifact while retaining audit evidence. Easier upfront than retrofitted.

What this is, when it’s working

A GTM dark factory is a specific shape: an AI-native CRM where the determinism lives between tasks and the LLM agency lives inside them. The agent does the bounded judgment work; the orchestrator decides what comes next; the journal holds both accountable. Volume goes up. Variance stays bounded. The operator’s role compresses to where it adds the most value — picking what gets built next, reviewing what the rubric can’t decide, deciding when a skill has earned graduation.

Outbound that holds shape between wide and narrow doesn’t need an SDR army. It needs orchestration you can trust, a contract on the wire, and the discipline to let skills earn autonomy rather than be granted it. The framework choice is secondary. The split between framework, orchestrator, and authoring layer is what makes it work.

Tag: nemotron

Building a GTM dark factory with Nemotron 3 and NanoClaw

What the engine does

Operator in the loop, then less of it

Why this architecture for a GTM pipeline

What this is, when it’s working

Share: