Back to projects
Dec 01, 2025
6 min read

Self-Improving Code & OpenClaw

Building the platform that makes agentic AI safe, observable, and auditable — three-layer architecture combining threat detection, autonomous development, and governance for regulated industries

The Dual-Sided Thesis

The explosion of agentic development means code gets created extraordinarily fast. Your shipping dock manager is now building applications to solve logistics bottlenecks. Your compliance team is automating audit responses. Non-technical people creating real solutions is a force multiplier for business — but the attack surface increases immensely.

The threat side: Prompt injection attacks hijack agent behavior. Mixture of Experts layer attacks poison the logic layer of running models. Novel attack vectors unique to autonomous systems emerge faster than traditional security can respond. Every agentic deployment needs defense infrastructure, guardrails, and cost management.

The opportunity side: When everyone in an organization can build with AI, extraordinary intellectual property gets created at the edges — good patterns, novel solutions, domain innovations that would never surface through traditional development cycles.

The synthesis: The same automated systems that identify and mitigate threats can also identify and surface innovations. Defense and innovation are two sides of the same automated oversight system. That’s what Self-Improving Code builds.

Three-Layer Architecture

MOLT — Model-Oriented Lifecycle Toolkit

The foundation layer provides the infrastructure for building, testing, and deploying agentic systems in regulated environments. MOLT handles the lifecycle that most AI practitioners skip: version control for model configurations, reproducible evaluation pipelines, deployment automation with rollback capability, and the observability instrumentation that makes everything else possible.

FrawdBot — Threat Detection

The detection layer. FrawdBot started as an insider threat detection engine for Google Workspace — 21,000 lines of Python running 12 behavioral rules against rolling statistical baselines. But the pattern recognition architecture generalizes: any environment where agents act autonomously needs automated oversight that can distinguish normal behavior from anomalous behavior, correlate across multiple signals, and detect multi-week campaigns that no human reviewer would catch.

Observability & Governance

The compliance layer. Built on experience surviving 38 annual audits at Oracle (FedRAMP, SOC, PCI, government frameworks) and building FDA compliance AI at Always Cool Brands. The key insight: auditors want evidence, not explanations. This layer provides LLM-as-judge infrastructure and evaluation pipelines that enable continuous auditing within governance methodologies specific to each industry — FDA for food safety, NRC for nuclear, FINRA for financial services.

Builders of Builders

Self-Improving Code isn’t a product you use — it’s a platform for building products that govern themselves. The development cycle demonstrates the thesis:

  1. Run detection against real data
  2. Review findings, identify false positives and missed patterns
  3. Write new rules or tune existing ones (often with AI assistance)
  4. Run comprehensive test suites to verify nothing broke
  5. Re-run forensic mode against historical data to validate
  6. Deploy and repeat

Each iteration makes the system smarter. The feedback loop between detection, evaluation, and improvement is the “self-improving” part — not autonomous AI writing its own code unchecked, but AI-assisted development with automated oversight at every step.

The Technology Stack

LayerComponents
Agent FrameworkClaude Code, LangGraph, Model Context Protocol (MCP)
ObservabilityLangSmith + LangFuse (dual — cloud and self-hosted)
AnalyticsClickHouse for high-volume event data
Local InferenceOllama for data sovereignty requirements
DistributionFine-tune distribution to air-gapped endpoints
DetectionPython, SQLite with FTS5 + vec0, pytest

The stack is deliberately hybrid. Cloud-hosted observability provides convenience; self-hosted provides sovereignty. The same detection rules run against cloud APIs and local inference. This flexibility matters in regulated industries where data residency is a compliance requirement, not a preference.

Regulated Industry Applications

The thread connecting everything: the lessons learned building AI-driven compliance for FDA-regulated food safety apply directly to nuclear energy and financial services. With FINRA’s December 2025 requirements, the oversight and auditability disciplines that were voluntary best practices are now mandatory in finance.

FDA: Label compliance, nutritional accuracy, supply chain integrity — proven at Always Cool Brands with 10 SKUs across Sprouts stores in 24 states NRC: Safety-critical process validation, regulatory submission review for nuclear energy consulting FINRA: The same oversight disciplines, now required by regulation — LLM-as-judge continuous auditing applied to financial services workflows

The pattern repeats: enter a regulated space, build the governance infrastructure first, then layer innovation on top. Compliance as a foundation, not an afterthought.

OpenClaw — The Pattern Taking the World by Force

OpenClaw is a popular open-source project that demonstrates a core pattern at the heart of the agentic AI revolution: agents that can build, test, and improve software autonomously while remaining observable and auditable. It’s not a component of Self-Improving Code — it’s a standalone project that proves the pattern works in the open, where anyone can see, use, and build on it.

The pattern OpenClaw demonstrates — autonomous agents operating within structured guardrails, with every action logged and every decision traceable — is the same pattern that enterprises need to adopt as agentic development becomes the default way software gets built. OpenClaw shows what that looks like at the community level. Self-Improving Code shows what it looks like at the enterprise level, in regulated industries where the stakes are higher.

The Career Thread

Self-Improving Code is the convergence of three career threads:

  1. Software-driven infrastructure (since 2000): From Perl scripts at Openwave automating network provisioning to LangGraph orchestrating autonomous agents — the discipline is the same, the tools evolved
  2. Security and governance (since 2000): From CISSP and incident response at Openwave through 38 audits/year at Oracle to LLM-as-judge continuous auditing — the security thread has run through every role
  3. The comb-shaped career: Seven-plus deep verticals — networking, storage, software, DevOps/SRE, business operations, regulatory compliance, AI/ML — all carried simultaneously and all contributing to a platform that requires expertise across every domain

This isn’t a pivot to AI. It’s the natural convergence of 25 years building at the boundary between research and commercial implementation, in environments where failure isn’t acceptable.

Let's Build AI That Works

Interested in building similar solutions?