DANYLO PRAVDA
ALL NOTES

MASTER PLAN — THE SYSTEM — 2026-06-16WORK IN PROGRESS

The complete agentic operating system I'm building on Windows — full plan and execution

A Windows-native build plan for a complete agentic operating system that turns my real automation work into content, funnels, and clients. Every tool is verified for Windows 11 as of mid-2026 and wired to what I already run — Claude Code, a knowledge graph, a live members funnel on Neon and Vercel, a 14,099-post corpus — with mermaid architecture, honest cost economics, and a phased build that earns autonomy before it takes it.

15 min read

The complete agentic operating system I'm building on Windows — full plan and execution

This is the master plan, and it supersedes an earlier write-up that collapsed seventeen detailed builds into "just run n8n." That was wrong — but the correction needs its own honesty. When you actually verify the field's "agentic OS" tools, they split cleanly in two: a real, open-source core (Claude Code, Hermes, OpenClaw — all genuinely installable, all now running natively on Windows) and a paid-community branding layer on top of it (the "Pantheon" of personas, the overnight "dreaming" loop, "Mission Control" dashboards) that is sold through subscription communities and is not part of any official codebase. This plan adopts the real core, replicates the good ideas from the branding layer with tools I already own, and runs the whole thing on the machine I actually sit at: Windows 11. Every tool below was checked for Windows support and current status in June 2026. The faithful tool-by-tool reference behind every choice is attached: the complete agentic-OS reference.

CONTENTS

CH.01

The loop and the money

One closed loop turns my real automation work into clients. Five stages, each feeding the next:

flowchart LR
  S["1 SENSE"] --> T["2 STRATEGIZE"]
  T --> A["3 ACT"]
  A --> O["4 OBSERVE"]
  O --> L["5 LEARN"]
  L --> S
  • SENSE — read competitors honestly (winners and failures) and my own world: the 14,099-post automation corpus I already mined, plus the cheap social-intelligence funnel I already built.
  • STRATEGIZE — turn one real project into an optimized post-series and a funnel that points at the service.
  • ACT — gated auto-posting and replies, inside a safe envelope, every outbound action approved from my phone.
  • OBSERVE — a "what happened in the last 4 hours?" digest, computed from logs, narrated by the model — never invented.
  • LEARN — the system compares its own results to its baselines and rewrites the memory that feeds STRATEGIZE.

Money hierarchy: clients first; subscribers are the asset; views are fuel; ad-revenue is a rounding error. Content is build-in-public of real work — one artifact is portfolio, proof, and distribution at once. The difference from the old plan: this loop is run by a complete agentic OS that remembers, schedules, controls, and improves itself — not a single workflow canvas.

CH.02

The machine: a Windows-native host

The field's builds assume a Mac — a MacBook or always-on Mac Mini, Homebrew, ~/.config, desktop cron. I run Windows 11, so every layer has to translate. The good news from the research: the core tools are now first-class on Windows. The honest news: the "always-on PC as a 24/7 server" story has real holes, and I should plan around them rather than pretend they don't exist.

flowchart TB
  subgraph PC["Always-on Windows 11 PC"]
    CC["Claude Code (native, PowerShell)"]
    TS["Task Scheduler (PC awake)"]
    subgraph WSL["WSL2 + Docker Engine + systemd"]
      SVC["n8n / Postgres / services"]
    end
  end
  CC --> SVC
  TS --> CC
  CLOUD["Claude Routines (cloud, PC asleep)"] -.-> GH["GitHub repo"]
  GH -.-> CC
  CC --> TG["Telegram (control from phone)"]
  TG --> CC

The translation, tool by tool — every row verified against primary docs in June 2026:

Field default (Mac-centric) Windows-native equivalent Verified status
Claude Code on macOS/Linux Claude Code native on Windows 11 (irm https://claude.ai/install.ps1 | iex, or winget install Anthropic.ClaudeCode) Native, no WSL2 required. claude -p headless works. Sandboxing is the one gap — it needs WSL2
Homebrew winget winget install -e --id OpenJS.NodeJS.LTS · Docker.DockerDesktop · Obsidian.Obsidian · Python.Python.3.13 · Git.Git
Desktop cron Windows Task Scheduler Native; "run whether user is logged on or not" + missed-run catch-up. Env vars need a wrapper script; a named account (not SYSTEM) for user-installed CLIs
Always-on Mac Mini Always-on Windows 11 PC Works, with caveats (below)
n8n via Homebrew/pm2 Docker Engine inside WSL2 with systemd pm2 startup is broken on Windows; Docker Engine-in-WSL2 (systemctl enable --now docker) is the dependable always-on path
bash / zsh PowerShell (+ Git Bash when installed) Native PowerShell tool in Claude Code; Git Bash unlocks the Bash tool

The honest gate on the host. Docker Desktop on Windows does not run headless — it needs an interactive login to start containers (Docker's own roadmap issue for this is still open). WSL2 is updated through the Microsoft Store, and that update kills running instances without warning. Windows forced restarts can still fire outside an 18-hour Active-Hours window. A home PC realistically lands around 95–98% uptime, and running it 24/7 adds real electricity cost. So the plan is layered: the PC-awake tier runs interactive and scheduled work (Task Scheduler + Docker Engine in WSL2), and the truly always-on, machine-asleep tier runs in the cloud (Claude Routines), with a cheap Linux VPS (Hetzner from ~€3.49/mo) as the honest fallback if I ever need a service that genuinely cannot blink. I am not going to pretend a desktop is a datacenter.

CH.03

What I already have vs. what I need

The reason this is buildable in weeks, not months: most of the OS already exists in this repo and the systems around it. The job is mostly wiring, not inventing.

OS layer My actual asset Status
Core runtime Claude Code + .claude/skills + pre-commit hooks + CLAUDE.md discipline gates Have — this site runs on it
Code memory graphify (298 nodes / 597 edges / 15 communities on this repo; the real payoff is on btc-bot, 728k lines) Havegraphify-out/ is committed
Semantic memory CLAUDE.md + docs/STATE.md spine; a self-improving wiki is the gap Partial — add the LLM-wiki pattern
Control surface Next.js 16 + Vercel + Neon + Auth.js v5 — the live /notes members funnel and /dashboard cabinet Have — the dashboard foundation already ships
SENSE data The 14,099-post / 7,526-account corpus + the Apify discovery→enrich→score funnel Have — base rates already de-overfitted
Cheap inference A pool of free NVIDIA NIM modelsminimaxai/minimax-m3, google/diffusiongemma-26b-a4b-it, moonshotai/kimi-k2.6, z-ai/glm-5.1, mistralai/mistral-medium-3.5-128b (separate free key per model, one concurrent worker each) Have — free-tier per key; pooled for throughput + multi-model coverage (see Tooling)
Channels Telegram bot as the two-way approval surface Build — small, well-understood
Cadence /loop (have) + Task Scheduler (build) + Cloud Routines (build) Partial
Self-improvement Self-updating CLAUDE.md, per-skill learnings.md, an overnight digest Build — the compounding layer

Two honest notes on my own assets. First, graphify earns its keep on btc-bot, not on this site — the site's code is small enough that graphify itself prints "you may not need a graph." I'll point it at the 728k-line trading system, where querying summaries instead of re-reading source is the real token win. Second, the control dashboard should be a Next.js page on my existing Vercel + Neon + Auth.js stack — I already built password-gated, OAuth-backed surfaces for the members funnel — not the field's Obsidian-plugin dashboard, whose embedded-terminal plugins have open, unresolved Windows 11 bugs as of May 2026.

CH.04

The seven layers, on Windows

The target architecture, with each layer mapped to a Windows-native tool and, where possible, to something I already run.

flowchart TB
  subgraph CORE["Core — the brain"]
    CCW["Claude Code on Windows"]
  end
  subgraph MEM["Memory"]
    CMD["CLAUDE.md + skills + learnings.md"]
    WIKI["LLM wiki (Obsidian, Karpathy pattern)"]
    GRAPH["graphify code graph"]
  end
  subgraph CONN["Connections — the hands"]
    CLI["CLI-first + direct REST"]
    MCP["MCP: Context7 + Tool Search"]
    TGRAM["Telegram control"]
  end
  subgraph CAD["Cadence — the heartbeat"]
    LOOP["/loop (session)"]
    TASK["Task Scheduler (PC awake)"]
    ROUT["Cloud Routines (PC asleep)"]
  end
  subgraph CTRL["Control — the cockpit"]
    DASH["Next.js dashboard on Vercel + Neon"]
  end
  CCW --> CMD
  CCW --> GRAPH
  CCW --> CLI
  CCW --> MCP
  LOOP --> CCW
  TASK --> CCW
  ROUT --> CCW
  CCW --> DASH
  DASH --> TGRAM

1 — Core runtime (the brain). Claude Code, native on Windows, is the OS. It reads CLAUDE.md, runs skills, calls tools, schedules work, and runs headless behind buttons. The same plain-folder setup is portable — it would run unchanged in Codex or Cursor — but Claude Code is the home base. Per-task model economics is the lever the field calls "personas": I route the work by model tier.

2 — Memory (so it stops forgetting). Three real stores. Semantic: CLAUDE.md plus a self-improving LLM wiki in Obsidian, built on Karpathy's published pattern — a raw/ folder of sources and a wiki/ folder the model owns, navigated by an index.md rather than vector search (no embeddings needed at hundreds-of-pages scale). Procedural: the Agent Skills standard, each skill self-improving via a learnings.md read before every run. Code: graphify, on btc-bot, where querying god nodes and summaries instead of re-reading source cuts tokens hard One shared vault path so Claude Code and the wiki read one universal memory.

3 — Connections (the hands). Cheapest-correct wins: CLI-first and direct REST over MCP where a CLI exists, because piping a CLI through the bash tool is far leaner than loading a fat MCP schema Where a standard connector wins, MCP: Context7 for version-correct library docs (npx ctx7 setup --claude), and Anthropic's Tool Search, now auto-enabled in Claude Code, which defers tool schemas and collapses the system-tool overhead from ~15k tokens to under 1k. Telegram is the two-way control and approval channel on my phone.

4 — Cadence (the heartbeat) — three tiers. /loop for session-scoped work (minutes to ~3 days); Windows Task Scheduler for persistent jobs while the PC is awake; Claude Cloud Routines for work that must run while the machine is off — the morning brief, the 4-hour digest. Routines clone the repo into a 4 vCPU / 16 GB cloud box, run a saved prompt, and push a branch back; the minimum interval is 1 hour, and they draw on the same subscription quota. Every autonomous post still passes a Telegram approval tap.

5 — Control surface (the cockpit). A Next.js dashboard on my existing Vercel + Neon + Auth.js stack — not a bolt-on. Mission Control (the active goal and the me-vs-agent split), live AI spend, the memory and schedule panels, and skill buttons that run Claude Code headless. I already ship OAuth-gated, server-rendered surfaces for the members funnel; the dashboard is the same machinery pointed inward.

6 — Self-improvement (why it compounds). An overnight digest: read the day's session history, find patterns and unused capabilities, emit a morning brief tied to my goals. Combined with a self-updating CLAUDE.md and per-skill learnings.md, the OS gets a little better while I sleep — this is the LEARN stage applied to the system itself, on top of LEARN for the content.

CH.05

What the field research validates (and what I'm adopting)

A 95-post teardown of the field's loudest agentic-OS seller — kept as private competitor research — confirms the spine of this plan and sharpens a few ideas worth adopting verbatim:

  • The memory / "Self" layer is the differentiator, not the agents. The single most-repeated, most-defensible point in the whole field: context is the biggest driver of output quality — without a persistent, business-specific memory, agents produce generic work; with it, output is in your voice and facts. That is exactly Layer 2 above — the lesson is to treat the Obsidian LLM-wiki as the load-bearing layer, not an afterthought, and to make every agent read it on every run. Worth adding alongside the semantic/procedural/code stores: an episodic memory — an append-only memory.md of decisions, Git-backed nightly, refactored quarterly.
  • "Own the harness, rent the model." Harnesses last years; models cycle every six months. Build on open, swappable agent shells (Claude Code, Hermes) and treat the model as a hot-swappable part — which is precisely why the model layer here is the free NVIDIA pool + Opus only at the top, re-rankable as ratings shift. The routing rule generalizes cleanly: a free "good-enough" tier carries the ~80% of volume work, a frontier model touches only the ~20% that needs it (here: the NVIDIA pool for bulk, Opus for synthesis — the same discipline the /mine-corpus pipeline encodes).
  • A 4-layer mental model — Intelligence (brain) / Execution (hands) / Research (long-running workhorse) / Self (memory) — maps onto the layers above and is a clean way to decide where a job belongs: Claude Code = Intelligence, Hermes = Research/workhorse, the CLI/MCP connectors = Execution, Obsidian = Self.
  • Reliability over features when picking agent tools. Demos are cherry-picked; real use is messier. A blunt, keepable heuristic: if a tool needs debugging on >20% of tasks, switch — favour the boring, reliable option for bulk work and reserve the flashy one for the narrow case it's genuinely best at.
  • Local-first is a positioning lever, not only a privacy choice. "Data never leaves the machine" is a sellable premium to regulated buyers (legal / medical / finance) — a pricing angle for the offer, not just an architecture note.

Everything around those ideas in the source — the revenue boasts, the listicle volume, the "free, no API keys" tool claims — is marketing scaffolding, not method. The architecture and the cost-engineering are the parts worth taking; the funnel theatre isn't.

CH.06

Tooling decisions (verified) — adopt, defer, skip

This is where the research changes the plan. Every tool below was checked for real Windows-11 support and current status in June 2026, and rated honestly.

Tool Windows-11 reality (verified) Decision Why
Claude Code Native (Win 10 1809+/11); winget or PowerShell install; -p headless works; sandboxing needs WSL2 Adopt — core Already the OS this site runs on
Hermes (Nous Research) Real, MIT, native Windows desktop app since v0.16.0 (June 2026); only the dashboard's embedded terminal needs WSL2 Adopt — base The open-source agent is a core layer of the stack (the research / long-running-workhorse tier), adopted now — not deferred. Only the paid-community "Pantheon / dreaming / Mission Control" branding ($59/mo Skool) is skipped; the codebase itself is base.
OpenClaw Real, MIT, native Windows Hub app — but CVE-2026-32922 (CVSS 9.9) + ~135k internet-exposed instances, 63% unauthenticated Skip A control plane with that exposure record is not going on my machine; the capability isn't worth the blast radius
n8n Runs stable on the 24/7 Windows PC three ways: Docker Engine in WSL2 (restart: unless-stopped, truly headless); Docker Desktop (fine on an always-on auto-login box); or native npm (npm i -g n8n) run as a service via nssm or Task-Scheduler-at-logon. Only pm2 startup is genuinely broken on Windows. Adopt It works on Windows — use it for deterministic canvas flows + the 400+ app connectors (webhooks, schedules, OAuth integrations) where a visual graph beats code; Claude Code skills still own the code-side logic.
NVIDIA NIM (free pool) Hosted API works from Windows; a pool of 5 free models (minimax-m3 / diffusiongemma-26b / kimi-k2.6 / glm-5.1 / mistral-medium-3.5), ~40 RPM per model-key → run one concurrent worker per model for N× throughput; self-host containers are Linux-only Adopt Free bulk inference; per-model ~40 RPM is real, so pool the model-keys (one worker each) and keep gemma as the universal fallback. Proven in the mining pipeline (281 posts distilled across the pool). deepseek-v4-pro was dropped — its endpoint times out.
Context7 + Tool Search Both GA and cross-platform; Tool Search auto-on in Claude Code Adopt Live docs + big context savings, near-zero setup
Obsidian Obsidian + CustomJS/Dataview fine on Windows; the Terminal/Shell-Commands plugins have open Windows bugs Adopt — base Obsidian is the base memory / knowledge layer — local-first, plain-markdown, hugely adopted; the self-improving LLM wiki (Karpathy pattern) lives here and every agent reads it for context. Only the buggy embedded-terminal plugin dashboard is skipped — build that on my own stack.
Telegram + Railway Telegram Bot API is pure HTTPS (cross-platform); Railway Hobby ~$5/mo for a tiny always-on bot Adopt Approve/reject via inline buttons is the firewall
Granola (meeting ingest) Has a native Windows app + REST API since 2025 (not Mac-only) Optional Only if meeting capture becomes part of SENSE

The headline corrections: Hermes is real and runs on Windows, but most of what the videos sell under its name is a configuration layer I can reproduce myself — so it's a defer, not a must-buy. OpenClaw is a hard skip on security grounds. And the field's nicest demo — the Obsidian dashboard with an embedded terminal — is exactly the piece that's flaky on Windows, so I route around it onto the Next.js stack I already run.

CH.07

The phased build

Build green, validate before autonomy. The gate after Phase 1 is load-bearing: the machine is earned, not assumed.

flowchart LR
  P0["Phase 0 — OS skeleton"] --> P1["Phase 1 — SENSE + STRATEGIZE (by hand)"]
  P1 --> GATE{"Beats baseline?"}
  GATE -- no --> P1
  GATE -- yes --> P2["Phase 2 — memory that grows"]
  P2 --> P3["Phase 3 — dashboard + model routing"]
  P3 --> P4["Phase 4 — OBSERVE + overnight digest"]
  P4 --> P5["Phase 5 — ACT, gated"]
  P5 --> P6["Phase 6 — LEARN closes the loop"]
  • Phase 0 — the OS skeleton. Harden what I have: CLAUDE.md + identity files; the skills library; hooks; graphify on btc-bot; the Obsidian LLM-wiki scaffold. Stand it on the Windows PC with Claude Code native, Docker Engine in WSL2, and one Task Scheduler job. Done when: one end-to-end "hello" — a scheduled task pulls data → a skill processes it → a Telegram tap writes a result.
  • Phase 1 — SENSE + STRATEGIZE, by hand. The content-intelligence skill (honest, follower- normalized base rates, n≥30) + the project→campaign generator. Post manually for two weeks and measure against baseline. Gate: if it doesn't beat baseline, fix the strategy — build no autonomy.
  • Phase 2 — memory that grows. Wire the LLM wiki + per-skill learnings.md + the shared vault path. Done when: a second STRATEGIZE run visibly drops a losing pattern it learned about itself.
  • Phase 3 — dashboard + model routing. Stand up the Next.js Mission-Control dashboard on Vercel/Neon; implement the per-task model routing. Done when: the dashboard shows live state and a headless button runs a skill.
  • Phase 4 — OBSERVE + the overnight digest. The "what happened in 4h?" Telegram digest (SQL counts; the model only narrates) + the morning brief. Done when: a 4-hour digest reconciles exactly with the logs and a morning brief lands tied to my goals.
  • Phase 5 — ACT, gated. Autonomous reply-drafting to high-signal niche posts, every post through the Telegram approval tap, inside a safe envelope with a hard daily cap. Only after Phase 1 passed.
  • Phase 6 — LEARN closes the loop. Weekly, the system compares its results to benchmarks and rewrites the memory that feeds STRATEGIZE; the overnight digest rewrites the OS itself.

CH.08

Honest gates, real stakes, and cost

  • Validate the content strategy by hand before any autonomy (the Phase 1 gate). This is the one rule that, if skipped, wastes everything downstream.
  • Permission scoping is the real stakes. A credential on the ring means the action can fire regardless of instructions — the field's 150k-inbox cautionary tale, and OpenClaw's 63%-of-instances- unauthenticated record, are the same lesson twice. Gate every outbound post behind the Telegram tap; never automate likes, follows, or DMs.
  • Cost discipline. Verified Claude API pricing (June 2026): Opus 4.8 $5/$25, Sonnet 4.6 $3/$15, Haiku 4.5 $1/$5 per million tokens in/out; cache reads are ~0.1× and writes ~1.25×. The OS leans on prompt caching, the /compact cadence, and the model split.
  • Security. Verify every package before install (Claude Code itself had two patched CVEs in 2025–26; OpenClaw's record is worse); treat hostile post text as prompt injection; keep secrets out of prompts and in environment variables; the Telegram approval tap is the firewall.
  • Reliability honesty. The Windows PC is the convenient host, not a guaranteed one. The 24/7, machine-asleep tier belongs in Claude Routines or on a small VPS — and the plan says so up front rather than discovering it in production.

Experimental and in motion. The complete tool-by-tool reference behind every choice here — including everything the n8n-only write-up dropped — is the attached agentic-OS reference, the faithful synthesis of all 17 field builds. The Windows-specific status, costs, and security findings above were verified against primary sources in June 2026.

agentic-osautomationai-systemswindowsgrowth
DISCUSSION

No comments yet — start the conversation.

Sign in to join the discussion — it's free.