Skip to content

What a Local Daemon Can Measure That Cloud Analysis Never Can

Part 1 of 5 AI Behavioral Intelligence

Cloud code analysis sees a snapshot. A local daemon sees the movie. The difference between these two observation modes is not a question of cost, latency, or sophistication. It is structural. There are entire classes of signal that a cloud analyzer cannot capture, no matter how much compute you throw at it, because the signal never reaches the cloud.

This is not a complaint about existing tools. SonarCloud is good at what it does. CodeQL is good at what it does. They examine artifacts. The thesis here is that AI-assisted development has produced a new class of signal that lives upstream of artifacts, in the editor itself, in the seconds and minutes before code becomes code. That signal is what behavioral intelligence is about. And it is only observable by something running on the developer’s machine.

The short version: there are five signals about AI-assisted development that a daemon on the developer’s machine can capture and a cloud analyzer fundamentally cannot. Edit-to-commit latency. Per-file AI suggestion acceptance rates. Intra-session reverts. Context-switch patterns within a session. Review time per diff line. Each is a property of work-in-progress, not of work-completed, and work-in-progress never reaches the cloud.

Five signals make the case.

The word “session” is doing heavy lifting

Section titled “The word “session” is doing heavy lifting”

Before going further: a session is the bounded unit of developer attention between meaningful context switches. It is not a git branch. It is not a workday. It is not a calendar block. It is the cognitive thread a developer is holding while they work on a thing, until they put it down. A session might be 12 minutes of fixing one bug. It might be 4 hours of writing a feature.

Cloud analysis has no concept of a session. Cloud analysis has commits. By the time work reaches a commit, the session is already over and most of the interesting signal has been compressed away. Behavioral intelligence is the discipline of measuring within sessions, not between them.

Hold that word in mind. The five signals below all live inside sessions, which means they all live inside the local editor, which means they are all unreachable to anything that does not run there.

How long did this change sit on disk before it got committed?

A change that is written and committed in 90 seconds is a different artifact than a change that was written, sat for 40 minutes, was edited five more times, and then committed. The second change has been thought about. The first one might have been, but you cannot tell from the commit alone.

Cloud analysis cannot measure this signal. Not because it is hard. Because the cloud has no record of when the change was first written. The cloud sees the commit timestamp. The cloud does not see the disk-write that happened 40 minutes earlier, because the cloud was not watching the disk. The signal arrives only at the moment the change is pushed, and by then the latency information is gone.

A daemon watching the filesystem captures the first-write timestamp the moment it happens. The latency between first-write and commit becomes a measurable property of the change. Across a codebase, you can compute baselines. You can flag anomalies. You can tell the difference between a 2,000-line change that took an hour to develop and a 2,000-line change that took 90 seconds.

The structural impossibility is simple: cloud analysis can only see what arrives at the cloud. The first-write event does not arrive at the cloud.

Signal two: AI suggestion acceptance rate per file

Section titled “Signal two: AI suggestion acceptance rate per file”

Aggregate AI acceptance rates are essentially meaningless. A developer who accepts 78% of Cursor suggestions overall could have a 95% acceptance rate on boilerplate files and a 35% rate on the auth module. Those two rates tell different stories. The first is “AI is helping me move fast on the well-trodden parts.” The second is “AI keeps suggesting things I have to reject because it does not understand this part of my codebase.” Both are valuable. Neither is recoverable from the aggregate.

Per-file acceptance rate requires per-file observation. Each suggestion needs to be matched to the file it was offered against, the timestamp of the offer, and whether the suggestion was accepted, modified, or discarded.

Cloud analysis does not see suggestions. Cursor’s telemetry sees suggestions, but Cursor does not publish per-file acceptance rates back to your codebase intelligence layer. The MCP servers that exist today don’t expose this either. The signal lives in the editor’s event stream, and it is gone the moment the editor closes.

A daemon co-located with the editor can subscribe to that event stream. It can build a per-file map. It can show you, for the first time, that your team’s AI tools are highly effective on 60% of the codebase and barely useful on the 40% that matters most. That insight is the difference between a vague “AI is helping” and a specific “AI is failing on the files where failure is most expensive.”

Code that gets written, regretted, and rewritten within a single session never reaches git. The undo-redo cycles. The 30 seconds of typing followed by a select-all-delete. The AI suggestion that gets accepted, sits for 8 seconds, and gets rolled back. None of this is in your repository. None of this is in your commit history. None of this is observable to any cloud tool.

But intra-session reverts are some of the most valuable signal there is. They tell you where the developer was uncertain. They tell you which files are hard. They tell you which AI suggestions looked plausible long enough to be accepted but turned out to be wrong on inspection. Across a team, intra-session revert patterns identify the parts of a codebase where AI assistance is actively misleading developers.

This is also where context drift becomes measurable rather than anecdotal. The developer who accepted a suggestion, sat with it for eight seconds, and rolled it back was reacting to the gap between what the AI thought was true about the codebase and what the developer knew was true. The revert is the resolution of that gap, made visible.

The structural impossibility: this signal exists between disk writes. It is observable only by something subscribed to the editor’s text-document-change events in real time. Filesystem watchers see writes. Editor APIs see edits. Cloud analyzers see commits. Three different observation depths. Only the second one reveals reverts.

Signal four: context-switch count within a session

Section titled “Signal four: context-switch count within a session”

A developer who edits 14 files in 6 minutes is in a different cognitive state than one who edits 2 files for 90 minutes. The first developer is investigating, navigating, hunting. The second is concentrating. AI assistance behaves differently in these two states. Output quality is different. Risk profile is different.

This is the shape of a session, and it is invisible to anything that doesn’t watch the editor’s focus events live. Sequence matters. Order matters. The 14-file pattern only registers as a context-switch signal if you can see the sequence: file A, then B, then back to A, then C, then D. Cloud analysis sees the eventual file changes, but the choreography is gone.

The daemon sees the choreography, because the daemon receives a focus-change event every time the developer switches files. Across thousands of sessions, you can build a baseline for what your sessions normally look like, and flag the ones that deviate.

Cloud tools see diffs. They cannot see the time spent on them, because time-on-diff is a property of the editor session, not of the artifact. A 45-second acceptance and a 90-minute acceptance produce identical pull requests. The PR review tool, the static analyzer, the test suite, the human reviewer on the other end of the PR all see the exact same code. The only thing that has access to the difference between these two artifacts is something that watched the editor while the diff was being generated and reviewed.

This is the signal that becomes most acutely interesting in the era of agentic code generation. Agents produce large diffs at high speed. The cognitive load on the human reviewing them does not scale linearly with the diff size. A 2,000-line agent-generated diff cannot meaningfully be reviewed in 45 seconds. Yet 45-second acceptances of 2,000-line diffs happen, and they happen often, and the only place they are observable is in the editor where they happen. This is one of the cleanest mechanisms by which AI-induced technical debt accumulates: not through worse code on average, but through faster code than any review process can credibly absorb.

CapabilityCloud analysisLocal daemon
Commit metadata
Static code structure
Session boundaries
Edit-to-commit latency
Per-file AI suggestion acceptance
Intra-session reverts
Context-switch sequence
Review time per diff line

The pattern is not subtle. Everything below the separator is a property of work-in-progress. Cloud analysis runs on artifacts. Artifacts arrive after the interesting things have happened.

Five signals. Five distinct structural impossibilities for cloud-based observation. Each of them is a property of work-in-progress, not of work-completed. AI-assisted development happens almost entirely in work-in-progress. The artifacts that reach the cloud are downstream of the moments where the interesting things happened.

The category that this defines is AI behavioral intelligence. Not AI observability in the sense of monitoring deployed AI systems. Not code intelligence in the sense of static analysis. Not behavioral code analysis in the CodeScene sense, which runs over git history retrospectively to find hotspots and ownership patterns. That work is real and useful, but it is observation of artifacts, and it is downstream of the layer this post is about. Behavioral intelligence sits between the editor and version control, watching what AI tools and developers are doing together, building a measured understanding of what is actually true about a codebase right now and what is changing about it minute by minute.

This is also where the relationship to AGENTS.md becomes obvious. AGENTS.md tells AI coding tools what they should do. Behavioral intelligence tells you what is actually happening when they do it. The two layers compose: prescription on one side, measurement on the other. workspace.json, the structured artifact this measurement work produces, is a machine-readable record of the codebase’s behavioral state — generated from session telemetry, committed to the repo so the whole team and every tool sees the same thing.

Three terms come out of this:

Sessions are the unit of measurement. They are bounded, observable, and they map naturally to how AI-assisted thought actually unfolds.

Behavioral signals are the observations that live inside sessions. Edit-to-commit latency, per-file acceptance rates, intra-session reverts, context-switch counts, review time per diff line. There are more. These are five.

Behavioral intelligence is the layer of tooling that captures these signals, builds baselines from them, and produces insights that no static or cloud-based analysis can produce.

The reason this matters now is that AI-assisted development has shifted enough of the work to inside-the-editor that the inside-the-editor is where the meaningful telemetry lives. Cloud analyzers are not going to catch up here. The architecture of where they run forecloses on it.

If you build code intelligence tooling, this is where the next layer of insight is going to come from. If you use code intelligence tooling, this is the layer your existing stack is missing. And if you ship AI coding tools, this is the observation infrastructure that will eventually tell you how your tool is actually performing in the codebases that matter, not in the benchmarks where everyone scores well.