Tracking Work Alongside Spec-Driven Development

A few months ago, I wrote about using specs to tame code generation. The argument was that code generation can speed up implementation, but it does not remove the need to write down intent. If the only source of truth is a chat transcript, review requires reconstructing what I meant.

Specs helped solve that. They gave me a contract before code existed. Acceptance criteria made implementation and review more objective.

A related issue came up as I used more agents in the same project: even when the spec was clear, agents still had to rebuild execution context. They needed to read the repo, infer what had already happened, understand what was still open, and decide where the current branch fit.

The spec described the intended behavior. It did not describe the current state of the work.

Jonathon Vandamme shared useful context with me around tracking work chronology: keeping a record of how work moves through a project. I took that idea and developed it further for agent coordination.

So I added a separate layer: track execution context explicitly.

What Specs Do Not Cover

Spec-driven development answers a product question:

What behavior should exist when this is done?

That is the right question before implementation. It keeps the work anchored to user value, workflows, constraints, and acceptance criteria.

But agents also need operational context:

Which branch is carrying the work?
Which task is active right now?
Was there a backlog item?
Which acceptance criteria or conformance IDs are touched?
Did another session already claim part of the work?
What was intentionally deferred?
Which README or project docs need to stay aligned?

These questions sit between the spec and git history.

Git records what changed. Specs record what should be true. Neither gives an agent a concise view of what is currently happening on a branch.

Without that view, each new session spends time rebuilding context from commits, diffs, docs, and conversation history. That adds overhead, especially when more than one agent session touches the same area of the repo.

The `tracking/` Model

The tracking/ directory is a small worklog system:

tracking/
+-- backlog/
+-- progress/
+-- tech-debt/

Each part has one job.

tracking/backlog/ captures work before it becomes a branch. It is useful when an issue is known, but not yet implemented.

tracking/progress/ captures branch-level coordination. There is one progress file per branch or PR.

tracking/tech-debt/ captures deferred work. This prevents cleanup from being forgotten while also keeping the current branch small.

The progress file is the main coordination document. Its job is to tell an agent where the branch stands without requiring a full reconstruction from git history and chat context.

# feat-rpc-retry - exponential backoff for an RPC client

Branch: `feat/rpc-retry`
PR: #482
Spec: [2026-06-10-rpc-retries](../spec/features/2026-06-10-rpc-retries.md)
Backlog: [rpc-retry](../backlog/rpc-retry.md)

## Tasks

| Task                 | Status | Owner | Notes        |
| -------------------- | ------ | ----- | ------------ |
| Wire up retry policy | done   | otter | RPC-RETRY-01 |
| Add jitter           | wip    | maple | RPC-RETRY-02 |
| Backfill tests       | todo   | -     | RPC-RETRY-03 |

## Notes

- otter: editing retry policy code and related tests.
- maple: downstream caller depends on `RetryPolicy.makeDefault()` staying stable.
- maple: deferred jitter tuning -> [tech-debt](../tech-debt/rpc-jitter.md)

This file is intentionally brief. It is an index, not a narrative.

The branch points to the spec, PR, backlog item, and any deferred work. The task table shows the current state. Notes preserve only the handoff details that git does not capture.

Why One File Per Branch

The branch is the right unit for this because it matches how agent work is actually reviewed.

A branch can contain several tasks. It can implement a spec, close a backlog item, introduce follow-up debt, and update docs. A future agent needs those links in one place.

The file naming rule makes that stable:

tracking/progress/<yyyy-mm-dd>-<branch-slug>.md

The date comes from the first branch-specific commit when one exists. If the branch has no unique commits yet, the file uses today's date and keeps that name for the life of the branch.

That rule prevents agents from creating duplicate progress files as the branch evolves.

Example: A JavaScript Runtime

I used this process while building out a JavaScript runtime in a private repo. The work had several related streams: a command-line executable, module loading, fetch support, web-platform test coverage, cryptography APIs, structured clone, and byte transport between Swift and JavaScriptCore.

That work had specs for the user-visible and architectural changes, including native byte transport and fetch streaming. The specs described what the runtime needed to support. The progress file described how the branch was moving through the implementation.

The progress file for that branch linked the specs, then tracked the implementation as a task table:

Branch: `codex/native-byte-bridge`
PR: #257
Spec: [native-byte-bridge](../spec/features/native-byte-bridge.md),
[fetch streaming](../spec/features/fetch-streaming.md)

| Task                               | Status | Owner | Notes                             |
| ---------------------------------- | ------ | ----- | --------------------------------- |
| Add fetch test harness             | done   | codex | local server + resource shim      |
| Build executable and module loader | done   | codex | async import, import maps, REPL   |
| Add native byte bridge             | done   | codex | ArrayBuffer + typed-array support |
| Stream production fetch responses  | done   | codex | native stream -> ReadableStream   |
| Maintain specs                     | done   | codex | activated completed feature specs |

The useful part was not the table by itself. It was the combination of links and status.

An agent could open the progress file and see that fetch streaming was already implemented, redirect behavior had test coverage, browser-only CORS behavior was intentionally out of scope, and active tech debt was clean for the PR. That is the kind of context that is easy to lose if it only exists in chat history.

The same branch also showed why one progress file per branch matters. Earlier progress shards were folded into a single file, so the PR had one coordination surface. Git kept the detailed history. The progress file kept the operational summary.

What Changed in `AGENTS.md`

I moved this into AGENTS.md as a required workflow step. The key rule is:

Every work item updates tracking/.

That applies to spec'd work, unplanned fixes, documentation changes, and chores. A small change can have a small progress entry, but it still gets one.

The workflow now makes tracking explicit:

Read the target package README and relevant docs.
Create or switch to a branch.
Pick a backlog item if one exists.
Identify conformance IDs if the work has them.
Implement the smallest coherent slice.
Add or update tests.
Run the relevant checks.
Update tracking/.
Open or update a PR.

The tracking step is near the end because it records what happened, but it also affects work in progress. Agents claim rows, mark tasks wip, move them to done, and add notes when a decision matters for handoff.

For multi-session work, this makes tracking/progress/ the shared coordination surface.

How It Combines With Specs

Specs and tracking are separate because they represent different kinds of truth.

File type	Primary question	Update pattern
Spec	What behavior should exist?	Written before significant work
Progress file	What is happening on the branch?	Amended during implementation
Backlog item	What work is known but open?	Created or reconciled as work is scoped
Tech debt	What did I intentionally defer?	Added when a branch leaves follow-up

This separation keeps specs product-focused. A spec should not become a running work diary. It should describe behavior, workflows, constraints, and acceptance criteria.

The progress file handles the operational state. It can say who is working on a row, what checks ran, which backlog item was reconciled, and what follow-up moved to tech debt.

The link between them is the useful part:

Spec: [2026-06-10-rpc-retries](../spec/features/2026-06-10-rpc-retries.md)
Backlog: [rpc-retry](../backlog/rpc-retry.md)

An agent can start at the progress file, open the spec for product intent, inspect the PR for review context, and follow tech-debt links for deferred work.

That keeps the reconstruction step smaller.

What Belongs in Progress Notes

Progress files are not commit logs. Git already stores commit history better than Markdown ever will.

A progress note should capture information that is useful for coordination and not obvious from the diff:

a file another session is actively editing
a compatibility constraint that affected the implementation
a link to deferred work
a reason a task is still wip
a check that could not run and why

It should avoid restating every commit, every file change, or every implementation detail.

The goal is fast orientation. A future agent should understand branch state in under a minute.

Agent Context

Agents are good at reading code, but making every session rediscover context is wasteful.

The repeated pattern looks like this:

The spec exists.
The branch has partial implementation.
The chat history contains useful context.
A new session starts.
The agent spends time reconstructing branch state from scattered evidence.

tracking/ gives that state a durable place inside the repo.

This is especially useful when threads compact, sessions pause, or multiple agents touch nearby files. The progress file becomes the first document to read after the README and spec.

It does not replace review. It reduces how much context review has to rebuild.

Tradeoffs

The obvious cost is maintenance. Every work item has one more file to update.

I think that cost is acceptable because the entries are short and scoped. A documentation-only change might add a single task row and one note. A larger feature might link to a spec, backlog item, PR, and tech-debt note.

The other risk is duplication. If progress files start repeating commit history or spec content, they become stale. The fix is to keep them as indexes: task state, ownership, links, and handoff notes.

The system works if each file owns a distinct concern.

Result

Spec-driven development made intent explicit. tracking/ makes execution context explicit.

Together, they give agents two things they need before changing code:

the product contract
the current state of the work

That has become the important distinction in my project. Specs tell an agent what good looks like. Tracking tells the agent where the branch is and what remains.

The article I wrote in February focused on clarity before implementation. This layer adds clarity during implementation.

That is the practical difference: less reconstruction, fewer stale assumptions, and a cleaner handoff from one agent session to the next.