File editing is the highest-frequency "actuator" in coding agents—and it's where agents most often fail for reasons that have little to do with "can the model code?" and everything to do with interfaces, determinism, and feedback.
This post is a practical blueprint for building a file-edit subsystem that:
If you already have an agent loop, treat this as the missing "edit reliability layer" that turns good plans into correct changes on disk.
Most failures in "edit the codebase" aren't deep reasoning failures. They're representation ↔ applier mismatches:
So the core engineering goal is not "make the model smarter." It's: make edits deterministic, verifiable, recoverable, and cheap to repair.
At scale, reliability comes from one move: normalize everything into an internal Edit IR and make application deterministic and instrumented.
Your agent loop becomes:

You don't need a huge IR. A small set of operations covers 95% of real workflows:
CreateFile(path, content)DeleteFile(path)ReplaceExact(path, old, new, expected_replacements=1)RewriteFile(path, content) (used sparingly / as fallback)If you support patch-like operations, you can still compile them down into these primitives (e.g., multiple ReplaceExacts).
Key principle: the IR is what you test. Adapters are allowed to be messy; the applier must be boring.
A strict applier prevents silent corruption. A helpful applier prevents retry spirals.
For content-based operations like search/replace:
old == "".old matches 0 times.old matches >1 times (unless expected_replacements allows it).When an apply fails, return an error payload that includes:
NO_MATCH vs MULTIPLE_MATCHES vs OUT_OF_DATE vs PARSE_ERRORold (so it can revise)That makes "repairing the edit" a mechanical task for the model, not guesswork.
Edits are only meaningful relative to a specific file version. In real agent runs, files change due to:
Fix: include hash/version checks at every step.
When you view a file, return {path, hash, excerpt}.
When the model proposes an edit, require it to reference the base hash it saw.
Before applying, the applier re-computes the current hash:
OUT_OF_DATE and force re-read.This one rule dramatically reduces "phantom edits" and makes failures legible.

Here are the dominant edit representations you'll encounter. The point is not to "pick the best one forever," but to:
Examples (conceptually):
apply_patch-style): model outputs patch operations; your harness applies and reports results.view / str_replace / insert / create): model calls structured editor commands.replace(old_string, new_string)): strict unique match requirements; sometimes with correction loops.Why these win: they maximize "well-formedness" and make errors explicit.
A robust text-only format is:
<<<<<<< SEARCH=======>>>>>>> REPLACEWhy it's strong: no line numbers; deterministic to apply; easy to reject ambiguity.
Failure mode: SEARCH mismatch or matches multiple times.
@@ hunks)Useful for PR-like workflows and interoperability, but harder to apply robustly unless you invest in a flexible patcher.
Failure mode: malformed hunks, brittle context, line-number/offset confusion.
A blunt but often effective fallback—especially for small files or when edits are widespread.
Failure mode: unintended changes, elision ("… existing code …"), merge conflicts.
Models don't "prefer formats" as an abstract aesthetic choice. They're more reliable when the interface matches:
apply_patch)Use a patch tool when you can expose it as a first-class action. It's designed for iterative, multi-step coding workflows:
view + str_replace)This tool family is essentially "editor RPC":
view) before edit,str_replace with enough context for uniqueness,This style typically emphasizes:
old_string must match whitespace precisely and uniquely,If your environment supports checkpointing/restore, integrate it into your fallback ladder. Rollback is not a luxury; it's how you prevent error propagation.
When an edit fails, don't "try again" blindly. Escalate deterministically.
A battle-tested ladder:
Minimal targeted edit
str_replace / replaceResync + widen context
old anchor (more lines; preserve indentation)Switch representation
Sandbox + validate
Rollback

Below is a strict, production-oriented "exact replace with uniqueness" applier sketch. It intentionally refuses ambiguous edits.
from __future__ import annotations
from dataclasses import dataclass
from pathlib import Path
from typing import Iterable
@dataclass(frozen=True)
class ReplaceOp:
path: Path
old: str
new: str
expected_replacements: int = 1
base_hash: str | None = None # optional drift control hook
class EditError(RuntimeError):
def __init__(self, code: str, message: str, *, data: dict | None = None):
super().__init__(message)
self.code = code
self.data = data or {}
def _file_hash(text: str) -> str:
# Replace with a real hash (sha256) in production.
# Keeping it simple here to focus on control flow.
return str(len(text)) + ":" + str(text.count("\n"))
def apply_replace_ops(ops: Iterable[ReplaceOp], *, root: Path) -> list[dict]:
"""
Deterministic applier:
- enforces sandbox root
- enforces unique matching (or explicit expected_replacements)
- supports drift control via base_hash
- returns a structured receipt list (one per file)
"""
root = root.resolve()
by_file: dict[Path, list[ReplaceOp]] = {}
for op in ops:
if op.old == "":
raise EditError("EMPTY_OLD", "Refusing empty 'old' for replace (ambiguous).")
fp = (root / op.path).resolve()
if not str(fp).startswith(str(root)):
raise EditError("OUT_OF_ROOT", f"Refusing to edit outside root: {fp}")
by_file.setdefault(fp, []).append(op)
receipts: list[dict] = []
for fp, file_ops in by_file.items():
if not fp.exists():
raise EditError("FILE_NOT_FOUND", f"File not found: {fp}", data={"path": str(fp)})
text = fp.read_text(encoding="utf-8")
current_hash = _file_hash(text)
# Optional drift guard: if any op specifies base_hash, require it.
for op in file_ops:
if op.base_hash is not None and op.base_hash != current_hash:
raise EditError(
"OUT_OF_DATE",
f"File changed since read: {fp}",
data={"path": str(fp), "base_hash": op.base_hash, "current_hash": current_hash},
)
original_text = text
for op in file_ops:
count = text.count(op.old)
if count != op.expected_replacements:
raise EditError(
"MATCH_COUNT_MISMATCH",
f"{fp}: expected {op.expected_replacements} match(es), found {count}.",
data={
"path": str(fp),
"expected": op.expected_replacements,
"found": count,
# In production: include candidate snippets around occurrences.
},
)
text = text.replace(op.old, op.new, op.expected_replacements)
if text != original_text:
fp.write_text(text, encoding="utf-8")
receipts.append({
"path": str(fp),
"changed": text != original_text,
"before_hash": current_hash,
"after_hash": _file_hash(text),
})
return receiptsThe receipt is what keeps the model "grounded" in reality.
A good receipt contains:
Example receipt payload:
{
"ok": false,
"error": {
"code": "MATCH_COUNT_MISMATCH",
"message": "src/foo.py: expected 1 match(es), found 2.",
"data": {
"path": "src/foo.py",
"expected": 1,
"found": 2,
"candidates": [
{"line_start": 42, "line_end": 60, "excerpt": "...\n..."},
{"line_start": 113, "line_end": 131, "excerpt": "...\n..."}
]
}
},
"next_action_hint": "Widen `old` context to uniquely identify the intended occurrence."
}This is the difference between "agent stuck retrying" and "agent repairs edit in one step."
Whole-file rewrite is a legitimate tool—especially when:
But you must guard against elision and accidental churn:
Rewrite guardrails
A good policy is: "rewrite is allowed, but only when validation is cheap and the diff is explainable."
A recurring pattern in strong systems is splitting:
This can be done with:

You'll see the same reliability moves repeated across scaffolds:
The important part is not which tool does it—it's that these are convergent solutions to the same mechanical problem.
expected_replacements or rejects ambiguity.NO_MATCH, MULTIPLE_MATCHES)The ecosystem will keep shipping new diff syntaxes, editor tools, and "apply models." Don't anchor your system to any single representation.
Anchor it to:
That's what turns "LLM can propose good changes" into "agent reliably lands correct changes."