Peeking Into Claude Code
- Peeking Into Claude Code
- Install chain (old npm era vs Bun binary)
- Decompiling the Bun binary
- From binary to
claude.js(and what the Bun markers are) - Prompt assembly and system reminders
- Tools and schemas
- Non-interactive mode
- MITM vs Bun hook (and why HTTP_PROXY didn’t help)
- Context management and git state
- Skills and plugins
- Automating the whole pipeline
- Appendix: tool schema reference
Clud (Claude Code) is awesome. So, why not figure out the harness and prompts it has?
Install chain (old npm era vs Bun binary)
There are two eras:
Old way:
npm install @anthropic-ai/claude-code
npm pack @anthropic-ai/claude-code
tar -xvf anthropic-ai-claude-code-*.tgz
You could just unpack the minified JS and read it.
New way:
curl -fsSL https://claude.ai/install.sh | bash
That redirects to a GCS bootstrap script which does the following:
- Reads
latest. - Downloads
manifest.json. - Selects the platform artifact (
darwin-arm64for my machine). - Downloads the
claudebinary. - Verifies SHA-256.
- Runs
claude install.
As of writing about this run, latest resolved to 2.1.50, and the checksum matched.
Decompiling the Bun binary
Quick fingerprint:
strings claude-2.1.50-darwin-arm64 | tail -n1
# ---- Bun! ----
This tells you that it was "compiled" by Bun.
From binary to claude.js (and what the Bun markers are)
The Bun binary isn’t a single JS file; it’s a bundle with a module table embedded near the end of the file. Bun drops markers and a trailer so its runtime can locate the module graph.
The ---- Bun! ---- string is the obvious marker, but the real work is the module table that sits near it. The table contains entries like:
- module path pointer
- payload offset
- payload length
- flags
The main trick is the entry size. The upstream tool expected 36‑byte entries, but this binary uses 52‑byte entries, so the parser “walked off” the buffer and failed. I wrote a custom extractor that:
- Scans for Bun markers.
- Reads the trailer and module table offsets.
- Tries multiple entry sizes (
52, 40, 36, 32, 28) and scores them based on pointer sanity. - Extracts the bundled payloads into a folder.
Output highlights:
.../bundled/claude(main JS wrapper, ~10.7 MB).../bytecode/claude.bytecode(~97 MB)- extra
.nodeand.wasmmodules (ripgrep, tree‑sitter, resvg, etc.)
Then de‑minify the JS wrapper into a readable claude.js. This step uses bun-decompile.
Result:
.../deminified/claude-openai/deminified/claude.js
That file is what the prompt/tool/schema extractor consumes.
Prompt assembly and system reminders
The system prompt is not a single static string. It is assembled from section builders like # System, # Doing tasks, and # Using your tools. That’s why “same prompt, different behavior” happens when mode/tool state/env context changes.
Here’s a real prompt snippet from a trace (interactive mode):
You are an interactive agent that helps users with software engineering tasks.
System reminders are a real control channel. Example reminder text:
Tool results and user messages may include <system-reminder> or other tags.
There’s even a non-interactive reminder that changes behavior in --print mode:
You are running in non-interactive mode and cannot return a response to the user until your team is shut down.
Tools and schemas
I extracted the tool schemas from the de-minified bundle. Example schema snippet:
WI8 = NR(() => y.strictObject({
file_path: y.string().describe("The absolute path to the file to read"),
The extracted tool universe for this version is 30 tools:
AskUserQuestion, Bash, Edit, EnterPlanMode, EnterWorktree, ExitPlanMode, Glob, Grep,
ListMcpResourcesTool, LSP, mcp, NotebookEdit, Read, ReadMcpResourceTool, SendMessage,
Skill, StructuredOutput, Task, TaskCreate, TaskGet, TaskList, TaskOutput, TaskStop,
TaskUpdate, TeamCreate, TeamDelete, TodoWrite, ToolSearch, WebFetch, Write
Non-interactive mode
The prompt actually changes with -p / --print. In my --print capture, the system prompt includes the non-interactive reminder above, and the tools array is smaller.
Tools in the --print capture (18 total):
AskUserQuestion, Bash, Edit, EnterPlanMode, EnterWorktree, ExitPlanMode, Glob, Grep,
NotebookEdit, Read, Skill, Task, TaskOutput, TaskStop, TodoWrite, WebFetch, WebSearch, Write
Tools that did not appear in the --print tools list:
ListMcpResourcesTool, LSP, mcp, ReadMcpResourceTool, SendMessage, StructuredOutput,
TaskCreate, TaskGet, TaskList, TaskUpdate, TeamCreate, TeamDelete, ToolSearch
MITM vs Bun hook (and why HTTP_PROXY didn’t help)
MITM showed telemetry/config/update endpoints, but it did not show the real model prompt/response. The Bun-compiled binary does not respect HTTP_PROXY/HTTPS_PROXY in the way you’d expect, so I stopped fighting it and hooked Bun directly.
The Bun preload hook patches fetch and dumps /v1/messages requests + SSE responses. This gets you the full system[], tools[], and tool-use event stream.
Here is the exact run pattern (print mode):
BUN_OPTIONS='--preload <path_to_dir>/trace-claude-messages.cjs' \
CLAUDE_TRACE_DIR=artifacts/trace/messages \
claude -p "Reply exactly: TRACE_FETCH_OK2" --output-format json \
> artifacts/trace/trace-fetch-out.json \
2> artifacts/trace/trace-fetch-err.log
The interesting part is the request URL: it hits loopback first:
http://127.0.0.1:<port>/v1/messages?beta=true
That explains why MITM mostly showed telemetry unless you hook Bun at runtime.
Full hook source:
/* eslint-disable no-console */
const fs = require("fs");
const path = require("path");
const os = require("os");
const { randomUUID } = require("crypto");
const outDir =
process.env.CLAUDE_TRACE_DIR ||
path.join(process.cwd(), "artifacts", "trace", "messages");
fs.mkdirSync(outDir, { recursive: true });
const maxBytes = Number(process.env.CLAUDE_TRACE_MAX_BYTES || 8 * 1024 * 1024);
const onlyMessages = process.env.CLAUDE_TRACE_ONLY_MESSAGES !== "0";
const redactAuth = process.env.CLAUDE_TRACE_REDACT_AUTH !== "0";
const originalFetch = globalThis.fetch?.bind(globalThis);
if (!originalFetch) {
throw new Error("globalThis.fetch is not available");
}
function lowerHeaders(input) {
const out = {};
const h = new Headers(input || {});
for (const [k, v] of h.entries()) {
const key = k.toLowerCase();
if (
redactAuth &&
(key === "authorization" ||
key === "x-api-key" ||
key === "cookie" ||
key === "set-cookie")
) {
out[key] = "***";
} else {
out[key] = v;
}
}
return out;
}
async function readBodySafe(body) {
if (!body) return { text: "", truncated: false };
try {
const txt = await body.text();
if (Buffer.byteLength(txt, "utf8") > maxBytes) {
return { text: txt.slice(0, maxBytes), truncated: true };
}
return { text: txt, truncated: false };
} catch (err) {
return { text: <!--CODE_BLOCK_1127-->, truncated: false };
}
}
function shouldCapture(url) {
if (!onlyMessages) return true;
return /\/v1\/messages(\?|$)/.test(url);
}
globalThis.fetch = async function tracedFetch(input, init = {}) {
const request = new Request(input, init);
const url = request.url;
if (!shouldCapture(url)) {
return originalFetch(input, init);
}
const id = randomUUID();
const ts = new Date().toISOString();
const reqClone = request.clone();
const reqBody = await readBodySafe(reqClone);
let response;
let fetchErr;
try {
response = await originalFetch(request);
} catch (err) {
fetchErr = err;
}
const baseRecord = {
id,
ts,
pid: process.pid,
hostname: os.hostname(),
request: {
method: request.method,
url,
headers: lowerHeaders(request.headers),
body: reqBody.text,
body_truncated: reqBody.truncated,
},
};
if (fetchErr) {
const rec = {
...baseRecord,
error: String(fetchErr),
stack: fetchErr && fetchErr.stack ? String(fetchErr.stack) : null,
};
fs.writeFileSync(
path.join(outDir, <!--CODE_BLOCK_1128-->),
JSON.stringify(rec, null, 2)
);
throw fetchErr;
}
// Return the response immediately so the caller can start reading the
// SSE stream without waiting. Capture the response body in the background.
const respClone = response.clone();
readBodySafe(respClone)
.then((respBody) => {
const record = {
...baseRecord,
response: {
status: response.status,
status_text: response.statusText,
headers: lowerHeaders(response.headers),
body: respBody.text,
body_truncated: respBody.truncated,
},
};
fs.writeFileSync(
path.join(outDir, <!--CODE_BLOCK_1129-->),
JSON.stringify(record, null, 2)
);
})
.catch((err) => {
const record = {
...baseRecord,
response: {
status: response.status,
status_text: response.statusText,
headers: lowerHeaders(response.headers),
body: <!--CODE_BLOCK_1130-->,
body_truncated: false,
},
};
fs.writeFileSync(
path.join(outDir, <!--CODE_BLOCK_1131-->),
JSON.stringify(record, null, 2)
);
});
return response;
};
console.error(
<!--CODE_BLOCK_1132-->
);
Context management and git state
Prompt assembly includes dynamic environment state: cwd, platform, shell, permission mode, tool availability. There are also prompt sections that explicitly reference task management and environment context.
Git state is first-class too. The EnterWorktree tool and related hooks show repo state is meant to be part of the agent loop, and you can see that in the tool schemas and traces.
Skills and plugins
Skills are exposed as tools (Skill, AskUserQuestion) and appear in the tool schemas. MCP adapters (mcp, ListMcpResourcesTool, ReadMcpResourceTool) are also part of the tool surface.
Automating the whole pipeline
I ended up with two automation layers:
Binary + decompile pipeline
- versioned artifacts under
claude-code/versions/<version>/... - diffs under
claude-code/diffs/<old>_to_<new>.md
- versioned artifacts under
Prompt/tool/schema extraction
scripts/extract-claude-intel.tsscripts/render-claude-intel-report.tsscripts/extract-tool-descriptions.py
For model I/O inspection I use the Bun preload hook so I always get the real /v1/messages payloads, not just the telemetry.
Extraction scripts (self‑contained)
I’m not linking the repo, so here are the actual snippets and what they do.
1) extract-claude-intel.ts — parse the de‑minified bundle
This reads claude.js, finds prompt anchors, tool implementation blocks, schema snippets, and system reminders, then writes a raw JSON blob.
function extractByPatterns(text: string, lines: string[], patterns: RegExp[], before = 8, after = 16): LineContext[] {
const out: LineContext[] = [];
for (const pattern of patterns) {
let m: RegExpExecArray | null;
const re = new RegExp(pattern.source, pattern.flags.includes("g") ? pattern.flags : <!--CODE_BLOCK_1147-->);
while ((m = re.exec(text)) !== null) {
const line = lineNumberAt(text, m.index);
out.push({
line,
match: m[0].slice(0, 120),
snippet: getLineWindow(lines, line, before, after),
});
if (out.length >= 50) return out.sort((a, b) => a.line - b.line);
}
}
return out.sort((a, b) => a.line - b.line);
}
function parseToolBlocks(text: string): ToolRecord[] {
const tools: ToolRecord[] = [];
const assignRe = /([A-Za-z_$][\w$]*)\s*=\s*\{/g;
let m: RegExpExecArray | null;
while ((m = assignRe.exec(text)) !== null) {
const symbol = m[1];
const openBraceIndex = text.indexOf("{", m.index);
const closeBraceIndex = findMatchingBrace(text, openBraceIndex);
const block = text.slice(openBraceIndex, closeBraceIndex + 1);
if (!block.includes("name:")) continue;
if (!block.includes("inputSchema")) continue;
if (!block.includes("call(") && !block.includes("async call(")) continue;
// extract name + input/output schema expressions
// store the block + snippets for later rendering
}
return tools;
}
What it gives me:
- prompt anchors + snippets
- system reminder blocks
- tool implementation blocks
- schema snippets (Zod/NR)
Raw output:
claude-intel.json
2) render-claude-intel-report.ts — normalize + render markdown
This takes the raw JSON and turns it into human‑readable artifacts.
const tools = intel.tools
.map((t) => {
const resolvedName = resolveExpr(t.nameExpr, constMap) ?? t.nameExpr;
const resolvedInput = resolveExpr(t.inputSchemaExpr, constMap) ?? t.inputSchemaExpr;
const resolvedOutput = resolveExpr(t.outputSchemaExpr, constMap) ?? t.outputSchemaExpr;
return { ...t, resolvedName, resolvedInput, resolvedOutput };
})
.sort((a, b) => a.resolvedName.localeCompare(b.resolvedName));
await Bun.write(<!--CODE_BLOCK_1150-->, systemMd.join("\n"));
await Bun.write(<!--CODE_BLOCK_1151-->, implMd.join("\n"));
await Bun.write(<!--CODE_BLOCK_1152-->, JSON.stringify(schemaRecords, null, 2));
await Bun.write(<!--CODE_BLOCK_1153-->, schemaMd.join("\n"));
await Bun.write(<!--CODE_BLOCK_1154-->, sandboxMd.join("\n"));
Outputs:
system-prompts.mdtool-implementations.mdtool-schemas.json+tool-schemas.mdsandbox-capabilities.md
3) extract-tool-descriptions.py — description strings
This scrapes the long tool descriptions from the bundle and writes:
tool-descriptions.jsontool-descriptions.md
How I run it
# 1) Extract raw intel from the de‑minified bundle
bun scripts/extract-claude-intel.ts \
claude-code/versions/2.1.50/analysis/claude-intel/claude-intel.json \
claude-code/versions/2.1.50/deminified/claude-openai/deminified/claude.js \
claude-code/versions/2.1.50/analysis/claude-intel
# 2) Render reports
bun scripts/render-claude-intel-report.ts \
claude-code/versions/2.1.50/analysis/claude-intel/claude-intel.json \
claude-code/versions/2.1.50/deminified/claude-openai/deminified/claude.js \
claude-code/versions/2.1.50/analysis/claude-intel
# 3) Extract tool descriptions
python3 scripts/extract-tool-descriptions.py \
claude-code/versions/2.1.50/deminified/claude-openai/deminified/claude.js \
claude-code/versions/2.1.50/analysis/claude-intel/tool-descriptions.json \
claude-code/versions/2.1.50/analysis/claude-intel/tool-descriptions.md
Appendix: tool schema reference
These are the tool parameters as observed in captured /v1/messages payloads.
Observed in trace payloads
AskUserQuestion
questions: Questions to ask the user (1-4 questions) Type: arrayquestions[].question: The complete question to ask the user. Should be clear, specific, and end with a question mark. Example: "Which library should we use for date formatting?" If multiSelect is true, phrase it accordingly, e.g. "Which features do you want to enable?" Type: stringquestions[].header: Very short label displayed as a chip/tag (max 12 chars). Examples: "Auth method", "Library", "Approach". Type: stringquestions[].options: The available choices for this question. Must have 2-4 options. Each option should be a distinct, mutually exclusive choice (unless multiSelect is enabled). There should be no 'Other' option, that will be provided automatically. Type: arrayquestions[].options[].label: The display text for this option that the user will see and select. Should be concise (1-5 words) and clearly describe the choice. Type: stringquestions[].options[].description: Explanation of what this option means or what will happen if chosen. Useful for providing context about trade-offs or implications. Type: stringquestions[].options[].markdown: Optional preview content shown in a monospace box when this option is focused. Use for ASCII mockups, code snippets, or diagrams that help users visually compare options. Supports multi-line text with newlines. Type: stringquestions[].multiSelect: Set to true to allow the user to select multiple options instead of just one. Use when choices are not mutually exclusive. Type: boolean Default: falseanswers: User answers collected by the permission component Type: objectannotations: Optional per-question annotations from the user (e.g., notes on preview selections). Keyed by question text. Type: objectmetadata: Optional metadata for tracking and analytics purposes. Not displayed to user. Type: objectmetadata.source: Optional identifier for the source of this question (e.g., "remember" for /remember command). Used for analytics tracking. Type: string
Bash
command: The command to execute Type: stringtimeout: Optional timeout in milliseconds (max 600000) Type: numberdescription: Clear, concise description of what this command does in active voice. Never use words like "complex" or "risk" in the description - just describe what it does.
For simple commands (git, npm, standard CLI tools), keep it brief (5-10 words): - ls → "List files in current directory" - git status → "Show working tree status" - npm install → "Install package dependencies"
For commands that are harder to parse at a glance (piped commands, obscure flags, etc.), add enough context to clarify what it does:
- find . -name "*.tmp" -exec rm {} \; → "Find and delete all .tmp files recursively"
- git reset --hard origin/main → "Discard all local changes and match remote main"
- curl -s url | jq '.data[]' → "Fetch JSON from URL and extract data array elements" Type: string
- run_in_background: Set to true to run this command in the background. Use TaskOutput to read the output later. Type: boolean
- dangerouslyDisableSandbox: Set this to true to dangerously override sandbox mode and run commands without sandboxing. Type: boolean
Edit
file_path: The absolute path to the file to modify Type: stringold_string: The text to replace Type: stringnew_string: The text to replace it with (must be different from old_string) Type: stringreplace_all: Replace all occurrences of old_string (default false) Type: boolean Default: false
EnterPlanMode
- (no parameters)
EnterWorktree
name: Optional name for the worktree. A random name is generated if not provided. Type: string
ExitPlanMode
allowedPrompts: Prompt-based permissions needed to implement the plan. These describe categories of actions rather than specific commands. Type: arrayallowedPrompts[].tool: The tool this prompt applies to Allowed: Bash Type: stringallowedPrompts[].prompt: Semantic description of the action, e.g. "run tests", "install dependencies" Type: string
Glob
pattern: The glob pattern to match files against Type: stringpath: The directory to search in. If not specified, the current working directory will be used. IMPORTANT: Omit this field to use the default directory. DO NOT enter "undefined" or "null" - simply omit it for the default behavior. Must be a valid directory path if provided. Type: string
Grep
pattern: The regular expression pattern to search for in file contents Type: stringpath: File or directory to search in (rg PATH). Defaults to current working directory. Type: stringglob: Glob pattern to filter files (e.g. ".js", ".{ts,tsx}") - maps to rg --glob Type: stringoutput_mode: Output mode: "content" shows matching lines (supports -A/-B/-C context, -n line numbers, headlimit), "fileswithmatches" shows file paths (supports headlimit), "count" shows match counts (supports headlimit). Defaults to "fileswithmatches". Allowed: content, fileswith_matches, count Type: string-B: Number of lines to show before each match (rg -B). Requires output_mode: "content", ignored otherwise. Type: number-A: Number of lines to show after each match (rg -A). Requires output_mode: "content", ignored otherwise. Type: number-C: Alias for context. Type: numbercontext: Number of lines to show before and after each match (rg -C). Requires output_mode: "content", ignored otherwise. Type: number-n: Show line numbers in output (rg -n). Requires output_mode: "content", ignored otherwise. Defaults to true. Type: boolean-i: Case insensitive search (rg -i) Type: booleantype: File type to search (rg --type). Common types: js, py, rust, go, java, etc. More efficient than include for standard file types. Type: stringhead_limit: Limit output to first N lines/entries, equivalent to "| head -N". Works across all output modes: content (limits output lines), fileswithmatches (limits file paths), count (limits count entries). Defaults to 0 (unlimited). Type: numberoffset: Skip first N lines/entries before applying head_limit, equivalent to "| tail -n +N | head -N". Works across all output modes. Defaults to 0. Type: numbermultiline: Enable multiline mode where . matches newlines and patterns can span lines (rg -U --multiline-dotall). Default: false. Type: boolean
NotebookEdit
notebook_path: The absolute path to the Jupyter notebook file to edit (must be absolute, not relative) Type: stringcell_id: The ID of the cell to edit. When inserting a new cell, the new cell will be inserted after the cell with this ID, or at the beginning if not specified. Type: stringnew_source: The new source for the cell Type: stringcell_type: The type of the cell (code or markdown). If not specified, it defaults to the current cell type. If using edit_mode=insert, this is required. Allowed: code, markdown Type: stringedit_mode: The type of edit to make (replace, insert, delete). Defaults to replace. Allowed: replace, insert, delete Type: string
Read
file_path: The absolute path to the file to read Type: stringoffset: The line number to start reading from. Only provide if the file is too large to read at once Type: numberlimit: The number of lines to read. Only provide if the file is too large to read at once. Type: numberpages: Page range for PDF files (e.g., "1-5", "3", "10-20"). Only applicable to PDF files. Maximum 20 pages per request. Type: string
SendMessage
type: Message type: "message" for DMs, "broadcast" to all teammates, "shutdownrequest" to request shutdown, "shutdownresponse" to respond to shutdown, "planapprovalresponse" to approve/reject plans Allowed: message, broadcast, shutdownrequest, shutdownresponse, planapprovalresponse Type: stringrecipient: Agent name of the recipient (required for message, shutdownrequest, planapproval_response) Type: stringcontent: Message text, reason, or feedback Type: stringsummary: A 5-10 word summary of the message, shown as a preview in the UI (required for message, broadcast) Type: stringrequest_id: Request ID to respond to (required for shutdownresponse, planapproval_response) Type: stringapprove: Whether to approve the request (required for shutdownresponse, planapproval_response) Type: boolean
Skill
skill: The skill name. E.g., "commit", "review-pr", or "pdf" Type: stringargs: Optional arguments for the skill Type: string
Task
description: A short (3-5 word) description of the task Type: stringprompt: The task for the agent to perform Type: stringsubagent_type: The type of specialized agent to use for this task Type: stringmodel: Optional model to use for this agent. If not specified, inherits from parent. Prefer haiku for quick, straightforward tasks to minimize cost and latency. Allowed: sonnet, opus, haiku Type: stringresume: Optional agent ID to resume from. If provided, the agent will continue from the previous execution transcript. Type: stringrun_in_background: Set to true to run this agent in the background. The tool result will include an output_file path - use Read tool or Bash tail to check on output. Type: booleanmax_turns: Maximum number of agentic turns (API round-trips) before stopping. Used internally for warmup. Type: integer Range: ..9007199254740991isolation: Isolation mode. "worktree" creates a temporary git worktree so the agent works on an isolated copy of the repo. Allowed: worktree Type: string
TaskOutput
task_id: The task ID to get output from Type: stringblock: Whether to wait for completion Type: boolean Default: truetimeout: Max wait time in ms Type: number Default: 30000 Range: 0..600000
TaskStop
task_id: The ID of the background task to stop Type: stringshell_id: Deprecated: use task_id instead Type: string
TeamCreate
team_name: Name for the new team to create. Type: stringdescription: Team description/purpose. Type: stringagent_type: Type/role of the team lead (e.g., "researcher", "test-runner"). Used for team file and inter-agent coordination. Type: string
TeamDelete
- (no parameters)
TodoWrite
todos: The updated todo list Type: arraytodos[].content: Type: stringtodos[].status: Allowed: pending, in_progress, completed Type: stringtodos[].activeForm: Type: string
WebFetch
url: The URL to fetch content from Type: string Format: uriprompt: The prompt to run on the fetched content Type: string
WebSearch
query: The search query to use Type: stringallowed_domains: Only include search results from these domains Type: arrayblocked_domains: Never include search results from these domains Type: array
Write
file_path: The absolute path to the file to write (must be absolute, not relative) Type: stringcontent: The content to write to the file Type: string
Schemas not observed in trace payloads yet (gated)
These tools are in the de-minified bundle but did not appear in the live /v1/messages tool arrays I captured.
- MCP tools only show up when MCP servers are connected and active. I have strong opinions on why MCPs are less efficient, so I don't really care about it.
ToolSearchshows up when deferred/tool-discovery mode is enabled.LSPshows up when a language server is initialized.StructuredOutputlikely requires structured-output mode flags.
So the parameters below are best-effort snippets and may be incomplete:
ListMcpResourcesTool:server(optional, server name filter)LSP:operation(enum: goToDefinition, findReferences, hover, documentSymbol, workspaceSymbol, goToImplementation, prepareCallHierarchy, incomingCalls, outgoingCalls)mcp: passthrough object (no fixed fields in snippet)ReadMcpResourceTool:server(MCP server name)StructuredOutput: passthrough object (no fixed fields in snippet)ToolSearch:query(search tools; supportsselect:<tool_name>)