// DICTIONARY_MANUAL
AI Engineering Dictionary
A structured reference index explaining emerging AI terms, architecture patterns, failure modes, and developer conventions. Use the homepage console for instant interactive search, or browse the sections below.
§01
Section 1 — The Model
9 entries compiledArtificial Intelligence (AI)verified
A general term describing computer systems that perform tasks historically requiring human intelligence — like writing code, reasoning through bugs, or understanding natural language.Inferenceverified
The execution phase where a trained model processes input tokens to generate output tokens.Model Providerverified
The host infrastructure that serves model inference, either via cloud-based APIs or local serving engines.Modelverified
The compiled, frozen parameters that calculate next-token probability distributions. A model is completely stateless and pure.Next-token Predictionverified
The core autoregressive mechanism of generative models, calculating token probabilities and sampling one token at a time.Non-determinismverified
The operational characteristic where identical prompts output different token sequences due to temperature scaling, nucleus sampling, and floating-point math variations.Parametersverified
The internal floating-point weights and biases of a neural network, optimized during training, that define the model's behavior.Tokenverified
The basic numerical chunk (character fragment or sub-word) that a model reads and writes, converted from text via a tokenizer.Trainingverified
The computational process of optimizing a model's parameters by exposing it to datasets and adjusting weights via backpropagation.§02
Section 2 — Model provider request
14 entries compiledAgentverified
An autonomous software system that embeds an LLM within a stateful execution loop, enabling it to call tools, interact with files, and iteratively accomplish complex goals.Cache Tokensverified
The portion of input tokens that matched an active prefix cache, resulting in significantly reduced bills and near-instant processing.Context Windowverified
The absolute maximum token limit (capacity) that a model can process, read, and write in a single API request.Contextverified
The combined body of system instructions, conversation logs, files, and schemas injected into the model request to guide its behavior and provide facts.Harnessverified
The client-side application code that drives the model, parses tool calls, maintains conversation logs, and executes command operations on the local machine.Input Tokensverified
The numerical text fragments (prompts, system rules, history, and schemas) sent in a request to the model provider.Model Provider Requestverified
The network API payload containing prompt messages, system templates, parameters, and tool definitions sent to a model provider.Output Tokensverified
The numerical text fragments generated by the model in response to a request, billed at a premium rate and processed sequentially.Prefix Cacheverified
An optimization system that stores pre-processed prompt segments in GPU memory, skipping repetitive calculations for identical context prefixes.Sessionverified
The active span of a conversation thread, representing the sequence of user queries, tool calls, and model responses accumulated in memory.Statefulverified
The operational design where a client application (harness) maintains a persistent record of messages, files, and variables across multiple stateless model queries.Statelessverified
The architectural characteristic of AI models where each API request has no memory of prior queries, requiring the client to send the entire conversation history in every call.System Promptverified
The root-level instruction block in an API request that establishes the model's role, constraints, formatting rules, and tool access boundaries.Turnverified
A single request-response exchange in a session, composed of user input (and potential tool results) followed by the model's output.§03
Section 3 — Environment
7 entries compiledEnvironmentverified
The boundary of directories, files, systems, databases, and network APIs that an agent can see and modify using its tools.Filesystemverified
The storage interface where an agent reads source documents, inspects files, and writes edits using file-operation tools.Permission Modeverified
The configuration tier (Bypass, Ask, or Strict) that dictates which tool classes run automatically and which require developer approval.Permission Requestverified
A checkpoint in an agent loop that prompts the developer for approval before executing a sensitive tool call.Tool Callverified
The structured request generated by a model during inference, specifying a tool name and arguments it wants the client harness to run.Tool Resultverified
The execution output (data or error logs) sent back to the model provider by the client harness after running a tool call.Toolverified
An external function or API made available to a model, defined via a JSON schema, allowing the agent loop to execute operations on the host system.§03
Section 3 — Agent Tooling
1 entries compiled§04
Section 4 — Sandbox
4 entries compiledAgent Modeverified
The runtime configuration (e.g. architect, builder, interpreter) that sets the model's role instructions and restricts the tools it can access.Hallucinationverified
A model failure mode where the LLM generates factually false statements, non-existent code functions, or phantom API parameters that sound plausible.Sandboxverified
An isolated computing environment (container, VM, or restricted shell) that restricts the files and commands an agent can access, limiting the damage of automated actions.Sycophancyverified
A model failure mode where the LLM submissively agrees with a user's incorrect statements or preferences to appear cooperative, prioritizing sycophantic agreement over technical accuracy.§05
Section 5 — Parametric knowledge
5 entries compiledAttention Budgetverified
The finite mathematical capacity each token has to distribute influence across other context tokens, which dilutes as prompt length grows.Attention Relationshipverified
The mathematical connection computed between pairs of tokens inside the context window that represents how they influence and depend on each other.Contextual Knowledgeverified
The active facts, source code files, and logs loaded inside the model's context window that it can read directly at query time.Knowledge Cutoffverified
The calendar date past which a model has no pre-trained parametric knowledge of events, codebase changes, or library updates.Parametric Knowledgeverified
The frozen world facts and coding capability compiled directly into the model's parameters during training, which cannot be modified during inference.§06
Section 6 — Attention degradation
5 entries compiledAttention Degradationverified
The gradual decline in a model's constraint-following and reasoning performance as prompt context length increases, caused by attention budget dilution.Clearingverified
Ending the current conversation session and starting a fresh one with a completely empty context window to wipe out accumulated noise.Handoffverified
The process of transferring task progress, decisions, and next steps from a bloated chat session to a fresh one, preserving focus while resetting the context window.Primary Sourceverified
The original, raw source of truth (e.g. active code files, terminal test logs, database rows) rather than summaries or descriptions of them.Smart Zoneverified
The early phase of a session where the context window is small, keeping the model sharp, highly focused, and accurate.§07
Section 7 — Secondary source
5 entries compiledCompactionverified
An in-memory session reset where the active chat history is summarized by the model, throwing away the detailed transcript to free up context window space.Handoff Artifactverified
A persistent file written to the environment by an agent to record plans, status, and decisions, used to brief a fresh successor session.Secondary Sourceverified
A compiled, lossy description or summary of a primary source (e.g. readmes, design docs, compaction summaries) that trades detail for lower token costs.Specverified
A high-level handoff artifact (like a PRD or design doc) stored in the environment that defines a project's goals, constraints, and ticket checklist across multiple sessions.Ticketverified
A granular handoff artifact that scopes a single session of work, designed to be completed before the model drifts out of the smart zone.§08
Section 8 — Autocompact
5 entries compiledAGENTS.mdverified
A project brief file loaded by the harness at startup, detailing the project overview, folder layout, commands, and constraints for coding agents.Autocompactverified
Compaction triggered automatically by the client harness when context size crosses a threshold (often 80%), risking the quiet loss of task constraints.Context Pointerverified
A reference path or URL link in one document pointing to another, allowing the agent to load the detail only when the task requires it.Memory Systemverified
The client-side database or filesystem infrastructure that saves user preferences and project facts across sessions to simulate stateful continuity.Progressive Disclosureverified
The optimization technique of loading only the specific context required for the active task, hiding detailed files behind context pointers until needed.§09
Section 9 — Skills and Subagents
6 entries compiledAway From Keyboard (AFK)verified
A working pattern where the developer leaves the agent to run unattended, deferring all review to the end of the session.Automated Checkverified
A deterministic verification tool run locally (lints, typechecks, builds, test suites) that gives the agent binary pass/fail logs to self-correct from.Automated Reviewverified
The process where a secondary model (with a fresh context window) reviews the diff generated by a working agent to catch design flaws, security risks, or contract breaks.Human-in-the-loopverified
A working pattern where the developer actively monitors, redirects, and collaborates with the agent in real time, catching mistakes before they build up.Skillverified
A pre-packaged, teachable capability (instructions, scripts, templates) loaded into the context window dynamically using progressive disclosure.Subagentverified
A secondary agent spawned by a parent agent to execute a specific sub-task in a separate, isolated context window, returning a brief summary result.§10
Section 10 — Human and Vibe review
6 entries compiledDesign Conceptverified
The shared mental model of what is being built, held in common between developer and agent, separate from any physical file or code asset.Developer Experience (DX)verified
The quality of the interaction between a human developer and a codebase toolchain, characterized by fast feedback, clean documentation, and ease of work.Grillingverified
A planning technique where the agent Socratically interviews the developer, one decision at a time, to resolve ambiguities before committing to code or specs.Human Reviewverified
The final verification gate where a developer reads the primary code diff produced by an agent to judge its correctness, architecture, and safety.Prototypingverified
A development technique where the agent builds a quick, visual version of a feature, allowing you to react to a physical asset rather than discussing concepts in abstract text.Vibe Codingverified
A working pattern where the developer accepts the agent's code modifications blindly without conducting code diff reviews, judging progress strictly by runtime behavior.§11
Section 11 — Agent experience
1 entries compiled§99
embeddings-vectors
2 entries compiledEmbeddingsverified
High-dimensional coordinate lists (vectors) that represent the semantic meaning of text, images, or audio, placing related concepts close to each other in a continuous geometric space.Vector Databasesverified
Specialized database systems designed to store, index, and query high-dimensional vector embeddings rapidly using Approximate Nearest Neighbor (ANN) search algorithms.