// DICTIONARY_MANUAL

AI Engineering Dictionary

A structured reference index explaining emerging AI terms, architecture patterns, failure modes, and developer conventions. Use the homepage console for instant interactive search, or browse the sections below.

§02

Section 2 — Model provider request

14 entries compiled
Agentverified
An autonomous software system that embeds an LLM within a stateful execution loop, enabling it to call tools, interact with files, and iteratively accomplish complex goals.
Cache Tokensverified
The portion of input tokens that matched an active prefix cache, resulting in significantly reduced bills and near-instant processing.
Context Windowverified
The absolute maximum token limit (capacity) that a model can process, read, and write in a single API request.
Contextverified
The combined body of system instructions, conversation logs, files, and schemas injected into the model request to guide its behavior and provide facts.
Harnessverified
The client-side application code that drives the model, parses tool calls, maintains conversation logs, and executes command operations on the local machine.
Input Tokensverified
The numerical text fragments (prompts, system rules, history, and schemas) sent in a request to the model provider.
Model Provider Requestverified
The network API payload containing prompt messages, system templates, parameters, and tool definitions sent to a model provider.
Output Tokensverified
The numerical text fragments generated by the model in response to a request, billed at a premium rate and processed sequentially.
Prefix Cacheverified
An optimization system that stores pre-processed prompt segments in GPU memory, skipping repetitive calculations for identical context prefixes.
Sessionverified
The active span of a conversation thread, representing the sequence of user queries, tool calls, and model responses accumulated in memory.
Statefulverified
The operational design where a client application (harness) maintains a persistent record of messages, files, and variables across multiple stateless model queries.
Statelessverified
The architectural characteristic of AI models where each API request has no memory of prior queries, requiring the client to send the entire conversation history in every call.
System Promptverified
The root-level instruction block in an API request that establishes the model's role, constraints, formatting rules, and tool access boundaries.
Turnverified
A single request-response exchange in a session, composed of user input (and potential tool results) followed by the model's output.