AI Glossary (WIP)

This is a short list of common AI terms and their definitions, compiled as I was becoming familiar with the basics.

Agent

AI system that pursues a goal autonomously, planning and completing complex, multi-step tasks. An agent is a model running inside a harness: it receives an end goal, figures out how to tackle the job, then works toward it step by step. In the observe-think-act loop, an agent can call tools and feed the results back in, so context builds across the loop rather than resetting each step.

AI Safety

Field for making AI systems beneficial and safe to use.

Artificial Intelligence (AI)

Development of computer systems that perform tasks typically requiring human intelligence. AI systems learn from data to make decisions, recognize patterns, and solve problems, instead of relying only on explicit, pre-programmed instructions.

Bias

Errors in output resulting from biased training data.

Budget

Cap on how much an agent can spend before it has to stop, measured in tokens, steps, tool calls, time, or dollars. Budgets keep a loop from running away or burning resources, and they give an orchestrator a backstop for handing work to subagents.

Computer Vision

AI interpreting images and video.

Context

The AI model’s working memory, the information it can access and use when generating a response. Context resets for each conversation, it’s not saved, nor is it training. The context window is the maximum amount of info that a model can consider at once, including prompt, conversation history, and response. Context is most effective at the beginning (primacy bias), then end (recency bias), of the session; context in the middle can get overlooked.

Context Engineering

Managing everything (prompt wording, tools, memory, retrieved data, and history) in the model's context window, so the model has what it needs and isn't buried in what it doesn't. The successor to prompt engineering once agents enter the picture, since an agent's context shifts every step.

  • Selection - Choosing what to put in the window: the right tools, the relevant documents, the memory that matters for this step, and leaving the rest out.

  • Retrieval - Pulling in outside information on demand rather than holding it all in the window at once.

  • Compaction - Summarizing or trimming history as it grows so the window doesn't fill with stale turns.

  • Isolation - Giving each agent or subagent only the context for its own task, instead of one shared pile.

Eval

Test that measures whether a model or agent's output is actually good, scored against set criteria. Evals tell if a change helped or hurt by running the same cases before and after and compare. The judge can be a person, a rule, or another model grading the output.

General AI (AGI)

Not-yet-achieved AI system with human-level reasoning across any domain.

Guardrail

Constraint that limits what an agent is allowed to do or say. Boundary around an agent's autonomy: the loop decides freely, the guardrails decide what it isn't allowed to choose.
Examples: block unsafe actions, off-limits topics, or outputs that break a required format.

Hallucination

When a model confidently generates false information.

Harness

Software scaffolding around a model that turns it into a working application. The model only generates text; the harness handles system prompt, context window, output parsing, and tool calls. When the harness runs an observe-think-act loop, deciding what to send the model next and what to do with each response, the result is an agent.

Inference

Using a model to generate responses in a chat.

Large Language Model (LLM)

AI trained on massive text datasets.

LLMs are a type of ML. ML is a type of AI.

Latency

How long the model takes to respond.

Loop

Repeated cycle in which a system observes a result, decides what to do next, acts, then feeds the outcome back in and starts again.

  • Agentic loop - Observe-think-act: the agent checks its own work, reacts, and refines until it reaches the goal or hits a stopping condition like a step limit or budget.

  • Reflection loop - Agent reviews its own output, spots problems, and revises; stops when the work is good enough or after a set number of passes.

  • Evaluation loop - Agent's output is scored against criteria, and the score drives the next attempt; stops when the output clears the bar or after a set number of tries.

  • Tool-use loop - Reason-act-observe: agent calls a tool, reads the result, then calls another or finishes when it has what it needs.

  • Human-in-the-loop - Person reviews, approves, or corrects the agent's work at set checkpoints; stops or continues based on their call.

  • Feedback loop - Any cycle where the result is measured and fed back to improve the next attempt; may run indefinitely unless given a target.

  • Training loop - Model predicts, gets feedback, and adjusts, repeated until it stops improving or a set number of passes is reached.

Machine Learning (ML)

Systems that learn patterns from data. Deep learning is a subset of ML using layered neural networks inspired by the brain.

LLMs are a type of ML. ML is a type of AI.

Memory

State an agent keeps beyond a single response, so it can carry information forward. Short-term memory is what's available within the current session, usually held in the context window. Long-term memory is persisted outside the context window in a file or database, and pulled back in when relevant, so an agent can recall things across sessions.

Model

Massive file of numbers (weights or parameters) that work together mathematically to generate text responses. The model has no memory, personality, or tools. It generates answers token by token based on probability.

Model Context Protocol (MCP)

Open standard that allows AI models to connect with external tools, data sources, and services to allow plug-and-play for tools. An MCP server exposes tools (model-controlled actions), resources (app-controlled context like files or DB rows), and prompts (user-invoked templates). By the time an MCP’s capabilities reach the model they have been flattened into ordinary tools to call.
Examples: search the web, access files, query databases, interact with app like Slack or Gmail.

Multi-Agent

Systems where multiple AI models collaborate on a task. Orchestration is coordinating multiple AI agents or steps to complete complex tasks.

Natural Language Processing (NLP)

AI understanding and generating human language.

Orchestration

Coordinating multiple agents or steps so they work together toward one goal. The orchestrator is the layer that decides what runs when, in what order, and passes results between them. In a multi-agent setup, the orchestrator is usually itself an agent directing the others.

Persona

Defined character, identity, or personality assigned to model to shape its tone, voice, and style of response. Overlaps with role prompting, but focuses on personality and communication style rather than expertise or task focus.

Prompt Engineering

Crafting inputs to get better AI outputs.

  • Standard prompting - Asking direct question or giving direct instruction in a short prompt.

  • Zero-shot prompting - Giving the model instruction with high specificity and low complexity without examples.

  • One-shot prompting - Including a standard pattern example in the prompt with the format or style of desired output. Shot means example, not attempt; it’s distinct from the slang "I one-shotted it" meaning solved in one try.

  • Few-shot prompting - Including a few diverse examples for complex cases, such as including a description, technical requirements, implementation and integration notes, and expected deliverable with data structure.

  • Role prompting - Assigning an identity or expertise for the tone or focus of the output.

  • Chain of thought prompting - Asking model to break down its reasoning into intermediate steps before giving a final answer (such as including “let’s think step by step” in a zero shot). Reasoning models now do this on their own, so the move is often setting how much they reason rather than including in prompt.

  • Emotional prompting - Including emotional language or stakes to influence the model's response.

  • Prompt chaining - Breaking a complex task into separate prompts where each output feeds into the next.

  • Negative prompting - Specifying what you don’t want in the output.

  • Meta-prompting - Having model write or improve prompts for itself or another model.

Retrieval Augmented Generation (RAG)

Giving LLM the ability to look up relevant information from an external source before generating a response. Allows to stay up-to-date without retraining.

Skill

Modular set of instructions or best practices that improve AI output for specific a task. A skill extends what the model knows how to do; it's like a playbook or recipe that the model reads before starting a task. A skill can instruct the model to call tools, including tools that came from an MCP server.
Examples: write text documents, work with PDFs, create UI components, write tests, review pull requests.

Structured Output

Model returns its response in a specific, predictable shape that matches a provided template, schema, or data format, rather than free-form text.

Subagent

Agent spawned by another agent to handle a scoped piece of work and report its result back. Splitting a job across subagents keeps each one focused on a narrow task with its own context, instead of one agent juggling everything. The agent that spawns them coordinates the results.

System Prompt

Set of instructions given to the model before the user conversation begins that defines its role, behavior, and guidelines for how it should respond. In consumer AI products, the provider writes the system prompt; if you're building an application with an API, you write it yourself.

Token

Units of text a model processes, used to determine cost to use. One token is about ¾ of a word. Input tokens (text given to model) are lower cost than output tokens (text generated by model).

Tool

A function that extends what the model can do; it's a new verb. Where a skill is consumed by being read, a tool is consumed by being called.

Training

Process for an AI model to learn from data to develop its capabilities. Billions of times, the model is educated by taking in data to make a prediction then using feedback to make better predictions. People fine-tune the model with curated data and use Reinforcement Learning from Human Feedback (RLHF) to produce preferred responses.

Workflow

System where the steps are fixed in advance by a person, and the model fills in each one, following a set path every time. A workflow is predictable, wired up ahead of the task like "summarize this, then translate it, then email it"; an agent is more flexible, figuring out its own steps from a goal like "get this translated and sent".

*AI was used for research

Amanda HintonAI, code