OpenAI Academy Reference Guide

Codex for Builders Training

A business-oriented, graduate-level course for applying Codex responsibly across software delivery, review, automation, and team adoption.

Primary source Codex for Builders OpenAI Academy, Aug. 7, 2025; updated Jun. 2, 2026
Voice narration Ready

Overview

This course translates the OpenAI Academy Codex for Builders material into an operating guide for business-oriented builders, product owners, technical program managers, analysts, and leaders who need to understand how Codex changes software work without becoming full-time developers.

The Academy resource describes Codex as an agentic software teammate for accelerating builder productivity. It identifies the main usage surfaces: Codex App, Codex CLI, Codex web, the Codex tab in ChatGPT iOS, the IDE extension, and GitHub integration. It also highlights common uses: understanding large codebases, drafting technical designs, debugging, planning migrations, implementing features, and automating advanced CI/CD workflows.

What You Will Be Able To Do

  • Explain Codex in business terms: what it does, where it fits, and when it should not be used without oversight.
  • Choose the right Codex surface for a task: app, CLI, IDE, web/cloud, iOS, or GitHub review.
  • Write strong prompts using goal, context, constraints, and done criteria.
  • Evaluate Codex output through tests, diffs, review evidence, and risk controls.
  • Use team guidance such as AGENTS.md to make behavior more consistent.
  • Design a practical adoption plan with training, governance, metrics, and escalation paths.

Course Structure

Each section includes a detailed lesson and a randomized multiple-choice exam. When you answer a question, the guide immediately shows why the correct answer is correct and why the other options are incorrect. The final exam draws from all sections and randomizes order each time it is opened.

Reference Links

Prerequisites

This training assumes college-graduate-level business judgment, comfort with digital tools, and enough technical literacy to understand repositories, change requests, testing, and software delivery risk. You do not need to be a professional software engineer, but you should be willing to read structured prompts, review code-adjacent evidence, and ask precise questions.

Access Needed

  • An OpenAI or ChatGPT plan that includes the Codex surfaces you intend to use.
  • Access to the relevant code repository, usually through GitHub for cloud tasks and pull request review.
  • Local development access if you will use the CLI or IDE extension.
  • Permission from your organization to use AI coding tools with the data, repositories, and systems involved.
  • A Cloudflare account if you want to deploy or maintain this training guide online.

Baseline Concepts

Repository
A structured folder containing source code, configuration, documentation, and change history.
Branch
A line of work where changes can be developed before merging into a main code line.
Pull request
A review process for proposing, discussing, testing, and merging changes.
Diff
The visible set of changes between two versions of files.
Test suite
Automated checks that help confirm the software still behaves as expected.
Sandbox
A boundary that limits what an agent can read, write, or access while performing work.
Approval policy
The rules for when Codex must ask before taking actions such as using the network or changing files outside the workspace.

How To Use This Course

  1. Read the Overview and Prerequisites first.
  2. Use the Start button in the voice panel when you want narration for the current lesson.
  3. Complete each section exam before moving to the next section.
  4. Use the final exam as a readiness check before applying Codex in a live business workflow.
  5. Keep a note of policy questions that arise, especially around repository access, confidential data, approval settings, and deployment authority.

Section 1: What Codex Is and Why It Matters

The Academy page positions Codex as an agentic software teammate. That phrase matters. A conventional chatbot answers questions. A coding agent can inspect files, reason about dependencies, edit code, run commands, validate changes, and report evidence. For a business user, the key shift is from asking for advice to delegating bounded technical work.

Codex is not a replacement for accountable engineering judgment. It is a productivity layer that can accelerate analysis, drafting, implementation, testing, refactoring, and review when the task is well framed. The stronger the business context and completion criteria, the more useful the output becomes.

Core Business Interpretation

  • Speed: Codex can reduce the time between a business request and a technical draft, prototype, patch, or review.
  • Quality: Codex can run checks and surface issues, but quality depends on verification instructions and human review.
  • Access to technical work: Semi-technical users can describe outcomes in natural language and collaborate with engineers through evidence.
  • Consistency: Reusable guidance, configuration, and repository instructions help Codex follow team standards.

Common Builder Use Cases

The Academy material lists several high-value uses: learning a new or large codebase, drafting technical designs and docs, debugging issues, planning migrations, implementing features, and using headless CLI workflows for CI/CD automation. For business teams, these map to faster discovery, better handoffs, clearer estimates, lower documentation debt, and more repeatable delivery processes.

What Codex Should Not Be Asked To Do Alone

  • Make production changes without review, testing, and release controls.
  • Handle regulated or confidential data without approved policies.
  • Bypass security, compliance, or procurement processes.
  • Convert vague business wishes into shipped features without product owner validation.

Section 2: Choosing the Right Codex Surface

The Academy page states that Codex is one unified product with clients for the places developers work. The practical question is not whether Codex can help, but where the task should be run.

Surface Selection Guide

SurfaceBest ForBusiness Consideration
Codex AppLocal planning, implementation, review, visual/frontend feedback, and longer interactive work.Good for guided collaboration where a user wants to inspect progress.
Codex CLITerminal-first repository work, automation, repeatable commands, and advanced workflows.Best when the user or team is comfortable with command-line tooling.
IDE ExtensionEditor-attached coding where open files and selected text provide context.Useful for engineers and technical analysts working inside a code editor.
Codex Web or CloudParallel tasks, delegated work, GitHub-connected repositories, and remote execution.Useful when work should run away from the local machine or from another device.
ChatGPT iOS Codex tabStarting, approving, or following up on tasks from mobile.Good for lightweight oversight, not deep technical review.
GitHub IntegrationPull request review, review comments, and follow-up fixes.Strong for governed team workflows because evidence lives in the PR.

Decision Rules

Use local surfaces when the work depends on local files, local tools, or close inspection. Use cloud/web when the repository is in GitHub and the work can be delegated in parallel. Use GitHub integration when the unit of work is a pull request and the desired outcome is review feedback or a targeted fix.

For business users, the most important operating question is: where will the evidence be easiest to review? If the answer is a pull request, GitHub may be the right surface. If the answer is a working local prototype, the app or IDE may be better. If the answer is a repeatable automation job, the CLI is usually strongest.

Section 3: Prompting and Delegation

Codex quality improves when prompts include enough context to reduce guessing. The Codex manual recommends a practical four-part default: goal, context, constraints, and done criteria. This is business-friendly because it mirrors how strong managers delegate work.

The Four-Part Prompt

  1. Goal: State the outcome. Example: "Create a customer export page that lets operations download filtered CSV files."
  2. Context: Point to the relevant repository folders, tickets, screenshots, errors, policies, or examples.
  3. Constraints: Name standards Codex must follow, such as no new dependencies, accessibility requirements, or security limits.
  4. Done when: Define proof, such as tests passing, a specific bug no longer reproducing, or a reviewable diff being ready.

When To Ask For A Plan First

For ambiguous, risky, or multi-step work, ask Codex to plan before implementing. A plan lets the user confirm scope, spot missing assumptions, and keep business priorities visible. This is especially important for migrations, workflow automation, compliance-sensitive changes, and tasks touching shared components.

Good Delegation Pattern

Goal: Add a simple training progress tracker to this static course.
Context: Work in index.html, styles.css, and app.js. Preserve the current tab layout.
Constraints: No backend. Use localStorage only. Keep the interface accessible.
Done when: Progress persists after refresh, exams still randomize, and no console errors appear.

Prompt Anti-Patterns

  • "Make it better" without defining what better means.
  • Asking for implementation before clarifying a business rule.
  • Omitting verification steps and then assuming the result is correct.
  • Combining unrelated tasks that should be reviewed separately.

Section 4: Codebase Work, Testing, and Review

Codex can help understand large codebases, debug issues, implement changes, write tests, and review diffs. The business value comes from compressing the cycle between question, investigation, change, and evidence.

Evidence-Oriented Workflow

  1. Ask Codex to inspect the relevant files and summarize the current behavior.
  2. Ask for a plan if the change has risk or ambiguity.
  3. Let Codex implement a narrow change.
  4. Require verification: tests, linting, type checks, screenshots, or reproduction steps.
  5. Review the diff and ask Codex to explain tradeoffs, risks, and residual gaps.

What Business Reviewers Should Look For

  • Does the output match the business requirement, not just the technical task?
  • Did Codex touch only the expected files or areas?
  • Are test results included, and are they relevant?
  • Are assumptions explicitly stated?
  • Is there a rollback or mitigation plan for higher-risk work?

GitHub Review

The Codex manual describes GitHub code review as a high-signal review pass on pull request diffs. Codex can be triggered with @codex review, can follow review guidance in AGENTS.md, and can be asked to fix a flagged issue when permissions allow. This is useful for teams because review comments and fixes remain attached to the PR.

Section 5: Security, Approvals, and Governance

Codex security is built around boundaries. The Codex manual explains two central controls: sandbox mode, which defines what Codex can technically access, and approval policy, which defines when Codex must ask before acting. Business leaders should understand these controls because they determine the risk profile of agentic work.

Sandbox Mode

A sandbox limits where Codex can write and whether it can reach the network. Local CLI and IDE defaults generally keep network access off and limit writes to the active workspace. Cloud tasks run in isolated managed environments. These constraints reduce the chance that a task affects unrelated files, systems, or data.

Approval Policy

Approval policy controls when the agent pauses for permission. A common pattern is workspace write with on-request approvals: Codex can work inside the project but asks before crossing important boundaries. This gives productivity while preserving oversight.

Governance Checklist

  • Classify repositories by sensitivity before enabling Codex workflows.
  • Define which data may be used in prompts, screenshots, logs, or attachments.
  • Keep network access scoped and intentional.
  • Require human review for production-impacting changes.
  • Document who may approve escalations, new dependencies, releases, or deployment changes.
  • Review outputs for prompt injection risk when Codex uses web or external content.

Business Principle

Do not measure Codex maturity by how much autonomy it has. Measure maturity by how reliably the organization can delegate, verify, approve, and audit work at the right risk level.

Section 6: Team Customization with Instructions, Config, Skills, and MCP

Codex becomes more reliable when repeated expectations are encoded. The Codex manual recommends using AGENTS.md for durable repository guidance, configuration for consistent behavior, MCP for external systems, skills for reusable workflows, and automation for stable repeated tasks.

AGENTS.md

AGENTS.md is a repository instruction file for Codex. It can describe project layout, build commands, test commands, engineering conventions, PR expectations, constraints, do-not rules, and what "done" means. Guidance can exist globally, at the repository root, and in subdirectories. More specific guidance closer to the current work takes precedence.

Configuration

Configuration can set defaults such as model choice, reasoning effort, sandbox mode, approval policy, profiles, and MCP servers. Business users do not need to memorize every setting, but they should know that consistent configuration reduces inconsistent agent behavior across teams.

MCP and Skills

Model Context Protocol connects Codex to external tools or data sources when authorized. Skills package reusable workflows and instructions, such as a security review process or a document generation process. Use these only when the team has a repeatable need and clear ownership.

Automation

Automate stable workflows only after the team has proven the manual version works. Good candidates include recurring review checks, report generation, documentation updates, and narrow CI/CD support. Poor candidates include ambiguous product decisions, unsupervised sensitive data handling, or broad production changes.

Section 7: Business Adoption, Metrics, and Change Management

Adopting Codex is not just a tooling rollout. It changes how work is described, delegated, reviewed, and measured. The successful pattern is to start with bounded workflows, gather evidence, train users, and expand based on results.

Pilot Workflow

  1. Select two or three low-to-medium-risk workflows, such as documentation updates, test creation, codebase explanation, or PR review support.
  2. Define the expected inputs, prompts, review steps, and completion evidence.
  3. Run a short pilot with a small group of builders and reviewers.
  4. Capture before-and-after metrics: cycle time, review quality, defect escape, rework, and user satisfaction.
  5. Update guidance, prompts, and AGENTS.md based on repeated friction.

Useful Metrics

  • Time from request to first reviewable artifact.
  • Percentage of Codex changes with relevant verification evidence.
  • Review comments resolved without engineering rework.
  • Number of repeated mistakes converted into durable instructions.
  • Adoption by role and workflow, not just total usage.

Training Emphasis

Teach people to delegate clearly, inspect evidence, and escalate uncertainty. Do not train users to blindly trust generated code or to treat Codex as a shortcut around existing accountability.

Section 8: Capstone Operating Playbook

This section turns the course into an operating model. The goal is to give a semi-technical business user a repeatable way to request, supervise, and evaluate Codex-assisted work.

The Codex Work Order

Before starting a meaningful task, write a short work order:

Business outcome:
User or stakeholder:
Relevant repository or files:
Known constraints:
Security or data sensitivity:
Preferred Codex surface:
Verification required:
Definition of done:
Human approver:

Readiness Checklist

  • The task has a clear owner and business outcome.
  • The right repository, files, screenshots, or error logs are available.
  • The task is scoped small enough to review.
  • Security and data sensitivity are understood.
  • Testing or validation steps are known.
  • The reviewer knows what evidence to inspect.

After-Action Review

After each meaningful Codex-assisted task, ask: What did Codex do well? What did it misunderstand? Which instruction should become durable? Which test or review step caught the most risk? This creates a feedback loop that improves the system over time.

Final Exam

The final exam covers all sections. It randomizes question order and answer order each time this tab is opened. A score of 80% or higher indicates readiness to participate in a governed Codex pilot, assuming your organization has approved access, data, and security policies.