By using this site, you agree to the Privacy Policy and Terms of Use.
Accept
MadisonyMadisony
Notification Show More
Font ResizerAa
  • Home
  • National & World
  • Politics
  • Investigative Reports
  • Education
  • Health
  • Entertainment
  • Technology
  • Sports
  • Money
  • Pets & Animals
Reading: OpenAI debuts GPT‑5.1-Codex-Max coding mannequin and it already accomplished a 24-hour activity internally
Share
Font ResizerAa
MadisonyMadisony
Search
  • Home
  • National & World
  • Politics
  • Investigative Reports
  • Education
  • Health
  • Entertainment
  • Technology
  • Sports
  • Money
  • Pets & Animals
Have an existing account? Sign In
Follow US
2025 © Madisony.com. All Rights Reserved.
Technology

OpenAI debuts GPT‑5.1-Codex-Max coding mannequin and it already accomplished a 24-hour activity internally

Madisony
Last updated: November 19, 2025 7:56 pm
Madisony
Share
OpenAI debuts GPT‑5.1-Codex-Max coding mannequin and it already accomplished a 24-hour activity internally
SHARE

[ad_1]

OpenAI debuts GPT‑5.1-Codex-Max coding mannequin and it already accomplished a 24-hour activity internally

Contents
Efficiency Benchmarks: Incremental Positive factors Throughout Key DutiesTechnical Structure: Lengthy-Horizon Reasoning through CompactionPlatform Integration and Use CircumstancesCybersecurity and Security ConstraintsDeployment Context and Developer UtilizationOutlook

OpenAI has launched GPT‑5.1-Codex-Max, a brand new frontier agentic coding mannequin now obtainable in its Codex developer surroundings. The discharge marks a major step ahead in AI-assisted software program engineering, providing improved long-horizon reasoning, effectivity, and real-time interactive capabilities. GPT‑5.1-Codex-Max will now exchange GPT‑5.1-Codex because the default mannequin throughout Codex-integrated surfaces.

The brand new mannequin is designed to function a persistent, high-context software program growth agent, able to managing complicated refactors, debugging workflows, and project-scale duties throughout a number of context home windows.

It comes on the heels of Google releasing its highly effective new Gemini 3 Professional mannequin yesterday, but nonetheless outperforms or matches it on key coding benchmarks:

On SWE-Bench Verified, GPT‑5.1-Codex-Max achieved 77.9% accuracy at extra-high reasoning effort, edging previous Gemini 3 Professional’s 76.2%.

It additionally led on Terminal-Bench 2.0, with 58.1% accuracy versus Gemini’s 54.2%, and matched Gemini’s rating of two,439 on LiveCodeBench Professional, a aggressive coding Elo benchmark.

When measured towards Gemini 3 Professional’s most superior configuration — its Deep Pondering mannequin — Codex-Max holds a slight edge in agentic coding benchmarks, as effectively.

Efficiency Benchmarks: Incremental Positive factors Throughout Key Duties

GPT‑5.1-Codex-Max demonstrates measurable enhancements over GPT‑5.1-Codex throughout a variety of ordinary software program engineering benchmarks.

On SWE-Lancer IC SWE, it achieved 79.9% accuracy, a major improve from GPT‑5.1-Codex’s 66.3%. In SWE-Bench Verified (n=500), it reached 77.9% accuracy at extra-high reasoning effort, outperforming GPT‑5.1-Codex’s 73.7%.

Efficiency on Terminal Bench 2.0 (n=89) confirmed extra modest enhancements, with GPT‑5.1-Codex-Max attaining 58.1% accuracy in comparison with 52.8% for GPT‑5.1-Codex.

All evaluations had been run with compaction and extra-high reasoning effort enabled.

These outcomes point out that the brand new mannequin gives the next ceiling on each benchmarked correctness and real-world usability beneath prolonged reasoning masses.

Technical Structure: Lengthy-Horizon Reasoning through Compaction

A serious architectural enchancment in GPT‑5.1-Codex-Max is its capacity to motive successfully over prolonged input-output classes utilizing a mechanism known as compaction.

This allows the mannequin to retain key contextual data whereas discarding irrelevant particulars because it nears its context window restrict — successfully permitting for steady work throughout hundreds of thousands of tokens with out efficiency degradation.

The mannequin has been internally noticed to finish duties lasting greater than 24 hours, together with multi-step refactors, test-driven iteration, and autonomous debugging.

Compaction additionally improves token effectivity. At medium reasoning effort, GPT‑5.1-Codex-Max used roughly 30% fewer pondering tokens than GPT‑5.1-Codex for comparable or higher accuracy, which has implications for each value and latency.

Platform Integration and Use Circumstances

GPT‑5.1-Codex-Max is presently obtainable throughout a number of Codex-based environments, which seek advice from OpenAI’s personal built-in instruments and interfaces constructed particularly for code-focused AI brokers. These embody:

  • Codex CLI, OpenAI’s official command-line device (@openai/codex), the place GPT‑5.1-Codex-Max is already dwell.

  • IDE extensions, possible developed or maintained by OpenAI, although no particular third-party IDE integrations had been named.

  • Interactive coding environments, resembling these used to exhibit frontend simulation apps like CartPole or Snell’s Regulation Explorer.

  • Inside code overview tooling, utilized by OpenAI’s engineering groups.

For now, GPT‑5.1-Codex-Max is just not but obtainable through public API, although OpenAI states that is coming quickly. Customers who want to work with the mannequin in terminal environments at the moment can achieve this by putting in and utilizing the Codex CLI.

It isn’t presently confirmed whether or not or how the mannequin will combine into third-party IDEs until they’re constructed on prime of the CLI or future API.

The mannequin is able to interacting with dwell instruments and simulations. Examples proven within the launch embody:

  • An interactive CartPole coverage gradient simulator, which visualizes reinforcement studying coaching and activations.

  • A Snell’s Regulation optics explorer, supporting dynamic ray tracing throughout refractive indices.

These interfaces exemplify the mannequin’s capacity to motive in actual time whereas sustaining an interactive growth session — successfully bridging computation, visualization, and implementation inside a single loop.

Cybersecurity and Security Constraints

Whereas GPT‑5.1-Codex-Max doesn’t meet OpenAI’s “Excessive” functionality threshold for cybersecurity beneath its Preparedness Framework, it’s presently essentially the most succesful cybersecurity mannequin OpenAI has deployed. It helps use circumstances resembling automated vulnerability detection and remediation, however with strict sandboxing and disabled community entry by default.

OpenAI reviews no improve in scaled malicious use however has launched enhanced monitoring methods, together with exercise routing and disruption mechanisms for suspicious conduct. Codex stays remoted to a neighborhood workspace until builders opt-in to broader entry, mitigating dangers like immediate injection from untrusted content material.

Deployment Context and Developer Utilization

GPT‑5.1-Codex-Max is presently obtainable to customers on ChatGPT Plus, Professional, Enterprise, Edu, and Enterprise plans. It’ll additionally change into the brand new default in Codex-based environments, changing GPT‑5.1-Codex, which was a extra general-purpose mannequin.

OpenAI states that 95% of its inner engineers use Codex weekly, and since adoption, these engineers have shipped ~70% extra pull requests on common — highlighting the device’s influence on inner growth velocity.

Regardless of its autonomy and persistence, OpenAI stresses that Codex-Max must be handled as a coding assistant, not a substitute for human overview. The mannequin produces terminal logs, take a look at citations, and power name outputs to assist transparency in generated code.

Outlook

GPT‑5.1-Codex-Max represents a major evolution in OpenAI’s technique towards agentic growth instruments, providing better reasoning depth, token effectivity, and interactive capabilities throughout software program engineering duties. By extending its context administration and compaction methods, the mannequin is positioned to deal with duties on the scale of full repositories, quite than particular person recordsdata or snippets.

With continued emphasis on agentic workflows, safe sandboxes, and real-world analysis metrics, Codex-Max units the stage for the subsequent era of AI-assisted programming environments — whereas underscoring the significance of oversight in more and more autonomous methods.

[ad_2]

Subscribe to Our Newsletter
Subscribe to our newsletter to get our newest articles instantly!
[mc4wp_form]
Share This Article
Email Copy Link Print
Previous Article When My Remedy Canine Turned My Co-Trainer, and Modified How I Train When My Remedy Canine Turned My Co-Trainer, and Modified How I Train
Next Article What’s subsequent as soon as Trump indicators invoice releasing Epstein information What’s subsequent as soon as Trump indicators invoice releasing Epstein information

POPULAR

China’s Nostradamus Predicts US Defeat in Iran War After Trump Win
Politics

China’s Nostradamus Predicts US Defeat in Iran War After Trump Win

FRMO Corp Q3 2026 Earnings Call: Key Highlights Revealed
business

FRMO Corp Q3 2026 Earnings Call: Key Highlights Revealed

New UK Cameras Enforce Strict Phone Ban, 90% Drivers Admit Breaking Rule
top

New UK Cameras Enforce Strict Phone Ban, 90% Drivers Admit Breaking Rule

Passenger: Scariest 2026 Horror Film Releases May 22 After Trailer Scares
Entertainment

Passenger: Scariest 2026 Horror Film Releases May 22 After Trailer Scares

Falklands Veteran Urges King Charles to Sway Trump on Islands
top

Falklands Veteran Urges King Charles to Sway Trump on Islands

New Algorithm Detects Aortic Stenosis with 90.5% Sensitivity
Technology

New Algorithm Detects Aortic Stenosis with 90.5% Sensitivity

ABN AMRO General Shareholders Meeting Opens with Key Leaders
business

ABN AMRO General Shareholders Meeting Opens with Key Leaders

You Might Also Like

Rainfall Buries a Mega-Airport in Mexico
Technology

Rainfall Buries a Mega-Airport in Mexico

The story of the park begins in 2014, when Enrique Peña Nieto, the president of Mexico on the time, introduced…

3 Min Read
Nous Analysis drops Hermes 4 AI fashions that outperform ChatGPT with out content material restrictions
Technology

Nous Analysis drops Hermes 4 AI fashions that outperform ChatGPT with out content material restrictions

Need smarter insights in your inbox? Join our weekly newsletters to get solely what issues to enterprise AI, information, and…

14 Min Read
Finest Massagers for When You’d Moderately Not Pay Spa Costs (2025)
Technology

Finest Massagers for When You’d Moderately Not Pay Spa Costs (2025)

I spent eight months (and counting) testing dozens of massagers throughout quite a lot of classes, together with (however not…

1 Min Read
4 Greatest Music Streaming Providers (2025): Spotify, Apple Music, and Extra In contrast
Technology

4 Greatest Music Streaming Providers (2025): Spotify, Apple Music, and Extra In contrast

Spotify has the perfect music discovery algorithms and the slickest, snappiest consumer interface. It leads me down rabbit holes to…

3 Min Read
Madisony

We cover the stories that shape the world, from breaking global headlines to the insights behind them. Our mission is simple: deliver news you can rely on, fast and fact-checked.

Recent News

China’s Nostradamus Predicts US Defeat in Iran War After Trump Win
China’s Nostradamus Predicts US Defeat in Iran War After Trump Win
April 25, 2026
FRMO Corp Q3 2026 Earnings Call: Key Highlights Revealed
FRMO Corp Q3 2026 Earnings Call: Key Highlights Revealed
April 25, 2026
New UK Cameras Enforce Strict Phone Ban, 90% Drivers Admit Breaking Rule
New UK Cameras Enforce Strict Phone Ban, 90% Drivers Admit Breaking Rule
April 25, 2026

Trending News

China’s Nostradamus Predicts US Defeat in Iran War After Trump Win
FRMO Corp Q3 2026 Earnings Call: Key Highlights Revealed
New UK Cameras Enforce Strict Phone Ban, 90% Drivers Admit Breaking Rule
Passenger: Scariest 2026 Horror Film Releases May 22 After Trailer Scares
Falklands Veteran Urges King Charles to Sway Trump on Islands
  • About Us
  • Privacy Policy
  • Terms Of Service
Reading: OpenAI debuts GPT‑5.1-Codex-Max coding mannequin and it already accomplished a 24-hour activity internally
Share

2025 © Madisony.com. All Rights Reserved.

Welcome Back!

Sign in to your account

Username or Email Address
Password

Lost your password?