OpenAI study suggests AI may be about to eclipse human expertise in real-world tasks
Cryptocurrency
Fortune

OpenAI study suggests AI may be about to eclipse human expertise in real-world tasks

Why This Matters

Also: All the news and watercooler chat from Fortune.

October 10, 2025
09:02 AM
6 min read
AI Enhanced

s·CEO DailyOpenAI study suggests AI may be to eclipse human expertise in real-world tasksBy Geoff ColvinBy Geoff ColvinSenior Editor-at-LargeGeoff ColvinSenior Editor-at-LargeGeoff Colvin is a senior editor-at-large at Fortune, covering leadership, globalization, wealth creation, the info revolution, and related issues.SEE FULL BIO Sam Altman, chief executive officer of OpenAI Inc.Kyle Grillot/Bloomberg via Getty ImagesIn today’s CEO Daily: Geoff Colvin on AI potentially eclipsing human expertise.

The big story: Israeli government apves Gaza deal as troops pull out. The : Mostly down on news of China restrictions on rare-earth exports. Plus: All the news and watercooler chat from Fortune.

Good morning. Rarely does a 29-page scholarly paper merit the attention of top-level executives, but every leader should be familiar with a recent study from OpenAI.

It’s the best description yet of how AI can handle real-world tasks, showing which AI models are excelling, and hinting at what it all means for humans in the years ahead.

The paper can be heavy going, but you can get a masterful summary from our AI Editor, Jeremy Kahn. For leaders, three points stand out: The study is highly realistic.

It examined 44 occupations and 1,320 specialized tasks required by those occupations. For example: the final testing step in manufacturing a cable spooling truck for underground mining operations.

Appriate fessionals (average experience: 14 years) vetted the tasks, all of which are elements of actual work derables. Previous re has almost always focused on less realistic tests.

The AI results were graded by expert humans who didn’t know if they were looking at work from AI or from an expert human fessional. The best models are already nearly as good as human industry experts.

The study examined seven AI models from Open AI, Google’s Gemini, xAI’s Grok, and Anthropic’s Claude.

The winner was Claude Opus 4.1, which came within a few percentage points of reaching parity with human industry experts.

The best models also tasks 100 times faster and 100 times cheaper than the industry experts, though the comparisons ignore “the human oversight, iteration, and integration steps required in real workplace settings,” OpenAI says.

The models are imving at a galloping pace. For example, as OpenAI’s models imved, the percentage of their task outputs that were as good as or better than humans’ outputs more than tripled.

If that rate continues—a big if—OpenAI would be better at these real-world tasks than humans overall in a few months. At least some AI competitors could well be on similar trajectories.

The pace of change described in this new re may be the hardest challenge for leaders.

Consider the two-year cycle of Moore’s Law, which changed the world and inspired new corporate giants while dooming others. In retrospect, those were the days.

John Chambers, who ran Cisco through the internet frenzy and its crash, said recently that 50% of executives “won’t have the skills to adjust to this new innovation economy driven by AI because they were trained to move at the speed of a five-year cycle as opposed to a 12-month cycle.” His warning to leaders is worth remembering: “With the speed the market is moving at now, you have to be able to reinvent yourself, which most CEOs and leaders don’t know how to do—especially with AI.”—Geoff Colvin CEO Daily via Diane Brady at diane.brady@fortune.comTop newsIsraeli government apves Gaza deal as troops pull outThe IDF now has 24 hours to retreat to an agreed-upon line and Hamas has 72 hours to release all Israeli hostages.

So far, events are going as planned and the mood is upbeat on both sides.

coverage from the BBC here.China places export controls on rare earth mineralsThe new rules curb the supply chain for the semiconductors that are used in phones, computers, AI data centers, cars, solar panels, and other IT kit.

China has a virtual monopoly in rare earths.NY Attorney General indictedLaetitia James is charged with bank fraud and making false statements.

The secution is part of President Trump’s retribution plan: It was James who secured a $367 million fine against Trump in a civil suit (the fine was later reversed).Make Argentina Great AgainYes, the U.S.

is bailing out Argentina. Treasury Secretary Scott Bessent confirmed that the Treasury has bought pesos to support the government of Trump ally President Javier Milei. The U.S.

is also viding a $20 billion swap line to Argentina.

(A swap line allows central banks to exchange fixed amounts of currency on the understanding that the swap will be reversed later and interest will be paid on the repaid currency.)Moody’s chief economist: roughly half of U.S.

states are contracting economicallyMoody’s Analytics chief economist Mark Zandi exclusively told Fortune that nearly half of U.S.

states are seeing their economies contract—and only 16 are experiencing growth.

Zandi also noted that lower-income households are ““hanging on by their fingertips financially…and their world is going into recession pretty quickly.”KPMG survey identifies quarter where AI sentiment changedA new KPMG survey of 130 leaders in companies making more than $1 billion annually found that the adoption of agentic AI nology has quadrupled in the past six months.

A principle and aIQ gram lead at the company told Fortune that the most recent quarter was when the “fear factor” surrounding the nology faded, leading to what she describes as “cognitive fatigue.”Google restricts WFH to just 4 days per yearGoogle’s previous policy was to allow staff to work from anywhere for up to four weeks per year.

The new rule says that a single WFH day will now count as an entire week. Federal workers will get back payU.S.

House Speaker Mike Johnson said furloughed federal workers will get the wages they are owed once the shutdown ends.The S&P 500 futures were up 0.14% this morning.

The index closed down 0.28% in its last session. STOXX Europe 600 was flat in early trading. The U.K.’s FTSE 100 was down 0.14% in early trading. Japan’s Nikkei 225 was down 1.01%.

China’s CSI 300 was down 1.97%. The South Korea KOSPI was up 1.73%. India’s Nifty 50 was up 0.51% before the end of the session.

Bitcoin held at $121.4K.Around the watercooler$1.8 trillion deficit revealed during ‘pointless and wasteful government shutdown,’ budget watchdog says by Nick LichtenbergBattle over Elon Musk’s trillionaire pay package builds as pension funds face off against Tesla by Amanda GerutYou’re 10 times more ly to have a flight delay during the government shutdown, Transportation Secretary says: ‘These controllers are stressed out’ by Sydney LakeCalifornia’s ‘impossible’ dream of ending fossil fuels isn’t working, and now it’s looking at price spikes and shortages by Jordan BlumFrom WhatsApp friends to a $500 million–plus valuation: These founders argue their tiny AI models are better for customers and the planet by Vivienne WaltCEO Daily is compiled and edited by Joey Abrams and Jim Edwards.This is the web version of CEO Daily, a of must-read global insights from CEOs and industry leaders.

to get it dered free to your inbox.

FinancialBooklet Analysis

AI-powered insights based on this specific article

Key Insights

  • The Federal Reserve's actions could influence market sentiment across sectors
  • Financial sector news can impact lending conditions and capital availability for businesses

Questions to Consider

  • How might the Fed's policy stance affect borrowing costs and economic growth?
  • Could this financial sector news affect lending conditions and capital availability?

Stay Ahead of the Market

Get weekly insights into market shifts, investment opportunities, and financial analysis delivered to your inbox.

No spam, unsubscribe anytime