All the major AI labs are developing AI software agents that can operate a computer just like a person, visually parsing pixels, moving the mouse, and pressing keys. These are Computing User Agents (CUAs), and I wrote in-depth about them last week.

Today’s best CUAs are able to complete about 45% of the tasks in the popular OSWorld benchmark, up from just 6% when the benchmark was created sixteen months ago. What happens when they reach 100%?

In this post, I’ll explore what these benchmarks really measure, what they leave out, and how to prepare for the moment that AI UI execution becomes a solved problem.

Understanding the CUA benchmarks

The OSWorld benchmark defines 369 real desktop tasks: file management, web browsing, multi-app workflows, and so on. Human testers are able to finish 72-74% of them, but CUAs are closing the gap fast: 17% at the beginning of 2025, 45% today, and likely human parity in 2026.

Eventually, one will hit 100%. Then what?

A perfect score only proves a CUA can navigate any UI. It still won’t decide why a task matters, evaluate risk, or resolve ambiguity. The CUA execution layer is just the hands and eyes, not the brain.

Humans provide the scaffolding

Even with 100% capable CUAs, human workers will need to:

Set goals and intent
Define guardrails and checkpoints
Respond to escalations

The value won’t be in the raw UI control by the CUA, it will be the scaffolding around it.

CUAs become the universal API

That said, once CUA execution is solved, every legacy desktop app will become an intelligent API surface. A typical use case will involve multiple models working together:

An interface agent (CUA) will be deterministic and sandboxed. There will be one per workspace instance.
Planner / reasoning agents will decide and orchestrate which CUAs to invoke, when, and with what constraints.

CUAs plus scaffolding map directly to Stage 4 (“AI uses your computer”), and then evolve into Stages 5 (“AI uses your computer without you”) and 6 (“multi-agent coordination”) in the 7‑stage human-AI collaboration roadmap.

Because each CUA will use existing workspaces and run with its own identity, existing IAM, DLP, session recording, and other guardrails will still apply as they do today. You won’t have to reinvent your security model.

Planning for this future

Don’t get me wrong, it will be a big deal when CUAs hit 100 % execution. But this will only be a milestone, not the final destination, on the journey to AI in the workplace. When this happens, the value will shift to how well you design, secure, and govern the orchestration layer which drives the CUAs.

The scarcity (e.g. the value) will be the judgment around knowing what to do, how to do it, and what success looks like. When CUAs hit 100%, that judgement will still come from humans.

At that point, human workers will shift from doing to directing, and the org chart will start to look like a massive orchestration graph. Once CUA execution is solved, the only interface left to optimize will be your own thinking.

Read more & connect

Join the conversation and discuss this post on LinkedIn. You can find all my posts on my author page on the Citrix blog (or via RSS).

Video of my most recent talk

In May I gave the closing keynote at the EUCtech Denmark 2025 conference, called The Future of Work in an AI-Native World. I talked about a lot of what I covered today and walked through how AI will evolve and impact the workplace in the coming years. You can watch it on YouTube.

My upcoming talks

AppManagEvent: Closing Keynote: AI & the Future of Enterprise Apps — Utrecht, Netherlands, Oct 10
MAICON 2025: AI at Work: The Employees’ Revolution! — Cleveland, Ohio, Oct 14-16

Topics

Products

What happens when AI agents score 100% in computing using benchmarks?

What happens when AI agents score 100% in computing using benchmarks?

Understanding the CUA benchmarks

Humans provide the scaffolding

CUAs become the universal API

Planning for this future

Tagged under:

You might be interested in

Workers don’t want to build automations. They want to delegate.

AI will be THE interface to knowledge work. Here’s how we’ll get there.

IT admits workers control AI. Workers admit they use it to leave at 5.

Workers don’t want to build automations. They want to delegate.

AI will be THE interface to knowledge work. Here’s how we’ll get there.

IT admits workers control AI. Workers admit they use it to leave at 5.

Everyone wants to provide your AI. Nobody wants to help you manage it.

Three trends reshaping how work happens

Will AI need to operate your legacy desktop apps or is direct file manipulation enough?

AI just created 10,000 accidental citizen developers in your company. Welcome to the post-application era!

The bitter lesson of workplace AI: Stop engineering, start enabling

If AI is normal technology, boring infrastructure is your best strategy

Worker-led AI isn’t shadow IT. It’s shadow strategy.

Workers don’t want to build automations. They want to delegate.

The CIO’s M&A integration dilemma: speed vs. security

AI will be THE interface to knowledge work. Here’s how we’ll get there.

Why I joined Citrix — and what it means for healthcare leaders

How the most successful CIOs are building successful merger and acquisition approaches

IT admits workers control AI. Workers admit they use it to leave at 5.

One identity. Every app. Now inside Citrix sessions.

Citrix Aidrien: Built-in AI intelligence that keeps IT focused

Everyone wants to provide your AI. Nobody wants to help you manage it.

Certificate lifetimes are shrinking—your business continuity doesn’t have to: Automating SSL/TLS at scale with NetScaler

Topics

Products

What happens when AI agents score 100% in computing using benchmarks?

Understanding the CUA benchmarks

Humans provide the scaffolding

CUAs become the universal API

Planning for this future

Share

Tagged under:

You might be interested in