Agent UX: Designing Interfaces for AI That Acts on Your Behalf

27/5/2026 · 11 min read

An agent is not a chatbot with more steps. It is a piece of software the user has handed real authority to — and the entire UX shifts the moment that handoff happens.

For most of the last decade the AI in our products was advisory. Autocomplete suggested, the chat answered, the recommendation system ranked. The user stayed in the driver's seat and the system stayed in its lane. The patterns we developed — undo, confirmation dialogs, optimistic UI — were calibrated for a world where the worst the AI could do was suggest a bad option you had to accept.

Agents broke that calibration. When the AI is reading your inbox, writing the response, sending it on your behalf, and then booking the meeting it just agreed to — the design problem is no longer "how do we surface a suggestion". It is "how does the user trust this thing enough to leave the room, and how do they take control back when they need to". That is a different design discipline, and the patterns we keep reaching for from the suggesting era are quietly the wrong patterns for the agentic era.

The Autonomy Slider Is The Centre Of The Whole Problem

Every agentic product has, implicitly or explicitly, an autonomy slider. At one end the agent only suggests, the user clicks every action. At the other end the agent runs an entire flow unattended and reports back when it is done. Where your product sits on that slider — and how it lets the user move the slider — is the most important design decision the team will make.

The mistake I see most often is treating the slider as a single global setting. "Auto mode" on or off. In real products the right autonomy level varies by action class. Reading my calendar to plan a week is low-stakes; the agent can act freely. Sending an email to an executive on my behalf is high-stakes; the agent should draft, never send. Buying a flight is irreversible; the agent should always confirm. The user does not want one slider, they want a policy.

The well-designed agentic products in 2026 make the autonomy boundaries legible. The user knows which classes of action the agent will perform without asking, which require confirmation, and which are never going to happen without explicit instruction. The boundary is visible in the UI before the agent acts, not buried in settings the user has to find.

Trust Is Earned In Specifics, Not Promised In General

Every agentic product has a marketing page that says some version of "we are transparent about what the agent is doing". The products that users actually trust have something much more specific than transparency. They have legibility — the user can, at any moment, see what the agent has just done, what it is about to do, and which piece of input it is reasoning from.

The patterns that build that legibility are concrete and small. A live activity feed showing the agent's current action in plain language. Source citations on every claim, with the cited passage visible on hover. A diff view when the agent is about to modify something the user already has. A pending-action queue the user can preview and edit before it ships. None of these are revolutionary — they are the agentic equivalent of "show, don't tell".

The trust signals that do not work are the ones designed by the team rather than discovered through use. The animated thinking indicator, the confidence percentage, the "explanation" that is itself a generated paragraph. Users learn quickly that these signals do not correlate with whether the agent was actually right, and once they have learned that, the signals become noise and then friction.

The Five Patterns That Show Up In Every Good Agentic UI

Across the agentic products I have audited or designed in the last year, five patterns show up over and over. None of them are unique to AI — most are borrowed from air traffic control, medical devices, or industrial automation, all fields that have been designing for delegated authority for decades. They are unfamiliar to most consumer product teams because the consumer product world has never needed them before.

The activity stream. A persistent, scrollable log of what the agent has done, is doing, and is about to do. Written in plain language, timestamped, interruptible. The user can stop reading at any time and trust that the log will be there if they need to audit later.
The pending actions tray. Anything the agent has prepared but not yet executed sits in a tray the user can review, edit, approve, or cancel. Critical for reversibility — by the time an action leaves the tray, the user has explicitly blessed it.
The kill switch. A single, always-visible, unambiguous way to halt the agent immediately. Not buried in a menu. Not behind a confirmation. The user needs to know with absolute certainty that one click stops everything, the same way they trust the brake pedal in a car.
The handoff request. When the agent encounters something outside its confident zone — an ambiguous instruction, a destructive action, a low-confidence output — it stops and surfaces the question to the user rather than guessing. The handoff is fast, contextual, and resumes where it left off.
The recap. When the agent finishes a multi-step task, it summarises what it did, what it skipped, what it could not do, and what it noticed. The recap is the user's compressed view of an autonomous session; without it the user has to audit the whole activity stream to know if the work is good.

Designing The Wait — The Most Under-Thought UX Surface

Agentic tasks take seconds, sometimes minutes. The user is not going to sit there watching a spinner for ninety seconds. What they do with that time, and what the agent shows them in the meantime, decides whether the product feels powerful or unreliable.

The wrong answer is the indeterminate spinner. The user has no idea if the agent is making progress, stuck, or about to do something destructive. The right answer is structured progress — a visible plan the agent is working through, with steps that complete one by one and timestamps the user can map against their own intuition about how long things should take.

The next-better answer is the parallel surface. The agent runs in the background; the user can leave the page, return when ready, get a notification when something requires attention. Email, Slack, push, in-app — the channel matters less than the principle: the agent is a colleague the user delegated to, not a system the user has to babysit. Designing for the asynchronous case is what separates a v0 agentic product from a product users actually live with.

Recovery Flows Are Where The Product Quality Actually Lives

Every agent will make mistakes. The product that handles those mistakes well feels reliable; the product that does not feels broken even when its success rate is higher. Recovery is the load-bearing surface of agentic UX and it is consistently the most under-designed.

Good recovery starts with reversibility as a first-class design goal. Every action the agent takes should ideally be undoable, and where it cannot be undone, the agent should ask first. Drafts instead of sent emails. Soft-deletes instead of permanent ones. Sandboxed previews instead of live edits. The architecture decisions here are made in the first week of building the agent; they cannot be retrofitted.

Beyond reversibility, the recovery flow needs to surface what went wrong in language the user can act on. "I tried to book the meeting but the calendar API returned an error" is useful. "Something went wrong" is not. The error needs to be contextualised to the task the user was trying to accomplish, with a clear next step the user can take — retry, hand off to a human, edit the inputs, abandon.

The most under-designed part of recovery is the agent's own recovery loop. When the agent encounters a failure mid-task, what does it do? Stop and ask? Retry with a variation? Skip and continue? Fall back to a different tool? These are policy decisions, not implementation details, and they should be visible to the user and consistent across the product. The user should be able to predict, before delegating a task, how the agent will behave when it hits trouble.

The Failure Modes That Quietly Kill Adoption

Agentic products fail in distinctive ways. The team ships a powerful demo, usage spikes for two weeks, and then it quietly tails off. Looking at session data usually shows the same patterns. Three failure modes account for most of the drop-off.

The first is the over-confident first action. The agent does something slightly wrong in the first session, the user has to undo it, and the implicit contract breaks. The fix is to start every relationship conservative — confirm more, do less, build a track record — and to widen the autonomy band only after the user has had a chance to trust the smaller version.

The second is the under-communicating long task. The agent goes off for two minutes, the user does not know what is happening, they assume it is broken, they close the tab. The fix is the structured progress surface; the right amount of visible activity makes a sixty-second task feel deliberate rather than stuck.

The third is the silent failure. The agent finishes a task, reports success, and only later does the user notice that something was wrong. This is the most corrosive failure mode because it teaches the user that they cannot trust the recap. Once that lesson is learned, they have to audit every output by hand, which removes the entire value proposition of the agent. The fix is rigorous self-checking — the agent verifies its own outputs against the original request before reporting completion, and surfaces uncertainty when its self-check is ambiguous.

When NOT To Build An Agent

The market pressure in 2026 is to put an agent in every product, the same way the 2024 pressure was to put a chat box in every product. Both pressures produce the same outcome — most of the resulting features are not used. The honest counter is knowing when the agentic framing is the wrong framing.

If the task the user wants to delegate is fast, low-stakes, and the user wants creative control over the output, an agent is the wrong UI. A good editor with inline AI assistance is faster and more satisfying than handing the task to an autonomous loop and reviewing the output. Writing, design, code — the suggesting model is still the right model for most creative work.

If the task is high-stakes and the user cannot verify the output without essentially redoing the work, the agent is also the wrong abstraction. The user is not getting time back; they are getting a different kind of work. Financial decisions, legal advice, medical diagnosis — the right UX is decision support, not delegation.

If the task is rare for any individual user, the trust-building loop never completes. The user has not had enough sessions to learn what the agent is good and bad at, so they do not delegate confidently, so the agent does not get the chance to earn the trust. Agents work best in product surfaces where the user delegates often enough to build calibrated trust over time.

Conclusion: Design For The Relationship, Not The Task

An agentic product is not a series of one-shot tasks. It is a relationship the user builds with a piece of software over weeks and months, and every screen, every confirmation dialog, every recovery flow is a deposit or a withdrawal in that relationship.

The agentic products that win in 2026 will be the ones designed by teams who understood that framing from day one. Not a feature wall that lists what the agent can do. A coherent UX language for delegation, control, recovery and trust, applied consistently across every action the agent can take. The aesthetic of trustworthy agency, not the aesthetic of AI magic.

If you take one thing from this, take this: borrow the patterns from fields that have been designing for delegated authority for decades, rather than reinventing them. Cockpit controls, medical device displays, industrial process monitors — they have all already solved versions of the trust, handoff and recovery problems that consumer agentic products are now bumping into for the first time. The novelty is the model in the middle; the design problem is older than the model.

Design for the relationship. Make the autonomy boundaries legible. Build recovery flows worth trusting. And remember that the user is not measuring your agent on what it can do — they are measuring it on whether they can leave the room and still trust what they will find when they get back.