Amazon Customer Service authentication and the striated user flow I designed to describe the workflow processes. May 2025.

 CONTEXT

Customers were hanging up before we could even ask why they were calling.

When I joined Amazon Customer Service, the IVR system had a structural problem that no amount of copy optimization could fix on its own: customers had to navigate more than two minutes of procedural steps (greetings, authentication, item selection) before they could say a single word about why they were calling.

That two-minute barrier created a 92% drop-off rate.

Most of the customers were hanging up before they ever got to explain their problem. And authentication, which was necessary to connect the call to an Amazon account, was the single biggest hurdle in that gauntlet. It featured an involuntary modal transition forcing users to leave their chosen modality and return later, and these are where multimodal experiences break.

I had to figure out a way to make this system work.

THE ARCHITECTURE

Preliminary user flows I designed (sanitized) with relevant converse workflows. I needed to understand this link to get analytics showing drop off points. April 2025.

First, I had to understand what I was actually dealing with.

Authentication at Amazon Customer Service featured five distinct authentication types, each differentiated by how the system delivered the verification link — text, email, app notification, or combinations thereof. Before I could propose any changes, I needed to map exactly what was happening at the system level and where users were falling off.

I built three-layer flow maps of the existing experience: the top layer showed the customer journey: what they saw, heard, and were asked to do. The middle layer showed the system responses with the under-the-hood processes happening below in parallel. The bottom layer showed the two named workflows handling authentication and intent detection. Once I understood which workflows were being triggered under which conditions, I worked with analytics to get real metrics on abandonment at each step.

Then I built a Sankey chart to visualize the flow — where customers dropped off, which steps created the most friction, which authentication paths had the highest abandonment. This gave leadership the first clear data picture of where design changes could actually make a measurable difference. It showed that authentication was a problem, but it was part of a larger architectural problem.

Sankey diagram (sanitized) mapping customer drop-off across authentication steps, used to identify highest-friction points in the Amazon Customer Service IVR experience. The large red “Greeting” block shows how many customers got that message in a 45 day period at the beginning of the experience and the two purple blocks marked “Intent detection” show the amount of customers who were asked why they were calling, July 2025.

THE CONSTRAINT

Sources confirming my analysis of the reason for calling being too prolonged included our own benchmark testing and the Conversation Analysis work of John Heritage. August 2025.

Amazon was the only voice experience doing it this way.

As a multimodal designer, I knew that you want to keep the user in the modality they purposefully selected (which in this case was voice). But although voice is a very accessible modality, confirming security details aloud in a voice modality is inherently insecure. You can't just ask customers questions like 'What's your password?' The involuntary modal transition—switching from voice to text or email—was the safest method available. But designing around those modal transitions is brutal. There are so many ways the experience breaks when you're asking someone to leave voice, text or email something, and come back.

I also analyzed benchmark testing across other major voice experiences to understand how they handled authentication. The finding was stark: Amazon was the only one that authenticated before asking the reason for calling. Every other voice experience asked "Why are you calling?" first, then authenticated.

(Sanitized) Direct Dial restructuring vision framework showing proposed Amazon Customer Service IVR architecture with Order Related and Account and Payment authentication checks. May 2025.

Researching this delay in getting the customer’s reason for calling also led me to the work of John Heritage and the field of Conversation Analysis, specifically the principle that institutional voice interactions need to establish the reason for the interaction as early as possible to feel natural and purposeful. It gave me academic backing for what the data was already showing, and I worked out a plan to incorporate authentication later in the flow, but that would require months of iterative steps before leadership felt the system was secure enough to handle unidentified user requests.

So the obstacle was: how do we keep authentication early (for security reasons), reduce the friction it creates (for UX reasons), and do it through a channel transition that's inherently fragile?

If we had to keep authentication early, the way we asked for it had to be persuasive.

THE APPROACH

Cialdini's six principles of behavioral persuasion applied as a framework for designing authentication consent prompts. June 2025.

It couldn't feel like a barrier. It had to feel like protection.

I turned to Robert Cialdini's six principles of behavioral persuasion: Scarcity, Reciprocity, Social Proof, Authority, Liking, and Consistency. Each principle offers a different psychological frame for the same ask. I designed a multi-variant A/B testing framework — running experiments across authentication types simultaneously in multiple global markets. The hypothesis was simple: different persuasion strategies would reduce friction and raise the likeliness the user would agree to authenticate. What I didn't know going in was whether the same strategy would work everywhere, or if the same strategies would apply to all forms of authentication. But by carefully coordinating the variables we were testing for, I ensured we could get a broader view of how the persuasion strategies were working.

 

THE FINDINGS

Multi-variant A/B experiment testing Scarcity and Social Proof persuasion strategies in anonymized authentication consent prompts. September 2025.

One size does not fit all markets.

When the weblab results came back, the data told a more complex story than I expected — and a more interesting one.

In the US, the Social Proof strategy ("Most callers find this quick and easy") actually increased drop-off and agent escalation.

In the UK and India, Scarcity ("We can only access your protected account information after verifying your identity") improved acceptance rates.

In Germany and Canada, Social Proof outperformed every other strategy.

This meant I had to go back to Product and Engineering with a finding that made their lives harder: we couldn't pick one winning strategy and scale it globally. We needed to localize by market. That's a significant technical lift — it requires the system to support market-specific copy variations rather than a single global prompt. But the data justified it, and I made that case directly to stakeholders.

Phase 2 experiment testing Scarcity and Liking strategies in multichannel unauthenticated consent prompts across five global markets. November 2025.

We implemented scarcity testing for the US market and continued experimenting with other strategies internationally. The framework I built was designed to be reusable — future teams could keep testing new persuasion strategies without rebuilding the testing architecture from scratch.

 

THE RESULTS

Small percentages. Enormous scale.

Amazon Customer Service markets where authentication persuasion experiments were conducted and planned, with Amazon domestic distribution center locations. December 2025.

My changes were still being implemented when I left Amazon, but the early numbers were already significant:

  • ~3% improvement in authentication acceptance rates

  • ~6% reduction in negative customer sentiment

  • Hundreds of thousands of customer interactions affected annually

  • Reduced human-agent escalation across multiple markets

At Amazon's call volume, a 3% authentication acceptance rate improvement isn't a rounding error. It's a meaningful reduction in agent load, customer frustration, and call abandonment — every single day, across multiple continents.

The bigger finding was the one I didn't anticipate: market context directly shapes how customers interpret persuasion. What reads as reassuring in Germany reads as pressure in the US. Designing for that (not homogenizing globally) is how you actually get results at scale.

What this project taught me about design at scale

This project taught me that voice design, and increasingly LLM-powered design, isn't just about what users do. It's about how they interpret framing. The words you choose aren't decoration, they're the mechanism. Change the frame, change the behavior.

It also taught me something about organizational constraints. The "right" answer (moving authentication later in the flow) was architecturally sound but organizationally impossible in the short term. A designer who can only operate in ideal conditions isn't useful at a company like Amazon. You have to be able to find the lever that's actually available to you, and pull it hard enough to matter.

The persuasion framework I built didn't solve the two-minute problem. But it chipped away at the hardest part of it — and left behind a system that could keep chipping.


Next Case Study: CELESTE — an AI-powered job search assistant I built with Claude Code.

Check out my other case studies or let's talk →