an abstract photo of a curved building with a blue sky in the background

Defending the Digital Front: A Vishing Simulation

Where Evidence-Based Design meets AI Voice: Training the human firewall to resist sophisticated vishing attacks.

Role: Learning Experience Designer & Voice Architect
The Tech Stack:
Retell.ai (Conversation Logic)
ElevenLabs (Speech-to-Speech)
Replit (Custom Middleware & API Integration)
Visual Storyboarding: Gemini 3 Pro Image (Nano Banana Pro Custom Workflow)
Mind map showing the behavioral path from a vishing trigger to secure out-of-band verification.
Mind map showing the behavioral path from a vishing trigger to secure out-of-band verification.

The Performance Challenge

  • The Problem: The rise of sophisticated vishing allows attackers to simulate authentic emotional distress, tricking employees into sharing protected information. This breakdown in 'out-of-band' verification creates critical security vulnerability, leading to significant unauthorized financial transfers.

  • The Action Map: To bridge the vishing performance gap, I collaborated with cybersecurity experts to isolate the critical high-stakes actions required of front-desk staff. Using MindMup, I mapped a behavioral path that prioritizes out-of-band verification and emotional regulation, ensuring security protocols remain intact even during high-pressure scenarios.

  • The Target Goal: Achieve a 25% increase in successful 'out-of-band' verification by Front-Desk/Finance Admins by end of Q4, directly mitigating organizational financial loss by preventing unauthorized data disclose during simulated pressure events.

Action Map: Isolate the critical behaviors required to transition from emotional reaction to protocol-driven response.

The SME Insights & Discovery

  • The Behavioral Gaps:

    • The Helpfulness Reflex: Employees often prioritize "Customer Success" and conflict resolution over security protocols when a caller sounds distressed.

    • MFA Bypass under Pressure: When faced with a perceived "emergency" (like a service disconnection), staff are 40% more likely to skip out-of-band verification to "fix" the problem quickly.

    • Authority Bias: Scammers use technical jargon or reference "upper management" to create a false sense of hierarchy, making the employee feel that questioning the caller is a performance risk.

    • Rapport-Based Trust: Scammers leverage "forced empathy" and shared personal details to create an artificial sense of connection. This social bond makes employees feel that following strict verification protocols would be "rude" or "distrustful," leading to a breakdown in professional boundaries.

The Strategy: Larry wasn't designed to be "villain," but a pressure cooker.

I intentionally developed Larry as a fast-talking, distressed utility representative to target the 'Social Compliance' gap. By using ElevenLabs Speech-to-Speech, I was able to give Larry authentic vocal tremors and an urgent cadence. This forces the learner to choose between their instinct to be 'helpful' to a person in crisis and their professional obligation to follow security MFA protocols. Larry isn't a test of knowledge; he’s a test of emotional regulation under fire.

  • Designing the Antagonist: "Late Payment Larry"

The Solution

  • Conversational Simulation:

Learners engage in a real-time voice call with 'Late Payment Larry,' a fast-talking utility representative claiming an immediate service disconnection is imminent. This high-fidelity simulation requires the learner to manage their emotional response and stick to security protocols despite Larry's increasingly desperate tone.

  • Design Choice Highlights:

    • Leveraged ElevenLabs Speech-to-Speech to capture authentic emotional stress cues.

    • Implemented Retell.ai to allow for dynamic, off-script learner responses.

    • Engineered the technical infrastructure on Replit, enabling the complex API 'handshakes' required for a stable, high-fidelity voice experience.

The Pedagogy

  • Feedback Loops: I utilized a 'Mentor Voice' to provide consequence-based feedback. If a learner fails to verify Larry's identity, the simulation doesn't just say 'Incorrect'; it demonstrates the organizational impact, such as a simulated notification of an unauthorized fund transfer. This creates a tight feedback loop between action and consequence.

  • Scaffolding: To prevent cognitive overload, I included a "Think' prompt before high-stakes decision points. This brief pause encourages the learner to transition from 'System 1' intuitive reacting to 'System 2' deliberate processing, ensuring they have the mental space to recall the correct MFA protocols.

3-panel storyboard: Larry's voice waveform, user choosing to help, and a red security alert pop-up.
3-panel storyboard: Larry's voice waveform, user choosing to help, and a red security alert pop-up.

Direct Feedback Loop: The simulation bypasses generic 'Incorrect' prompts, instead triggering immediate narrative consequences—like an unauthorized transfer alert—to reinforce the stakes of protocol failure.

The Results & Reflection

  • Success Metrics (Projected) - While developed as a high-fidelity prototype, this simulation targets a 25% reduction in successful vishing attempts by replacing passive 'awareness' with active 'stress-testing.' Success is measured by the learner’s ability to initiate out-of-band verification without being prompted, even when faced with high-pressure emotional cues.

  • Lessons Learned: The Design-First Reset- Early versions of this project focused on the script of a scam. However, a 'Design-First' reset led me to realize that the real failure point wasn't a lack of knowledge—it was emotional hijacking. This pivot from a text-based quiz to an AI-powered voice performance transformed the project from a standard training module into a true behavioral intervention.

Evolution of Design

From Script to Simulation: The Iterative Pivot

This project began as a standard branching scenario script, but SME discovery revealed that the 'gap' was emotional, not informational. By iterating from text to high-fidelity voice AI using Speech-to-Speech technology, I was able to bridge that gap. This section documents that pivot from 'knowing' the policy to 'performing' under pressure.

Text-based branching script showing a linear dialogue path.
Text-based branching script showing a linear dialogue path.
Retell AI dashboard showing voice simulation configuration for Larry.
Retell AI dashboard showing voice simulation configuration for Larry.

Early Draft

Final Draft

Phase 1: A static, text-based script focused on information delivery rather than behavioral change.

Phase 2: The final high-fidelity architecture in Retell.ai. By engineering complex system prompts and speech-to-speech parameters, I transitioned the project from a static script to a dynamic, emotionally-responsive simulation.