Voice cloning technology can now replicate a person's voice from three seconds of audio. The average listener can no longer reliably distinguish a cloned voice from a real one. That is a problem you cannot train your way around – if you're training people to listen.
The organizations getting this right have stopped teaching employees to evaluate the voice. They're teaching them to evaluate the call.
That shift sounds small. It's not. The voice is the one thing attackers can now manufacture on demand. The structure of the call – the urgency, the secrecy, the ask that bypasses your normal process – is the one thing they can't hide. And that structure is consistent across virtually every vishing attack on record, whether the target was a Fortune 500 AP team, a healthcare IT desk, or a cryptocurrency exchange. The patterns are not random. They are a playbook. And your employees can learn to recognize it.
Here are the four structural red flags that show up before, during, and after a vishing call – none of which require your employees to recognize a cloned voice.
Red Flag 1: The call creates urgency that can't wait for your normal process
This is the single most reliable signal in social engineering. Not a weird voice. Not a foreign accent. Urgency that demands you act before you verify.
In 95.3% of vishing attacks, the caller uses some combination of authority and urgency to move the target past their instinct to pause. The formula is consistent: a high-status person (your CFO, your IT director, a vendor you trust) needs something time-sensitive enough that following your normal process would cause a problem. The wire needs to go out before the market closes. The password reset needs to happen before the audit starts. The access code needs to be shared before the CEO lands.
The 2023 MGM Resorts attack – which cost the company an estimated $100 million and disrupted operations for weeks – started with a ten-minute call to the IT help desk. The caller claimed to be an employee locked out of their account. The request was urgent. The help desk complied. There was no sophisticated technology involved. There was no zero-day exploit. There was a caller who understood that urgency collapses verification.
The behavior to train is not "be suspicious of urgency." That's too vague to act on. The behavior is: any request that asks you to skip or abbreviate your normal verification process is, itself, a red flag — regardless of who is asking and regardless of what happens if you don't. If your AP team has a two-person approval process for wire transfers, a caller asking them to bypass it is the signal. Not the voice. The request.
Most security awareness programs teach employees to recognize a phishing email. They don't build the reflex for what to do when the urgency feels real and the voice sounds right. That reflex only comes from having been in that position before – under pressure, with a convincing caller, in a simulation that makes the stakes feel real. Knowing the rule isn't the same as having the muscle memory to apply it under pressure.
Red Flag 2: The call asks for secrecy
Legitimate requests don't ask you to keep them quiet.
This sounds obvious written down. It doesn't feel obvious in the moment, when the caller is your CFO's voice and the explanation is plausible – an acquisition that isn't public yet, a regulatory matter that needs to stay internal, a personnel issue that HR hasn't announced. Attackers know that authority plus a plausible reason for secrecy is enough to override most people's instinct to loop in a colleague.
In the Scattered Spider campaigns that compromised over 130 organizations in 2023 and 2024, attackers routinely asked their targets not to discuss the call with IT or management until the issue was resolved. The reason given varied by target. The instruction to stay quiet did not.
The behavior your employees need is a binary rule, not a judgment call: if a caller asks you not to tell anyone about the call or the request, the call ends. Full stop. Not "use your judgment about whether the reason sounds legitimate." The presence of a secrecy request is the disqualifier. No legitimate internal process – no compliance audit, no executive action, no HR matter – requires an employee to keep a call secret from their own security team.
Training this behavior means putting employees in a scenario where the secrecy request is buried inside a plausible, high-pressure conversation – not presented as an obvious red flag. The employee who can pause a convincing call and say "I'll need to loop in my manager before I go further" has the skill. The employee who has only read about it does not.
Red Flag 3: The call moves you to a channel you didn't initiate
Multi-channel attacks are now the norm. CrowdStrike's 2025 Global Threat Report documented a 442% surge in vishing attacks, and the majority of sophisticated campaigns involve more than one channel — a phishing email that references an upcoming call, a Slack message that precedes a voice request, a Teams message that follows up on a voicemail.
The pattern is deliberate. Each channel makes the next one feel more legitimate. If you already got an email from "IT security" warning you about suspicious activity on your account, the follow-up phone call from "IT security" feels like confirmation, not escalation.
The red flag isn't that a call came in. It's that the call is asking you to move somewhere – to a different number, a different platform, a different system – that you didn't initiate. Attackers create context across channels so the eventual ask feels like the conclusion of a process rather than the beginning of an intrusion.
The behavior: verify by going back to the source, not forward to where the caller is directing you. If someone calls claiming to be from IT and asks you to log into a system or call a number to confirm your identity, hang up and call IT back directly using the number you already have. If someone emails and says to expect a follow-up call, call IT proactively before that call arrives. The direction of initiation matters. If you didn't start the process, you don't know who you're talking to.
This is the behavior that's nearly impossible to train via a one-time module. It requires employees to experience a multi-channel sequence – email, then call, then request – and learn to recognize the shape of the campaign, not just the individual message.
Red Flag 4: The call asks for something that only works once, immediately
Wire transfers. Password resets. MFA codes. Credential confirmations. These requests share a structural feature: they cannot be undone. And attackers know that the window between compliance and discovery is their operating margin.
The AFP's 2025 Payments Fraud and Control Survey found that 79% of organizations experienced payment fraud in 2024. Callback phishing — where an attacker convinces a target to initiate a call to a number the attacker controls — now appears in 43% of business email compromise attacks. The callback format works specifically because it exploits the employee's sense that they initiated the verification. They called the number. It must be legitimate.
Irreversibility is the tell. Not the specific ask, but the fact that the ask cannot be paused, reversed, or reviewed after the fact. A caller requesting an MFA code that expires in thirty seconds is using time pressure to eliminate the verification window. A caller asking for a wire transfer to a new account before end of business is using deadline pressure to do the same thing.
The behavior: any irreversible action requires out-of-band verification, always, without exception. Out-of-band means a channel you initiate, using contact information you already have, independent of anything the caller provided. Not a callback number they give you. Not a Teams message they send you to confirm. A call to the number in your directory. A message to the colleague's known address. The inconvenience is the point. Friction is your defense.
Why your current training isn't building these reflexes
Each of the four red flags above is structural. None of them require your employees to detect a cloned voice. All of them require your employees to recognize a pattern under pressure – and that pattern only becomes instinct through exposure, not education.
Most security awareness programs teach employees what a vishing attack is. They describe the red flags. They explain the concepts. None of that is wrong. And almost none of it creates the reflex that fires when the call feels urgent, the voice sounds right, and the ask has a plausible reason behind it.
The gap is simulation. Specifically, voice phishing simulation that reproduces the structural conditions of a real attack – not a scripted exercise employees can smell from a distance, but a scenario built around your org's real context: your CFO's name, your IT help desk's actual workflow, your AP team's payment approval process, the vendors your finance team talks to every week. The scenario has to feel real for the reflex to form.
And when an employee fails – when they give the code, approve the wire, or agree to the secrecy request – what happens next determines whether that failure becomes a learning moment or a statistic. A generic reminder three days later does not. A training module built around exactly what that employee fell for, delivered immediately, does.
How Frame builds the reflex, not just the knowledge
Frame's vishing simulations are built from your organization's actual environment – your org chart, your vendors, your workflows – not a generic script library. Your AP team gets a call from a voice cloned from your CFO's name and title. Your IT help desk gets an employee impersonation that mirrors your real identity verification workflow. Your senior developers get a "partnership call" that follows an email your security team didn't send.
Every simulation is designed around the structural patterns above: urgency, secrecy, channel-switching, irreversible asks. Not because those are interesting training topics, but because they are the consistent architecture of every successful vishing attack in the record. Employees who have experienced that architecture in a safe environment recognize it when it arrives for real.
When someone falls for a Frame vishing simulation, their next training is generated automatically around what they missed – not a generic module on social engineering, but a scenario built around the specific red flag they didn't catch. The reflex that didn't fire gets trained directly, immediately, before the real call arrives.
You can't teach your employees to detect a cloned voice. You can teach them to recognize that a call demanding urgency, secrecy, a channel they didn't initiate, and an irreversible action is not a call from your CFO – no matter how much it sounds like one. But only if your training reproduces the pressure they'll be under when it actually matters.
Book a demo and see how Frame builds voice phishing simulations around your actual organization – in minutes.


