Deepfakes, Voice Impersonators Used in Vishing-as-a-ServiceHigh Demand for Vishing Services in Cybercrime Forums
While organizations are grappling with ways to tackle what some researchers says is a 35% spike in email phishing attacks, cybercriminals have upped the ante to move to more sophisticated techniques - using advanced deepfake and voice impersonation technologies to bypass voice authorization mechanisms and for voice phishing, or vishing, attacks.
Photon Research Team - the research arm of digital risk protection firm Digital Shadows, tells Information Security Media Group that cybercriminals are taking vishing to the next level, using deepfake audio or video technology to make impersonation attempts appear as credible as possible.
The research also found cybercriminals weaponizing commercially available deepfake and voice impersonation tools.
With the help of deepfake audio technology, the researchers say, threat actors can now impersonate their targets and bypass security measures such as voice authentication mechanisms to authorize a fraudulent transaction or spoof the victims' contacts to gather valuable intelligence.
More banks are using voice authorization to authenticate customers. According to a report by biometrics research organization Biometric Update, the voice biometrics market is rising at a 22.8% compounded annual growth rate and will be worth $3.9 billion in 2026.
The Photon researchers observed attackers frequently targeting bank account holders with prerecorded messages purporting to originate from their bank, urging them to provide their account credentials over the phone.
Although vishing is not new, Photon researchers point out that advancements in deepfake technologies are making vishing attempts look more credible than they have ever been.
"While there are limits to how far AI can go, the voice impersonation tools available on the open market today are very sophisticated," the Photon Research team says.
The researchers tell ISMG that deep learning AI technology can create very realistic deepfakes, but add that the cost of the impersonation increases in line with its credibility.
So, can the cost barrier stop smaller cybercrime groups from getting in on the action?
What the researchers say suggests otherwise: "If an attacker obtains the right sample during the reconnaissance phase of an attack, they may not need voice impersonation tools, which are too expensive and complex for many cybercriminals. Instead, an attacker may edit their sample to produce whatever sounds they are looking for."
Security measures may call for an authenticator to say their name, date of birth or a predetermined phrase. In this instance, all an attacker needs to do is play the prerecorded phrase.
Unlike the "spray and pray" approach observed in conventional phishing attacks, the Photon Research Team points out that most successful vishing attacks involve individuals rather than companies.
Cybercriminals zero in on a specific person or a group of people who are high-value individuals or people with significant access.
In June 2015, the Times of Israel reported how Thamar Eilam Gindin, an Israeli professor, was the target of a vishing and spear-phishing attack in which an Iranian cybercriminal posed as a BBC Persian correspondent seeking an interview. Gindin was tricked into sharing her email credentials to access a Google Drive document. The attackers then used her credentials to access her social media account and sent malware to her contacts.
In July 2019, The Wall Street Journal wrote about how cybercriminals impersonated the CEO of a UK-based energy company using a voice cloning tool in a successful attempt to receive a fraudulent money transfer of $243,000.
In 2020, Security Week reported that APT-C-23, part of the Hamas-linked Molerats group, targeted Israeli soldiers on social media with fake personas of Israeli women, using voice-altering software to produce convincing audio messages of female voices. The messages reportedly encouraged the Israeli targets to download a mobile app that would install malware on their devices, giving complete control to the attackers.
The Photon researchers say that although voice impersonation services on cybercriminal forums appeal to a small, specialized group of attackers, they have come across a Russian-speaking cybercrime forum in which a section was solely dedicated to vishing-related services.
The researchers found a threat actor advertising a vishing service that creates, clones and hosts customized voice robots, including a "sophisticated telephone interactive response system."
Based on their observations, the researchers say that there's no dearth of customers looking for vishing operators. Some have a specific target they've identified - for instance, one customer wanted to target a cryptocurrency specialist in a multistage social engineering attack.
The base price for vishing related services is $1,000. Additional customization comes at an added cost.
How Cybercriminals Use Vishing
"Deepfake technology can alter or clone voices in real time, resulting in the artificial simulation of a person’s voice," the Photon researchers say.
To choose their targets, the researchers say cybercriminals use open-source intelligence techniques, engage in active and passive scanning for open ports and vulnerable devices, or sift through leaked databases containing compromised credentials.
Researchers say that after zeroing in on the target, an attacker may pose as a buyer to speak to an authority figure in the target organization and ask innocent questions to elicit a voice sample. They will likely record the conversation and use the sample as a reference to be imitated or spliced later.
Sometimes, simple voice impersonation proves to be insufficient; the Photon Research team cites an instance in which an attempted vishing attack ran into an obstacle when the targeted company instructed the attackers to send a document to a corporate email ID.
Faced with the unexpected impediment, the attackers offered to pay 1,000 rubles - roughly $13.5 - to social engineers on the criminal forum who could trick the victim to click on a link he or she would receive on the phone. The researchers call this a "hybrid tactic" - when a vishing attack is used in combination with a regular phishing attack.
The researchers highlight a massive hybrid operation in 2020 in which attackers targeted the new hires of companies by claiming to be IT support, offering to troubleshoot VPN access issues. The attackers succeeded in getting the VPN credentials of unsuspecting new hires either over a "troubleshooting" phone call or by having them enter their credentials on a spoofed VPN access portal.
Weaponizing Voice Impersonation Tools
The Photon Research team says that it has come across cybercriminals discussing a commercially available software used to alter voices to improve human-machine interaction in role-playing games, or RPGs.
Advanced software, such as AV Voice Changer Software Diamond, can be purchased for $99.95, while others such as Voicemod are available for free.
The Photon Research team says that deepfake technologies and voice impersonation tools are not only being used to propagate cyberattacks, but also to spread disinformation. And this has not gone unnoticed by the authorities.
CNN Business reported that the Pentagon, through the Defense Advanced Research Projects Agency, or DARPA, is collaborating with major research institutions to develop technology to detect deepfakes.