Taiwan Tech students develop an app to prevent AI Voice Fraud and Protect Voice Rights.
In recent years, deepfake technology has rapidly advanced, with criminals using AI voice synthesis to commit fraud, posing significant risks to society. To address this issue, students from the Department of Information Management at Taiwan Tech - Wen-Ya Wang, Ting-Yu Tsai, Yu-Huan Chen, and Shih-Hsin Mao - developed the “VoiceGuard System” which applies audio watermarking to counter adversarial attacks and deep learning techniques for voice protection and authenticity verification. Their “VoiceGuard App” won second place in the Cybersecurity Application category and third place in the Information Application category at the International ICT Innovative Services Awards 2024.

Students from the Department of Information Management at Taiwan Tech, Wen-Ya Wang, Ting-Yu Tsai, Yu-Huan Chen, and Shih-Hsin Mao - developed the “VoiceGuard System”, which applies audio watermarking to counter adversarial attacks and deep learning for digital voice protection. The system won second place in the Cybersecurity Application category and third place in the Information Application category at the International ICT Innovative Services Awards 2024.
Team leader Wen-Ya Wang shared that her inspiration for designing a voice protection system came from personal experiences, where friends and family received scam calls prompting conversations, along with news reports on voice fraud. With this in mind, she aimed to use technology to safeguard voice rights. The “VoiceGuard App” offers two key functions: voice authenticity verification and digital audio protection. It can distinguish between AI-generated and real human voices and embed a unique “audio watermark” - an imperceptible frequency or marker - to prevent voice manipulation by AI software. The audio watermark also serves as a foundation for voice copyright protection.

The “VoiceGuard App” can add imperceptible noise or markers to audio files, preventing the voice from being reused or synthesized by AI software. The audio watermark also serves as a basis for protecting voice copyright.

The “VoiceGuard App” offers two main features: voice authenticity verification and digital audio protection, allowing users to upload audio files for either identification or protection.
In terms of voice authenticity verification, the team collected publicly available human voice databases and AI-generated audio, using detection models for integration and classification. They analyzed the features and differences between real and synthetic voices, ultimately training a recognition system. Under clean background conditions, the system achieved a 99.99% accuracy rate for identifying real human voices and a 99.94% accuracy rate for AI-generated voices. Wen-Ya Wang mentioned that the collaborative development process not only met but exceeded her expectations, providing significant growth and valuable experience in both software and hardware skills.

When using the “VoiceGuard App” for verification, the system will notify the user of the likelihood that the audio file is either a real human voice or AI-generated speech.
The app currently supports both audio file uploads and recording functions, allowing users to verify voice authenticity or add watermarks. Wen-Ya Wang plans to further develop real-time voice recognition and protection features for live calls. Additionally, as the training model primarily uses English voice data, the team aims to expand the model’s voice database by collecting more Chinese voice samples for training, enhancing support for Chinese speech to better meet the needs of Taiwanese users. In the future, the team hopes to evolve the “VoiceGuard App” into a compliance tool, applicable in areas such as communications, voice copyright protection, biometric verification, and film production, becoming a trusted tool for safeguarding voice rights.

During the development of the “VoiceGuard App”, visual tools such as waveform and spectrograms were used to assist in analyzing audio.
Team member Shih-Hsin Mao shared that although he was initially unfamiliar with voice-related cybersecurity technologies, he enrolled in relevant courses to expand his knowledge and contribute to the development of the voice model. Yu-Huan Chen, a team member from Indonesia, admitted that she didn’t expect to win, but was pleasantly surprised upon hearing the news. “Winning is an encouragement for me, reinforcing my determination to continue pursuing the field of cybersecurity”, she said. Team member Ting-Yu Tsai also expressed that the competition’s success has motivated her to face future challenges with more confidence and determination.
Assistant Professor Jheng-Jia Huang, the team’s advisor, pointed out that voice rights have been increasingly recognized in recent years. For example, the voices of public figures can be misused by criminals to create false statements, seriously damaging their reputation. In Japan, some voice actors have had their voices stolen, affecting their professional work. However, voice copyright protection laws are still incomplete.
Jheng-Jia Huang also mentioned that the technology used in the “VoiceGuard App” meets a high market demand, but developing a credible system faces several challenges, including the collection of voice samples and model construction. Through guidance, Professor Huang encouraged the students to identify real-life problems and solutions, maintain creativity and passion, and gradually bring their ideas to life, ultimately becoming the much-needed professionals in the industry.

The team’s outstanding performance also led to an invitation for them to visit companies and participate in events where they showcased and shared their competition project.
