Image by melojordan94 from Pixabay’s Free Use Content License.
Among the many advancements in AI technology today, South Korean entertainment company HYBE has added what they call “voice synthesis technology” to the mix. Supertone, an AI voice replication software startup led by president Kyogu Lee, developed this tech for HYBE since HYBE acquired a majority stake in a $32 million deal in 2023.
HYBE Chairman Bang Si-Hyuk states this acquisition brings HYBE closer to its goal of creating “a hyper-realistic and expressive voice that [is not] distinguishable from real humans.”
“I have long doubted that the entities that create and produce music will remain human,” notes Bang Si-Hyuk. “I don’t know how long human artists can be the only ones to satisfy human needs and human tastes. And that’s becoming a key factor for my operation and a strategy for HYBE.”
Kyogu Lee, with his expertise in AI and a PhD in Computer-Based Music Theory and Acoustics from Stanford University, states Supertone’s tech stands out among other AI audio emulators because it is “theoretically capable of creating an infinite number of new and original voices, as well as recreating existing voices.”
“NANSY” which stands for Neural Analysis & Synthesis and serves as the pillar for Supertone’s speech tech, Lee explains, “has the special ability to divide and re-assemble voice components—timbre, linguistics, pitch, and loudness—individually and independently, generating natural-sounding voices with unparalleled realism.”
Supertone recently showcased their AI’s capabilities by recreating the voice of Kim Hyuk Gun, the vocalist of the popular Korean band The Cross, who was paralyzed in an accident. Lee notes “We collected 20 years of his voice data since debut and used it to train an AI voice in his unique vocal style.”
Supertone’s voice synthesis technology does more than replicate voices, however, as Lee comments, “Supertone’s multilingual pronunciation correction technology unlocks new avenues for artists to communicate with local fans in their native language, reaching out to the global market.” This means large artists like Ariana Grande or Olivia Rodrigo could release singles across languages in their own voices within a day.
Lee adds, “We hope this collaboration will establish a constructive precedent for AI technology supporting artists in overcoming language barriers to connect with global fans and broaden their musical spectrum.”
Given that HYBE’s greatest revenue driver was what they refer to as “Artist Indirect-Involvement” wherein the company gains revenue off of an artist’s brand/likeness alone up until 2022, Lee was also asked whether he believed their tech would be used to replicate popular artists like BTS, whom HYBE operates. He stated:
“While Supertone is theoretically capable of creating an infinite number of new and original voices, as well as recreating existing voices, we are devoted to prioritizing the rights of all artists and creators, including those under HYBE.
Our focus with HYBE artists lies in facilitating seamless communication and interaction with global audiences, transcending all barriers, including language and geography.”
Supertone has also applied this technology to a real-time vocal changer called “Supertone Shift” which enables users to switch between and customize a number of predefined voices.
Lee sees the Supertone Shift “as the ultimate creative tool for a diverse range of content creators, including VTubers, livestreamers, podcasters, and gamers, enhancing the versatility and quality of their outputs.”