15.ai
Type of site | Artificial intelligence, speech synthesis, machine learning, deep learning |
---|---|
Available in | English |
Founder(s) | 15 |
URL | 15 |
Commercial | No |
Registration | None |
Launched | Initial release: March 12, 2020 Stable release: v24.2.1 / September 2021 |
Current status | Under maintenance |
Part of a series on |
Artificial intelligence |
---|
15.ai is a
Launched in early 2020, 15.ai began as a
Credited as the impetus behind the popularization of AI
Several commercial alternatives have spawned with the rising popularity of 15.ai, leading to cases of misattribution and theft. In January 2022, it was discovered that Voiceverse NFT, a company that voice actor Troy Baker announced his partnership with, had plagiarized 15.ai's work as part of their platform.[14][15][16]
On September 8, 2022, 15.ai was temporarily taken down in preparation for an upcoming update, a year after its last stable release (v24.2.1). As of April 2024, it is still temporarily offline, though the new avatar icon was updated in Twitter.[17]
Features
Available characters include
The deep learning model used by the application is nondeterministic: each time that speech is generated from the same string of text, the intonation of the speech will be slightly different. The application also supports manually altering the emotion of a generated line using emotional contextualizers (a term coined by this project), a sentence or phrase that conveys the emotion of the take that serves as a guide for the model during inference.[9][12][13] Emotional contextualizers are representations of the emotional content of a sentence deduced via transfer learned emoji embeddings using DeepMoji, a deep neural network sentiment analysis algorithm developed by the MIT Media Lab in 2017.[20][21] DeepMoji was trained on 1.2 billion emoji occurrences in Twitter data from 2013 to 2017, and has been found to outperform human subjects in correctly identifying sarcasm in Tweets and other online modes of communication.[22][23][24]
15.ai uses a multi-speaker model—hundreds of voices are trained concurrently rather than sequentially, decreasing the required training time and enabling the model to learn and generalize shared emotional context, even for voices with no exposure to such emotional context.
The application supports a simplified version of a set of English phonetic transcriptions known as
{AA1 R P AH0 B EH2 T}
to denote /ˈɑːrpəˌbɛt/, the pronunciation of the word ARPABET).[12]The following is a table of phonemes used by 15.ai and the CMU Pronouncing Dictionary:[30]
ARPABET | Rspl. | IPA | Example |
---|---|---|---|
AA
|
ah | ɑ | odd |
AE
|
a | æ | at |
AH0
|
ə | ə | about |
AH
|
u, uh | ʌ | hut |
AO
|
aw | ɔ | ought |
AW
|
ow | aʊ | cow |
AY
|
eye | aɪ | hide |
EH
|
e, eh | ɛ | Ed |
ARPABET | Rspl. | IPA | Example |
---|---|---|---|
ER
|
ur, ər | ɝ, ɚ | hurt |
EY
|
ay | eɪ | ate |
IH
|
i, ih | ɪ | it |
IY
|
ee | i | eat |
OW
|
oh | oʊ | oat |
OY
|
oy | ɔɪ | toy |
UH
|
uu | ʊ | hood |
UW
|
oo | u | two |
AB | Description |
---|---|
0 | No stress |
1 | Primary stress
|
2 | Secondary stress |
ARPABET | Rspl. | IPA | Example |
---|---|---|---|
B
|
b | b | be |
CH
|
ch, tch | tʃ | cheese |
D
|
d | d
|
dee |
DH
|
dh | ð | thee |
F
|
f | f | fee |
G
|
g | ɡ | green |
HH
|
h | h | he |
JH
|
j | dʒ | gee |
ARPABET | Rspl. | IPA | Example |
---|---|---|---|
K
|
k | k | key |
L
|
l | l
|
lee |
M
|
m | m | me |
N
|
n | n
|
knee |
NG
|
ng | ŋ | ping |
P
|
p | p | pee |
R
|
r | r
|
read |
S
|
s, ss | s | sea |
ARPABET | Rspl. | IPA | Example |
---|---|---|---|
SH
|
sh | ʃ | she |
T
|
t | t
|
tea |
TH
|
th | θ | theta |
V
|
v | v | vee |
W
|
w, wh | w | we |
Y
|
y | j | yield |
Z
|
z | z | zee |
ZH
|
zh | ʒ | seizure |
Background
Speech synthesis
In 2016, with the proposal of
For years, reducing the amount of data required to train a realistic high-quality text-to-speech model has been a primary goal of scientific researchers in the field of deep learning speech synthesis.[35][36] The developer of 15.ai claims that as little as 15 seconds of data is sufficient to clone a voice up to human standards, a significant reduction in the amount of data required.[9][37]
Copyrighted material in deep learning
A landmark case between Google and the Authors Guild in 2013 ruled that Google Books—a service that searches the full text of printed copyrighted books—was transformative, thus meeting all requirements for fair use.[38] This case set an important legal precedent for the field of deep learning and artificial intelligence: using copyrighted material to train a discriminative model or a non-commercial generative model was deemed legal.[39] The legality of commercial generative models trained using copyrighted material is still under debate; due to the black-box nature of machine learning models, any allegations of copyright infringement via direct competition would be difficult to prove.[39]
Development
15.ai was designed and created by an anonymous research scientist affiliated with the Massachusetts Institute of Technology known by the alias 15.[40] The project began development while the developer was an undergraduate. The developer has stated that they are capable of paying the high cost of running the site out of pocket.[9]
According to posts made by its developer on Hacker News, 15.ai costs several thousands of dollars per month to operate; they are able to support the project due to a successful startup exit.[41] The developer has stated that during their undergraduate years at MIT, they were paid the minimum hourly rate to work on a related project (approximately $14 an hour in Massachusetts[42]) that eventually evolved into 15.ai. They also stated that the democratization of voice cloning technology is not the only function of the website; in response to a user asking whether the research could be conducted without a public website, the developer wrote:
[...] The website has multiple purposes. It serves as a
content, even if they can't hire someone to voice their projects.It also demonstrates the progress of my research in a far more engaging manner—by being able to use the actual model, you can discover things about it that even I wasn't aware of (such as getting characters to make gasping noises or moans by placing commas in between certain phonemes).
It also doesn't let me get away with
The algorithm used by the project to facilitate the cloning of voices with minimal viable data has been dubbed DeepThroat
The developer has also worked closely with the Pony Preservation Project from /mlp/, the
In addition, the developer has stated that the logo of 15.ai, which features a robotic Twilight Sparkle, is an homage to the fact that her voice (as originally portrayed by Tara Strong) was indispensable to the implementation of emotional contextualizers.[41]
Reception
15.ai has been met with largely positive reception. Liana Ruppert of
Reception has also been largely acclaimed overseas, especially in Japan. Takayuki Furushima of Den Fami Nico Gamer has described 15.ai as "like magic," and Yuki Kurosawa of Automaton Media called it "revolutionary."[13][12]
Computer scientist and technology entrepreneur
Impact
Fandom content creation
15.ai has been frequently used for content creation in various fandoms, including the My Little Pony: Friendship Is Magic fandom, the Team Fortress 2 fandom, the Portal fandom, and the SpongeBob SquarePants fandom. Numerous videos and projects containing speech from 15.ai have gone viral.[9][4][5] However, some videos and projects that contain non-15.ai-generated speech have also gone viral, many of which do not properly credit the source(s) of the synthetic speech featured in them. As a consequence, many videos and projects that have been made with other speech synthesis software have been mistaken as being made with 15.ai, and vice versa. Due to this misattribution and absence of proper credit, 15.ai's terms of service has a rule that forbids having 15.ai-and-non-15.ai-generated speech in the same videos and projects.[48]
The My Little Pony: Friendship Is Magic fandom has seen a resurgence in video and musical content creation as a direct result, inspiring a new genre of fan-created content assisted by artificial intelligence. Some
Viral videos from the Team Fortress 2 fandom that feature voices from 15.ai include Spy is a
Some users have created AI virtual assistants using 15.ai and external voice control software. One user on Twitter created their own personal GLaDOS desktop assistant using the voice control system VoiceAttack that is able to boot up applications, utter corresponding random dialogues, and thank the user in response to actions.[12][13]
Troy Baker / Voiceverse NFT plagiarism scandal
Troy Baker @TroyBakerVA I’m partnering with @VoiceverseNFT to explore ways where together we might bring new tools to new creators to make new things, and allow everyone a chance to own & invest in the IP’s they create. We all have a story to tell. You can hate. Or you can create. What'll it be?
January 14, 2022[tweet 1]
In December 2021, the developer of 15.ai posted on
On January 14, 2022, it was discovered that Voiceverse NFT, a company that video game and
15 @fifteenai I've been informed that the aforementioned NFT vocal synthesis is actively attempting to appropriate my work for their own benefit. After digging through the
log files, I have evidence that some of the voices that they are taking credit for were indeed generated from my own site.January 14, 2022[tweet 3]
Voiceverse Origins @VoiceverseNFT Hey @fifteenai we are extremely sorry about this. The voice was indeed taken from your platform, which our marketing team used without giving proper credit. Chubbiverse team has no knowledge of this. We will make sure this never happens again.
January 14, 2022[tweet 4]
A week prior to the announcement of the partnership with Baker, Voiceverse made a (now-deleted) Twitter post directly responding to a (now-deleted) video posted by Chubbiverse—an NFT platform with which Voiceverse had partnered—showcasing an AI-generated voice and claimed that it was generated using Voiceverse's platform, remarking "I wonder who created the voice for this? ;)"
Following the tweet, Voiceverse admitted to plagiarizing voices from 15.ai as their own platform, claiming that their
The initial partnership between Baker and Voiceverse was met with severe backlash and universally negative reception.
Reactions from voice actors
Some voice actors have publicly decried the use of voice cloning technology. Cited reasons include concerns about impersonation and fraud, unauthorized use of an actor's voice in pornography, and the potential of AI being used to make voice actors obsolete.[8][10][11]
See also
- Audio deepfake
- Character.ai
- DALL-E
- Deepfake
- Midjourney
- NovelAI
- Stable Diffusion
- Synthetic media
- WaveNet
Notes
- ^ The phrase "high-fidelity" in TTS research is often used to describe vocoders that are able to reconstruct waveforms with very little distortion, and is not simply synonymous with "high quality." See the papers for HiFi-GAN,[1] GAN-TTS,[2] and parallel WaveNet[3] for unbiased examples of this usage of terminology.
- ^ Translated from original quote written in Spanish: "La dirección es 15.AI y funciona tan fácil como parece."[18]
References
- Notes
- arXiv:2010.05646v2 [cs].
- arXiv:1909.11646v2 [cs].
- ^ DeepMind. Archivedfrom the original on June 18, 2022. Retrieved June 5, 2022.
- ^ a b c d e Zwiezen, Zack (January 18, 2021). "Website Lets You Make GLaDOS Say Whatever You Want". Kotaku. Archived from the original on January 17, 2021. Retrieved January 18, 2021.
- ^ a b c d Ruppert, Liana (January 18, 2021). "Make Portal's GLaDOS And Other Beloved Characters Say The Weirdest Things With This App". Game Informer. Game Informer. Archived from the original on January 18, 2021. Retrieved January 18, 2021.
- ^ a b c Clayton, Natalie (January 19, 2021). "Make the cast of TF2 recite old memes with this AI text-to-speech tool". PC Gamer. Archived from the original on January 19, 2021. Retrieved January 19, 2021.
- ^ Rock, Paper, Shotgun. Archivedfrom the original on January 18, 2021. Retrieved January 18, 2021.
- ^ a b c d e Ng, Andrew (April 1, 2020). "Voice Cloning for the Masses". The Batch. Archived from the original on August 7, 2020. Retrieved April 5, 2020.
- ^ a b c d e f g Chandraseta, Rionaldi (January 19, 2021). "Generate Your Favourite Characters' Voice Lines using Machine Learning". Towards Data Science. Archived from the original on January 21, 2021. Retrieved January 23, 2021.
- ^ a b c Ng, Andrew (March 7, 2021). "Weekly Newsletter Issue 83". The Batch. Archived from the original on February 26, 2022. Retrieved March 7, 2021.
- ^ a b c d Lopez, Ule (January 16, 2022). "Troy Baker-backed NFT firm admits using voice lines taken from another service without permission". Wccftech. Archived from the original on January 16, 2022. Retrieved June 7, 2022.
- ^ a b c d e f g h i j Kurosawa, Yuki (January 19, 2021). "ゲームキャラ音声読み上げソフト「15.ai」公開中。『Undertale』や『Portal』のキャラに好きなセリフを言ってもらえる". AUTOMATON. Archived from the original on January 19, 2021. Retrieved January 19, 2021.
- ^ a b c d e Yoshiyuki, Furushima (January 18, 2021). "『Portal』のGLaDOSや『UNDERTALE』のサンズがテキストを読み上げてくれる。文章に込められた感情まで再現することを目指すサービス「15.ai」が話題に". Denfaminicogamer. Archived from the original on January 18, 2021. Retrieved January 18, 2021.
- ^ a b c d e Williams, Demi (January 18, 2022). "Voiceverse NFT admits to taking voice lines from non-commercial service". NME. Archived from the original on January 18, 2022. Retrieved January 18, 2022.
- ^ a b c d e f Wright, Steve (January 17, 2022). "Troy Baker-backed NFT company admits to using content without permission". Stevivor. Archived from the original on January 17, 2022. Retrieved January 17, 2022.
- ^ a b c d Henry, Joseph (January 18, 2022). "Troy Baker's Partner NFT Company Voiceverse Reportedly Steals Voice Lines From 15.ai". Tech Times. Archived from the original on January 26, 2022. Retrieved February 14, 2022.
- ^ "15 on Twitter: "(I probably won't open Twitter again until I finally get this up and running.)" / Twitter". Twitter. Retrieved June 6, 2023.
- ^ a b c Villalobos, José (January 18, 2021). "Descubre 15.AI, un sitio web en el que podrás hacer que GlaDOS diga lo que quieras". LaPS4. Archived from the original on January 18, 2021. Retrieved January 18, 2021.
- ^ Moto, Eugenio (January 20, 2021). "15.ai, el sitio que te permite usar voces de personajes populares para que digan lo que quieras". Yahoo! Finance. Archived from the original on March 8, 2022. Retrieved January 20, 2021.
- S2CID 2493033.
- ^ Corfield, Gareth (August 7, 2017). "A sarcasm detector bot? That sounds absolutely brilliant. Definitely". The Register. Archived from the original on June 2, 2022. Retrieved June 2, 2022.
- ^ "An Algorithm Trained on Emoji Knows When You're Being Sarcastic on Twitter". MIT Technology Review. August 3, 2017. Archived from the original on June 2, 2022. Retrieved June 2, 2022.
- ^ "Emojis help software spot emotion and sarcasm". BBC. August 7, 2017. Archived from the original on June 2, 2022. Retrieved June 2, 2022.
- ^ Lowe, Josh (August 7, 2017). "Emoji-Filled Mean Tweets Help Scientists Create Sarcasm-Detecting Bot That Could Uncover Hate Speech". Newsweek. Archived from the original on June 2, 2022. Retrieved June 2, 2022.
- arXiv:1910.11997 [eess].
- arXiv:1910.10838 [eess].
- ^ Klautau, Aldebaro (2001). "ARPABET and the TIMIT alphabet" (PDF). Archived from the original (PDF) on June 3, 2016. Retrieved September 8, 2017.
- ^ "Phonetics" (PDF). Columbia University. 2017. Archived (PDF) from the original on June 19, 2022. Retrieved June 11, 2022.
- CiteSeerX 10.1.1.832.2872. Archivedfrom the original on June 11, 2022. Retrieved June 11, 2022.
Table 3.2
- ^ "The CMU Pronouncing Dictionary". CMU Pronouncing Dictionary. July 16, 2015. Archived from the original on June 3, 2022. Retrieved June 4, 2022.
- arXiv:1810.07217 [cs.CL].
- arXiv:1910.01709 [cs.CL].
- ^ "Audio samples from "Semi-Supervised Training for Improving Data Efficiency in End-to-End Speech Synthesis"". August 30, 2018. Archived from the original on November 11, 2020. Retrieved June 5, 2022.
- arXiv:1712.05884 [cs.CL].
- arXiv:1808.10128 [cs.CL].
- arXiv:1905.06791 [cs.CL].
- ^ a b c d Phillips, Tom (January 17, 2022). "Troy Baker-backed NFT firm admits using voice lines taken from another service without permission". Eurogamer. Archived from the original on January 17, 2022. Retrieved January 17, 2022.
- ^ - F.2d – (2d Cir, 2015). (temporary cites: 2015 U.S. App. LEXIS 17988; Slip opinion (October 16, 2015))
- ^ a b Stewart, Matthew (October 31, 2019). "The Most Important Court Decision For Data Science and Machine Learning". Towards Data Science. Archived from the original on February 21, 2022. Retrieved February 21, 2022.
- ^ "15". Twitter. June 9, 2022. Retrieved June 9, 2022.
- ^ a b c "15.ai". Hacker News. June 12, 2022. Archived from the original on June 13, 2022. Retrieved June 13, 2022.
- UROP. Archivedfrom the original on June 19, 2022. Retrieved June 13, 2022.
- ^ "15.ai – About". 15.ai. February 20, 2022. Retrieved February 20, 2022.
- ^ a b c Branwen, Gwern (March 6, 2020). ""15.ai", 15, Pony Preservation Project". Gwern.net. Gwern. Archived from the original on March 18, 2022. Retrieved June 17, 2022.
- ^ Scotellaro, Shaun (March 14, 2020). "Neat "Pony Preservation Project" Using Neural Networks to Create Pony Voices". Equestria Daily. Archived from the original on June 23, 2021. Retrieved June 11, 2022.
- ^ "Pony Preservation Project (Thread 108)". 4chan. Desuarchive. February 20, 2022. Retrieved February 20, 2022.
- Marginal Revolution (blog). Archivedfrom the original on June 19, 2022. Retrieved June 16, 2022.
- ^ "15.ai – FAQ". 15.ai. January 18, 2021. Retrieved January 18, 2021.
- ^ Scotellaro, Shaun (May 15, 2022). "Full Simple Animated Episode – The Tax Breaks (Twilight)". Equestria Daily. Archived from the original on May 21, 2022. Retrieved May 28, 2022.
- ^ The Terribly Taxing Tribulations of Twilight Sparkle. April 27, 2014. Archived from the original on June 30, 2022. Retrieved April 28, 2022.
{{cite book}}
:|website=
ignored (help) - ^ Phillips, Tom (January 14, 2022). "Video game voice actor Troy Baker is now promoting NFTs". Eurogamer. Archived from the original on January 14, 2022. Retrieved January 14, 2022.
- ^ McWhertor, Michael (January 14, 2022). "The Last of Us voice actor wants to sell 'voice NFTs,' drawing ire". Polygon. Archived from the original on January 14, 2022. Retrieved January 14, 2022.
- ^ "Last Of Us Voice Actor Pisses Everyone Off With NFT Push". Kotaku. January 14, 2022. Archived from the original on January 14, 2022. Retrieved January 14, 2022.
- ^ Purslow, Matt (January 14, 2022). "Troy Baker Is Working With NFTs, but Fans Are Unimpressed". IGN. Archived from the original on January 14, 2022. Retrieved January 14, 2022.
- ^ Strickland, Derek (January 31, 2022). "Last of Us actor Troy Baker heeds fans, abandons NFT plans". Tweaktown. Archived from the original on January 31, 2022. Retrieved January 31, 2022.
- ^ Peterson, Danny (January 31, 2022). "'The Last of Us' actor Troy Baker reverses course on NFTs amid fan backlash". We Got This Covered. Archived from the original on February 14, 2022. Retrieved February 14, 2022.
- ^ Peters, Jay (January 31, 2022). "The voice of Joel from The Last of Us steps away from NFT project after outcry". The Verge. Archived from the original on February 4, 2022. Retrieved February 4, 2022.
- Tweets
- ^ @TroyBakerVA (January 14, 2022). "I'm partnering with @VoiceverseNFT to explore ways where together we might bring new tools to new creators to make new things, and allow everyone a chance to own & invest in the IP's they create. We all have a story to tell. You can hate. Or you can create. What'll it be?" (Tweet) – via Twitter.
- ^ @fifteenai (December 13, 2021). "I have no interest in incorporating NFTs into any aspect of my work. Please stop asking" (Tweet) – via Twitter.
- ^ @fifteenai (January 14, 2022). "I've been informed that the aforementioned NFT vocal synthesis is actively attempting to appropriate my work for their own benefit. After digging through the log files, I have evidence that some of the voices that they are taking credit for were indeed generated from my own site" (Tweet) – via Twitter.
- ^ @VoiceverseNFT (January 14, 2022). "Hey @fifteenai we are extremely sorry about this. The voice was indeed taken from your platform, which our marketing team used without giving proper credit. Chubbiverse team has no knowledge of this. We will make sure this never happens again" (Tweet) – via Twitter.
- ^ @fifteenai (January 14, 2022). "Go fuck yourself" (Tweet) – via Twitter.
- ^ @VoiceverseNFT (January 7, 2022). "I wonder who created the voice for this? ;)" (Tweet). Archived from the original on January 7, 2022 – via Twitter.
- ^ @fifteenai (January 14, 2022). "Sounds like a scam" (Tweet) – via Twitter.
- ^ @fifteenai (January 14, 2022). "Give proper credit or remove this post" (Tweet) – via Twitter.
- ^ @fifteenai (January 14, 2022). "Certainly not you :)" (Tweet) – via Twitter.
- ^ @fifteenai (January 14, 2022). "Go fuck yourself" (Tweet) – via Twitter.
- ^ @yongyea (January 14, 2022). "The NFT scheme that Troy Baker is promoting is already finding itself in trouble after stealing and profiting off of somebody else's work. Who could've seen this coming" (Tweet) – via Twitter.
- ^ @BronyStruggle (January 15, 2022). "actual" (Tweet) – via Twitter.
- YouTube (referenced for view counts and usage of 15.ai only)
- ^ "SPY IS A FURRY". YouTube. Archived from the original on June 13, 2022. Retrieved June 14, 2022.
- ^ "Spy is a Furry Animated". YouTube. Archived from the original on June 14, 2022. Retrieved June 14, 2022.
- ^ "[SFM] – Spy's Confession – [TF2 15.ai]". YouTube. Archived from the original on June 30, 2022. Retrieved June 14, 2022.
- ^ "Among Us Struggles". YouTube. Retrieved July 15, 2022.
- ^ "The UPDATED 2b2t Timeline (2010–2020)". YouTube. Archived from the original on June 1, 2022. Retrieved June 14, 2022.
- TikTok
- ^ "She said " 👹 "". TikTok. Retrieved July 15, 2022.
External links
- Official website
- 15 on Twitter
- The Tax Breaks (Twilight) (15.ai)