Apple breaks silence on claims it used 'swiped YouTube videos' to train AI
A new report claimed that tech giants, including Apple, Nvidia, Anthropic, and Salesforce, used data from "thousands of YouTube videos" to train AI. The investigation, performed by Proof News and published on Wired, alleged that subtitles from 173,000 YouTube videos were swiped for the companies' AI models.
Called "YouTube Subtitles," the dataset contains video transcripts from educational channels like Khan Academy, MIT, and Harvard, as well as the Wall Street Journal, NPR, and the BBC. Material from YouTube stars like PewDiePie, Marques Brownlee, and MrBeast were discovered, too.
We haven't heard from Anthropic and Salesforce yet (we reached out for comment), but Apple has issued a response to Wired's report.
Will Apple use this data for Apple Intelligence and other AI services?
The short answer is no, but here's the longer response for those who don't identify with the "TLDR" crowd:
In an email to Mashable, Apple said that its open-source language model, OpenELM, indeed used the dataset, but not in the way some may be thinking.
The OpenELM project is a part of Apple's ongoing effort to benefit the broader research community. In other words, according to Apple, the OpenELM model was created for research purposes only, making it clear that it will not underpin any of its machine learning-powered hardware nor AI services, including Apple Intelligence.
For the uninitiated, Apple Intelligence is the company's new suite of AI features, which were revealed at WWDC 2024 (Apple's annual event where it spills the beans on what's to come with its software offerings, including iOS and iPadOS).
Apple Intelligence, for example, can help summarize text, whether it's an email or text message, for quicker interactions with friends, loved ones, co-workers, and more. It will also underpin more entertainment-focused features like Genmoji, which generates new iOS emojis with a prompt. There's also Image Playground, which lets users create AI-generated images on the fly.
When it comes to AI utilities that does service its consumers, Apple highlighted that it offers websites an option to opt out of having their content used for AI training. Apple assured that its generative models are built and fine-tuned using high-quality data, including licensed content from publishers and stock image companies, alongside publicly available data on the web.
To put it succinctly, Apple doesn't deny that its open-source language model, OpenELM, used the dataset, but wants to make clear that it will not underpin any of its AI services, including Apple Intelligence.
What does Nvidia have to say?
We also reached out to Nvidia for comment, but the company, known for bringing AI to many of its gaming hardware and services, declined to issue a statement.
We will update this article if we hear anything from Anthropic and Salesforce.
tech technology tech news technology news latest technology news new technology latest technology latest tech news technews technology news today tech news today technical news new technologies technology websites tech websites July 18, 2024 at 11:24PM
No comments