News in English

Apple breaks silence on claims it used 'swiped YouTube videos' to train AI

A new report claimed that tech giants including Apple, Nvidia, Anthropic, and Salesforce used data from "thousands of YouTube videos" to train AI. The investigation, performed by Proof News and published on Wired, alleged that subtitles from 173,000 YouTube videos were swiped for the companies' AI models.

Called "YouTube Subtitles," the dataset contains video transcripts from educational channels like Khan Academy, MIT, and Harvard, as well as the Wall Street Journal, NPR, and the BBC. Material from YouTube stars like PewDiePie, Marques Brownlee, and MrBeast were discovered, too.

We haven't heard from Anthropic and Salesforce yet after reaching out for comment, but Apple has issued a response to Wired's report.

Will Apple use this data for Apple Intelligence and other AI services?

The short answer is no, but here's the longer response for those who don't identify with the "TLDR" crowd:

In an email to Mashable, Apple said that its open-source language model, OpenELM, indeed used the dataset, but not in the way some may be thinking.

The OpenELM project is a part of Apple's ongoing effort to benefit the broader research community. In other words, according to Apple, the OpenELM model was created for research purposes only and will not underpin any of Apple's machine learning-powered hardware or AI services, including Apple Intelligence.

For the uninitiated, Apple Intelligence is the company's new suite of AI features, which were revealed at WWDC 2024 (Apple's annual event where the company spills the beans on what's to come with its software offerings, including iOS and iPadOS).

Apple Intelligence, for example, can help summarize text, whether it's an email or text message, for quicker interactions with friends, loved ones, coworkers, and more. It will also underpin more entertainment-focused features like Genmoji, which generates new iOS emojis with a prompt. There's also Image Playground, which lets users create AI-generated images on the fly.

New Genmoji feature coming to iOS 18. Credit: Apple

When it comes to AI utilities for its consumers, Apple highlighted that it offers websites an option to opt out of having their content used for AI training. Apple assured that its generative models are built and fine-tuned using high-quality data, including licensed content from publishers and stock image companies, alongside publicly available data on the web.

To put it succinctly, Apple doesn't deny that its open-source language model, OpenELM, used the dataset, but wants to make clear that it will not underpin any of its AI services, including Apple Intelligence.

What does Nvidia have to say?

We also reached out to Nvidia for comment, but the company, known for bringing AI to many of its gaming hardware and services, declined to issue a statement.

We will update this article if we hear anything from Anthropic and Salesforce.

Читайте на 123ru.net