Google's AI generated a 'podcast' from one of my articles and it's incredibly convincing and creepy just how well it can mimic humans talking
A podcast at the click of a button—Google's new AI tool can do it. It's called NotebookLM and it's essentially a summary bot. Input a document, hit generate, and out pops a briefing doc, FAQ, or study guide. What's more, it can generate a podcast covering the document's contents, hosted by fleeting ephemeral beings with chirpy American accents.
Take, for example, an article I wrote back in 2023 called "Cache is king when it comes to designing the gaming CPUs of the next 20 years." In which I spoke to a handful of silicon engineering experts about what's next for chip design. I fed the article into NotebookLM, waited around four minutes, and out pops a 10-minute long podcast.
Take a listen in the Soundcloud embed below.
"What the ****!"
You'll have to excuse the expletive but that was my honest reaction to hearing it for the first time. A well-summarised document is one thing, but it's the natural cadence of the conversation and the lifelike emotion that has sent me for a loop after listening.
They (I'm already acting like these are real people) even introduce the podcast… as a podcast. I have to remind myself when listening that these aren't real people—they're the product of me feeding a hyperlink into a box on a website. A computer feigning two humans sharing thoughts and feelings. I don't know why but this feels deeply strange to me.
I'm not the only one who feels this way about the new AI tool. We played a version generated on our RTX 4090 review to the rest of the PC Gamer team and received such responses as:
"This is real existential crisis inducing."
And:
"this is ****ing terrifying."
And:
"the interruptions and responses from the co-host are freaking me out."
And:
"I'm moving to the woods I can't take it."
My point is, NotebookLM is spectacularly impressive and terrifying in equal measure. What's more, the analogies and references made throughout the recording aren't drawn like-for-like from the subject matter—in this instance, a PC Gamer article. They're mostly made up by the AI.
For example, the reference to how 3D V-Cache is like building a skyscraper instead of a bigger warehouse. I wish I'd come up with that, but that's all AI generated. And that's just another reason why the whole thing is frightfully good.
Though, a podcast is as much about the hosts as it is the content. And so far we've not had any other 'hosts' (AI vocaloids) lending their voice to anything we've uploaded to NotebookLM. That's sure to wear a bit thin with time. Not to mention there are unlikely to be any hilarious gaffs with two Google-programmed bots behind the mic.
I suppose what I'm saying is this doesn't feel like an actual, credible threat to successful podcasts, nor a replacement for them. The PC Gamer Chat Log is safe another day. Though as we've seen with other forms of AI generation, it may still change the dynamic of what's deemed to be worth the effort. For example, AI-generated art didn't immediately wipe out all human-made art, of course not, but then you probably wouldn't paint 300 stunning images just to run a single D&D campaign for your friends. You might do that with AI, if you're not totally opposed to its use, which would also be completely fair.
The same goes here. I wouldn't record a podcast for every article I ever wrote, but if I could do that with a couple button presses? Something not at all worth the effort only months ago is now next to no effort at all.
Don't worry, I'll spare you the hundreds of articles on long-since-released graphics card specifications. But you get the idea. Things are possible now that weren't remotely viable only months ago.
There's an elephant in the room, though, and it is pretending to be a human being with thoughts, emotions, and vocal cords. This sort of eerily impressive natural language tool being so easy and accessible is a dangerous tool in the wrong hands. How easy would it be to catfish someone if you can voice any text, any document, with a sleek, conversational human tone. Now that's terrifying.
But as proof of concept for what AI can do, I've found nothing that's evoked a response out of me quite like NotebookLM.
So, what is NotebookLM?
NotebookLM is a free tool available to use over at NotebookLM.google. It has an incredibly boring sounding name, though it's functionally pretty exciting.
It's built around Google's Gemini AI model—the same one being rolled out to new Android phones and being used to generate AI snippets in web searches that I've suggested may break the business of the internet. NotebookLM, however, is intended to be a study buddy—an AI capable of summarising documents, listening to audio, and saving you time taking notes. This could have totally changed how I revised for exams in school, but I was born 20 years too early—missed it by a hair.
It's been available in the US through 2024, though recent improvements during the summer and a global rollout have seen it land into the hands of more users as of late.
The podcast feature, called "Audio Overview" is also a more recent addition, added into the software in September.
Best microphone for gaming: make sure you're heard
Best webcams: be seen while you get your stream on
Best capture cards: lessen the load with a dedicated card
For now, the software is only capable of speaking in English, and a note on the Google blog post about its rollout says it will "sometimes introduce accuracies". That's a given, as all AI models, even the best, are prone to making stuff up, sometimes. It's often cited as "hallucinating" but it's really just a fancy-sounding term for when the AI is a bit pants (bad).
One feature that appears to be headed to Audio Overview is the ability to interrupt the speakers and, assumedly, change the direction of the conversation or issue on-the-fly corrections. It's not for certain yet, but Google notes in its blog post that "you can't interrupt them yet", which is a bit of a weird thing to say if that wasn't an intended feature at some point.
Let's think on that for a second: The ability to interrupt a podcast host mid-conversation and tell them what you'd like them to talk about—it's giving strange, highly-personalised live show with passive-aggressive overtures…
Google isn't the only firm to be playing around with AI-powered bots that sound like humans. OpenAI is also in the market with its own voice-assistant to match the one in the movie Her. That was human-to-bot contact, but no less odd for it.
No doubt this conversational AI stuff is going to get real weird, real quick.