Summary
- Google Gemini introduces Audio Overviews, allowing users to create podcasts from uploaded documents.
- Audio Overviews utilize AI to generate realistic voices and engaging discussions on document content.
- Audio Overviews provide a convenient way to extract information from documents in a podcast format.
They say that you’re never more than six feet from a rat, and these days, the same is probably true of podcasters. It seems that almost everyone on the planet either has a podcast or is going to start one.
With Google Gemini you can now create your own bespoke podcasts using a feature called Audio Overviews. All you need to do is upload a document, and Gemini will create a short podcast deep dive into the contents of the document with two AI hosts.
What are Audio Overviews in Google Gemini?
Audio Overviews is a new feature in Gemini that was previously available in Google’s NotebookLM AI-powered note-taking app. The feature is able to summarize information in a unique way. Instead of giving you a bland text summary of the information, Audio Overviews generates an audio file of a podcast with the two AI-generated hosts discussing the information that you want summarized.
The hosts have a back-and-forth conversation discussing the topic of whatever they are summarizing and asking questions of each other to glean more information on specific key points. The overall result is what sounds like a real podcast with two informed people discussing the topic at hand.
In my testing, Gemini generated Audio Overviews that ranged between five and fifteen minutes in length, depending on how much content was in the uploaded documents. The 15-minute podcast, for example, was generated from a 146-page manual for an SLR camera, while even a single-page PDF of a garbage collection schedule generated a podcast that was five minutes long.

Related
How to Use GarageBand to Record a Podcast
Plus, some recommendations for better tools when you need more flexibility.
What Can You Use to Generate Audio Overviews?
You can create Audio Overviews from a wide range of different sources in Gemini. You can upload a document, and Gemini will turn whatever information the document holds into your own bespoke podcast. These don’t just have to be text documents, either; you can upload a Google Slides presentation, and Gemini will create an Audio Overview based on the content of the slides.
Another really useful option is that you can generate Audio Overviews from a Deep Research report. Deep Research is a feature that generates a report on any topic you choose by coming up with a plan of what to research, finding the appropriate content on the web, and then collating the information that it finds into a report. The results are in the form of a detailed written response that breaks down everything that was discovered, but these reports can often be quite long and fairly dry.
Once you’ve generated a Deep Research report, however, you can get Gemini to turn it into an Audio Overview. Then, instead of having to read through the entire report, you can sit back and listen to two AI-generated podcasters discuss it in detail. It can make it easier to digest the information from a Deep Research report, without having to read through all the details.
Audio Overviews seem like they would be a great way to get information from web pages with a lot of information on them, but currently, there’s no way to generate an Audio Overview from a web link. However, you can copy the content to a text file, or save the contents of the web page as a PDF, and then Gemini will happily create an Audio Overview from the content. I saved the Wikipedia page on the history of Brazil as a PDF, and Gemini created a podcast from the file discussing Brazil’s history, which was useful and informative.

Related
Reddit Is the Wikipedia of the Human Experience
There’s a reason so many people append their Google searches with “Reddit”
You can’t generate Audio Overviews from most image files, either, but I found that if I saved images as PDF, it would at least try to generate an Audio Overview from the file. If there’s no readable text in the image, however, then the Audio Overview generation will fail. If the image does contain text, it will work; I was able to get the AI-generated podcast hosts to have an enthusiastic and in-depth discussion about the PDF image of my local waste collection schedule.
How to Create an Audio Overview
When you upload a document to Gemini by clicking the “+” icon, you should see a suggestion pop up above the prompt window that you can click to generate your Audio Overview. If it doesn’t pop up, however, all you need to do is ask Gemini to generate an Audio Overview from the document and, as long as it’s a valid document with readable text, an Audio Overview will be generated.
You can upload a wide range of files, although not all of them may be suitable for generating an Audio Overview. Supported file types include the following:
• C, CPP, PY, JAVA, PHP, and SQL files
• TXT, DOC, DOCX, PDF, RTF, DOT, DOTX, HWP, and HWPX files
• PPTX, XLS, and CSV files
• Google Docs and Google Slides
If you have a Gemini Advanced subscription, you can also upload HTML, XLSX, TSV, and Google Sheets files.
As mentioned above, you can upload images to Gemini, but you won’t be able to generate an Audio Overview from image files. However, if you save an image as a PDF, it’s possible to create an Audio Overview, as long as the image contains some readable text.
Generating an Audio Overview from a Deep Research report is also easy to do; once you’ve generated the Deep Research report, you should see an option to generate an Audio Overview for the report. However, I found that this doesn’t always happen. If the option doesn’t appear, you can just ask Gemini to generate an Audio Overview, and it will create one for you.
How Good Are Gemini’s Audio Overviews?
Since AI chatbots burst onto the scene, a lot of the things they can do have felt a little bit like magic. It still blows my mind that, in a matter of moments, AI can generate images of things that have never existed in images before, such as a unicorn with three legs rollerblading at a disco. Gemini’s Audio Overviews can also feel a little like magic, too.
That’s because the results are genuinely impressive. For a start, the voices are very realistic, and make it feel like you’re listening to real people talking. The way they interact is also really well done, with interruptions and the hosts talking over each other on occasions.
In trying out the feature, my results have usually been very good at picking out the key points of the documents and discussing them in a very accessible way. I tried uploading the manual for an old Canon EOS 3 film camera I own, and the hosts had a highly informative discussion about the eye-tracking autofocus feature.
I also uploaded an unpublished screenplay and the hosts talked through the key points of the plot in a very entertaining way, picking up a lot of the humor, and most of the central parts of the plot. The results aren’t always perfect, however; the screenplay summary missed a key part of the plot which is required to understand both the title of the screenplay and its poignant last line.
Audio Overviews Are (Mostly) a Great Way to Access Information
Some AI features can feel like companies showcasing what the AI can do rather than genuinely useful features. The Audio Overview feature doesn’t feel like that, however.
Reading through a long document isn’t always the easiest way to extract the key information from it. Listening to two people discussing the information can make it easier to distill the key facts without having to sift through it all yourself. Having two people discussing it is a clever touch, as often one of the hosts will ask the question you’ve been thinking about yourself.
In particular, I found Audio Overviews to be very useful for Deep Research reports. These reports are often long walls of text, and while they are packed with useful information, reading through the entire report can feel like something of a chore. An Audio Overview of the report is far easier to digest, and the AI does a pretty good job of extracting the important information rather than waffling on about less important facts.
That’s not to say that Audio Overviews are perfect. I found that I often have the same problem with Audio Overviews that I have when listening to audiobooks: I start to tune out and miss what’s being said. I then have to rewind the Audio Overview to catch up on what I’ve missed.
This isn’t the fault of the Audio Overviews, of course, but I’m sure I’m not the only person who suffers from this problem. For me, they work best when I have no other distractions, such as if I’m going for a walk with headphones on, but your mileage may vary.
The podcasts don’t always feature all the information that you might want to extract, either. The Audio Overview for the screenplay did extract most of the central plot points, but it missed something that wasn’t necessarily central to the plot but was certainly a central theme of the script.
If you don’t enjoy reading through large amounts of text to extract the information that you want, then Audio Overviews can be a useful alternative. You can turn almost anything you want into your own bespoke podcast and have other people explain the key information to you, rather than having to read it for yourself. Hopefully, Google will add the ability to generate Audio Overviews of content from websites at some point, because right now you still have to jump through a few hoops to make it happen.