Production Guide Banner.png

How An Audiobook Is Made

Before you go any further into the process, it’s very important that you understand how an audiobook is made. By understanding this, you will understand a) the work involved, and b) why narrators charge what they do.

This will also give you a better platform to know if a potential narrator is able to bring the goods or if they are brand new to the business. Neither of these is good or bad, they are just one more tool to evaluate who you will hire based on your needs and budget.

This section, and this whole resource at large, is based on a few assumptions:

  1. You are the rights-holder of the written work that is being turned into an audiobook

  2. The narrator is also the producer for this work

There are cases where the narrator is just that, and the editing work is farmed out by narrator to someone else. In these cases, the narrator is paying for the editor’s work out of the fee that you are paying them.

Step 0: Reading The Book

While not every narrator does this, many do, or at least will skim through the book to see if there are any words that they want pronunciation help with, or to get an idea of characters if voices are required.

So right away, this may take some time to do before the recording even starts, and while they are doing this, they aren’t completing other work, such as recording other projects or reading other books for future audiobook projects.

As I just mentioned, not every narrator does this, as we all have our different ways of approaching projects, hence why this is “Step 0”.

Step 1: Recording

This is where it all really starts. A narrator will go into whatever space they have and spend hours talking to themselves. This area (if they have set it up properly) will have sound treatment (where extraneous sound is removed/reduced and there is no echo or reverb in the room so they can get clean, high-quality audio).

To give you an idea, I converted a closet in my office into a recording space that is roughly 22” wide by 26” deep. There are no fans in this space (as the mic would pic it up). It gets hot in the afternoons and in the summer and if I read a long chapter. (Insert full of hot air jokes here.) This is not atypical for a lot of home studios.

How long with this take? Narrators will read anywhere from 2.5-3 words per second. For a point of reference, NaNoWriMo challenges people to write 50k words in 30 days. A 50k word novel would ultimately be in the area of 5.5-6 hours when complete.

The calculator here can help you estimate how long your manuscript will be when it is complete.

But as you can imagine, no narrator is going to offer a flawless read in one take. No, we are going to flub words, we are going to want to offer a different interpretation or delivery of dialogue, we are going to mess up and need to try again.

There are two methods in which a narrator will record an audiobook and account for mistakes or alternate takes:

Punch and Roll:

This is where we will have a monitor in the booth with us, or be recording in front of a computer screen. If we mess up, we click a button in our Digital Audio Workstation (or DAW) and it jumps back 5 seconds or so and starts playing back what we just recorded in our headphones, and starts recording at a predetermined point, thus erasing the mistake.

Dog Clicker:

In this case, when we mess up we will snap our fingers right next to the mic, or possibly use a clicker (like used in training dogs, hence the name). This will create a marker in the wave form of our recorded track, and we can easily see where we need to edit out the bad takes. The recording continues the entire time, capturing everything.

There are benefits and drawbacks to both, ones that really get too “inside baseball” for you to worry about. Suffice to say that the recording of the book will likely take 30% longer than the actual finished time. For out example of the 50k NaNoWriMo book, actual recording time is likely closer to 7-8 hours.

Step 3: Editing

This is one of the longer steps in the process. Once the book is recorded, the narrator will need to go back through and take out the mouth sounds and breath sounds. (You may not realize it, but the human mouth makes a lot of weird, sometimes gross noises, and the mic picks up everything.)

If the narrator did Punch and Roll, this will be mostly all of the editing they need to do, save for mistakes they didn’t realize they made or odd sounds that the mic picked up. Punch and Roll editing takes less time, but the recording process is longer.

If the narrator did Dog Clicker recording, the recording likely went faster but now they need to go through and not only edit out the mouth/breath sounds, but also all of the bad takes; this editing process takes longer.

Note: you will hear breath sounds in the finished product. That is normal. They should not be excessive nor distracting.

Step 4: Quality Check

(Sometimes narrators might combine this step with Step 3, and that is purely a choice of how they like to work.)

Quality Checking (or QC or QA, for quality assurance) is where the narrator listens to all of the recordings with the manuscript in front of them, making sure that they read the words exactly as they appear on the page. If any mistakes are detected, the narrator will need to go back and re-record those lines.

Note: the finished recording should match the manuscript word-for-word, unless you, the author, have agreed to something other than that.

Time Check: At this point, the narrator has likely spent 1.5-2x the amount of “finished” time on steps 3 and 4, meaning for every hour of finished audio produced, they will have spent 1.5-2 hours on these steps.

Step 5: Mastering

This is where we apply the “finishing touches” on the files so that they meet the technical specs required by whichever site they are being submitted to for distribution. This involves applying equalization, normalization, sometimes compression, etc to make everything sound good when it hits Audible, iTunes, or wherever else it may be.

This is not a lengthy process, usually 30 seconds-2 minutes per track, depending on the track length and computer processing power.

At some point as agreed upon by you and the narrator, they may have been submitting tracks to you for your approval, and you may have been offering feedback to them here and there. This may have required the narrator to go back an re-record some sections of audio.

When all is said and done, the narrator has invested 3 hours for every hour of finished audio between recording and editing. For our NaNoWriMo example, the 5.5-6 hour finished book of 50k words likely took them 16.5 to 18 hours to produce.

 
 
 

Here is a very summarized version of how an audiobook is made. This was part of a video series I produced with one of th authors I work with.

 
 
 
 
 
 
 
 
 
 
This is the recording booth I built.

This is the recording booth I built.

 
 
 
 
Dog Clicker Recording

Dog Clicker Recording