Chapter 4. Audio Descriptions

Carli Spina

ch4

Chapter 4. Audio Descriptions

While most people have encountered captioned videos, audio descriptions remain less well known. Unfortunately, they are also less prevalent than captions across virtually all platforms. It is important that video creators strive to address this disparity by including audio descriptions in their content where needed, but the first step in this process is understanding what audio descriptions are, how they differ from captions, and what is required to provide high-quality audio descriptions.

What Are Audio Descriptions?

Audio description can go by several different names, which all refer to the same concept, including video description, described video, audio described video, verbal description, visual description, audio-narrated description, and descriptive narration. However, audio description is used most frequently, particularly in the United States, where this has been the preferred term of several government institutions.

Regardless of the terminology used, the basic concept remains the same. Audio description is defined by the US Access Board as

Narration added to the soundtrack to describe important visual details that cannot be understood from the main soundtrack alone. Audio description is a means to inform individuals who are blind or who have low vision about visual content essential for comprehension. Audio description of video provides information about actions, characters, scene changes, on-screen text, and other visual content. Audio description supplements the regular audio track of a program. Audio description is usually added during existing pauses in dialogue.1

A similar definition appears in the documentation for the Web Content Accessibility Guidelines (WCAG) 2.1, which also notes that audio description is not necessary where all the visual information is already explained in the audio track,2 for example, a video that shows text on the screen that is read as part, or all, of the main audio track. Considering this example can help to clarify the purpose and importance of audio description. It is a means of representing the visual information of a video in an alternate format that allows it to be perceived by another sense. The symbol for audio description is shown in figure 4.1.

While audio description is primarily used to make video content accessible, it can also be used in other settings to improve accessibility. In any arena where visual elements are key, audio description can be adapted as a method of access. It has been used to improve the accessibility of museums, art displays, fashion shows, parades, dance performances, fireworks displays, video games, and virtual reality experiences. Overall, it is a very versatile process that can be adapted to many different settings even though it is primarily thought of as a tool for improving video accessibility.

Why Is Audio Description Important?

For most video content, the visual information presented on the screen is not represented in the audio track. This information remains unknown to anyone who is unable to see the video. Audio descriptions are, therefore, vital to make this content accessible to those whose vision makes it impossible for them to perceive the visual information of the video.

While many people might think that the audience for audio descriptions consists solely of those who are completely unable to see, this stereotypical assumption about what it means to be blind or visually impaired is overly restrictive. For this reason, the National Federation of the Blind “encourage[s] people to consider themselves as blind if their sight is bad enough—even with corrective lenses—that they must use alternative methods to engage in any activity that people with normal vision would do using their eyes.”3 In the United States, there are approximately half a million children under the age of eighteen and almost twenty-seven million adults over the age of eighteen who are blind or visually impaired. Worldwide, over 250 million people have vision impairments. For these people, audio descriptions are vital to allow them to participate in visual content, whether a Hollywood blockbuster or educational videos assigned in their classes. Without audio descriptions, they must rely on merely the sound content of the video, which in many cases may be incomprehensible without the visual elements, or rely on finding someone who can describe the video to them. Audio descriptions offer meaningful, independent opportunities to interact with video for a segment of the population that is otherwise excluded.

Audio descriptions can also be useful to improve the experience of users who are able to view the content being described. As Joel Snyder has argued, audio description “is useful for anyone who wants to truly notice and appreciate a more full perspective on any visual event.”4 It can be useful particularly as an educational tool, as it draws attention to the most central and important visual elements of the content in a way that can help teach viewers how to recognize, describe, and understand these elements. However, just as audio description is not yet as common as captioning, its use by those who can see the content is not as common as the use of captions by those who can hear the audio.

In some countries, audio descriptions may be legally required in specific circumstances. For example, in the United States, the Federal Communications Commission has set standards pursuant to the 21st Century Communications and Video Accessibility Act that require TV stations to provide access to minimum amounts of audio described content.5 In addition, the Justice Department has issued rules saying that specific types of movie theaters must support audio description to be in compliance with the Americans with Disabilities Act (ADA).6 While neither of these rules specifically states whether audio descriptions are required for online video content, the fact that the ADA requires equitable access to public accommodations has led some to argue that audio descriptions are required for institutions that are subject to these legal provisions, particularly government entities.7

From a standards point of view, audio descriptions are also important. WCAG provides the framework for web accessibility standards for many institutions, and audio descriptions are incorporated into these guidelines. In order to achieve Level A compliance, WCAG 2.1 requires audio description or a “media alternative” for prerecorded media in most situations.8 To achieve Level AA compliance, all prerecorded video must have audio description.9 Interestingly, Level AAA compliance requires extended audio descriptions, which refers to audio descriptions that take longer than the natural pauses in the course of the main audio track.10 These may not be needed in all video content, but they can be useful in cases where additional details would provide better access to the content.

A Brief History of Audio Descriptions

The concept of verbally describing the content of films is not a new one. An early example was the verbal description of a 1929 showing of a film for members of the New York Association for the Blind and the New York League for the Hard of Hearing.11 After the event, the idea percolated from a few different independent sources. Chet Avery of the Department of Education was among the first to suggest the idea in the 1960s, and it was then independently the subject of a master’s thesis that proposed that audio descriptions should be offered over the radio in sync with programs on television.12 In the 1980s, audio description work began to gather interest in the world of theater,13 which later led to interest in incorporating these descriptions into television programming. In the mid-1980s, Margaret Pfanstiehl, who had worked on description for theater programming, partnered with Barry Cronin, who had independently had the idea for audio description of television.14 WGBH, Boston’s PBS channel, “creat[ed] a national video description service by training describers” in efforts that would ultimately lead to WGBH funding its Descriptive Video Service in 1988.15 This work was also done by two other organizations founded in the same year, which worked on television, video, and, notably, educational videos.16

In the online environment, audio descriptions have also become somewhat more prevalent than in the past. When streaming videos debuted, few had audio descriptions associated with them. However, access to them has begun to improve with major streaming video players incorporating audio descriptions in some of their content. Tools and services also exist to support the creation and sharing of audio descriptions for online videos. However, a majority of videos available online do not have audio descriptions, meaning that there is clearly still more work to be done to ensure equitable access to the internet.

Integrated Description versus Separate Audio Files versus Text-Based Description

Integrated Description

There are three main ways to add audio descriptions to a video. One method, sometimes referred to as integrated description, involves incorporating narrated descriptions of visual elements into the primary audio track of the video. With this method all users will experience the audio descriptions. This can be done in more than one way. For example, in some cases it might be possible to naturally incorporate all of the important visual elements into the video narration. In a tutorial demonstrating how to complete a task, if all elements of the visuals in the video are also described in the audio, this is an example of integrated description.

The other main way of offering integrated description in a video is to add video descriptions to the natural pauses in the narrative or dialogue of the main audio track. These descriptions are often recorded in a different voice from the main speech or narration to make it clear which content is description and which is the main audio. In this approach, all users will experience the audio descriptions, and they will sound more like a traditional set of audio descriptions than in the first version of integrated description. In this way, this approach can be particularly useful when working with platforms that do not support the inclusion of alternative audio tracks for video descriptions. At the same time, this approach means that users have no option to opt out of hearing the audio descriptions. While this method will improve access for those who cannot see the video content, especially because it does not require that they know how to access a separate audio track, it can also be distracting or confusing for other users.

Separate Audio Files

As a second method, audio descriptions can be offered in a separate audio file that can be selected or deselected in a manner similar to closed captions. This method improves accessibility for those who are helped by audio descriptions, but does require a video delivery platform that both supports separate audio tracks and also offers an accessible way to listen to these audio tracks. Each of these aspects is important. Not all online video platforms offer the ability to have alternative audio tracks, and at the same time, not all online video platforms are completely accessible, so it is important to consider both of these factors when determining which approach to audio descriptions is appropriate in a particular setting. In the case of platforms that do meet these two criteria, this approach to audio descriptions offers greater flexibility and customizability so users can access the information in the way that best fits their needs.

Text-Based Description

Alternatively, with some media players, it is possible to provide descriptions as text files that are read aloud using either built-in browser functionality or a screen reader. This third method does not require the describer to create a recording of the descriptions, but it does require that the file have time stamps to synchronize the reading of the descriptions with the video. Often these descriptions are made available as WebVTT files. In some cases, users can set their preferences so the video will automatically pause when the descriptions are read.

Best Practices in the Creation of Audio Description

By their very nature, audio descriptions are quite subjective. Unlike captions, which try to recreate the soundtrack, the heart of audio description is determining which visual elements are important and the optimal way to describe them. Because the goal generally is for the audio descriptions to fit within the natural pauses in the audio track, a major task for anyone creating audio descriptions is to decide what information is important and convey it in a concise manner. This makes audio description a creative process. As Joel Snyder states:

Audio Description is a kind of literary art form in itself, to a great extent. . . . It provides a verbal version of the visual—the visual is made verbal, and aural, and oral. Using words that are succinct, vivid, and imaginative, AD conveys the visual image that is not fully accessible to a segment of the population and not fully realized by the rest of us—the rest of us, sighted folks who see but who may not observe.17

Because of this element of creativity, it can be difficult to boil the creation of audio description down to a series of rote steps. Instead it is a process that requires practice and refinement over time. For this reason, many audio descriptions are created by professional audio describers.

However, there are best practices that can help to ensure that the audio descriptions will be effective in providing access:

It is helpful to always explicitly consider what is missing from the experience of the video if sound is the only point of access. It can be easy to slip away from this point of view, so audio describers, particularly those who are new to the process, should start their description process by carefully considering this aspect. Some questions to consider:
- Who is visible?
- Who is speaking?
- What is happening silently on the screen or is happening with sounds that do not make the action obvious?
- Are silent elements conveying information? These can include facial expressions, gestures, and text shown on the screen but not spoken.
Because time is always a factor in descriptions, particularly descriptions that must fit within the natural pauses in the original audio track, prioritization is key. It is vital to understand what the most important information is and focus on clearly conveying that information.
Be as concise as possible. The goal is not to fill each and every pause, but instead to convey the necessary information as briefly and clearly as possible. As Sabine Braun notes,
It can be assumed that sighted viewers do not process everything they see. . . . In other words, the visual mode is rather impressionistic. By contrast, the sequential nature of the verbal mode seems to encourage a more complete processing of the information offered. Extensive descriptions can, therefore, lead to a cognitive processing overload in the recipient.18
Context is important. The setting may be important, for example, in a film, but may be less important in other works, such as educational videos that perhaps have no other setting than a close-up of a chalkboard or a set of slides. Thus, while the setting should likely always be offered, the amount of detail provided will vary.
Text shown on screen should always be included in audio descriptions, as it will not be accessible otherwise.
Do not offer commentary or flourishes. Though interpretation is inherent in the description process, audio descriptions should be as objective and neutral as possible. Audio descriptions should be unobtrusive, almost fading into the background. This is particularly important in the case of dramatic works.
Think about the audience when deciding the level of complexity for vocabulary and syntax. The audio descriptions should be aimed at the same audience as the video itself.
As with captions, do not censor content that is not otherwise censored or obscured in the video. The goal is to offer those using audio descriptions the same access to the content as other audience members.
Audio describing requires practice, both in the sense that describers will likely have to write, refine, and even practice the script for each project and also in the sense that it takes practice across multiple projects to gain this skill.

Though the creation of audio description can be less straightforward than the creation of captions, it is equally important. Without descriptions, videos with visual elements that are vital to complete comprehension and appreciation will exclude some users. As a result, it is important that audio descriptions be included in the process of developing accessible videos.

Notes

US Access Board, “Information and Communication Technology (ICT) Standards and Guidelines, Notice of Proposed Rulemaking, 36 CFR Parts 1193 and 1194,” February 2015, 157, https://www.access-board.gov/attachments/article/1702/ict-proposed-rule.pdf (page discontinued).
Web Accessibility Initiative, “Understanding Success Criterion 1.2.5: Audio Description (Prerecorded),” Web Content Accessibility Guidelines 2.1, W3C, accessed October 24, 2020, https://www.w3.org/WAI/WCAG21/Understanding/audio-description-prerecorded.html.
National Federation for the Blind, “Blindness Statistics,” January 2019, https://www.nfb.org/resources/blindness-statistics.
Joel Snyder, “Audio Description: The Visual Made Verbal,” International Congress Series 1282 (September 2005): 937.
Federal Communications Commission, “Audio Description,” May 21, 2020, https://www.fcc.gov/audio-description.
Department of Justice, Civil Rights Division, “Nondiscrimination on the Basis of Disability by Public Accommodations: Movie Theaters; Movie Captioning and Audio Description,” Final Rule, 28 CFR Part 36, CRT Docket No. 126, AG Order No. RIN 1190-AA63, November 21, 2016, https://www.ada.gov/regs2016/movie_rule.htm.
Elisa Edelberg, “Legal Requirements for Audio Description,” 3Play Media, last updated June 3, 2019, https://www.3playmedia.com/2017/03/22/legal-requirements-audio-description.
Web Accessibility Initiative, “Understanding Success Criterion 1.2.3: Audio Description or Media Alternative (Prerecorded),” Web Content Accessibility Guidelines 2.1, W3C, accessed October 24, 2020, https://www.w3.org/WAI/WCAG21/Understanding/audio-description-or-media-alternative-prerecorded.
Web Accessibility Initiative, “Understanding Success Criterion 1.2.5.”
Web Accessibility Initiative, “Understanding Success Criterion 1.2.8: Media Alternative (Prerecorded),” Web Content Accessibility Guidelines 2.1, W3C, accessed October 24, 2020, https://www.w3.org/WAI/WCAG21/Understanding/media-alternative-prerecorded.html.
“Blind and Deaf at Movie: One Hundred Applaud Talking Film at Special Showing,” New York Times, August 28, 1929, 28.
Jaclyn Packer, Katie Vizenor, and Joshua A. Miele, “An Overview of Video Description: History, Benefits, and Guidelines,” Journal of Visual Impairment and Blindness 109, no. 2 (2015): 84.
Snyder, “Audio Description,” 936.
Packer, Vizenor, and Miele, “Overview of Video Description,” 85.
Packer, Vizenor, and Miele, “Overview of Video Description,” 85.
Packer, Vizenor, and Miele. “Overview of Video Description,” 85.
Snyder, “Audio Description,” 936–37.
Sabine Braun, “Audiodescription Research: State of the Art and Beyond,” Translation Studies in the New Millennium 6 (2008): 21.

Figure 4.1

Audio description symbol

Refbacks

There are currently no refbacks.

Published by ALA TechSource, an imprint of the American Library Association.
Copyright Statement | ALA Privacy Policy