Video games are a rapidly evolving form of entertainment that leverages huge technological advancements to create new ways of interactions and to tell deeper and deeper stories. It’s no wonder that graphics have been at the forefront of these advancements; pushing to blur the line between film and video games. One of the most important aspects of film, however, is audio. With every advancement in film projection, there was a proportional change in audio resolution or spatialization. This, unfortunately, has not been the case with games. While games have seen multiple jumps in visual fidelity per generation, audio sees a slow linear change. Let's take a stroll down a brief history of visual and audio advancements in games to prove the point.

Note: We will not cover anything before the 8-bit generation which includes monochrome displays, oscilloscope graphics displays, and arcade cabinets.

3rd Generation (NES)

580b57fcd9996e24bc43c367.png

Graphic Advancements

  • The NES supported 64 8x8 or 8x16 sprites

  • The NES could display ~48-52 colors on-screen form specified color palettes

  • 256x240 interlaced output

Audio Advancements

  • Mono output

  • 5 simultaneous voices

  • 4 sound generators (waveform and sound generators)

  • 1 low-quality sample player


4th Generation (SNES and Sega Genesis)

Sega-Genesis-Mod1-Bare-criscoedit.png

Graphic Advancements

  • Resolution up to 512x448 interlaced (SNES)

  • 128 simultaneous sprites at 8x8 or multiples therein (SNES)

  • 15-bit color palette capable of displaying 256 simultaneous colors (SNES)

  • Parallax scrolling, and the Mode7 chip provided additional visual effects (blending, pixelization, etc…)

  • The SNES had the Super FX chip which allowed for 100s of vector polygons to be rendered.

Audio Advancements

  • Stereo audio

  • 8 simultaneous voices

  • Rudimentary DSP for effects like echo, and panning


5th Generation (Playstation, Sega Saturn, Nintendo 64)

23-239723_png-free-download-playstation-one-games-console-png.png

Graphic Advancements

  • 750x756 interlaced output resolution (Sega Saturn, N64)

  • 16M , 24-bit color palette

  • 207,000 simultaneous colors displayed on screen (N64)

  • 150,000 polygons/sec (Sega Saturn, N64)

  • 600,000 flat textures/sec

  • Textures, shading, bitmap

  • Anti-aliasing, Z-buffering (N64)

  • Mipmapping, texture filtering for sprites (N64)

  • 16k simultaneous sprites (Sega Saturn)

Audio Advancements

  • 16-bit audio, 44.1 kHz PCM audio

  • Stereo output

  • 32 sound channels (Sega Saturn)

  • Internal DSP with pitch modulation, digital reverb, and ADSR


6th Generation (PS2, Xbox, Gamecube)

GameCube.png

Graphic Advancements

  • 480-1080 interlaced output

  • 32-bit color palette

  • 75M polygon fill rate (PS2)

  • 932 megapixel/sec texture fillrate (Xbox)

  • FSAA, bump mapping, anisotropic filtering, alpha blending, diffuse, specular, particle, physics simulations

Audio Advancements

  • 64-bit PCM audio, 48 kHz

  • 256 simultaneous voices

  • Stereo output

  • DSP including Dolby Prologic, Dolby Digital 5.1 and DTS 5.1

  • Spatial sound frameworks, interactive music frameworks


7th Generation (PS3, Xbox360, Wii)

Xbox-360-phat.png

Graphic Advancements

  • 720-1080 progressive output

  • 128-bit color

  • 500M polygon fill rate (360)

  • 4.4 gigapixel/sec texture fill rate (PS3)

  • Normal mapping, dynamic tessellation, animation blending, subsurface scattering, ambient occlusion, soft body dynamics, crowd simulations, volumetric lighting, volumetric fog, resolution upscaling

Audio Advancements

  • LPCM audio up to 192 kHz

  • Up to 7.1 channel audio

  • DSP including Dolby TrueHD and DTS-HD

  • Spatial sound frameworks with dynamic EQ

  • Adaptive audio systems, and interactive audio


8th Generation (PS4 and Xbox One)

585ea27bcb11b227491c350b.png

Graphic Advancements

  • Up to 4k 60fps (OneX, PS4 Pro)

  • 1.7B polygon fill rate (XboxOne)

  • Up to 187 gigapixel/sec texture fill rate (OneX)

  • HDR output (HDR10, Dolby Vision)

  • Global illumination, physical-based rendering, SSR ambient occlusion and reflections, AI assisted upscaling

Audio Advancements

  • DSP including Dolby Atmos, DTS:X, and Dolby

  • Advancements in spatial modeling


As you can tell, graphics continually get huge, multiple advancements in graphics every generation, bringing it closer and closer to cinema quality. Now let's take a look at audio. The huge shifts in audio happen in-line with dramatic shifts in technology. The CD generation brought in huge fidelity improvements. The DVD generation brought huge changes to audio middleware. The latest generation is finally allocating computational power to DSP that can simulate 3D with HRTFs or using ray tracing to simulate more accurate sound environments. Even with those once-a-generation shifts, game audio still lacks that immediacy and emphasis of music scoring found in movies. It’s particularly important because we’re bumping up to a point of diminishing returns with graphic advancements. We’ll need to start looking at new ways to improve the game experience, and the obvious area to start with is audio.

This generation will see huge improvement to spatial rendering of sounds with HRTFs, and ambisonics. With these improvements we hope to see more advancements towards game scores beginning to have the immediacy of movie scores. This is usually achieved by events in the game triggering music changes. There is an inevitable lag with this method because the game has to trigger an event, the event is sent to the audio engine, and the audio engine cues up a stinger to play at the next available beat point. To really change this, we have to think about music as a method to inform games of when game events can trigger. Imagine a rousing musical score giving beat information to the game so selective punches can be timed to points that make sense. Imagine those events triggering changes in the audio. This is the key to truly interactive music, and it starts with music.

This generation can truly be the generation of game audio. All of the pieces are there. There is strong middleware support. There is finally hardware resources allocated to address it. There is a strong need within the industry to find ways to advance the craft. If developers take on the challenge of putting audio on par with graphics, we might finally meet the ideal of cinema, and even exceed it.

Comment