Author: Hoak <[email protected]>     Reply to Message
Date: 7/29/2004 1:58:13 AM
Subject: Better Sound Render

As a Sound Designer the status of what is regarded as 'State-Of-The-Art' in game sound render is laughable to be generous. To be fair, until recently sound render capability and fidelity in games has really not been much of a concern with good reason; games and game design haven't offered the level of play detail and subtlety to take advantage of much more then crude 'positional' sound render capabilities, and most game Fans listen to game sound on the most abject sound hardware (even when put under the illusion they have a State-Of-The-Art rig).

There are a slew of issues and challenges unique to game sound rendering that will only be overcome when some generous or concerned Programmer assumes the onus of seriously addressing them -- to date no one really has. Fundamental issues and serious limitations of game sound render that bring it in way below the bar of what's technically feasible can be summarized (in no particular order):

· limited dynamic range
· crude sub-mixing of multiple sound channels
· gross compression/companding
· simplistic, crude compression and companding algorithms
· gross interactions between mixer, compression and companding
· lack of sound and level designer control over aforesaid parameters
· complete lack of even the most basic engineering documentation of the aforesaid
· no (or very crude) steridian based boundary effects
· cheap canned positional libraries
· cheap canned DSP libraries
· very poor perspective (first to third person) and proximity effects
· crap-tastic tools (worst in the industry)
· no security
· poor sync
· uncodumented black-box sound manipulation

As just about everything that can be wrong with sound render in games is wrong even the smallest concerted attempt at addressing some some of these issues with the crudest of solutions would be a god send. In many cases issues and limitation of crusty sound renderer 'back planes' and features could be overcome by the simple expedient of documenting how they perform and at the very least offering Sound & Level Designers means to disable mixer compression, ACG, and DSP effects and features, and create these effects statically/manually.

Arguably the largest issue confronting fidelity in game sound render is having automated dynamic mixing an indefinite and changing number of sound sources, of dynamic position and not have them overload. In essence sound renderers are required to automate the task of live show Sound Engineer that is setting up for multiple performers, performing different kinds of music with different instruments, number of performers, and musical genera on-the-fly -- a nearly impossible task with no automation, and only crude DSP, and very crude compression schemes.

The current solution has bee to use massive amounts of audio compression and companding (not to be confused with digital compression) reducing dynamic range on a ghastly scale -- and while this is a better sounding solution gross digital overload -- the dynamic range achieved and double digit distortion figures obviate any need for high fidelity audio hardware beyond the cheapest EAX compatible card and discount headphones that aren't physically painful to wear.

The value of decent sound render performance capability won't be readily apparent unless or until it's available for a capable Sound Designer and Game Designer to collaborate that exploits it. And it won't be the 'knock your socks off' kind of thing like HUGE explosions and cheap positional panning effects of jets, or magical plasma balls screaming past or behind you... The value and benefit will be subtle, deep, and immersive -- acoustics will match the virtual render space and transition effects from one to the next will give a lot of the subtle cues and atmosphere of the kind that can raise hair on your neck, make you shiver, uncomfortable, awe-struck, or those spooky jolts of adrenaline you don't quite understand what. Wind won't have to sound like the choofing on the news anchor mans microphone, and the subtle sonic density of the game world will be rich on a scale that is larger the the visual difference from Doom to Doom III. With decent sub-mixing, and control over compression schemes compelling dynamic range and transient effects can be had without a mud-slide of slew distortion offering even the most jaded drama and intensity freaks a case of the "WoWoW's"; with decent render capability there is the potential for realistic sound that will consistently lift you out of your seat.

A big improvement doesn't require massive engineering, a million lines of code, or even any real innovation, a lot of the code for high quality DSP, compression and automated mixing is available for free in open source DSP DirectX and VST audio plug-ins projects. A engine architecture that could use DX or VST plugs would be a real breakthrough allowing Sound Designers to do what Level Designers are finally able to do with CryTech's Sandbox and DooM III new IDE Level Tools: sound design, and post production in real time.

I'm well aware when I hoof, choof with a rant like this I'm suggesting a lot of work for a programmer; and in this case there are some low/no code solutions that would be a huge step in the right direction. My Big Three are (in order of importance/value):

· documentation
· separate status assignable sound channels
· on/off control of some of the games mixing and DSP

Documentation is number one; having to design sound for black-box DSP and mixing, where you don't have a clue of the thresholds, range, or actual specifications of the DSP being applied by the game's sound renderer -- you're fucked into an enormously wasteful situation of trial and error; this is probably why there are so may amateur game sound designers that 'seem' to get acceptable results, and so few real Sound Designers/Engineers that are willing to bother or take the work seriously.

The whole EAX, Miles, AC3 HTRF positional foo-faw in current sound renderers is to be blunt -- crap. There difference calculated on first to third person perspectives, indoor and outdoor sounds is so simplistic that it's cruder then the graphics equivalent of using DooM's 2D sprites for player models. TrueCombat's first programmer took an initial step in the right direction by offing three separate assignable 'spacial' channels for weapon sounds; but a lot more can be achieved with this method. Separate channels for first and third person sounds, and the ability to trigger an indefinite number of channels with one event simultaneously. This last is a not only an outstanding means to overcoming the limitations of dynamic range for truly loud events, it allows for creation of very realistic sounds that have complex and much more realistic harmonic motion. I was fortunate to have one programmer offer this as a feature for one mod project, so I can speak from experience and say it works very well.

Being able to manually setup and control mixing and DSP effects would allow for using much more capable and powerful DSP in post processing a sound for a game. Listening to a SoundBlaster Audigy II's laughable reverbs compared to convolved sampled reverbs like Waves IR1 and the results are something anyone can hear and appreciate. Being able to control the threshold of AGC and companding and the ratio of compression would be a dream come true, but just being able to turn the crappy processing off or set a threshold would be a huge leap forward.

While I'm sure commercial Sound Designers have access to better tools, they can't be that much better considering the pathetic results they get. Call Of Duty is a case in point, they not only had all the resources of Activision, id Software, and Creative Labs at hand -- they had a budget to do on location sound recording of all weapons and sound emitters in the game; and the results they got for the tools, time and money they had were in my humblest of opinions abominable. One man, at home, alone did a better job on Red Orchestra with DSP convolved sounds created for free with a couple hundred dollars in tools...

These results and improvements that can be had are not small; a good analogy is you can tell the difference playing a stereo music CD vs monophonic AM radio through the cheapest and crudest audio hardware, even a three inch PM speaker driven by a half watt audio amplifier is adequate for nearly anyone to appreciate the difference.

Rant Off...

Minneapolis, 2--4
_