In my opinion, ears have commonalities to microphones. In fact, a great microphone can only come close to mimicking our ears. The ear possesses a series of filters, compressors, and equalizers in order to attempt to give us the best scenario for intelligibility, also to prevent damage to our ears.
A great microphone should treat on and off axis sound the same regarding frequency.
Sound waves travel through the outer, middle, and inner ear, through the smallest bones in the human body, then are naturally amplified while journeying through the liquid-filled cochlear chamber, which is lined with hairs that own differing sensitivities. From there, sound is transmitted electrically to tell the brain to hear the world around us, converting sound waves into information that the brain can translate.
Each ear also has an Eustachian tube to regulate pressure on both sides of the eardrum. If the pressure becomes uneven, we may experience hearing loss and pain temporarily until this pressure is regulated. An omnidirectional microphone also has a similar feature. The tube in the onmi microphone prevents the diaphragm from becoming stuck at one pressure.
Sound is a vital cog in the human design. It lets us understand one another, and signals us for dangerous situations, or the joy of music. Working 24/7 in the background within an invisible realm. Truly magical!
Within this design, the frequencies between 1 kHz and 3 kHz are intensified 20:1 through the middle ear. This is the target frequency range where humans communicate. The larger eardrum forces pressure through the much smaller oval window of the cochlear. The additional strength is needed to make the journey through the high-impedance liquid-filled cochlear. Between 0.5-5kHz is where humans require sensitivity the most. The cochlear aligns the frequencies and will also boost additional frequencies like 4 kHz, in order to ensure intelligibility. In my opinion, if the sound enters the ear accurately through a well-placed premium microphone, the ear will not need to overenhance/process, because the sound is understood initially.
I find it interesting that these same frequencies during a news program are often boosted to increase intelligibility. Mic placement on the chest/tie is convenient, but it is not the best place to put a microphone due to the 800 Hz build up on the chest (not from the chest). This causes the "muddy" sound that is often heard when using a microphone in such a way. If you add a few more microphones to the equation, and have this done in an acoustic space that is not treated properly we now have the perfect recipe for unintelligibility.
A microphone that can accurately produce these attributes at all frequencies and angles should be the engineer's target solution, whether it is one or 10 microphones open at the same time. Select such a microphone that allows you to transmit your speech naturally, without the need for excessive processing and equalization.
A great microphone should treat on and off axis sound the same regarding frequency. The only difference should be that the off-axis sound is lower in level. Natural! Like our ear/brain relationship.
There are studies that portray that when a sound system is being designed, intelligibility is delivered to the mid-brain where intelligibility is received. Common belief is that if you have proper uniform speaker coverage, good signal quality and a wide frequency response, your live sound system will be able to satisfy all listeners with a clear intelligible sound.
Recognizing this, there are recently released studies that say, "The Eyes Point to The Ears."
Most mammals (humans included) have a special protection function in their mid-brain which automatically points their eyes in the direction of an abrupt change in the sound field. This “potential danger (or desired focused sound)” is then reconciled with what is being seen. Priority is given to any sound which might be associated with a “danger or loudness,” while sounds with a lower priority must wait in short-term memory.
Since everything that we hear passes through the mid-brain on its way to higher brain centers, the automatic protection function may either facilitate or impede the upstream transmission of sound impulses (including speech), based on how closely their origin compares with what is being seen.
When the listener's visual and aural stimuli are closely matched, the time needed to sort out the effects of competing interference is greatly reduced. But when visual and aural stimuli do not match, a significant processing delay will be introduced that can rob the mind of the processing time needed for an observation on broadcast intelligibility comprehension.
It is hard to concentrate on a speaker when also contending with reflective surfaces in poorly designed studios, paired with multiple open mics, and possibly inexperienced sound mixers, which can distract our attention away from the source.
Your microphones need to possess the ability to portray sound in a linear fashion as our ear/brain relationship does. This will improve comprehension and intelligibility in harsh conditions and maintain a natural sound in perfect conditions. It starts at the microphone!
Intelligibility is King, and the Microphone, its Throne!
A new common practice in live speech events is placing sound reinforcement in front of the podium to have the ears point at the source and prioritize that sound, the person speaking! This makes the brain happy and not distracted from having to align the elements of the presentation.
The old "phoneme" model of how we make sense of speech that we process a continuous flow of individual speech stream components like beads on a string has been challenged by recent science.
The syllable is now thought to be the basic organizational unit of speech. Within the syllable, particular components are more critical to intelligibility, while others serve as placeholders. Together they form a data packet with a specific energy envelope that is the foundation of an intelligible speech stream.
Equally important are the rapid transitions between syllables that are decisive for meaning. These extremely short, decisive transients serve as "markers" which are used by the brain to parse similar syllables.
Face-to-face clarity is the ultimate paradigm. When you deliver that, you've delivered the speech stream with the deck properly stacked for the listener.
I have worked in production and post-production for a few prominent broadcast facilities. It is a fast-paced environment where visual appeal is paramount and sound sometimes takes a back seat. In my opinion, sound is 51 percent of the video product. It is crucial that the message being sent is clear, uncolored by bad acoustics, and intelligible to the mass audience.
The microphone is the first point of acquisition in the sound chain, and often the gets the lease amount of the budget/attention. The goal is to make it sound clear and open, as if there is no microphone at all, for those who are listening to a video broadcast or in the same room.
The goal, then is to find a microphone that can deliver;
1. A linear frequency response both on/off axis will deliver a high level of intelligibility especially when there are two or more individuals speaking at the same time. Comb-filtering is greatly reduced resulting in a natural sound like the way the human ear/brain work together.
2. A fast impulse response will ensure that consonants and other fast transients get reproduced accurately. This may be achieved by having a suspended high voltage on the back plate of the microphone. The microphone needs to react quickly to sources such as the human voice or a tambourine for example.
3. The ability to handle high SPL levels is also an attribute that is very important whether using a boom mic or on a body worn microphone. This ensures that the microphone will not distort.
4. In a facility, sound from microphones take multiple paths such as internal communication systems, on-air sound, digital telephone systems, and social media feeds. Intelligibility needs to be the paramount goal. Without premium sound, listeners may become fatigued listening to the sound produced, even if the production values are amazing.
Sound is invisible and our ears never shut down, remember, they work even when we are asleep.
It is worth it to invest in the invisible, good sound relaxes the body and puts us at an even playing field with the world around us. When we do not notice sound during productions, it usually means it is done properly.
Intelligibility is King, and the Microphone, its crown!
Selecting the proper microphones for the right application is an integral part of transmitting the clear sonic detail required for a successful production.
Italicized passages are influenced by New Best Practices for Speech Intelligibility
By SDA Consulting SOUND DESIGN & APPLIED PSYCHOACOUSTICS and their contributors:
Steven M. Chase and Eric D. Young, “Cues for Sound Localization Are Encoded in Multiple Aspects of Spike Trains in the Inferior Colliculus, “April 2008; Steven Greenberg, “Understanding Speech Understanding: Towards a Unified Theory of Speech Perception,“1996 and “A Multi-tier Framework for Understanding Spoken Language,” Listening to Speech: An Auditory Perspective, 2006; David Griesinger, “Phase Coherence as a Measure of Acoustic Quality, part one: the Neural Mechanism, part two: Perceiving Engagement, part three: Hall Design,” 2010; Jennifer M. Groh, Making Space: How the Brain Knows Where Things Are, 2014; Norbert Kopco, I-Fan Lin, Barbara G. Shinn-Cunningham, and Jennifer M. Groh, “Reference Frame of the Ventriloquism Aftereffect,” 2009; Dominic Massaro, www.talkingbrains.org and “Tests of auditory/visual integration efficiency within the framework of the fuzzy logical model of perception,” 2000; David Perez-Gonzalez, Manuel S. Malmierca and Ellen Covey, “Novelty detector neurons in the mammalian auditory midbrain,” 2005; Bernt C. Skottun, Trevor M. Shackleton, Robert H. Arnott, and Alan R. Palmer, “The ability of inferior colliculus neurons to signal differences in interaural delay,” April 2001; Hans Wallach, Edwin B. Newman and Mark R. Rosenzweig, “The Precedence Effect in Sound Localization,” 1949; Sungyub Yoo, “Speech Decomposition and Enhancement,” 2005.