Thursday, August 25, 2011

Kashu-do (歌手道): The Singer's Formant and the Singer's Formant Region: Defining "Ring"

Superficial understanding of vocal science is one of the greatest enemies of progress in science-based vocal pedagogy.  When a self-described science-based teacher refers to the acoustic region between 2kHz and 3kHz as the Singer's Formant, it is not always clear what they mean.  Too often, one will point to one of the 3 potential peaks that lie between 2000 and 3200 Hz and say: "see, there is the ring in the voice!" To this a self-described "Bel Canto" teacher will say, "well I don't hear much ring!"

The five formants (regions of acoustic strength) in the vocal tract occur approximately around 500Hz, 1000Hz, 2000Hz, 2500 Hz and 3000Hz.  These are not precise numbers and their values shift depending on adjustments (vowel changes) in the vocal tract.  The first formant of the [i] vowel for example lies around 280Hz.  This means that the lowest area of acoustic intensity when the vocal tract is shaped to the [i] vowel is around 280Hz.  When the tract is shaped to [a] the lowest area of acoustic intensity is around 800Hz.  Any space will have acoustic regions that amplify sound. Some concert halls are friendly to high voices and others to low voices.  This is simply a way of explaining the nature of formants.  What is unusual about the human voice as an instrument is that its resonator, the vocal tract, is flexible and changeable.  It can readjust to intensify any given pitch.  The region of the singer's formant is particularly interesting because the human ear is extremely sensitive to the area between 2000 and 3200Hz.

Now to the argument between the "self-described" science-based teacher and the "self-described" Bel Canto teacher.  The former points to strong energy in the SF area and calls it the ring in the voice.  The latter claims not to hear any ring.  Empirically they would be both correct and therefore paradoxically both wrong in their assessments. The reasoning is the following:  strong energy in the region between 2000 and 3200 Hz will be perceived as particularly strong to the human ear and could be enough to make a singer easily discernible in the presence of an orchestra.  Practically that could be enough (especially given the low expectations of modern day opera).  But to the Bel Canto teacher who seeks a particularly intense experience relative to ring, the mere presence of the upper three formants in strength is not enough.  And so he will discredit the science-based teacher for not understanding the nature of ring.  This would not be entirely wrong.  However, most voices exhibit energy in the SF area precisely in this way and it is enough for most orchestral situations.  Nevertheless, this does not constitute "ring" in the traditional sense, nor does it in fact meet with the scientific definition of "ring", which  indeed coincides with the results the Bel Canto teacher expects.

The scientific definition of "ring" is not a mere presence of strong energy in the upper formants but rather a cluster effect of two of the three formants.  If we take the note Bb4 (c. 460Hz), the fifth harmonic (5th multiple of the fundamental frequency)  would be at 2300Hz, squarely between the third formant (cir. 2100 Hz) and the fourth (c. 2500Hz).  Effective vocal tract tuning (including, vowel, laryngeal depth and  aryepiglottic fold diameter, etc) would raise the 3rd formant and lower the fourth such that both energies would impact the fifth harmonic.  Most of the vocal energy on the upper part of the spectrum would center on the fifth harmonic.  The third formant of the [u] vowel falls just below 2300 and its fourth formant just above 2300.  This would support a constant assertion by Gioacchino Livigni that the Old School tenors (before Corelli and Del Monaco) tended to pursue this strategy of clustering around the 5th harmonic on notes Bb and above.  A similar strategy is possible on the 6th harmonic (c. 2760 Hz).  This would require tuning the fourth and fifth formants of the back vowels ([u, o, a] respectively up and down to cluster around the 6th harmonic or the third and fourth formant of front vowels ([i, e, E].  This is the strategy of the tenors of the second half of the 20th century.  This is equally viable.  What is not as efficient is spreading the tree formants on three different harmonics.  Because the human ear is particularly sensitive to the SF region (c. 2000-3000 Hz), even if the energy is spread between three harmonics, as long as it is strong, it will have a strong impact on the listener's ear even with an orchestra.  It will however not be the same powerful impact that is heard from top singers, who by some personal sensitivity can achieve the complex tuning of the vocal tract necessary to cluster the formants as described above.

The fact is that science and tradition agree, if one takes the time to really understand both.  The constant assertion that science is not far along enough to make a difference in vocal pedagogy is just the easy judgment of those who are either not interested in understanding science as deeply as they need to for it to be relevant or are simply afraid to have their techniques proven inadequate.  In truth, science is less there to discredit anyone's approach, but proactively present to help us refine the approaches dictated by our instincts.  Inspired teachers begin with great instincts.  Great teachers go beyond their instincts and educate themselves with all the available facts.  We must never forget that the most celebrated voice teacher at the height of the Bel Canto period was Manuel Garcia, for whom Rossini composed the role of Conte di Almaviva in Barbiere di Siviglia.  He is the same Manuel Garcia whose son, Manuel Garcia, Jr., invented the first mirror laryngoscope, basically the same tool used by doctors and ENTs for superficial pharyngeal/laryngeal analysis.  Garcia, Jr. is considered the first true science-based voice teacher.

PS.  Since my latest PC computer crashed, I acquired a MAC.  I have tried to calibrate the frequency region of Audacity for spectrographic analysis but to no avail.  It is possible that the version I have has a bug.  I would welcome advice from other MAC users who use Audacity or who have recommendations for another spectrographic analyzer.  Spectrograms would enhance this post but unfortunately Voce Vista does not work with MAC and will not any time soon.

© 08/25/2011


Justin Petersen said...

Hey J-R,

Also interesting that Manuel Garcia I's children: Maria Malibran, Manuel Garcia II, and Pauline Viardot were all successful singers/teachers in their own right. What a family!!

What exercises would you utilize to encourage the "ring" in the voice that you are referring to?



MKR said...

Ron, I remain bewildered by the term "formant," at least as you explain it here:

The first formant of the [i] vowel for example lies around 280Hz. This means that the lowest area of acoustic intensity when the vocal tract is shaped to the [i] vowel is around 280Hz. When the tract is shaped to [a] the lowest area of acoustic intensity is around 800Hz.

I do not see how to make sense of this. If I sing, for example, at A2, be the vowel [i], [a], or any other, the first peak in the acoustic wave form will be at the fundamental, around 110 Hz. Since I know that you know this perfectly well, I have to assume that when you say "the lowest area of acoustic intensity," you don't mean anything that I would understand by that phrase--in particular, you don't mean the lowest frequency range in which the signal has a peak of intensity. So what do you mean?

Jean-Ronald LaFond said...

Yes Miles, the spectrograph will show the fundamental pitch A 110 and its multiples. If the vocal tract were a neutral area (formant-less), each successive multiple (harmonic, overtone) would get weaker. When you sing [i] on that A2, you should notice that the 3rd peak (most likely) would be taller than the others around it. If you sing [a] if would be the 7th or 8th peak depending on the quality of the vowel. Formants are frequency bands that draw the energy of the vocal tract to them. So wherever the formants fall, if there is a harmonic nearby, it will enhance it. Sometimes, the drawing of the energy is so strong that the fundamental itself ends up being quite weak and even invisible. Hope this makes sense.

Daniel James Shigo said...

Really enjoyed this blog post! Just wanted to draw attention to one matter.

It is a commonly held view, but García did not, in fact, invent the laryngoscope. At least three other gentlemen, Bozzoni, Czermak and Babington had invented devices to see the vocal folds.

See the article here:

García was, however, the first person to use it and publish his results. And for that he justly achieved world-wide fame.

Arachne said...

Re your problems with Audacity, have you tried Amadeus Pro? This is what I use on my Mac. I've tried others, but found this was far and away the best.

Jean-Ronald LaFond said...

What a wonderful community of experts!

First, thank you Dear Arachne for suggesting Amadeus Pro! I had worked with it a while back but totally forgot it existed since working with VV.

And thank you Dan for the scholarly report on the Laryngoscope issue. I will leave the post as is to reflect the necessity of your correction.

Thank you for making the blog a place of information.

Much love,


Jean-Ronald LaFond said...

Good point Justin about the Garcia children! What a vocal family indeed!

KG said...

So, how does one go about tuning the 3rd, 4th, and 5th formants? It is my understanding from reading Sundberg, Titze, and a lot of other stuff, that the fourth and fifth formant are essentially fixed by nature and can only really be affected substantially by lowering or raising the larynx, which lowers or raises all formants. The third if I remember correctly depends on the tip of the tongue...

From what I have observed in my students lighter voices tend to have higher F4 and F5 and lower voices lower ones. I, for instance, seem to have F5, even with a completely depressed larynx, no lower than about 3300.

Perhaps the especially ringy voices are simply gifts of nature, and not "tuned" in the same way F1 and F2 can be "tuned." This would also explain why different singers have certain notes that just ring better than everything else--like Lauri-Volpi's Bb.

Just some thoughts--I'm sure I'm off-base at least partially. I just get frustrated sometimes by the nature of science in singing, as it seems a lot better at describing what is happening than at explaining how to make it happen.


Blue Yonder said...

Hi TS! A little tech tip: There's a commercial program called Parallels that will let you run Windows inside of OS X on your Mac (it runs within its own window). Then, you can install & run Windows-only applications like VoceVista. For more info:

Jean-Ronald LaFond said...

Thank you for this information Blue Yonder! I will investigate!


loosestrife said...

It's a year later and you've probably found something full-featured that works for you, but in a pinch Gerry Beauregard, a Canadian in Singapore who produces a range of audio tools, has one available online. You can read about it at .