Audio Analysis
What is Acoustic Voice Analysis Used For
Acoustic analysis of voice function is frequently used in clinics to obtain meaningful information about underlying laryngeal pathology through careful examination of the signals emitted from the mouth. Acoustic studies are non-invasive as they focus solely on sound. They can be performed using live or recorded sound samples.
What should be considered when performing acoustic sound analysis
It is important to use appropriate techniques and protocols to ensure the accuracy of the data used for acoustic analysis. Acoustic recording should ideally be performed using a professional grade condenser microphone placed 3-4 cm from the mouth, at an angle of 45-90 degrees (off-axis). A head-mounted microphone can also be used to maintain a consistent distance between the mouth and the microphone. Background noise and room echo should be minimised and a soundproof room should be used if possible. Data should either be recorded directly to a digital computer or stored using digital audio tape (DAT) technology.
Acoustic Sound Assessment - Data - What Information is Obtained from Analytical Tools
The most commonly used analytical tools in acoustic sound assessment are as follows:
- Fundamental Frequency
- Density
- Sound Range Profile
- Spectrography and Spectral Measurements
- Perturbation Measurements
- Nonlinear Measurements
What Is Fundamental Frequency (F0) in Acoustic Voice Analysis?
Fundamental frequency refers to the number of vibrations of the vocal folds per second, expressed in Hertz (Hz). It is the primary factor in pitch perception. Fundamental frequency (F0) can be measured from a prolonged vowel at a comfortable pitch within the patient's F0 range or during speech or reading. Typical speaking fundamental frequency (F0) values vary depending on age, gender, psychological state, intensity, and the act of speaking.
Normal Values of Fundamental Frequency (F0)
In prepubescent girls and boys, the average is around 220 to 240 Hz; in adult women, it is 200-220 Hz, and in adult men, it is 100-120 Hz. The gender-related difference diminishes later in life, with the speaking fundamental frequency (F0) decreasing in women and increasing in men.
Fundamental frequency (F0) is primarily influenced by the length, mass, and tension of the vocal fold. These parameters may vary due to normal personal differences or the presence of pathology. Adult vocal folds are longer and have greater mass than those of children, resulting in a lower F0; similarly, male vocal folds are longer and heavier than those of females, resulting in a lower F0.
- Excessively high F0 can result from restricted vocal fold length and mass (as in laryngeal pitch or androgen deficiency in adolescent males) or increased musculoskeletal tension and laryngeal posture.
- Excessively low F0 can result from increased vocal fold mass (as in vocal fold edema or polypoid degeneration or exposure to androgens in women) or decreased tension (as in superior laryngeal nerve neuropathy or impaired cricothyroid function).
What Does Intensity Represent in Acoustic Voice Analysis?
Intensity is defined as the physical counterpart of loudness and is measured in decibels (dB). It can be easily measured with sound level meters or, alternatively, using the amplitude of a properly calibrated acoustic voice signal. As with fundamental frequency (F0), intensity measurements can be taken from prolonged vowels, reading, or speaking actions and/or intensity limits (for dynamic range measurement). Vocal intensity tends to increase with an increase in fundamental frequency (F0). It varies as a function of physical and psychological state. On average, intensity is typically 2 dB higher in children than adults and in men than women. Intensity varies as a function of subglottic pressure (Ps) and the amplitude of vocal fold vibration. Increasing subglottic pressure (Ps) during normal phonation increases vibration amplitude and mucosal wave movement.
During Normal Speech
- Increased intensity may be a sign of self-control issues associated with conductive hearing loss.
- Decreased intensity may indicate inadequate respiratory support, insufficient glottic closure, or reduced tissue flexibility limiting vocal fold vibration amplitude. Additionally, due to reduced tissue flexibility, the inability to sustain vocal fold oscillation at low subglottic pressure and the restriction of sound production at low intensity may occur.
What Is the Voice Range Profile in Acoustic Voice Analysis?
Voice Range Profile (VRP), or phonetogram, visually presents the physiological limits of the voice system by combining fundamental frequency (F0) and dynamic range data. Voice Range Profile (VRP) can be created with a keyboard and sound level meter or with computer programs that allow for data collection and presentation. Intensity is typically shown in dB, while F0 is displayed as linear frequency (Hz), logarithmic semitones (ST), or a percentage of the total range.
A normal VRP for a voice shows an expanded dynamic range in the mid-frequencies with narrowing at the extremes. As F0 increases, both the lowest and highest intensity values tend to increase. VRP can be described both quantitatively and qualitatively. Variations have been identified based on age, gender, previous vocal training, and the presence of vocal pathology.
What Are Spectrography and Spectral Measurements in Acoustic Voice Analysis?
Spectrography is a powerful analytical technique that provides information about both the glottic source and the vocal tract filter function. A spectrogram is obtained by converting the acoustic waveform from the time domain to the frequency domain using the Fourier theory.
In addition to harmonic and formant information, the spectrogram also provides quantitative information about abnormal voice quality. A normal voice signal contains well-defined harmonics and formants, while a dysphonic voice signal contains weak and irregular harmonic-formant patterns and high-frequency noise.
What Are Perturbation Measurements in Acoustic Voice Analysis?
Perturbation refers to cycle-to-cycle variations in fundamental frequency, amplitude, and waveform morphology. Some level of cycle-to-cycle perturbation is expected even in normal voices. However, high degrees of perturbation are associated with the presence of vocal pathology. Physiopathologically, the presence of perturbation in the voice signal indicates irregularity in vocal fold vibration.
What Are Nonlinear Measurements in Acoustic Voice Analysis?
Nonlinear approaches to acoustic sound analysis are based on the idea that the complex and often unpredictable nature of vocal fold vibration cannot be adequately described using linear approaches (e.g. the source-filter theory of speech production). These analytical approaches originate from nonlinear dynamical theory. According to this theory, the outputs of complex systems are not ordinary, but are due to nonlinear causes arising from the system. These systems are called chaotic and are characterised by their internal states and rules, making them deterministic, nonlinear, unpredictable, low-dimensional (controlled by a relatively small number of parameters) and highly sensitive to initial conditions (small changes increase over time).
Which sound types - acoustic signals - can be analysed in acoustic sound analysis
The sound signals analysed can be roughly divided into three types. Type 1 signals are periodic. Type 2 signals contain strong sub-harmonics and modulations and can be intermittent. Type 3 signals are chaotic. A spectrogram can be used to classify the signals. For perturbation analysis the signal must be Type 1.