Next Article in Journal
Survey Optimization via the Haphazard Intentional Sampling Method
Previous Article in Journal
Legendre Transformation and Information Geometry for the Maximum Entropy Theory of Ecology
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Proceeding Paper

A Weakly Informative Prior for Resonance Frequencies †

AI Lab, Vrije Universiteit Brussel, Pleinlaan 2, 1050 Brussels, Belgium
*
Author to whom correspondence should be addressed.
Presented at the 40th International Workshop on Bayesian Inference and Maximum Entropy Methods in Science and Engineering, online, 4–9 July 2021.
Phys. Sci. Forum 2021, 3(1), 2; https://doi.org/10.3390/psf2021003002
Published: 4 November 2021

Abstract

:
We derive a weakly informative prior for a set of ordered resonance frequencies from Jaynes’ principle of maximum entropy. The prior facilitates model selection problems in which both the number and the values of the resonance frequencies are unknown. It encodes a weakly inductive bias, provides a reasonable density everywhere, is easily parametrizable, and is easy to sample. We hope that this prior can enable the use of robust evidence-based methods for a new class of problems, even in the presence of multiplets of arbitrary order.

1. Introduction

An important problem in the natural sciences is the accurate measurement of resonance frequencies. The problem can be formalized by the following probabilistic model:
p ( D , x | I ) = p ( D | x ) p ( x | I ) L ( x ) π ( x ) ,
where D is the data, x = { x k } k = 1 K are the K resonance frequencies of interest, and I is the prior information about x . As an example instance of (1), we refer to the vocal tract resonance (VTR) problem discussed in Section 5 for which D is audio recorded from the mouth of a speaker; x are a set of K VTR frequencies, and the underlying model is a sinusoidal regression model. Any realistic problem will include additional model parameters θ , but these have been silently ignored by formally integrating them out of (1), i.e., p ( D , x | I ) = d θ p ( D , x , θ | I ) .
In this paper, we assume that the likelihood L ( x ) p ( D | x ) is given, and our task is to choose an uninformative prior π ( x ) p ( x | I ) from limited prior information I. A conflict arises, however:
The   uninformative   priors   π   most   commonly   chosen   to   express   limited prior   information   I   are ,   in   practice ,   often   precluded   by   that   same   I .
The goal of this paper is to describe this conflict (2) and to show how it can be resolved by adopting a specific choice for π . This allows robust inference of the number of resonances K in the important case of such limited prior information I, which in turn enables accurate measurement of the resonance frequencies x with standard methods such as nested sampling [1] or reversible jump MCMC [2].

2. Notation

The symbol π is intended to convey a vague notion of a generally uninformative or weakly informative prior. Definite choices for π are indicated with the subscript i:
π i ( x ) p ( x | β i , I i ) , ( i = 1 , 2 , 3 ) ,
where β i is a placeholder for the hyperparameter specific to π i . Note that in the plots below and for the experiments in Section 5, the values of the β i are always set according to Table 1.
Each π i uniquely determines a number of important high-level quantities since the likelihood L ( x ) and data D are assumed to be given. These quantities are the evidence for the model with K resonances:
Z i ( K ) = d K x L ( x ) π i ( x ) ,
the posterior:
P i ( x ) = L ( x ) π i ( x ) Z i ( K ) ,
and the information:
H i ( K ) = d K x P i ( x ) log P i ( x ) π i ( x ) ,
which measures the amount of information obtained by updating from prior π i to posterior P i , i.e., H i ( K ) D KL ( P i | π i ) , where D KL is the Kullback–Leibler divergence.

3. Conflict

The uninformative priors π referenced in (2) are of the independent and identically distributed type:
π ( x ) = k = 1 K g ( x k | β ) ,
where g ( x | β ) is any wide distribution with hyperparameters β . A typical choice for g is the uniform distribution over the full frequency bandwidth; other examples include diffuse Gaussians or Jeffreys priors [3,4,5,6,7,8,9].
Second, the limited prior information I in (2) about K implies that the problem will involve model selection, since each value of K implicitly corresponds to a different model for the data. It is, thus, necessary to evaluate and compare evidence Z ( K ) = d K x L ( x ) π ( x ) for each plausible K.
The conflict between these two elements is due to the label switching problem, which is a well-known issue in mixture modeling, e.g., [10]. The likelihood functions L ( x ) used in models parametrized by resonance frequencies are typically invariant to switching the label k; i.e., the index k of the frequency x k has no distinguishable meaning in the model underlying the data. The posterior P ( x ) L ( x ) π ( x ) will inherit this exchange symmetry if the prior is of type (7). Thus, if the model parameters x are well determined by the data D, the posterior landscape will consist of one primary mode, which is defined as a mode living in the ordered region:
R K ( x 0 ) = { x | x 0 x 1 x 2 x K }   with   x 0 > 0 ,
and ( K ! 1 ) induced modes, which are identical to the primary mode up to a permutation of the labels k and, thus, live outside of the region R K ( x 0 ) . The trouble is that correctly taking into account these induced modes during the evaluation of Z ( K ) requires a surprising amount of extra work in addition to tuning the MCMC method of choice, and that is the label switching problem in our setting. In fact, there is currently no widely accepted solution for the label switching problem in the context of mixture models either [11,12]. This is, then, how in (2) uninformative priors π are “precluded” by the limited information I: the latter implies model selection, which in turn implies evaluating Z ( K ) , which is hampered by the label switching problem due to the exchange symmetry of the former. Therefore, it seems better to try to avoid it by encoding our preference for primary modes directly into the prior. This results in abandoning the uninformative prior π in favor of the weakly informative prior π 3 , which is proposed in Section 4 as a solution to the conflict.
We use the VTR problem to briefly illustrate the label switching problem in Figure 1. The likelihood L ( x ) is described implicitly in Section 5 and is invariant to switching the labels k because the underlying model function (23) of the regression model is essentially a sum of sinusoids, one for each x k . As frequencies can be profitably thought of as scale variables ([13], Appendix A), the uninformative prior (7) is represented by
π 1 ( x ) p ( x | x 0 , x max , I 1 ) = k = 1 K h ( x k | x 0 , x max ) ,
where β 1 ( x 0 , x max ) are a common lower and upper bound, and
h ( x | a , b ) = { 1 log ( b / a ) 1 x if   a x b 0 otherwise with a > 0 b <
is the Jeffreys prior, the conventional uninformative prior for a scale variable [although any prior of the form (7) that is sufficiently uninformative would yield essentially the same results.] We have visualized the posterior landscape P 1 ( x ) in Figure 1 by using the pairwise marginal posteriors P 1 ( x k , x ) plotted in blue. Note the exchange symmetry of P 1 , which manifests as an (imperfect) reflection symmetry around the dotted diagonal x k = x bordering the ordered region R 3 ( x 0 ) . The primary mode can be identified by the black dot; all other modes are induced modes. Integrating all K ! modes to obtain Z ( K ) quickly becomes intractable for Z 4 .

A Simple Way Out?

A simple method out of the conflict is to break the exchange symmetry by assuming specialized bounds for each x k :
π 2 ( x ) p ( x | a , b , I 2 ) = k = 1 K h ( x k | a k , b k ) ,
where β 2 ( a , b ) with a = { a k } k = 1 K and b = { b k } k = 1 K being hyperparameters specifying the individual bounds. However, in order to enable the model to detect doublets (a resolved pair of two close frequencies such as the primary mode in the leftmost panel in Figure 1), it is necessary to assign overlapping bounds in ( a , b ) , presumably by using some heuristic. The necessary degree of overlap increases as the detection of higher order multiplets such as triplets (which can and do occur) is desired, but the more overlap in ( a , b ) , the more the label switching problem returns. Despite this issue, there will be cases where we have sufficient prior information I to set the ( a , b ) hyperparameters without too much trouble; the VTR problem is such a case for which the overlapping values of ( a , b ) up to K = 5 are given in Table 1.

4. Solution

Our solution to the conflict (2) is a chain of K coupled Pareto distributions:
π 3 ( x ) p ( x | x 0 ¯ , I 3 ) = k = 1 K Pareto ( x k | x k 1 , λ k )
where
Pareto ( x | x * , λ ) = { λ x * λ x λ + 1 if   x x * 0 otherwise with x * > 0 λ > 0 ,
and the hyperparameter β 3 x 0 ¯ is defined as
x 0 ¯ ( x 0 ¯ , x ¯ ) , x 0 ¯ : = x 0 , x ¯ = { x k ¯ } k = 1 K , λ k = x k ¯ x k ¯ x k 1 ¯ .
From Figure 2, it can be seen that π 3 encodes weakly informative knowledge about K ordered frequencies: (12) and (13) together imply that π 3 ( x ) is defined only for x ¯ R K ( x 0 ) , while nonzero only for x R K ( x 0 ) . In other words, its support is precisely the ordered region R K ( x 0 ) , which solves the label switching problem underlying the conflict automatically, as the exchange symmetry of π is broken. This is illustrated in Figure 2, where P 3 contracts to a single primary mode, which is just what we would like.
The K + 1 hyperparameters x 0 ¯ in (14) are a common lower bound x 0 plus K expected values of the resonance frequencies x ¯ . While the former is generally easily determined, the latter may seem difficult to set given the premise of this paper that we dispose only of limited prior information I. Why do we claim that π 3 is only weakly informative if it is parametrized by the expected values of the very things it is supposed to be only weakly informative about? The answer is that for any reasonable amount of data, inference based on π 3 is completely insensitive to the exact values of x ¯ . Therefore, any reasonable guess for x 0 ¯ will suffice in practice. For example, for the VTR problem, we simply applied a heuristic where we take x k ¯ = k × 500 Hz (see Table 1). This insensitivity is due to the maximum entropy status of π 3 and indicates the weak inductive bias it entails. On a more prosaic level, the heavy tails of the Pareto distributions in (12) ensure that the prior will be eventually overwhelmed by the data no matter how a priori improbable the true value of x is. More prosaic still, in Section 5.1 below we show quantitatively that for the VTR problem π 3 is about as (un)informative as π 2 .

4.1. Derivation of π 3

Our ansatz consists of interpreting the x as a set of K ordered scale variables that are bounded from below by x 0 . Starting from (9) and not bothering with the bounds ( a , b ) , we obtain the improper pdf
m ( x ) { k = 1 K 1 x k x R K ( x 0 ) 0 otherwise .
We can simplify (15) using the one-to-one transformation x u defined as
x u : u k = log x k x k 1 ( k = 1 , 2 , , K ) u x : x k = x 0 exp κ = 1 k u κ ( k = 1 , 2 , , K )
which yields (with abuse of notation for brevity)
m ( u ) { 1 u 0 0 otherwise .
Since model selection requires proper priors, we need to normalize m ( u ) by adding extra information (i.e., constraints) to it; we propose to simply fix the K first moments u = { u k } k = 1 K . This will yield the Pareto chain prior π 3 ( u ) directly, expressed in u space rather than x space. The expression for π 3 ( u ) is found by minimizing the Kullback–Leibler divergence [14]
D KL ( π 3 | m ) = d K u π 3 ( u ) log π 3 ( u ) m ( u ) ,   subject   to   u d K u u π 3 ( u ) = u ¯ ,
where u ¯ = { u k ¯ } k = 1 K are the supplied first moments. This variational problem is equivalent to finding π 3 ( u ) by means of Jaynes’ principle of maximum entropy with m ( u ) serving as the invariant measure [15]. Since the exponential distribution Exp ( x | λ ) is the maximum entropy distribution for a random variable x 0 with a fixed first moment x = 1 / λ , the solution to (18) is
π 3 ( u ) = k = 1 K Exp ( u k | λ k ) ,
where the rate hyperparameters λ k = 1 / u k ¯ and
Exp ( x | λ ) = { λ exp { λ x } if   x 0 0 otherwise with λ > 0 .
Transforming (19) to x space using (16) finally yields (12), but we still need to express λ k in terms of x ¯ —we might find it hard to pick reasonable values of u k ¯ = log x k / x k 1 ¯ from limited prior information I. For this, we will need the identity
x k d K x x k π 3 ( x ) = λ k λ k 1 x k 1 ( k = 1 , 2 , , K ) .
Constraining x k = x k ¯ and solving for λ k , we obtain λ k = x k ¯ / ( x k ¯ x k 1 ¯ ) , in agreement with (14). Note that the existence of the first marginal moments x k requires that λ k > 1 .

4.2. Sampling from π 3

Sampling from π 3 is trivial because of the independence of the u k in u space (19). To produce a sample x π 3 ( x ) given the hyperparameter x 0 ¯ , compute the corresponding rate parameters { λ k } k = 1 K from (14), and use them in (19) to obtain a sample u π 3 ( u ) . The desired x is then obtained from u using the transformation (16).

5. Application: The VTR Problem

We now present a relatively simple but realistic instance of the problem of measuring resonance frequencies, which will allow us to illustrate the above ideas. The VTR problem consists of measuring human vocal tract resonance (VTR) frequencies x for each of five representative vowel sounds taken from the CMU ARCTIC database [16]. The VTR frequencies x describe the vocal tract transfer function T ( x ) and are fundamental quantities in acoustic phonetics [17]. The five vowel sounds are recorded utterances of the first vowel in the words W = { shore , that , you , little , until } . In order to achieve high-quality VTR frequency estimates x ^ , only the quasi-periodic steady-state part of the vowel sound is considered for the measurement. The data D, thus, consists of a string of highly correlated pitch periods. See Figure 3 for an illustration of these concepts.
The measurement itself is formalized as inference using the probabilistic model (1). The model assumed to underlie the data is the sinusoidal regression model introduced in [18]; due to limited space, we only describe it implicitly. The sinusoidal regression model assumes that each pitch period d D can be modeled as
d t = f ( t ; A , α , x ) + σ e t   where   e t N ( 0 , 1 ) , ( t = 1 , 2 , , T ) ,
where d = { d t } t = 1 T is a time series consisting of T samples. The model function
f ( t ; A , α , x ) = k = 1 K [ A k cos ( x k t ) + A K + k sin ( x k t ) ] exp { α k t } + = 1 L A 2 K + t 1
consists of a sinusoidal part (first ∑) and a polynomial trend correction (second ∑). Note the additional model parameters θ = { A , α , σ , L } . Formally, given the prior p ( θ ) ([18], Section 2.2), the marginal likelihood L ( x ) is then obtained as L ( x ) = θ L ( x , θ ) p ( θ ) , where the complete likelihood L ( x , θ ) is implicitly given by (22) and (23). Practically, we just marginalize out θ from samples obtained from the complete problem p ( D , x , θ | I ) .
For inference, the computational method of choice is nested sampling [1] using the dynesty library [19,20,21,22,23], which scales roughly as O ( K 2 ) [24]. Since the VTR problem is quite simple ( H i ( K ) 30 nats ), we only perform single nested sampling runs and take the obtained log Z i ( K ) and H i ( K ) as point estimates. Full details on the experiments and data are available at https://github.com/mvsoom/frequency-prior.

5.1. Experiment I: Comparing π 2 and π 3

In Experiment I, we perform a high-level comparison between π 2 and π 3 in terms of evidence (4) and information (6). The values of the hyperparameters used in the experiment are listed in Table 1. We did not include π 1 in this comparison as the label switching problem prevented convergence of nested sampling runs for K 4 . The ( a , b ) bounds for π 2 were based on loosely interpreting the VTRs as formants and consulting formant tables from standard works [25,26,27,28,29,30]. These allowed us to compile bounds up until the fifth formant such that K max = 5 . For π 3 , we simply applied a heuristic where we take x k ¯ = k × 500 Hz . We selected x 0 empirically (although a theoretical approach is also possible [31]), and x max was set to the Nyquist frequency. The role of x max is to truncate π 3 in order to avoid aliasing effects, since the support of π 3 ( x i ) is unbounded from above. We implemented this by using the following likelihood function in the nested sampling program:
L ( x ) = { L ( x ) if   x k x max for   all   ( k = 1 , 2 , , K ) 0 otherwise .
First, we compare the influence of π 2 and π 3 on model selection. Given D W , the posterior probability of the number of resonances K is given by the following.
p i ( K ) = Z i ( K ) K Z i ( K ) ( K = 1 , 2 , , K max ) .
The results in the top row of Figure 4a are striking: while p 2 ( K ) shows individual preferences based on D, p 3 ( K ) prefers K = K max unequivocally.
Second, in Figure 4b, we compare π 2 and π 3 directly in terms of differences in evidence [ log Z i ( K ) ] and uninformativeness [ H i ( K ) ] for each combination ( D , K ) .
Arrows pointing eastward indicate Z 3 ( K ) > Z 2 ( K ) . The π 3 prior dominates the π 2 prior in terms of evidence, for almost all values of K, indicating that π 3 places its mass in regions of higher likelihood or, equivalently, that the data were much more probable under π 3 than π 2 . This implies that the hint of π 3 at more structure beyond K > K max should be taken serious–we investigate this in Section 5.2.
Arrows pointing northward indicate H 3 ( K ) > H 2 ( K ) , i.e., π 3 is less informative than π 2 , since more information is gained by updating from π 3 to P 3 than from π 2 to P 2 . It is observed that π 2 and π 3 are roughly comparable in terms of (un)informativeness.

5.2. Experiment II: ‘Free’ Analysis

We now freely look for more structure in the data by letting K vary up until K max = 10 . This goes beyond the capacities of π 1 (because of the label switching problem) and π 2 (because no data are available to set the ( a , b ) bounds). Thus, the great advantage of π 3 is that we can use a simple heuristic to set x 0 ¯ and let the model perform the discovering without worrying about convergence issues or the obtained evidence values. The bottom row in Figure 4a shows that model selection for the VTR problem is well-defined, with the most probable values of K 10 , except for D = until . That case is investigated in Figure 3, where the need for more VTRs (higher K) is apparent from the unmodeled broad peak centered at around 3000 Hz in the FFT power spectrum (right panel). Incidentally, this spectrum also shows that spectral peaks are often resolved into more than one VTR, which underlines the importance of using a prior that enables trouble-free handling of multiplets of arbitrary order. A final observation from the spectrum is the fact that the inferred x ^ k differs substantially from the supplied values in x ¯ (Table 1), which hints at the weak inductive bias underlying π 3 .

6. Discussion

It is only when the information in the prior is comparable to the information in the data that the prior probability can make any real difference in parameter estimation problems or in model selection problems.
([32], p. 9)
Although the weakly informative prior for resonance frequencies π 3 is meant to be overwhelmed, its practical advantage (i.e., solving the label switching problem) will nonetheless persist, making a real difference in model selection problems even when “the information in the prior” is much smaller than “the information in the data”. In this sense, π 3 is quite unlike the prior referenced in the above quote. Since it will be overwhelmed, all it has to do is provide a reasonable density everywhere (which it does), be easily parametrizable (which it is), and be easy to sample from (which it is).
Thus, we hope that this prior can enable the use of robust evidence-based methods for a new class of problems, even in the presence of multiplets of arbitrary order. The prior is compatible with off-the-shelf exploration algorithms and solves the label switching problem without any special tuning or post processing. It would be interesting to compare it to other approaches, e.g., [33], especially in terms of exploration efficiency. It is valid for any collection of scale variables that is intrinsically ordered, of which frequencies and wavelengths seem to be the most natural examples. Some examples of recent work where the prior could be applied directly are:
  • Nuclear magnetic resonance (NMR) spectroscopy [34];
  • Resonant ultrasound spectroscopy (a standard method in material science) [35];
  • In the analysis of atomic spectra [36], such as X-ray diffraction [37];
  • Accurate modeling of instrument noise (in this case LIGO/Virgo noise) [38];
  • Model-based Bayesian analysis in acoustics [39].

Author Contributions

Conceptualization, writing, methodology, and analysis: M.V.S. Supervision: B.d.B. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the Flemish AI plan and by the Research Foundation Flanders (FWO) under grant number G015617N.

Acknowledgments

We would like to thank Roxana Radulescu, Timo Verstraeten, and Yannick Jadoul for helpful comments.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Skilling, J. Nested Sampling for General Bayesian Computation. Bayesian Anal. 2006, 1, 833–859. [Google Scholar] [CrossRef]
  2. Green, P.J. Reversible Jump Markov Chain Monte Carlo Computation and Bayesian Model Determination. Biometrika 1995, 82, 711–732. [Google Scholar] [CrossRef]
  3. Mark, Y.Z.; Hasegawa-johnson, M. Particle Filtering Approach to Bayesian Formant Tracking. In Proceedings of the IEEE Workshop on Statistical Signal Processing, St. Louis, MO, USA, 28 September–1 October 2003. [Google Scholar]
  4. Zheng, Y.; Hasegawa-Johnson, M. Formant Tracking by Mixture State Particle Filter. In Proceedings of the 2004 IEEE International Conference on Acoustics, Speech, and Signal Processing, Montreal, QC, Canada, 17–21 May 2004; Volume 1, pp. 1–565. [Google Scholar] [CrossRef]
  5. Yan, Q.; Vaseghi, S.; Zavarehei, E.; Milner, B.; Darch, J.; White, P.; Andrianakis, I. Formant Tracking Linear Prediction Model Using HMMs and Kalman Filters for Noisy Speech Processing. Comput. Speech Lang. 2007, 21, 543–561. [Google Scholar] [CrossRef]
  6. Mehta, D.D.; Rudoy, D.; Wolfe, P.J. Kalman-Based Autoregressive Moving Average Modeling and Inference for Formant and Antiformant Tracking. J. Acoust. Soc. Am. 2012, 132, 1732–1746. [Google Scholar] [CrossRef]
  7. Shi, Y.; Chang, E. Spectrogram-Based Formant Tracking via Particle Filters. In Proceedings of the (ICASSP ’03), 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, Hong Kong, China, 6–10 April 2003; Volume 1, p. 1. [Google Scholar] [CrossRef]
  8. Deng, L.; Lee, L.J.; Attias, H.; Acero, A. Adaptive Kalman Filtering and Smoothing for Tracking Vocal Tract Resonances Using a Continuous-Valued Hidden Dynamic Model. IEEE Trans. Audio Speech Lang. Process. 2007, 15, 13–23. [Google Scholar] [CrossRef]
  9. Luberadzka, J.; Kayser, H.; Hohmann, V. Glimpsed Periodicity Features and Recursive Bayesian Estimation for Modeling Attentive Voice Tracking. Int. Congr. Acoust. 2019, 9, 8. [Google Scholar]
  10. Stephens, M. Dealing with Label Switching in Mixture Models. J. R. Stat. Soc. Ser. (Stat. Methodol.) 2000, 62, 795–809. [Google Scholar] [CrossRef]
  11. Celeux, G.; Kamary, K.; Malsiner-Walli, G.; Marin, J.M.; Robert, C.P. Computational Solutions for Bayesian Inference in Mixture Models. arXiv 2018, arXiv:1812.07240. [Google Scholar]
  12. Celeux, G.; Fruewirth-Schnatter, S.; Robert, C.P. Model Selection for Mixture Models - Perspectives and Strategies. arXiv 2018, arXiv:1812.09885. [Google Scholar]
  13. Bretthorst, G.L. Bayesian Spectrum Analysis and Parameter Estimation; Springer: Berlin/Heidelberg, Germany, 1988. [Google Scholar]
  14. Knuth, K.H.; Skilling, J. Foundations of Inference. Axioms 2012, 1, 38–73. [Google Scholar] [CrossRef] [Green Version]
  15. Jaynes, E.T. Prior Probabilities. IEEE Trans. Syst. Sci. Cybern. 1968, 4, 227–241. [Google Scholar] [CrossRef]
  16. Kominek, J.; Black, A.W. The CMU Arctic Speech Databases. In Proceedings of the Fifth ISCA Workshop on Speech Synthesis, Pittsburgh, PA, USA, 14–16 June 2004. [Google Scholar]
  17. Van Soom, M.; de Boer, B. A New Approach to the Formant Measuring Problem. Proceedings 2019, 33, 29. [Google Scholar] [CrossRef] [Green Version]
  18. Van Soom, M.; de Boer, B. Detrending the Waveforms of Steady-State Vowels. Entropy 2020, 22, 331. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  19. Speagle, J.S. Dynesty: A Dynamic Nested Sampling Package for Estimating Bayesian Posteriors and Evidences. arXiv 2019, arXiv:1904.02180. [Google Scholar] [CrossRef] [Green Version]
  20. Feroz, F.; Hobson, M.P.; Bridges, M. MULTINEST: An Efficient and Robust Bayesian Inference Tool for Cosmology and Particle Physics. Mon. Not. R. Astron. Soc. 2009, 398, 1601–1614. [Google Scholar] [CrossRef] [Green Version]
  21. Neal, R.M. Slice Sampling. Ann. Stat. 2003, 31, 705–767. [Google Scholar] [CrossRef]
  22. Handley, W.J.; Hobson, M.P.; Lasenby, A.N. POLYCHORD: Nested Sampling for Cosmology. Mon. Not. R. Astron. Soc. 2015, 450, L61–L65. [Google Scholar] [CrossRef]
  23. Handley, W.J.; Hobson, M.P.; Lasenby, A.N. POLYCHORD: Next-Generation Nested Sampling. Mon. Not. R. Astron. Soc. 2015, 453, 4384–4398. [Google Scholar] [CrossRef] [Green Version]
  24. Buchner, J. Nested Sampling Methods. arXiv 2021, arXiv:2101.09675. [Google Scholar]
  25. Peterson, G.E.; Barney, H.L. Control Methods Used in a Study of the Vowels. J. Acoust. Soc. Am. 1952, 24, 175–184. [Google Scholar] [CrossRef]
  26. Hillenbrand, J.; Getty, L.A.; Clark, M.J.; Wheeler, K. Acoustic Characteristics of American English Vowels. J. Acoust. Soc. Am. 1995, 97, 3099–3111. [Google Scholar] [CrossRef] [Green Version]
  27. Vallée, N. Systèmes Vocaliques: De La Typologie Aux Prédictions. Ph.D. Thesis, Université Stendhal, Grenoble, France, 1994. [Google Scholar]
  28. Kent, R.D.; Vorperian, H.K. Static Measurements of Vowel Formant Frequencies and Bandwidths: A Review. J. Commun. Disord. 2018, 74, 74–97. [Google Scholar] [CrossRef]
  29. Vorperian, H.K.; Kent, R.D.; Lee, Y.; Bolt, D.M. Corner Vowels in Males and Females Ages 4 to 20 Years: Fundamental and F1–F4 Formant Frequencies. J. Acoust. Soc. Am. 2019, 146, 3255–3274. [Google Scholar] [CrossRef] [PubMed]
  30. Klatt, D.H. Software for a Cascade/Parallel Formant Synthesizer. J. Acoust. Soc. Am. 1980, 67, 971–995. [Google Scholar] [CrossRef] [Green Version]
  31. de Boer, B. Acoustic Tubes with Maximal and Minimal Resonance Frequencies. J. Acoust. Soc. Am. 2008, 123, 3732. [Google Scholar] [CrossRef] [Green Version]
  32. Bretthorst, G.L. Bayesian Analysis. II. Signal Detection and Model Selection. J. Magn. Reson. 1990, 88, 552–570. [Google Scholar] [CrossRef]
  33. Buscicchio, R.; Roebber, E.; Goldstein, J.M.; Moore, C.J. Label Switching Problem in Bayesian Analysis for Gravitational Wave Astronomy. Phys. Rev. D 2019, 100, 084041. [Google Scholar] [CrossRef] [Green Version]
  34. Wilson, A.G.; Wu, Y.; Holland, D.J.; Nowozin, S.; Mantle, M.D.; Gladden, L.F.; Blake, A. Bayesian Inference for NMR Spectroscopy with Applications to Chemical Quantification. arXiv 2014, arXiv:1402.3580. [Google Scholar]
  35. Xu, K.; Marrelec, G.; Bernard, S.; Grimal, Q. Lorentzian-Model-Based Bayesian Analysis for Automated Estimation of Attenuated Resonance Spectrum. IEEE Trans. Signal Process. 2019, 67, 4–16. [Google Scholar] [CrossRef]
  36. Trassinelli, M. Bayesian Data Analysis Tools for Atomic Physics. Nucl. Instruments Methods Phys. Res. Sect. Beam Interact. Mater. Atoms 2017, 408, 301–312. [Google Scholar] [CrossRef] [Green Version]
  37. Fancher, C.M.; Han, Z.; Levin, I.; Page, K.; Reich, B.J.; Smith, R.C.; Wilson, A.G.; Jones, J.L. Use of Bayesian Inference in Crystallographic Structure Refinement via Full Diffraction Profile Analysis. Sci. Rep. 2016, 6, 31625. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  38. Littenberg, T.B.; Cornish, N.J. Bayesian Inference for Spectral Estimation of Gravitational Wave Detector Noise. Phys. Rev. D 2015, 91, 084034. [Google Scholar] [CrossRef] [Green Version]
  39. Xiang, N. Model-Based Bayesian Analysis in Acoustics—A Tutorial. J. Acoust. Soc. Am. 2020, 148, 1101–1120. [Google Scholar] [CrossRef] [PubMed]
Figure 1. The exchange symmetry of the posterior P 1 ( x ) for a well-determined instance of the VTR problem from Section 5 with K : = 3 . The pairwise marginal posteriors P 1 ( x k , x ) are shown using the isocontours of kernel density approximations calculated from posterior samples of x . For each panel, the diagonal x k = x is plotted as a dotted line, and the ordered region R 3 ( x 0 ) is shaded in grey. The black dot marks the mean of the primary mode for this problem.
Figure 1. The exchange symmetry of the posterior P 1 ( x ) for a well-determined instance of the VTR problem from Section 5 with K : = 3 . The pairwise marginal posteriors P 1 ( x k , x ) are shown using the isocontours of kernel density approximations calculated from posterior samples of x . For each panel, the diagonal x k = x is plotted as a dotted line, and the ordered region R 3 ( x 0 ) is shaded in grey. The black dot marks the mean of the primary mode for this problem.
Psf 03 00002 g001
Figure 2. Contraction of prior ( π 3 ) to posterior ( P 3 ) for the application of π 3 to the VTR problem used in Figure 1. The pairwise marginal prior π 3 ( x k , x ) is obtained by integrating out the third frequency; for example, π 3 ( x 1 , x 2 ) = d x 3 π 3 ( x ) . Unlike P 1 in Figure 1, P 3 exhibits only a single mode that coincides with the primary mode as marked by the black dot.
Figure 2. Contraction of prior ( π 3 ) to posterior ( P 3 ) for the application of π 3 to the VTR problem used in Figure 1. The pairwise marginal prior π 3 ( x k , x ) is obtained by integrating out the third frequency; for example, π 3 ( x 1 , x 2 ) = d x 3 π 3 ( x ) . Unlike P 1 in Figure 1, P 3 exhibits only a single mode that coincides with the primary mode as marked by the black dot.
Psf 03 00002 g002
Figure 3. The VTR problem for the case ( D : = until , K : = 10 ) . Left panel: The data D, i.e., the quasi-periodic steady-state part, consist of 3 highly correlated pitch periods. Right panel: Inferred VTR frequency estimates { x ^ k } k = 1 K for K : = 10 at 3 sigma. They describe the power spectral density of the vocal tract transfer function | T ( x ) | 2 , represented here by 25 posterior samples and compared to the Fast Fourier Transform (FFT) of D. All x ^ k are well resolved, and most have error bars too small to be seen on this scale.
Figure 3. The VTR problem for the case ( D : = until , K : = 10 ) . Left panel: The data D, i.e., the quasi-periodic steady-state part, consist of 3 highly correlated pitch periods. Right panel: Inferred VTR frequency estimates { x ^ k } k = 1 K for K : = 10 at 3 sigma. They describe the power spectral density of the vocal tract transfer function | T ( x ) | 2 , represented here by 25 posterior samples and compared to the Fast Fourier Transform (FFT) of D. All x ^ k are well resolved, and most have error bars too small to be seen on this scale.
Psf 03 00002 g003
Figure 4. (a) Model selection in Experiment I (top row) and Experiment II (bottom row). (b) In Experiment I, π 2 and π 3 are compared in terms of evidence [ log Z i ( K ) ] and uninformativeness [ H i ( K ) ] for each ( D , K ) . The arrows point from π 2 to π 3 and are color-coded by the value of K. For small values of K, the arrow lengths are too small to be visible on this scale.
Figure 4. (a) Model selection in Experiment I (top row) and Experiment II (bottom row). (b) In Experiment I, π 2 and π 3 are compared in terms of evidence [ log Z i ( K ) ] and uninformativeness [ H i ( K ) ] for each ( D , K ) . The arrows point from π 2 to π 3 and are color-coded by the value of K. For small values of K, the arrow lengths are too small to be visible on this scale.
Psf 03 00002 g004
Table 1. The values of the hyperparameters β i used throughout the paper. All quantities are given in units of Hz.
Table 1. The values of the hyperparameters β i used throughout the paper. All quantities are given in units of Hz.
k 012345678910
a = { a k } 200600140029003500
b = { b k } 11003500400045005500
x 0 ¯ = { x k ¯ } 200500100015002000250030003500400045005000
other x 0 = 200 x max = 5500
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Van Soom, M.; de Boer, B. A Weakly Informative Prior for Resonance Frequencies. Phys. Sci. Forum 2021, 3, 2. https://doi.org/10.3390/psf2021003002

AMA Style

Van Soom M, de Boer B. A Weakly Informative Prior for Resonance Frequencies. Physical Sciences Forum. 2021; 3(1):2. https://doi.org/10.3390/psf2021003002

Chicago/Turabian Style

Van Soom, Marnix, and Bart de Boer. 2021. "A Weakly Informative Prior for Resonance Frequencies" Physical Sciences Forum 3, no. 1: 2. https://doi.org/10.3390/psf2021003002

APA Style

Van Soom, M., & de Boer, B. (2021). A Weakly Informative Prior for Resonance Frequencies. Physical Sciences Forum, 3(1), 2. https://doi.org/10.3390/psf2021003002

Article Metrics

Back to TopTop