Dirichlet Mixed Process Integrated Bayesian Estimation for Individual Securities
Abstract
:1. Introduction
2. Estimation Methods
2.1. Kernel Estimation
2.2. Dirichlet Process
2.2.1. Parameters of Dirichlet Process Mixture Models
2.2.2. Density Distribution of a Dirichlet Process Mixture Model
2.2.3. Prior Distribution
- represents a symmetric Dirichlet prior on the mixing proportions. This prior assumes that, before observing any data, all components are equally likely, reflecting a state of maximum uncertainty. The vector specifies that each component has the same prior weight, ensuring that no particular component is favored a priori.
- is a common, often improper, prior for scale parameters, reflecting a vague prior belief. More specifically, , implying independence between the scale parameters of each component.
- represents the prior on the variances , conditional on the scale parameters . A common choice is an inverse gamma distribution, but the specific form depends on the model assumptions. It is expressed as which implies that the prior for each may depend on the previous mean . This notation can be better expressed as if the dependence is on the means.
- is the prior on the means , conditional on the variances . Often, this is a product of normal distributions, i.e., , where and are hyperparameters. This implies that the prior for each mean is normal, centered at , with variance scaled by the component variance .
2.2.4. Posterior Distribution
2.3. Clustering the Number of Components
2.4. Algorithms
- -
- Specify the maximum number of clusters, K. K is determined by the coefficient F in Equation (10).
- -
- Input: data , number of iterations , and a grid spanning .
- -
- Initialize parameters:
- ○
- Mixing proportions: , for .
- ○
- Cluster parameters: , where is the base distribution (e.g., a normal-inverse-gamma prior for mean and variance).
- ○
- Cluster assignments: , for .
- -
- Compute the conditional probability for each cluster , as in Equation (14):
- -
- Normalize probabilities using numerical stabilization:
- -
- Sample .Step 2.2: Sample Mixing Proportions (w):
- -
- Compute the number of points in each cluster: .
- -
- Sample w , as in Equation (5).Step 2.3 Sample Cluster Parameters ():For each cluster with :
- -
- Compute the posterior distribution for given the data assigned to cluster
- -
- Sample from the posterior, assuming and are conjugate (e.g., normal-inverse-gamma for normal components).
- -
- Cluster assignments: .
- -
- Mixing proportions: .
- -
- Cluster parameters: .
- -
- For each iteration , compute the mixture density on :
- -
- Compute the average density:
2.5. Measures of Discrepancy Between Kernel and Dirichlet Methods
3. Data and Estimation Procedures
3.1. Description of Data
3.2. Estimation Procedures
- Step 1: Split the stock price series into training and testing sets.
- Step 2: Estimate the prior probability density using the kernel and DMP methods.
- Step 3: Compute the posterior probability density using the kernel and DMP methods.
- Step 4: Compare the estimate accuracy using some common measures.
4. Empirical Results and Discussions
4.1. Empirical Results
4.2. Discussions
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Amewou-Atisso, M., Ghosal, S., Ghosh, J. K., & Ramamoorthi, R. V. (2003). Posterior consistency for semi-parametric regression problems. Bernoulli, 9(2), 291–312. [Google Scholar] [CrossRef]
- Barron, A. R. (1988). The exponential convergence of posterior probabilities with implications for Bayes estimators of density functions. Department of Statistics, University of Illinois. Available online: http://www.stat.yale.edu/~arb4/publications_files/convergence%20of%20bayer%27s%20estimator.pdf (accessed on 31 May 2025).
- Barron, A. R., Schervish, M. J., & Wasserman, L. (1999). The consistency of posterior distributions in nonparametric problems. The Annals of Statistics, 27(2), 536–561. [Google Scholar] [CrossRef]
- Boness, A. J., Chen, A. H., & Jatusipitak, S. (1974). Investigations of nonstationarity in prices. The Journal of Business, 47(4), 518–537. [Google Scholar] [CrossRef]
- Calinski, T., & Harabasz, J. (1974). A dendrite method for cluster analysis. Communications in Statistics—Theory and Methods, 3(1), 1–27. [Google Scholar] [CrossRef]
- Chae, M., & Walker, S. G. (2017). A novel approach to Bayesian consistency. Electronic Journal of Statistics, 11, 4723–4745. [Google Scholar] [CrossRef]
- Choudhuri, N., Ghosal, S., & Roy, A. (2004). Bayesian estimation of the spectral density of a time series. Journal of the American Statistical Association, 99(468), 1050–1059. [Google Scholar] [CrossRef]
- Cont, R. (2001). Empirical properties of asset returns: Stylized facts and statistical issues. Quantitative Finance, 1(2), 223–236. [Google Scholar] [CrossRef]
- Diaconis, P., & Freedman, D. (1986). On the consistency of Bayes estimates. The Annals of Statistics, 14, 1–26. [Google Scholar] [CrossRef]
- Duda, R. O., & Hart, P. E. (1973). Pattern classification and scene analysis (Vol. 3). Wiley. [Google Scholar]
- Escobar, M. D., & West, M. (1995). Bayesian density estimation and inference using mixtures. Journal of the American Statistical Association, 90(430), 577–588. [Google Scholar] [CrossRef]
- Ferguson, T. S. (1973). A Bayesian analysis of some nonparametric problems. The Annals of Statistics, 1, 209–230. [Google Scholar] [CrossRef]
- Freedman, D. A. (1963). On the asymptotic behavior of Bayes’ estimates in the discrete case. The Annals of Mathematical Statistics, 34(4), 1386–1403. [Google Scholar] [CrossRef]
- Friedman, H. P., & Rubin, J. (1967). On some invariant criteria for grouping data. Journal of the American Statistical Association, 62(320), 1159–1178. [Google Scholar] [CrossRef]
- Geman, S., & Hwang, C. R. (1982). Nonparametric maximum likelihood estimation by the method of sieves. The Annals of Statistics, 10, 401–414. [Google Scholar] [CrossRef]
- Ghosal, S., Ghosh, J. K., & Ramamoorthi, R. V. (1999). Posterior consistency of Dirichlet mixtures in density estimation. The Annals of Statistics, 27(1), 143–158. [Google Scholar] [CrossRef]
- Griffin, J. E., & Steel, M. F. J. (2006). Inference with non-Gaussian Ornstein-Uhlenbeck processes for stochastic volatility. Journal of Econometrics, 134(2), 605–644. [Google Scholar] [CrossRef]
- Hubert, L. J., & Levin, J. R. (1976). A general statistical framework for assessing categorical clustering in free recall. Psychological Bulletin, 83(6), 1072. [Google Scholar] [CrossRef]
- Hyndman, R. J., & Koehler, A. B. (2006). Another look at measures of forecast accuracy. International Journal of Forecasting, 22(4), 679–688. [Google Scholar] [CrossRef]
- Li, Y., Schofield, E., & Gönen, M. (2019). A tutorial on Dirichlet process mixture modeling. Journal of Mathematical Psychology, 91, 128–144. [Google Scholar] [CrossRef]
- Lindsay, B. G. (1983). The geometry of mixture likelihoods: A general theory. The Annals of Statistics, 11, 869–894. [Google Scholar] [CrossRef]
- Martin, G. M., Frazier, D. T., Maneesoonthorn, W., Loaiza-Maya, R., Huber, F., Koop, G., Maheu, J., Nibbering, D., & Panagiotelis, A. (2024). Bayesian forecasting in economics and finance: A modern review. International Journal of Forecasting, 40(2), 811–839. [Google Scholar] [CrossRef]
- Milligan, G. W., & Cooper, M. C. (1985). An examination of procedures for determining the number of clusters in a data set. Psychometrika, 50(2), 159–179. [Google Scholar] [CrossRef]
- Murtagh, F., & Legendre, P. (2014). Ward’s hierarchical agglomerative clustering method: Which algorithms implement Ward’s criterion? Journal of Classification, 31(3), 274–295. [Google Scholar] [CrossRef]
- Müller, P., Quintana, F., & Rosner, G. (2004). A method for combining inference across related nonparametric Bayesian models. Journal of the Royal Statistical Society Series B: Statistical Methodology, 66(3), 735–749. [Google Scholar] [CrossRef]
- Neal, R. M. (2000). Markov chain sampling methods for Dirichlet process mixture models. Journal of Computational and Graphical Statistics, 9(2), 249–265. [Google Scholar] [CrossRef]
- Petrone, S., & Wasserman, L. (2002). Consistency of Bernstein polynomial posteriors. Journal of the Royal Statistical Society Series B: Statistical Methodology, 64(1), 79–100. [Google Scholar] [CrossRef]
- Rasmussen, C., & Ghahramani, Z. (2001). Infinite mixtures of Gaussian process experts. Advances in Neural Information Processing Systems, 14. Available online: https://proceedings.neurips.cc/paper/2055-infinite-mixtures-of-gaussian-process-experts (accessed on 6 January 2024).
- Roeder, K., & Wasserman, L. (1997). Practical Bayesian density estimation using mixtures of normals. Journal of the American Statistical Association, 92(439), 894–902. [Google Scholar] [CrossRef]
- Roweis, S., & Ghahramani, Z. (1999). A unifying review of linear Gaussian models. Neural Computation, 11(2), 305–345. [Google Scholar] [CrossRef] [PubMed]
- Saxena, A., Prasad, M., Gupta, A., Bharill, N., Patel, O. P., Tiwari, A., Er, M. J., Ding, W., & Lin, C. T. (2017). A review of clustering techniques and developments. Neurocomputing, 267, 664–681. [Google Scholar] [CrossRef]
- Schwartz, L. (1965). On Bayes procedures. Zeitschrift Für Wahrscheinlichkeitstheorie und Verwandte Gebiete, 4(1), 10–26. [Google Scholar] [CrossRef]
- Silverman, B. W. (2018). Density estimation for statistics and data analysis. Routledge. Available online: https://www.taylorfrancis.com/books/mono/10.1201/9781315140919/density-estimation-statistics-data-analysis-bernard-silverman (accessed on 1 January 2024).
- Teh, Y. W. (2010). Dirichlet process. Encyclopedia of Machine Learning, 1063, 280–287. [Google Scholar]
- Tierney, L. (1998). A note on Metropolis-Hastings kernels for general state spaces. Annals of Applied Probability, 8, 1–9. [Google Scholar] [CrossRef]
- Verdinelli, I., & Wasserman, L. (1998). Bayesian goodness-of-fit testing using infinite-dimensional exponential families. The Annals of Statistics, 26(4), 1215–1241. [Google Scholar] [CrossRef]
- Walker, S. (2004). New approaches to Bayesian consistency. The Annals of Statistics, 32(5), 2028–2043. [Google Scholar] [CrossRef]
- Ward, J. H. (1963). Hierarchical grouping to optimize an objective function. Journal of the American Statistical Association, 58(301), 236–244. [Google Scholar] [CrossRef]
- Zellner, A. (1971). Bayesian and non-Bayesian analysis of the log-normal distribution and log-normal regression. Journal of the American Statistical Association, 66(334), 327–330. [Google Scholar] [CrossRef]
- Zellner, A., & Chetty, V. K. (1965). Prediction and decision problems in regression models from the Bayesian point of view. Journal of the American Statistical Association, 60(310), 608–616. [Google Scholar] [CrossRef]
Stock | Count | Mean | Std | Min | 25% | 50% | 75% | Max | ADF p-Value | KPSS p-Value | JB p-Value | LB p-Value |
---|---|---|---|---|---|---|---|---|---|---|---|---|
WMT | 1217 | 48.605 | 9.759 | 32.178 | 43.172 | 46.009 | 51.329 | 83.086 | 0.995 | 0.010 | 0.000 | 0.000 |
BA | 1217 | 196.706 | 40.451 | 95.010 | 170.000 | 198.490 | 217.180 | 345.395 | 0.001 | 0.097 | 0.000 | 0.000 |
JNJ | 1217 | 147.631 | 12.327 | 96.589 | 142.410 | 149.978 | 156.081 | 170.244 | 0.126 | 0.010 | 0.000 | 0.000 |
AAPL | 1217 | 149.268 | 39.563 | 54.450 | 125.117 | 149.018 | 174.561 | 235.961 | 0.679 | 0.010 | 0.010 | 0.000 |
JPM | 1217 | 135.586 | 34.668 | 68.481 | 111.444 | 134.814 | 148.985 | 224.341 | 0.979 | 0.010 | 0.000 | 0.000 |
XOM | 1217 | 75.223 | 29.744 | 24.809 | 48.300 | 79.291 | 102.124 | 123.246 | 0.906 | 0.010 | 0.000 | 0.000 |
Stock | Method | MAE | RMSE | R2 | MAD | MAPE | NMSE | RAE | RRSE |
---|---|---|---|---|---|---|---|---|---|
WMT | KDE | 0.0037 | 0.0058 | 0.9271 | 0.0026 | 86.1549 | 0.0729 | 0.2351 | 0.2700 |
WMT | DPM | 0.0021 | 0.0027 | 0.9846 | 0.0015 | 61.6504 | 0.0154 | 0.1304 | 0.1240 |
WMT | BP | 0.0085 | 0.0140 | 0.5797 | 0.0040 | 218.2602 | 0.4203 | 0.5342 | 0.6483 |
BA | KDE | 0.0004 | 0.0006 | 0.9719 | 0.0003 | 50.9255 | 0.0281 | 0.1321 | 0.1677 |
BA | DPM | 0.0004 | 0.0005 | 0.9845 | 0.0003 | 42.3142 | 0.0155 | 0.1132 | 0.1245 |
BA | BP | 0.0009 | 0.0013 | 0.8936 | 0.0008 | 124.1110 | 0.1064 | 0.2773 | 0.3262 |
JNJ | KDE | 0.0015 | 0.0022 | 0.9712 | 0.0010 | 43.5283 | 0.0288 | 0.1376 | 0.1696 |
JNJ | DPM | 0.0014 | 0.0018 | 0.9808 | 0.0010 | 48.2336 | 0.0192 | 0.1243 | 0.1385 |
JNJ | BP | 0.0041 | 0.0068 | 0.7283 | 0.0014 | 76.4473 | 0.2717 | 0.3730 | 0.5212 |
AAPL | KDE | 0.0013 | 0.0015 | 0.8251 | 0.0012 | 74.7754 | 0.1749 | 0.4008 | 0.4182 |
AAPL | DPM | 0.0010 | 0.0012 | 0.8982 | 0.0008 | 43.9056 | 0.1018 | 0.3029 | 0.3191 |
AAPL | BP | 0.0018 | 0.0021 | 0.6661 | 0.0015 | 123.2281 | 0.3339 | 0.5674 | 0.5779 |
JPM | KDE | 0.0015 | 0.0019 | 0.8442 | 0.0010 | 48.6411 | 0.1558 | 0.3776 | 0.3947 |
JPM | DPM | 0.0012 | 0.0015 | 0.8980 | 0.0010 | 41.0208 | 0.1020 | 0.3135 | 0.3193 |
JPM | BP | 0.0025 | 0.0032 | 0.5625 | 0.0022 | 81.5329 | 0.4375 | 0.6386 | 0.6614 |
XOM | KDE | 0.0036 | 0.0044 | 0.5053 | 0.0025 | 60.6087 | 0.4947 | 0.7211 | 0.7033 |
XOM | DPM | 0.0023 | 0.0032 | 0.7395 | 0.0017 | 42.4779 | 0.2605 | 0.4564 | 0.5104 |
XOM | BP | 0.0059 | 0.0071 | −0.3138 | 0.0059 | 96.4258 | 1.3138 | 1.1944 | 1.1462 |
Method | Mean | Wilcoxon Test | Kruskal–Wallis Test | ||||
---|---|---|---|---|---|---|---|
BP | DPM | KDE | KDE-DPM | KDE-BP | DPM-BP | ||
MAE | 0.004 | 0.001 | 0.002 | 0.031 | 0.031 | 0.031 | 0.130 |
RMSE | 0.006 | 0.002 | 0.003 | 0.031 | 0.031 | 0.031 | 0.130 |
R2 | 0.519 | 0.914 | 0.841 | 0.031 | 0.031 | 0.031 | 0.018 |
MAD | 0.003 | 0.001 | 0.001 | 0.094 | 0.031 | 0.031 | 0.220 |
MAPE | 120.001 | 46.600 | 60.772 | 0.063 | 0.031 | 0.031 | 0.002 |
NMSE | 0.481 | 0.086 | 0.159 | 0.031 | 0.031 | 0.031 | 0.018 |
RAE | 0.597 | 0.240 | 0.334 | 0.031 | 0.031 | 0.031 | 0.058 |
RRSE | 0.647 | 0.256 | 0.354 | 0.031 | 0.031 | 0.031 | 0.018 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Khoi, P.D.; Trong, T.M.; Gan, C. Dirichlet Mixed Process Integrated Bayesian Estimation for Individual Securities. J. Risk Financial Manag. 2025, 18, 304. https://doi.org/10.3390/jrfm18060304
Khoi PD, Trong TM, Gan C. Dirichlet Mixed Process Integrated Bayesian Estimation for Individual Securities. Journal of Risk and Financial Management. 2025; 18(6):304. https://doi.org/10.3390/jrfm18060304
Chicago/Turabian StyleKhoi, Phan Dinh, Thai Minh Trong, and Christopher Gan. 2025. "Dirichlet Mixed Process Integrated Bayesian Estimation for Individual Securities" Journal of Risk and Financial Management 18, no. 6: 304. https://doi.org/10.3390/jrfm18060304
APA StyleKhoi, P. D., Trong, T. M., & Gan, C. (2025). Dirichlet Mixed Process Integrated Bayesian Estimation for Individual Securities. Journal of Risk and Financial Management, 18(6), 304. https://doi.org/10.3390/jrfm18060304