The Negative Binomial Distribution Efficiency in Finite Mixture of Semi-parametric Generalized Linear Models

Author
Faculty of Mathematical Science and Computer, Allameh Tabataba’i University, Tehran, Iran
Abstract
Introduction

Selection the appropriate statistical model for the response variable is one of the most important problem in the finite mixture of generalized linear models. One of the distributions which it has a problem in a finite mixture of semi-parametric generalized statistical models, is the Poisson distribution. In this paper, to overcome over dispersion and computational burden, finite mixture of semi-parametric generalized linear models using the negative binomial (GFMMNB) distributions instead of finite mixture of semi-parametric generalized linear models using the Poisson distributions (GFMMP) has been proposed. Efficiency of GFMMNB to GFMMP using weighted generalized mean of square error (WGMSE) for both the simulation data and real data are shown.

Material and methods

In this scheme, first we have introduced finite mixture of semi-parametric generalized linear models using the Poisson distributions (GFMMP). Then, we have introduced finite mixture of semi-parametric generalized linear models using the negative binomial (GFMMNB) instead of GFMMP. For estimating the parameters in the proposed model, the EM algorithm in two steps computed. We have used the efficiency method using weighted generalized mean of square error (WGMSE) for comparing between GFMMNB and GFMMP model in both the simulation and real data.

Results and discussion

Results of real example and simulation study between GFMMNB and GFMMP model are shown that the proposed method is very competitive in terms of estimation accuracy and speed of computational estimation methods. The reported results demonstrate that there is a good agreement between simulation study and real data in the GFMMNB model.

Also, the numerical results reported in the tables indicate that the accuracy improve by increasing the n for GFMMNB model. Therefore, to get more accurate results, the larger n is recommended.

Conclusion

The following conclusions were drawn from this research.


Computation of estimators for proposed model using the EM algorithm are found very easily and therefore many calculations are reduced.
Confidence intervals for parameters in GFMMNB model is more accurate than GFMMP model.


· The main characteristic of proposed method is that it improves the finite mixture model and can be easily solved by using iterative method.

./files/site1/files/51/%D8%A7%D8%B3%DA%A9%D9%86%D8%AF%D8%B1%DB%8C.pdf
Keywords

1. Chen and Holland, "New Equating Methods and Their Relationships With Levine Observed Score Linear Equating Under The Kernel Equating Framework." Psychometrika, Vol. 75, NO. 3 (2010) 542–557. 2. Cho H., Fryzlewicz P.,"High-Dimensional Variable Selection via Tilting", J. Roy. Stat. Soc. Ser. B, Stat. Method., 74 (2012) 593-622. 3. Du Y., Khalili A., Neslehova J. G., Steele R. J., "Simultaneous Fixed and Random Effects Selection in Finite Mixture of Linear Mixed-Effects Models", The Canadian Journal of Statistics, 41 (2013) 596-616. 4. Eskandari F., Ormoz E., "Finite Mixture of Generalized Semiparametric Models: Variable Selection via Penalized Estimation", Communications in Statistics–Simulation and Computation,(to appear) (2016). 5. Fan J., Li R., "Variable Selection via Nonconcave Penalized Likelihood and its Oracle Properties", J. Amer. Statist. Assoc., 96 (2001)1348-1360. 6. Keller L. A., Skorupski W. P., Swaminathan H., Jodoin M. G. An Evaluation of Item Response Theory Equating Procedures for Capturing Changes in Examinee Distributions with Mixed-Format Tests. Paper Presented at the Annual Meeting of the National Council on Measurement in Education (2004). 7. Khalili A., Chen J., "Variable Selection in Finite Mixture of Regression Models", Journal of American Statistical Association, 102 (2007)1025-1038. 8. Khalili A., "An Overview of the New Feature Selection Methods in Finite Mixture of Regression Mdels", JIRSS, 10 (2011) 201-235. 9. Kolen M. J., Brennan R. L., "Test equating, scaling, and linking: Methods and 31 Practices", New York: Springer (2004). 10. Li R., Liang H., "Variable Selection in Semiparametric Regression Modeling", Ann. Statist., 36 (2008) 261-286. 11. Lee W., Ban J. "A comparison of IRT Linking Procedures. Applied Measurement in Education, 23 (2010) 23-48. 12. Ma S., Song Q., Wang L., "Simultaneous Variable Selection and Estimation in Semiparametric Modeling of Longitudinal/Clustered Data", Bernoulli, 19 (2013) 252-274. 13. McLachlan G. J., Peel D., "Finite Mixture Models", New York: Wiley (2000). 14. Nelder J., Wedderburn R. W. M, "Generalized Linear Models", J. Roy. Statist. Soc. Ser. A., 135 (1972) 370-384. 15. Myint A., Htet L., O., "An Application of Linear Test Equating Method in Scoring", Yangon Institute of Education Research Vol. 2, No. 1 (2010) 1-16. 16. Ormoz E., Eskandari F., "Variable Selection in Finite Mixture of Semi-Parametric Regression models", Commun Stat-Theorm, to appear (2013). 17. Ormoz E., Eskandari F., "Variable Selection in Finite Mixture of Semi-Parametric Regression Models", Communications in Statistics-Theory and Methods,Vol. 3 (2016)657-670. 18. Santarelli M. F., Latta D. D., Michele Scipioni M., PositanoV., Landini L., "A Conway-Maxwell–Poisson (CMP) model to address data dispersion on positron emission tomography, Computers in Biology and Medicine 77 (2016) 90-101. 19. Skrondal A., Rabe-Hesketh S., "Generalized Latent Variable Modelling: Multilevel, Longitudinal, and Structural Equations Models", Chapman and Hall (2004). 20. Schwarz G., "Estimating the Dimension of a Model", The Annals of Statistics, 6 (1978) 461-464. 21. Von Davier A. A., Holland P. W., Thayer D. T. "The kernel Method of Test Equating",New York: Springer-Verlag (2004). 22. Von Davier A. A. "A Statistical Perspective on Equating Test Scores", Scaling and Linking (pp.1-17). New York, NY: Springer-Verlag (2011). 23. Santarelli M. F., Latta D. D., Michele Scipioni M., Positano V., Landini L., "A Conway–Maxwell–Poisson (CMP) model to address data dispersion on positron emission tomography, Computers in Biology and Medicine 77 (2016) 90-101.