If you are looking for an early working paper, "Recall, Precision, and Average Precision", cited by an article about information retrieval in Wikipedia, please refer to and/or cite this published paper instead.

For a sortable list, visit https://uwaterloo.ca/scholar/m3zhu/publications.

For preprints, scroll down to the bottom.

Book

Zhu M (2023). Essential Statistics for Data Science: A Concise Crash Course, Oxford University Press.

Journal articles, book chapters, and refereed conference proceedings

Jian J, Sang P, Zhu M (forthcoming). Two Gaussian regularization methods for time-varying networks. Journal of Agricultural, Biological, and Environmental Statistics, accepted and to appear.

Hofert M, Prasad A, Zhu M (2023). Dependence model assessment and selection with DecoupleNets. Journal of Computational and Graphical Statistics, 32(4), 1272 - 1286.

Hofert M, Prasad A, Zhu M (2023). RafterNet: Probabilistic predictions in multi-response regression. The American Statistician, 77(4), 406 - 416.

Hofert M, Prasad A, Zhu M (2022). Applications of multivariate quasi-random sampling with neural networks. In Monte Carlo and Quasi-Monte Carlo Methods, MCQMC 2020, A. Keller, Ed., Springer, 273 - 289.

Hofert M, Prasad A, Zhu M (2022). Multivariate time-series modeling with generative neural networks. Econometrics and Statistics, 23, 147 - 164.

Hofert M, Prasad A, Zhu M (2021). Quasi-random sampling for multivariate distributions via generative neural networks. Journal of Computational and Graphical Statistics, 30(3), 647 - 670.

Cheng L, Zhu M (2021). First-order correction of statistical significance for screening two-way epistatic interactions. In Epistasis: Methods and Protocols, K. C. Wong, Ed., Springer, 181 - 190.

Wu Y, Qin Y, Zhu M (2020). High-dimensional covariance matrix estimation using a low-rank and diagonal decomposition. Canadian Journal of Statistics, 48(2), 308 - 337.

Hofert M, Oldford W, Prasad A, Zhu M (2019). A framework for measuring association of random vectors via collapsed random variables. Journal of Multivariate Analysis, 172, 5 - 27.

Zhang C, Wu Y, Zhu M (2019). Pruning variable selection ensembles. Statistical Analysis and Data Mining, 12(3), 168 - 184.

Wu Y, Qin Y, Zhu M (2019). Quadratic discriminant analysis for high-dimensional data. Statistica Sinica, 29(2), 939 - 960.

Cheng L, Zhu M (2019). Compositional epistasis detection using a few prototype disease models. PLoS ONE, 14(3):e0213236.

Zhu M (2019). On fitting complex models to noisy data. In Proceedings of the International Conference on Statistics: Theory and Applications, 34.1 – 34.7.

Xin L, Zhu M, Chipman HA (2017). A continuous-time stochastic block model for basketball networks. The Annals of Applied Statistics, 11(2), 553 - 597.

Murdoch WJ, Zhu M (2016). Expanded alternating optimization for matrix factorization and penalized regression. In Proceedings of the 22nd International Conference on Computational Statistics, 217 - 229.

Zhu M (2015). Use of majority votes in statistical learning. Wiley Interdisciplinary Reviews: Computational Statistics, 7(6), 357 - 371.

Su W, Yuan Y, Zhu M (2015). A relationship between the average precision and the area under the ROC curve. In Proceedings of the ACM SIGIR 2015 International Conference on the Theory of Information Retrieval, 349 - 352.

Soltan-Ghoraie L, Burkowski F, Zhu M (2015). Using kernelized partial canonical correlation analysis to study directly coupled side chains and allostery in small G proteins. Bioinformatics, 31(12), i124 - i132.

Cheng L, Zhu M, Poss JW, Hirdes JP, Glenny C, Stolee P (2015). Opinion versus practice regarding the use of rehabilitation services in home care: An investigation using machine learning algorithms. BMC Medical Informatics and Decision Making, 15:80.

Yuan Y, Su W, Zhu M (2015). Threshold-free measures for assessing the performance of medical screening tests. Frontiers in Public Health, 3:57.

Soltan-Ghoraie L, Burkowski F, Zhu M (2015). Sparse networks of directly coupled, polymorphic and functional side chains in allosteric proteins. Proteins: Structure, Function, and Bioinformatics, 83(3), 497 - 516.

Armstrong JJ, Hirdes JP, Zhu M, Stolee P (2015). Rehabilitation therapies for older clients of the Ontario home care system: Regional variation and client-level predictors of service provision. Disability and Rehabilitation, 37(7), 625 - 631.

Soltan-Ghoraie L, Burkowski F, Li SC, Zhu M (2014). Residue-specific side-chain polymorphisms via particle belief propagation. IEEE/ACM Transactions on Computational Biology and Bioinformatics, 11(1), 33 - 41.

Zhu M (2014). Making personalized recommendations in e-commerce. In Statistics in Action: A Canadian Outlook, J. F. Lawless, Ed., Chapman & Hall, 259 - 268.

Zhu M, Cheng L, Armstrong JJ, Poss JW, Hirdes JP, Stolee P (2014). Using machine learning to plan rehabilitation for home care clients: Beyond "black-box" predictions. In Machine Learning in Healthcare Informatics, S. Dua, U. R. Acharya, P. Dua, Eds, Springer, 181 - 207. [***errata***]

Nguyen J, Zhu M (2013). Content-boosted matrix factorization techniques for recommender systems. Statistical Analysis and Data Mining, 6(4), 286 - 301.

Xin L, Zhu M (2012). Stochastic stepwise ensembles for variable selection. Journal of Computational and Graphical Statistics, 21(2), 275 - 294.

Zhu M, Wang S, Xin L (2012). On individual neutrality and collective decision making. The Mathematical Scientist, 37, 141 - 146.

Young SS, Yuan F, Zhu M (2012). Chemical descriptors are more important than learning algorithms for modeling. Molecular Informatics, 31(10), 707 - 710.

Armstrong JJ, Zhu M, Hirdes JP, Stolee P (2012). K-means cluster analysis of rehabilitation service users in the home health care system of Ontario: Examining the heterogeneity of a complex geriatric population. Archives of Physical Medicine and Rehabilitation, 93(12), 2198 - 2205.

Forbes P, Zhu M (2011). Content-boosted matrix factorization for recommender systems: Experiments with recipe recommendation. In Proceedings of the 5th ACM Conference on Recommender Systems, 261 - 264.

Su W, Chipman HA, Zhu M (2011). Pseudo-likelihood inference underestimates model uncertainty: Evidence from Bayesian nearest neighbours. Journal of the Iranian Statistical Society, 10(2), 167 - 180.

Zhu M, Fan G (2011). Variable selection by ensembles for the Cox model. Journal of Statistical Computation and Simulation, 81(12), 1983 - 1992.

Fan G, Zhu M (2011). Detection of rare items with TARGET. Statistics and Its Interface, 4(1), 11 - 17.

Gu H, Kenney T, Zhu M (2010). Partial generalized additive models: An information-theoretic approach for dealing with concurvity and selecting variables. Journal of Computational and Graphical Statistics, 19(3), 531 - 551.

Zhu M, Hastie TJ (2010). Letter to the editor. Journal of the American Statistical Association, 105(490), 880.

Hoshino R, Oldford RW, Zhu M (2010). Two-stage approach for unbalanced classification with time-varying decision boundary: Application to marine container inspection. In Proceedings of the ACM SIGKDD Workshop on Intelligence and Security Informatics, Washington, DC, USA, July 25, 2010.

Laflamme-Sanders A, Zhu M (2008). LAGO on the unit sphere. Neural Networks, 21(9), 1220 - 1223.

Zhu M (2008). Kernels and ensembles: Perspectives on statistical learning. The American Statistician, 62(2), 97 - 109.

Zhu M, Zhang Z, Hirdes JP, Stolee P (2007). Using machine learning algorithms to guide rehabilitation planning for home care clients. BMC Medical Informatics and Decision Making, 7:41.

Zhu M, Chen W, Hirdes JP, Stolee P (2007). The K-nearest neighbors algorithm predicted rehabilitation potential better than current clinical assessment protocol. Journal of Clinical Epidemiology, 60, 1015 - 1021.

Zhu M (2006). Discriminant analysis with common principal components. Biometrika, 93(4), 1018 - 1024.

Zhu M, Chipman HA (2006). Darwinian evolution in parallel universes: A parallel genetic algorithm for variable selection. Technometrics, 48(4), 491 - 502. [R code]

Zhu M, Su W, Chipman HA (2006). LAGO: A computationally efficient approach for statistical detection. Technometrics, 48(2), 193 - 205. [R code]

Zhu M, Ghodsi A (2006). Automatic dimensionality selection from the scree plot via the use of profile likelihood. Computational Statistics and Data Analysis, 51(2), 918 - 930. [R code]

Kustra R, Shioda R, Zhu M (2006). A factor analysis model for functional genomics. BMC Bioinformatics, 7:216.

Zhu M, Hastie TJ, Walther G (2005). Constrained ordination analysis with flexible response functions. Ecological Modelling, 187(4), 524 - 536. [***errata***]

Zhu M (2004). On the forward and backward algorithms of projection pursuit. The Annals of Statistics, 32(1), 233 - 244.

Zhu M, Lu AY (2004). The counter-intuitive non-informative prior for the Bernoulli family. Journal of Statistics Education, 12(2), online.

Zhu M, Hastie TJ (2003). Feature extraction for nonparametric discriminant analysis. Journal of Computational and Graphical Statistics, 12(1), 101 - 120. [***errata***]

Hastie TJ, Zhu M (2001). Discussion of "Dimension reduction and visualization in discriminant analysis" by Cook and Yin. Australian and New Zealand Journal of Statistics, 43(2), 179 - 185.

Refereed articles and abstracts from RNA Diagnostics, Inc.

Parissenti AM, Guo B, Pritzker LB, Pritzker KPH, Wang X, Zhu M, Shepherd LE, Trudeau ME (2015). Tumor RNA disruption predicts survival benefit from breast cancer chemotherapy. Breast Cancer Research and Treatment, 153, 135 - 144.

Trudeau ME, Pritzker LB, Parissenti AM, Wang X, Zhu M, Guo B, Shepherd LE, Chapman JW, Pritzker KPH (2012). A novel RNA test to guide primary systematic breast cancer chemotherapy. Annals of Oncology, 23(suppl 2), ii19.

Newspaper column

Zhu M (2016). Tyranny in the name of science. In Ottawa Citizen, September 8, 2016, A8.

Trade magazines, newsletters, bulletins, and others

Zhu M (2010). Predictive analytics: Managing fundamental tradeoffs. Analytics, September-October 2010, 18 - 21. [PDF version]

Zhu M (2008). How to draw a trilinear plot? ASA Statistical Computing and Graphics Newsletter, 19(1), 7 - 9.

Preprints and newly submitted articles (aka technical reports)

Jian J, Zhu M, Sang P (2023). Restricted Tweedie stochastic block models. Submitted.

Wu Y, Qin Y, Zhu M (2018). Estimation of multiple large covariance matrices by a partially common diagonal and low-rank matrix decomposition. Submitted.

Cheng H, Zhu M, Chan VW, Michela JL (2014). Single-index response surface models. Preprint. [R code]