Papers | Aoshima Lab. (Makoto AOSHIMA), University of Tsukuba

"Automatic Sparse PCA for High-Dimensional Data"

Statistica Sinica, 35 (2025), 1069-1090.

DOI:: 10.5705/ss.202022.0319 (Supplement)

arXiv:: 2209.14891

Keywords:: Clustering, Large p small n, PCA consistency,
Shrinkage PC directions, Thresholding

Code

"Test for High-Dimensional Outliers with Principal Component Analysis "

Japanese Journal of Statistics and Data Science (2024), in print.

DOI:: 10.1007/s42081-024-00255-0

Keywords:: Consistency, Grubbs test, HDLSS, Outlier detection, PC score

Code

"High Dimensional Statistical Analysis and its Application to an ALMA Map of NGC 253"

The Astrophysical Journal Supplement Series, 271:44 (2024).

DOI:: 10.3847/1538-4365/ad2517 (Open Access)

arXiv:: 2203.04535

Keywords:: methods: statistical — ISM: lines and bands — galaxies: ISM —galaxies: star formation — galaxies: individual (NGC 253) —
radio lines: galaxies

"Asymptotic Properties of Hierarchical Clustering in High-Dimensional Settings "

Journal of Multivariate Analysis, 199 (2024), 105251.

DOI:: 10.1016/j.jmva.2023.105251 (Open Access)

Keywords:: Clustering behavior, High-dimension low-sample-size, Multiclass,
Ward's linkage function

"Geometric Classifiers for High-Dimensional Noisy Data"

Special Issue: 50th Anniversary Jubilee Edition, Journal of Multivariate Analysis,
188 (2022), 104850. (Editor's invited paper)

DOI:: 10.1016/j.jmva.2021.104850 (Open Access)

Keywords:: Data transformation, HDLSS, Large p small n,
Noise-reduction methodology, Quadratic classifier, SSE model

"Clustering by Principal Component Analysis with Gaussian Kernel in High-Dimension, Low-Sample-Size Settings"

Journal of Multivariate Analysis, 185 (2021), 104779.
[This paper is selected as a top cited article.]

DOI:: 10.1016/j.jmva.2021.104779 (Open Access)

Keywords:: HDLSS, Non-linear, PCA, PC score, Radial basis function kernel, Spherical data

"Asymptotic properties of distance-weighted discrimination and its bias correction for high-dimension, low-sample-size data"

Japanese Journal of Statistics and Data Science, 4 (2021), 821–840.

DOI:: 10.1007/s42081-021-00135-x

Keywords:: Bias-corrected DWD, Discriminant analysis, HDLSS, Large p small n, Weighted DWD

"Hypothesis Tests for High-Dimensional Covariance Structures"

Annals of the Institute of Statistical Mathematics, 73 (2021), 599-622.

DOI:: 10.1007/s10463-020-00760-5

Keywords:: Cross-data-matrix methodology, Diagonal structure, HDLSS,
Intraclass correlation model, Test of eigenvector, Unbiased estimate

"Geometric Consistency of Principal Component Scores for High-dimensional Mixture Models and Its Application"

Scandinavian Journal of Statistics, 47 (2020), 899-921.

DOI:: 10.1111/sjos.12432 (Open Access)

Keywords:: Clustering, Geometric representation, HDLSS, Microarray,
Mixture model, PCA, PC score

"Bias-Corrected Support Vector Machine with Gaussian Kernel in High-Dimension, Low-Sample-Size Settings"

Annals of the Institute of Statistical Mathematics, 72 (2020), 1257-1286.

DOI:: 10.1007/s10463-019-00727-1

Keywords:: Geometric representation, HDLSS, Imbalanced data, Radial basis function kernel

Code

"High-Dimensional Quadratic Classifiers in Non-Sparse Settings"

Methodology and Computing in Applied Probability, 21 (2019), 663-682.

DOI:: 10.1007/s11009-018-9646-z (Open Access)

arXiv:: 1503.04549

Keywords:: Asymptotic normality, Bayes error rate, Feature selection, Heterogeneity, Large p small n

Code

"Inference on High-Dimensional Mean Vectors under the Strongly Spiked Eigenvalue Model"

Japanese Journal of Statistics and Data Science, 2 (2019), 105-128.
[This paper is one of the Springer Nature 2019 Highlights.]
#2019HighlightsAuthor
- a selection of the most popular articles and book chapters published in the Springer Nature in 2019, and reflecting top research that made an impact.

DOI:: 10.1007/s42081-018-0029-z (Open Access)

Keywords:: Asymptotic normality, Data transformation, eigenstructure estimation, Large p small n, Noise reduction methodology, Spiked model

"Distance-Based Classifier by Data Transformation for High-Dimension, Strongly Spiked Eigenvalue Models"

Annals of the Institute of Statistical Mathematics, 71 (2019), 473-503.

DOI:: 10.1007/s10463-018-0655-z

arXiv:: 1710.10768

Keywords:: Asymptotic normality, Data transformation, Discriminant analysis,
Large p small n, Noise reduction methodology, Spiked model

"Equality Tests of High-Dimensional Covariance Matrices under the Strongly Spiked Eigenvalue Model"

Journal of Statistical Planning and Inference, 202 (2019), 99-111.

DOI:: 10.1016/j.jspi.2019.02.002

Keywords:: HDLSS, Large p small n, Noise-reduction methodology, SSE model, Two-sample test

"A quadratic classifier for high-dimension, low-sample-size data under the strongly spiked eigenvalue model"

Stochastic Models, Statistics and Their Applications, Proceedings of the 14th Workshop on Stochastic Models, Statistics and their Application (2019), 131-142.

DOI:: 10.1007/978-3-030-28665-1_10

Keywords:: Classification, Eigenstructure, Geometrical quadratic discriminant analysis, HDLSS, Noise reduction methodology, SSE model

"A Test of Sphericity for High-Dimensional Data and Its Application for Detection of Divergently Spiked Noise"

Sequential Analysis, 37 (2018), 397-411.
[This paper was awarded the Abraham Wald Prize in Sequential Analysis 2019.]

DOI:: 10.1080/07474946.2018.1548850

Keywords:: Cross-data-matrix method, Gene expression data, HDLSS, Noise detection, Noise-reduction method, Sphericity

"The JSS Award Lecture: High-Dimensional Statistical Analysis: New Developments of Theories and Methodologies"

Journal of the Japan Statistical Society Series J, 48 (2018), 89-111.

DOI:: 10.11329/jjssj.48.89

Keywords:: Data transformation, Discriminant analysis, Geometric representation, HDLSS, Noise-reduction methodology, PCA, Two-sample test

"A Survey of High Dimension Low Sample Size Asymptotics"

Australian & New Zealand Journal of Statistics, Special Issue in Honour of
Peter Gavin Hall, 60 (2018), 4-19.
[This paper has been recognized as TOP DOWNLOADED ARTICLE 2017-2018.]
[This paper has been recognized as TOP DOWNLOADED ARTICLE 2018-2019.]

DOI:: 10.1111/anzs.12212

Keywords:: Canonical correlations, Classification, Geometric representation, Hypothesis testing, PCA

"Two-Sample Tests for High-Dimension, Strongly Spiked Eigenvalue Models"

Statistica Sinica, 28 (2018), 43-62.

arXiv:: 1602.02491

DOI:: 10.5705/ss.202016.0063 (Supplement)

Keywords:: Asymptotic normality, Eigenstructure estimation, Large p small n,
Noise reduction methodology, Spiked model

"Support Vector Machine and Its Bias Correction in High-Dimension, Low-Sample-Size Settings"

Journal of Statistical Planning and Inference, 191 (2017), 88-100.

arXiv:: 1702.08019

DOI:: 10.1016/j.jspi.2017.05.005

Keywords:: Distance-based classifier, HDLSS, Imbalanced data, Large p small n, Multiclass classification

Code

"Statistical inference for high-dimension, low-sample-size data"

American Mathematical Society, Sugaku Expositions, 30 (2017), 137-158.

DOI:: 10.1090/suga/421

"Non-asymptotic results for Cornish-Fisher expansions"

Journal of Mathematical Sciences, 218 (2016), 363-368.

arXiv:: 1604.00539

DOI:: 10.1007/s10958-016-3036-2

"High-Dimensional Inference on Covariance Structures via the Extended Cross-Data-Matrix Methodology"

Journal of Multivariate Analysis, 151 (2016), 151-166.

arXiv:: 1503.06492

DOI:: 10.1016/j.jmva.2016.07.011 (Open Archive)

Keywords:: Correlations test, Cross-data-matrix methodology, Graphical modeling, Large p small n, Pathway analysis, RV-coefficient

Code

"Reconstruction of a High-Dimensional Low-Rank Matrix"

Electronic Journal of Statistics, 10 (2016), 895–917.

DOI:: 10.1214/16-EJS1128

Keywords:: Eigenstructure, HDLSS, Noise-reduction methodology, PCA, Singular value decomposition

"Asymptotic Properties of the First Principal Component and Equality Tests of Covariance Matrices in High-Dimension, Low-Sample-Size Context"

Journal of Statistical Planning and Inference, 170 (2016), 186-199.

arXiv:: 1503.07302

DOI:: 10.1016/j.jspi.2015.10.007

Keywords:: Contribution ratio, Equality test of covariance matrices, HDLSS, Noise-reduction methodology, PCA

"Geometric Classifier for Multiclass, High-Dimensional Data"

Sequential Analysis, Special Issue: Celebrating Seventy Years of Charles Stein's 1945 Seminal Paper on Two-Stage Sampling, 34 (2015), 279-294.

DOI:: 10.1080/07474946.2015.1063256 (Open Access)

Keywords:: Asymptotic normality, Geometric classifier, HDLSS, Sample size determination, Two-stage procedure

"Asymptotic Normality for Inference on Multisample, High-Dimensional Mean Vectors under Mild Conditions"

Methodology and Computing in Applied Probability, 17 (2015), 419-439.

DOI:: 10.1007/s11009-013-9370-7 (Open Access)

Keywords:: Asymptotic normality, Confidence region, Cross-data-matrix methodology, Large p small n, Microarray, Two-stage procedure

"A Distance-Based, Misclassification Rate Adjusted Classifier for Multiclass, High-Dimensional Data"

Annals of the Institute of Statistical Mathematics, 66 (2014), 983-1010.

DOI:: 10.1007/s10463-013-0435-8

Keywords:: Asymptotic normality, Distance-based classifier, HDLSS, Sample size determination, Two-stage procedure

"The JSS Research Prize Lecture: Effective Methodologies for High-Dimensional Data"

Journal of the Japan Statistical Society Series J, 43 (2013), 123-150.

"PCA Consistency for the Power Spiked Model in High-Dimensional Settings"

Journal of Multivariate Analysis, 122 (2013), 334-354.

DOI:: 10.1016/j.jmva.2013.08.003 (Open Archive)

Keywords:: Cross-data-matrix methodology, HDLSS, Large p small n, Microarray data, Noise-reduction methodology

"Invited Review Article: Statistical Inference in High-Dimension, Low-Sample-Size Settings"

Sugaku, 65 (2013), 225-247.

"Correlation Tests for High-Dimensional Data Using Extended Cross-Data-Matrix Methodology"

Journal of Multivariate Analysis, 117 (2013), 313-331.

DOI:: 10.1016/j.jmva.2013.03.007 (Open Archive)

Keywords:: Cross-data-matrix methodology, Graphical modeling, HDLSS, High-dimensional regression, Pathway analysis, Two-stage procedure

"Effective PCA for High-Dimension, Low-Sample-Size Data with Noise Reduction via Geometric Representations"

Journal of Multivariate Analysis, 105 (2012), 193-215.

DOI:: 10.1016/j.jmva.2011.09.002 (Open Archive)

Keywords:: Consistency, Discriminant analysis, Eigenvalue distribution, Geometric representation, HDLSS, Inverse matrix, Noise reduction, Principal component analysis

Code

"Inference on High-Dimensional Mean Vectors with Fewer Observations Than the Dimension"

Methodology and Computing in Applied Probability, 14 (2012), 459-476.

DOI:: 10.1007/s11009-011-9233-z

Keywords:: Classification, Confidence region, HDLSS, Sample size determination, Two-stage estimation, Variable selection

"Two-Stage Procedures for High-Dimensional Data"

Sequential Analysis (Editor's special invited paper), 30 (2011), 356-399.
[This paper was awarded the Abraham Wald Prize in Sequential Analysis 2012.]

DOI:: 10.1080/07474946.2011.619088 (Open Access)

Keywords:: Asymptotic normality, Classification, Confidence region, HDLSS, Lasso, Pathway analysis, Regression, Sample size determination, Testing equality of covariance matrices, Two-sample test, Variable selection

"Authors’ Response"

Sequential Analysis, 30 (2011), 432-440.

DOI:: 10.1080/07474946.2011.619102

Keywords:: Classification, Confidence region, Cross-data-matrix methodology, HDLSS, Robustness, Sample size determination, Two-sample test, Variable selection

"Effective PCA for High-Dimension, Low-Sample-Size Data with Singular Value Decomposition of Cross Data Matrix"

Journal of Multivariate Analysis, 101 (2010), 2060-2077.
[This paper is ranked in the Top 25 Most Downloaded Articles.]

DOI:: 10.1016/j.jmva.2010.04.006 (Open Archive)

Keywords:: Consistency, Eigenvalue distribution, HDLSS, Microarray data analysis, Mixture model, Principal component analysis, Singular value

Code

"Intrinsic Dimensionality Estimation of High-Dimension, Low Sample Size Data with D-Asymptotics"

Communications in Statistics. Theory and Methods, Special Issue Honoring Akahira, M.
(ed. Aoshima, M.), 39 (2010), 1511-1521.

DOI:: 10.1080/03610920903121999

Keywords:: Dual covariance matrix, Effective dimension, HDLSS, Large p small n, Maximum eigenvalue

"PCA Consistency for Non-Gaussian Data in High Dimension, Low Sample Size Context"

Communications in Statistics. Theory and Methods, Special Issue Honoring Zacks, S.
(ed. Mukhopadhyay, N.), 38 (2009), 2634-2652.

DOI:: 10.1080/03610910902936083 (Open Access)

Keywords:: Consistency, Dual covariance matrix, Eigenvalue distribution, HDLSS, Large p small n, Principal component analysis, Random matrix theory, Sample size

INTRODUCTION of SELECTED PAPERS