From Genes to Tumors: Dimension Reduction in Logistic Regression
Course Project · High-Dimensional Data Analysis
Logistic Regression
PCA
Sparse PCA
High-Dimensional Data
Classifying prostate tumor vs. healthy tissue samples using gene expression data with 6,000+ variables, comparing stepwise logistic regression, PCA, and sparse PCA.
In a course on High-Dimensional Data Analysis, I worked on classifying prostate tumor versus healthy tissue samples using gene expression data with over 6,000 variables. Comparing stepwise logistic regression, PCA, and sparse PCA, I learned how dimension reduction techniques can improve prediction while raising trade-offs between accuracy, parsimony, and interpretability.