Comparison of Statistical Learning and Predictive Models on Breast Cancer Data and King County Housing Data

Author(s)Cai, Yunjiao
Author(s)Fu, Zhuolun
Author(s)Zhao, Yuzhe
Author(s)Hu, Yilin
Author(s)Ding, Shanshan
Date Accessioned2017-09-19T12:37:11Z
Date Available2017-09-19T12:37:11Z
Publication Date2017-09
AbstractIn this study, we evaluate the predictive performance of popular statistical learning methods, such as discriminant analysis, random forests, support vector machines, and neural networks via real data analysis. Two datasets, Breast Cancer Diagnosis in Wisconsin and House Sales in King County, are analyzed respectively to obtain the best models for prediction. Linear and Quadratic Discriminant Analysis are used in WDBC data set. Linear Regression and Elastic Net are used in KC house data set. Random Forest, Gradient Boosting Method, Support Vector Machines, and Neural Network are used in both datasets. Individual models and stacking of models are trained based on accuracy or R-squared from repeated cross-validation of training sets. The final models are evaluated by using test sets.en_US
URLhttp://udspace.udel.edu/handle/19716/21667
Languageen_USen_US
PublisherDepartment of Applied Economics and Statistics, University of Delaware, Newark, DE.en_US
Part of SeriesAPEC Research Reports;RR17-10
KeywordsMachine learningen_US
KeywordsPredictionen_US
KeywordsClassificationen_US
KeywordsRegressionen_US
KeywordsStackingen_US
TitleComparison of Statistical Learning and Predictive Models on Breast Cancer Data and King County Housing Dataen_US
TypeWorking Paperen_US
Files
Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
RR17-10.pdf
Size:
2.25 MB
Format:
Adobe Portable Document Format
Description:
License bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
2.22 KB
Format:
Item-specific license agreed upon to submission
Description: