- Lumping, splitting, bias, variance, smoothing and shrinking
- Exploring “personalized medicine” challenges through the eyes of statistical street smarts.
- Try out the live app …. Here is the project code
- Updated Mar 9, 2019
The module
“Bias, variance, smoothing, and shrinking” begins with the
“lump versus split dilemma” sub-module, in the context of a study treating patients and recording responses. The dilemma: in deciding whether to pursue “personalized medicine” for one small group of patients, how should we use the data from patients in other groups (if at all)?
This example is used as a jumping off point for individual explorations into a web of connected statistical topics important for data scientists. We examine both frequentist and Bayesian approaches. We touch on the role of prior information and the effects of having many additional features.
Along the way, students are asked a few thought questions, and to report on any “Questions” and “Aha’s” (observations, discoveries, delightful connections) that arise.
Those students interested in technical details of derivations or the source coding can follow links provided to some relevant documents.