TradeMark Study Guide iCAS CSPA Exam 3: Predictive Modeling Methods and Techniques

Retired

This book is no longer available for sale.

TradeMark Study Guide iCAS CSPA Exam 3: Predictive Modeling Methods and Techniques

Fall 2021 Exam

About the Book

This study guide equips you with the information you need to successfully pass iCAS CSPA Exam 3. It organizes and summarizes the syllabus materials to help guide you through your study process. It also contains exam questions for each topic and recommendations for supplemental study materials. Written by someone without experience in R and statistics prior to taking the exam, this guide is designed to help all exam takers regardless of their background.

About the Author

Tyson Mohr, FSA, CPCU, CLU, FLMI, CSPA
Tyson Mohr, FSA, CPCU, CLU, FLMI, CSPA

Tyson Mohr is an actuary with 10+ years of experience in the insurance industry. He has expertise in life insurance, capital modeling, model risk management, and Enterprise Risk Management. He is currently a manager of a data science team focusing on predictive modeling for P&C rating and underwriting. He has a passion for both learning and teaching.

Tyson lives in Bloomington, Illinois with his wife and 3 sons.

Table of Contents

  • Introduction
    • Biographical Note
    • Study Guide Structure
    • Study Tips
    • Strategies for Test Day
    • Feedback
  • A1: Types of Data, Missing and Incomplete Data
    • a. Describe types of data such as discrete and continuous data. Describe special issues that arise in data from surveys
    • b. Describe key patterns of missing data values, including censoring, truncation, missing-at- random, and missing-completely-at-random.
    • c. Describe key underlying causes of missing data. Identify appropriate ways to deal with missing values in a given situation, and identify the advantages and disadvantages of each.
  • A2: Linear Model Diagnostics
    • Extra: Thoughts on how to approach statistical content
    • a. Interpret linear model output such as confidence intervals for parameter estimates and for predictions. Perform, interpret, and act upon standard diagnostics on linear models, including assessment and treatment.
    • b. Understand and apply the hat matrix, hat values, residuals (raw, standardized, Studentized, and Pearson), and Cook’s D to detect outliers and influential observations
    • c. Apply residual plots, marginal model plots, and added variable plots to assess quality of fit and the impact of each predictor
    • d. Use QQ plots to diagnose non-normal errors
    • e. Use F-tests, residual plots, component-plus- residual plots, and CERES plots to identify non-linear dependencies
    • f. Use residual plots and spread-level plots to identify heteroscedasticity; determine when transformation of the target variable (possibly via Box Cox) is an appropriate remedy, and when weighted regression is appropriate.
    • g. Identify collinearity via variance-inflation factors and generalized variance-inflation factors and discuss possible ways to deal with collinearity
  • A3: Classical Models—Generalized Linear Models and Their Diagnostics
    • a. Understand the assumptions behind different forms of the Generalized Linear Model and be able to select the appropriate model
    • b. Understand the relationship between mean and variance for various models within the GLM family
    • c. Understand how to select the appropriate link function and distribution for the dependent variable.
    • d. Understand the Tweedie as compound gamma- Poisson and also as the GLM with variance function a powerlaw.
    • e. Be able to describe the reason for a double GLM and two ways in which a double GLM might be fit. Be able to describe similarities and differences between a double GLM and a weighted GLM
    • f. Use appropriate diagnostics to evaluate the fit of a GLM
    • g. Describe the effect of non- canonical link function
    • h. Define deviance and its relationship to a GLM
  • Supplemental Study 1
  • B1: Validation Holdout vs Cross-Validation and Tuning Parameters
    • a. Explain and contrast holdout and Cross- Validation approaches and the best use of each
    • b,c. For a given dataset and model, use cross- validation to estimate the accuracy of model predictions. Why might this estimate be inaccurate?
  • B2: Evaluation: Goodness of Fit Metrics, Bootstrapping, Bias-Variance Tradeoff, and Presentation of Results
    • a. Define and apply ROC curves, AUC, Lorenz curves, and Gini index
    • b. Estimate variance of model estimates.
    • c. Describe why your model may be biased.
    • d. Describe how to build a model to minimize the expected mean squared error.
    • e,f. What exhibits do you show for the holdout data? What presentation material do you prepare and show?
  • B3: Classification Models and Special Considerations
    • a. Describe and apply the ROC curve in evaluating a classification model
    • b. Define and describe the Bayes error
    • c. Apply linear regression, logistic regression, linear discriminant analysis, quadratic discriminant analysis, and nearest neighbors to fit classification models. Compare and contrast these methods as to when each might be preferable
    • d. Fit a logistic regression by penalized maximum likelihood, and describe when that should be preferred to maximum likelihood
    • e. Describe how unbalanced training datasets can influence classifiers and why that is a problem
    • f, g. Identify algorithmic solutions to using unbalanced training sets, including various undersampling, oversampling, and cost- sensitive learning approaches. Discuss the advantages and drawbacks of each.
  • B4: Shrinkage and Feature Selection Methods
    • a,b. Apply forward stepwise selection. Define “best subset” selection.
    • c. Define a shrinkage method and explain which penalty term corresponds to which method (ridge, lasso)
    • d,e. Use shrinkage methods (lasso and ridge) to improve linear model predictions. Select the tuning parameter for the penalty term. Comment on how this is done.
  • Supplemental Study 2
  • A4: Causal Inference from Observational Data
    • a,b. Understand coarsened exact matching (CEM) for estimating causal effects and explain the strengths and weaknesses of it. Discuss the process for using CEM to estimate causal effects.
    • c. Distinguish causal effects from predictions
    • Extra: Introduction to Experiments
    • d. Explain SATT (sample average treatment effect on the treated)
  • B5 Non-Linear Effects and Additive Models
    • a. Be able to discuss several ways of capturing non-linear relationships in regressions and GLM models, including polynomials, step functions, splines, smoothing splines, and local regression
    • b. Be able to build general additive models (GAM)
  • B6 Single Trees
    • a-d. Build regression and classification trees. Use a tree to determine an estimate for an observation. Discuss reasons for pruning and methods to prune. Implement pruning
  • B7 Ensemble Methods, Random Forests, and Boosting
    • a,b. Be able to fit bagged tree models, boosted tree models, and random forests to data. Be able to use each to get estimates for a new observation. Discuss how each of these methods works, and what its pros and cons are
  • B8 Principal Components Analysis and Unsupervised Learning
    • a. Differentiate between supervised and unsupervised learning tasks.
    • Extra: Overview of clustering
    • b,d. Describe the choices involved in using k-means and hierarchical clustering and the implications thereof. Summarize potential issues with using clustering and ways to mitigate them.
    • c. Interpret a dendrogram.
    • e. Cluster data using k-means and hierarchical clustering
  • Supplemental Study 3
  • Appendix: Resources on Statistics Fundamentals
  • Appendix: Overview of R Techniques
    • Introduction
    • R Fundamentals
    • Most Important Processes
    • Packages and Functions
  • Appendix: Fall 2021 Sample Exam Answers
  • Change Log

The Leanpub 60 Day 100% Happiness Guarantee

Within 60 days of purchase you can get a 100% refund on any Leanpub purchase, in two clicks.

Now, this is technically risky for us, since you'll have the book or course files either way. But we're so confident in our products and services, and in our authors and readers, that we're happy to offer a full money back guarantee for everything we sell.

You can only find out how good something is by trying it, and because of our 100% money back guarantee there's literally no risk to do so!

So, there's no reason not to click the Add to Cart button, is there?

See full terms...

80% Royalties. Earn $16 on a $20 book.

We pay 80% royalties. That's not a typo: you earn $16 on a $20 sale. If we sell 5000 non-refunded copies of your book or course for $20, you'll earn $80,000.

(Yes, some authors have already earned much more than that on Leanpub.)

In fact, authors have earnedover $13 millionwriting, publishing and selling on Leanpub.

Learn more about writing on Leanpub

Free Updates. DRM Free.

If you buy a Leanpub book, you get free updates for as long as the author updates the book! Many authors use Leanpub to publish their books in-progress, while they are writing them. All readers get free updates, regardless of when they bought the book or how much they paid (including free).

Most Leanpub books are available in PDF (for computers) and EPUB (for phones, tablets and Kindle). The formats that a book includes are shown at the top right corner of this page.

Finally, Leanpub books don't have any DRM copy-protection nonsense, so you can easily read them on any supported device.

Learn more about Leanpub's ebook formats and where to read them

Write and Publish on Leanpub

You can use Leanpub to easily write, publish and sell in-progress and completed ebooks and online courses!

Leanpub is a powerful platform for serious authors, combining a simple, elegant writing and publishing workflow with a store focused on selling in-progress ebooks.

Leanpub is a magical typewriter for authors: just write in plain text, and to publish your ebook, just click a button. (Or, if you are producing your ebook your own way, you can even upload your own PDF and/or EPUB files and then publish with one click!) It really is that easy.

Learn more about writing on Leanpub