Lecture "Applied data science: Classification" includes content: Classification - logistic regression review ; classification evaluation metrics; the expected value framework;... We invite you to consult!
Nội dung trích xuất từ tài liệu:
Lecture Applied data science: Classification
Classification
Overview
1. Introduction 8. Validation
2. Application 9. Regularisation
3. EDA 10. Clustering
4. Learning Process 11. Evaluation
5. Bias-Variance Tradeoff 12. Deployment
6. Regression (review) 13. Ethics
7. Classification
Lecture outline
- Classification - Logistic regression review
- Classification evaluation metrics
- The expected value framework
Classification problems
Response is categorical, e.g. credit card default (Yes/No), favourite movie types
(Action/Drama/Animation)
Exemplary techniques - logistic regression, classification tree, K-NN, etc.
Logistic regression formulation
Logistic regression coefficients are estimated by
maximising the likelihood function
Logistic regression example
responding
Yes No
student_Yes 127 2817
student_No 206 6850
Total 333 9667
Training set responding
Yes No
student_Yes 84 1959
student_No 150 4808
Total 234 6767
Test set responding
Yes No
student_Yes 43 858
student_No 56 2042
Total 99 2900
Logistic regression results
Logistic regression results interpretation
Prediction from multiple classifiers
The ROC curve
The ROC curve
Each point corresponds to a confusion matrix
Point A is more ‘conservative’ than B, which is
more ‘conservative’ than C
Points that are closer to the upper left are
preferred. Point (0,1) represents the perfect
classifier
Points along the diagonal represent random
guessing - no classifiers should be in the
lower right
The ROC curves from different classifiers
p n
Predicted Yes 46 12
Predicted No 53 2888
The expected value analytical framework
The targeted marketing example.
Assume that we sell the product for $200, production related cost is $100 and
shipping and handling cost is $1. What would be the minimum probability of
responding we should target.
Expected value of a classifier
Expected value of a classifier
From the above example, let’s use 0.35 as the threshold and assume the matrix of
cost/benefit information is as below. What would be total expected value of the
logistic regression classifier per customer?
Actual Yes Actual No
Predicted Yes $99 $-1
Predicted No $0 $0
The profit curves
Actual Yes Actual No Actual Yes Actual No
Predicted Yes $99 $-1 Predicted Yes $99 $-10
Predicted No $0 $0 Predicted No $0 $0
Lecture Applied data science: Classification
Số trang: 18
Loại file: pdf
Dung lượng: 846.97 KB
Lượt xem: 43
Lượt tải: 0
Xem trước 2 trang đầu tiên của tài liệu này:
Thông tin tài liệu:
Tìm kiếm theo từ khóa liên quan:
Lecture Applied data science Applied data science Logistic regression review Classification logistic regression Classification evaluation metrics The expected value frameworkTài liệu có liên quan:
-
Lecture Applied data science: Exploratory data analysis
35 trang 46 0 0 -
Lecture Applied data science: Application
12 trang 39 0 0 -
Lecture Applied data science: Linear regression (review)
20 trang 36 0 0 -
Lecture Applied data science: Validation
23 trang 34 0 0 -
Lecture Applied data science: Regularisation
34 trang 33 0 0 -
Lecture Applied data science: Clustering
21 trang 30 0 0 -
Lecture Applied data science: Introduction
20 trang 27 0 0 -
Lecture Applied data science: Evaluation, deployment, ethics
19 trang 21 0 0