A Complete Guide to Area Under Curve (AUC)

This tutorial explains the various methods to calculate the AUC (Area under the ROC Curve) mathematically as well as the steps to implement it in Python, R and SAS.

What is Area under Curve?

Area under Curve (AUC) or Receiver operating characteristic (ROC) curve is used to evaluate the performance of a binary classification model. It measures discrimination power of a predictive classification model. In simple words, it checks how well model is able to distinguish between events and non-events.

Example : Suppose you are building a predictive model for bank to identify customers who are likely to purchase a credit card. In this case, purchase of credit card is event (or desired outcome) and non-purchase of credit card is non-event.

ROC Curve

AUC or ROC curve is a plot of the proportion of true positives (events correctly predicted to be events) versus the proportion of false positives (non-events wrongly predicted to be events) at different probability cutoffs. True Positive Rate is also called Sensitivity. False Positive Rate is also called (1-Specificity).

Higher the AUC score, better the model. Diagonal line represents random classification model. It is equivalent to prediction by tossing a coin. All points along the diagonal line say same true positive and false positive rate.

How to Generate ROC Curve?

To generate ROC curve, we calculate Sensitivity and (1-Specificity) at all possible cutoffs and then we plot them. Cut-off represents minimum threshold above that predicted probability would be classified as 'event'.

Let's say cutoff is 0.5. In the case of propensity to purchase a credit card, customers with a predicted probability greater than or equal to 0.5 would be classified as potential buyers.

Cut-off	Sensitivity	Specificity	1-specificity
0	1	0	1
0.01	0.979	0.081	0.919
0.02	0.938	0.158	0.842
….
….
….
0.99	0.02	0.996	0.004
1	0	1	0

Calculating AUC using Concordance and Tied Percent
Calculating AUC using Integration Method
Calculating AUC using Mann–Whitney U Test
Calculating AUC using Cumulative Events and Non-Events

Calculating AUC using Concordance and Tied Percent

Calculate the predicted probability in logistic regression or any other binary classification technique.
Divide the data into two datasets. One dataset contains observations having actual value of dependent variable with value 1 (i.e. event) and corresponding predicted probability values. And the other dataset contains observations having actual value of dependent variable 0 (non-event) against their predicted probability scores.
Compare each predicted value in first dataset with each predicted value in second dataset.

Total Number of pairs to compare = x * y
x : Number of observations in first dataset (actual values of 1 in dependent variable)
y : Number of observations in second dataset (actual values of 0 in dependent variable).

Interpretation of Concordant, Discordant and Tied Percent

Percent Concordant : Percentage of pairs where the observation with the desired outcome (event) has a higher predicted probability than the observation without the outcome (non-event). Higher the concordant number, better the model but needs to be validated on unseen data.

Percent Discordant : Percentage of pairs where the observation with the desired outcome (event) has a lower predicted probability than the observation without the outcome (non-event). Lower the discordant number, better the model but needs to be validated on unseen data.

Percent Tied : Percentage of pairs where the observation with the desired outcome (event) has same predicted probability than the observation without the outcome (non-event).

AUC : AUC is also known as c-statistics. It is calculated by adding Concordance Percent and 0.5 times of Tied Percent.

Gini coefficient or Somers' D statistic is closely related to AUC. It is calculated by (2*AUC - 1). It can also be calculated by (Percent Concordant - Percent Discordant)

In this section, you will learn how to calculate AUC using Concordance and Tied Percent in Python, SAS, and R.

In the following code, we will use the pandas library along with the statsmodels library. You need to make sure that both libraries are installed in your Python environment. If not installed, use this command : pip install pandas pip install statsmodels

import pandas as pd import statsmodels.formula.api as smf import statsmodels.api as sm # Read Data df = pd.read_csv("https://stats.idre.ucla.edu/stat/data/binary.csv") # Convert admit column to binary variable df['admit'] = df['admit'].astype('int') # Factor Variables df['rank'] = df['rank'].astype('category') # Logistic Model df['rank'] = df['rank'].cat.reorder_categories([4, 1, 2, 3]) mylogistic = smf.glm(formula='admit ~ gre + gpa + rank', data=df, family=sm.families.Binomial()).fit() print(mylogistic.summary()) # Predict pred = mylogistic.predict() finaldata = pd.concat([df, pd.Series(pred, name='pred')], axis=1) def AUC(actuals, predictedScores): fitted = pd.DataFrame() ones = fitted[fitted['Actuals'] == 1] # Subset ones zeros = fitted[fitted['Actuals'] == 0] # Subset zeros totalPairs = len(ones) * len(zeros) # calculate total number of pairs to check conc = sum(ones['PredictedScores'].apply(lambda x: (x > zeros['PredictedScores']).sum())) disc = sum(ones['PredictedScores'].apply(lambda x: (x < zeros['PredictedScores']).sum())) concordance = conc / totalPairs discordance = disc / totalPairs tiesPercent = (1 - concordance - discordance) AUC = concordance + 0.5 * tiesPercent Gini = 2 * AUC - 1 return print(AUC(finaldata['admit'], finaldata['pred']))

FILENAME PROBLY TEMP; PROC HTTP URL="https://stats.idre.ucla.edu/stat/data/binary.csv" METHOD="GET" OUT=PROBLY; RUN; OPTIONS VALIDVARNAME=ANY; PROC IMPORT FILE=PROBLY OUT=WORK.binary REPLACE DBMS=CSV; RUN; ods graphics on; Proc logistic data= WORK.binary descending plots(only)=roc; class rank / param=ref ; model admit = gre gpa rank; output out = estprob p= pred; run; /*split the data into two datasets- event and non-event*/ Data event nonevent; Set estprob; If admit = 1 then output event; else if admit = 0 then output nonevent; run; /*Cartesian product of event and non-event actual cases*/ Proc SQL noprint; create table pairs as select a.admit as admit1, b.admit as admit0, a.pred as pred1,b.pred as pred0 from event a cross join nonevent b; quit; /*Calculating concordant,discordant and tied percent*/ Data pairs; set pairs; concordant =0; discordant=0; tied=0; If pred1 > pred0 then concordant = 1; else If pred1 < pred0 then discordant = 1; else tied = 1; run; /*Mean values - Final Result*/ proc sql; select mean(Concordant)*100 as Percent_Concordant, mean(Discordant) *100 as Percent_Discordant, mean(Tied)*100 as Percent_Tied, (calculated Percent_Concordant + 0.5* calculated Percent_Tied)/100 as AUC, 2*calculated AUC - 1 as somers_d from pairs; quit;

# Read Data df = read.csv("https://stats.idre.ucla.edu/stat/data/binary.csv") # Factor Variables df$admit = as.factor(df$admit) df$rank = as.factor(df$rank) # Logistic Model df$rank zeros$PredictedScores))>, FUN.VALUE=logical(nrow(zeros)))), na.rm=T) disc , FUN.VALUE = logical(nrow(zeros)))), na.rm = T) concordance AUC(finaldata$admit, finaldata$pred)

AUC and Concordance

Calculating AUC using Integration Method

Trapezoidal Rule Numerical Integration method is used to find area under curve. The area of a trapezoid is as follows:

( x_i+1 – x_i ) * ( y_i + y_i+1 ) / 2

In our case, x refers to values of false positive rate (1-Specificity) at different probability cut-offs, y refers to true positive rate (Sensitivity) at different cut-offs. Vector x needs to be sorted. Any observation with predicted probability that exceeds or equals probability cut-off is predicted to be an event; otherwise, it is predicted to be a nonevent.

( fpr_i+1 – fpr_i ) * ( tpr_i + tpr_i+1 ) / 2

fpr represents false positive rate (1- specificity). tpr represents true positive rate (sensitivity). See the image below showing step by step calculation. It includes a very few cut-offs for demonstration purpose.

AUC Calculation Steps using Integration Method

In this section, you will learn how to calculate AUC using Integration Method in Python, SAS, and R.

import pandas as pd import numpy as np from sklearn.metrics import roc_curve import statsmodels.formula.api as smf import statsmodels.api as sm # Read Data df = pd.read_csv("https://stats.idre.ucla.edu/stat/data/binary.csv") # Convert admit column to binary variable df['admit'] = df['admit'].astype('int') # Factor Variables df['rank'] = df['rank'].astype('category') # Logistic Model df['rank'] = df['rank'].cat.reorder_categories([4, 1, 2, 3]) mylogistic = smf.glm(formula='admit ~ gre + gpa + rank', data=df, family=sm.families.Binomial()).fit() print(mylogistic.summary()) # Predict pred = mylogistic.predict() finaldata = pd.concat([df, pd.Series(pred, name='pred')], axis=1) # Calculate ROC curve fpr, tpr, thresholds = roc_curve(finaldata['admit'], finaldata['pred']) # Use trapezoidal rule to approximate area under ROC curve dx = np.diff(fpr) auroc = np.sum(dx * (tpr[1:] + tpr[:-1])) / 2 print(f'AUROC: ')

In the SAS program below, we are using PROC IML procedure to perform integration calculations.

FILENAME PROBLY TEMP; PROC HTTP URL="https://stats.idre.ucla.edu/stat/data/binary.csv" METHOD="GET" OUT=PROBLY; RUN; OPTIONS VALIDVARNAME=ANY; PROC IMPORT FILE=PROBLY OUT=WORK.binary REPLACE DBMS=CSV; RUN; ods graphics on; Proc logistic data= WORK.binary descending plots(only)=roc; class rank / param=ref ; model admit = gre gpa rank / outroc=performance; output out = estprob p= pred; run; proc sort data=performance; by _1MSPEC_; run; proc iml; use performance; read all var into sensitivity; read all var into falseposrate; N = 2 : nrow(falseposrate); fpr = falseposrate[N] - falseposrate[N-1]; tpr = sensitivity[N] + sensitivity[N-1]; ROC = fpr`*tpr/2; Gini= 2*ROC - 1; print ROC Gini;

# Read Data df = read.csv("https://stats.idre.ucla.edu/stat/data/binary.csv") # Factor Variables df$admit = as.factor(df$admit) df$rank = as.factor(df$rank) # Logistic Model df$rank 
Calculating AUC using Mann–Whitney U Test
Area under curve (AUC) is directly related to Mann Whitney U test. People from analytics community also call it Wilcoxon rank-sum test.
This test assumes that the predicted probability of event and non-event are two independent continuous random variables. Area under the curve = Probability that Event produces a higher probability than Non-Event. AUC=P(Event>=Non-Event)
AUC = U₁/(n₁ * n₂) Here U₁ = R₁ - (n₁*(n₁ + 1) / 2)
where U₁ is the Mann Whitney U statistic and R₁ is the sum of the ranks of predicted probability of actual event. It is calculated by ranking predicted probabilities and then selecting only those cases where dependent variable is 1 and then take sum of all these cases. n₁ is the number of 1s (event) in dependent variable. n₂ is the number of 0s (non-events) in dependent variable.
n₁*n₂ is the total number of pairs (or cross product of number of events and non-events). It is similar to what we have done in concordance method to calculate AUC.
In this section, you will learn how to calculate AUC using Mann–Whitney U Test in Python, SAS, and R.
import pandas as pd import numpy as np import statsmodels.formula.api as smf import statsmodels.api as sm # Read Data df = pd.read_csv("https://stats.idre.ucla.edu/stat/data/binary.csv") # Convert admit column to binary variable df['admit'] = df['admit'].astype('int') # Factor Variables df['rank'] = df['rank'].astype('category') # Logistic Model df['rank'] = df['rank'].cat.reorder_categories([4, 1, 2, 3]) mylogistic = smf.glm(formula='admit ~ gre + gpa + rank', data=df, family=sm.families.Binomial()).fit() print(mylogistic.summary()) # Predict pred = mylogistic.predict() finaldata = pd.concat([df, pd.Series(pred, name='pred')], axis=1) # Calculate the AUC using the Mann-Whitney U test from scipy.stats import mannwhitneyu def auc_mann_whitney(y, pred): y = np.array(y, dtype=bool) n1 = np.sum(y) n2 = np.sum(~y) U, _ = mannwhitneyu(pred[y], pred[~y], alternative='greater') return U / (n1 * n2) # Example usage auc = auc_mann_whitney(finaldata['admit'], finaldata['pred']) print(auc)
# Read Data df = read.csv("https://stats.idre.ucla.edu/stat/data/binary.csv") # Factor Variables df$admit = as.factor(df$admit) df$rank = as.factor(df$rank) # Logistic Model df$rank auc_mannWhitney(as.numeric(as.character(finaldata$admit)), finaldata$pred)
FILENAME PROBLY TEMP; PROC HTTP URL="https://stats.idre.ucla.edu/stat/data/binary.csv" METHOD="GET" OUT=PROBLY; RUN; OPTIONS VALIDVARNAME=ANY; PROC IMPORT FILE=PROBLY OUT=WORK.binary REPLACE DBMS=CSV; RUN; ods graphics on; Proc logistic data= WORK.binary descending plots(only)=roc; class rank / param=ref ; model admit = gre gpa rank; output out = estprob p= pred; run; %let dependent_var = admit; %let score_dataset = estprob; %let score_column = pred; ods output WilcoxonScores=WilcoxonScore; proc npar1way wilcoxon data= &score_dataset.; class &dependent_var.; var &score_column.; run; data AUC; set WilcoxonScore end=eof; retain v1 v2 1; if _n_=1 then v1=abs(ExpectedSum - SumOfScores); v2=N*v2; if eof then do; d=v1/v2; Gini=d * 2; AUC=d+0.5; put AUC= GINI=; keep AUC Gini; output; end; proc print noobs; run;
Calculating AUC using Cumulative Events and Non-Events
In this method, we will see how we can calculate area under curve using decile (binned) data.


  Sort predicted probabilities in descending order. It means customer having high likelihood to buy a product should appear at top (in case of propensity model)
  Split or rank into 10 parts. It is similar to concept of calculating decile.
  Calculate number of cases in each decile level. It would be same in each level as we divided the data in 10 equal parts.
  Calculate number of 1s (event) in each decile level. Maximum 1s should be captured in first decile (if your model is performing fine!)
  Calculate cumulative percent of 1s in each decile level. Last decile should have 100% as it is cumulative in nature.
  Similar to the above step, we will calculate cumulative percent of 0s in each decile level.
  AUC would be calculated using trapezoidal rule numeric integration formula. In this case, x is cumulative % of 0s and y is cumulative % of 1s


This method returns an approximation of AUC score since we are using only 10 bins instead of raw values.
Related Posts
Learn Python : Top 50 Python Tutorials
Spread the Word! 
 Share Share Tweet

About Author: 

Deepanshu founded ListenData with a simple objective - Make analytics easy to understand and follow. He has over 10 years of experience in data science. During his tenure, he worked with global clients in various domains like Banking, Insurance, Private Equity, Telecom and HR.
While I love having friends who agree, I only learn from those who don't 
Let's Get Connected Email LinkedIn
Post Comment 23 Responses to "A Complete Guide to Area Under Curve (AUC)"
 
Very precise and clear explanation of concordance and discordance. Also the code helps in better understanding of the phenomenon. Thanks. Reply Delete