IFRS 9.5.5 introduces an impairment model for financial assets based on Expected Credit Losses (ECL), which requires entities to recognize a loss allowance prior to loss materialization, utilizing forward-looking and historical information. IFRS 9.5.5.1 stipulates that an entity βshall recognise a loss allowance for expected credit losses on a financial asset that is measured in accordance with paragraphs 4.1.2β, that is, financial assets βmeasured at amortised costβ held βto collect contractual cash flowsβ and whose βcontractual terms of the financial asset give rise on specified dates to cash flows that are solely payments of principal and interest on the principal amount outstandingβ. Referring to the impairment modelβs input data, IFRS 9.5.5.4 expects entities to consider βall reasonable and supportable information, including that which is forward-lookingβ.
The ECL for Trade Receivables that contain a "significant financing component"1 under IFRS 15, such as credit card receivables, can be measured under the βSimplified Approachβ. In contratst with the "General Approach", the Simplified Approach allows entities to recognise lifetime expected losses on all these assets without the need to identify significant increases in credit risk. In any case because the maturities will typically be 12 months or less, the credit loss for 12-month and lifetime ECLs would be the same. IFRS 9.5.5.15 states that "an entity shall always measure the loss allowance at an amount equal to lifetime expected credit losses for...trade receivables or contract assets that result from transactions that are within the scope of IFRS 15, and thatβ¦contain a significant financing component in accordance with IFRS 15, if the entity chooses as its accounting policy to measure the loss allowance at an amount equal to lifetime expected credit losses."
Lifetime expected credit loss is the discounted value of expected credit losses that result from probable default events over the expected life of a financial instrument. IFRS 9.5.5.17 clarifies that "An entity shall measure expected credit losses of a financial instrument in a way that reflects: (a)an unbiased and probability-weighted amount that is determined by evaluating a range of possible outcomes; (b)the time value of money. "The term βdefaultβ is not defined in IFRS 9. IFRS 9:B5.5.37 states that a definition of default should be "consistent with the definition used for internal credit risk management purposes". Entities will need to consider the requirements of this paragraph where it states there is a "rebuttable presumption that default does not occur later than when a financial asset is 90 days past due unless an entity has reasonable and supportable information to demonstrate that a more lagging default criterion is more appropriate".
IFRS 9.5.5.19 indicates that the maximum expected life is generally understood as the contractual life: "The maximum period to consider when measuring expected credit losses is the maximum contractual period (including extension options) over which the entity is exposed to credit risk and not a longer period". The expected period of exposure is more subjective. IFRS 9:B5.5.40 states that when dtermining expected life "an entity should consider factors such as historical information and experience about:(a) the period over which the entity was exposed to credit risk on similar financial instruments;(b) the length of time for related defaults to occur on similar financial instruments following a significant increase in credit risk; and (c) the credit risk management actions that an entity expects to take once the credit risk on the financial instrument has increased, such as the reduction or removal of undrawn limits."
With specific reference to revolving credit facilities, IFRS 9:B5.5.39 prevails upon the entity to apply dicretionary judgement regarding the time horizon of credit exposure. Where financial instruments include both a loan and an undrawn commitment component (such as credit cards and overdraft facilities), the contractual ability to demand repayment and cancel the undrawn commitment does not necessarily limit the exposure to credit losses beyond the contractual period. For those financial instruments, management should measure ECL over the period that the entity is exposed to credit risk and ECL would not be mitigated by credit risk management actions, even if that period extends beyond the maximum contractual period. In the Illustrative Examples, IFRS 9:IE60 provides further guidance on which factors should be taken into consideration when determining size and time horizon of credit exposure: "At the reporting date the outstanding balance on the credit card portfolio is CU60,000 and the available undrawn facility is CU40,000. Bank A determines the expected life of the portfolio by estimating the period over which it expectsto be exposed to credit risk on the facilities at the reporting date, taking into account: (a) the period over which it was exposed to credit risk on a similar portfolio of credit cards; (b) the length of time for related defaults to occur on similar financial instruments; and (c) past events that led to credit risk management actions because of an increase in credit risk on similar financial instruments, such as the reduction or removal of undrawn credit limits.
1 A significant financing component exists if the timing of payments agreed to by the parties to the contract (either explicitly or implicitly) provides the customer or the entity with a significant benefit of financing the transfer of goods or services to the customer. [IFRS 15:60]
- The ECL calculation model should calculate an unbiased and probability-weighted amount to be presented as an impairment to the book value of the financial assets in the Balance sheet.
- This unbiased and probability weighted amount is the difference between the present value of cashflows due under contract and the present value of cashflows that an entity expects to receive.
- The Expected Credit Loss determined by the probability of default, the size of the exposure to defaulting customers, the expected recoverable amount in the event of default and the discount rate applied.
- The estimated size of the exposure is necessarily related to the expectations on the customers drawdown of the undrawn commitment component over a defined time frame. The time frame will be governed by subjective evaluations focussing on how long it will take the entity to identify and take remedial action in relation to problem credit
- The Lifetime Expected Credit Losses will have to incorporate the term structure of the default probability of the assets. In other words, the hazard rate or default intensity, which connotes an instantaneous rate of failure, should be used along with the exponential distribution to compute the cumulative probability of default for a given time horizon.
- The entity should apply a granular and dynamic approach for portfolio segmentation by grouping financial assets
based on shared credit characteristics.
- As with all such forward-looking models, expected loss should be taken into consideration the expected loss at a aggregate portfolio level which generally involves incorporating some expectation of the effect of correlation between the constituent assets.
The future value of Lifetime Expected Credit Loss of portfolio at future time is defined as a function of the probability of default , expected exposure at the time of default and the size of the expected loss in the event of default . The present value of this future is obtained by discounting it at the Expected Interest Rate of the portfolio assets, . Thus:
is the hazard rate or default intensity. More precisely, it is the (instantaneous) probability of default, , over an infinitesimally small time interval :
The estimation of default probabilities of each credit portfolio constituent is achieved with the logit model, which employs the technique of logistic transformation to generate a sigmoid function bounded by 0 and 1.:
Where is a linear regression function of the form:
Where are parameters that are estimated statistically and are scores, ratios and other explanatory variables for obligor , transformed into binary "dummy" variables.
is the average cumulative probability of default of the portfolio over , that is, the output of the cumulative default time distribution at time horizon , where denotes the weighted average lifetime of the credit portfolio:
The Vasicek Model offers an elegant solution allowing the computation of a portfolio default rate , , which integrates the impact of (negative) assumptions about future economic conditions and the effect of the correlation between the portfolio assets. The model takes three inputs:
* The weighted average standalone probability of default, denoted by
* The average correlation of portfolio assets with the broader economy, denoted by ;
* A common systematic economic factor (such as GDP growth , general levels of credit quality etc.) denoted by
The default rate for an asymptotic portfolio, having estimated the average default probability, the default correlation parameter and the common market factor, is given by:
is a standard normal variable,~, representing the assumed severity of economic downturn. The higher the probability of default, the greater the correlation coefficient and the larger the assumed market downturn, the smaller the distance from default, the closer to default and the higher the associated default rate for the portfolio.
It may make more intuitive sense if the variable is restated in terms of the inverse of the standard normal cumulative distribution and a probability input ranging from 0.5 to 0,999, where the higher the input value, the more severe the assumed economic downturn. This results in:
The correlation coefficient, , can be obtained by adapting the Basel II IRB risk-weighted formula for corporate exposures, which is based on the Vasicek model and which prescribes that correlations are bounded by upper and lower limits and are function of the probability of default weighted average. For credit card default correlations, we employ the empirical study of Crook et al1 to set the lower bounds at 0.396% and the upper bound at 4% and assume that correlation is an increasing function of the default probability:
A "Two-stage" LGD model is implemented. The "Stage 1" model is a classification model to predict whether the loan will have a recovery rate (RR) greater than zero. The "Stage 2" model a regression-type model to predict the value of the recovered amount of when the recovery rate is expected to be positive. The predicted recovery is the expected value of the two combined models, that is, the product of a binary value representing the event of recovery and the expected recovery value. So, for obligor , predicted will be either:
Or:
Where is the predicted amount of postive RR obtained from a multivariate linear regression, is the probability of a postive RR obtained from a multivariate logistic regression assuming some threshold and is the obligor-specific recovery rate.
LGD is therefore:
For credit card portfolios, EAD estimation is bedevilled by the revolving nature of the credit line which poses challenges to predicting the exposure at default time. Additional borrowings in the period prior to default means taking the current balance for non-defaulted customers does not produce a sufficiently conservative enough estimate of the amount drawn by the time of default. One solution is to use historic data to derive a Credit Conversion Factor (CCF) which is the proportion of the current undrawn amount that will likely be drawn down at time of default. The dependent variable in the regression analysis will be:
So, for obligor , predicted will be:
Where is the obligor-specific CCF multiplier obtained by applying the multivariate linear regression function to the obligor's data.
2 J. Crook & T. Bellotti (2012) Asset correlations for credit card defaults, Applied Financial Economics, 22:2, 87-95
To avoid any suggestion of the selective usage of raw data and the gaming of model results, the procedure for treating raw data should be transparent and rigourous. For example:
import numpy as np
import pandas as pd
# 1) Retrieve loan data into dataframe
loan_data = pd.read_csv('loan_data_2007_2014.csv')
# 2) Convert string values to integers where necessary. First removing text...
loan_data['emp_length_int'] = loan_data['emp_length'].str.replace('\+ years', '')
loan_data['emp_length_int'] = loan_data['emp_length_int'].str.replace('< 1 year', str(0))
loan_data['emp_length_int'] = loan_data['emp_length_int'].str.replace('n/a', str(0))
loan_data['emp_length_int'] = loan_data['emp_length_int'].str.replace(' years', '')
loan_data['emp_length_int'] = loan_data['emp_length_int'].str.replace(' year', '')
#...then converting string datatype to numeric datatype
loan_data['emp_length_int'] = pd.to_numeric(loan_data['emp_length_int'])
# 2) Convert string values to integers where necessary, replacing text with empty space
loan_data['term_int'] = pd.to_numeric(loan_data['term'].str.replace(' months', ''))
# 3) Convert string points in time to numeric periods of time where necessary.First converting to datetime format...
loan_data['earliest_cr_line_date'] = pd.to_datetime(loan_data['earliest_cr_line'], format = '%b-%y')
#...then converting to a new passage of time variable
loan_data['mths_since_earliest_cr_line'] = round(pd.to_numeric((pd.to_datetime('2017-12-01')
- loan_data['earliest_cr_line_date'])
/ np.timedelta64(1, 'M')))
loan_data['issue_d_date'] = pd.to_datetime(loan_data['issue_d'], format = '%b-%y')
loan_data['mths_since_issue_d'] = round(pd.to_numeric((pd.to_datetime('2017-12-01')
- loan_data['issue_d_date'])
/ np.timedelta64(1, 'M')))
# 4) Transform all discrete variables into dummy variables and concatenate in single dataframe
loan_data_dummies = [pd.get_dummies(loan_data['grade'], prefix = 'grade', prefix_sep = ':'),
pd.get_dummies(loan_data['sub_grade'], prefix = 'sub_grade', prefix_sep = ':'),
pd.get_dummies(loan_data['home_ownership'], prefix = 'home_ownership', prefix_sep = ':'),
pd.get_dummies(loan_data['verification_status'], prefix = 'verification_status', prefix_sep = ':'),
pd.get_dummies(loan_data['loan_status'], prefix = 'loan_status', prefix_sep = ':'),
pd.get_dummies(loan_data['purpose'], prefix = 'purpose', prefix_sep = ':'),
pd.get_dummies(loan_data['addr_state'], prefix = 'addr_state', prefix_sep = ':'),
pd.get_dummies(loan_data['initial_list_status'], prefix = 'initial_list_status', prefix_sep = ':')]
loan_data_dummies = pd.concat(loan_data_dummies, axis = 1)
# 5) Incorporate new dummy variables into master dataframe
loan_data = pd.concat([loan_data, loan_data_dummies], axis = 1)
# 6) Replace missing values with appropriate alternative value or remove from dataset
loan_data['total_rev_hi_lim'].fillna(loan_data['funded_amnt'], inplace=True) # other variable
loan_data['annual_inc'].fillna(loan_data['annual_inc'].mean(), inplace=True) # mean value
loan_data['mths_since_earliest_cr_line'].fillna(0, inplace=True) # zero value
loan_data['acc_now_delinq'].fillna(0, inplace=True) # zero value
loan_data['total_acc'].fillna(0, inplace=True) # zero value
loan_data['pub_rec'].fillna(0, inplace=True) # zero value
loan_data['open_acc'].fillna(0, inplace=True) # zero value
loan_data['inq_last_6mths'].fillna(0, inplace=True) # zero value
loan_data['delinq_2yrs'].fillna(0, inplace=True) # zero value
loan_data['emp_length_int'].fillna(0, inplace=True) # zero value
# To remove null values from dataset:
#indices = loan_data[loan_data['person _ emp_ length'].isnull()].index
#loan_data.drop(indices, inplace=True)
# 7) Search for errors/anomalies/outliers in the dataset. Remove or replace
pd.crosstab(loan_data['home_ownership'],
loan_data['emp_length_int'],
values=loan_data['mths_since_earliest_cr_line'],
aggfunc='min').round(2)
loan_data['mths_since_earliest_cr_line'].describe()
# Replace all negative values in dataset with max.
loan_data['mths_since_earliest_cr_line'][loan_data['mths_since_earliest_cr_line']
< 0] = loan_data['mths_since_earliest_cr_line'].max()
# Remove all negative values from dataset
#indices = loan_data[cr _ loan['person _ emp_ length'] < 0].index
#loan_data.drop(indices, inplace=True)
The data should be divided into training and testing datasets. All discrete and continous feature variables should be transformed into dummy variables. The initial transformation of the feature variables of the training dataset into narrow categories of arbitrary size is referred to as "fine classing". The process of creating new, refined and usually enlarged categories based on the initial ones are refined is a process known as "coarse classing".
A metric called 'Weight of Evidence' is employed to this end. The objective is to lower the number of dummy variables. Weight of evidence shows to what extent each of the different categories of an independent variable explains the dependent variable. The objective is to obtain categories with a similar WOE. Ideally, each category (bin) should have at least 5% of the observations. Each category (bin) should be non-zero for both non-events and events. The should be monotonic, i.e. either growing or decreasing with the groupings.
The formula for is:
The steps to calculate are:
# Define dependent 'Default' variable and add to loan_data dataframe
loan_data['good_bad'] = np.where(loan_data['loan_status'].isin(['Charged Off', 'Default',
'Does not meet the credit policy. Status:Charged Off',
'Late (31-120 days)']), 0, 1)
# Imports the libraries we need.
from sklearn.model_selection import train_test_split
cr_inp_train, cr_inp_test, cr_tgt_train, cr_tgt_test = train_test_split(loan_data.drop('good_bad', axis = 1),
loan_data['good_bad'],
test_size = 0.2,
random_state = 42)
# WoE function for discrete unordered variables
# The function takes 3 arguments: a feature dataframe, a string, and a target dataframe.
# The function returns a dataframe as a result.
def woe_discrete(df, discrete_variabe_name, good_bad_variable_df):
df = pd.concat([df[discrete_variabe_name], good_bad_variable_df], axis = 1)
df = pd.concat([df.groupby(df.columns.values[0], as_index = False)[df.columns.values[1]].count(),
df.groupby(df.columns.values[0], as_index = False)[df.columns.values[1]].mean()], axis = 1)
df = df.iloc[:, [0, 1, 3]]
df.columns = [df.columns.values[0], 'n_obs', 'prop_good']
df['prop_n_obs'] = df['n_obs'] / df['n_obs'].sum()
df['n_good'] = df['prop_good'] * df['n_obs']
df['n_bad'] = (1 - df['prop_good']) * df['n_obs']
df['prop_n_good'] = df['n_good'] / df['n_good'].sum()
df['prop_n_bad'] = df['n_bad'] / df['n_bad'].sum()
df['WoE'] = np.log(df['prop_n_good'] / df['prop_n_bad'])
df = df.sort_values(['WoE'])
df = df.reset_index(drop = True)
df['diff_prop_good'] = df['prop_good'].diff().abs()
df['diff_WoE'] = df['WoE'].diff().abs()
df['IV'] = (df['prop_n_good'] - df['prop_n_bad']) * df['WoE']
df['IV'] = df['IV'].sum()
return df
# NOTE ON GROUPBY
# Groups the data according to a criterion contained in one column (1st = Grade)
# Does not turn the names of the values of the criterion into index if as_index = False
# Aggregates the data in another column (Good_bd) to these groups, using a selected function (mean)
# Syntax: Produces Pandas DataFrame >>> df.groupby('month')[['duration']].sum()
# WoE function for ordered discrete and continuous variables
def woe_ordered_continuous(df, discrete_variabe_name, good_bad_variable_df):
df = pd.concat([df[discrete_variabe_name], good_bad_variable_df], axis = 1)
df = pd.concat([df.groupby(df.columns.values[0], as_index = False)[df.columns.values[1]].count(),
df.groupby(df.columns.values[0], as_index = False)[df.columns.values[1]].mean()], axis = 1)
df = df.iloc[:, [0, 1, 3]]
df.columns = [df.columns.values[0], 'n_obs', 'prop_good']
df['prop_n_obs'] = df['n_obs'] / df['n_obs'].sum()
df['n_good'] = df['prop_good'] * df['n_obs']
df['n_bad'] = (1 - df['prop_good']) * df['n_obs']
df['prop_n_good'] = df['n_good'] / df['n_good'].sum()
df['prop_n_bad'] = df['n_bad'] / df['n_bad'].sum()
df['WoE'] = np.log(df['prop_n_good'] / df['prop_n_bad'])
#df = df.sort_values(['WoE'])
#df = df.reset_index(drop = True)
df['diff_prop_good'] = df['prop_good'].diff().abs()
df['diff_WoE'] = df['WoE'].diff().abs()
df['IV'] = (df['prop_n_good'] - df['prop_n_bad']) * df['WoE']
df['IV'] = df['IV'].sum()
return df
# NOTE: We order the results by the values of a different column.
# WoE Visualization
import matplotlib.pyplot as