%matplotlib inline
import pymc3 as pm
import matplotlib.pyplot as plt
import numpy as np
import math
import seaborn as sns
import scipy
"figure.figsize"] = (10,10)
plt.rcParams[from warnings import filterwarnings
'ignore') filterwarnings(
Introduction
I’m interested in what a typical default with Afterpay looks like. I have read hundreds of pages of information published by Afterpay, but I’m yet to see them mention the average default size.
Because I’m curious and looking for a way to entertain myself on a long train ride, I decided to work it out myself.
What do we know?
- Late Fees Revenue: 46.1 million AUD (Page 54 FY2019 Annual report)
- Average Transaction Value: Approximately 150 AUD (Page 25 FY2019 Annual report)
Furthermore, we know that the lowest and highest fee you can charge for a single transaction is 10 AUD and 68 AUD. Hence, this, in turn, bounds the average of the late fees.
Let’s think about the different paths a transaction could take.
The customer makes good on all their payments on time.
The customer makes no payments, including late fees.
The customer is continually late making payments but, in the end, makes all the payments required.
A combination of 2 and 3, where the customer makes some payments before defaulting.
In cases 2 and 4, there will be a contribution to GROSS LOSS (Afterpay doesn’t get paid what’s owed in total).
In the case of 3 and 4, there will be a contribution to LATE FEES (Afterpay doesn’t get paid on time).
I will use PyMC3 to perform a Monte Carlo simulation, estimating how often cases 3 and 4 occur.
While not strictly necessary, I’m modelling “underlying_sales_aud”, “late_fees_rev_aud”, and “average_transaction_value_aud” as random variables so that they show up in the variable graph.
I’m also going to model average_transaction_value_aud, assuming they have rounded to the nearest 10 AUD.
with pm.Model() as model:
= pm.Uniform('underlying_sales_aud', lower=5.24715*10**9, upper=5.247249*10**9)
underlying_sales_aud
= pm.Uniform('late_fees_rev', lower=46.05 * 10**6, upper=46.149 * 10**6)
late_fees_rev_aud
= pm.Uniform('average_transaction_value', lower=144.50, upper=154.49)
average_transaction_value_aud
= pm.Uniform('average_late_fee',lower = 10, upper = 68)
average_late_fee_aud
= pm.Deterministic('number_of_transactions', underlying_sales_aud / average_transaction_value_aud)
number_of_transactions
= pm.Deterministic('late_payment_rate',late_fees_rev_aud / (number_of_transactions * average_late_fee_aud))
late_payment_rate
Now that we have instantiated all the random variables, we will take 50,000 draws from them to perform our Monte Carlo simulation.
with model:
= pm.sample_prior_predictive(samples=50_000, random_seed=0) samples
Variable Graph
We can graph the relationship between all our variables. From this, we can quickly see which variables are critical dependencies.
pm.model_to_graphviz(model)
Results
We can now visualise the distribution of possible values for the late payment rate.
100*samples["late_payment_rate"], kde=False, norm_hist=True, bins=100)
sns.distplot('Distribution of frequency of late payements (%)')
plt.title('Percentage of transactions with late payment (%)')
plt.xlabel('Relative Frequency')
plt.ylabel( plt.show()
From this chart, we can see a high likelihood that the value of this parameter is bounded between 2 and 14%.
We can find a 94% chance the value is between 1.9% and 9.8% using PyMC3’s summary function.
'late_payment_rate']) pm.summary(samples[
arviz.stats.stats_utils - WARNING - Shape validation failed: input_shape: (1, 50000), minimum_shape: (chains=2, draws=4)
mean | sd | hpd_3% | hpd_97% | mcse_mean | mcse_sd | ess_mean | ess_sd | ess_bulk | ess_tail | r_hat | |
---|---|---|---|---|---|---|---|---|---|---|---|
x | 0.043 | 0.025 | 0.019 | 0.097 | 0.0 | 0.0 | 48586.0 | 47975.0 | 49350.0 | 46251.0 | NaN |
Conclusion
Based on our assumptions, we can understand how common it is for Afterpay customers to be late in payment. Based on our model and beliefs, it’s approximately 4.3% of the time. However, this is almost certainly wrong because:
We made several implicit assumptions: 1. All payments are the same size. 2. The average late fee is uniformly distributed between 10 AUD and 68 AUD.
In future posts, I want to refine the model further, build a more accurate distribution, narrow its bounds, and try to determine a result we can have more confidence in.