Introduction

This notebook illustrates the concepts in the Exam 7 reading LDF Curve-Fitting and Stochastic Reserving: A Maximum Likelihood Approach by David R. Clark. In this notebook, the concepts in the reading are illustrated by reproducing the examples in the paper. The following packages are used in this notebook:

easypackages::packages("data.table", "DT", "ggplot2", "stats4")
## Warning: package 'data.table' was built under R version 3.4.2

The approach is based on treating the cumulative reported percentage of claims as a cumulative distribution function: \[ G(x) = \frac{1}{\mathrm{LDF}_x} \] Fitting a distribution provides an expected emergence pattern. Two commonly-used distributions for this purpose include the Loglogistic curve \[ G(x|\omega,\theta) = \frac{x^{\omega}}{x^{\omega} + \theta^{\omega}} \] and the Weibull curve \[ G(x|\omega,\theta) = 1 - exp\left(-\frac{x^{\omega}}{\theta^{\omega}}\right) \] In either case, multiplying \(G(x)\) by an estimate of the ultimate expected loss gives the expected losses to emerge by time \(x\).

G = function(x, omega, theta) {
  return(x^omega/(x^omega + theta^omega))
}
The approach further assumes that actual emergence \(C\) follows an overdispersed Poisson distribution: if \(X\) is Poisson with mean \(\lambda\), then \(C = \sigma^2 X\) for some value \(\sigma^2\). Since \[ Pr(X=x) = \frac{\lambda^x e^{-\lambda}}{x!} \] then \[ Pr(C=c) = \frac{\lambda^{c/\sigma^2} e^{-\lambda}}{(c/\sigma^2)!} \] Therefore, \[ E[C] = \sigma^2 \lambda = \mu \] and \[ \mathrm{Var}(C) = \sigma^4 \lambda = \mu \sigma^2 \] There is an implicit assumption that the ratio of variance to the mean is constant, and that it can be approximated by a \(\chi^2\) formula: \[ \frac{\mathrm{Var}(C)}{\mu} = \sigma^2 \approx \frac{1}{n-p} \sum_{z, t} \frac{(c_{z,t} - \mu_{z,t})^2}{\mu_{z,t}} \] where

Clark recommends that this method be applied to data that is arranged in tabular, rather than triangular format. The approach is illustrated on the following dataset:

clark.table = fread("./clark_table.csv")
datatable(clark.table)

The method is applied to a table of incremental losses in each development period:

next.period = clark.table[, .(AY, previous_reported = reported_losses, previous_age = age + 12)]
clark.incremental = merge(clark.table, next.period, by.x = c("AY", "age"), by.y = c("AY", "previous_age"), all.x = TRUE)
clark.incremental[is.na(previous_reported), previous_reported := 0]
clark.incremental[, incremental_reported := reported_losses - previous_reported]
clark.incremental = clark.incremental[, .(AY, age, incremental_reported)]
datatable(clark.incremental)

Loss Development Factor Approach

In this approach, we assume that there is an ultimate loss amount \(U_z\) for accident year \(z\) that is independent of the ultimate losses in other accident years. In this case, the expected amount to emerge between development age \(x\) and \(y\) is given by \[ \mu_{z; x, y} = U_z[G(y|\omega, \theta) - G(x|\omega, \theta)] \]

Selecting parameters using maximum likelihood

Under the assumption that actual losses follow an overdispersed Poisson distribution, the log-likelihood function is \[ \sum_{i,t} \frac{c_{i,t}}{\sigma^2} \log\left(\frac{\mu_{i,t}}{\sigma^2}\right) - \frac{\mu_{i,t}}{\sigma^2} - \log\left(\frac{c_{i,t}}{\sigma^2}!\right) \] where the sum is over all entries in the table. Since \(\mu_i\) is the only part of this expression that depends on \(\omega\) and \(\theta\), and \(\sigma^2\) is assumed to be fixed, then maximizing the log-likelihood is equivalent to maximizing \[ \ell = \sum_{i,t} c_{i,t} \log(\mu_{i,t}) - \mu_{i,t}. \]

For the LDF method, to determine ultimate losses, we can differentiate with respect to \(U_z\) and set it equal to zero: \[ \frac{\partial \ell}{\partial U_z} = \sum_t \frac{c_{z,t}}{U_z} - [G(x_t) - G(x_{t-1})] = 0 \] where \(x_t\) is the midpoint of development period \(t\). Solving for \(U_z\) gives \[ U_z = \frac{\sum_t c_{z,t}}{\sum_t[G(x_t) - G(x_{t-1})]} \] Note that due to cancellation in the denominator, and the fact that the numerator is simply the latest cumulative loss value for the accident year, this is equal to the cumulative value divided by \(G(x_t)\), which is the usual LDF estimate of ultimate. For convenience, let’s define a function to calculate the value of the Loglogistic CDF:

G = function(x, omega, theta) {
  return(x^omega/(x^omega + theta^omega))
}

Now, the following function can be used to calculate the ultimate losses:

ult = function(z, omega, theta) {
  numerator = clark.incremental[AY == z, sum(incremental_reported)]
  t = clark.incremental[AY == z, max(age)]
  x_t = t - 6
  denominator = G(x_t, omega, theta)
  return(numerator/denominator)
}

As an example, using the parameters provided by Clark, the ultimate value in 1994 is:

ult(z = 1994, omega = 1.434294, theta = 48.6249)
## [1] 6917862

We can then use this to calculate the expected emergence between two time periods, in this case, assuming a Loglogistic curve:

mu = function(z, x, y, omega, theta) {
  return(ult(z, omega, theta) * (G(y, omega, theta) - G(x, omega, theta)))
}

As an example, the expected emergence for AY 1994 between ages 36 and 48 is

mu(z = 1994, x = 30, y = 42, omega = 1.434294, theta = 48.6249)
## [1] 790263.9

(Note that we use the midpoints of the development year.)

From this, we can calculate the contribution of each individual term to the loglikelihood function, and the sum over all entries in the table. I calculated the negative loglikelihood function, since this function is minimized by R’s “mle” function.

mle.term = function(omega, theta, z, t) {
  c = clark.incremental[AY == z & age == t, incremental_reported]
  return(c * log(mu(z, max(t-18, 0), max(t-6,0), omega, theta)) - mu(z, max(t-18,0), max(t-6,0), omega, theta))
}
neg.loglikelihood = function(omega = 1, theta = 50) {
  mle.terms = apply(clark.incremental, 1, function(r) mle.term(omega, theta, z = r['AY'], t = r['age']))
  return(-(sum(mle.terms)))
}

We now use the “mle” function to estimate the parameters, constraining the parameters to be non-negative:

mle.fit.ldf.ll = mle(minuslogl = neg.loglikelihood, method = "L-BFGS-B", lower = rep(0, 2))
mle.fit.ldf.ll
## 
## Call:
## mle(minuslogl = neg.loglikelihood, method = "L-BFGS-B", lower = rep(0, 
##     2))
## 
## Coefficients:
##     omega     theta 
##  1.434296 48.624756

Let’s save these results in variables for later convenience:

omega.ldf.ll = mle.fit.ldf.ll@coef[[1]]
theta.ldf.ll = mle.fit.ldf.ll@coef[[2]]

The approach can also be applied using the Weibull distribution. This requires us to re-define the function \(G\):

G = function(x, omega, theta) {
  return(1 - exp(-(x/theta)^omega))
}

Refitting the model:

mle.fit.ldf.weibull = mle(minuslogl = neg.loglikelihood, method = "L-BFGS-B", lower = rep(0, 2))
mle.fit.ldf.weibull
## 
## Call:
## mle(minuslogl = neg.loglikelihood, method = "L-BFGS-B", lower = rep(0, 
##     2))
## 
## Coefficients:
##     omega     theta 
##  1.296907 48.884451

Saving the parameters for later use:

omega.ldf.weibull = mle.fit.ldf.weibull@coef[[1]]
theta.ldf.weibull = mle.fit.ldf.weibull@coef[[2]]

For the remaining tests, switch back to the loglogistic distribution:

G = function(x, omega, theta) {
  return(x^omega/(x^omega + theta^omega))
}

Assessment of Residuals

The normalized residuals are \[ \frac{c_{z, x, y} - \hat{\mu}_{z, x, y}}{\sqrt{\sigma^2 \hat{\mu}_{z, x, y}}} \] Note that \(\sigma^2 \mu_{z, x, y}\) is the process variance. To calculate the residuals, we first need to determine the scale parameter \(\sigma^2\) using the \(\chi^2\) formula above.

clark.incremental$predicted_mean = apply(clark.incremental, 1, function(r) mu(z = r['AY'], x = max(r['age'] - 18,0), y = max(r['age'] - 6, 0), omega = omega.ldf.ll, theta = theta.ldf.ll))
clark.incremental[, chi_squared_contribution := (incremental_reported - predicted_mean)^2 / predicted_mean]
n = nrow(clark.incremental)
p = 2 + length(unique(clark.incremental$AY))
sigma.squared.ldf = clark.incremental[, sum(chi_squared_contribution)] / (n - p)
print(paste0("The scale parameter is ", sigma.squared.ldf))
## [1] "The scale parameter is 65029.3306566893"

Next, we calculate and plot the normalized residuals by age:

clark.incremental[, normalized_residual := (incremental_reported - predicted_mean) / sqrt(sigma.squared.ldf*predicted_mean)]
ggplot(data = clark.incremental, aes(x = age, y = normalized_residual)) + geom_point()
## Warning in plyr::split_indices(scale_id, n): '.Random.seed' is not an
## integer vector but of type 'NULL', so ignored

If the curve fit is appropriate, then the residuals should be randomly scattered around the horizontal \(y=0\) line. In this case, the age 12 and age 48 residuals appear to be overly high, and the age 24 residuals appear to be too low. However, there is generally no consistent pattern.

A second test involves testing the assumption that the variance-to-mean ratio is constant by plotting the residuals against the expected mean. Again, we are looking for residuals that are randomly scattered around 0:

ggplot(data = clark.incremental, aes(x = predicted_mean, y = normalized_residual)) + geom_point() + geom_smooth()
## `geom_smooth()` using method = 'loess'

Similarly, to test for accident year effects, we can plot residuals against accident year:

ggplot(data = clark.incremental, aes(x = AY, y = normalized_residual)) + geom_point() + geom_smooth()
## `geom_smooth()` using method = 'loess'

We can also look for calendar year effects:

clark.incremental[, CY := AY + age/12 - 1]
ggplot(data = clark.incremental, aes(x = CY, y = normalized_residual)) + geom_point() + geom_smooth()
## `geom_smooth()` using method = 'loess'

Calculation of Reserves

To calculate the reserves, we collapse the incremental values to the most recent diagonal (i.e. total cumulative reported losses), then use the fitted loglogistic curve to determine the cumulative percent reported. From this, we get the LDF and ultimate loss estimate:

clark.diagonal = clark.incremental[, .(cumulative_reported = sum(incremental_reported), age = max(age)), by = AY]
clark.diagonal[, current_year_midpoint := age - 6]
clark.diagonal[, percent_reported := G(x = current_year_midpoint, omega = omega.ldf.ll, theta = theta.ldf.ll)]
clark.diagonal[, LDF := 1/percent_reported]
clark.diagonal[, ultimate_losses := LDF * cumulative_reported]
clark.diagonal[, reserve := ultimate_losses - cumulative_reported]
datatable(clark.diagonal)

From this, we can calculate the total reserve, its process variance, and its coefficient of variation:

total.ultimate.ldf = clark.diagonal[, sum(ultimate_losses)]
total.reserve.ldf = clark.diagonal[, sum(reserve)]
reserve.variance.ldf = total.reserve.ldf * sigma.squared.ldf
reserve.cv.ldf = sqrt(reserve.variance.ldf) / total.reserve.ldf
print(paste0("The total ultimate is ", total.ultimate.ldf))
## [1] "The total ultimate is 69998593.5854499"
print(paste0("The total reserve is ", total.reserve.ldf))
## [1] "The total reserve is 35640503.5854499"
print(paste0("The coefficient of variation of the reserve is ", reserve.cv.ldf))
## [1] "The coefficient of variation of the reserve is 0.0427152277244997"

Projected emergence in the coming year can be estimated by using differences in the values of the growth function:

clark.diagonal[, future_growth := G(x = current_year_midpoint + 12, omega = omega.ldf.ll, theta = theta.ldf.ll)]
clark.diagonal[, estimated_12_mo_emergence := (future_growth - percent_reported) * ultimate_losses]
datatable(clark.diagonal[, .(AY, age, current_year_midpoint, percent_reported, future_growth, ultimate_losses, estimated_12_mo_emergence)])

A key reason for calculating the 12-month emergence is that this is an estimate that is easily testable within a short time frame.

A concern with this approach is that only 77.24% of the losses are reported by month 120, so there is significant extrapolation into the tail. One way to mitigate this is to truncate the extrapolation at some fixed point, such as month 240. This can be done by calculating the LDF at month 240, and dividing all other LDFs by this value:

LDF.240 = 1 / G(x = 234, omega = omega.ldf.ll, theta = theta.ldf.ll)
clark.diagonal[, truncated_LDF := LDF / LDF.240]
clark.diagonal[, losses_at_240 := truncated_LDF * cumulative_reported]
clark.diagonal[, reserve_at_240 := losses_at_240 - cumulative_reported]
datatable(clark.diagonal[, .(AY, cumulative_reported, LDF, truncated_LDF, losses_at_240, reserve_at_240)])

As before, we can look at the coefficient of variation of the reserve:

total.ultimate.ldf.trunc = clark.diagonal[, sum(losses_at_240)]
total.reserve.ldf.trunc = clark.diagonal[, sum(reserve_at_240)]
reserve.variance.ldf.trunc = total.reserve.ldf.trunc * sigma.squared.ldf
reserve.cv.ldf.trunc = sqrt(reserve.variance.ldf.trunc) / total.reserve.ldf.trunc
print(paste0("The total ultimate is ", total.ultimate.ldf.trunc))
## [1] "The total ultimate is 63345664.9499024"
print(paste0("The total reserve is ", total.reserve.ldf.trunc))
## [1] "The total reserve is 28987574.9499024"
print(paste0("The coefficient of variation of the reserve is ", reserve.cv.ldf.trunc))
## [1] "The coefficient of variation of the reserve is 0.0473640356855139"

An alternative approach to truncation is to use the lighter-tailed Weibull curve:

G = function(x, omega, theta) {
  return(1 - exp(-(x/theta)^omega))
}

In this case, we have:

clark.diagonal[, percent_reported_weibull := G(x = current_year_midpoint, omega = omega.ldf.weibull, theta = theta.ldf.weibull)]
clark.diagonal[, LDF_weibull := 1 / percent_reported_weibull]
clark.diagonal[, ultimate_weibull := LDF_weibull * cumulative_reported]
clark.diagonal[, reserve_weibull := ultimate_weibull - cumulative_reported]
datatable(clark.diagonal[, .(AY, age, current_year_midpoint, percent_reported_weibull, LDF_weibull, ultimate_weibull, reserve_weibull)])

Cape Cod Approach

In this approach, we assume that there is a relationship between the ultimate loss in the various accident years that is determined by a common expected ultimate loss ratio. The ultimate losses in accident year \(z\) are determined by an exposure base \(P_z\), typically earned premium. In this case, the expected amount to emerge between development age \(x\) and \(y\) is given by \[ \mu_{z; x, y} = P_z L [G(y|\omega, \theta) - G(x|\omega, \theta)] \]

Selecting parameters using maximum likelihood

In this case, to maximize \[ \ell = \sum_{i,t} c_{i,t} \log(\mu_{i,t}) - \mu_{i,t}. \] we require that \[ \frac{\partial \ell}{\partial L} = \sum_{i,t} \frac{c_{i,t}}{L} - P_i [G(x_t|\omega, \theta) - G(x_{t-1}|\omega, \theta)] = 0 \] Solving for \(L\), we obtain \[ L = \frac{\sum_{i,t} c_{i,t}}{\sum_{i,t} P_i[G(x_t|\omega, \theta) - G(x_{t-1}|\omega, \theta)]} \] In this case, the telescoping sum in the denominator is equal to \[ \sum_i P_i \times \text{(percent reported for AY } i \text{ in current CY)} \] Returning to the loglogistic distribution and re-define the calculation of ultimate:

G = function(x, omega, theta) {
  return(x^omega/(x^omega + theta^omega))
}
ELR = function(omega, theta) {
  numerator = clark.incremental[, sum(incremental_reported)]
  AY.summary = clark.incremental[, .(EP = mean(EP), age = max(age)), by = AY]
  AY.summary[,reported_premium := EP * G(age - 6, omega, theta)]
  denominator = AY.summary[, sum(reported_premium)]
  return(numerator / denominator)
}
ult = function(z, omega, theta) {
  EP = clark.incremental[AY == z, mean(EP)]
  return(EP * ELR(omega, theta))
}

To apply these functions, we require an additional assumption about the amount of on-level earned premium in each accident year. Clark assumes that premium is equal to $10M in the first accident year, increasing by $400K each year.

clark.incremental[, EP := 10000000 + (AY - 1991) * 400000]
clark.diagonal[, EP := 10000000+ (AY - 1991) * 400000]

Using the parameters selected by Clark, the expected loss ratio is:

ELR(omega = 1.447634, theta = 48.0205)
## [1] 0.5977656

To fit the parameters, we can use maximum likelihood as before. (Note that having re-defined the function for ultimate losses, we need no other changes to the fitting procedure.)

mle.fit.cc = mle(minuslogl = neg.loglikelihood, method = "L-BFGS-B", lower = rep(0, 2))
mle.fit.cc
## 
## Call:
## mle(minuslogl = neg.loglikelihood, method = "L-BFGS-B", lower = rep(0, 
##     2))
## 
## Coefficients:
##     omega     theta 
##  1.447628 48.021047

Save these coefficients for later use:

omega.cc = mle.fit.cc@coef[[1]]
theta.cc = mle.fit.cc@coef[[2]]

A special consideration when applying the Cape Cod method is validating the assumption that the expected loss ratio is constant over each year:

clark.diagonal[, CC_percent_reported := G(x = current_year_midpoint, omega = omega.cc, theta = theta.cc)]
clark.diagonal[, reported_premium := EP * CC_percent_reported]
clark.diagonal[, cc_ultimate_LR := cumulative_reported / reported_premium]
datatable(clark.diagonal[, .(AY, EP, age, current_year_midpoint, CC_percent_reported, reported_premium, cc_ultimate_LR)])

Visualizing the loss ratio over time:

ggplot(data = clark.diagonal, aes(x = AY, y = cc_ultimate_LR)) + geom_point() + geom_line() + geom_smooth()
## `geom_smooth()` using method = 'loess'

Since there is no consistent increasing or decreasing trend, this is a good validation that the expected loss ratio is not changing over time.

Assessment of Residuals

As before, in order to calculate the normalized residuals, we first estimate \(\sigma^2\):

clark.incremental$predicted_mean_cc = apply(clark.incremental, 1, function(r) mu(z = r['AY'], x = max(r['age'] - 18,0), y = max(r['age'] - 6, 0), omega = omega.cc, theta = theta.cc))
clark.incremental[, chi_squared_contribution_cc := (incremental_reported - predicted_mean_cc)^2 / predicted_mean_cc]
n = nrow(clark.incremental)
p = 3
sigma.squared.cc = clark.incremental[, sum(chi_squared_contribution_cc)] / (n - p)
print(paste0("The scale parameter is ", sigma.squared.cc))
## [1] "The scale parameter is 61576.8281698009"

Next, calculate the normalized residuals. For comparison purposes, I’ll also plot the normalized residuals from the LDF approach:

clark.incremental[, normalized_residual_cc := (incremental_reported - predicted_mean_cc) / sqrt(sigma.squared.cc * predicted_mean_cc)]
ggplot(data = clark.incremental, aes(x = age, y = normalized_residual_cc)) + geom_point(color = "blue") + geom_point(aes(y = normalized_residual), colour = "red")

Examining residuals by predicted mean, we validate that the proportional variance assumption is consistent:

ggplot(data = clark.incremental, aes(x = predicted_mean_cc, y = normalized_residual_cc)) + geom_point() + geom_smooth()
## `geom_smooth()` using method = 'loess'

Testing for accident year effects, we find no consistent pattern:

ggplot(data = clark.incremental, aes(x = AY, y = normalized_residual_cc)) + geom_point() + geom_smooth()
## `geom_smooth()` using method = 'loess'

Testing for calendar year effects, we find no consistent pattern:

ggplot(data = clark.incremental, aes(x = CY, y = normalized_residual_cc)) + geom_point() + geom_smooth()
## `geom_smooth()` using method = 'loess'

Calculation of Reserves

Reserves can be obtained using this method, using the same truncation point (240 months) as before. For this method, we apply truncation by subtracting the growth function from the truncation point value of the growth function, and applyng this percentage to the

CC.growth.240 = G(x = 234, omega = omega.cc, theta = theta.cc)
clark.diagonal[, CC_truncated_growth := CC.growth.240 - CC_percent_reported]
clark.diagonal[, CC_expected_losses := EP * ELR(omega = omega.cc, theta = theta.cc)]
clark.diagonal[, CC_truncated_reserves := CC_expected_losses * CC_truncated_growth]
print(paste0("The total estimated reserve value is ", clark.diagonal[, sum(CC_truncated_reserves)]))
## [1] "The total estimated reserve value is 29707734.5762609"

Parameter Variance and Variance of the Reserves

The variance of the parameter estimates can be approximated using the Delta Method. Recall that the information matrix is the matrix of expected values of second derivatives of the likelihood function. The Rao-Cramer bound states that the covariance matrix, \(\Sigma\), is bounded below by the inverse of the information matrix, divided by \(\sigma^2\). For the Cape Cod method, the information matrix is a \(3\times 3\) matrix, while it is a \((n+2)\times(n+2)\) matrix for the LDF method.

Since the Cape Cod reserve is given by \[ R = \sum_z P_z L [G(y_z) - G(x_z)] \] then the variance of the estimate is \[ (\partial R)^T \Sigma (\partial R) \] where \(\partial R\) is the vector of partial derivatives: \[ \frac{\partial R}{\partial L} = \sum_z P_z[G(y_z) - G(x_z)] \] \[ \frac{\partial R}{\partial \theta} = \sum_z P_zL\left[\frac{\partial G(y_z)}{\partial \theta} - \frac{\partial G(x_z)}{\partial \theta}\right] \] \[ \frac{\partial R}{\partial \omega} = \sum_z P_zL\left[\frac{\partial G(y_z)}{\partial \omega} - \frac{\partial G(x_z)}{\partial \omega}\right] \]

The process variance can be derived from the assumption that actual losses emerge according to an overdispersed Poisson process: \[ \text{Process variance } = \sigma^2 \sum \mu_{z, x, y} = \sigma^2 R \] The overall variance of the reserves will be the sum of the process variance and the parameter variance.

To determine the parameter variance, start with the covariance matrix provided by Clark:

covariance.matrix = matrix(data = c(0.002421, -0.002977, 0.242396, -0.002997, 0.007853, -0.401000, 0.242396, -0.401000, 33.021994), nrow = 3, ncol = 3)
covariance.matrix
##           [,1]      [,2]      [,3]
## [1,]  0.002421 -0.002997  0.242396
## [2,] -0.002977  0.007853 -0.401000
## [3,]  0.242396 -0.401000 33.021994

Application of the Delta method requires R implementations of the derivatives of the loglogistic distribution and the reserves. For the derivatives of the reserves, these functions calculate one term in the sum, rather than the whole sum.

G_omega = function(x, omega, theta) {
  return(log(x / theta) * x^omega * theta^omega / (x^omega + theta^omega)^2)
}
G_theta = function(x, omega, theta) {
  return(-omega* x^omega * theta^omega / (x^omega + theta^omega)^2 / theta)
}
R_L = function(P, L, y, x, omega, theta) {
  return(P * (G(y, omega.cc, theta.cc) - G(x, omega.cc, theta.cc)))
}
R_theta = function(P, L, y, x, omega, theta) {
  return(P * L * (G_theta(y, omega.cc, theta.cc) - G_theta(x, omega.cc, theta.cc)))
}
R_omega = function(P, L, y, x, omega, theta) {
  return(P * L * (G_omega(y, omega.cc, theta.cc) - G_omega(x, omega.cc, theta.cc)))
}

Calculate the derivatives for each entry in the table:

clark.diagonal$R_L = apply(clark.diagonal, 1, function(r) R_L(P = r['EP'], L = ELR(omega.cc, theta.cc), y = 234, x = r['current_year_midpoint'], omega = omega.cc, theta = theta.cc))
clark.diagonal$R_theta = apply(clark.diagonal, 1, function(r) R_theta(P = r['EP'], L = ELR(omega.cc, theta.cc), y = 234, x = r['current_year_midpoint'], omega = omega.cc, theta = theta.cc))
clark.diagonal$R_omega = apply(clark.diagonal, 1, function(r) R_omega(P = r['EP'], L = ELR(omega.cc, theta.cc), y = 234, x = r['current_year_midpoint'], omega = omega.cc, theta = theta.cc))
clark.diagonal$parameter_variance = apply(clark.diagonal, 1, function(r) r[c('R_L', 'R_omega', 'R_theta')] %*% covariance.matrix %*% r[c('R_L', 'R_omega', 'R_theta')])
clark.diagonal$parameter_stdev = sqrt(clark.diagonal$parameter_variance)
clark.diagonal$process_stdev = sqrt(clark.diagonal$CC_truncated_reserves * sigma.squared.cc)
clark.diagonal$total_stdev = sqrt(clark.diagonal$parameter_stdev^2 + clark.diagonal$process_stdev^2)
datatable(clark.diagonal[,.(AY, cumulative_reported, CC_truncated_reserves, process_stdev, process_CV = process_stdev / CC_truncated_reserves, parameter_stdev, parameter_CV = parameter_stdev / CC_truncated_reserves, total_stdev, CV_total = total_stdev / CC_truncated_reserves)])

For the total reserve, the standard deviations are as follows. Note that to calculate the overall parameter standard deviation, we need to go back to the covariance matrix and use derivatives corresponding to the whole reserve, not just the individual terms in the sum.

total.process.stdev = sqrt(clark.diagonal[, sum(process_stdev^2)])
partial_R = as.matrix(clark.diagonal[, .(sum(R_L), sum(R_omega), sum(R_theta))])[1,]
total.parameter.stdev = sqrt(partial_R %*% covariance.matrix %*% partial_R)
total.stdev = sqrt(total.parameter.stdev^2 + total.process.stdev^2)
print(paste0("The total parameter standard deviation is ", total.parameter.stdev))
## [1] "The total parameter standard deviation is 3145421.50649694"
print(paste0("The total process standard deviation is ", total.process.stdev))
## [1] "The total process standard deviation is 1352519.15598873"
print(paste0("The total combined standard deviation is ", total.stdev))
## [1] "The total combined standard deviation is 3423884.41990234"

Note that the standard deviation associated with parameter estimation is a much larger part of the overall variance than the process standard deviation is.

As an additional application in a prospective pricing context, we can use the covariance matrix to calculate the variance of the expected loss ratio for pricing purposes – the parameter variance is just the diagonal entry in the covariance matrix corresponding to the ELR, and the process variance is calculated as usual. The example in Clark assumes a prospective premium of $14M:

prospective.losses = 14000000 * ELR(omega = omega.cc, theta = theta.cc)
prospective.process.variance = prospective.losses * sigma.squared.cc
prospective.parameter.variance = covariance.matrix[1,1] * prospective.losses^2
prospective.loss.variance = prospective.process.variance + prospective.parameter.variance
print(paste0("The prospective losses are ", prospective.losses))
## [1] "The prospective losses are 8368774.24553926"
print(paste0("The parameter CV is ", sqrt(prospective.parameter.variance) / prospective.losses))
## [1] "The parameter CV is 0.0492036584005702"
print(paste0("The process CV is ", sqrt(prospective.process.variance) / prospective.losses))
## [1] "The process CV is 0.0857783584822893"
print(paste0("The coefficient of variation of prospective losses is ", sqrt(prospective.loss.variance) / prospective.losses))
## [1] "The coefficient of variation of prospective losses is 0.0988884562722876"

Note: although the intermediate answers above match the numbers in Clark’s paper, the final CV does not. A typical use case for a result like this is as an additional piece of information to compare to numbers produced by other prospective pricing approaches.

Additional applications of this approach include:
  • Calculation of parameter variance for estimates of emergence in the next twelve months can be performed by changing the value of \(y\) in the derivatives calculated above so that \(y\) is twelve months later than \(x\). An advantage of this approach is that the variance estimates can be tested against actual emergence relatively quickly.
  • Variability of discounted reserves by incorporating the discount factor into the calculation of \(R\) prior to taking its derivatives. Generally, this results in a significant reduction in the coefficient of variation, since the tail has the greatest parameter variance but also receives the largest discount.

Commentary

Assumptions Underlying the Model

  • Expected emergence is strictly increasing.
  • Actual emergence in any period has a constant ratio of variance to mean. Theoretically, we should also fit \(\sigma^2\) using maximum likelihood, but this becomes mathematically intractible. (Assuming \(\sigma^2\) is fixed means we are actually using a quasi-likelihood estimator.)
  • For the Cape Cod approach: expected loss ratio is constant over time.
  • Losses are independent and identically distributed.
      Independence assumption means that emergence in one period does not affect surrounding periods. This can be tested through residual analysis.
    • Identical distribution means that emergence pattern is the same across all accident years.

Advantages of the approach

  • Only two parameters are needed to fit Weibull or Loglogistic curves to determine expected emergence.
    • This means that there are a total of 3 parameters in the Cape Cod approach, and \(n+2\) parameters in the LDF approach, where \(n\) is the number of accident years
    • The Cape Cod approach is often preferred over the LDF approach to avoid the risk of overparameterization.
    • The source of the benefit of the Cape Cod method lies in the additional information that is provided by the exposure base.
  • There is no need for the evaluation dates to be evenly spaced, e.g. if the most recent evaluation is at 9 months.
  • The expected emergence pattern is a smooth curve, rather than an erratic curve that follows historical randomness.
  • Use of an overdispersed Poisson distribution has several advantages:
    • It allows us to match the first two moments of any distribution.
    • Maximum likelihood estimate of the overdispersed Poisson model produces the LDF and Cape Cod estimates of ultimate losses.
    • It allows for a point mass at zero, which can occur when there is no change in development in a given year.
  • There is no requirement to have a complete triangle. The method can be applied to the most recent diagonals, for example.

Disadvantages of the approach

  • Not suited to lines of business with negative expected development, such as lines with significant salvage or subrogation potential.
  • Independence assumption means that the method will not handle inflation well.
  • Assumption of identical development patterns across accident years will not handle changes in the mix of business or changes in claim handling practices well.
  • Overall variance on the reserves may be understated because we have not considered the variance in \(\sigma^2\).
  • The variance estimates are not exact: they are an approximation to a lower bound on the variance. This also means that there is more potential for variability than indicated by the models.