Introduction

This notebook illustrates the concepts in the Exam 7 readings Credible Claims Reserve: The Benktander Method by Thomas Mack and Credible Loss Ratio Claims Reservses: The Benktander, Neuhaus, and Mack Methods Revisited by Werner Hürlimann. The main difference between these papers is that Mack focuses on estimation of ultimate claims for a single accident year, while Hürlimann considers the methods for the whole development triangle. These concepts are illustrated by reproducing the examples in the papers. The following packages are used in this notebook:

easypackages::packages("data.table", "DT", "ggplot2")

Definitions

Single Accident Year

Mack uses the following notation:
  • \(p_k\) is the percentage of ultimate claims expected to be paid after \(k\) years of development. Assume that \(p_1>0\) and that all claims are paid after \(n\) years, so \(p_n=1\). Note that these are reciprocals of age-to-ultimate factors.
  • \(q_k = 1 - p_k\)
  • \(U_0\) is the expected ultimate claims amount, prior to taking into account any information about actual claims emergence. This may be expressed either as dollar-value losses or as a loss ratio.
  • \(C_k\) is the amount of claims paid up to year \(k\)
  • True ultimate claims and reserves are denoted by \(U\) and \(R\), respectively. Estimators of these quantities are denoted using subscripts which identify method used to determine the estimate. Note that these are related by \(U = C_k + R\).

The numerical example in Mack’s paper uses the following parameters:

k = 3
p_3 = 0.5
U_0 = 0.9
C_3 = 0.55
The three methods have ultimate and reserve estimates given in the following table:
Method Reserve Estimate Ultimate Estimate
Bornhuetter-Ferguson (BF) \(R_{BF} = q_k U_0\) \(U_{BF} = q_kU_0 + C_k\)
Chain Ladder (CL) \(R_{CL} = C_k / p_k - C_k\) \(U_{CL} = C_k / p_k\)
Gunnar Benktander (GB) \(R_{GB} = q_k U_{BF}\) \(U_{GB} = C_k + R_{GB}\)
The Benktander method is a special case of the following procedure:
  1. Replacing the ultimate loss estimate with a credibility-weighted average of the Chain Ladder ultimate and original ultimate loss estimate: \(U_c = cU_{CL} + (1-c) U_0\)
  2. The corresponding reserves at time \(k\) are obtained by using the above revised ultimate amount in the Bornhuetter-Ferguson method: \(R_c = q_k U_c\)
  3. Add the actual claims to obtain the new ultimate claims estimate: \(U = C_k + q_kU_c\).

Note that the Bornhuetter-Ferguson and Chain Ladder methods are special cases of this approach, with \(c = 0\) and \(c = 1\), respectively. Benktander applied the above procedure with \(c = p_k\), so the prior ultimate in this case becomes \[ U_{p_k} = p_k U_{CL} + q_k U_0 = U_{BF} \] and the reserve becomes \[ R_{GB} = q_k U_{BF} \] This is added to the actual claims to obtain the new ultimate claims estimate: \[ U_{GB} = R_{GB} + C_k = q_k^2 U_0 + (1 + q_k)C_k = q_k^2 U_0 + (1 - q_k^2) U_{CL} \]

An alternative approach is to take a credibility-weighted average of the Chain Ladder and Bornhuetter-Ferguson reserves: \[ R_c = c R_{CL} + (1-c) R_{BF} \] When we select \(c = p_k\), this becomes \[ R_{p_k} = p_k (C_k / p_k - C_k) + q_k^2 U_0 = q_k(C_k + q_kU_0) = q_k U_{BF} \] The reserve and ultimate estimates for each of the three methods, using the values from Mack’s example, are:

R_BF = (1 - p_3)*U_0
U_BF = C_3 + R_BF
U_CL = C_3 / p_3
R_CL = U_CL - C_3
R_GB = (1 - p_3) * U_BF
U_GB = R_GB + C_3
print(paste0("The Bornhuetter-Ferguson reserve is ", R_BF))
## [1] "The Bornhuetter-Ferguson reserve is 0.45"
print(paste0("The Bornhuetter-Ferguson ultimate is ", U_BF))
## [1] "The Bornhuetter-Ferguson ultimate is 1"
print(paste0("The Chain Ladder reserve is ", R_CL))
## [1] "The Chain Ladder reserve is 0.55"
print(paste0("The Chain Ladder ultimate is ", U_CL))
## [1] "The Chain Ladder ultimate is 1.1"
print(paste0("The Benktander reserve is ", R_GB))
## [1] "The Benktander reserve is 0.5"
print(paste0("The Benktander ultimate is ", U_GB))
## [1] "The Benktander ultimate is 1.05"

Multiple Accident Years

Hürlimann illustrates the ideas in his paper using the following loss triangle (loaded in tabular for for ease of processing):

hurlimann.table = fread("./hurlimann_1.csv")
datatable(hurlimann.table)

Hürlimann uses \(S_{i,k}\) to denote the incremental paid claims for origin period \(i\) during development period \(k\), corresponding to calendar period \(i+k-1\), which is “incremental_paid_claims” in the above table. Then the row sum \[ C_{i,k} = \sum_{1 \leq j \leq k} S_{i,j} \] is the cumulative paid claim amount for origin period \(i\) as of development period \(k\). Assuming \(n\) years appear in the development triangle, then only \(C_{i, n-i+1}\) is known at the end of the calendar year. \(V_i\) denotes the exposure measure (typically premium) for origin period \(i\). The quantity \[ m_k = \frac{E[\sum_{1\leq i \leq n-k+1} S_{i,k}]}{\sum_{1\leq i \leq n-k+1}} \] is the expected incremental loss ratio for development period \(k\), which is obtained by summing the known values in each column divided by the sum of the corresponding premiums. This can be calculated from the table above as follows:

hurlimann.parameters = hurlimann.table[, .(m_k = sum(incremental_paid_claims) / sum(exposure)), by = "development_period"]
datatable(hurlimann.parameters)

Summing \(m_k\) across all development periods gives the overall loss ratio, and multiplying by the premium in each accident year gives the expected value of the burning cost, \(U_i^{BC}\) in each accident year: \[ U_i^{BC} = V_i \sum_{1 \leq k \leq n} m_k \]

m = hurlimann.parameters[, sum(m_k)]
burning.cost = hurlimann.table[, .(exposure = max(exposure), expected_burning_cost = max(exposure) * m, reported_claims = sum(incremental_paid_claims)), by = "origin_period"]
datatable(burning.cost)

By truncating the sum at the most recently-known incremental payment, we can calculate loss ratio payout factors for each accident year: \[ p_i = \frac{\sum_{1\leq k \leq n-i+1} m_k}{\sum_{1\leq k \leq n} m_k} \] This quantity, together with the percentage of unreported claims \(q_i = 1 - p_i\), can be calculated as follows:

table.with.LR = merge(hurlimann.table, hurlimann.parameters, by = "development_period")
payout.factors = table.with.LR[, .(expected_burning_cost = max(exposure) * m, cumulative_paid_claims = sum(incremental_paid_claims), p_i = sum(m_k) / m), by = "origin_period"]
payout.factors[, q_i := 1 - p_i]
datatable(payout.factors)

The individual loss ratio claims reserve is obtained by dividing the cumulative claims by the percentage paid to date: \[ U_i^{ind} = \frac{C_{i, n-i+1}}{p_i} \] This estimate plays a role analogous to the chain ladder method, though it is not the same estimate as the chain ladder method, since the payment percentage is simply the ratio of the cumulative loss ratio in the development period of interest to the overall loss ratio, as opposed to a product of age-to-age factors. The estimates of ultimate claims and reserve \(R_i^{ind} = U_i^{ind} - C_{i, n-i+1}\) can be calculated as follows:

payout.factors[, individual_ultimate := cumulative_paid_claims / p_i]
payout.factors[, individual_reserve := individual_ultimate - cumulative_paid_claims]
datatable(payout.factors[, .(origin_period, cumulative_paid_claims, individual_ultimate, individual_reserve)])

On the other hand, by applying the unpaid percentage to the expected ultimate burning cost, we obtain the collective loss ratio claims reserve, \[ R_i^{coll} = q_i U_i^{BC} \] and the corresponding ultimate, \[ U_i^{coll} = R_i^{coll} + C_{i, n-i+1} \] This is the loss ratio analogue of the Bornhuetter-Ferguson approach. Applied to the data above:

payout.factors[, collective_reserve := q_i * expected_burning_cost]
payout.factors[, collective_ultimate := collective_reserve + cumulative_paid_claims]
datatable(payout.factors[, .(origin_period, collective_reserve, collective_ultimate)])

More generally, Hürlimann considers a credibility-weighted average of these two approaches: \[ R_i^c = Z_i R_i^{ind} + (1 - Z_i) R_i^{coll} \] The Benktander approach involves using \(Z_i = p_i\) as the credibility weight: \[ R_i^{GB} = p_i R_i^{ind} + q_i R_i^{coll} \] Using the data above:

payout.factors[, GB_reserve := p_i * individual_reserve + q_i * collective_reserve]
payout.factors[, GB_ultimate := GB_reserve + cumulative_paid_claims]
datatable(payout.factors[, .(origin_period, GB_reserve, GB_ultimate)])

The Neuhaus credibilty weights are obtained by multiplying the paid claim percentage by the total loss ratio: \[ Z_i^{WN} = p_i m \] Using the above data:

payout.factors[, neuhaus_reserve := p_i * m * individual_reserve + (1 - p_i * m) * collective_reserve]
payout.factors[, neuhaus_ultimate := neuhaus_reserve + cumulative_paid_claims]
datatable(payout.factors[, .(origin_period, neuhaus_reserve, neuhaus_ultimate)])

Iterated Bornhuetter-Ferguson Method

Single Accident Year

The Benktander approach can be viewed as a special case of the following iterated Bornhuetter-Ferguson technique:
  1. Start with \(U^{(0)} = U_0\) as the initial ultimate claims estimate.
  2. Calculate \(R^{(m)} = q_k U^{(m)}\).
  3. Calculate \(U^{(m+1)} = C_k + R^{(m)}\)
  4. Repeat steps 2 and 3 until the desired number of iterations has been performed

Note that the Benktander reserve is \(R^{(1)}\) and the Benktander ultimate is \(U^{(2)}\).

The following performs 19 iterations of the above algorithm, using the parameters from Mack’s example. Note that array indices in R start at 1, rather than 0, so indicies in the code are 1 larger than they are in Mack’s paper. (I have adjusted output indices to match Mack’s convention.)

R = rep(0, 20)
R[20] = NA
U = rep(0, 20)
U[1] = U_0
for (i in 1:19) {
  R[i] = (1 - p_3) * U[i]
  U[i+1] = C_3 + R[i]
}
iterated.BF.results = data.frame(m = 0:19, R, U)
iterated.BF.results
##     m         R        U
## 1   0 0.4500000 0.900000
## 2   1 0.5000000 1.000000
## 3   2 0.5250000 1.050000
## 4   3 0.5375000 1.075000
## 5   4 0.5437500 1.087500
## 6   5 0.5468750 1.093750
## 7   6 0.5484375 1.096875
## 8   7 0.5492188 1.098438
## 9   8 0.5496094 1.099219
## 10  9 0.5498047 1.099609
## 11 10 0.5499023 1.099805
## 12 11 0.5499512 1.099902
## 13 12 0.5499756 1.099951
## 14 13 0.5499878 1.099976
## 15 14 0.5499939 1.099988
## 16 15 0.5499969 1.099994
## 17 16 0.5499985 1.099997
## 18 17 0.5499992 1.099998
## 19 18 0.5499996 1.099999
## 20 19        NA 1.100000

Notice that the results converge to the Chain Ladder results. This is due to the fact that \[ U^{(m)} = (1 - q_k^m) U_{CL} + q_k^m U_0 \] and \[ R^{(m)} = (1 - q_k^m) R_{CL} + q_k^m R_{BF} \] since \(q_k^m \rightarrow 0\) as \(m \rightarrow \infty\). We can also perform calculations using these formulas directly:

ultimate.mixture = function(chain.ladder.ultimate, budgeted.loss.ultimate, p, m) {
  return((1 - (1-p)^m) * chain.ladder.ultimate + (1-p)^m*budgeted.loss.ultimate)
}
reserve.mixture = function(chain.ladder.reserve, BF.reserve, p, m) {
  return((1 - (1-p)^m) * chain.ladder.reserve + (1-p)^m*BF.reserve)
}
mixture.results = data.frame(m = 0:19)
mixture.results$R = apply(mixture.results, 1, function(y) reserve.mixture(chain.ladder.reserve = R_CL, BF.reserve = R_BF, p = p_3, m = y['m']))
mixture.results$U = apply(mixture.results, 1, function(y) ultimate.mixture(chain.ladder.ultimate = U_CL, budgeted.loss.ultimate = U_0, p = p_3, m = y['m']))
mixture.results
##     m         R        U
## 1   0 0.4500000 0.900000
## 2   1 0.5000000 1.000000
## 3   2 0.5250000 1.050000
## 4   3 0.5375000 1.075000
## 5   4 0.5437500 1.087500
## 6   5 0.5468750 1.093750
## 7   6 0.5484375 1.096875
## 8   7 0.5492188 1.098438
## 9   8 0.5496094 1.099219
## 10  9 0.5498047 1.099609
## 11 10 0.5499023 1.099805
## 12 11 0.5499512 1.099902
## 13 12 0.5499756 1.099951
## 14 13 0.5499878 1.099976
## 15 14 0.5499939 1.099988
## 16 15 0.5499969 1.099994
## 17 16 0.5499985 1.099997
## 18 17 0.5499992 1.099998
## 19 18 0.5499996 1.099999
## 20 19 0.5499998 1.100000

Multiple accident years

Analogous to the iterated Bornheutter-Ferguson method, there is an iterated collective loss ratio method: given a starting point \(U_i^{(0)}\), define \[ R_i^{(m)} = p_i U_i^{(m)} \] and \[ U_i^{(m+1)} = R_i^{(m)} + C_{i, n-i+1} \] If the starting point is the collective loss ratio ultimate, then the first iteration gives the Benktander estimate \(R_i^{(1)}\). More generally, the mixture is given by \[ U_i^{(m)} = (1 - q_i^m) U_i^{ind} + q_i^m U_i^{(0)} \] and \[ R_i^{(m)} = (1 - q_i^m) R_i^{ind} + q_i^m R_i^{(0)} \] so, as \(m\rightarrow \infty\), this method converges to the individual loss ratio method.

Mean Squared Error and Optimal Credibility

Single Accident Year

Additional Assumptions

An optimal credibility value \(c^*\) can be determined by minimizing the mean square error: \[ mse(R_c) = E[(R_c - R)^2] \] In this framework, we view \(U_0\) as being an unbiased estimator of \(U\) that is independent of \(C_k\), \(R\), and \(U\). In order to calculate mean square error and determine the optimum value of \(c\), additional assumptions are needed:
  1. \(E[C_k/U | U] = p_k\)
  2. \(\mathrm{Var}(C_k / U|U) = p_k q_k \beta^2(U)\) for some function \(\beta^2\) that depends only on \(U\). For notational convenience, we will write \(\alpha^2(U) = U^2 \beta^2(U)\).
Consequences of these assumptions include:
  1. \(E[C_k|U] = p_kU\) is a straightforward rearrangement of assumption 1, and taking the expected value over \(U\) of both sides gives \(E[C_k] = p_kE[U]\).
  2. From the above, we can deduce that \(E[R] = q_k E[U] = E[R_{BF}] = E[R_{CL}]\).
  3. \(\mathrm{Var}(C_k|U) = p_k q_k \alpha^2(U)\) is due to the definition of \(\alpha\).
  4. \(\mathrm{Var}(C_k) = p_kq_k E(\alpha^2(U)) + p_k^2 \mathrm{Var}(U)\) is due to the law of total variance, and observations 1 and 3 above.
  5. Covariance between actual claims emergence and ultimate follows from the law of total covariance: \(\mathrm{Cov}(C_k,U) = \mathrm{Cov}(E[C_k|U], U) = p_k \mathrm{Var}(U)\).
  6. From observation 5 and the definition of \(R\), we obtain \(\mathrm{Cov}(C_k,R) = \mathrm{Cov}(C_k, U) - \mathrm{Var}(C_k) = p_kq_k (\mathrm{Var}(U) - E[\alpha^2(U)])\).

Combining the above, we can arrive at a formula for \[ \mathrm{Var}(R) = \mathrm{Var}(U) - 2 \mathrm{Cov}(C_k, U) + \mathrm{Var}(C_k) \] Following simplification, we obtain \[ \mathrm{Var}(R) = q_k E[\alpha^2(U)] + q_k^2(\mathrm{Var}(U) - E[\alpha^2(U)]) \]

Calculation of MSE and optimal credibility

Formulas for the MSE of \(R_c\) can then be obtained through the following process:
  1. Calculate \(\mathrm{mse}(R_{BF})\). Since \(R_{BF}\) is an unbiased estimator of \(R\), then \(\mathrm{mse}(R_{BF}) = \mathrm{Var}(R_{BF} - R)\), and the observations above can be used to calculate this expression.
  2. Calculate \(\mathrm{mse}(R_{CL})\), using the same method as above.
  3. Express \(R_c = cR_{CL} + (1-c) R_{BF}\). This results in an MSE that is a linear combination of the MSE for \(R_{BF}\) and \(R_{CL}\), along with a cross term \(2c(1-c)E[(R_{CL} - R)(R_{BF} - R)]\).

Introducing the notation \[ t = \frac{E[\alpha^2(U)]}{\mathrm{Var}(U_0) + \mathrm{Var}(U) - E[\alpha^2(U)]} \] the result of the above process simplifies to \[ \mathrm{mse}(R_c) = E[\alpha^2(U)] \left( \frac{c^2}{p_k} + \frac{1}{q_k} + \frac{(1-c)^2}{t}\right) q_k^2 \] The MSE for the BF and CL methods can be obtained by either setting \(c = 0\) or \(c = 1\) in the above formula, or from the result of steps 1 and 2 above. The resulting formulas are \[ mse(R_{BF}) = E[\alpha^2(U)]q_k(1 + q_k/t) \] and \[ mse(R_{CL}) = E[\alpha^2(U)] q_k/p_k \]

Determining \(c^*\) is now just a matter of differentiating the formula for \(mse(R_c)\) and setting it equal to 0: \[ \frac{2c}{p_k} - \frac{2(1-c)}{t} = 0 \] Solving for \(c\), we obtain: \[ c^* = \frac{p_k}{p_k + t} \]

Continuing the numerical example, in order to determine the MSE, we must first estimate \(\mathrm{Var}(U_0)\), \(\mathrm{Var}(U)\), and \(E[\mathrm{Var}(C_k|U)]\). The latter allows us to determine \(E[\alpha^2(U)]\), since \[ \mathrm{Var}(C_k|U) = p_k q_k U^2 \beta^2(U) = p_k q_k \alpha^2(U) \] Taking the expected value of both sides of the equation, we obtain \[ E[\alpha^2(U)] = \frac{E[\mathrm{Var}(C_k|U)]}{p_k q_k} \] Alternately, if \(\beta^2(U)\) does not depend on \(U\), then \[ E[\alpha^2(U)] = E[U^2] \beta^2 = E[U^2] \frac{\mathrm{Var}(C_k/U|U)}{p_kq_k} \]

A selection for \(\mathrm{Var}(U)\) may be made by assuming a range of reasonable values along with a distributional assumption, and determining the variance corresponding to that range. Mack makes the following assumption:

var_U = 0.35^2

An approach to determining \(\mathrm{Var}(C_k/U|U)\) in the case of constant \(\beta\) is to assume, based on results of other accident years, a 95% condfidence interval, and dividing the size of the interval in 4 to obtain the standard deviation. (This is referred to as the two-sigma approach.) Assuming that \(C_k/U\) has always been between 0.3 and 0.7, we can proceed as follows:

var_C_k_over_U = ((0.7 - 0.3)/4)^2
beta_squared = var_C_k_over_U / (p_3 *(1-p_3))
E_U_squared = var_U + U_0^2
E_alpha_squared = E_U_squared * beta_squared

An estimate of \(\mathrm{Var}(U_0)\) can be obtained from a similar two-sigma approach. The assumption used by Mack is

var_U_0 = 0.15^2

From the above, we can calculate \(t\):

t = E_alpha_squared / (var_U + var_U_0 - E_alpha_squared)
print(paste0("The value of t is ", t))
## [1] "The value of t is 0.346332404828227"

We can now calculate the standard error of each of the four methods, including the optimum credibility method.

MSE = function(p, t, E_alpha_squared, c) {
  return(E_alpha_squared * (1-p)^2 *(c^2/p + 1/(1-p) + (1-c)^2/t))
}
MSE.CL = MSE(p = p_3, t = t, E_alpha_squared = E_alpha_squared, c = 1)
SE.CL = sqrt(MSE.CL)
MSE.BF = MSE(p = p_3, t = t, E_alpha_squared = E_alpha_squared, c = 0)
SE.BF = sqrt(MSE.BF)
MSE.GB = MSE(p = p_3, t = t, E_alpha_squared = E_alpha_squared, c = p_3)
SE.GB = sqrt(MSE.GB)
MSE.opt = MSE(p = p_3, t = t, E_alpha_squared = E_alpha_squared, c = p_3/(p_3+t))
SE.opt = sqrt(MSE.opt)
print(paste0("The standard error for the CL method is ", SE.CL))
## [1] "The standard error for the CL method is 0.19313207915828"
print(paste0("The standard error for the BF method is ", SE.BF))
## [1] "The standard error for the BF method is 0.213483020402092"
print(paste0("The standard error for the Benktander method is ", SE.GB))
## [1] "The standard error for the Benktander method is 0.17333133011663"
print(paste0("The standard error for the optimum credibility method is ", SE.opt))
## [1] "The standard error for the optimum credibility method is 0.172244388753129"

Ranges of best performance

By comparing the formulas for MSE for the various methods, we can identify ranges of values for \(p_k\) and \(t\) over which each method outperforms the others. Comparing the Bornheutter-Ferguson MSE to the Chain Ladder MSE, after simplification we see that \[ mse(R_{BH}) < mse(R_{CL}) \] if and only if \[ p_k < t \] This suggests that the Bornheutter-Ferguson approach is better than the Chain Ladder method for early development years when \(p_k\) is small. For the Benktander method, applying the general formula with \(c = p_k\) gives \[ mse(R_{GB}) = E[\alpha^2(U)] q_k\left(p_kq_k + 1 + q_k^3/t\right) \]

The Benktander method outperforms the Chain Ladder method when \[ mse(R_{GB}) < mse(R_{CL}) \] which (after simplification) occurs if and only if \[ t > \frac{p_kq_k}{1+p_k} \]

The Benktander method outperforms the Bornheutter-Ferguson method when \[ mse(R_{GB}) < mse(R_{BF}) \] which occurs if and only if \[ t < 2-p_k \]

The ranges of parameters over which each method is the best-performing can be visualized in two ways. First, we can plot the boundaries of the regions defined by the two inequalities given above, as done by Mack. Second, we can explicitly calculate the MSE of each method over a range of parameters, select the lowest, and create a raster plot colour-coded by the best-performing method.

CL.boundary = function(p) {return(p * (1-p) / (1+p))}
BF.boundary = function(p) {return(2 - p)}
CL.MSE.scaled = function(p, t) {return(1/p)}
BF.MSE.scaled = function(p, t) {return(1 + (1-p)/t)}
GB.MSE.scaled = function(p, t) {return(p*(1-p) + 1 + (1-p)^3 / t)}
p_range = 0.005 * 1:199
t_range = 0.01 * 1:200
MSE.test = as.data.table(merge(p_range, t_range))
colnames(MSE.test) = c("p", "t")
MSE.test$CL.boundary = apply(MSE.test, 1, function(y) CL.boundary(p = y['p']))
MSE.test$BF.boundary = apply(MSE.test, 1, function(y) BF.boundary(p = y['p']))
MSE.test$CL.MSE = apply(MSE.test, 1, function(y) CL.MSE.scaled(p = y['p'], t = y['t']))
MSE.test$BF.MSE = apply(MSE.test, 1, function(y) BF.MSE.scaled(p = y['p'], t = y['t']))
MSE.test$GB.MSE = apply(MSE.test, 1, function(y) GB.MSE.scaled(p = y['p'], t = y['t']))
MSE.test[, lowest_MSE := pmin(CL.MSE, BF.MSE, GB.MSE)]
MSE.test[, best_method := ifelse(GB.MSE == lowest_MSE, "Benktander",
                                 ifelse(BF.MSE == lowest_MSE, "Bornheutter-Ferguson",
                                 "Chain Ladder"))]
ggplot(data = MSE.test, aes(x = p, y = t, fill = best_method)) + geom_raster() + geom_line(aes(x = p, y = CL.boundary)) + geom_line(aes(x = p, y = BF.boundary)) + labs(title = "Lowest MSE method by payment percentage and volatility")
## Warning in plyr::split_indices(scale_id, n): '.Random.seed' is not an
## integer vector but of type 'NULL', so ignored

Multiple Accident Years

The formulas for MSE and the optimal credibility factor also apply to the multiple accident year approach, with each accident year getting its own optimal credibility value, and with \(U_i^{BC}\) replacing \(U_0\) in the formulas. In addition, Hürlimann shows that under the assumption that there exists a factor \(f_i\) such that \[ \mathrm{Var}(U_i) = f_i \mathrm{Var}(U_i^{BC}) \] then the minimum variance optimal credibilty is obtained when \[ t_i = \frac{f_i -1 + \sqrt{(f_i + 1)(f_i -1 + 2p_i)}}{2} \] In particular, when \(f_i=1\), \[ t_i = \sqrt{p_i} \] Applying this method to the earlier example:

payout.factors[, t_i := sqrt(p_i)]
payout.factors[, Z_optimal := p_i / (p_i + t_i)]
payout.factors[, optimal_reserve := Z_optimal * individual_reserve + (1 - Z_optimal) * collective_reserve]
payout.factors[, optimal_ultimate := optimal_reserve + cumulative_paid_claims]
datatable(payout.factors[, .(origin_period, t_i, Z_optimal, optimal_reserve, optimal_ultimate)])

Comparisons to Other Methods

Bayesian Model

A Bayesian approach involves making a distributional assumption about \(U\) and \(C_K|U\), and using this to determine \(U|C_k\). Assuming that \[ U \sim \mathrm{Lognormal}(\mu, \sigma^2) \] and \[ C_k | U \sim \mathrm{Lognormal}(\nu, \tau^2), \] then \[ U | C_k \sim \mathrm{Lognormal}(\mu_1, \sigma_1^2) \] with
  • \(z = \sigma^2 / (\sigma^2 + \tau^2)\)
  • \(\mu_1 = z(\tau^2/2 + \log(C_k / p_k))\)
  • \(\sigma_1^2 = z\tau^2\)
  • \(E[C_k|U] = p_kU\)
  • \(\mathrm{Var}(C_k|U) = p_kq_k\beta^2 U^2\)

Using the numerical example from before, we can calculate the parameters for this model:

sigma = sqrt(log(1 + var_U / U_0^2))
mu = log(U_0) - sigma^2/2
tau = sqrt(log(1 + beta_squared * (1 - p_3) / p_3))

The posterior parameters are:

z = sigma^2 / (sigma^2 + tau^2)
mu_1 = z * (tau^2/2 + log(C_3 / p_3)) + (1-z)*mu
sigma_1 = sqrt(z) * tau

Apply the formulas for mean and variance of the lognormal distribution:

posterior_mean = exp(mu_1 + sigma_1^2 / 2)
posterior_variance = posterior_mean^2 * (exp(sigma_1^2) - 1)
print(paste0("The estimated ultimate value is ", posterior_mean))
## [1] "The estimated ultimate value is 1.06922885871791"
print(paste0("The standard deviation of the ultimate value is ", sqrt(posterior_variance)))
## [1] "The standard deviation of the ultimate value is 0.188720685057277"

Note that the variance estimate above is the conditional variance, which is more relevant to our use case than the unconditional variance, since we are performing these calculations after we know \(C_k\), so it is fixed. However, given that earlier methods provived unconditional variances, for fairness of comparison, Mack shows that the unconditional standard deviation for this model is 17%, which is only slightly better than the Benktander and optimal credibility methods. However, the Bayesian approach requires strong distributional assumptions which are not justified by the small reduction in mean squared error.

Additional Example

Hürlimann provides an additional example using the following table:

hurlimann.table.2 = fread("./hurlimann_2.csv")
datatable(hurlimann.table.2)

Calculate the values of \(m_k\) for each development period:

example.2.m = hurlimann.table.2[, .(m_k = sum(incremental_paid_claims) / sum(exposure)), by = development_period]
datatable(example.2.m)

The overall loss ratio estimate is:

m = example.2.m[, sum(m_k)]
m
## [1] 0.9650938

Calculate payout factors by dividing the cumulative sum of \(m_k\) by \(m\):

example.2.m = example.2.m[order(development_period)]
example.2.m[, p_k := cumsum(m_k) / m]
example.2.m[, q_k := 1 - p_k]
datatable(example.2.m)

Determine cumulative loss for each origin period:

example.2.cumulative.loss = hurlimann.table.2[, .(development_period = max(development_period), cumulative_paid_loss = sum(incremental_paid_claims), exposure = max(exposure)), by = origin_period]
datatable(example.2.cumulative.loss)

Join the loss ratios for each development period, then calculate reserves by various methods:

payout.factors.2 = merge(example.2.cumulative.loss, example.2.m, by = "development_period")
payout.factors.2[, individual_loss_reserve := cumulative_paid_loss / p_k * q_k]
payout.factors.2[, collective_loss_reserve := exposure * m * q_k]
payout.factors.2[, benktander_loss_reserve := p_k * individual_loss_reserve + q_k * collective_loss_reserve]
payout.factors.2[, neuhaus_loss_reserve := p_k * m * individual_loss_reserve + (1 - p_k * m) * collective_loss_reserve]
payout.factors.2[, t_k := sqrt(p_k)]
payout.factors.2[, optimal_Z := p_k / (p_k + sqrt(p_k))]
payout.factors.2[, optimal_loss_reserve := optimal_Z * individual_loss_reserve + (1 - optimal_Z) * collective_loss_reserve]
datatable(payout.factors.2[, .(origin_period, cumulative_paid_loss, individual_loss_reserve, collective_loss_reserve, benktander_loss_reserve, neuhaus_loss_reserve, t_k, optimal_Z, optimal_loss_reserve)][order(origin_period)])

Calculate the corresponding ultimate values:

payout.factors.2[, individual_ultimate := individual_loss_reserve + cumulative_paid_loss]
payout.factors.2[, collective_ultimate := collective_loss_reserve + cumulative_paid_loss]
payout.factors.2[, benktander_ultimate := benktander_loss_reserve + cumulative_paid_loss]
payout.factors.2[, neuhaus_ultimate := neuhaus_loss_reserve + cumulative_paid_loss]
payout.factors.2[, optimal_ultimate := optimal_loss_reserve + cumulative_paid_loss]
datatable(payout.factors.2[, .(origin_period, individual_ultimate, collective_ultimate, benktander_ultimate, neuhaus_ultimate, optimal_ultimate)][order(origin_period)])