These study notes are based on the Exam 7 syllabus reading Estimating the Premium Asset on Retrospectively Rated Policies by Michael Teng and Miriam Perkins. These notes also incorporate ideas from a discussion of the paper written by Sholom Feldblum.
easypackages::packages("data.table", "DT", "ggplot2")
## Loading required package: data.table
## Warning: package 'data.table' was built under R version 3.4.2
## Loading required package: DT
## Loading required package: ggplot2
## All packages loaded successfully
The premium \(P_n\) for a retrospectively rated policy at the \(n\)th retro adjustment is defined in terms of the following quantities:
The premium is then calculated as \[ P = T (B + CL_C) \] A consequence of retrospective rating is the existince of an asset corresponding to premium that the insurer expects to collect, which is referred to as Earned but Not Reported (EBNR) premium. The main idea of Teng and Perkins is that, given that premium is related to losses by this formula, then the premium development can be related to loss development through a PDLD ratio, and two methods to calculate this ratio are provided. (The scope of the approach is only retrospective policies, and not premium development on prospectively-rated policies due to changes in exposure.)
The reason a special method for estimating premium asset is necessary (i.e. rather than simply applying chain ladder to premium data) is that there is often a time-delay in reporting premium collected, so a method that relates the premium asset to incurred losses will allow the estimate to be produced more quickly.The following R function can be used to calculate the PDLD ratio given plan parameters, using the example provided by Teng and Perkins
PDLD.formula.first.adjustment = function(basic_premium_factor, expected_loss_ratio, percent_reported, loss_capping_ratio, loss_conversion_factor, tax_multiplier) {
return(basic_premium_factor * tax_multiplier / (expected_loss_ratio * percent_reported) + loss_capping_ratio * loss_conversion_factor * tax_multiplier)
}
PDLD_1 = PDLD.formula.first.adjustment(basic_premium_factor = 0.2, expected_loss_ratio = 0.70, percent_reported = 0.784, loss_capping_ratio = 0.85, loss_conversion_factor = 1.2, tax_multiplier = 1.03)
print(paste0("The PDLD ratio for the first adjustment is ", PDLD_1))
## [1] "The PDLD ratio for the first adjustment is 1.42596443148688"
In subsequent adjustments, an incremental PDLD ratio is calculated using the ratio of change in premium to change in losses. The formula simplifies due to the cancellation of the basic premium. For \(n\geq 2\), the incremental PDLD ratio is \[ \frac{P_n - P_{n-1}}{L_n - L_{n-1}} = \frac{L_{C,n} - L_{C, n-1}}{L_n - L_{n-1}}CT \] The quantity \[ \frac{L_{C,n} - L_{C, n-1}}{L_n - L_{n-1}} \] is referred to as the incremental loss capping ratio. Note that if non-cumulative loss capping ratios are provided, then the cumulative ratio can be calculated as an average weighted by the percentage of losses emerged as of each adjustment period. The R function for computing this ratio is:
PDLD.formula.incremental = function(loss_capping_ratio, loss_conversion_factor, tax_multiplier) {
return(loss_capping_ratio * loss_conversion_factor * tax_multiplier)
}
PDLD_2 = PDLD.formula.incremental(loss_capping_ratio = 0.58, loss_conversion_factor = 1.2, tax_multiplier = 1.03)
print(paste0("The PDLD ratio for the second adjustment is ", PDLD_2))
## [1] "The PDLD ratio for the second adjustment is 0.71688"
For convenience, combine the two formulas into a single one, using the number of the adjustment to differentiate between the two:
PDLD.formula = function(adjustment_number, basic_premium_factor, expected_loss_ratio, percent_reported, loss_capping_ratio, loss_conversion_factor, tax_multiplier) {
if (adjustment_number == 1) {
result = PDLD.formula.first.adjustment(basic_premium_factor, expected_loss_ratio, percent_reported, loss_capping_ratio, loss_conversion_factor, tax_multiplier)
}
else {
result = PDLD.formula.incremental(loss_capping_ratio, loss_conversion_factor, tax_multiplier)
}
return(result)
}
Import the loss capping ratios from Exhibit 4 and calculate the formula-based PDLD ratios using the plan parameters provided in the exhibit:
implied.PDLD = fread("./teng_perkins_exhibit_4-loss_capping_ratios.csv")
implied.PDLD$implied_PDLD = apply(implied.PDLD, 1, function(r) PDLD.formula(adjustment_number = r['adjustment_number'],
basic_premium_factor = 0.2,
expected_loss_ratio = 0.7,
percent_reported = 0.784,
loss_capping_ratio = r['loss_capping_ratio'],
loss_conversion_factor = 1.2,
tax_multiplier = 1.03))
datatable(implied.PDLD)
Teng and Perkins recommend testing the PDLD values implied by the formula against actual historical data to determine whether any bias exists. The following uses the loss and premium data from Exhibit 4.
historical.data = fread("./teng_perkins_exhibit_4-premium_and_loss.csv")
implied.PDLD.test = merge(historical.data, implied.PDLD, by = "adjustment_number")
implied.PDLD.test[, predicted_premium := implied_PDLD * loss]
implied.PDLD.test[, premium_error := premium - predicted_premium]
datatable(implied.PDLD.test)
A visual check can help detect biases:
ggplot(data = implied.PDLD.test, aes(x = adjustment_number, y = premium_error, col = as.factor(policy_effective_year))) + geom_point()
## Warning in plyr::split_indices(scale_id, n): '.Random.seed' is not an
## integer vector but of type 'NULL', so ignored
The plot suggests that these ratios are prone to underestimating the premium in the first adjustment period, and overestimating it in subsequent periods.
The approach can be applied to the data in Exhibit 4, which is already on an incremental basis:
historical.data[, empirical_PDLD := premium / loss]
datatable(historical.data)
Looking at trends in the empirical ratio over time can help assess whether the ratios are changing over time:
historical.data[, time_period := policy_effective_year + 0.25*policy_effective_quarter]
ggplot(data = historical.data, aes(x = time_period, y = empirical_PDLD, col = as.factor(adjustment_number))) + geom_point() + geom_smooth(method = "lm")
Observations about this graph include:
Averages and loss-weighted averages can be calculated as follows, and compared to the formula-based approach. Selections made by Teng and Perkins have been added manually to the table. (Note that averages do not match the ones calculated by Teng and Perkins, because they used data from policy years from 1983 to 1992, while the table here only uses policy years 1987 to 1992.)
empirical.summary = historical.data[, .(avg_PDLD = mean(empirical_PDLD), wtd_avg_PDLD = sum(empirical_PDLD * loss) / sum(loss)), by = adjustment_number]
PDLD.comparison = merge(empirical.summary, implied.PDLD[,.(adjustment_number, implied_PDLD)], by = "adjustment_number")
PDLD.comparison$selected_PDLD = c(1.750, 0.7, 0.55, 0.45, 0.40, 0.35)
datatable(PDLD.comparison)
Once incremental PDLD ratios have been selected, the cumulative ratios (CPDLD) can be calculated. These are equal to the average PDLD in all subsequent periods, weighted by the percentage of losses expected to emerge in each period. Incremental percentage emergence values from Exhibit 7 are used below.
PDLD.comparison = PDLD.comparison[order(adjustment_number)]
PDLD.comparison$percentage_emergence = c(0.784, 0.093, 0.044, 0.03, .029, 0.016)
tail.loss.emergence = 0.004
PDLD.comparison = PDLD.comparison[order(-adjustment_number)]
PDLD.comparison[, remaining_emergence := cumsum(percentage_emergence) + tail.loss.emergence]
PDLD.comparison[, remaining_weighted_PDLD := cumsum(percentage_emergence * selected_PDLD)]
PDLD.comparison[, CPDLD := remaining_weighted_PDLD / remaining_emergence]
datatable(PDLD.comparison[,.(adjustment_number, selected_PDLD, percentage_emergence, remaining_emergence, remaining_weighted_PDLD, CPDLD)])
Typically, CPDLD ratios are greater than one during the first period (due to the basic premium charge, and the fact that most losses are ratable), and less than one during subsequent periods (since losses are increasingly subject to per-occurrence and aggregate limits).
The final step to determine the premium asset is to multiply the CPDLD ratios by expected future losses to get the expected future premiums. Ultimate premiums may be calculated by adding the booked premiums as of the most recent adjustment date. The premium asset is the difference between the ultimate premiums and the current booked premium. (The reason for the difference between the current booked premium and the premium at the prior adjustment could be due to differences in timing of adjustments, minor adjustments, or interim premium payments.) Note that in this process, the two latest policy periods are grouped together, and have the same CPDLD ratio applied to them, since neither year has had its first adjustment yet. The following uses the annual summary data from Exhibit 1:
annual.summary = fread("./teng_perkins_exhibit_1-annual_summaries.csv")
annual.summary[, adjustment_number := ifelse(policy_effective_year == 1994 | policy_effective_year == 1993, 1, 1994 - policy_effective_year)]
annual.summary = merge(annual.summary, PDLD.comparison[, .(adjustment_number, CPDLD)], by = "adjustment_number", all.x = TRUE)
annual.summary[is.na(CPDLD), CPDLD := 0]
prior.booked.premium = historical.data[, .(prior_booked_premium = sum(premium)), by = "policy_effective_year"]
annual.summary = merge(annual.summary, prior.booked.premium, by = "policy_effective_year", all.x = TRUE)
annual.summary[is.na(prior_booked_premium), prior_booked_premium := 0]
annual.summary[, expected_future_premium := CPDLD * expected_future_loss_emergence]
annual.summary[, ultimate_premium := expected_future_premium + prior_booked_premium]
annual.summary[, premium_asset := ultimate_premium - premium_booked_12_1994]
datatable(annual.summary)
An additional consideration is that if the premium asset is not secured, then a provision for collectibility risk should be added.
The approach of Teng and Perkins is compared to other approaches, due to Fitzgibbon and Berry, in the discussion by Feldblum. The key idea is that, since the retrospective premium formula is a constant plus a multiple of the losses, then the relationship between premium and losses can be determined by linear regression. Implicitly, this gives a way of estimating the combined effect of the average plan parameters. (Fitzgibbon and Berry do this on the basis of the retro adjustment as a percentage of standard premium. Since the sample data doesn’t include sample premium, the examples below will be based on raw premium and loss amounts. Regardless, the form of the model is the same.) To apply this approach, first calculate cumulative losses:
regression.data = historical.data[, .(policy_effective_year, policy_effective_quarter, adjustment_number, loss, premium, time_period)][order(adjustment_number)]
regression.data[, cumulative_loss := cumsum(loss), by = "time_period"]
regression.data[, cumulative_premium := cumsum(premium), by = "time_period"]
datatable(regression.data)
Visualize the above data:
ggplot(data = regression.data, aes(x = cumulative_loss, y = cumulative_premium)) + geom_point() + geom_smooth(method = "lm") + coord_cartesian(ylim = c(0, 200000))
Parameters of this model are as follows:
fitzgibbon.model = lm(cumulative_premium ~ cumulative_loss, data = regression.data)
summary(fitzgibbon.model)
##
## Call:
## lm(formula = cumulative_premium ~ cumulative_loss, data = regression.data)
##
## Residuals:
## Min 1Q Median 3Q Max
## -16082 -6442 -1197 4243 25436
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 1.560e+04 2.984e+03 5.228 1.28e-06 ***
## cumulative_loss 1.147e+00 3.383e-02 33.893 < 2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 9562 on 82 degrees of freedom
## Multiple R-squared: 0.9334, Adjusted R-squared: 0.9326
## F-statistic: 1149 on 1 and 82 DF, p-value: < 2.2e-16
The intercept term can be interpreted as the contribution of the basic premium, and the “cumulative_loss” parameter can be interpreted as the effect of loss capping and loss conversion. The slope allows us to quantify the premium responsiveness of the book of business. It is generally not equal to 1 due to a variety of factors (loss capping, minimum premium, loss conversion, and premium tax). The slope of the line is referred to as the “swing” of the plan. Plans with narrow swing (low per-occurrence and aggregate limits, low premium responsiveness) are typically sold to small accounts, and plans with wide swing (high per-occurence and aggregate limits, high premium responsiveness, possibly even above 1) would be sold to large accounts. The major drawbacks of the Fitzgibbon approach is that it is not responsive to the actual experience (which may show more or less premium responsiveness than initially projected), and that the projection produced does not change from year to year.
Note that since we expect the loss capping ratio to decrease over time, it makes sense for the slope to decrease in subsequent adjustment periods: the fit should not be a straight line, but rather a sequence of line segments of decreasing slope. (The resulting graph will be concave, meaning that the linear approach overestimates premium at high and low loss ratios.) One approach is to fit an interaction between adjustment number and cumulative loss, without the marginal effects:
regression.data[, loss_period_1 := ifelse(adjustment_number == 1, cumulative_loss, 0)]
regression.data[, loss_period_2 := ifelse(adjustment_number == 2, cumulative_loss, 0)]
regression.data[, loss_period_3 := ifelse(adjustment_number == 3, cumulative_loss, 0)]
regression.data[, loss_period_4 := ifelse(adjustment_number == 4, cumulative_loss, 0)]
regression.data[, loss_period_5 := ifelse(adjustment_number == 5, cumulative_loss, 0)]
regression.data[, loss_period_6 := ifelse(adjustment_number == 6, cumulative_loss, 0)]
fitzgibbon.model.v2 = lm(cumulative_premium ~ loss_period_1 + loss_period_2 + loss_period_3 + loss_period_4 + loss_period_5 + loss_period_6, data = regression.data)
summary(fitzgibbon.model.v2)
##
## Call:
## lm(formula = cumulative_premium ~ loss_period_1 + loss_period_2 +
## loss_period_3 + loss_period_4 + loss_period_5 + loss_period_6,
## data = regression.data)
##
## Residuals:
## Min 1Q Median 3Q Max
## -10082 -4188 -1924 2806 19392
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 7.407e+03 2.285e+03 3.241 0.00176 **
## loss_period_1 1.375e+00 3.615e-02 38.026 < 2e-16 ***
## loss_period_2 1.281e+00 3.178e-02 40.291 < 2e-16 ***
## loss_period_3 1.222e+00 2.950e-02 41.426 < 2e-16 ***
## loss_period_4 1.174e+00 2.934e-02 40.012 < 2e-16 ***
## loss_period_5 1.127e+00 3.039e-02 37.087 < 2e-16 ***
## loss_period_6 1.129e+00 3.675e-02 30.729 < 2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 6756 on 77 degrees of freedom
## Multiple R-squared: 0.9688, Adjusted R-squared: 0.9663
## F-statistic: 398.1 on 6 and 77 DF, p-value: < 2.2e-16
The decreasing slope is evident, with the exception of the last two adjustment periods, which can be attributed to volatile data. The approach of Teng and Perkins does not include an intercept, but we could still fit a regression in the style of their approach:
teng.perkins.regression = lm(cumulative_premium ~ loss_period_1 + loss_period_2 + loss_period_3 + loss_period_4 + loss_period_5 + loss_period_6 + 0, data = regression.data)
summary(teng.perkins.regression)
##
## Call:
## lm(formula = cumulative_premium ~ loss_period_1 + loss_period_2 +
## loss_period_3 + loss_period_4 + loss_period_5 + loss_period_6 +
## 0, data = regression.data)
##
## Residuals:
## Min 1Q Median 3Q Max
## -10302 -3361 -1247 3642 22384
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## loss_period_1 1.47332 0.02070 71.19 <2e-16 ***
## loss_period_2 1.36497 0.01932 70.63 <2e-16 ***
## loss_period_3 1.29772 0.01913 67.84 <2e-16 ***
## loss_period_4 1.24475 0.02074 60.01 <2e-16 ***
## loss_period_5 1.19318 0.02382 50.10 <2e-16 ***
## loss_period_6 1.19396 0.03270 36.51 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 7155 on 78 degrees of freedom
## Multiple R-squared: 0.9965, Adjusted R-squared: 0.9962
## F-statistic: 3684 on 6 and 78 DF, p-value: < 2.2e-16
The slopes increase to account for the lack of an intercept term.
There are two sources of uncertainty associated with estimating premium emergence:In other words, this method allows the projections to “get back on track” if actual experience differs from expectations.
A key drawback of the approach used by Teng and Perkins is that, by excluding an intercept, it reduces its explanatory power, since the drivers of premium responsiveness can no longer be allocated between the basic premium and the loss conversion / loss capping effects during the first adjustment. This means that a lengthening of reporting patterns can have a negative impact on the accuracy of the projection, because a reduction in losses at the first adjustment period impacts premium responsiveness but not the intercept – the point at which the slope changes is too far to the left. Feldblum proposes a simple adjustment to accommodate this: calculate the average basic premium as a ratio to the standard loss ratio, and subtract this from the first CPDLD, in essence, removing the intercept.