These study notes are based on the Exam 7 syllabus reading Estimating the Premium Asset on Retrospectively Rated Policies by Michael Teng and Miriam Perkins. These notes also incorporate ideas from a discussion of the paper written by Sholom Feldblum.

Introduction

easypackages::packages("data.table", "DT", "ggplot2")
## Loading required package: data.table
## Warning: package 'data.table' was built under R version 3.4.2
## Loading required package: DT
## Loading required package: ggplot2
## All packages loaded successfully
The premium \(P_n\) for a retrospectively rated policy at the \(n\)th retro adjustment is defined in terms of the following quantities:

The premium is then calculated as \[ P = T (B + CL_C) \] A consequence of retrospective rating is the existince of an asset corresponding to premium that the insurer expects to collect, which is referred to as Earned but Not Reported (EBNR) premium. The main idea of Teng and Perkins is that, given that premium is related to losses by this formula, then the premium development can be related to loss development through a PDLD ratio, and two methods to calculate this ratio are provided. (The scope of the approach is only retrospective policies, and not premium development on prospectively-rated policies due to changes in exposure.)

The reason a special method for estimating premium asset is necessary (i.e. rather than simply applying chain ladder to premium data) is that there is often a time-delay in reporting premium collected, so a method that relates the premium asset to incurred losses will allow the estimate to be produced more quickly.

Formula-based Approach

The formula-based approach begins with the following expression for the PDLD ratio in the first period: \[ P_1 / L_1 = \frac{BT}{L_1} + \frac{L_{C,1}}{L_1} CT \] Parameters used should be the average values of the parametrs on retro plans sold. The formula can be simplified by using the expected amount of losses to emerge at the first adjustment. Specificically, if we let then the formula can be re-written as \[ P_1 / L_1 = \frac{BT}{SER_1} + \frac{L_{C,1}}{L_1} C T \] Note that \(B/S\) is the basic premium factor in the retro rating plan. The quantity \(L_{C,n}/L_n\) is the loss capping ratio. Changes in the loss capping ratio over time drive changes in the PDLD ratio over time:

The following R function can be used to calculate the PDLD ratio given plan parameters, using the example provided by Teng and Perkins

PDLD.formula.first.adjustment = function(basic_premium_factor, expected_loss_ratio, percent_reported, loss_capping_ratio, loss_conversion_factor, tax_multiplier) {
  return(basic_premium_factor * tax_multiplier / (expected_loss_ratio * percent_reported) + loss_capping_ratio * loss_conversion_factor * tax_multiplier)
}
PDLD_1 = PDLD.formula.first.adjustment(basic_premium_factor = 0.2, expected_loss_ratio = 0.70, percent_reported = 0.784, loss_capping_ratio = 0.85, loss_conversion_factor = 1.2, tax_multiplier = 1.03)
print(paste0("The PDLD ratio for the first adjustment is ", PDLD_1))
## [1] "The PDLD ratio for the first adjustment is 1.42596443148688"

In subsequent adjustments, an incremental PDLD ratio is calculated using the ratio of change in premium to change in losses. The formula simplifies due to the cancellation of the basic premium. For \(n\geq 2\), the incremental PDLD ratio is \[ \frac{P_n - P_{n-1}}{L_n - L_{n-1}} = \frac{L_{C,n} - L_{C, n-1}}{L_n - L_{n-1}}CT \] The quantity \[ \frac{L_{C,n} - L_{C, n-1}}{L_n - L_{n-1}} \] is referred to as the incremental loss capping ratio. Note that if non-cumulative loss capping ratios are provided, then the cumulative ratio can be calculated as an average weighted by the percentage of losses emerged as of each adjustment period. The R function for computing this ratio is:

PDLD.formula.incremental = function(loss_capping_ratio, loss_conversion_factor, tax_multiplier) {
  return(loss_capping_ratio * loss_conversion_factor * tax_multiplier)
}
PDLD_2 = PDLD.formula.incremental(loss_capping_ratio = 0.58, loss_conversion_factor = 1.2, tax_multiplier = 1.03)
print(paste0("The PDLD ratio for the second adjustment is ", PDLD_2))
## [1] "The PDLD ratio for the second adjustment is 0.71688"

For convenience, combine the two formulas into a single one, using the number of the adjustment to differentiate between the two:

PDLD.formula = function(adjustment_number, basic_premium_factor, expected_loss_ratio, percent_reported, loss_capping_ratio, loss_conversion_factor, tax_multiplier) {
  if (adjustment_number == 1) {
    result = PDLD.formula.first.adjustment(basic_premium_factor, expected_loss_ratio, percent_reported, loss_capping_ratio, loss_conversion_factor, tax_multiplier)
  }
  else {
    result = PDLD.formula.incremental(loss_capping_ratio, loss_conversion_factor, tax_multiplier)
  }
  return(result)
}

Import the loss capping ratios from Exhibit 4 and calculate the formula-based PDLD ratios using the plan parameters provided in the exhibit:

implied.PDLD = fread("./teng_perkins_exhibit_4-loss_capping_ratios.csv")
implied.PDLD$implied_PDLD = apply(implied.PDLD, 1, function(r) PDLD.formula(adjustment_number = r['adjustment_number'],
                                                                     basic_premium_factor = 0.2,
                                                                     expected_loss_ratio = 0.7,
                                                                     percent_reported = 0.784,
                                                                     loss_capping_ratio = r['loss_capping_ratio'],
                                                                     loss_conversion_factor = 1.2,
                                                                     tax_multiplier = 1.03))
datatable(implied.PDLD)

Teng and Perkins recommend testing the PDLD values implied by the formula against actual historical data to determine whether any bias exists. The following uses the loss and premium data from Exhibit 4.

historical.data = fread("./teng_perkins_exhibit_4-premium_and_loss.csv")
implied.PDLD.test = merge(historical.data, implied.PDLD, by = "adjustment_number")
implied.PDLD.test[, predicted_premium := implied_PDLD * loss]
implied.PDLD.test[, premium_error := premium - predicted_premium]
datatable(implied.PDLD.test)

A visual check can help detect biases:

ggplot(data = implied.PDLD.test, aes(x = adjustment_number, y = premium_error, col = as.factor(policy_effective_year))) + geom_point()
## Warning in plyr::split_indices(scale_id, n): '.Random.seed' is not an
## integer vector but of type 'NULL', so ignored

The plot suggests that these ratios are prone to underestimating the premium in the first adjustment period, and overestimating it in subsequent periods.

Advantages of the approach

  • Easily explainable and intuitive
  • Can adjust to changes in retrospective plan parametrs, e.g. due to changes in mix of business, that may distort other methods

Disadvantages of the approach

  • Use of average parameters is a possibly source of bias.
  • Formula approach will not reflect deviations in loss experience from expectations.
  • Significant workload may be required to calculate the average plan parameters. Key consideration is that they need to be the parameters that were actually charged (e.g. including underwriting deviation).
  • While the method can detect decreasing premium responsiveness over time, it does not detect decreasing responsiveness at high incurred loss ratios. (Decreasing responsiveness due to loss ratios is a result of aggregate limits.)

Empirical PDLD Ratios

The empirical approach involves calculating ratios based on actual historical data:
  1. Separate data into homogeneous groups by size of account and type of rating plan.
  2. If data is provided in a cumulative format, it should first be converted to incremental values for each adjustment period.
    • For losses, the first adjustment period occurs at 18 months, and each subsequent period is in 12-month intervals.
    • For premium, the first adjustment period occurs at 27 months, and ecah subsequent period is in 12-month intervals.
  3. Calculate ratios of incremental premium to incremental losses for each policy effective period and adjustment period.
  4. Review empirical ratios for trends, compare to formula-based ratios, and make a selection for each adjustment period. Pay particular attention to situations in which the formula-based an empirical PDLDs diverge. (Main drivers of differerences are changes in parameters over time, or loss experience that differs from expectations.)

The approach can be applied to the data in Exhibit 4, which is already on an incremental basis:

historical.data[, empirical_PDLD := premium / loss]
datatable(historical.data)

Looking at trends in the empirical ratio over time can help assess whether the ratios are changing over time:

historical.data[, time_period := policy_effective_year + 0.25*policy_effective_quarter]
ggplot(data = historical.data, aes(x = time_period, y = empirical_PDLD, col = as.factor(adjustment_number))) + geom_point() + geom_smooth(method = "lm")
Observations about this graph include:

Averages and loss-weighted averages can be calculated as follows, and compared to the formula-based approach. Selections made by Teng and Perkins have been added manually to the table. (Note that averages do not match the ones calculated by Teng and Perkins, because they used data from policy years from 1983 to 1992, while the table here only uses policy years 1987 to 1992.)

empirical.summary = historical.data[, .(avg_PDLD = mean(empirical_PDLD), wtd_avg_PDLD = sum(empirical_PDLD * loss) / sum(loss)), by = adjustment_number]
PDLD.comparison = merge(empirical.summary, implied.PDLD[,.(adjustment_number, implied_PDLD)], by = "adjustment_number")
PDLD.comparison$selected_PDLD = c(1.750, 0.7, 0.55, 0.45, 0.40, 0.35)
datatable(PDLD.comparison)

Advantages of the approach

  • More responsive to changes in loss experience than the formula-based method

Disadvantages of the approach

  • Does not count for changes in the parameters of plans written over time./li>

Cumulative PDLD Ratios

Once incremental PDLD ratios have been selected, the cumulative ratios (CPDLD) can be calculated. These are equal to the average PDLD in all subsequent periods, weighted by the percentage of losses expected to emerge in each period. Incremental percentage emergence values from Exhibit 7 are used below.

PDLD.comparison = PDLD.comparison[order(adjustment_number)]
PDLD.comparison$percentage_emergence = c(0.784, 0.093, 0.044, 0.03, .029, 0.016)
tail.loss.emergence = 0.004
PDLD.comparison = PDLD.comparison[order(-adjustment_number)]
PDLD.comparison[, remaining_emergence := cumsum(percentage_emergence) + tail.loss.emergence]
PDLD.comparison[, remaining_weighted_PDLD := cumsum(percentage_emergence * selected_PDLD)]
PDLD.comparison[, CPDLD := remaining_weighted_PDLD / remaining_emergence]
datatable(PDLD.comparison[,.(adjustment_number, selected_PDLD, percentage_emergence, remaining_emergence, remaining_weighted_PDLD, CPDLD)])

Typically, CPDLD ratios are greater than one during the first period (due to the basic premium charge, and the fact that most losses are ratable), and less than one during subsequent periods (since losses are increasingly subject to per-occurrence and aggregate limits).

The final step to determine the premium asset is to multiply the CPDLD ratios by expected future losses to get the expected future premiums. Ultimate premiums may be calculated by adding the booked premiums as of the most recent adjustment date. The premium asset is the difference between the ultimate premiums and the current booked premium. (The reason for the difference between the current booked premium and the premium at the prior adjustment could be due to differences in timing of adjustments, minor adjustments, or interim premium payments.) Note that in this process, the two latest policy periods are grouped together, and have the same CPDLD ratio applied to them, since neither year has had its first adjustment yet. The following uses the annual summary data from Exhibit 1:

annual.summary = fread("./teng_perkins_exhibit_1-annual_summaries.csv")
annual.summary[, adjustment_number := ifelse(policy_effective_year == 1994 | policy_effective_year == 1993, 1, 1994 - policy_effective_year)]
annual.summary = merge(annual.summary, PDLD.comparison[, .(adjustment_number, CPDLD)], by = "adjustment_number", all.x = TRUE)
annual.summary[is.na(CPDLD), CPDLD := 0]
prior.booked.premium = historical.data[, .(prior_booked_premium = sum(premium)), by = "policy_effective_year"]
annual.summary = merge(annual.summary, prior.booked.premium, by = "policy_effective_year", all.x = TRUE)
annual.summary[is.na(prior_booked_premium), prior_booked_premium := 0]
annual.summary[, expected_future_premium := CPDLD * expected_future_loss_emergence]
annual.summary[, ultimate_premium := expected_future_premium + prior_booked_premium]
annual.summary[, premium_asset := ultimate_premium - premium_booked_12_1994]
datatable(annual.summary)

An additional consideration is that if the premium asset is not secured, then a provision for collectibility risk should be added.

Fitzgibbon’s and Berry’s Approaches

The approach of Teng and Perkins is compared to other approaches, due to Fitzgibbon and Berry, in the discussion by Feldblum. The key idea is that, since the retrospective premium formula is a constant plus a multiple of the losses, then the relationship between premium and losses can be determined by linear regression. Implicitly, this gives a way of estimating the combined effect of the average plan parameters. (Fitzgibbon and Berry do this on the basis of the retro adjustment as a percentage of standard premium. Since the sample data doesn’t include sample premium, the examples below will be based on raw premium and loss amounts. Regardless, the form of the model is the same.) To apply this approach, first calculate cumulative losses:

regression.data = historical.data[, .(policy_effective_year, policy_effective_quarter, adjustment_number, loss, premium, time_period)][order(adjustment_number)]
regression.data[, cumulative_loss := cumsum(loss), by = "time_period"]
regression.data[, cumulative_premium := cumsum(premium), by = "time_period"]
datatable(regression.data)

Visualize the above data:

ggplot(data = regression.data, aes(x = cumulative_loss, y = cumulative_premium)) + geom_point() + geom_smooth(method = "lm") + coord_cartesian(ylim = c(0, 200000))

Parameters of this model are as follows:

fitzgibbon.model = lm(cumulative_premium ~ cumulative_loss, data = regression.data)
summary(fitzgibbon.model)
## 
## Call:
## lm(formula = cumulative_premium ~ cumulative_loss, data = regression.data)
## 
## Residuals:
##    Min     1Q Median     3Q    Max 
## -16082  -6442  -1197   4243  25436 
## 
## Coefficients:
##                  Estimate Std. Error t value Pr(>|t|)    
## (Intercept)     1.560e+04  2.984e+03   5.228 1.28e-06 ***
## cumulative_loss 1.147e+00  3.383e-02  33.893  < 2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 9562 on 82 degrees of freedom
## Multiple R-squared:  0.9334, Adjusted R-squared:  0.9326 
## F-statistic:  1149 on 1 and 82 DF,  p-value: < 2.2e-16

The intercept term can be interpreted as the contribution of the basic premium, and the “cumulative_loss” parameter can be interpreted as the effect of loss capping and loss conversion. The slope allows us to quantify the premium responsiveness of the book of business. It is generally not equal to 1 due to a variety of factors (loss capping, minimum premium, loss conversion, and premium tax). The slope of the line is referred to as the “swing” of the plan. Plans with narrow swing (low per-occurrence and aggregate limits, low premium responsiveness) are typically sold to small accounts, and plans with wide swing (high per-occurence and aggregate limits, high premium responsiveness, possibly even above 1) would be sold to large accounts. The major drawbacks of the Fitzgibbon approach is that it is not responsive to the actual experience (which may show more or less premium responsiveness than initially projected), and that the projection produced does not change from year to year.

Note that since we expect the loss capping ratio to decrease over time, it makes sense for the slope to decrease in subsequent adjustment periods: the fit should not be a straight line, but rather a sequence of line segments of decreasing slope. (The resulting graph will be concave, meaning that the linear approach overestimates premium at high and low loss ratios.) One approach is to fit an interaction between adjustment number and cumulative loss, without the marginal effects:

regression.data[, loss_period_1 := ifelse(adjustment_number == 1, cumulative_loss, 0)]
regression.data[, loss_period_2 := ifelse(adjustment_number == 2, cumulative_loss, 0)]
regression.data[, loss_period_3 := ifelse(adjustment_number == 3, cumulative_loss, 0)]
regression.data[, loss_period_4 := ifelse(adjustment_number == 4, cumulative_loss, 0)]
regression.data[, loss_period_5 := ifelse(adjustment_number == 5, cumulative_loss, 0)]
regression.data[, loss_period_6 := ifelse(adjustment_number == 6, cumulative_loss, 0)]
fitzgibbon.model.v2 = lm(cumulative_premium ~ loss_period_1 + loss_period_2 + loss_period_3 + loss_period_4 + loss_period_5 + loss_period_6, data = regression.data)
summary(fitzgibbon.model.v2)
## 
## Call:
## lm(formula = cumulative_premium ~ loss_period_1 + loss_period_2 + 
##     loss_period_3 + loss_period_4 + loss_period_5 + loss_period_6, 
##     data = regression.data)
## 
## Residuals:
##    Min     1Q Median     3Q    Max 
## -10082  -4188  -1924   2806  19392 
## 
## Coefficients:
##                Estimate Std. Error t value Pr(>|t|)    
## (Intercept)   7.407e+03  2.285e+03   3.241  0.00176 ** 
## loss_period_1 1.375e+00  3.615e-02  38.026  < 2e-16 ***
## loss_period_2 1.281e+00  3.178e-02  40.291  < 2e-16 ***
## loss_period_3 1.222e+00  2.950e-02  41.426  < 2e-16 ***
## loss_period_4 1.174e+00  2.934e-02  40.012  < 2e-16 ***
## loss_period_5 1.127e+00  3.039e-02  37.087  < 2e-16 ***
## loss_period_6 1.129e+00  3.675e-02  30.729  < 2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 6756 on 77 degrees of freedom
## Multiple R-squared:  0.9688, Adjusted R-squared:  0.9663 
## F-statistic: 398.1 on 6 and 77 DF,  p-value: < 2.2e-16

The decreasing slope is evident, with the exception of the last two adjustment periods, which can be attributed to volatile data. The approach of Teng and Perkins does not include an intercept, but we could still fit a regression in the style of their approach:

teng.perkins.regression = lm(cumulative_premium ~ loss_period_1 + loss_period_2 + loss_period_3 + loss_period_4 + loss_period_5 + loss_period_6 + 0, data = regression.data)
summary(teng.perkins.regression)
## 
## Call:
## lm(formula = cumulative_premium ~ loss_period_1 + loss_period_2 + 
##     loss_period_3 + loss_period_4 + loss_period_5 + loss_period_6 + 
##     0, data = regression.data)
## 
## Residuals:
##    Min     1Q Median     3Q    Max 
## -10302  -3361  -1247   3642  22384 
## 
## Coefficients:
##               Estimate Std. Error t value Pr(>|t|)    
## loss_period_1  1.47332    0.02070   71.19   <2e-16 ***
## loss_period_2  1.36497    0.01932   70.63   <2e-16 ***
## loss_period_3  1.29772    0.01913   67.84   <2e-16 ***
## loss_period_4  1.24475    0.02074   60.01   <2e-16 ***
## loss_period_5  1.19318    0.02382   50.10   <2e-16 ***
## loss_period_6  1.19396    0.03270   36.51   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 7155 on 78 degrees of freedom
## Multiple R-squared:  0.9965, Adjusted R-squared:  0.9962 
## F-statistic:  3684 on 6 and 78 DF,  p-value: < 2.2e-16

The slopes increase to account for the lack of an intercept term.

There are two sources of uncertainty associated with estimating premium emergence:
  1. Actual reported losses may differ from the expected losses. This means there is uncertainty with respect to the point at which the slope of the graph changes.
  2. The relationship between reported losses and premium may differ from initial projections.
There are two assumptions underlying the Teng and Perkins approach that allow it to be more flexible than a linear regression model:
  1. The premium responsiveness during subsequent adjustments is independent of premium responsiveness during preceding adjustments. This allows the method to respond to actual emergence and actual premium responsiveness, by starting the next line segment at the point corresponding to the actual loss and premium (but keeping the slope the same).
  2. The slope of the line segment depends only on the time period, not the starting loss or retrospective premium. This means that the point at which the slope of the line changes is not “fixed”, but is equal to the amount of loss that has emerged as of the current adjustment.

In other words, this method allows the projections to “get back on track” if actual experience differs from expectations.

A key drawback of the approach used by Teng and Perkins is that, by excluding an intercept, it reduces its explanatory power, since the drivers of premium responsiveness can no longer be allocated between the basic premium and the loss conversion / loss capping effects during the first adjustment. This means that a lengthening of reporting patterns can have a negative impact on the accuracy of the projection, because a reduction in losses at the first adjustment period impacts premium responsiveness but not the intercept – the point at which the slope changes is too far to the left. Feldblum proposes a simple adjustment to accommodate this: calculate the average basic premium as a ratio to the standard loss ratio, and subtract this from the first CPDLD, in essence, removing the intercept.