Introduction

These study notes are based on Chapter 8 the Exam 9 syllabus reading Investments by Bodie, Kane, and Marcus. This chapter explains how single-index models can be used to simplify the estimation of correlations between securities. This chapter corresponds to learning objective A5 on the syllabus.

easypackages::packages("dplyr", "ggplot2", "multcomp")
options(scipen = 999)

Formulation of the model

The full Markowitz model is often not used in practice for several reasons:

In a single factor model, we assume that the return on security \(i\) is given by \[ r_i = E(r_i) + \beta_i m + \epsilon_i \] where \(m\) is some common factor with a mean of 0 that drives the returns of all securities and \(\epsilon_i\) is the firm-specific uncertainty, assumed to be uncorrelated with other securities. An important variation is the single index model, in which \(m\) is the excess rate of return \(R_M\) on a broad index of securities, and the excess return \(R_i = r_i - r_f\) is modelled: \[ R_i = \alpha_i + \beta_i R_M + \epsilon_i \] In this form, \(\beta_i R_M\) is the systematic risk premium, and \(\alpha_i\) is the non-market premium. Due to competitive pressures (e.g. investors bid up the price of stocks with high values for \(\alpha\)), this should tend to 0 over time. As a result, statistical methods are not reliable for estimating \(\alpha\), becuase estimates based on historical data are unlikely to be valid in the future. The value of \(\alpha\) is determined through security-specific analysis, and represents the premium attributable to private information about the security.

In this form, we can calculate the variance of the return as \[ \sigma_i^2 = \beta_i^2 \sigma_M^2 + \sigma^2(\epsilon_i) \] where \(\sigma^2(\epsilon_i)\) is the firm-specific risk and \(\beta_i^2\sigma_M^2\) is the systematic risk. The covariance of two returns is: \[ \mathrm{Cov}(R_i, R_j) = \mathrm{Cov}(\beta_i R_M + \epsilon_i, \beta_j R_M + \epsilon_j) = \beta_i \beta_j \sigma_M^2 \] using the fact that the \(\epsilon\) values are uncorrelated with each other, as well as the market.

To understand the impact of diversification in the context of the single-index model, assume we have an equally-weighted portfolio of \(n\) stocks. The excess return of this portfolio is \[ R_P = \frac{1}{n} \sum_{1\leq i \leq n} \alpha_i + \frac{R_M}{n} \sum_{1\leq i \leq n} \beta_i + \frac{1}{n} \sum_{1\leq i \leq n} \epsilon_i = \alpha_P + \beta_P R_M + \epsilon_P \] where \(\alpha_P\), \(\beta_P\), and \(\epsilon_P\) are averages of these parameters across the portfolio. Therefore, the portfolio variance is \[ \sigma_P^2 = \beta_P^2 \sigma_M^2 + \sigma^2(\epsilon_P) \] Note that the systematic portion, \(\beta_P^2 \sigma_M^2\), is fixed and does not decrease as the number of stocks in the portfolio increases. For the portfolio-specific risk, \[ \sigma^2(\epsilon_P) = \sum_{1\leq i \leq n} \frac{1}{n^2} \sigma^2(\epsilon_i) = \frac{1}{n} \overline{\sigma^2}, \] where \(\overline{\sigma^2}\) is the average residual variance of the stocks in the portfolio. Therefore, \(\sigma^2(\epsilon_P) \rightarrow 0\) as \(n\rightarrow \infty\), leaving behind only the non-diversifiable risk \(\beta_P^2 \sigma_M^2\).

Advantages of the single-index model include:

Disadvantages of the single-index model include:

Regression Estimates

The general linear regression formulas can be used to estimate \(\alpha_i\) and \(\beta_i\): \[ \beta_i = \frac{\mathrm{Cov}(R_i, R_M)}{\sigma^2_M} \] and \[ \alpha_i = \overline{R_i} - \beta_i \overline{R_M} \] The usual linear regression tests can be used to assess the significance of these estimates.

The approach can be illustrated using simulated data, which I have generated using the following parameters:

set.seed(12345)
stock.simulation = data.frame(time_period = 1:60)
stock.simulation = stock.simulation %>% mutate(market_return = rnorm(60, mean = 0.06, sd = 0.15))
stock.simulation = stock.simulation %>% mutate(stock_A_excess_return = rnorm(60, mean = market_return, sd = 0.30),
                                               stock_B_excess_return = rnorm(60, mean = 1.2 * market_return, sd = 0.30),
                                               stock_C_excess_return = rnorm(60, mean = 0.5 * market_return - 0.01, sd = 0.30),
                                               stock_D_excess_return = rnorm(60, mean = 0.5 * market_return, sd = 0.30),
                                               stock_E_excess_return = rnorm(60, mean = 0.5 * market_return + 0.01, sd = 0.30))

In the following visualizations, the black line is the market index and the red line is one of the stocks. The first graph shows the stock that is highly sensitive to market swings:

ggplot(data = stock.simulation, aes(x = time_period, y = market_return)) + geom_line() + geom_line(aes(y = stock_B_excess_return), colour = "red") + ylim(-1, 1)

The second graph is the stock that is less sensitive to market swings:

ggplot(data = stock.simulation, aes(x = time_period, y = market_return)) + geom_line() + geom_line(aes(y = stock_D_excess_return), colour = "red") + ylim(-1, 1)

The estimates for \(\alpha\) and \(\beta\) can be determined by fitting a linear regression model:

stock.A.model = lm(stock_A_excess_return ~ market_return, data = stock.simulation)
summary(stock.A.model)
## 
## Call:
## lm(formula = stock_A_excess_return ~ market_return, data = stock.simulation)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -0.61519 -0.24743 -0.01429  0.21577  0.67717 
## 
## Coefficients:
##               Estimate Std. Error t value Pr(>|t|)    
## (Intercept)    0.06805    0.04748   1.433 0.157171    
## market_return  1.01206    0.25088   4.034 0.000162 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.3243 on 58 degrees of freedom
## Multiple R-squared:  0.2191, Adjusted R-squared:  0.2056 
## F-statistic: 16.27 on 1 and 58 DF,  p-value: 0.0001623

Interpretations of these results include:

To determine the systematic risk component, multiply \(\beta\) by the standard deviation of the market index:

market.sd = sd(stock.simulation$market_return)
systematic.sd.A = market.sd * stock.A.model$coefficients['market_return']
systematic.sd.A
## market_return 
##     0.1703076

The overall variance of the stock is:

systematic.sd.A^2 + 0.3243^2
## market_return 
##     0.1341752

An important consideration when interpreting the results is that the \(p\)-value for \(\beta\) is assessing whether it is significantly different from zero; however, the more relevant test for this use case is whether it is significantly different from 1. This test can be performed using the “glht” function in the “multcomp” package:

summary(glht(stock.A.model, "market_return = 1"))
## 
##   Simultaneous Tests for General Linear Hypotheses
## 
## Fit: lm(formula = stock_A_excess_return ~ market_return, data = stock.simulation)
## 
## Linear Hypotheses:
##                    Estimate Std. Error t value Pr(>|t|)
## market_return == 1   1.0121     0.2509   0.048    0.962
## (Adjusted p values reported -- single-step method)

This \(p\) value makes it clear that the estimate is not significantly different from 1. The \(t\)-value can be calculated more explicitly by subtracting 1 from the parameter estimate and dividing by the standard error:

se = summary(stock.A.model)$coefficients['market_return', 'Std. Error']
(stock.A.model$coefficients['market_return'] - 1) / se
## market_return 
##    0.04808628

The results can be visualized as a scatterplot, along with the regression line which is called the security characteristic line for this stock:

ggplot(data = stock.simulation, aes(x = market_return, y = stock_A_excess_return)) + geom_point() + geom_abline(intercept = stock.A.model$coefficients['(Intercept)'], slope = stock.A.model$coefficients['market_return'], colour ="red")

Compare the above results to the other simulated values:

stock.B.model = lm(stock_B_excess_return ~ market_return, data = stock.simulation)
summary(stock.B.model)
## 
## Call:
## lm(formula = stock_B_excess_return ~ market_return, data = stock.simulation)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -0.60714 -0.25945 -0.00778  0.22219  0.84841 
## 
## Coefficients:
##               Estimate Std. Error t value    Pr(>|t|)    
## (Intercept)   -0.03532    0.04857  -0.727        0.47    
## market_return  1.51097    0.25666   5.887 0.000000209 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.3318 on 58 degrees of freedom
## Multiple R-squared:  0.374,  Adjusted R-squared:  0.3632 
## F-statistic: 34.66 on 1 and 58 DF,  p-value: 0.0000002092

In this case, the estimate of \(\beta\) is much higher. Test for significance:

summary(glht(stock.B.model, "market_return = 1"))
## 
##   Simultaneous Tests for General Linear Hypotheses
## 
## Fit: lm(formula = stock_B_excess_return ~ market_return, data = stock.simulation)
## 
## Linear Hypotheses:
##                    Estimate Std. Error t value Pr(>|t|)  
## market_return == 1   1.5110     0.2567   1.991   0.0512 .
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## (Adjusted p values reported -- single-step method)

The moderate \(t\)-value suggests that there is some evidence that this stock is more sensitive than average to market swings, consistent with the way in which it was generated.

stock.C.model = lm(stock_C_excess_return ~ market_return, data = stock.simulation)
summary(stock.C.model)
## 
## Call:
## lm(formula = stock_C_excess_return ~ market_return, data = stock.simulation)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -0.6883 -0.1583  0.0079  0.1952  0.5736 
## 
## Coefficients:
##               Estimate Std. Error t value Pr(>|t|)   
## (Intercept)    0.01178    0.03961   0.297  0.76724   
## market_return  0.58050    0.20932   2.773  0.00745 **
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.2706 on 58 degrees of freedom
## Multiple R-squared:  0.1171, Adjusted R-squared:  0.1019 
## F-statistic: 7.691 on 1 and 58 DF,  p-value: 0.00745
summary(glht(stock.C.model, "market_return = 1"))
## 
##   Simultaneous Tests for General Linear Hypotheses
## 
## Fit: lm(formula = stock_C_excess_return ~ market_return, data = stock.simulation)
## 
## Linear Hypotheses:
##                    Estimate Std. Error t value Pr(>|t|)  
## market_return == 1   0.5805     0.2093  -2.004   0.0497 *
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## (Adjusted p values reported -- single-step method)

In this case, the \(\beta\) term is showing up as significant, as expected. The estimate for \(\alpha\) is not showing up as significant even though we generated this data using a small negative value of \(\alpha\).

stock.D.model = lm(stock_D_excess_return ~ market_return, data = stock.simulation)
summary(stock.D.model)
## 
## Call:
## lm(formula = stock_D_excess_return ~ market_return, data = stock.simulation)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -0.60281 -0.16584 -0.02363  0.17769  0.86462 
## 
## Coefficients:
##                Estimate Std. Error t value Pr(>|t|)
## (Intercept)   -0.008613   0.039693  -0.217    0.829
## market_return  0.294308   0.209739   1.403    0.166
## 
## Residual standard error: 0.2711 on 58 degrees of freedom
## Multiple R-squared:  0.03283,    Adjusted R-squared:  0.01616 
## F-statistic: 1.969 on 1 and 58 DF,  p-value: 0.1659
summary(glht(stock.D.model, "market_return = 1"))
## 
##   Simultaneous Tests for General Linear Hypotheses
## 
## Fit: lm(formula = stock_D_excess_return ~ market_return, data = stock.simulation)
## 
## Linear Hypotheses:
##                    Estimate Std. Error t value Pr(>|t|)   
## market_return == 1   0.2943     0.2097  -3.365  0.00136 **
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## (Adjusted p values reported -- single-step method)

In this case the \(\beta\) shows up as significant, even though it is much lower than the value of 0.5 used to generate the data.

stock.E.model = lm(stock_E_excess_return ~ market_return, data = stock.simulation)
summary(stock.E.model)
## 
## Call:
## lm(formula = stock_E_excess_return ~ market_return, data = stock.simulation)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -0.8588 -0.1704  0.0247  0.1679  0.6204 
## 
## Coefficients:
##               Estimate Std. Error t value Pr(>|t|)   
## (Intercept)    0.05307    0.04415   1.202  0.23428   
## market_return  0.77903    0.23331   3.339  0.00147 **
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.3016 on 58 degrees of freedom
## Multiple R-squared:  0.1612, Adjusted R-squared:  0.1468 
## F-statistic: 11.15 on 1 and 58 DF,  p-value: 0.001474
summary(glht(stock.E.model, "market_return = 1"))
## 
##   Simultaneous Tests for General Linear Hypotheses
## 
## Fit: lm(formula = stock_E_excess_return ~ market_return, data = stock.simulation)
## 
## Linear Hypotheses:
##                    Estimate Std. Error t value Pr(>|t|)
## market_return == 1   0.7790     0.2333  -0.947    0.348
## (Adjusted p values reported -- single-step method)

In this case, the \(\beta\) estimate is showing up as not significantly different from 1, despite the use of a value of 0.5 for generating the data. As with Stock C, the \(\alpha\) is not significant, even though a positive \(\alpha\) was used to generate the value.

After performing a regression analysis, there may be a need to further adjust \(\beta\) for the following reasons:

One approach to address the problem of changes in \(\beta\) over time is to calulate \(\beta\) for different time periods, then constructing a forecasting model based on these estimates.

Portfolio Optimization

When using the single index model for portfolio optimization, where \(w_i\) is the weight of security \(i\), then \[ E(R_P) = \alpha_P + \beta_P + E(R_M) \] where \[ \alpha_P = \sum_{1\leq i \leq n} w_i \alpha_i \] and \[ \beta_P = \sum_{1 \leq i \leq n} w_i \beta_i \] The variance is given by \[ \sigma_P^2 = \beta_P^2 \sigma_M^2 + \sigma^2(\epsilon_P) \] where \[ \sigma^2(\epsilon_P) = \sum_{1 \leq i \leq n} w_i^2 \sigma^2(\epsilon_i) \] As before, the Sharpe ratio is \[ S_P = \frac{E(R_P)}{\sigma_P} \] and the optimal portfolio maximizes this quantity subject to the constraint that the sum of the weights is 1.

When the single index model is used, the optimization process simplifies to an algorithm involving explict algebraic formulas. The approach assumes that there is a subset \(A\) of securities for which a detailed analysis has been performed, and for which we have estimates for \(\alpha\), \(\beta\), and \(\sigma^2(\epsilon)\). This is referred to as the active portfolio. Along with these securities, we also include a market index portfolio \(M\), referred to as the passive portfolio. The general approach is to take advantage of any existing knowledge about the \(\alpha_i\) values, without losing out on the diversification potential provided by the securities for which limited information is known.

The approach is to construct an initial active portfolio, then determine a weighting of this portfolio with the market portfolio:

  1. Calculate initial weights for the active portfolio, \(w_i\), that are proportional to \(\frac{\alpha_i}{\sigma^2(\epsilon_i)}\) and sum to 1. (If short positions are prohibited, remove any securities with negative \(\alpha_i\) from the active portfolio.)

  2. Based on the above, determine \(\alpha_A\), \(\beta_A\), and \(\sigma^2(\epsilon_A)\) for the active portfolio.

  3. The initial position in the active portfolio is \[ w_A^0 = \frac{\alpha_A / \sigma^2(\epsilon_A)}{E(R_M) / \sigma^2_M} \]

  4. Adjust the initial weight of the active portfolio based on its \(\beta\) value: \[ w_A^* = \frac{w_A^0}{1 + (1 - \beta_A) w_A^0} \]

  5. The optimal weights for the risky portfolio are now \(w_M^* = 1 - w_A^*\) and \(w_i^* = w_Aw_i\).

  6. The expected return of the optimal risky portfolio is now \[ E(R_P) = w_A^*\alpha_A + (w_M^* + w_A^* \beta_A) E(R_M) \]

  7. The variance of the optimal risky portfolio is now \[ \sigma_P^2 = (w_M^* + w_A^*\beta_A)^2 \sigma_M^2 + [w_A^* \sigma(\epsilon_A))]^2 \]

The resulting portfolio has a Sharpe ratio of \[ S_P^2 = S_M^2 + \left[\frac{\alpha_A}{\sigma(\epsilon_A)}\right]^2 \] The quantity \(\alpha_A / \sigma(\epsilon_A)\) is called the information ratio, and is a measure of the extra return that can be obtained through security analysis relative to the security’s contribution of firm-specific risk. (Note that the information ratio is the ratio of \(\alpha\) to the residual standard deviation. In contrast, in the optimization algorithm, it is the ratio to residual variance that is used as weights.)

The above concepts can be illustrated using the following data:

BKM.8.data = read.csv("./Data/BKM_8.csv")
BKM.8.data
##        X SD_excess_return correlation_with_SP_500   alpha
## 1     HP           0.3817                    0.72  0.0150
## 2   DELL           0.2901                    0.58 -0.0100
## 3    WMT           0.1935                    0.43 -0.0050
## 4 TARGET           0.2611                    0.66  0.0075
## 5     BP           0.1822                    0.35  0.0120
## 6  SHELL           0.1988                    0.46  0.0025

Along with the above, we use the following assumptions about the S&P 500:

sp.500.expected.excess = 0.06
sp.500.sd.excess = 0.1358

First, determine the \(\beta\) for each stock in the active portfolio, the corresponding systematic risk, and the residual standard deviation:

BKM.8.data = BKM.8.data %>% mutate(beta = correlation_with_SP_500 * SD_excess_return / sp.500.sd.excess, systemic_sd = beta * sp.500.sd.excess, residual_sd = sqrt(SD_excess_return^2 - systemic_sd^2))
BKM.8.data
##        X SD_excess_return correlation_with_SP_500   alpha      beta
## 1     HP           0.3817                    0.72  0.0150 2.0237408
## 2   DELL           0.2901                    0.58 -0.0100 1.2390133
## 3    WMT           0.1935                    0.43 -0.0050 0.6127025
## 4 TARGET           0.2611                    0.66  0.0075 1.2689691
## 5     BP           0.1822                    0.35  0.0120 0.4695876
## 6  SHELL           0.1988                    0.46  0.0025 0.6734021
##   systemic_sd residual_sd
## 1    0.274824   0.2648899
## 2    0.168258   0.2363202
## 3    0.083205   0.1746974
## 4    0.172326   0.1961554
## 5    0.063770   0.1706758
## 6    0.091448   0.1765183

Determine the initial weights in the active portfolio:

BKM.8.data = BKM.8.data %>% mutate(weight_proportion = alpha / residual_sd^2)
weight.scale = sum(BKM.8.data$weight_proportion)
BKM.8.data = BKM.8.data %>% mutate(initial_weight = weight_proportion / weight.scale)
BKM.8.data %>% dplyr::select(X, alpha, systemic_sd, weight_proportion, initial_weight)
##        X   alpha systemic_sd weight_proportion initial_weight
## 1     HP  0.0150    0.274824         0.2137767      0.3831228
## 2   DELL -0.0100    0.168258        -0.1790598     -0.3209044
## 3    WMT -0.0050    0.083205        -0.1638314     -0.2936126
## 4 TARGET  0.0075    0.172326         0.1949218      0.3493317
## 5     BP  0.0120    0.063770         0.4119432      0.7382694
## 6  SHELL  0.0025    0.091448         0.0802344      0.1437931

Calculate the alpha and residual variance of the active portfolio:

alpha.active = sum(BKM.8.data$alpha * BKM.8.data$initial_weight)
alpha.active
## [1] 0.02226265
active.residual.var = sum(BKM.8.data$residual_sd^2 * BKM.8.data$initial_weight^2)
active.residual.var
## [1] 0.0398983
beta.active = sum(BKM.8.data$beta * BKM.8.data$initial_weight)
beta.active
## [1] 1.084643

Calculate the initial weight for the active portfolio:

initial.active.weight = (alpha.active / active.residual.var) / (sp.500.expected.excess / sp.500.sd.excess^2)
initial.active.weight
## [1] 0.1715026

Adjuste the weight based on beta:

final.active.weight = initial.active.weight / (1 + (1 - beta.active) * initial.active.weight)
final.active.weight
## [1] 0.1740289

Determine the weight of each stock in the active portfolio:

BKM.8.data = BKM.8.data %>% mutate(final_weight = initial_weight * final.active.weight)
BKM.8.data %>% dplyr::select(X, initial_weight, final_weight)
##        X initial_weight final_weight
## 1     HP      0.3831228   0.06667444
## 2   DELL     -0.3209044  -0.05584665
## 3    WMT     -0.2936126  -0.05109708
## 4 TARGET      0.3493317   0.06079382
## 5     BP      0.7382694   0.12848023
## 6  SHELL      0.1437931   0.02502416

Determine the expected return of the optimum portfolio:

active.expected.excess = beta.active * sp.500.expected.excess + alpha.active
active.expected.excess
## [1] 0.08734124
optimum.expected.excess = final.active.weight * active.expected.excess + (1 - final.active.weight) * sp.500.expected.excess
optimum.expected.excess
## [1] 0.06475817

The standard deviation of the optimum portfolio is:

optimum.variance = ((1 - final.active.weight) + final.active.weight * beta.active)^2 * sp.500.sd.excess^2 + final.active.weight^2 * active.residual.var
optimum.sd = sqrt(optimum.variance)
optimum.sd
## [1] 0.1421172

The Sharpe ratio of the optimum portfolio is:

optimum.expected.excess / optimum.sd
## [1] 0.4556672

Compare this to the Sharpe Ratio of the S&P 500:

sp.500.expected.excess / sp.500.sd.excess
## [1] 0.4418262

Compare this to the Sharpe Ratio of the active portfolio:

active.expected.excess / sqrt(active.residual.var + beta.active^2 * sp.500.sd.excess^2)
## [1] 0.3519251

Notice that the Sharpe ratio of the combination of the active portfolio and the market index is superior to either.

Tracking portfolios

The purpose of a tracking portfolio is to eliminate systematic risk from a portfolio, retaining only the firm-specific components. Suppose that \[ R_P = \alpha + \beta R_M + e_P \] where \(\alpha > 0\) and \(\beta > 1\). The objective is to construct a portfolio with the same \(\beta\) but no non-systematic risk. This can be done by taking a levered position in the market index; the tracking portfolio \(T\) will consist of

  • A position of \(1 - \beta\) in T-bills

  • A position of \(\beta\) in the market index

By construction, the portfolio \(T\) has \(\alpha_T = 0\) and \(\beta_T = \beta\). By taking a short position in \(T\) along with the original portfolio \(P\), the combined portfolio has an excess return of: \[ R_C = R_P - R_T = \alpha + \beta R_M + e_P - \beta R_M = \alpha + e_P \] Thus, the only remaining risk is the non-systematic component.