• Introduction

Introduction

In this bonus practical I will show you how to do a stratification analysis.

The idea behind stratification is to study statistical properties of a series during different periods

An example of this could be to compare the performance of several stocks during periods where the benchmark index was volatile. Stratification would then involve:

  • stratifying periods according to some series

  • comparing the performance of the series during these periods.

Let’s do this simple example. We want to:

  • Identify months where the ZAR Spot experienced extreme price volatility.

    • Identify extreme as its own top & bottom quintile (20%) by SD.
  • Compare the performance of different local indexes during these periods, in particular which provided the best and worst performance, compared to overall performance.

    • NB to winsorize returns at 1% and 99% to filter out extreme returns (check results for yourself if we skip the winzorised step)…
library(rmsfuns)
pacman::p_load("tidyr", "tbl2xts","devtools","lubridate", "readr", "PerformanceAnalytics", "ggplot2", "dplyr")

Idxs <- 
  
  fmxdat::SA_Indexes %>% arrange(date) %>% 
  
  group_by(Tickers) %>% mutate(Return = Price / lag(Price)-1) %>% 
  
  ungroup() %>% 
  
  select(date, Tickers, Return) %>% filter(!is.na(Return)) %>% 
  
  mutate(YearMonth = format(date, "%Y%B"))

# Consider only indexes with data from before 20080101, and use this as a common start date too...:
# Can you argue why?

Idx_Cons <- 
  
  Idxs %>% group_by(Tickers) %>% filter(date == first(date)) %>% 
  
  ungroup() %>% filter(date < ymd(20080101)) %>% 
  
  pull(Tickers) %>% unique

Idxs <- 
  
  Idxs %>% 
  
  filter(Tickers %in% Idx_Cons) %>% 
  
  filter(date > ymd(20080101))

# Winzorising:

Idxs <-
  
  Idxs %>% group_by(Tickers) %>% 
  
  mutate(Top = quantile(Return, 0.99), Bot = quantile(Return, 0.01)) %>% 
  
  mutate(Return = ifelse(Return > Top, Top, 
                         
                         ifelse(Return < Bot, Bot, Return))) %>% ungroup()



zar <- 
  
  fmxdat::PCA_EX_Spots  %>% 
  
  filter(date > ymd(20080101)) %>% filter(Spots == "ZAR_Spot") %>% 
  
  select(-Spots)


ZARSD <- 
  
zar %>% 
  
  mutate(YearMonth = format(date, "%Y%B")) %>% 
  
  group_by(YearMonth) %>% summarise(SD = sd(Return)*sqrt(52)) %>% 
  
  # Top Decile Quantile overall (highly volatile month for ZAR:
  mutate(TopQtile = quantile(SD, 0.8),
         
         BotQtile = quantile(SD, 0.2))



Hi_Vol <- ZARSD %>% filter(SD > TopQtile) %>% pull(YearMonth)

Low_Vol <- ZARSD %>% filter(SD < BotQtile) %>% pull(YearMonth)


# Create generic function to compare performance:

Perf_comparisons <- function(Idxs, YMs, Alias){
  # For stepping through uncomment:
  # YMs <- Hi_Vol
  Unconditional_SD <- 
    
  Idxs %>% 
    
    group_by(Tickers) %>% 
    
    mutate(Full_SD = sd(Return) * sqrt(252)) %>% 
    
    filter(YearMonth %in% YMs) %>% 
    
    summarise(SD = sd(Return) * sqrt(252), across(.cols = starts_with("Full"), .fns = max)) %>% 
    
    arrange(desc(SD)) %>% mutate(Period = Alias) %>% 
    
    group_by(Tickers) %>% 
    
    mutate(Ratio = SD / Full_SD)
    
    Unconditional_SD
  
}

perf_hi <- Perf_comparisons(Idxs, YMs = Hi_Vol, Alias = "High_Vol")

perf_lo <- Perf_comparisons(Idxs, YMs = Low_Vol, Alias = "Low_Vol")
End

That’s it.

This is simply an illustration of how you can go about doing simple stratifications.

This can of course be made more involved & more robust - in particular doing more advanced statistical analyses for the different stratified periods.