Following on from last
week's post I will continue to go through the paper
Regression models based on log-incremental payments by Stavros Christofides
[1]. In the
previous post I introduced the model from the first 15 pages up to section F. Today I will progress with sections G to K which illustrate the model with a more realistic incremental claims payments triangle from a UK Motor Non-Comprehensive account:
# Page D5.17
tri <- t(matrix(
c(3511, 3215, 2266, 1712, 1059, 587, 340,
4001, 3702, 2278, 1180, 956, 629, NA,
4355, 3932, 1946, 1522, 1238, NA, NA,
4295, 3455, 2023, 1320, NA, NA, NA,
4150, 3747, 2320, NA, NA, NA, NA,
5102, 4548, NA, NA, NA, NA, NA,
6283, NA, NA, NA, NA, NA, NA), nc=7))
The rows show origin period data, e.g. accident years, underwriting years or years of account and the columns present the development periods or lags. The triangle appears to be fairly well behaved. The last two years in rows 6 and 7 appear to be slightly higher than rows 2 to 5 and the values in row 1 are lower in comparison to the later years. The last payment of £1,238 in the third row stands out a bit as well.
Before I plot the data, I will transform the triangle into a data frame and add extra columns:
m <- dim(tri)[1]; n <- dim(tri)[2]
dat <- data.frame(
origin=rep(0:(m-1), n),
dev=rep(0:(n-1), each=m),
value=as.vector(tri))
## Add dimensions as factors
dat <- with(dat, data.frame(origin, dev, cal=origin+dev,
value, logvalue=log(value),
originf=factor(origin),
devf=as.factor(dev),
calf=as.factor(origin+dev)))
I am particularly interested in the decay of claims payments in the development year direction for each origin year on the original and log-scale. The
interaction.plot
of the
stats
package does an excellent job for this:
op <- par(mfrow=c(2,1), mar=c(4,4,2,2))
with(dat, interaction.plot(x.factor=dev, trace.factor=origin,
response=value))
points(dat$devf, dat$value, pch=16, cex=0.5)
with(dat, interaction.plot(x.factor=dev, trace.factor=origin,
response=logvalue))
points(dat$devf, dat$logvalue, pch=16, cex=0.5)
par(op)
Indeed the origin years 1 to 4 (rows 2 to 5) look quite similar and the decay of claims in development year direction appears to be linear on a log-scale from development year 1 onwards.
Based on those observations Christofides suggests two models; the first one will have a unique level for each origin year and a unique level for the zero development period. The parameters for development periods 1 to 6 are assumed to follow a linear relationship with the same slope \(s\):
\begin{align}
\ln(P_{ij}) & = Y_{ij} = a_i + d_j + \epsilon_{ij}
&\mbox{for } i,\,j \mbox{ from } 0 \mbox{ to } 6\\
\mbox{where } d_0 &= d,\quad d_j = s \cdot j
&\mbox{for } j > 0
\end{align}and \(\epsilon_{ij} \sim N(0, \sigma^2)\). The second model will be a reduced version of the above with only two levels for the origin years 5 and 6. Hence, I add four more columns to my data frame:
Read more »