I'm looking at some prevalence data at the moment, where we believe the rate of increase in the condition has slowed in the last few years. I've tried fitting splines to the data, which are rather curvy. Mathematical complexity tends to make public health researchers unduly nervous, so I thought I'd try something simpler - segmented regression. The segmented package in R does a very nice job of this. I wanted to fit these segmented regressions to every group, for every subcategory of interest, so I wrote a lattice panel function:
panel.segmented_lm <- function(x, y, groups, subscripts, ...)
{
g = groups[subscripts]
for (group in levels(g)) {
x2 = x[g==group]
y2 = y[g==group]
if (length(x2) == 0 || length(y2) == 0)
next
lm_fit = lm(y2~x2)
# If segmented regression fails, fall back to the simple linear regression
segmented_fit = tryCatch(segmented(lm_fit, seg.Z=~x2, psi=list(x2=c(mean(x2)))),
error = function(e) {
lm_fit
})
panel.lines(x2, predict(segmented_fit), col = "black")
}
}
There's substantial room for improvement. The colours should vary by group, and I don't think I should have to iterate through the groups manually, but it's a start!
Oh, and in case you're wondering why I'm being cagey about what I'm analysing and why, there's a good reason for that. Some people aren't going to like the results very much, so we have to have an unassailable case before we publish. |