Monday, November 11, 2013
In the comments section of my previous post on cancer rates in the US states () my friend Aaron Blaisdell noted that: …comparing states that are roughly comparable in terms of number of seniors per 1000 individuals, latitude appears to have the largest effect on rates of cancer.
Good point, so I collected data on the latitudes of US states, built a more complex model (with several multivariate controls), and analyzed it with WarpPLS 4.0 ().
The coefficient of association for the effect of latitude on cancer rates (path coefficient) turned out to be 0.35. Its P value was lower than 0.001, meaning that the probability that this is a false positive is less than a tenth of a percent, or that we can be 99.9 percent confident that this is not a false positive.
This was calculated controlling for the: (a) proportion of seniors in the population (population age); (b) proportion of obese individuals in the population (obesity rates); and (c) the possible moderating effect of latitude on the effect of population age on cancer rates. The graph below shows this multivariate-adjusted association.
What is cool about a multivariate analysis is that you can control for certain effects. For example, since we are controlling for proportion of seniors in the population (population age), the fact that we have a state with a very low proportion of seniors (Alaska) does not tilt the effect toward that outlier as much as it would if we had not controlled for the proportion of seniors. This is a mathematical property that is difficult to grasp, but that makes multivariate adjustment such a powerful technique.
I should note that the 99.9 percent confidence mentioned above refers to the coefficient of association. That is, we are quite confident that the coefficient of association is not zero; that is it. The P value does not support the hypothesized direction of causality (latitude -> cancer) or exclude the possibility of a major confounder causing the effect.
Nonetheless, among the newest features of WarpPLS 4.0 (still a beta version) are several causality assessment coefficients: path-correlation signs, R-squared contributions, path-correlation ratios, path-correlation differences, Warp2 bivariate causal direction ratios, Warp2 bivariate causal direction differences, Warp3 bivariate causal direction ratios, and Warp3 bivariate causal direction differences. Without going into a lot of technical detail, which you can get from the User Manual () without even having to install the software, I can tell you that all of these causality assessment coefficients support the hypothesized direction of causality.
Also, while we cannot exclude the possibility of a major confounder causing the effect, we included two possible confounders in the analysis and controlled for their effects. They were the proportion of seniors in the population (population age) and the proportion of obese individuals in the population (obesity rates).
Having said all of the above, I should also say that the effect is similar in magnitude to the effect of population age on cancer rates, which I discussed in the previous post linked above. That is, it is not the type of effect that would be clearly noticeable in a person’s normal life.
Sunlight exposure? Maybe.
We do know that our body naturally produces as much as 10,000 IU of vitamin D based on a few minutes of sun exposure when the sun is high (). Getting that much vitamin D from dietary sources is very difficult, even after “fortification”.