Tuesday, September 28, 2010

Income, obesity, and heart disease in US states

The figure below combines data on median income by state (bottom-left and top-right), as well as a plot of heart disease death rates against percentage of population with body mass index (BMI) greater than 30 percent. The data are recent, and have been provided by CNN.com and creativeclass.com, respectively.


Heart disease deaths and obesity are strongly associated with each other, and both are inversely associated with median income. US states with lower median income tend to have generally higher rates of obesity and heart disease deaths.

The reasons are probably many, complex, and closely interconnected. Low income is usually associated with high rates of stress, depression, smoking, alcoholism, and poor nutrition. Compounding the problem, these are normally associated with consumption of cheap, addictive, highly refined foods.

Interestingly, this is primarily an urban phenomenon. If you were to use hunter-gatherers as your data sources, you would probably see the opposite relationship. For example, non-westernized hunter-gatherers have no income (at least not in the “normal” sense), but typically have a lower incidence of obesity and heart disease than mildly westernized ones. The latter have some income.

Tragically, the first few generations of fully westernized hunter-gatherers usually find themselves in the worst possible spot.

Wednesday, September 22, 2010

Low nonexercise activity thermogenesis: Uncooperative genes or comfy furniture?

The degree of nonexercise activity thermogenesis (NEAT) seems to a major factor influencing the amount of fat gained or lost by an individual. It also seems to be strongly influenced by genetics, because NEAT is largely due to involuntary activities like fidgeting.

But why should this be?

The degree to which different individuals will develop diseases of civilization in response to consumption of refined carbohydrate-rich foods can also be seen as influenced by genetics. After all, there are many people who eat those foods and are thin and healthy, and that appears to be in part a family trait. But whether we consume those products or not is largely within our control.

So, it is quite possible that NEAT is influenced by genetics, but the fact that NEAT is low in so many people should be a red flag. In the same way that the fact that so many people who eat refined carbohydrate-rich foods are obese should be a red flag. Moreover, modern isolated hunter-gatherers tend to have low levels of body fat. Given the importance of NEAT for body fat regulation, it is not unreasonable to assume that NEAT is elevated in hunter-gatherers, compared to modern urbanites. Hunter-gatherers live more like our Paleolithic ancestors than modern urbanites.

True genetic diseases, caused by recent harmful mutations, are usually rare. If low NEAT were truly a genetic “disease”, those with low NEAT should be a small minority. That is not the case. It is more likely that the low NEAT that we see in modern urbanites is due to a maladaptation of our Stone Age body to modern life, in the same way that our Stone Age body is maladapted to the consumption of foods rich in refined grains and seeds.

What could have increased NEAT among our Paleolithic ancestors, and among modern isolated hunter-gatherers?

One thing that comes to mind is lack of comfortable furniture, particularly comfortable chairs (photo below from: prlog.org). It is quite possible that our Paleolithic ancestors invented some rudimentary forms of furniture, but they would have been much less comfortable than modern furniture used in most offices and homes. The padding of comfy office chairs is not very easy to replicate with stones, leaves, wood, or even animal hides. You need engineering to design it; you need industry to produce that kind of thing.


I have been doing a little experiment with myself, where I do things that force me to sit tall and stand while working in my office, instead of sitting back and “relaxing”. Things like putting a pillow on the chair so that I cannot rest my back on it, or placing my computer on an elevated surface so that I am forced to work while standing up. I tend to move a lot more when I do those things, and the movement is largely involuntary. These are small but constant movements, a bit like fidgeting. (It would be interesting to tape myself and actually quantify the amount of movement.)

It seems that one can induce an increase in NEAT, which is largely due to involuntary activities, by doing some voluntary things like placing a pillow on a chair or working while standing up.

Is it possible that the unnaturalness of comfy furniture, and particularly of comfy chairs, is contributing (together with other factors) to not only making us fat but also having low-back problems?

Both obesity and low-back problems are widespread among modern urbanites. Yet, from an evolutionary perspective, they should not be. They likely impaired survival success among our ancestors, and thus impaired their reproductive success. Evolution “gets angry” at these things; over time it wipes them out. In my reading of studies of hunter-gatherers, I don’t recall a single instance in which obesity and low-back problems were described as being widespread.

Friday, September 17, 2010

Strong causation can exist without any correlation: The strange case of the chain smokers, and a note about diet

Researchers like to study samples of data and look for associations between variables. Often those associations are represented in the form of correlation coefficients, which go from -1 to 1. Another popular measure of association is the path coefficient, which usually has a narrower range of variation. What many researchers seem to forget is that the associations they find depend heavily on the sample they are looking at, and on the ranges of variation of the variables being analyzed.

A forgotten warning: Causation without correlation

Often those who conduct multivariate statistical analyses on data are unaware of certain limitations. Many times this is due to lack of familiarity with statistical tests. One warning we do see a lot though is: Correlation does not imply causation. This is, of course, absolutely true. If you take my weight from 1 to 20 years of age, and the price of gasoline in the US during that period, you will find that they are highly correlated. But common sense tells me that there is no causation whatsoever between these two variables.

So correlation does not imply causation alright, but there is another warning that is rarely seen: There can be strong causation without any correlation. Of course this can lead to even more bizarre conclusions than the “correlation does not imply causation” problem. If there is strong causation between variables B and Y, and it is not showing as a correlation, another variable A may “jump in” and “steal” that “unused correlation”; so to speak.

The chain smokers “study”

To illustrate this point, let us consider the following fictitious case, a study of “100 cities”. The study focuses on the effect of smoking and genes on lung cancer mortality. Smoking significantly increases the chances of dying from lung cancer; it is a very strong causative factor. Here are a few more details. Between 35 and 40 percent of the population are chain smokers. And there is a genotype (a set of genes), found in a small percentage of the population (around 7 percent), which is protective against lung cancer. All of those who are chain smokers die from lung cancer unless they die from other causes (e.g., accidents). Dying from other causes is a lot more common among those who have the protective genotype.

(I created this fictitious data with these associations in mind, using equations. I also added uncorrelated error into the equations, to make the data look a bit more realistic. For example, random deaths occurring early in life would reduce slightly any numeric association between chain smoking and cancer deaths in the sample of 100 cities.)

The table below shows part of the data, and gives an idea of the distribution of percentage of smokers (Smokers), percentage with the protective genotype (Pgenotype), and percentage of lung cancer deaths (MLCancer). (Click on it to enlarge. Use the "CRTL" and "+" keys to zoom in, and CRTL" and "-" to zoom out.) Each row corresponds to a city. The rest of the data, up to row 100, has a similar distribution.


The graphs below show the distribution of lung cancer deaths against: (a) the percentage of smokers, at the top; and (b) the percentage with the protective genotype, at the bottom. Correlations are shown at the top of each graph. (They can vary from -1 to 1. The closer they are to -1 or 1, the stronger is the association, negative or positive, between the variables.) The correlation between lung cancer deaths and percentage of smokers is slightly negative and statistically insignificant (-0.087). The correlation between lung cancer deaths and percentage with the protective genotype is negative, strong, and statistically significant (-0.613).


Even though smoking significantly increases the chances of dying from lung cancer, the correlations tell us otherwise. The correlations tell us that lung cancer does not seem to cause lung cancer deaths, and that having the protective genotype seems to significantly decrease cancer deaths. Why?

If there is no variation, there is no correlation

The reason is that the “researchers” collected data only about chain smokers. That is, the variable “Smokers” includes only chain smokers. If this was not a fictitious case, focusing the study on chain smokers could be seen as a clever strategy employed by researchers funded by tobacco companies. The researchers could say something like this: “We focused our analysis on those most likely to develop lung cancer.” Or, this could have been the result of plain stupidity when designing the research project.

By restricting their study to chain smokers the researchers dramatically reduced the variability in one particular variable: the extent to which the study participants smoked. Without variation, there can be no correlation. No matter what statistical test or software is used, no significant association will be found between lung cancer deaths and percentage of smokers based on this dataset. No matter what statistical test or software is used, a significant and strong association will be found between lung cancer deaths and percentage with the protective genotype.

Of course, this could lead to a very misleading conclusion. Smoking does not cause lung cancer; the real cause is genetic.

A note about diet

Consider the analogy between smoking and consumption of a particular food, and you will probably see what this means for the analysis of observational data regarding dietary choices and disease. This applies to almost any observational study, including the China Study. (Studies employing experimental control manipulations would presumably ensure enough variation in the variables studied.) In the China Study, data from dozens of counties were collected. One may find a significant association between consumption of food A and disease Y.

There may be a much stronger association between food B and disease Y, but that association may not show up in statistical analyses at all, simply because there is little variation in the data regarding consumption of food B. For example, all those sampled may have eaten food B; about the same amount. Or none. Or somewhere in between, within a rather small range of variation.

Statistical illiteracy, bad choices, and taxation

Statistics is a “necessary evil”. It is useful to go from small samples to large ones when we study any possible causal association. By doing so, one can find out whether an observed effect really applies to a larger percentage of the population, or is actually restricted to a small group of individuals. The problem is that we humans are very bad at inferring actual associations from simply looking at large tables with numbers. We need statistical tests for that.

However, ignorance about basic statistical phenomena, such as the one described here, can be costly. A group of people may eliminate food A from their diet based on coefficients of association resulting from what seem to be very clever analyses, replacing it with food B. The problem is that food B may be equally harmful, or even more harmful. And, that effect may not show up on statistical analyses unless they have enough variation in the consumption of food B.

Readers of this blog may wonder why we explicitly use terms like “suggests” when we refer to a relationship that is suggested by a significant coefficient of association (e.g., a linear correlation). This is why, among other reasons.

One does not have to be a mathematician to understand basic statistical concepts. And doing so can be very helpful in one’s life in general, not only in diet and lifestyle decisions. Even in simple choices, such as what to be on. We are always betting on something. For example, any investment is essentially a bet. Some outcomes are much more probable than others.

Once I had an interesting conversation with a high-level officer of a state government. I was part of a consulting team working on an information technology project. We were talking about the state lottery, which was a big source of revenue for the state, comparing it with state taxes. He told me something to this effect:

Our lottery is essentially a tax on the statistically illiterate.

Thursday, September 2, 2010

How to lose fat and gain muscle at the same time? Strength training plus a mild caloric deficit

Ballor et al. (1996) conducted a classic and interesting study on body composition changes induced by aerobic and strength training. This study gets cited a lot, but apparently for the wrong reasons. One of these reasons can be gleaned from this sentence in the abstract:

    “During the exercise training period, the aerobic training group … had a significant … reduction in body weight … as compared with the [strength] training group ...

That is, one of the key conclusions of this study was that aerobic training was more effective than strength training as far as weight loss is concerned. (The authors refer to the strength training group as the “weight training group”.)

Prior to starting the exercise programs, the 18 participants had lost a significant amount of weight through dieting, for a period of 11 weeks. The authors do not provide details on the diet, other than that it was based on “healthy” food choices. What this means exactly I am not sure, but my guess is that it was probably not particularly high or low in carbs/fat, included a reasonable amount of protein, and led to a caloric deficit.

The participants were older adults (mean age of 61; range, 56 to 70), who were also obese (mean body fat of 45 percent), but otherwise healthy. They managed to lose an average of 9 kg (about 20 lbs) during that 11-week period.

Following the weight loss period, the participants were randomly assigned to either a 12-week aerobic training (four men, five women) or weight training (four men, five women) exercise program. They exercised 3 days per week. These were whole-body workouts, with emphasis on compound (i.e., multiple-muscle) exercises. The figure below shows what actually happened with the participants.


As you can see, the strength training group (WT) gained about 1.5 kg of lean mass, lost 1.2 kg of fat, and thus gained some weight. The aerobic training group (AT) lost about 0.6 kg of lean mass and 1.8 kg of fat, and thus lost some weight.

Which group fared better? In terms of body composition changes, clearly the strength training group fared better. But my guess is that the participants in the strength training group did not like seeing their weight going up after losing a significant amount of weight through dieting. (An analysis of the possible psychological effects of this would be interesting; a discussion for another blog post.)

The changes in the aerobic training group were predictable, and were the result of compensatory adaptation. Their bodies changed to become better adapted to aerobic exercise, for which a lot of lean mass is a burden, as is a lot of fat mass.

So, essentially the participants in the strength training group lost fat and gained muscle at the same time. The authors say that the participants generally stuck with their weight-loss diet during the 12-week exercise period, but not a very strict away. It is reasonable to conclude that this induced a mild caloric deficit in the participants.

Exercise probably induced hunger, and possibly a caloric surplus on exercise days. If that happened, the caloric deficit must have occurred on non-exercise days. Without some caloric deficit there would not have been fat loss, as extra calories are stored as fat.

There are many self-help books and programs online whose main claim is to have a “revolutionary” prescription for concurrent fat loss and muscle gain – the “holy grail” of body composition change.

Well, it may be as simple as combining strength training with a mild caloric deficit, in the context of a nutritious diet focused on unprocessed foods.

Reference:

Ballor, D.L., Harvey-Berino, J.R., Ades, P.A., Cryan, J., & Calles-Escandon, J. (1996). Contrasting effects of resistance and aerobic training on body composition and metabolism after diet-induced weight loss. Metabolism, 45(2), 179-183.