

Consider a regression of blood pressure against age in middle aged men. Computer packages will often produce the intercept from a regression equation, with no warning that it may be totally meaningless. For instance, a regression line might be drawn relating the chronological age of some children to their bone age, and it might be a straight line between, say, the ages of 5 and 10 years, but to project it up to the age of 30 would clearly lead to error. To project the line at either end – to extrapolate – is always risky because the relationship between x and y may change or some kind of cut off point may exist. As nouns the difference between digression and regression is that digression is a departure from the subject, course, or idea at hand an exploration of a different or unrelated concern while regression is an action of regressing, a return to a previous state. They show how one variable changes on average with another, and they can be used to find out what one variable is likely to be when we know the other – provided that we ask this question within the limits of the scatter diagram. Regression lines give us useful information about the data they are collected from. Calculation of the correlation coefficient However, it is hardly likely that eating ice cream protects from heart disease! It is simply that the mortality rate from heart disease is inversely related – and ice cream consumption positively related – to a third factor, namely environmental temperature. As a further example, a plot of monthly deaths from heart disease against monthly sales of ice cream would show a negative association. However, if the intention is to make inferences about one variable from the other, the observations from which the inferences are to be made are usually put on the baseline. In such cases it often does not matter which scale is put on which axis of the scatter diagram. The yield of the one does not seem to be “dependent” on the other in the sense that, on average, the height of a child depends on his age. It is reasonable, for instance, to think of the height of children as dependent on age rather than the converse but consider a positive correlation between mean tar yield and nicotine yield of certain brands of cigarette.’ The nicotine liberated is unlikely to have its origin in the tar: both vary in parallel with some other factor or factors in the composition of the cigarettes. This confusion is a triumph of common sense over misleading terminology, because often each variable is dependent on some third variable, which may or may not be mentioned.

For example, search for the work of Vincent Verardi.The words “independent” and “dependent” could puzzle the beginner because it is sometimes not clear what is dependent on what. New York: Wiley.Īs Steve said in 2011, there are now better modern methods available in Stata. In Exploring Data Tables, Trends, and Shapes, ed.

The literature would keep you busy indefinitely. "I used robust regression as codified by Li (1985)" obliges you to explain why you didn't use something more recent (to fad- and fashion-followers) or something else that someone else fancies for some reason of their own.

"I used -rreg- as implemented in Stata" counts for little outside this community. There are probably hundreds of ways to do robust regression (quite apart from what robustness means). "I used robust regression" means virtually nothing. If you ever used -rreg- for real, you'd be obliged to explain it and defend the choice in any serious forum. When -rreg- was written the method seemed a good all-round flavour of robust regression, but it is doubtful whether it now looks like _the_ method of choice to anyone in 2011. The help file has it right: -rreg- is "one version of robust regression". I wrote this in 2011 in See also back and forth in the thread, including an endorsement by Steve Samuels. I talk to robustniks and look at their literature and it's pretty clear that it is way behind the state of the art. rreg is just one flavour of robust regression.
