level 13
When a binary outcome variable is modeled using logistic regression, it is assumed that the logit transformation of the outcome variable has a linear relationship with the predictor variables. This makes the interpretation of the regression coefficients somewhat tricky. In this page, we will walk through the concept of odds ratio and try to interpret the logistic regression results using the concept of odds ratio in a couple of examples.
2018年02月23日 14点02分
2
level 13
Everything starts with the concept of probability. Let’s say that the probability of success of some event is .8. Then the probability of failure is 1- .8 = .2. The odds of success are defined as the ratio of the probability of success over the probability of failure. In our example, the odds of success are .8/.2 = 4. That is to say that the odds of success are 4 to 1. If the probability of success is .5, i.e., 50-50 percent chance, then the odds of success is 1 to 1.
2018年02月23日 14点02分
3
level 13
The transformation from odds to log of odds is the log transformation. Again this is a monotonic transformation. That is to say, the greater the odds, the greater the log of odds and vice versa. The table below shows the relationship among the probability, odds and log of odds. We have also shown the plot of log odds against odds.
p odds logodds .001 .001001 -6.906755 .01 .010101 -4.59512 .15 .1764706 -1.734601 .2 .25 -1.386294 .25 .3333333 -1.098612 .3 .4285714 -.8472978 .35 .5384616 -.6190392 .4 .6666667 -.4054651 .45 .8181818 -.2006707 .5 1 0 .55 1.222222 .2006707 .6 1.5 .4054651 .65 1.857143 .6190392 .7 2.333333 .8472978 .75 3 1.098612 .8 4 1.386294 .85 5.666667 1.734601 .9 9 2.197225 .999 999 6.906755 .9999 9999 9.21024
2018年02月23日 14点02分
6
level 13
Why do we take all the trouble doing the transformation from probability to log odds? One reason is that it is usually difficult to model a variable which has restricted range, such as probability. This transformation is an attempt to get around the restricted range problem. It maps probability ranging between 0 and 1 to log odds ranging from negative infinity to positive infinity.
2018年02月23日 14点02分
7
level 13
Another reason is that among all of the infinitely many choices of transformation, the log of odds is one of the easiest to understand and interpret. This transformation is called logit transformation. The other common choice is the probit transformation, which will not be covered here.
2018年02月23日 14点02分
8
level 13
A logistic regression model allows us to establish a relationship between a binary outcome variable and a group of predictor variables. It models the logit-transformed probability as a linear relationship with the predictor variables. More formally, let y be the binary outcome variable indicating failure/success with 0/1 and p be the probability of y to be 1, p = prob(y=1). Let x1, .., xk be a set of predictor variables. Then the logistic regression of y on x1, …, xk estimates parameter values forβ0, β1, . . . , βk via maximum likelihood method of the following equation.
logit(p) = log(p/(1-p))= β0 + β1*x1 + … + βk*xk
In terms of probabilities, the equation above is translated into
p= exp(β0 + β1*x1 + … + βk*xk)/(1+exp(β0 + β1*x1 + … + βk*xk)).
2018年02月23日 14点02分
9
level 13
Let’s start with the simplest logistic regression, a model without any predictor variables. In an equation, we are modeling
logit(p)= β0
Logistic regression Number of obs = 200 LR chi2(0) = 0.00 Prob > chi2 = .Log likelihood = -111.35502 Pseudo R2 = 0.0000------------------------------------------------------------------------------ hon | Coef. Std. Err. z P>|z| [95% Conf. Interval]-------------+---------------------------------------------------------------- intercept | -1.12546 .1644101 -6.85 0.000 -1.447697 -.8032217------------------------------------------------------------------------------
This means log(p/(1-p)) = -1.12546. What is p here? It turns out that p is the overall probability of being in honors class ( hon = 1). Let’s take a look at the frequency table for hon.
2018年02月23日 14点02分
11
level 13
hon | Freq. Percent Cum.
------------+-----------------------------------
0 | 151 75.50 75.50
1 | 49 24.50 100.00
------------+-----------------------------------
Total | 200 100.00
So p = 49/200 = .245. The odds are .245/(1-.245) = .3245 and the log of the odds (logit) is log(.3245) = -1.12546. In other words, the intercept from the model with no predictor variables is the estimated log odds of being in honors class for the whole population of interest. We can also transform the log of the odds back to a probability: p = exp(-1.12546)/(1+exp(-1.12546)) = .245, if we like.
2018年02月23日 14点02分
12
level 13
Logistic regression with a single dichotomous predictor variables
Now let’s go one step further by adding a binary predictor variable, female, to the model. Writing it in an equation, the model describes the following linear relationship.
logit(p) = β0 + β1*female
Logistic regression Number of obs = 200 LR chi2(1) = 3.10 Prob > chi2 = 0.0781Log likelihood = -109.80312 Pseudo R2 = 0.0139------------------------------------------------------------------------------ hon | Coef. Std. Err. z P>|z| [95% Conf. Interval]-------------+---------------------------------------------------------------- female | .5927822 .3414294 1.74 0.083 -.0764072 1.261972 intercept | -1.470852 .2689555 -5.47 0.000 -1.997995 -.9437087------------------------------------------------------------------------------
Before trying to interpret the two parameters estimated above, let’s take a look at the crosstab of the variable hon with female.
| female hon | male female | Total-----------+----------------------+---------- 0 | 74 77 | 151 1 | 17 32 | 49 -----------+----------------------+---------- Total | 91 109 | 200
In our dataset, what are the odds of a male being in the honors class and what are the odds of a female being in the honors class? We can manually calculate these odds from the table: for males, the odds of being in the honors class are (17/91)/(74/91) = 17/74 = .23; and for females, the odds of being in the honors class are (32/109)/(77/109) = 32/77 = .42. The ratio of the odds for female to the odds for male is (32/77)/(17/74) = (32*74)/(77*17) = 1.809. So the odds for males are 17 to 74, the odds for females are 32 to 77, and the odds for female are about 81% higher than the odds for males.
2018年02月23日 14点02分
13