 StatTools : Logistic Regression Explained and Code Frqgments
 Related Links: Introduction Data Named group designation Numerical group designation References Logistic regression is an extension of linear regression, where the outcome is the probability of binomial 0/1 variable. The general formula is z = const + b1x1 + b2x2 + b3x3 ...etc, where x1, x2, x3 and so on are independent variables, either binary (0/1) or ordinal (0, 1, 2, 3..etc) The product of the coefficient and group designation for each independent varaible, is the log odds ratio of that group to the Reference group designated 0 After obtaining y, the probability of the outcome is calculated by y = 1 / (1 + exp(-z)) Different statistical packages have default approaches to calculating logistic regression Where an independent variable is binary (0/1), the coefficients is the log odds ratio of group 1 to group 0 Where an independent variable has more than 2 groups, e.g. 3 grouops of 0, 1, and 2 The first option is to treat the 3 groups together, so that the log odds ratio of each group to group 0 is the product of group designation and the coefficient. log odds ratio Grp 1/Grp 0 = b, grp2 / Grp 0 = 2b The second option is to transform all groups into binary dummy variables. The number of dummy variables being the number of group -1. In the case of 3 groups, two dummy variables are created d1 and d2, so the d1=0 and d2=0 for group 0, d1=1 and d2=0 for group 2, and d1=0 and d2=1 for group 3. Both options for handling independent variables with multiple groups are available in R, with the following conventions Where groups are represented numerically, such as 0, 1, 2, 3... R performs logistic regression with groups in eqach variable Where groups are represented by names in text, such as one, two, three, then R convertes each group to the appropriate number of dummy variables 