Document Type


Publication Date



The following report is a compilation of injury traffic crashes analysis using logistic regression. The purpose of this study is to use real world data collected in Orange County, California to learn how crash characteristic relate to probability of injury crashes. The data used in this project involves crashes that occurred in 1998 on six Orange County freeways including Interstates 5 and 405, and State Routes 22, 55, 57 and 91. This dataset involves some information about crash typology. The real world data was processed and potential dependent variables were identified using explanatory analysis. Then, processed data were imported to SAS to estimate logistic regression coefficients. Also, several logistic regression models concentrating on different dependent variable interactions were fitted. Finally, the best model was selected using deviance as goodness-of-fit measure. The final model gives following results: Crashes involving speeding and alcohol usage cause to higher probability of injury than crashes due to other causes. Crashes on the weekend cause to higher probability of injury than crashes on weekdays. Crashes off the road cause to higher probability of injury than crashes that occur on the road. Also, Highway 91 was identified as the highest risky highway for injury crashes comparing other highways which involved in this study.