A data mining approach to gain insight into traffic violations of young drivers in developing countries

Document Type : Research Paper


1 Department of civil, Sirjan university of technology

2 Civil Engineering Department, Shahid Bahonar University, Kerman, Iran

3 Postdoctoral Researcher, McMaster Institute for Transportation & Logistics (MITL) McMaster University, Hamilton, ON, Canada


In developing countries, the population of younger adults is relatively higher. In addition, the frequency of traffic violations, committed by young drivers, is considerable. Consequently, annually a large portion of road crashes is recorded among this age group. This paper aims to study the traffic rule violations of young drivers in Iran. Focusing on the behavior of young drivers and understanding the mechanisms that affect the occurrence of violations among this group of drivers can be helpful to promote traffic safety. For this purpose, 567 drivers in the range of 18 to 40 years old have been studied. Then, different data mining approaches such as descriptive analysis, correlation analysis, multinomial logistic regression (MLR), and Random Forest (RF) were used to provide insight into traffic violations of young drivers, and to propose potential countermeasures to decrease this issue. Results indicated that driving over speed limits, red-light running, and angry driving are the most frequent violations. The frequency of using mobile phone while driving, as a source of distraction, has been found to be highly correlated with other violations. As the frequency of previous traffic fines, the number of days with access to private cars, and the frequency of previous crashes increase, more diverse types of violations with high frequencies are expected in the future. In addition, the frequency of risky violations was found to be higher among men and those with lower education levels.