Thursday, September 17, 2020

Application of computer analysis of accident data

 Computer Analysis of Accident Data

  • Data collected from accident sites covers several aspects and is subjective to the person collecting the data. 
  • Identifying the cause of road accidents is the aim behind accident data collection and reconstruction of the event with the main aim being reduction damages caused by traffic accidents.
  • Because of exponential growth in population leading to increased number of vehicles on the road and consequently increasing accidents, the volume of data from accidents has reached explosive proportions.
  • In order to manage this humongous data and analyse it to make sense to policy planners, data mining technologies are used
  • 'WEKA' is a popular data mining program that can handle huge sets of data efficiently
  • The results of data mining will help organizations such as transportation, to explore the accident data recorded by the police information system, discover patterns to predict future behaviors and effective decisions to be taken to reduce accidents.
  • Road accidents are predicted through machine learning algorithms and advanced techniques for analyzing information, such as convolutional neural networks and long short-term memory networks, among other deep learning architectures
  • Data sources for the road accident forecast is made. 
  • A classification is proposed according to its origin and characteristics, such as open data, measurement technologies, onboard equipment and social media data.
  • Road accident forecasting and Traffic accident prediction are driven by traffic engineering, data analysis and machine learning
  • The main areas of interest of models obtained from computer analysis of accident data are 
    • detection of problematic areas for circulation 
    • real time detection of traffic incidents 
    • road accident forecasting and 
    • prediction of the severity of the consequences suffered by involved in a road accident  
  • Therefore, the study of road accident prediction is a field of relevant and current scientific knowledge, open to innovation in the research of algorithms and data analysis techniques that respond to the challenge of generating a more secure mobility environment, which considers the pecularities of each country or region, i.e., traffic composition, weather conditions, roads conditions, and demography
  • Data about accidents can be gathered by installing equipment on vehicles, for example satellite positional systems (GPS, GLONASS, Galileo), cameras and sensors, in order to gather data like acceleration, unexpected braking events, sudden lane changes and information about the driver behavior and status like drowsiness and level of stress
  • Another emerging data source suitable for proposing models of road accident prediction is social media
  • Government data like police bodies, traffic police and road concessionaires can be characterized as historical, since it contains data spanning several decades, and can be considered as reliable, because it is supported by the custody process of the entities responsible for the data.
  • Open data can be defined, as the data that is produced and funded with public money, that is made available and accessible without restriction to the public .
  • Road traffic information is usually one of the most available data.
  • Measurement technologies include all kind of equipment that is part of the road infrastructure, such as radar, cameras, or equipment embedded on the road itself.
  • By using analytic methods, researchers seek to characterize the information and variables of the road accident, in order to discover hidden patterns, profile behaviors, generate rules and inferences. 
  • These patterns are useful to 
    • profile drivers or drivers’ behavior on the road
    • limit unsafe areas for driving
    • generate classification rules related to road accident data
    • perform selection of variables to be fetched in real-time model of accidents and 
    • select relevant variables to be used to train other methods, such as artificial neural networks and deep learning algorithms. 
  • Clustering is a method of partitioning and grouping objects into groups (clusters), so that objects grouped in each cluster share common characteristics, while looking for them to be clearly different from other objects grouped in other clusters. 
  • Common characteristics can be interpreted as the level of correlation of objects according to the characteristics on which clustering techniques are applied.
  • Unlike classification methods, clustering does not require that the data be previously marked with any particular category in order to distinguish different groups within the data. 
  • The absence of these previous categories or classes indicates that the objective of clustering is to find an underlying structure in the information and achieve a more compact representation of it instead of discriminating future data into categories.
  • The main advantages of clustering algorithms are that they do not require prior data processing, work well with large data sets, and their results can be interpreted graphically. 
  • On the other hand, clustering algorithms are sensitive to the possibility of finding a local maximum instead of a global maximum on their optimization functions.
  • Clustering algorithms use a distance function to calculate the similarity in characteristics when they work with continuous elements and a measure of similarity for data with qualitative elements. 
  • Among the techniques based on similarity functions we can include K-nearest neighbor and K-means clustering
  • Cluster techniques whose similarity function is based on distribution probabilities, their operation is based on the premise that each cluster has an underlying probability of distribution from which the data elements are generated. An example of this type of algorithm is latent class clustering (LCC)
  • For data sets with attributes both qualitative and quantitative, clustering techniques such as two-step clustering
  • Batch clustering, in combination with fuzzy C-means and real time clustering is used to study abrupt braking events in real time
  • Batch clustering results, correlations were obtained that indicate potentially dangerous places for driving, according to the time of day.
  • K-means clustering and association rules model in order to determinate the variables that influence the event of road accidents, obtaining a 6-cluster model, which was used as an input to a rules association model. 
  • It was found by computer analysis of accident data that accident severity, type of road, lighting present in the road and the type of surrounding area were important factors in any accident
  • Real-traffic data is used in order to predict the number of accidents on any road or intersection and to identify risk factors using clustering to group roads and finding risk patterns. 
  • The quantity of clusters was evaluated and selected using the Bayesian information criterion (BIC)
  • A decision tree builds classification models in the form of trees or dendrogram, each node represents one of the input variables, and each node has several branches equal to the number of possible values of said input variable. 
  • Decision trees are useful tools in pattern classification applications.
  • Decision tree method of analysis is exploratory and not inferential.
  • Rule learners and classifiers do not require prior data processing and work well with large data sets and rule learners and classifiers can be interpreted graphically; however, their results are not as accurate
  • Road Accident Data Management System (RADMS) is a Geographic Information System (GIS) based software that is funded by world bank used for collecting, comparing and analyzing road accident data.Currently, it is being used by the government of Tamil Nadu.
  • RADMS is a comprehensive traffic-management system which helps to study and analyse traffic accidents in a scientific manner.
  • The various components of RADMS are:
    • Creation of GIS database
    • Web based access and data flow
    • Report generation and plotting results on maps
    • Analysis and identification of black-spots for  police and transport departments to take-up necessary measures
    • RADMS generates the following twelve types of reports for analysis and suggestion of remedial measures
      • Driver report
      • Vehicle report
      • Road report
      • Yearly report
      • Enforcement
      • Collision type
      • Time period report
      • Alcohol usage report
      • Person report
      • Landmark report
      • Weather report
      • General report
  •  RADaR is a robust road crash database in order to reduce road accidents. RADaR is Road Accident Data Recorder
  • RADaR is an end-to-end solution for road accident data recording and reporting as it helps identify the factors contributing to road accidents
  • RADaR is designed as a n application for android tablet with connectivity to web-based database server.
  • It used GPS/GPRS to record exact accident location in global coordinate system and transmits data to web-based central server
  • It also provides a facility to take photographs of the accident scene and upload it to the network
  • It features a pictorial menu-driven recording of road layout of crash site and collision diagram plotted on layout for scientific investigation
  • RADaR can draw data for vehicle registration and driver license information from national databases
  • The pilot studies for RADaR was carried out in New Delhi (India) and Addis Ababa (Ethiopia)
  • AI machine-learning method is used to create decision trees distinguishing the characteristics of accidents
  • In order to identify factors causing accidents, Data Mining (DM) techniques such as Decision Trees (DTs) that are used as they allow certain decision rules to be extracted. These rules could be used in future road safety campaigns thereby enabling managers to implement priority actions.
  •  Artificial Neural Network (ANN) models are used for the analysis and prediction of accidents. In this technique, the number of vehicles, accidents, and population are selected and used as model parameters. The sigmoid and linear functions are used as activation functions with the feed forward-back propagation algorithm.
  • The ANN model has demonstrated to be better than statistical methods in use.
  • Since the data collected from accident sites is huge, it falls under the domain of 'BIG DATA'. 
  • Traffic on highways is monitored and lots of data is processed daily to predict probability of accidents based on highway conditions like road surface, light on highway, turns etc. 
  • Accident prediction is based on different queries and in order to process this big data, Hadoop has been used. 
  • Execution time is very less on Hadoop as compared to other sequential techniques.

No comments:

Post a Comment

National importance of survival of transportation systems during and after all natural disasters

NATIONAL IMPORTANCE OF SURVIVAL OF TRANSPORTATION SYSTEMS A transportation system can be defined as the combination of elements and their...