Tuesday, September 29, 2020

BAYESIAN STATISTICS

BAYESIAN STATISTICS

  • Frequentist Statistics tests whether an event (hypothesis) occurs or not. It calculates the probability of an event in the long run of the experiment Bayesian statistics is different from classical statistics
  • Bayesian statistics is a mathematical framework to update your beliefs as you observe more data

For example, lets consider an accident

  • Try recollecting the events that occurred just before an accident (This is called PRIOR)
  • Occurrence of event (Accident) = DATA
  • As more DATA is observed = BELIEF is UPDATED = POSTERIOR
  • Accident is an event with TWO outcomes (YES or NO)
  • Point data (YES or NO)
  • Accident is a RANDOM VARIABLE (YES = X & NO = 1-X)
  • RANDOM VARIABLE
  • Start with the objective (Accident | Data)
  • CONDITIONAL PROBABILITY ->  If PRIOR says there cannot be an accident, the BELIEF cannot be changed.

Baye's rule

  • More data = More precise predictions
  • Data is huge (difficult to compute) -> CHOP IT INTO PIECES (LARGE or unmanageable to SMALL CHUNKS or manageable pieces)
  • Belief propagation in graphical models

Approximation -> Variational Bayes

  • More data to update beliefs
  • Use previous outputs as priors and subsequently update
  • Belief influences results (Very subjective) similarly Prior is subjective
  • Frequentists NOT INTERESTED in SINGLE EVENTS
  • BAYESIAN statistics deals with UNCERTAIN EVENTS whereas FREQUENTIST statistics deals with REPEATABLE EVENTS

Bayesian statistics can:

  • -incorporate prior knowledge easily
  • -update beliefs easily
  • -tackle a wider set of problems as probabilities are BELIEFS
  • However, bayesian statistics MUST SPECIFY A MODEL

BELIEFs are SUBJECTIVE

Frequentist statistics 

  • -has non-parametric methods
  • -probabilities are objective
  • -hard to cheat
  • -are focussed on repeatable events
  • -prior knowledge is introduced using and ad-hoc format
  • -requires a huge data

FREQUENTIST and BAYESIAN statistics use the SAME RULES OF PROBABILITIES
Difference exists in set-up of WHAT IS RANDOM
Bayesian statistics uses UNCERTAINTY IN KNOWLEDGE
Frequentist statistics uses INTRINSIC RANDOMNESS
Usage of either methods is acceptable depending on DATA AVAILABLE and CONSISTENCY
BAYESIAN STATISTICS IS A FUNDAMENTALLY DIFFERENT APPROACH TO STATISTICS

It is an associated set of MATHEMATICAL tools
In BAYESIAN approach, DATA is FIXED and PARAMETERS may VARY
Frequentist
statisticians talk about CONFIDENCE INTERVALS while BAYESIAN STATISTICIANS talk about CREDIBLE INTERVAL
 

Baye's theorem talks about

  • -Posterion
  • -Likelihood
  • -Prior
  • -Evidence
  • Posterior = (likelihood * prior)/Evidence
  • Strength of Prior

Usefulness of Baye's statistics in case of

  • -Sparse data
  • -Abundant data and
  • -Uniform prior

Source of priors
 

Mathematical tools in Baye's statistics

  • -Analytical methods
  • -Grid approximation
  • -Markov chain monte carlo simulation
  • MCMC

-It is an algorithm for exploring parameter space
-Time spent at each point approximates parameter distribution
-Examples include Metropolis-Hastings, Gibbs sampling, etc
Bayesian methods perform extremely well in complex (hierarchical models)
Bayesian methods should be used in case of complex models with many interacting parameters
Bayesian methods are preferred when assumptions CANNOT be made regarding estimates and the data is messy (disorganised, missing data (gaps))
 

Bayesian road safety analysis

  • Bayesian statistics for determining hazardous road locations
  • Hazardous locations that are prone to traffic accidents are called "BLACK SPOTS". Identifying these black spots help in scheduling road safety policies. 
  • A bayesian estimation of the model via a Markov Chain Monte Carlo (MCMC) approach is used. 
  • Black spots are dangerous locations where accidents occur. Treating black spots is a well known and frequently used means of improving road safety. 
  • Black spots are spatial concentrations of interdependent high-frequency accident locations. 
  • From a statistical point of view, road accidents are treated as random events. As a matter of fact they are indeed unintentional result of human behaviour. \
  • Hence, it is impossible to predict the exact circumstance of every accident. 
  • There are several statistical models to analyse black spot data. 
  • Most accidents follow the Poisson probability law. In order to correct the extra Poisson variation found in accident counts, binomial regression models are used. 
  • Most recently, Bayesian techniques have been used to handle problems in traffic safety. 
  • To estimate accident frequencies, a hierarchical bayesian poisson model is used.
  • Identification of sites that are more dangerous than others (black spots) help in better scheduling road safety policies. 
  • Bayesian estimation for the model using a Markov chain Monte Carlo is proposed. 
  • The problem of identifying black spots is difficult since accidents are rare events and observed data is not necessarily a good indicator as it simply extracts data from an underlying density distribution.
  • Policy making has a tremendous impact on society as it can reduce the accidents at a particular site.
  • The hierarchical procedure for ranking sites  takes into account fatalities and injuries at all levels; combines this information by means of a cost function to rank the sites.

No comments:

Post a Comment

National importance of survival of transportation systems during and after all natural disasters

NATIONAL IMPORTANCE OF SURVIVAL OF TRANSPORTATION SYSTEMS A transportation system can be defined as the combination of elements and their...