Epistemic uncertainty

Below is an excerpt from Jonh Ridgway's book review of "Waltzing with Bears", which is about software project risk management.  The general concept of epistemic vs. aleatory uncertainty is very important.

[…] Do yourself a favour, ignore what the book says about risk analysis [for software projects] and go and buy a good book on Bayesian Methods and Decision Theory. You don't have to take my word for this, just type in 'epistemic uncertainty and Monte Carlo' into your Internet search engine and take it from there. In the meantime, here are some background notes to help explain my remarks:

There are two types of uncertainty: epistemic and aleatory. As the name suggests, epistemic uncertainty results from gaps in knowledge. For example, one may be uncertain of an outcome because one has never used a particular technology before. Such uncertainty is essentially a state of mind and hence subjective. Aleatory uncertainty results from variability that is intrinsic to the behaviour of some systems [like throwing dice] (alea is the Latin for die). For example, I can be confident regarding the long term frequency of throwing sixes but I remain uncertain of the outcome of any given throw of a dice. This uncertainty can be objectively determined.There are two branches of probability theory: Frequentist and Bayesian.

Frequentist probability theory is used to analyse systems that are subject to aleatory uncertainty.

Bayesian probability theory is used to analyse epistemic uncertainty.

For most risk assessments a [software] Project Manager has to undertake, there is both epistemic and aleatory uncertainty but epistemic uncertainty is always significant due to the novelty of the situation underassessment.

Standard Monte Carlo Simulation uses frequentist probability theory to analyse risk.

When Monte Carlo is used to model schedule risk, the [software] schedule uncertainties are being treated as if they are aleatory, even though they are predominantly epistemic. This is now considered to be unrealistic and is known to give incorrect results. The main problem is that the second order uncertainties [in software projects] are often too large to be ignored, i.e. the required shape for the chosen probability distribution curves is more important than the tool vendors would have you believe and yet they are usually imprecisely known.

Using standard Monte Carlo to analyse schedule risk also requires unrealistic assumptions to be made regarding the correlations between the probabilities for the individual outcomes, e.g. that there are no correlations or that they are all of the same nature. In practice, there are correlations to be considered when analysising schedule risk and they are of both a positive and negative nature.

As a result of the above drawbacks many expert authorities are warning against the use of Monte Carlo Simulation where the historical data upon which the analysis is premised is incomplete. For example, its use in ecological and economic models is now controversial (see the World Congress on Risk website).

In software development there is a growing appreciation of the importance of Bayesian Methods in analysing problems in software quality assurance.

Bayesian methods are appropriate in situations where there are gaps in information (i.e. where there is epistemic uncertainty). They involve the creation of Bayesian Belief Networks (BBNs) to model causal relationships. Data is fed into the model to enable the probability of specified outcomes to be calculated given the current body of knowledge.

Even more interesting, Bayes Theorem can be used to assess the likelihood that pre-conditions exist in the light of outcomes becoming known.

BBNs can be used in any situation where one is trying to calculate the likelihood of an outcome, or an unknown situation, when there is only limited information. It is useful in Decision Theory when a risk-based decisions required in the face of epistemic uncertainty.

Leave a Reply

Your email address will not be published. Required fields are marked *