Interpreting PDF's and CDF's in the petroleum Industry
Engineering Honours Degree 2008
University of Adelaide
Exploration and production in the oil industry is teeming with risk and uncertainty. Some of the most important decisions, at almost every stage of the business are made from the representation of this risk and uncertainty. This includes not only exploration, but also production and down-stream product marketing, to name a few. The oil industry is regarded as a classical illustration of the need for sophisticated and systematic approaches to risk management. A vital factor in the analysis of risk and probabilities is the communication and interpretation of the data. All the modelling and hard work is useless if the key points of the work cannot be conveyed to the decision makers.
The aim of this study was to investigate probability density functions and cumulative distribution functions in terms of their accuracy of conveying different pieces of information, such as the mean, mode, percentiles (P10, P50, and P90). Following a literature review into similar studies completed in this area, two surveys were carried out to fourth year, and third year petroleum engineering students to investigate their interpretation of probability density functions and cumulative distribution functions. They were asked to estimate a range of different values from the displays. Following the analysis of these results, it was found that probability density functions communicated the mode of a distribution considerably better than the Cumulative distribution function, and in general gave a lower variance in results for the mean as compared to the Cumulative distribution function. While the Cumulative distribution function did an overall better job at communicating values like P10, P50, P90 and P (X>a). In addition, it was found that the Cumulative distribution function might confuse some users, if they did not fully understand how the display worked, and finally under most conditions there was no or very little correlation between the accuracy of people’s results, and their confidence of their answers.
One could conclude that the user may not fully understand the relative probabilities of a distribution, if they are only given a cumulative distribution function, as opposed to a probability density function. However, if they are required to determine exact values, like the P10, P50, and P90 for their analysis of the decision, they will obtain more accurately with a cumulative distribution function. These choices are vital