History and Theory
In 1951 Danie Krige, a South African statistician, considered the problem of predicting concentration values for gold deposits in Witwatersrand given the location and value of boring samples. Creating boreholes in the ground is rather expensive (in time and money) so we have the classic mathematician's goal: extrapolating the most information from doing the least amount of work (drilling the least amount of holes).
Now Krige new the first law of geography 18 years before it was written down:
The great contribution of Krige and later Georges Matheron in a more technical sense, was to translate this quote into mathematics. Consider the hard-earned boreholes of Figure 1 as given data pairs: (x,y) = (location, amount of gold); notice that the location x can be coordinates in 1,2, or 3 dimensional space. Let me write down some assumptions about this data:
- The amount of gold follows a known average spatial trend. (Let's say from empirical studies or from assumed regressions)
- Deviations from this average trend are normally distributed and depends on how far away from other gold you stand.
How do you declare dependence on how far away something is? Well the distance between two points in space is measured by Pythagoras, since he isn't around much anymore use his formula: distance between a and b = ||a - b||. So we need a function which depends on ||a-b|| to describe the spatial correlations of the deviations in assumption 2. Here comes the technical framework; consider the mean trend as a linear regression of known functions in a vector p and deviations from such as a random process z which has a correlation function R which depends on the distance between two points
Notice that the spatial correlation is an a priori choice and ends up significantly effecting performance of this method. A theoretically tractable choice is the classic Gaussian relationship denoted below.
A graph of this choice is seen on the right side for a particular choice of the shape parameter epsilon. This choice ends up being vital as well as it changes how sharply the dependence from the gold drops off; there are ways to statistically pick this value (Cross-Validation, Likelihood Maximization, etc. ) which are omitted here.
The solution to the statistical model posed boils down to a general weighted least squares solution. For detailed codes, visit my codes section.
Numerical Experiment in 1D
Consider the following 1 dimensional test problem, the circles are given data and we are to approximate by Kriging the underlying red-lined function. The Kriging interpolant was calculated and is shown on the right hand side below.
As we are solving a statistical model, we can also include a confidence estimation of the prediction by assuming normality and calculating a usual z-score:
Numerical Experiment in 3D
To stay true to the original gold problem explored by Krige, viability of the method should be explored while considering 3D coordinates as the data. In practice, using vectorized MATLAB code makes this equivalent to solving the 1D case. Below is shown the original data set from Figure 1 with the Kriging solution at the right.
References
- Meshfree Approximation Methods with MATLAB, Fasshauer (2007)
- Interpolation of spatial data: some theory for kriging, Stein (2012)