Some aspects of seismic interpretation, such as picking horizons and well-imaged faults, can be easily explained to a new interpreter. Other aspects, such as recognizing carbonate buildups, karst collapse, mass transport processes or volcanic intrusions, require not only an understanding of the underlying geologic processes, but also an understanding of their 3-D seismic data response. Although an experienced interpreter might be adept in using seismic data to map each of these features, they might also be challenged in explaining to the novice interpreter in a quantitative manner how they constructed their map or geobody.
Although seismic attributes quantify lateral and vertical changes in morphology, continuity, thickness and amplitude of the seismic data, they are rarely correlated to a specific feature. For example, while coherence is routinely used to map faults and channel edges, coherence anomalies also delineate angular unconformities, the interior of salt diapirs, shale diapirs, karst collapse and slumps, as well as areas having low signal-to-noise ratio. For this reason, skilled interpreters usually use more than one attribute, along with the seismic amplitude data, to isolate target facies on large 3-D volumes using crossplotting and/or 3-D visualization.
Machine learning promises to accelerate the task of mapping one or more seismic facies in 3-D. Unlike the human interpreter, computers do not tire and thereby provide a consistent (though perhaps consistently inaccurate) analysis of every voxel in the largest 3-D volume. However, ML algorithms are less skilled than the novice interpreter, and in general (at least for today’s algorithm) have no understanding of geologic processes. With ML, skilled interpreters do not need to explain how or why a given event was picked; rather, they only need to define zones (by drawing polygons, for example) representative of the target facies, as well as the background facies from which they wish to differentiate them. These interpreted areas of the seismic volume serve as training data for the ML algorithm, which then tries to mimic the skilled interpreter.
There are two competing ML techniques in use today in seismic facies classification: relatively simple techniques that use conventional attributes as input, and more complex (“deeper”) techniques that generate their own internal attributes as part of the workflow.
In this article, we address the first technique. The challenge, as with interactive interpretation, is to determine which attributes have the largest impact in the classification. Equally important, an understanding of the sensitivity of the classification result to the input attributes provides not only valuable information about ML, but also for subsequent interactive interpretation, and even an understanding of the geologic features themselves. In this article, we illustrate the process through the application of Shapley additive explanations, or “SHAP,” to a random forest architecture.
Specifically, our goal is to explain and analyze the global and local behavior of the ML model to distinguish between mass transport deposits, salt and conformal siliciclastic sediments in a Gulf of Mexico data set. The 3-D prestack time-migrated seismic volume was acquired by PGS, and it is characterized by a bin size of 37.5 by 25 meters, record length of two seconds and sampling interval of four milliseconds.
In figure 1, we show the seismic amplitudes along line AA.’ We observe two salt diapirs characterized by chaotic, low-amplitude reflectors (orange arrows). In addition, a coherent noise event (yellow arrow) is also visible within the otherwise chaotic, low-amplitude salt. MTDs (green arrows) tend to be characterized by intercalations of higher amplitude reflectors (red arrow) and more discontinuous, chaotic, lower amplitude reflectors (blue arrow), whereas most of the conformal sediments are characterized by high-amplitude, subparallel coherent reflectors. However, conformal sediments closer to the edges of the salt diapirs exhibit lower seismic amplitude, and lower signal-to-noise ratio. Finally, we see some low frequency noise (purple arrows).
Shapley Additive Explanations
SHAP uses a linear additive feature attribute method as a simpler explanation model as a means to explain complex ML models. By computing the Shapley values from cooperative game theory, SHAP assigns each input attribute an importance value based on its impact on the model.
For local interpretability, positive SHAP values indicate an increase in the probability of having particular seismic facies, whereas negative SHAP values represent a decrease in the probability. In addition, to interpret the global behavior of the model, SHAP combines several local explanations in which attributes characterized by higher average SHAP values show greater importance than attributes associated with lower average SHAP values.
Our proposed workflow to distinguish between MTDs, salt, and conformal sediments consists of the following steps:
- Compute nine seismic attributes measuring changes in morphology, energy, frequency, continuity and reflector dip.
- Apply a Kuwahara median filter to the seismic attributes to sharpen the edges and smooth the internal response of the seismic facies and remove seismic noise.
- Generate training and validation data sets by picking polygons surrounding the target seismic facies on six vertical lines throughout the seismic volume. We then randomly select the same number of samples for each seismic facies to balance our data.
- Train and validate a random forest ML algorithm using an 80/20-percent split ratio.
- Apply SHAP to interpret the global and local behavior on the model for each attribute on the seismic facies analysis.
To study the overall importance of our input seismic attributes in the ML model, we apply SHAP to our training data set. In figure 2, we observe that the highest contribution to the classification is given by the total energy (characterized by the largest average SHAP value), followed by the dip deviation, spectral bandwidth, GLCM entropy, coherence, energy deviation, GLCM contrast, covariance of dip and energy gradient, and reflector convergence.
In addition, using SHAP we can examine the impact of each input attribute toward each of the seismic facies. For the MTD and salt seismic facies, the most important attribute is given by the total energy, whereas for the conformal sediments the dip deviation shows the highest impact followed by the total energy. Also, for salt and MTDs, the spectral bandwidth and the dip deviation are the second most important attributes, respectively.
Therefore, we note that the attribute importance is dynamic and changes on the basis of the seismic facies analyzed. Also, from additional studies (not shown in this article), we found that the quality of the input attributes also affects the importance of the input attributes in the classification.
To analyze the impact of the seismic attributes at a particular voxel within the 3-D seismic volume, we compute the SHAP force plots in which SHAP values are considered as “forces” that increase or decrease the probabilities for the target seismic facies.
In figure 3, we analyze the SHAP force plot at voxel A located inside a salt diapir along line GG’ to interpret how the ML model uses the seismic attributes to obtain the final probabilities for each seismic facies. We note that at voxel A, the random forest architecture misclassified the seismic facies as being an MTD with a 45-percent probability, as salt facies with 13-percent probability, and as conformal sediment with 42-percent probability.
In the SHAP force plot (figure 3), the prediction starts from the base value which is given by the average of all probabilities for each seismic facies present in the data set if none of the input attributes are known. In this study, each seismic facies starts with a base probability of 33.3 percent. Next, considering the effect of the seismic attributes in the model (figure 3a), we observe that the total energy, dip deviation, spectral bandwidth and coherence attributes increase the probability of being an MTD (characterized by positive SHAP values associated with red arrows), whereas the GLCM entropy and the covariance of dip and energy gradient decrease the probability to be 45 percent (characterized by negative SHAP values associated with blue arrows).
For the salt seismic facies (figure 3b), we note that the dip deviation, covariance of dip and energy gradient, and coherence increase the probability, whereas the spectral bandwidth and the total energy decrease the probability from the base value of 33 percent to the final 13 percent. Finally, for the conformal sediments (figure 3c), the dip deviation and coherence attributes decrease the probability, whereas the GLCM entropy, covariance of dip and energy gradient, total energy, energy deviation and reflector convergence increase the probability to 42 percent.
In-Context Interpretation Using SHAP Values
To perform an in-context interpretation to evaluate how changes in the seismic response affect the model predictions, we compute the SHAP values for all Kuwahara-filtered seismic attributes and target seismic facies and co-render them with the seismic amplitudes indicating how the ML architecture “sees” the geology.
Figure 4 shows the SHAP values corendered with the seismic amplitudes along line GG’ for the total energy attribute and the MTDs, salt, and conformal sediments. Similar to the SHAP force plots for the local interpretability analysis, positive SHAP values (red colors) increase the probability, whereas negative SHAP values (blue colors) decrease the probability. For the MTD seismic facies (figure 4a), we note that the ML architecture correctly identifies MTDs in the area. However, reflectors characterized by lower amplitudes, lower seismic quality surrounding the salt diapirs (white arrows) and high-amplitude conformal sediments also increase the probability of having MTDs.
For salt seismic facies (figure 4b), we observe that in general the model correctly classifies lower amplitude, chaotic salt diapirs. However, dipping reflectors and noisy areas close to the edges of the salt also increase the probability for this seismic facies (orange arrows). Finally, the model correctly classifies the conformal sediments (figure 4c). However, some overlap with the MTD seismic facies exists (yellow arrows).
Similar analysis can be done to each target seismic facies for the remaining eight seismic attributes to understand their impact on the classification.
Although our multi-attribute random forest architecture has an accuracy of 91.46 percent in differentiating between MTDs, salt and conformal siliciclastic sediments, there are still some errors. The SHAP global analysis shows that the total energy and dip deviation (non-parallelism attribute) provide the largest role in the classification. Reinspecting the seismic amplitude data confirms that in this seismic volume, salt exhibits low-amplitude and chaotic dips (that is, of noise migrated into the salt diapirs), conformal sediments exhibit mixed amplitudes but nearly parallel dip, and MTDs show mixed amplitude of rotated blocks of otherwise conformal sediments in some areas, and moderate amplitude chaotic data in other areas. Other attributes help further discriminate between these three seismic facies. Areas of misclassification indicate the need to include such areas in the training data.
ML algorithms do not include the contextual insight of a human interpreter, such that coherent noise inside salt diapirs can be misclassified by the algorithm as conformal sediments or MTDs. In addition, we observe that dipping parallel conformal sediments close to the edges of the salt diapirs, which are associated with lower signal-to-noise ratio and lower amplitudes, tend to be misclassified as MTDs.
Finally, to analyze how changes in the seismic reflectors impact the classification, we study the SHAP values co-rendered with the seismic amplitudes. We note that overlap between MTDs, salt and conformal sediments exist in the study area.
We would like to thank PGS for providing the seismic data for use in research and education. Also, we thank the sponsors of the Attribute Assisted Seismic Processing and Interpretation consortium for their support.