"Womp Womp! Your browser does not support canvas :'("

DEEPEN Global Standardized Categorical Exploration Datasets for Magmatic Plays

Publicly accessible License 

DEEPEN stands for DE-risking Exploration of geothermal Plays in magmatic ENvironments.

As part of the development of the DEEPEN 3D play fairway analysis (PFA) methodology for magmatic plays (conventional hydrothermal, superhot EGS, and supercritical), weights needed to be developed for use in the weighted sum of the different favorability index models produced from geoscientific exploration datasets. This was done using two different approaches: one based on expert opinions, and one based on statistical learning. This GDR submission includes the datasets used to produce the statistical learning-based weights.

While expert opinions allow us to include more nuanced information in the weights, expert opinions are subject to human bias. Data-centric or statistical approaches help to overcome these potential human biases by focusing on and drawing conclusions from the data alone. The drawback is that, to apply these types of approaches, a dataset is needed. Therefore, we attempted to build comprehensive standardized datasets mapping anomalies in each exploration dataset to each component of each play. This data was gathered through a literature review focused on magmatic hydrothermal plays along with well-characterized areas where superhot or supercritical conditions are thought to exist. Datasets were assembled for all three play types, but the hydrothermal dataset is the least complete due to its relatively low priority.

For each known or assumed resource, the dataset states what anomaly in each exploration dataset is associated with each component of the system. The data is only a semi-quantitative, where values are either high, medium, or low, relative to background levels. In addition, the dataset has significant gaps, as not every possible exploration dataset has been collected and analyzed at every known or suspected geothermal resource area, in the context of all possible play types. The following training sites were used to assemble this dataset:
- Conventional magmatic hydrothermal: Akutan (from AK PFA), Oregon Cascades PFA, Glass Buttes OR, Mauna Kea (from HI PFA), Lanai (from HI PFA), Mt St Helens Shear Zone (from WA PFA), Wind River Valley (From WA PFA), Mount Baker (from WA PFA).
- Superhot EGS: Newberry (EGS demonstration project), Coso (EGS demonstration project), Geysers (EGS demonstration project), Eastern Snake River Plain (EGS demonstration project), Utah FORGE, Larderello, Kakkonda, Taupo Volcanic Zone, Acoculco, Krafla.
- Supercritical: Coso, Geysers, Salton Sea, Larderello, Los Humeros, Taupo Volcanic Zone, Krafla, Reyjanes, Hengill.
**Disclaimer: Treat the supercritical fluid anomalies with skepticism. They are based on assumptions due to the general lack of confirmed supercritical fluid encounters and samples at the sites included in this dataset, at the time of assembling the dataset. The main assumption was that the supercritical fluid in a given geothermal system has shared properties with the hydrothermal fluid, which may not be the case in reality.

Once the datasets were assembled, principal component analysis (PCA) was applied to each. PCA is an unsupervised statistical learning technique, meaning that labels are not required on the data, that summarized the directions of variance in the data. This approach was chosen because our labels are not certain, i.e., we do not know with 100% confidence that superhot resources exist at all the assumed positive areas. We also do not have data for any known non-geothermal areas, meaning that it would be challenging to apply a supervised learning technique. In order to generate weights from the PCA, an analysis of the PCA loading values was conducted. PCA loading values represent how much a feature is contributing to each principal component, and therefore the overall variance in the data.

Citation Formats

National Renewable Energy Laboratory. (2023). DEEPEN Global Standardized Categorical Exploration Datasets for Magmatic Plays [data set]. Retrieved from https://dx.doi.org/10.15121/1995526.
Export Citation to RIS
Taverna, Nicole, Caliandro, Nils, and King, Rachel. DEEPEN Global Standardized Categorical Exploration Datasets for Magmatic Plays. United States: N.p., 30 Jun, 2023. Web. doi: 10.15121/1995526.
Taverna, Nicole, Caliandro, Nils, & King, Rachel. DEEPEN Global Standardized Categorical Exploration Datasets for Magmatic Plays. United States. https://dx.doi.org/10.15121/1995526
Taverna, Nicole, Caliandro, Nils, and King, Rachel. 2023. "DEEPEN Global Standardized Categorical Exploration Datasets for Magmatic Plays". United States. https://dx.doi.org/10.15121/1995526. https://gdr.openei.org/submissions/1509.
@div{oedi_7599, title = {DEEPEN Global Standardized Categorical Exploration Datasets for Magmatic Plays}, author = {Taverna, Nicole, Caliandro, Nils, and King, Rachel.}, abstractNote = {DEEPEN stands for DE-risking Exploration of geothermal Plays in magmatic ENvironments.

As part of the development of the DEEPEN 3D play fairway analysis (PFA) methodology for magmatic plays (conventional hydrothermal, superhot EGS, and supercritical), weights needed to be developed for use in the weighted sum of the different favorability index models produced from geoscientific exploration datasets. This was done using two different approaches: one based on expert opinions, and one based on statistical learning. This GDR submission includes the datasets used to produce the statistical learning-based weights.

While expert opinions allow us to include more nuanced information in the weights, expert opinions are subject to human bias. Data-centric or statistical approaches help to overcome these potential human biases by focusing on and drawing conclusions from the data alone. The drawback is that, to apply these types of approaches, a dataset is needed. Therefore, we attempted to build comprehensive standardized datasets mapping anomalies in each exploration dataset to each component of each play. This data was gathered through a literature review focused on magmatic hydrothermal plays along with well-characterized areas where superhot or supercritical conditions are thought to exist. Datasets were assembled for all three play types, but the hydrothermal dataset is the least complete due to its relatively low priority.

For each known or assumed resource, the dataset states what anomaly in each exploration dataset is associated with each component of the system. The data is only a semi-quantitative, where values are either high, medium, or low, relative to background levels. In addition, the dataset has significant gaps, as not every possible exploration dataset has been collected and analyzed at every known or suspected geothermal resource area, in the context of all possible play types. The following training sites were used to assemble this dataset:
- Conventional magmatic hydrothermal: Akutan (from AK PFA), Oregon Cascades PFA, Glass Buttes OR, Mauna Kea (from HI PFA), Lanai (from HI PFA), Mt St Helens Shear Zone (from WA PFA), Wind River Valley (From WA PFA), Mount Baker (from WA PFA).
- Superhot EGS: Newberry (EGS demonstration project), Coso (EGS demonstration project), Geysers (EGS demonstration project), Eastern Snake River Plain (EGS demonstration project), Utah FORGE, Larderello, Kakkonda, Taupo Volcanic Zone, Acoculco, Krafla.
- Supercritical: Coso, Geysers, Salton Sea, Larderello, Los Humeros, Taupo Volcanic Zone, Krafla, Reyjanes, Hengill.
**Disclaimer: Treat the supercritical fluid anomalies with skepticism. They are based on assumptions due to the general lack of confirmed supercritical fluid encounters and samples at the sites included in this dataset, at the time of assembling the dataset. The main assumption was that the supercritical fluid in a given geothermal system has shared properties with the hydrothermal fluid, which may not be the case in reality.

Once the datasets were assembled, principal component analysis (PCA) was applied to each. PCA is an unsupervised statistical learning technique, meaning that labels are not required on the data, that summarized the directions of variance in the data. This approach was chosen because our labels are not certain, i.e., we do not know with 100% confidence that superhot resources exist at all the assumed positive areas. We also do not have data for any known non-geothermal areas, meaning that it would be challenging to apply a supervised learning technique. In order to generate weights from the PCA, an analysis of the PCA loading values was conducted. PCA loading values represent how much a feature is contributing to each principal component, and therefore the overall variance in the data.
}, doi = {10.15121/1995526}, url = {https://gdr.openei.org/submissions/1509}, journal = {}, number = , volume = , place = {United States}, year = {2023}, month = {06}}


Data from Jun 30, 2023

Last updated Sep 15, 2023

Submitted Jul 5, 2023


National Renewable Energy Laboratory


Nicole Taverna


Nicole Taverna

National Renewable Energy Laboratory

Nils Caliandro

National Renewable Energy Laboratory

Rachel King

National Renewable Energy Laboratory

Research Areas

DOE Project Details

Project Name DE-risking Exploration of geothermal Plays in magmatic ENvironments (DEEPEN)

Project Lead Lauren Boyd

Project Number 37178


Submission Downloads