Deep Green Unannotated Protein Structures
The Deep Green list is based on the identification and curation of conserved unannotated proteins in three green lineage (Viridiplantae) model organisms; Arabidopsis thaliana, Chlamydomonas reinhardtii, and Setaria viridis. Preliminary characterization of Deep Green proteins and genes was done using various informatics tools and published data sets and is presented in Knoshaug, Sun, et al., 2023, submitted. The structures of these unannotated proteins were also predicted using AlphaFold (Jumper et al., 2021). The data deposited here are the AlphaFold structural predictions having the highest pLDDT score and thus identified as the best folded structure (ranked_0). These data enable others to do in-depth structural characterizations to aid in functional characterization leading to deeper understanding of plant biology. References: Jumper, J., Evans, R., Pritzel, A., Green, T., Figurnov, M., Ronneberger, O., Tunyasuvunakool, K., Bates, R., ?ídek, A., Potapenko, A., Bridgland, A., Meyer, C., Kohl, S. A. A., Ballard, A. J., Cowie, A., Romera-Paredes, B., Nikolov, S., Jain, R., Adler, J., Back, T., Petersen, S., Reiman, D., Clancy, E., Zielinski, M., Steinegger, M., Pacholska, M., Berghammer, T., Bodenstein, S., Silver, D., Vinyals, O., Senior, A. W., Kavukcuoglu, K., Kohli, P. and Hassabis, D. (2021) Highly accurate protein structure prediction with AlphaFold. Nature, 596:583-589. Knoshaug, E. P., Sun, P., Nag, A., Nguyen, H., Mattoon, E. M., Zhang, N., Liu, J., Chen, C., Cheng, J., Zhang, R., St. John, P., and Umen, J. (submitted) Identification and preliminary characterization of conserved uncharacterized proteins from Chlamydomonas reinhardtii, Arabidopsis thaliana, and Setaria viridis.
Citation Formats
TY - DATA
AB - The Deep Green list is based on the identification and curation of conserved unannotated proteins in three green lineage (Viridiplantae) model organisms; Arabidopsis thaliana, Chlamydomonas reinhardtii, and Setaria viridis. Preliminary characterization of Deep Green proteins and genes was done using various informatics tools and published data sets and is presented in Knoshaug, Sun, et al., 2023, submitted. The structures of these unannotated proteins were also predicted using AlphaFold (Jumper et al., 2021). The data deposited here are the AlphaFold structural predictions having the highest pLDDT score and thus identified as the best folded structure (ranked_0). These data enable others to do in-depth structural characterizations to aid in functional characterization leading to deeper understanding of plant biology. References: Jumper, J., Evans, R., Pritzel, A., Green, T., Figurnov, M., Ronneberger, O., Tunyasuvunakool, K., Bates, R., ŽÃdek, A., Potapenko, A., Bridgland, A., Meyer, C., Kohl, S. A. A., Ballard, A. J., Cowie, A., Romera-Paredes, B., Nikolov, S., Jain, R., Adler, J., Back, T., Petersen, S., Reiman, D., Clancy, E., Zielinski, M., Steinegger, M., Pacholska, M., Berghammer, T., Bodenstein, S., Silver, D., Vinyals, O., Senior, A. W., Kavukcuoglu, K., Kohli, P. and Hassabis, D. (2021) Highly accurate protein structure prediction with AlphaFold. Nature, 596:583-589. Knoshaug, E. P., Sun, P., Nag, A., Nguyen, H., Mattoon, E. M., Zhang, N., Liu, J., Chen, C., Cheng, J., Zhang, R., St. John, P., and Umen, J. (submitted) Identification and preliminary characterization of conserved uncharacterized proteins from Chlamydomonas reinhardtii, Arabidopsis thaliana, and Setaria viridis.
AU - Knoshaug
A2 - Sun
A3 - Nag
A4 - Nguyen
A5 - Mattoon
A6 - Zhang
A7 - Liu
A8 - Chen
A9 - Cheng
A10 - Zhang
A11 - St. John
A12 - Umen
DB - Open Energy Data Initiative (OEDI)
DP - Open EI | National Renewable Energy Laboratory
DO -
KW - Donald Danforth Plant Science Center
KW - unannotated proteins
KW - protein structure
KW - Arabidopsis thaliana
KW - Setaria viridis
KW - Chlamydomonas reinhardtii
KW - energy crop
KW - model species
KW - green lineage
KW - AlphaFold
LA - English
DA - 2023/04/20
PY - 2023
PB - National Renewable Energy Laboratory
T1 - Deep Green Unannotated Protein Structures
UR - https://data.openei.org/submissions/8267
ER -
Knoshaug, et al. Deep Green Unannotated Protein Structures. National Renewable Energy Laboratory, 20 April, 2023, NREL. https://data.nrel.gov/submissions/216.
Knoshaug, Sun, Nag, Nguyen, Mattoon, Zhang, Liu, Chen, Cheng, Zhang, St. John, & Umen. (2023). Deep Green Unannotated Protein Structures. [Data set]. NREL. National Renewable Energy Laboratory. https://data.nrel.gov/submissions/216
Knoshaug, Sun, Nag, Nguyen, Mattoon, Zhang, Liu, Chen, Cheng, Zhang, St. John, and Umen. Deep Green Unannotated Protein Structures. National Renewable Energy Laboratory, April, 20, 2023. Distributed by NREL. https://data.nrel.gov/submissions/216
@misc{OEDI_Dataset_8267,
title = {Deep Green Unannotated Protein Structures},
author = {Knoshaug and Sun and Nag and Nguyen and Mattoon and Zhang and Liu and Chen and Cheng and Zhang and St. John and Umen},
abstractNote = {The Deep Green list is based on the identification and curation of conserved unannotated proteins in three green lineage (Viridiplantae) model organisms; Arabidopsis thaliana, Chlamydomonas reinhardtii, and Setaria viridis. Preliminary characterization of Deep Green proteins and genes was done using various informatics tools and published data sets and is presented in Knoshaug, Sun, et al., 2023, submitted. The structures of these unannotated proteins were also predicted using AlphaFold (Jumper et al., 2021). The data deposited here are the AlphaFold structural predictions having the highest pLDDT score and thus identified as the best folded structure (ranked_0). These data enable others to do in-depth structural characterizations to aid in functional characterization leading to deeper understanding of plant biology.\ References:\ Jumper, J., Evans, R., Pritzel, A., Green, T., Figurnov, M., Ronneberger, O., Tunyasuvunakool, K., Bates, R., ?ídek, A., Potapenko, A., Bridgland, A., Meyer, C., Kohl, S. A. A., Ballard, A. J., Cowie, A., Romera-Paredes, B., Nikolov, S., Jain, R., Adler, J., Back, T., Petersen, S., Reiman, D., Clancy, E., Zielinski, M., Steinegger, M., Pacholska, M., Berghammer, T., Bodenstein, S., Silver, D., Vinyals, O., Senior, A. W., Kavukcuoglu, K., Kohli, P. and Hassabis, D. (2021) Highly accurate protein structure prediction with AlphaFold. Nature, 596:583-589.\ Knoshaug, E. P., Sun, P., Nag, A., Nguyen, H., Mattoon, E. M., Zhang, N., Liu, J., Chen, C., Cheng, J., Zhang, R., St. John, P., and Umen, J. (submitted) Identification and preliminary characterization of conserved uncharacterized proteins from Chlamydomonas reinhardtii, Arabidopsis thaliana, and Setaria viridis.},
url = {https://data.nrel.gov/submissions/216},
year = {2023},
howpublished = {NREL, National Renewable Energy Laboratory, https://data.nrel.gov/submissions/216},
note = {Accessed: 2025-05-04}
}
Details
Data from Apr 20, 2023
Last updated Jan 17, 2025
Submitted Apr 20, 2023
Organization
National Renewable Energy Laboratory
Contact
Eric Knoshaug
Authors
Original Source
https://data.nrel.gov/submissions/216Research Areas
Keywords
Donald Danforth Plant Science Center, unannotated proteins, protein structure, Arabidopsis thaliana, Setaria viridis, Chlamydomonas reinhardtii, energy crop, model species, green lineage, AlphaFoldDOE Project Details
Project Name Deep Green: Structural and Functional Genomic Characterization of Conserved Unannotated Green Lineage Proteins
Project Number ERW9098