Skip to main content
  • Research article
  • Open access
  • Published:

Calculation of the relative metastabilities of proteins using the CHNOSZ software package

Abstract

Background

Proteins of various compositions are required by organisms inhabiting different environments. The energetic demands for protein formation are a function of the compositions of proteins as well as geochemical variables including temperature, pressure, oxygen fugacity and pH. The purpose of this study was to explore the dependence of metastable equilibrium states of protein systems on changes in the geochemical variables.

Results

A software package called CHNOSZ implementing the revised Helgeson-Kirkham-Flowers (HKF) equations of state and group additivity for ionized unfolded aqueous proteins was developed. The program can be used to calculate standard molal Gibbs energies and other thermodynamic properties of reactions and to make chemical speciation and predominance diagrams that represent the metastable equilibrium distributions of proteins. The approach takes account of the chemical affinities of reactions in open systems characterized by the chemical potentials of basis species. The thermodynamic database included with the package permits application of the software to mineral and other inorganic systems as well as systems of proteins or other biomolecules.

Conclusion

Metastable equilibrium activity diagrams were generated for model cell-surface proteins from archaea and bacteria adapted to growth in environments that differ in temperature and chemical conditions. The predicted metastable equilibrium distributions of the proteins can be compared with the optimal growth temperatures of the organisms and with geochemical variables. The results suggest that a thermodynamic assessment of protein metastability may be useful for integrating bio- and geochemical observations.

Background

Owing to the growing body of compositional data for microbial proteins and the exploration of environments that are extreme from the human standpoint, it has become possible in recent years to draw correlations between the compositions of proteins and environmental parameters such as temperature [1]. Accounting for the underlying causes of the observed correlations between environmental parameters and protein composition is an ongoing challenge. Biochemical approaches are based in part on the notion that proteins from thermophilic and hyperthermophilic organisms should have greater structural stabilities than their mesophilic counterparts [2]. Compositional features of thermophilic proteins that may enhance their structural stabilities include increased numbers of hydrophobic residues, stronger charge interactions on the protein surfaces, and other properties of the amino acid sequence [3]. However, it has also been suggested that, at least for sulfur, the elemental makeup of proteins is correlated with the chemical compositions of the environment [4]. This study was motivated by the desire to explore a possible thermodynamic explanation for the relationship between protein composition and the extracellular environment, which is shaped in part by geochemical constraints.

A thermodynamic assessment of protein metastability provides a framework for describing the relationship between geochemistry and protein composition that until now has received relatively little attention. The geochemical literature abounds with examples of theoretical calculation of the compositions of stable and/or metastable equilibrium reference states as a way to predict the distributions of, and reaction pathways among, minerals and inorganic or organic aqueous species [5, 6]. In recent years, the calculation [711] and experimental investigation [1214] of metastable equilibrium states in biogeochemical systems has gained traction. The primary advantage of extending a framework of this type to proteins and other biomacromolecules is that it places biochemical reactions in the same context as observations on the inorganic systems to which microbial metabolic pathways are coupled. Temperature, pressure, oxidation state and pH are just some of the variables that are commonly measured in geochemical studies that also appear explicitly in the thermodynamic representation of protein metastability reactions.

This study was undertaken in order to explore the thermodynamic relationships between geochemical variables and protein composition for model proteins from a number of organisms adapted to different environments. The cell-surface glycoproteins in archaea and the surface-layer proteins in bacteria [15, 16] were chosen for this purpose because they are intimately associated with the extracellular aquatic and mineralogical setting.

Because experimental values of the standard molal Gibbs energies of the model proteins were not available, they were calculated using previously reported group additivity and equations of state algorithms that are referenced to ionized unfolded aqueous proteins [17, 18]. These values are requisite for calculating the composition of the metastable equilibrium state in an open system described by chemical potentials of basis species, or perfectly mobile components [1922]. The predicted chemical activities of species can then be displayed on chemical predominance and/or speciation diagrams whose axes correspond to intensive chemical variables. Because of the lack of integration of algorithms for calculating thermodynamic properties of proteins in available geochemical equilibrium software packages, the task of calculating and graphically representing the metastable equilibrium distributions of the proteins was managed through development of the CHNOSZ software package, which is introduced in this study.

The implementation of the thermodynamic algorithms and data into the package is described first below. The results of the calculations for the model system of proteins are then described and are displayed primarily in the form of diagrams depicting the calculated metastable equilibrium distributions of the proteins. The graphical depictions shown below are only limited portrayals of the metastable equilibrium states of systems of proteins, which are in fact multidimensional functions of thermodynamic variables. The predicted response of at least one of the metastability reactions between proteins from hyperthermophilic and mesophilic organisms appears to be aligned with the differences in temperature, pressure and oxidation state between their environments. However, more tests in other systems will be required to assess the generality of the approach. Some potential implications of the findings are addressed briefly in the concluding remarks, and the paper is finished with a section devoted to the methods adopted for writing protein metastability reactions and computing their thermodynamic properties.

Implementation

The CHNOSZ software package consists of source code, data files, and documentation. It is written for the cross-platform R software environment [23]. The package can be freely downloaded from the project website at http://www.chnosz.net. The features of the package, its basic program structure, and the thermodynamic database are summarized in the following paragraphs.

Features

CHNOSZ was developed in order to ease calculations of 1) the standard molal thermodynamic properties of chemical species and reactions as a function of temperature and pressure, 2) the standard molal thermodynamic properties and equations of state parameters of neutral and ionized proteins using group additivity algorithms, 3) the chemical affinities of formation reactions of species of interest from basis species describing the system, and to assist in 4) generating metastable equilibrium activity diagrams for systems of biomolecules and/or other species.

The functions provided in CHNOSZ are suitable for either interactive use or scripted operation. The diagrams that are produced can be viewed on screen or saved as postscript files. Because the thermodynamic database includes the chemical formulas of species in addition to their standard molal thermodynamic properties, functions operating on user-input chemical reactions have the option to check, and possibly automatically correct, the mass balance of the reactions. This feature can speed up user interaction with the program and the writing of program scripts. The program has been designed with features in mind and is not presently optimized for speed. Most of the diagrams shown below can be produced in under a minute, but temperature-pressure diagrams of the same resolution require substantially more computational time, owing to the number of times the equations of state subroutines are called.

The package was developed with the goal of analyzing protein reactions, but the range of systems that can be studied using the software is limited only by the species available in the thermodynamic database, to which the user can make either temporary or persistent additions or updates. Complete documentation of the functions, including examples derived from the geochemical literature and this study, is provided with the package. Usage of the major functions in CHNOSZ is summarized below.

Standard molal properties

The relationships among the primary functions provided in CHNOSZ and some of the accessory functions are depicted in the flowchart shown in Fig. 1. Calculation of the standard molal thermodynamic properties of species and chemical reactions as a function of temperature and pressure is implemented in the primary function subcrt. The name of this function is a variation of the name of the SUPCRT92 software package [24]. The temperature and pressure ranges of calculations possible using subcrt are the same as those for SUPCRT92.

Figure 1
figure 1

Functions and data flow in the CHNOSZ program. Data sources are represented by ellipses, and functions by boxes. Computations in CHNOSZ are initiated by the user accessing the primary functions, shown in bold font. The accessory functions, shown in normal font, perform many of the underlying calculations.

The accessory function water implements two computational options for calculating the thermodynamic and electrostatic properties of liquid H2O as a function of temperature and pressure. The first of these options provides an interface to the FORTRAN subroutine named H2O92D.F that was distributed with SUCPRT92 [24] and that is included in the CHNOSZ source package. The calculation of the properties of liquid H2O in this case is consistent with data and equations from Refs. [2527] and others (see Ref. [24]). The stated temperature and pressure limits of applicability for these calculations, described in Ref. [24], are from 0.01°C and PSAT (i.e., 1 bar at temperatures below 100°C and the saturation vapor pressure of H2O at higher temperatures) to 2250°C and 30000 bar. However, electrostatic properties of the solvent, which are required by the revised Helgeson-Kirkham-Flowers (HKF) equations of state for aqueous species, can not be computed above 1000°C and 5000 bar. An alternative computational option for the properties of liquid H2O corresponds to the IAPWS-95 formulation for thermodynamic properties [28] coupled with equations for electrostatic properties taken from Ref. [29].

The functions denoted by eos in Fig. 1 actually consist of two functions, hkf, for calculating as a function of temperature and pressure the standard molal thermodynamic properties of aqueous species using the revised HKF equations of state [3033], and cgl, for calculating the properties of crystalline, gaseous and liquid (except H2O) species. The heat capacity equation implemented in CHNOSZ for these species contains up to six terms, as used in Ref. [34]; the first three terms are those in the Maier-Kelley equation [35, 36] which is used in the SUPCRT92 package.

The accessory function info provides a bridge between the thermodynamic and protein databases and the other functions. The function known as makeup is concerned with conversion between various computer- and human-readable representations of the chemical compositions of species. Its primary purpose is to transform the chemical formulas of species contained in the thermodynamic database (e.g., 'C4H6NO4-' for aspartate) into dataframe objects (which in R are similar to matrices with named columns and rows) so that other functions or makeup itself can perform further calculations on the stoichiometries of species. This function is also responsible for transforming a compositional dataframe back into a one-line chemical formula, and for calculating the reaction coefficients of basis species in formation reactions of the species of interest. It is with the aid of this function that subcrt checks whether a user-input chemical reaction is balanced with respect to mass and charge and automatically corrects the reaction if the necessary basis species have been defined.

Examples of the usage of the info and subcrt functions are shown in the program transcript in Fig. 2. The standard molal thermodynamic properties at 25°C and 1 bar and the equations of state parameters of chicken lysozyme (LYSC_CHICK, accession no. P00698 in the Swiss-Prot database [37]) can be retrieved using the code shown in Fig. 2a. The properties and parameters whose values appear in the example are standard molal Gibbs energy (ΔG°) and enthalpy (ΔH°) of formation from the elements (cal mol-1), standard molal entropy (S°), heat capacity ( C P MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaGaem4qam0aa0baaSqaaiabdcfaqbqaaiablIHiVbaaaaa@2F74@ ) and c1 (cal K-1 mol-1), standard molar volume (V°) (cm3 mol-1), a1 (cal bar-1 mol-1), a2 and ω (cal mol-1), a3 (cal K bar-1 mol-1), and a4 and c2 (cal K mol-1). The parameters a1, a2, a3, a4, c1, c2 and ω are species-dependent coefficients in the revised HKF equations of state. Note that the properties and parameters of proteins returned by info are those of nonionized proteins; the ionization contributions to thermodynamic properties of proteins are calculated using a separate function. Sample code for calculating the standard molal thermodynamic properties of LYSC_CHICK as a function of temperature at PSAT is shown in Fig. 2b, where the units are °C (T), bar (P), g cm-3 (ρ, density of water) and those listed above for the standard molal properties. The reaction-balancing feature of subcrt is demonstrated in Fig. 2c for Reaction 1 (below). In this mode, all the user has to do is identify the basis species in the system and the reaction coefficients of the proteins, and the program finds the correct quantities of basis species to add to the reaction.

Figure 2
figure 2

Transcript of CHNOSZ session to calculate thermodynamic properties of proteins and reactions. Commands at the prompt (>) were entered to calculate (a) the standard molal thermodynamic properties at 25°C and 1 bar and equations of state parameters of nonionized chicken lysozyme (LYSC_CHICK), (b) the standard molal thermodynamic properties of lysozyme as a function of temperature at PSAT and (c) the standard molal properties of the nonionized counterpart to Reaction 1 as a function of temperature at PSAT.

Chemical affinities and metastability diagrams

The primary function subcrt and the related accessory functions permit calculation of the standard molal Gibbs energies of protein formation reactions and corresponding values of the equilibrium constants (K r in Eqn. M7). Calculation of the activity products and chemical affinities of reactions (Q r and A r in Eqn. M7) is implemented in the sequence of primary functions basis, species, affinity that is depicted in Fig. 1.

Two conditions are required of a valid set of basis species in CHNOSZ: 1) the number of basis species is equal to the number of elements (and charge, if present). 2) The stoichiometric matrix denoting the elemental composition (and charge if present) of the basis species, which is square according to condition (1), is non-singular and has a real inverse. These two conditions ensure that a formation reaction for any species of interest in the system can be written using only positive or negative real numbers as reaction coefficients on the basis species. The basis species themselves can be any species that are present in the thermodynamic database, including nonionized proteins. The function basis also permits redefining the physical states of basis species (if a corresponding species in that state is present in the thermodynamic database) and/or setting the activities (a) or fugacities (f) of the basis species to be used in the following calculations. These values have default settings given by log a = -3 for aqueous species, log f = 0 for gases and log a = 0 for other species. The function basis can also be used to assign a buffer to one or more basis species so that the activities or fugacities of those basis species are taken from the buffer system.

After defining the basis species, the user can select any number of species of interest using the primary function species. The user may also call species to remove species or to alter the chemical activities or fugacities of the species of interest to be used in the calculations of chemical affinity. These values default to log a = -3 for aqueous species, log f = 0 for gases and log a = 0 for other species.

The function affinity permits calculation of log Q r and A r of formation reactions (such as those represented generically by Reaction M1) using Eqn. (M7) taking into account the activities and/or fugacities of the basis species and the species of interest. The contributions of the Q r and K r terms to the calculation are denoted conceptually in Fig. 1 by the two arrows, from the top and left, respectively, pointing toward the box labeled affinity. The calculations of chemical affinity can be carried out at a single point in temperature, pressure, chemical activity space, or as a function of one or two of T, P and logarithms of chemical activity or fugacity of the basis species. The accessory function buffer is invoked by affinity if one or more basis species were previously associated with a buffer system; the activities or fugacities of the basis species constrained in this way are then used by the program to calculate log Q r using Eqn. (M5).

The results of the calculations performed by affinity are accepted as input by diagram, which produces the diagrams using plotting functions provided in the R distribution. Many options are available for adding labels and legends and otherwise customizing the plot style.

Thermodynamic database

The database of thermodynamic properties packaged with CHNOSZ is contained in a file named OBIGT.csv. Work on this database was motivated by a software project developed by H. C. Helgeson and coworkers, named OrganoBioGeoTherm, that provides a Windows interface to the SUPCRT92 program (J. J. Donovan, personal communication).

The thermodynamic data file has records for over 2500 inorganic, organic and biochemical crystalline, gaseous, liquid and aqueous species. The thermodynamic data were originally taken from the data file distributed with the SUPCRT92 package. Updates since that time were taken from the SLOP98 data file downloaded from http://geopig.asu.edu and from recent reports of thermodynamic data and revised HKF equations of state parameters for aqueous inorganic and organic species, as well as proteins and other species of biogeochemical interest [[3840], and others]. The records in the data file include the names, states and chemical formulas of the species, up to two literature citations, and values of the standard molal thermodynamic properties at 25°C and 1 bar and equations of state parameters. The comma-separated-value (.csv) file format permits rapid reading of the data file by the CHNOSZ program or other software as well as addition to or modification of the file contents by the user. The CHNOSZ package also provides utility functions that can be used to export or import thermodynamic data to or from the SUPCRT92 data file format.

The data file protein.csv of amino acid compositions of proteins has records for over 200 proteins including those referred to in the present study. The user can add the composition of a protein to CHNOSZ by modifying this file, or at run time by inputting the amino acid composition of the protein at the command line or requesting a search of the online Swiss-Prot database http://www.expasy.org[37] through the function called protein.

Results

The model cell-surface proteins used in this study are listed in Table 1. The selected organisms were chosen to represent diverse geochemical environments. It can be seen from the optimal growth temperatures given in Table 1 that three of the organisms (M. jannaschii, M. sociabilis and M. fervidus) are hyperthermophilic, others such as M. voltae are mesophilic, and one organism (M. burtonii) is psychrotolerant. The chemical formulas and standard molal Gibbs energies of the proteins shown in Table 1 are those calculated for the nonionized aqueous proteins. Although the real proteins form crystalline or paracrystalline lattices on the cell surface [41], we are restricted at this time to using an aqueous group additivity model for lack of a crystalline analog. The present formulation is also restricted to the polypeptide molecules of proteins and does not take account of the presence of the carbohydrate chains in the glycoproteins. The standard molal Gibbs energies of ionized proteins were calculated in the present study by combining those of the nonionized proteins with ionization contributions (see Ref. [18] and the Methods).

Table 1 Model proteins used in the present study.

The relative metastabilities of the model proteins were calculated as a function of temperature, pressure and chemical activities or fugacities of basis species. Results of the calculations are presented below primarily on metastable equilibrium activity diagrams depicting either the predominant protein species as a function of two intensive variables, or on speciation diagrams showing the metastable equilibrium chemical activities of proteins as a function of a single variable. The computations were carried out using the CHNOSZ software package together with a program script for use with the package that is provided in Additional File 1.

Predominance diagrams

To assess the relative metastabilities of surface-layer proteins from different organisms as a function of temperature, pressure and oxidation state, we can first write a reaction between the cell-surface proteins from M. voltae and M. jannaschii as

1 553 C 2575 H 4040.935 N 645 O 884 S 11 ( CSG_METVO , a q ) 56.065 + 0.164 CO 2 ( a q ) + 0.031 H 2 O + 0.041 NH 3 ( a q ) + 0.006 H 2 S ( a q ) 1 530 C 2555 H 3976.130 N 640 O 865 S 14 ( CSG_METJA , a q ) 55.870 + 0.163 O 2 ( g ) + 0.004 H + , MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xI8qiVKYPFjYdHaVhbbf9v8qqaqFr0xc9vqFj0dXdbba91qpepeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaqbaeqabiqaaaqaaKqbaoaalaaabaGaeGymaedabaGaeGynauJaeGynauJaeG4mamdaaOGaee4qam0aaSbaaSqaaiabikdaYiabiwda1iabiEda3iabiwda1aqabaGccqqGibasdaWgaaWcbaGaeGinaqJaeGimaaJaeGinaqJaeGimaaJaeiOla4IaeGyoaKJaeG4mamJaeGynaudabeaakiabb6eaonaaBaaaleaacqaI2aGncqaI0aancqaI1aqnaeqaaOGaee4ta80aaSbaaSqaaiabiIda4iabiIda4iabisda0aqabaGccqqGtbWudaqhaaWcbaGaeGymaeJaeGymaeZaaSbaaWqaaiabcIcaOiabboeadjabbofatjabbEeahjabb+faFjabb2eanjabbweafjabbsfaujabbAfawjabb+eapjabcYcaSiabdggaHjabdghaXjabcMcaPaqabaaaleaacqGHsislcqaI1aqncqaI2aGncqGGUaGlcqaIWaamcqaI2aGncqaI1aqnaaGccqGHRaWkcqaIWaamcqGGUaGlcqaIXaqmcqaI2aGncqaI0aancqqGdbWqcqqGpbWtdaWgaaWcbaGaeGOmaiJaeiikaGIaemyyaeMaemyCaeNaeiykaKcabeaakiabgUcaRiabicdaWiabc6caUiabicdaWiabiodaZiabigdaXiabbIeainaaBaaaleaacqaIYaGmaeqaaOGaee4ta8Kaey4kaSIaeGimaaJaeiOla4IaeGimaaJaeGinaqJaeGymaeJaeeOta4KaeeisaG0aaSbaaSqaaiabiodaZiabcIcaOiabdggaHjabdghaXjabcMcaPaqabaGccqGHRaWkcqaIWaamcqGGUaGlcqaIWaamcqaIWaamcqaI2aGncqqGibasdaWgaaWcbaGaeGOmaidabeaakiabbofatnaaBaaaleaacqGGOaakcqWGHbqycqWGXbqCcqGGPaqkaeqaaaGcbaGaeSiZHmycfa4aaSaaaeaacqaIXaqmaeaacqaI1aqncqaIZaWmcqaIWaamaaGccqqGdbWqdaWgaaWcbaGaeGOmaiJaeGynauJaeGynauJaeGynaudabeaakiabbIeainaaBaaaleaacqaIZaWmcqaI5aqocqaI3aWncqaI2aGncqGGUaGlcqaIXaqmcqaIZaWmcqaIWaamaeqaaOGaeeOta40aaSbaaSqaaiabiAda2iabisda0iabicdaWaqabaGccqqGpbWtdaWgaaWcbaGaeGioaGJaeGOnayJaeGynaudabeaakiabbofatnaaDaaaleaacqaIXaqmcqaI0aandaWgaaadbaGaeiikaGIaee4qamKaee4uamLaee4raCKaee4xa8Laeeyta0KaeeyrauKaeeivaqLaeeOsaOKaeeyqaeKaeiilaWIaemyyaeMaemyCaeNaeiykaKcabeaaaSqaaiabgkHiTiabiwda1iabiwda1iabc6caUiabiIda4iabiEda3iabicdaWaaakiabgUcaRiabicdaWiabc6caUiabigdaXiabiAda2iabiodaZiabb+eapnaaBaaaleaacqaIYaGmcqGGOaakcqWGNbWzcqGGPaqkaeqaaOGaey4kaSIaeGimaaJaeiOla4IaeGimaaJaeGimaaJaeGinaqJaeeisaG0aaWbaaSqabeaacqGHRaWkaaGccqGGSaalaaaaaa@DC25@
(1)

which is a specific statement of Reaction M2 for the ionized proteins. The coefficient in front of each of the protein formulas is the reciprocal of the number of amino acid residues in the corresponding protein. Hence, protein length is conserved in Reaction 1. Let us now write a specific statement of Eqn. (M8) for Reaction 1 as

log K 1 = A 1 / 2.303 R T + log a CSG_METJA 1 / 530 a CSG_METVO 1 / 553 + log f O 2 ( g ) 0.163 a H + 0.004 a CO 2 ( a q ) 0.164 a H 2 O 0.031 a NH 3 ( a q ) 0.041 a H 2 S 0.006 , MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xI8qiVKYPFjYdHaVhbbf9v8qqaqFr0xc9vqFj0dXdbba91qpepeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaGagiiBaWMaei4Ba8Maei4zaCMaem4saS0aaSbaaSqaaiabigdaXaqabaGccqGH9aqpieWacqWFbbqqdaWgaaWcbaGaeGymaedabeaakiabc+caViabikdaYiabc6caUiabiodaZiabicdaWiabiodaZiabdkfasjabdsfaujabgUcaRiGbcYgaSjabc+gaVjabcEgaNLqbaoaalaaabaGaemyyae2aa0baaeaacqqGdbWqcqqGtbWucqqGhbWrcqqGFbWxcqqGnbqtcqqGfbqrcqqGubavcqqGkbGscqqGbbqqaeaacqaIXaqmcqGGVaWlcqaI1aqncqaIZaWmcqaIWaamaaaabaGaemyyae2aa0baaeaacqqGdbWqcqqGtbWucqqGhbWrcqqGFbWxcqqGnbqtcqqGfbqrcqqGubavcqqGwbGvcqqGpbWtaeaacqaIXaqmcqGGVaWlcqaI1aqncqaI1aqncqaIZaWmaaaaaOGaey4kaSIagiiBaWMaei4Ba8Maei4zaCwcfa4aaSaaaeaacqWGMbGzdaqhaaqaaiabb+eapnaaBaaabaGaeGOmaiJaeiikaGIaem4zaCMaeiykaKcabeaaaeaacqaIWaamcqGGUaGlcqaIXaqmcqaI2aGncqaIZaWmaaGaemyyae2aa0baaeaacqqGibascqGHRaWkaeaacqaIWaamcqGGUaGlcqaIWaamcqaIWaamcqaI0aanaaaabaGaemyyae2aa0baaeaacqqGdbWqcqqGpbWtdaWgaaqaaiabikdaYiabcIcaOiabdggaHjabdghaXjabcMcaPaqabaaabaGaeGimaaJaeiOla4IaeGymaeJaeGOnayJaeGinaqdaaiabdggaHnaaDaaabaGaeeisaG0aaSbaaeaacqaIYaGmaeqaaiabb+eapbqaaiabicdaWiabc6caUiabicdaWiabiodaZiabigdaXaaacqWGHbqydaqhaaqaaiabb6eaojabbIeainaaBaaabaGaeG4mamJaeiikaGIaemyyaeMaemyCaeNaeiykaKcabeaaaeaacqaIWaamcqGGUaGlcqaIWaamcqaI0aancqaIXaqmaaGaemyyae2aa0baaeaacqqGibasdaWgaaqaaiabikdaYaqabaGaee4uamfabaGaeGimaaJaeiOla4IaeGimaaJaeGimaaJaeGOnaydaaaaakiabcYcaSaaa@AC63@
(2)

where R stands for the gas constant and log K1 and A1 denote, respectively, the logarithm of the equilibrium constant and the chemical affinity of Reaction 1.

The equal-activity boundary shown in Fig. 3a between CSG_METVO and CSG_METJA is consistent with metastable equilibrium between the proteins, or A1 = 0. The location of the boundary can be calculated by combining Eqn. (2) with A1 = 0, the equilibrium constant of the reaction, and the reference activities of the basis species and proteins. In this study, the reference activities of the proteins were set to 10-3 and those of the basis species set to the values listed in the Methods.

Figure 3
figure 3

Relative metastabilities of proteins. log f O 2 ( g ) MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaGaemOzay2aaSbaaSqaaiabb+eapnaaBaaameaacqaIYaGmcqGGOaakcqWGNbWzcqGGPaqkaeqaaaWcbeaaaaa@32AE@ -pH diagrams at 25°C and 1 bar were constructed using activities of the basis species given in the Methods. Predominance field boundaries correspond to metastable equilibrium activities of proteins equal to 10-3. The diagrams were made for (a) all of the proteins listed in Table 1 and (b) the proteins listed in Table 1 except for those appearing in the first diagram. The dashed line appearing in each diagram represents the lower (reducing) stability limit of H2O.

In Reaction 1 it can be noted that O2(g) appears on the same side of the reaction as C 2555 H 3976.130 N 640 O 865 S 14 ( CSG_METJA , a q ) 55.870 MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaGaee4qam0aaSbaaSqaaiabikdaYiabiwda1iabiwda1iabiwda1aqabaGccqqGibasdaWgaaWcbaGaeG4mamJaeGyoaKJaeG4naCJaeGOnayJaeiOla4IaeGymaeJaeG4mamJaeGimaadabeaakiabb6eaonaaBaaaleaacqaI2aGncqaI0aancqaIWaamaeqaaOGaee4ta80aaSbaaSqaaiabiIda4iabiAda2iabiwda1aqabaGccqqGtbWudaqhaaWcbaGaeGymaeJaeGinaqZaaSbaaWqaaiabcIcaOiabboeadjabbofatjabbEeahjabb+faFjabb2eanjabbweafjabbsfaujabbQeakjabbgeabjabcYcaSiabdggaHjabdghaXjabcMcaPaqabaaaleaacqGHsislcqaI1aqncqaI1aqncqGGUaGlcqaI4aaocqaI3aWncqaIWaamaaaaaa@5BE3@ ; hence, the metastability of this protein is increased relative to that of C 2575 H 4040.935 N 645 O 884 S 11 ( CSG_METVO , a q ) 56.065 MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaGaee4qam0aaSbaaSqaaiabikdaYiabiwda1iabiEda3iabiwda1aqabaGccqqGibasdaWgaaWcbaGaeGinaqJaeGimaaJaeGinaqJaeGimaaJaeiOla4IaeGyoaKJaeG4mamJaeGynaudabeaakiabb6eaonaaBaaaleaacqaI2aGncqaI0aancqaI1aqnaeqaaOGaee4ta80aaSbaaSqaaiabiIda4iabiIda4iabisda0aqabaGccqqGtbWudaqhaaWcbaGaeGymaeJaeGymaeZaaSbaaWqaaiabcIcaOiabboeadjabbofatjabbEeahjabb+faFjabb2eanjabbweafjabbsfaujabbAfawjabb+eapjabcYcaSiabdggaHjabdghaXjabcMcaPaqabaaaleaacqGHsislcqaI1aqncqaI2aGncqGGUaGlcqaIWaamcqaI2aGncqaI1aqnaaaaaa@5C13@ by decreasing log f O 2 ( g ) MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaGaemOzay2aaSbaaSqaaiabb+eapnaaBaaameaacqaIYaGmcqGGOaakcqWGNbWzcqGGPaqkaeqaaaWcbeaaaaa@32AE@ , which can be seen in Fig. 3a. It is also apparent from Fig. 3a that at pH 7, the formation of CSG_METJA is predicted to be favored by increasing pH. However, at pHs less than ~6, increasing pH favors formation of CSG_METVO. This observation is consistent with the variation in the charges of the proteins as a function of pH, which are shown normalized to the lengths of the proteins in Fig. 4a. For example, at pH 2, the charge per residue of CSG_METJA is greater than that of CSG_METVO, and a statement of Reaction 1 written for the proteins in their calculated ionization states at this pH would have H+ as a reactant instead of a product. The standard molal Gibbs energies of the ionized proteins which were used to calculate log K1 are depicted in Fig. 4b per residue of protein.

Figure 4
figure 4

Properties of archaeal surface-layer proteins. Shown are calculated values of the net charge per residue (a) and standard molal Gibbs energy of formation from the elements (b) at 25°C and 1 bar for surface-layer proteins from archaeal species listed in Table 1. The computed charges per residue of CSG_METFE and CSG_METSC are indistinguishable from one another in (a).

Figure 3a was generated in CHNOSZ using a sequence of commands similar to the following. The complete program script for this and the other figures is provided in Additional File 1:

basis ( " CHNOS + " ) species ( c ( " CSG_METSC " , " CSG_METJA " , CSG_METFE " , " CSG_HALJP " , " CSG_METVO " , " CSG_METBU " , " SLAP_ACEKI " , " SLAP_BACST " , " SLAP_BACLI " , " SLAP_AERSA " ) ) a < affinity  ( pH = ( 0 , 14 ) ,  O 2 = ( 85 , 60 ) ) diagram ( a ) MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xI8qiVKYPFjYdHaVhbbf9v8qqaqFr0xc9vqFj0dXdbba91qpepeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaqbaeaabyqaaaaabaGaeeOyaiMaeeyyaeMaee4CamNaeeyAaKMaee4CamNaeiikaGIaeiOiaiIaee4qamKaeeisaGKaeeOta4Kaee4ta8Kaee4uamLaey4kaSIaeiOiaiIaeiykaKcabaGaee4CamNaeeiCaaNaeeyzauMaee4yamMaeeyAaKMaeeyzauMaee4CamNaeiikaGIaee4yamMaeiikaGIaeiOiaiIaee4qamKaee4uamLaee4raCKaee4xa8Laeeyta0KaeeyrauKaeeivaqLaee4uamLaee4qamKaeiOiaiIaeiilaWIaeiOiaiIaee4qamKaee4uamLaee4raCKaee4xa8Laeeyta0KaeeyrauKaeeivaqLaeeOsaOKaeeyqaeKaeiOiaiIaeiilaWIaee4qamKaee4uamLaee4raCKaee4xa8Laeeyta0KaeeyrauKaeeivaqLaeeOrayKaeeyrauKaeiOiaiIaeiilaWcabiqaaGqacaWLjaGaeiOiaiIaee4qamKaee4uamLaee4raCKaee4xa8LaeeisaGKaeeyqaeKaeeitaWKaeeOsaOKaeeiuaaLaeiOiaiIaeiilaWIaeiOiaiIaee4qamKaee4uamLaee4raCKaee4xa8Laeeyta0KaeeyrauKaeeivaqLaeeOvayLaee4ta8KaeiOiaiIaeiilaWIaeiOiaiIaee4qamKaee4uamLaee4raCKaee4xa8Laeeyta0KaeeyrauKaeeivaqLaeeOqaiKaeeyvauLaeiOiaiIaeiilaWIaeiOiaiIaee4uamLaeeitaWKaeeyqaeKaeeiuaaLaee4xa8LaeeyqaeKaee4qamKaeeyrauKaee4saSKaeeysaKKaeiOiaiIaeiilaWcabiqaaapacaWLjaGaeiOiaiIaee4uamLaeeitaWKaeeyqaeKaeeiuaaLaee4xa8LaeeOqaiKaeeyqaeKaee4qamKaee4uamLaeeivaqLaeiOiaiIaeiilaWIaeiOiaiIaee4uamLaeeitaWKaeeyqaeKaeeiuaaLaee4xa8LaeeOta4KaeeyqaeKaee4qamKaeeitaWKaeeysaKKaeiOiaiIaeiilaWIaeiOiaiIaee4uamLaeeitaWKaeeyqaeKaeeiuaaLaee4xa8LaeeyqaeKaeeyrauKaeeOuaiLaee4uamLaeeyqaeKaeiOiaiIaeiykaKIaeiykaKcabaGaeeyyaeMaeyipaWJaeyOeI0IaeeyyaeMaeeOzayMaeeOzayMaeeyAaKMaeeOBa4MaeeyAaKMaeeiDaqNaeeyEaKNaeeiiaaIaeiikaGIaeeiCaaNaeeisaGKaeyypa0Jaee4yamMaeeiiaaIaeiikaGIaeGimaaJaeiilaWIaeGymaeJaeGinaqJaeiykaKIaeiilaWIaeeiiaaIaee4ta8KaeGOmaiJaeyypa0Jaee4yamMaeeiiaaIaeiikaGIaeyOeI0IaeGioaGJaeGynauJaeiilaWIaeyOeI0IaeGOnayJaeGimaaJaeiykaKIaeiykaKcabaGaeeizaqMaeeyAaKMaeeyyaeMaee4zaCMaeeOCaiNaeeyyaeMaeeyBa0MaeiikaGIaeeyyaeMaeiykaKcaaaaa@0517@
(3)

Execution of the first command shown in Example 3 defines the basis species characterizing the chemical system. Here, 'CHNOS+' is a keyword that identifies the basis species used in this paper and that appear in Reaction 1. The second command defines the species of interest, corresponding to the proteins listed in Table 1. With the third command, the chemical affinities of the formation reactions of each of the proteins are calculated on a two-dimensional grid as a function of pH and log f O 2 ( g ) MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaGaemOzay2aaSbaaSqaaiabb+eapnaaBaaameaacqaIYaGmcqGGOaakcqWGNbWzcqGGPaqkaeqaaaWcbeaaaaa@32AE@ and the results assigned to a temporary object. Finally, the fourth command instructs the program to produce a metastable equilibrium activity diagram for the system, which in this case is a predominance diagram as a function of pH and log f O 2 ( g ) MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaGaemOzay2aaSbaaSqaaiabb+eapnaaBaaameaacqaIYaGmcqGGOaakcqWGNbWzcqGGPaqkaeqaaaWcbeaaaaa@32AE@ . The reference temperature and pressure and activities of the basis species and proteins are not explicitly specified in Example 3, and are set to default values by the program that correspond to those described in the Methods.

The approach used in CHNOSZ to make predominance diagrams does not rely on writing metastability reactions as represented by Reaction 1 but instead on using formation reactions for the proteins. For example, a specific statement of Reaction M1 for CSG_METJA in its computed ionization state at 25°C, 1 bar and pH 7 is

2555 CO 2 ( a q ) + 1042 H 2 O + 640 NH 3 ( a q ) + 14 H 2 S ( a q ) C 2555 H 3976.130 N 640 O 865 S 14 ( CSG_METJA , a q ) 55.870 + 2643.5 O 2 ( g ) + 55.870 H + . MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xI8qiVKYPFjYdHaVhbbf9v8qqaqFr0xc9vqFj0dXdbba91qpepeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaqbaeaabiqaaaqaaiabikdaYiabiwda1iabiwda1iabiwda1iabboeadjabb+eapnaaBaaaleaacqaIYaGmcqGGOaakcqWGHbqycqWGXbqCcqGGPaqkaeqaaOGaey4kaSIaeGymaeJaeGimaaJaeGinaqJaeGOmaiJaeeisaG0aaSbaaSqaaiabikdaYaqabaGccqqGpbWtcqGHRaWkcqaI2aGncqaI0aancqaIWaamcqqGobGtcqqGibasdaWgaaWcbaGaeG4mamJaeiikaGIaemyyaeMaemyCaeNaeiykaKcabeaakiabgUcaRiabigdaXiabisda0iabbIeainaaBaaaleaacqaIYaGmaeqaaOGaee4uam1aaSbaaSqaaiabcIcaOiabdggaHjabdghaXjabcMcaPaqabaaakeaacaWLjaGaaCzcaiablYCidkabboeadnaaBaaaleaacqqGYaGmcqqG1aqncqqG1aqncqqG1aqnaeqaaOGaeeisaG0aaSbaaSqaaiabiodaZiabiMda5iabiEda3iabiAda2iabc6caUiabigdaXiabiodaZiabicdaWaqabaGccqqGobGtdaWgaaWcbaGaeGOnayJaeGinaqJaeGimaadabeaakiabb+eapnaaBaaaleaacqaI4aaocqaI2aGncqaI1aqnaeqaaOGaee4uam1aa0baaSqaaiabigdaXiabisda0maaBaaameaacqGGOaakcqqGdbWqcqqGtbWucqqGhbWrcqqGFbWxcqqGnbqtcqqGfbqrcqqGubavcqqGkbGscqqGbbqqcqGGSaalcqWGHbqycqWGXbqCcqGGPaqkaeqaaaWcbaGaeyOeI0IaeGynauJaeGynauJaeiOla4IaeGioaGJaeG4naCJaeGimaadaaOGaey4kaSIaeGOmaiJaeGOnayJaeGinaqJaeG4mamJaeiOla4IaeGynauJaee4ta80aaSbaaSqaaiabikdaYiabcIcaOiabdEgaNjabcMcaPaqabaGccqGHRaWkcqaI1aqncqaI1aqncqGGUaGlcqaI4aaocqaI3aWncqaIWaamcqqGibasdaahaaWcbeqaaiabgUcaRaaakiabc6caUaaaaaa@9F12@
(4)

Using CHNOSZ, the chemical affinities of Reaction 4 and its counterparts for any other specified proteins of interest are first computed using Eqn. (M7). The chemical affinities of the formation reactions are then compared with one another to determine the theoretically predominant protein given the input conditions, which is the one with the highest chemical affinity of formation per residue. In this way, it is possible to generate predominance diagrams like those shown in Figs. 3a and 3b for any number of proteins. The diagram shown in Fig. 3a was produced using all ten proteins listed in Table 1, but only some of the proteins predominate at different points in the diagram. Removing these proteins from consideration leads to the results shown in Fig. 3b, where the metastability relationships among some of the less metastable proteins are depicted.

Chemical activity (speciation) diagrams

To calculate the chemical activities of proteins in metastable equilibrium, let us consider two ways of writing the formulas of proteins in chemical reactions. The first is represented in Reaction 1 above, in which are entered the whole formulas of proteins. If the conditions are such that metastable equilibrium between the proteins in this reaction corresponds to activities of the proteins each equal to 10-3, we have in Eqn. (2) log ( a CSG_METJA 1 / 530 / a CSG_METVO 1 / 553 ) MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaGagiiBaWMaei4Ba8Maei4zaC2aaeWaaeaacqWGHbqydaqhaaWcbaGaee4qamKaee4uamLaee4raCKaee4xa8Laeeyta0KaeeyrauKaeeivaqLaeeOsaOKaeeyqaeeabaGaeGymaeJaei4la8IaeGynauJaeG4mamJaeGimaadaaOGaei4la8Iaemyyae2aa0baaSqaaiabboeadjabbofatjabbEeahjabb+faFjabb2eanjabbweafjabbsfaujabbAfawjabb+eapbqaaiabigdaXiabc+caViabiwda1iabiwda1iabiodaZaaaaOGaayjkaiaawMcaaaaa@5336@ = -0.0002. If we decrease log f O 2 ( g ) MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaGaemOzay2aaSbaaSqaaiabb+eapnaaBaaameaacqaIYaGmcqGGOaakcqWGNbWzcqGGPaqkaeqaaaWcbeaaaaa@32AE@ by a single unit, it follows from Eqn. (2) that log ( a CSG_METJA 1 / 530 / a CSG_METVO 1 / 553 ) MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaGagiiBaWMaei4Ba8Maei4zaC2aaeWaaeaacqWGHbqydaqhaaWcbaGaee4qamKaee4uamLaee4raCKaee4xa8Laeeyta0KaeeyrauKaeeivaqLaeeOsaOKaeeyqaeeabaGaeGymaeJaei4la8IaeGynauJaeG4mamJaeGimaadaaOGaei4la8Iaemyyae2aa0baaSqaaiabboeadjabbofatjabbEeahjabb+faFjabb2eanjabbweafjabbsfaujabbAfawjabb+eapbqaaiabigdaXiabc+caViabiwda1iabiwda1iabiodaZaaaaOGaayjkaiaawMcaaaaa@5336@ = -0.0002 + 0.163 = 0.1628. Accordingly, supposing that aCSG_METJA is held constant at 10-3, the activity of CSG_METVO would be ~10-93, a vanishingly small quantity. The relative metastabilities of proteins computed using this approach are shown graphically in Fig. 5a, where it can be seen that the logarithms of activities of the non-predominant proteins drop precipitously.

Figure 5
figure 5

Metastable equilibrium chemical activities of proteins. Logarithms of chemical activities of the proteins listed in Table 1 were calculated at 25°C and 1 bar using reactions written for (a) whole protein formulas or (b) residue equivalents of the proteins. Activities of the basis species were set to the values given in the Methods, and initial activities of the proteins were set to 10-3. The vertical dashed lines represent the lower stability limit of H2O.

Let us propose to write the formulas of proteins in metastability reactions as residue equivalents instead of whole protein formulas. The chemical formula or any standard molal thermodynamic property of a residue equivalent of a protein is defined to be that of the protein divided by the length of the protein. In contrast, assuming activity coefficients of proteins and residue equivalents to be unity, the chemical activity of the residue equivalent of the jth protein (aresidue, j) is equal to the chemical activity of the protein (a j ) multiplied by the length of the protein (n j ):

aresidue, j= n j × a j .

We can rewrite Reaction 1 in terms of the residue equivalents of the proteins as

C 4.656 H 7.307 N 1.166 O 1.598 S 0.020 ( residue , CSG_METVO , a q ) 0.101 + 0.164 CO 2 ( a q ) + 0.031 H 2 O + 0.041 NH 3 ( a q ) + 0.006 H 2 S ( a q ) C 4.821 H 7.502 N 1.208 O 1.632 S 0.026 ( residue , CSG_METJA , a q ) 0.105 + 0.163 O 2 ( g ) + 0.004 H + . MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xI8qiVKYPFjYdHaVhbbf9v8qqaqFr0xc9vqFj0dXdbba91qpepeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaqbaeqabiqaaaqaaiabboeadnaaBaaaleaacqaI0aancqGGUaGlcqaI2aGncqaI1aqncqaI2aGnaeqaaOGaeeisaG0aaSbaaSqaaiabiEda3iabc6caUiabiodaZiabicdaWiabiEda3aqabaGccqqGobGtdaWgaaWcbaGaeGymaeJaeiOla4IaeGymaeJaeGOnayJaeGOnaydabeaakiabb+eapnaaBaaaleaacqaIXaqmcqGGUaGlcqaI1aqncqaI5aqocqaI4aaoaeqaaOGaee4uam1aa0baaSqaaiabicdaWiabc6caUiabicdaWiabikdaYiabicdaWmaaBaaameaacqGGOaakcqqGYbGCcqqGLbqzcqqGZbWCcqqGPbqAcqqGKbazcqqG1bqDcqqGLbqzcqGGSaalcqqGdbWqcqqGtbWucqqGhbWrcqqGFbWxcqqGnbqtcqqGfbqrcqqGubavcqqGwbGvcqqGpbWtcqGGSaalcqWGHbqycqWGXbqCcqGGPaqkaeqaaaWcbaGaeyOeI0IaeGimaaJaeiOla4IaeGymaeJaeGimaaJaeGymaedaaOGaey4kaSIaeGimaaJaeiOla4IaeGymaeJaeGOnayJaeGinaqJaee4qamKaee4ta80aaSbaaSqaaiabikdaYiabcIcaOiabdggaHjabdghaXjabcMcaPaqabaGccqGHRaWkcqaIWaamcqGGUaGlcqaIWaamcqaIZaWmcqaIXaqmcqqGibasdaWgaaWcbaGaeGOmaidabeaakiabb+eapjabgUcaRiabicdaWiabc6caUiabicdaWiabisda0iabigdaXiabb6eaojabbIeainaaBaaaleaacqaIZaWmcqGGOaakcqWGHbqycqWGXbqCcqGGPaqkaeqaaOGaey4kaSIaeGimaaJaeiOla4IaeGimaaJaeGimaaJaeGOnayJaeeisaG0aaSbaaSqaaiabikdaYaqabaGccqqGtbWudaWgaaWcbaGaeiikaGIaemyyaeMaemyCaeNaeiykaKcabeaaaOqaaiablYCidkabboeadnaaBaaaleaacqaI0aancqGGUaGlcqaI4aaocqaIYaGmcqaIXaqmaeqaaOGaeeisaG0aaSbaaSqaaiabiEda3iabc6caUiabiwda1iabicdaWiabikdaYaqabaGccqqGobGtdaWgaaWcbaGaeGymaeJaeiOla4IaeGOmaiJaeGimaaJaeGioaGdabeaakiabb+eapnaaBaaaleaacqaIXaqmcqGGUaGlcqaI2aGncqaIZaWmcqaIYaGmaeqaaOGaee4uam1aa0baaSqaaiabicdaWiabc6caUiabicdaWiabikdaYiabiAda2maaBaaameaacqGGOaakcqqGYbGCcqqGLbqzcqqGZbWCcqqGPbqAcqqGKbazcqqG1bqDcqqGLbqzcqGGSaalcqqGdbWqcqqGtbWucqqGhbWrcqqGFbWxcqqGnbqtcqqGfbqrcqqGubavcqqGkbGscqqGbbqqcqGGSaalcqWGHbqycqWGXbqCcqGGPaqkaeqaaaWcbaGaeyOeI0IaeGimaaJaeiOla4IaeGymaeJaeGimaaJaeGynaudaaOGaey4kaSIaeGimaaJaeiOla4IaeGymaeJaeGOnayJaeG4mamJaee4ta80aaSbaaSqaaiabikdaYiabcIcaOiabdEgaNjabcMcaPaqabaGccqGHRaWkcqaIWaamcqGGUaGlcqaIWaamcqaIWaamcqaI0aancqqGibasdaahaaWcbeqaaiabgUcaRaaakiabc6caUaaaaaa@EEC3@
(6)

In Reaction 6, the coefficients on the reactant and product residue equivalents are both set to unity. Hence, in both Reactions 1 and 6 protein length is conserved. Using Eqn. (M8) we can write for Reaction 6,

log K 6 = A 6 / 2.303 R T + log a residue , CSG_METJA a residue , CSG_METVO + log f O 2 ( g ) 0.163 a H + 0.004 a CO 2 ( a q ) 0.164 a H 2 O 0.031 a NH 3 ( a q ) 0.041 a H 2 S 0.006 . MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xI8qiVKYPFjYdHaVhbbf9v8qqaqFr0xc9vqFj0dXdbba91qpepeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaGagiiBaWMaei4Ba8Maei4zaCMaem4saS0aaSbaaSqaaiabiAda2aqabaGccqGH9aqpieWacqWFbbqqdaWgaaWcbaGaeGOnaydabeaakiabc+caViabikdaYiabc6caUiabiodaZiabicdaWiabiodaZiabdkfasjabdsfaujabgUcaRiGbcYgaSjabc+gaVjabcEgaNLqbaoaalaaabaGaemyyae2aaSbaaeaacqqGYbGCcqqGLbqzcqqGZbWCcqqGPbqAcqqGKbazcqqG1bqDcqqGLbqzcqGGSaalcqqGdbWqcqqGtbWucqqGhbWrcqqGFbWxcqqGnbqtcqqGfbqrcqqGubavcqqGkbGscqqGbbqqaeqaaaqaaiabdggaHnaaBaaabaGaeeOCaiNaeeyzauMaee4CamNaeeyAaKMaeeizaqMaeeyDauNaeeyzauMaeiilaWIaee4qamKaee4uamLaee4raCKaee4xa8Laeeyta0KaeeyrauKaeeivaqLaeeOvayLaee4ta8eabeaaaaGccqGHRaWkcyGGSbaBcqGGVbWBcqGGNbWzjuaGdaWcaaqaaiabdAgaMnaaDaaabaGaee4ta80aaSbaaeaacqaIYaGmcqGGOaakcqWGNbWzcqGGPaqkaeqaaaqaaiabicdaWiabc6caUiabigdaXiabiAda2iabiodaZaaacqWGHbqydaqhaaqaaiabbIeaijabgUcaRaqaaiabicdaWiabc6caUiabicdaWiabicdaWiabisda0aaaaeaacqWGHbqydaqhaaqaaiabboeadjabb+eapnaaBaaabaGaeGOmaiJaeiikaGIaemyyaeMaemyCaeNaeiykaKcabeaaaeaacqaIWaamcqGGUaGlcqaIXaqmcqaI2aGncqaI0aanaaGaemyyae2aa0baaeaacqqGibasdaWgaaqaaiabikdaYaqabaGaee4ta8eabaGaeGimaaJaeiOla4IaeGimaaJaeG4mamJaeGymaedaaiabdggaHnaaDaaabaGaeeOta4KaeeisaG0aaSbaaeaacqaIZaWmcqGGOaakcqWGHbqycqWGXbqCcqGGPaqkaeqaaaqaaiabicdaWiabc6caUiabicdaWiabisda0iabigdaXaaacqWGHbqydaqhaaqaaiabbIeainaaBaaabaGaeGOmaidabeaacqqGtbWuaeaacqaIWaamcqGGUaGlcqaIWaamcqaIWaamcqaI2aGnaaaaaiabc6caUaaa@B7EB@
(7)

Let us now consider conditions such that the metastable equilibrium activities of the proteins are each equal to 10-3. From Eqn. (5) we have aresidue, CSG_METJA = 0.530 and aresidue,CSG_METVO = 0.553, so log (aresidue,CSG_METJA/aresidue,CSG_METVO) = - 0.018. Now, if log f O 2 ( g ) MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaGaemOzay2aaSbaaSqaaiabb+eapnaaBaaameaacqaIYaGmcqGGOaakcqWGNbWzcqGGPaqkaeqaaaWcbeaaaaa@32AE@ is decreased by one unit, it follows from Eqn. (7) that to maintain metastable equilibrium, log (aresidue,CSG_METJA/aresidue,CSG_METVO) = -0.018 + 0.163 = 0.145. Supposing aresidue,CSG_METJA to be held constant at 0.530 (aCSG_METJA = 10-3), aresidue,CSG_METVO would be 0.380 (aCSG_METVO = 10-3.16). This type of assessment leads to the results shown graphically in Fig. 5b, where it can be seen that the metastable equilibrium activities of the proteins as a function of log f O 2 ( g ) MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaGaemOzay2aaSbaaSqaaiabb+eapnaaBaaameaacqaIYaGmcqGGOaakcqWGNbWzcqGGPaqkaeqaaaWcbeaaaaa@32AE@ are within a few log units of each other, even for the non-predominant proteins.

The diagram shown in Fig. 5b was actually constructed using CHNOSZ by taking account of the formation reactions of residue equivalents of the proteins, instead of the metastability reaction represented by Reaction 6. To demonstrate this procedure, let us write the formation reaction for the residue equivalent of CSG_METVO as

4.656 CO 2 ( a q ) + 1.935 H 2 O + 1.166 NH 3 ( a q ) + 0.020 H 2 S ( a q ) C 4.656 H 7.307 N 1.166 O 1.598 S 0.020 ( residue , CSG_METVO , a q ) 0.101 + 4.825 O 2 ( g ) + 0.101 H + MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xI8qiVKYPFjYdHaVhbbf9v8qqaqFr0xc9vqFj0dXdbba91qpepeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaqbaeaabiqaaaqaaiabisda0iabc6caUiabiAda2iabiwda1iabiAda2iabboeadjabb+eapnaaBaaaleaacqaIYaGmcqGGOaakcqWGHbqycqWGXbqCcqGGPaqkaeqaaOGaey4kaSIaeGymaeJaeiOla4IaeGyoaKJaeG4mamJaeGynauJaeeisaG0aaSbaaSqaaiabikdaYaqabaGccqqGpbWtcqGHRaWkcqaIXaqmcqGGUaGlcqaIXaqmcqaI2aGncqaI2aGncqqGobGtcqqGibasdaWgaaWcbaGaeG4mamJaeiikaGIaemyyaeMaemyCaeNaeiykaKcabeaakiabgUcaRiabicdaWiabc6caUiabicdaWiabikdaYiabicdaWiabbIeainaaBaaaleaacqaIYaGmaeqaaOGaee4uam1aaSbaaSqaaiabcIcaOiabdggaHjabdghaXjabcMcaPaqabaaakeaacaWLjaGaeSiZHmOaee4qam0aaSbaaSqaaiabisda0iabc6caUiabiAda2iabiwda1iabiAda2aqabaGccqqGibasdaWgaaWcbaGaeG4naCJaeiOla4IaeG4mamJaeGimaaJaeG4naCdabeaakiabb6eaonaaBaaaleaacqaIXaqmcqGGUaGlcqaIXaqmcqaI2aGncqaI2aGnaeqaaOGaee4ta80aaSbaaSqaaiabigdaXiabc6caUiabiwda1iabiMda5iabiIda4aqabaGccqqGtbWudaqhaaWcbaGaeGimaaJaeiOla4IaeGimaaJaeGOmaiJaeGimaaZaaSbaaWqaaiabcIcaOiabbkhaYjabbwgaLjabbohaZjabbMgaPjabbsgaKjabbwha1jabbwgaLjabcYcaSiabboeadjabbofatjabbEeahjabb+faFjabb2eanjabbweafjabbsfaujabbAfawjabb+eapjabcYcaSiabdggaHjabdghaXjabcMcaPaqabaaaleaacqGHsislcqaIWaamcqGGUaGlcqaIXaqmcqaIWaamcqaIXaqmaaGccqGHRaWkcqaI0aancqGGUaGlcqaI4aaocqaIYaGmcqaI1aqncqqGpbWtdaWgaaWcbaGaeGOmaiJaeiikaGIaem4zaCMaeiykaKcabeaakiabgUcaRiabicdaWiabc6caUiabigdaXiabicdaWiabigdaXiabbIeainaaCaaaleqabaGaey4kaScaaaaaaaa@B00F@
(8)

and that for the residue equivalent of CSG_METJA as

4.821 CO 2 ( a q ) + 1.966 H 2 O + 1.208 NH 3 ( a q ) + 0.026 H 2 S ( a q ) C 4.821 H 7.502 N 1.208 O 1.632 S 0.026 ( residue , CSG_METJA , a q ) 0.105 + 4.988 O 2 ( g ) + 0.105 H + . MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xI8qiVKYPFjYdHaVhbbf9v8qqaqFr0xc9vqFj0dXdbba91qpepeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaqbaeaabiqaaaqaaiabisda0iabc6caUiabiIda4iabikdaYiabigdaXiabboeadjabb+eapnaaBaaaleaacqaIYaGmcqGGOaakcqWGHbqycqWGXbqCcqGGPaqkaeqaaOGaey4kaSIaeGymaeJaeiOla4IaeGyoaKJaeGOnayJaeGOnayJaeeisaG0aaSbaaSqaaiabikdaYaqabaGccqqGpbWtcqGHRaWkcqaIXaqmcqGGUaGlcqaIYaGmcqaIWaamcqaI4aaocqqGobGtcqqGibasdaWgaaWcbaGaeG4mamJaeiikaGIaemyyaeMaemyCaeNaeiykaKcabeaakiabgUcaRiabicdaWiabc6caUiabicdaWiabikdaYiabiAda2iabbIeainaaBaaaleaacqaIYaGmaeqaaOGaee4uam1aaSbaaSqaaiabcIcaOiabdggaHjabdghaXjabcMcaPaqabaaakeaacaWLjaGaeSiZHmOaee4qam0aaSbaaSqaaiabisda0iabc6caUiabiIda4iabikdaYiabigdaXaqabaGccqqGibasdaWgaaWcbaGaeG4naCJaeiOla4IaeGynauJaeGimaaJaeGOmaidabeaakiabb6eaonaaBaaaleaacqaIXaqmcqGGUaGlcqaIYaGmcqaIWaamcqaI4aaoaeqaaOGaee4ta80aaSbaaSqaaiabigdaXiabc6caUiabiAda2iabiodaZiabikdaYaqabaGccqqGtbWudaqhaaWcbaGaeGimaaJaeiOla4IaeGimaaJaeGOmaiJaeGOnayZaaSbaaWqaaiabcIcaOiabbkhaYjabbwgaLjabbohaZjabbMgaPjabbsgaKjabbwha1jabbwgaLjabcYcaSiabboeadjabbofatjabbEeahjabb+faFjabb2eanjabbweafjabbsfaujabbQeakjabbgeabjabcYcaSiabdggaHjabdghaXjabcMcaPaqabaaaleaacqGHsislcqaIWaamcqGGUaGlcqaIXaqmcqaIWaamcqaI1aqnaaGccqGHRaWkcqaI0aancqGGUaGlcqaI5aqocqaI4aaocqaI4aaocqqGpbWtdaWgaaWcbaGaeGOmaiJaeiikaGIaem4zaCMaeiykaKcabeaakiabgUcaRiabicdaWiabc6caUiabigdaXiabicdaWiabiwda1iabbIeainaaCaaaleqabaGaey4kaScaaOGaeiOla4caaaaa@B0CD@
(9)

Specific statements of Eqn. (M8) for Reactions 8 and 9 are, respectively,

A 8 / 2.303 R T = log K 8 log ( a C 4.656 H 7.307 N 1.166 O 1.598 S 0.020 ( residue , CSG_METVO , a q ) 0.101 f O 2 ( g ) 4.825 a H + 0.101 ) + log ( a CO 2 ( a q ) 4.656 a H 2 O 1.935 a NH 3 ( a q ) 1.166 a H 2 S 0.020 ) MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xI8qiVKYPFjYdHaVhbbf9v8qqaqFr0xc9vqFj0dXdbba91qpepeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaqbaeWabiqaaaqaaGqadiab=feabnaaBaaaleaacqaI4aaoaeqaaOGaei4la8IaeGOmaiJaeiOla4IaeG4mamJaeGimaaJaeG4mamJaemOuaiLaemivaqLaeyypa0JagiiBaWMaei4Ba8Maei4zaCMaem4saS0aaSbaaSqaaiabiIda4aqabaGccqGHsislcyGGSbaBcqGGVbWBcqGGNbWzdaqadaqaaiabdggaHnaaBaaaleaacqqGdbWqdaWgaaadbaGaeGinaqJaeiOla4IaeGOnayJaeGynauJaeGOnaydabeaaliabbIeainaaBaaameaacqaI3aWncqGGUaGlcqaIZaWmcqaIWaamcqaI3aWnaeqaaSGaeeOta40aaSbaaWqaaiabigdaXiabc6caUiabigdaXiabiAda2iabiAda2aqabaWccqqGpbWtdaWgaaadbaGaeGymaeJaeiOla4IaeGynauJaeGyoaKJaeGioaGdabeaaliabbofatnaaDaaameaacqaIWaamcqGGUaGlcqaIWaamcqaIYaGmcqaIWaamdaWgaaqaaiabcIcaOiabbkhaYjabbwgaLjabbohaZjabbMgaPjabbsgaKjabbwha1jabbwgaLjabcYcaSiabboeadjabbofatjabbEeahjabb+faFjabb2eanjabbweafjabbsfaujabbAfawjabb+eapjabcYcaSiabdggaHjabdghaXjabcMcaPaqabaaabaGaeyOeI0IaeGimaaJaeiOla4IaeGymaeJaeGimaaJaeGymaedaaaWcbeaakiabdAgaMnaaDaaaleaacqqGpbWtdaWgaaadbaGaeGOmaiJaeiikaGIaem4zaCMaeiykaKcabeaaaSqaaiabisda0iabc6caUiabiIda4iabikdaYiabiwda1aaakiabdggaHnaaDaaaleaacqqGibascqGHRaWkaeaacqaIWaamcqGGUaGlcqaIXaqmcqaIWaamcqaIXaqmaaaakiaawIcacaGLPaaaaeaacqGHRaWkcyGGSbaBcqGGVbWBcqGGNbWzdaqadaqaaiabdggaHnaaDaaaleaacqqGdbWqcqqGpbWtdaWgaaadbaGaeGOmaiJaeiikaGIaemyyaeMaemyCaeNaeiykaKcabeaaaSqaaiabisda0iabc6caUiabiAda2iabiwda1iabiAda2aaakiabdggaHnaaDaaaleaacqqGibasdaWgaaadbaGaeGOmaidabeaaliabb+eapbqaaiabigdaXiabc6caUiabiMda5iabiodaZiabiwda1aaakiabdggaHnaaDaaaleaacqqGobGtcqqGibasdaWgaaadbaGaeG4mamJaeiikaGIaemyyaeMaemyCaeNaeiykaKcabeaaaSqaaiabigdaXiabc6caUiabigdaXiabiAda2iabiAda2aaakiabdggaHnaaDaaaleaacqqGibasdaWgaaadbaGaeGOmaidabeaaliabbofatbqaaiabicdaWiabc6caUiabicdaWiabikdaYiabicdaWaaaaOGaayjkaiaawMcaaaaaaaa@CD99@
(10)

and

A 9 / 2.303 RT = log K 9 log ( a C 4.821 H 7.502 N 1.208 O 1.632 S 0.02 6 ( residue , CSG _ METJA , aq ) 0.105 f O 2 ( g ) 4.988 a H + 0.105 ) + log ( a C O 2 ( aq ) 4.821 a H 2 O 1.966 a N H 3 ( aq ) 1.208 a H 2 S 0.026 ) . MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbwvMCKfMBHbqedmvETj2BSbqee0evGueE0jxyaibaieYdOi=BI8qipeYdI8qiW7rqqrFfpeea0xe9LqFf0xc9q8qqaqFn0dXdHiVcFbIOFHK8Feei0lXdar=Jb9qqFfeaYRXxe9vr0=vr0=LqpWqaaeaabiGaciaacaqabeaabeqacmaaaOqaauaabmqaceaaaeaatCvAUfKttLearyWrP9MDH5MBPbIqV92AaGabdiab=feabnaaBaaaleaaiqaacqGF5aqoaeqaaOGae43la8Iae4NmaiJae4Nla4Iae43mamJae4hmaaJae43mamdceiGae0NuaiLae0hvaqLaeyypa0Jae4hBaWMae43Ba8Mae43zaCMae03saS0aaSbaaSqaaiab+Lda5aqabaGccqGHsislcqGFSbaBcqGFVbWBcqGFNbWzdaqadaqaaiab9fgaHnaaBaaaleaacqGFdbWqdaWgaaadbaGae4hnaqJae4Nla4Iae4hoaGJae4NmaiJae4xmaedabeaaliab+HeainaaBaaameaacqGF3aWncqGFUaGlcqGF1aqncqGFWaamcqGFYaGmaeqaaSGae4Nta40aaSbaaWqaaiab+fdaXiab+5caUiab+jdaYiab+bdaWiab+Hda4aqabaWccqGFpbWtdaWgaaadbaGae4xmaeJae4Nla4Iae4NnayJae43mamJae4Nmaidabeaaliab+nfatnaaDaaameaacqGFWaamcqGFUaGlcqGFWaamcqGFYaGmcqGF2aGndaWgaaqaaiab+HcaOiab+jhaYjab+vgaLjab+nhaZjab+LgaPjab+rgaKjab+vha1jab+vgaLjab+XcaSiab+neadjab+nfatjab+Deahjab+9faFjab+1eanjab+veafjab+rfaujab+Peakjab+feabjab+XcaSiab9fgaHjab9fhaXjab+LcaPaqabaaabaGaeyOeI0Iae4hmaaJae4Nla4Iae4xmaeJae4hmaaJae4xnaudaaaWcbeaakiab9zgaMnaaDaaaleaacqGFpbWtdaWgaaadbaGae4NmaiJae4hkaGIae03zaCMae4xkaKcabeaaaSqaaiab+rda0iab+5caUiab+Lda5iab+Hda4iab+Hda4aaakiab9fgaHnaaDaaaleaacqGFibascqGHRaWkaeaacqGFWaamcqGFUaGlcqGFXaqmcqGFWaamcqGF1aqnaaaakiaawIcacaGLPaaaaeaacqGHRaWkcqGFSbaBcqGFVbWBcqGFNbWzdaqadaqaaiab9fgaHnaaDaaaleaacqGFdbWqcqGFpbWtdaWgaaadbaGae4NmaiJae4hkaGIae0xyaeMae0xCaeNae4xkaKcabeaaaSqaaiab+rda0iab+5caUiab+Hda4iab+jdaYiab+fdaXaaakiab9fgaHnaaDaaaleaacqGFibasdaWgaaadbaGae4Nmaidabeaaliab+9eapbqaaiab+fdaXiab+5caUiab+Lda5iab+zda2iab+zda2aaakiab9fgaHnaaDaaaleaacqGFobGtcqGFibasdaWgaaadbaGae43mamJae4hkaGIae0xyaeMae0xCaeNae4xkaKcabeaaaSqaaiab+fdaXiab+5caUiab+jdaYiab+bdaWiab+Hda4aaakiab9fgaHnaaDaaaleaacqGFibasdaWgaaadbaGae4Nmaidabeaaliab+nfatbqaaiab+bdaWiab+5caUiab+bdaWiab+jdaYiab+zda2aaaaOGaayjkaiaawMcaaiaac6caaaaaaa@D313@
(11)

At metastable equilibrium, A8 = A9, i.e. the chemical affinities of the formation reactions of the residue equivalents are equal. Values of log K8 = -367.714 and log K9 = -379.687 can be obtained using standard molal Gibbs energies at 25°C and 1 bar of the basis species and of the ionized proteins at pH 7 (see Fig. 4b). Let us also substitute the reference activities of the basis species described in the Methods and log f O 2 ( g ) MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaGaemOzay2aaSbaaSqaaiabb+eapnaaBaaameaacqaIYaGmcqGGOaakcqWGNbWzcqGGPaqkaeqaaaWcbeaaaaa@32AE@ = -80 to write

A 8 / 2.303 R T = 0.189 log a C 4.656 H 7.307 N 1.166 O 1.598 S 0.020 ( residue , CSG_METVO , a q ) 0.101 MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xI8qiVKYPFjYdHaVhbbf9v8qqaqFr0xc9vqFj0dXdbba91qpepeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaacbmGae8xqae0aaSbaaSqaaiabiIda4aqabaGccqGGVaWlcqaIYaGmcqGGUaGlcqaIZaWmcqaIWaamcqaIZaWmcqWGsbGucqWGubavcqGH9aqpcqaIWaamcqGGUaGlcqaIXaqmcqaI4aaocqaI5aqocqGHsislcyGGSbaBcqGGVbWBcqGGNbWzcqWGHbqydaWgaaWcbaGaee4qam0aaSbaaWqaaiabisda0iabc6caUiabiAda2iabiwda1iabiAda2aqabaWccqqGibasdaWgaaadbaGaeG4naCJaeiOla4IaeG4mamJaeGimaaJaeG4naCdabeaaliabb6eaonaaBaaameaacqaIXaqmcqGGUaGlcqaIXaqmcqaI2aGncqaI2aGnaeqaaSGaee4ta80aaSbaaWqaaiabigdaXiabc6caUiabiwda1iabiMda5iabiIda4aqabaWccqqGtbWudaqhaaadbaGaeGimaaJaeiOla4IaeGimaaJaeGOmaiJaeGimaaZaaSbaaeaacqGGOaakcqqGYbGCcqqGLbqzcqqGZbWCcqqGPbqAcqqGKbazcqqG1bqDcqqGLbqzcqGGSaalcqqGdbWqcqqGtbWucqqGhbWrcqqGFbWxcqqGnbqtcqqGfbqrcqqGubavcqqGwbGvcqqGpbWtcqGGSaalcqWGHbqycqWGXbqCcqGGPaqkaeqaaaqaaiabgkHiTiabicdaWiabc6caUiabigdaXiabicdaWiabigdaXaaaaSqabaaaaa@80B0@
(12)

and

A 9 / 2.303 R T = 0.593 log a C 4.821 H 7.502 N 1.208 O 1.632 S 0.026 ( residue , CSG_METJA , a q ) 0.105 . MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xI8qiVKYPFjYdHaVhbbf9v8qqaqFr0xc9vqFj0dXdbba91qpepeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaacbmGae8xqae0aaSbaaSqaaiabiMda5aqabaGccqGGVaWlcqaIYaGmcqGGUaGlcqaIZaWmcqaIWaamcqaIZaWmcqWGsbGucqWGubavcqGH9aqpcqaIWaamcqGGUaGlcqaI1aqncqaI5aqocqaIZaWmcqGHsislcyGGSbaBcqGGVbWBcqGGNbWzcqWGHbqydaWgaaWcbaGaee4qam0aaSbaaWqaaiabisda0iabc6caUiabiIda4iabikdaYiabigdaXaqabaWccqqGibasdaWgaaadbaGaeG4naCJaeiOla4IaeGynauJaeGimaaJaeGOmaidabeaaliabb6eaonaaBaaameaacqaIXaqmcqGGUaGlcqaIYaGmcqaIWaamcqaI4aaoaeqaaSGaee4ta80aaSbaaWqaaiabigdaXiabc6caUiabiAda2iabiodaZiabikdaYaqabaWccqqGtbWudaqhaaadbaGaeGimaaJaeiOla4IaeGimaaJaeGOmaiJaeGOnayZaaSbaaeaacqGGOaakcqqGYbGCcqqGLbqzcqqGZbWCcqqGPbqAcqqGKbazcqqG1bqDcqqGLbqzcqGGSaalcqqGdbWqcqqGtbWucqqGhbWrcqqGFbWxcqqGnbqtcqqGfbqrcqqGubavcqqGkbGscqqGbbqqcqGGSaalcqWGHbqycqWGXbqCcqGGPaqkaeqaaaqaaiabgkHiTiabicdaWiabc6caUiabigdaXiabicdaWiabiwda1aaaaSqabaGccqGGUaGlaaa@8150@
(13)

There are three unknowns in Eqns. (12) and (13). Conservation of protein length leads to a third equation:

a C 4.656 H 7.307 N 1.166 O 1.598 S 0.020 ( residue , CSG_METVO , a q ) 0.101 + a C 4.821 H 7.502 N 1.208 O 1.632 S 0.026 ( residue , CSG_METJA , a q ) 0.105 = 1.083 , MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xI8qiVKYPFjYdHaVhbbf9v8qqaqFr0xc9vqFj0dXdbba91qpepeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaGaemyyae2aaSbaaSqaaiabboeadnaaBaaameaacqaI0aancqGGUaGlcqaI2aGncqaI1aqncqaI2aGnaeqaaSGaeeisaG0aaSbaaWqaaiabiEda3iabc6caUiabiodaZiabicdaWiabiEda3aqabaWccqqGobGtdaWgaaadbaGaeGymaeJaeiOla4IaeGymaeJaeGOnayJaeGOnaydabeaaliabb+eapnaaBaaameaacqaIXaqmcqGGUaGlcqaI1aqncqaI5aqocqaI4aaoaeqaaSGaee4uam1aa0baaWqaaiabicdaWiabc6caUiabicdaWiabikdaYiabicdaWmaaBaaabaGaeiikaGIaeeOCaiNaeeyzauMaee4CamNaeeyAaKMaeeizaqMaeeyDauNaeeyzauMaeiilaWIaee4qamKaee4uamLaee4raCKaee4xa8Laeeyta0KaeeyrauKaeeivaqLaeeOvayLaee4ta8KaeiilaWIaemyyaeMaemyCaeNaeiykaKcabeaaaeaacqGHsislcqaIWaamcqGGUaGlcqaIXaqmcqaIWaamcqaIXaqmaaaaleqaaOGaey4kaSIaemyyae2aaSbaaSqaaiabboeadnaaBaaameaacqaI0aancqGGUaGlcqaI4aaocqaIYaGmcqaIXaqmaeqaaSGaeeisaG0aaSbaaWqaaiabiEda3iabc6caUiabiwda1iabicdaWiabikdaYaqabaWccqqGobGtdaWgaaadbaGaeGymaeJaeiOla4IaeGOmaiJaeGimaaJaeGioaGdabeaaliabb+eapnaaBaaameaacqaIXaqmcqGGUaGlcqaI2aGncqaIZaWmcqaIYaGmaeqaaSGaee4uam1aa0baaWqaaiabicdaWiabc6caUiabicdaWiabikdaYiabiAda2maaBaaabaGaeiikaGIaeeOCaiNaeeyzauMaee4CamNaeeyAaKMaeeizaqMaeeyDauNaeeyzauMaeiilaWIaee4qamKaee4uamLaee4raCKaee4xa8Laeeyta0KaeeyrauKaeeivaqLaeeOsaOKaeeyqaeKaeiilaWIaemyyaeMaemyCaeNaeiykaKcabeaaaeaacqGHsislcqaIWaamcqGGUaGlcqaIXaqmcqaIWaamcqaI1aqnaaaaleqaaOGaeyypa0JaeGymaeJaeiOla4IaeGimaaJaeGioaGJaeG4mamJaeiilaWcaaa@B26F@
(14)

where the value on the right-hand side corresponds to initial activities of the proteins each equal to 10-3. Solving Eqns. (12)–(14) gives a C 4.656 H 7.307 N 1.166 O 1.598 S 0.020 ( residue , CSG_METVO , a q ) 0.101 MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaGaemyyae2aaSbaaSqaaiabboeadnaaBaaameaacqaI0aancqGGUaGlcqaI2aGncqaI1aqncqaI2aGnaeqaaSGaeeisaG0aaSbaaWqaaiabiEda3iabc6caUiabiodaZiabicdaWiabiEda3aqabaWccqqGobGtdaWgaaadbaGaeGymaeJaeiOla4IaeGymaeJaeGOnayJaeGOnaydabeaaliabb+eapnaaBaaameaacqaIXaqmcqGGUaGlcqaI1aqncqaI5aqocqaI4aaoaeqaaSGaee4uam1aa0baaWqaaiabicdaWiabc6caUiabicdaWiabikdaYiabicdaWmaaBaaabaGaeiikaGIaeeOCaiNaeeyzauMaee4CamNaeeyAaKMaeeizaqMaeeyDauNaeeyzauMaeiilaWIaee4qamKaee4uamLaee4raCKaee4xa8Laeeyta0KaeeyrauKaeeivaqLaeeOvayLaee4ta8KaeiilaWIaemyyaeMaemyCaeNaeiykaKcabeaaaeaacqGHsislcqaIWaamcqGGUaGlcqaIXaqmcqaIWaamcqaIXaqmaaaaleqaaaaa@6B5A@ = 0.307, a C 4.821 H 7.502 N 1.208 O 1.632 S 0.026 ( residue , CSG_METJA , a q ) 0.105 MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaGaemyyae2aaSbaaSqaaiabboeadnaaBaaameaacqaI0aancqGGUaGlcqaI4aaocqaIYaGmcqaIXaqmaeqaaSGaeeisaG0aaSbaaWqaaiabiEda3iabc6caUiabiwda1iabicdaWiabikdaYaqabaWccqqGobGtdaWgaaadbaGaeGymaeJaeiOla4IaeGOmaiJaeGimaaJaeGioaGdabeaaliabb+eapnaaBaaameaacqaIXaqmcqGGUaGlcqaI2aGncqaIZaWmcqaIYaGmaeqaaSGaee4uam1aa0baaWqaaiabicdaWiabc6caUiabicdaWiabikdaYiabiAda2maaBaaabaGaeiikaGIaeeOCaiNaeeyzauMaee4CamNaeeyAaKMaeeizaqMaeeyDauNaeeyzauMaeiilaWIaee4qamKaee4uamLaee4raCKaee4xa8Laeeyta0KaeeyrauKaeeivaqLaeeOsaOKaeeyqaeKaeiilaWIaemyyaeMaemyCaeNaeiykaKcabeaaaeaacqGHsislcqaIWaamcqGGUaGlcqaIXaqmcqaIWaamcqaI1aqnaaaaleqaaaaa@6B0C@ = 0.776 and A(8 or 9)/2.303RT = 0.703.

The addition of any protein to the system increases by one the number of unknowns in Eqn. (14) but also provides another equation in the form of Eqns. (12) and (13). The procedure to set up and solve these equations has been encoded in a general form in CHNOSZ and was used to produce the diagrams shown in Fig. 5. The CHNOSZ program includes options to analyze the protein formation reactions using whole protein formulas or their residue equivalents, which were used to construct Figs. 5a and 5b, respectively. The logarithm of total activity of protein residues is 0.8211 in each of these figures, which corresponds to the sum of the activities of the residue equivalents of the ten model proteins whose starting activities are 10-3.

Another way of representing the chemical speciation in a protein system is on a degree of formation diagram. The degree of formation of the kth protein (α k ) can be calculated from

α k = a residue , k / j = 1 j ^ a residue , j , MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xI8qiVKYPFjYdHaVhbbf9v8qqaqFr0xc9vqFj0dXdbba91qpepeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaGaeqySde2aaSbaaSqaaiabdUgaRbqabaGccqGH9aqpcqWGHbqydaWgaaWcbaGaeeOCaiNaeeyzauMaee4CamNaeeyAaKMaeeizaqMaeeyDauNaeeyzauMaeiilaWIaem4AaSgabeaakiabc+caVmaaqahabaGaemyyae2aaSbaaSqaaiabbkhaYjabbwgaLjabbohaZjabbMgaPjabbsgaKjabbwha1jabbwgaLjabcYcaSiabdQgaQbqabaaabaGaemOAaOMaeyypa0JaeGymaedabaGafmOAaOMbaKaaa0GaeyyeIuoakiabcYcaSaaa@53BE@
(15)

where j ^ = k ^ MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaGafmOAaOMbaKaacqGH9aqpcuWGRbWAgaqcaaaa@2FB7@ denotes the number of proteins in the system, 1 j ^ a residue , j MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaWaaabmaeaacqWGHbqydaWgaaWcbaGaeeOCaiNaeeyzauMaee4CamNaeeyAaKMaeeizaqMaeeyDauNaeeyzauMaeiilaWIaemOAaOgabeaaaeaacqaIXaqmaeaacuWGQbGAgaqcaaqdcqGHris5aaaa@3D70@ represents the total activity of protein residues, and 1 k ^ α k = 1 MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaWaaabmaeaacqaHXoqydaWgaaWcbaGaem4AaSgabeaaaeaacqaIXaqmaeaacuWGRbWAgaqcaaqdcqGHris5aOGaeyypa0JaeGymaedaaa@3555@ . The degrees of formation of the proteins corresponding to the logarithms of activities shown in Fig. 5b are depicted in the figure in Additional File 2. This degree of formation diagram aids in visualization of the computed relative abundances of the proteins on a non-logarithmic scale.

The residue-equivalent approach was used in this study only to produce the diagrams shown in Fig. 5b and Additional File 2. The predominance diagrams shown elsewhere were produced using whole protein formulas in the formation reactions. Extending the residue-equivalent method to these diagrams would subtly alter the positions of the predominance field boundaries, more so for reactions between proteins that differ significantly in length. The differences in the locations of the predominance field boundaries can be assessed in part by comparing the locations of the crossover between predominant proteins in Figs. 5a and 5b.

Temperature and pressure diagrams

The approach described above for constructing Fig. 3 in CHNOSZ was used to produce the diagrams shown in Figs. 6a and 6b. These diagrams portray the metastabilities among the predominant model proteins as a function of temperature or pressure and log f O 2 ( g ) MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaGaemOzay2aaSbaaSqaaiabb+eapnaaBaaameaacqaIYaGmcqGGOaakcqWGNbWzcqGGPaqkaeqaaaWcbeaaaaa@32AE@ . It is immediately apparent that log f O 2 ( g ) MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaGaemOzay2aaSbaaSqaaiabb+eapnaaBaaameaacqaIYaGmcqGGOaakcqWGNbWzcqGGPaqkaeqaaaWcbeaaaaa@32AE@ and temperature have a close relationship along a reaction boundary. For all of the reactions represented by the predominance field boundaries in Fig. 6a, increasing temperature is accompanied by an increase in the value of log f O 2 ( g ) MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaGaemOzay2aaSbaaSqaaiabb+eapnaaBaaameaacqaIYaGmcqGGOaakcqWGNbWzcqGGPaqkaeqaaaWcbeaaaaa@32AE@ . Hence, for a small positive increment in temperature at constant log f O 2 ( g ) MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaGaemOzay2aaSbaaSqaaiabb+eapnaaBaaameaacqaIYaGmcqGGOaakcqWGNbWzcqGGPaqkaeqaaaWcbeaaaaa@32AE@ , the metastability of CSG_METJA increases relative to that of CSG_METVO. If values of log f O 2 ( g ) MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaGaemOzay2aaSbaaSqaaiabb+eapnaaBaaameaacqaIYaGmcqGGOaakcqWGNbWzcqGGPaqkaeqaaaWcbeaaaaa@32AE@ instead correspond as a function of temperature to the water stability limit (shown by the dashed line in Fig. 6a), increasing temperature would actually favor the formation of CSG_METVO relative to CSG_METJA.

Figure 6
figure 6

Temperature and pressure dependence of protein metastability. Predominance diagrams were constructed as a function of log f O 2 ( g ) MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaGaemOzay2aaSbaaSqaaiabb+eapnaaBaaameaacqaIYaGmcqGGOaakcqWGNbWzcqGGPaqkaeqaaaWcbeaaaaa@32AE@ and (a) temperature at PSAT or (b) pressure at 25°C. The dashed line in each diagram represents the lower stability limit of H2O.

We can recover nominal values of log f O 2 ( g ) MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaGaemOzay2aaSbaaSqaaiabb+eapnaaBaaameaacqaIYaGmcqGGOaakcqWGNbWzcqGGPaqkaeqaaaWcbeaaaaa@32AE@ in the natural environments of M. voltae and M. jannaschii from geochemical data. The first of these organisms was originally isolated from the sediment of an estuary [42] and the other inhabits submarine hydrothermal vent environments [43]. Values of a H 2 ( a q ) MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaGaemyyae2aaSbaaSqaaiabbIeainaaBaaameaacqaIYaGmcqGGOaakcqWGHbqycqWGXbqCcqGGPaqkaeqaaaWcbeaaaaa@33F5@ (activity of dissolved hydrogen) were taken from [44] and converted to log f O 2 ( g ) MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaGaemOzay2aaSbaaSqaaiabb+eapnaaBaaameaacqaIYaGmcqGGOaakcqWGNbWzcqGGPaqkaeqaaaWcbeaaaaa@32AE@ using the law of mass action for H2O H2(aq) + 0.5O2(g) evaluated at 25°C and 1 bar to calculate a nominal range of log f O 2 ( g ) MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaGaemOzay2aaSbaaSqaaiabb+eapnaaBaaameaacqaIYaGmcqGGOaakcqWGNbWzcqGGPaqkaeqaaaWcbeaaaaa@32AE@ for estuarine sediment of -73 to -70. Values of log f O 2 ( g ) MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaGaemOzay2aaSbaaSqaaiabb+eapnaaBaaameaacqaIYaGmcqGGOaakcqWGNbWzcqGGPaqkaeqaaaWcbeaaaaa@32AE@ obtaining in mixed hydrothermal vent fluid and seawater at 100°C are in the range of -65 to -60 [45]. The first of these ranges would plot near the CSG_HALJP – CSG_METVO boundary in Fig. 6a at 25°C and the second one near the boundary between CSG_METVO and CSG_METJA at 100°C. This observation might support the notion that proteins from hyperthermophilic organisms like M. jannaschii are thermodynamically favored relative to those from mesophilic organisms by increasing temperature accompanied by changes in the geochemical oxidation state.

It appears in Fig. 6b that increasing pressure also generally favors those proteins in lower oxidation states, but that the dependence of equilibrium log f O 2 ( g ) MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaGaemOzay2aaSbaaSqaaiabb+eapnaaBaaameaacqaIYaGmcqGGOaakcqWGNbWzcqGGPaqkaeqaaaWcbeaaaaa@32AE@ values on pressure is small relative to their dependence on temperature.

Proteins as chemical activity buffers

The chemical activities of basis species buffered by reacting protein assemblages correspond to the locations of the (pseudo)invariant points on metastable equilibrium predominance diagrams. Equal activities of three proteins correspond to the triple point, which is a pseudoinvariant point, in the predominance diagram shown in Fig. 3b. The number of independent variables on the axes of this diagram is two; in an eight-dimensional predominance diagram (of temperature, pressure and six chemical activities) one could distinguish the true invariant points in this system where nine proteins coexist with equal metastable equilibrium activities.

Let us ask what are the activities of CO2(aq) , H2O, NH3(aq) and H2S(aq) if they are buffered by a hypothetical metastable assemblage made up of the proteins from the METXX organisms listed in Table 1, at T = 100°C, P = 1000 bar, pH 7, log f O 2 ( g ) MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaGaemOzay2aaSbaaSqaaiabb+eapnaaBaaameaacqaIYaGmcqGGOaakcqWGNbWzcqGGPaqkaeqaaaWcbeaaaaa@32AE@ = -58 and activities of proteins equal to 10-3. At this temperature, pressure and pH the calculated charge of the cell-surface protein from M. jannaschii is -64.933. Consider then the formation reaction for this protein, which except for charge is equal to Reaction 4:

2555 CO 2 ( a q ) + 1042 H 2 O + 640 NH 3 ( a q ) + 14 H 2 S ( a q ) C 2555 H 3967.067 N 640 O 865 S 14 ( CSG_METJA , a q ) 64.933 + 2643.5 O 2 ( g ) + 64.933 H + MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xI8qiVKYPFjYdHaVhbbf9v8qqaqFr0xc9vqFj0dXdbba91qpepeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaqbaeaabiqaaaqaaiabikdaYiabiwda1iabiwda1iabiwda1iabboeadjabb+eapnaaBaaaleaacqaIYaGmcqGGOaakcqWGHbqycqWGXbqCcqGGPaqkaeqaaOGaey4kaSIaeGymaeJaeGimaaJaeGinaqJaeGOmaiJaeeisaG0aaSbaaSqaaiabikdaYaqabaGccqqGpbWtcqGHRaWkcqaI2aGncqaI0aancqaIWaamcqqGobGtcqqGibasdaWgaaWcbaGaeG4mamJaeiikaGIaemyyaeMaemyCaeNaeiykaKcabeaakiabgUcaRiabigdaXiabisda0iabbIeainaaBaaaleaacqaIYaGmaeqaaOGaee4uam1aaSbaaSqaaiabcIcaOiabdggaHjabdghaXjabcMcaPaqabaaakeaacaWLjaGaeSiZHmOaee4qam0aaSbaaSqaaiabikdaYiabiwda1iabiwda1iabiwda1aqabaGccqqGibasdaWgaaWcbaGaeG4mamJaeGyoaKJaeGOnayJaeG4naCJaeiOla4IaeGimaaJaeGOnayJaeG4naCdabeaakiabb6eaonaaBaaaleaacqaI2aGncqaI0aancqaIWaamaeqaaOGaee4ta80aaSbaaSqaaiabiIda4iabiAda2iabiwda1aqabaGccqqGtbWudaqhaaWcbaGaeGymaeJaeGinaqZaaSbaaWqaaiabcIcaOiabboeadjabbofatjabbEeahjabb+faFjabb2eanjabbweafjabbsfaujabbQeakjabbgeabjabcYcaSiabdggaHjabdghaXjabcMcaPaqabaaaleaacqGHsislcqaI2aGncqaI0aancqGGUaGlcqaI5aqocqaIZaWmcqaIZaWmaaGccqGHRaWkcqaIYaGmcqaI2aGncqaI0aancqaIZaWmcqGGUaGlcqaI1aqncqqGpbWtdaWgaaWcbaGaeGOmaiJaeiikaGIaem4zaCMaeiykaKcabeaakiabgUcaRiabiAda2iabisda0iabc6caUiabiMda5iabiodaZiabiodaZiabbIeainaaCaaaleqabaGaey4kaScaaaaaaaa@9DB0@
(16)

A rearranged statement of Eqn. (M8) for this reaction can be written as

530 A ¯ 16 / 2.303 R T 2555 log a CO 2 ( a q ) 640 log a NH 3 ( a q ) 14 log a H 2 S ( a q ) 1042 log a H 2 O = log K 16 log a C 2494 H 3967.067 N 613 O 841 S 9 ( CSG_METJA , a q ) 64.933 2463.5 log f O 2 ( g ) 64.933 log a H + MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xI8qiVKYPFjYdHaVhbbf9v8qqaqFr0xc9vqFj0dXdbba91qpepeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaqbaeqabiqaaaqaaiabiwda1iabiodaZiabicdaWGqadiqb=feabzaaraWaaSbaaSqaaiabigdaXiabiAda2aqabaGccqGGVaWlcqaIYaGmcqGGUaGlcqaIZaWmcqaIWaamcqaIZaWmcqWGsbGucqWGubavcqGHsislcqaIYaGmcqaI1aqncqaI1aqncqaI1aqncyGGSbaBcqGGVbWBcqGGNbWzcqWGHbqydaWgaaWcbaGaee4qamKaee4ta80aaSbaaWqaaiabikdaYiabcIcaOiabdggaHjabdghaXjabcMcaPaqabaaaleqaaOGaeyOeI0IaeGOnayJaeGinaqJaeGimaaJagiiBaWMaei4Ba8Maei4zaCMaemyyae2aaSbaaSqaaiabb6eaojabbIeainaaBaaameaacqaIZaWmcqGGOaakcqWGHbqycqWGXbqCcqGGPaqkaeqaaaWcbeaakiabgkHiTiabigdaXiabisda0iGbcYgaSjabc+gaVjabcEgaNjabdggaHnaaBaaaleaacqqGibasdaWgaaadbaGaeGOmaidabeaaliabbofatnaaBaaameaacqGGOaakcqWGHbqycqWGXbqCcqGGPaqkaeqaaaWcbeaakiabgkHiTiabigdaXiabicdaWiabisda0iabikdaYiGbcYgaSjabc+gaVjabcEgaNjabdggaHnaaBaaaleaacqqGibasdaWgaaadbaGaeGOmaidabeaaliabb+eapbqabaaakeaacqGH9aqpcyGGSbaBcqGGVbWBcqGGNbWzcqWGlbWsdaWgaaWcbaGaeGymaeJaeGOnaydabeaakiabgkHiTiGbcYgaSjabc+gaVjabcEgaNjabdggaHnaaBaaaleaacqqGdbWqdaWgaaadbaGaeGOmaiJaeGinaqJaeGyoaKJaeGinaqdabeaaliabbIeainaaBaaameaacqaIZaWmcqaI5aqocqaI2aGncqaI3aWncqGGUaGlcqaIWaamcqaI2aGncqaI3aWnaeqaaSGaeeOta40aaSbaaWqaaiabiAda2iabigdaXiabiodaZaqabaWccqqGpbWtdaWgaaadbaGaeGioaGJaeGinaqJaeGymaedabeaaliabbofatnaaDaaameaacqaI5aqocqGGOaakcqqGdbWqcqqGtbWucqqGhbWrcqqGFbWxcqqGnbqtcqqGfbqrcqqGubavcqqGkbGscqqGbbqqcqGGSaalcqWGHbqycqWGXbqCcqGGPaqkaeaacqGHsislcqaI2aGncqaI0aancqGGUaGlcqaI5aqocqaIZaWmcqaIZaWmaaaaleqaaOGaeyOeI0IaeGOmaiJaeGinaqJaeGOnayJaeG4mamJaeiOla4IaeGynauJagiiBaWMaei4Ba8Maei4zaCMaemOzay2aaSbaaSqaaiabb+eapnaaBaaameaacqaIYaGmcqGGOaakcqWGNbWzcqGGPaqkaeqaaaWcbeaakiabgkHiTiabiAda2iabisda0iabc6caUiabiMda5iabiodaZiabiodaZiGbcYgaSjabc+gaVjabcEgaNjabdggaHnaaBaaaleaacqqGibascqGHRaWkaeqaaaaaaaa@D9B3@
(17)

where A ¯ MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaacbmGaf8xqaeKbaebaaaa@2D00@ 16A16/530, and the right-hand side works out to -4772.316 at the conditions stated above. At metastable equilibrium, the values of A ¯ MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaacbmGaf8xqaeKbaebaaaa@2D00@ 16 and its counterparts for the other proteins in the hypothetical assemblage are all equal. It follows that we can combine Eqn. (17) with its counterparts for the four other proteins, dropping the subscripts on A ¯ MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaacbmGaf8xqaeKbaebaaaa@2D00@ , to write

[ 530 2555 1042 640 14 571 2812 1066 747 16 553 2575 1070 645 11 278 1362 519 355 4 571 2815 1071 747 14 ] × [ A ¯ / 2.303 R T log a CO 2 ( a q ) log a H 2 O log a NH 3 ( a q ) log a H 2 S ( a q ) ] = [ 4772.316 5785.204 5021.307 2683.266 5825.691 ] , MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xI8qiVKYPFjYdHaVhbbf9v8qqaqFr0xc9vqFj0dXdbba91qpepeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaWaamWaaeaafaqabeqbfaaaaaqaaiabiwda1iabiodaZiabicdaWaqaaiabikdaYiabiwda1iabiwda1iabiwda1aqaaiabigdaXiabicdaWiabisda0iabikdaYaqaaiabiAda2iabisda0iabicdaWaqaaiabigdaXiabisda0aqaaiabiwda1iabiEda3iabigdaXaqaaiabikdaYiabiIda4iabigdaXiabikdaYaqaaiabigdaXiabicdaWiabiAda2iabiAda2aqaaiabiEda3iabisda0iabiEda3aqaaiabigdaXiabiAda2aqaaiabiwda1iabiwda1iabiodaZaqaaiabikdaYiabiwda1iabiEda3iabiwda1aqaaiabigdaXiabicdaWiabiEda3iabicdaWaqaaiabiAda2iabisda0iabiwda1aqaaiabigdaXiabigdaXaqaaiabikdaYiabiEda3iabiIda4aqaaiabigdaXiabiodaZiabiAda2iabikdaYaqaaiabiwda1iabigdaXiabiMda5aqaaiabiodaZiabiwda1iabiwda1aqaaiabisda0aqaaiabiwda1iabiEda3iabigdaXaqaaiabikdaYiabiIda4iabigdaXiabiwda1aqaaiabigdaXiabicdaWiabiEda3iabigdaXaqaaiabiEda3iabisda0iabiEda3aqaaiabigdaXiabisda0aaaaiaawUfacaGLDbaacqGHxdaTdaWadaqaauaabeqafeaaaaqaaGqadiqb=feabzaaraGaei4la8IaeGOmaiJaeiOla4IaeG4mamJaeGimaaJaeG4mamJaemOuaiLaemivaqfabaGaeyOeI0IagiiBaWMaei4Ba8Maei4zaCMaemyyae2aaSbaaSqaaiabboeadjabb+eapnaaBaaameaacqaIYaGmcqGGOaakcqWGHbqycqWGXbqCcqGGPaqkaeqaaaWcbeaaaOqaaiabgkHiTiGbcYgaSjabc+gaVjabcEgaNjabdggaHnaaBaaaleaacqqGibasdaWgaaadbaGaeGOmaidabeaaliabb+eapbqabaaakeaacqGHsislcyGGSbaBcqGGVbWBcqGGNbWzcqWGHbqydaWgaaWcbaGaeeOta4KaeeisaG0aaSbaaWqaaiabiodaZiabcIcaOiabdggaHjabdghaXjabcMcaPaqabaaaleqaaaGcbaGaeyOeI0IagiiBaWMaei4Ba8Maei4zaCMaemyyae2aaSbaaSqaaiabbIeainaaBaaameaacqaIYaGmaeqaaSGaem4uam1aaSbaaWqaaiabcIcaOiabdggaHjabdghaXjabcMcaPaqabaaaleqaaaaaaOGaay5waiaaw2faaiabg2da9maadmaabaqbaeqabuqaaaaabaGaeyOeI0IaeGinaqJaeG4naCJaeG4naCJaeGOmaiJaeiOla4IaeG4mamJaeGymaeJaeGOnaydabaGaeyOeI0IaeGynauJaeG4naCJaeGioaGJaeGynauJaeiOla4IaeGOmaiJaeGimaaJaeGinaqdabaGaeyOeI0IaeGynauJaeGimaaJaeGOmaiJaeGymaeJaeiOla4IaeG4mamJaeGimaaJaeG4naCdabaGaeyOeI0IaeGOmaiJaeGOnayJaeGioaGJaeG4mamJaeiOla4IaeGOmaiJaeGOnayJaeGOnaydabaGaeyOeI0IaeGynauJaeGioaGJaeGOmaiJaeGynauJaeiOla4IaeGOnayJaeGyoaKJaeGymaedaaaGaay5waiaaw2faaiabcYcaSaaa@EA29@
(18)

where the rows on the right-hand side and in the stoichiometric matrix on the left-hand side correspond to the proteins from the METXX organisms listed in Table 1. Solving Eqn. (18) gives A ¯ MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaacbmGaf8xqaeKbaebaaaa@2D00@ /2.303RT = -0.739, log a CO 2 ( a q ) MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaGaemyyae2aaSbaaSqaaiabboeadjabb+eapnaaBaaameaacqaIYaGmcqGGOaakcqWGHbqycqWGXbqCcqGGPaqkaeqaaaWcbeaaaaa@3510@ = -8.44, log a H 2 O MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaGaemyyae2aaSbaaSqaaiabbIeainaaBaaameaacqaIYaGmaeqaaSGaee4ta8eabeaaaaa@30B2@ = 7.92, log a NH 3 ( a q ) MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaGaemyyae2aaSbaaSqaaiabb6eaojabbIeainaaBaaameaacqaIZaWmcqGGOaakcqWGHbqycqWGXbqCcqGGPaqkaeqaaaWcbeaaaaa@351A@ = 27.92 and log a H 2 S ( a q ) MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaGaemyyae2aaSbaaSqaaiabbIeainaaBaaameaacqaIYaGmaeqaaSGaee4uam1aaSbaaWqaaiabcIcaOiabdggaHjabdghaXjabcMcaPaqabaaaleqaaaaa@355A@ = -13.09. These values signify that the formation reactions of the proteins per residue are energetically unfavorable ( A ¯ MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaacbmGaf8xqaeKbaebaaaa@2D00@ is negative) and that the hypothetical protein assemblage may not be metastably present (for example, the large positive values for a H 2 O MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaGaemyyae2aaSbaaSqaaiabbIeainaaBaaameaacqaIYaGmaeqaaSGaee4ta8eabeaaaaa@30B2@ and a NH 3 ( a q ) MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaGaemyyae2aaSbaaSqaaiabb6eaojabbIeainaaBaaameaacqaIZaWmcqGGOaakcqWGHbqycqWGXbqCcqGGPaqkaeqaaaWcbeaaaaa@351A@ differ from probable natural ranges). Unambiguous identification of a natural metastable protein assemblage may require more comprehensive calculations coupled with insight gained from experiments and observations in the field.

The pseudoinvariant point representing the buffer assemblage described above is shown in Figs. 7a and 7b. The same pseudoinvariant point is present in both figures, but different variables are projected onto each diagram. The temperature-pressure relationships appearing in Fig. 7a suggest that metastability of CSG_METJA increases relative to that of CSG_METVO with increasing temperature and/or pressure, but that the sensitivity to temperature is much greater than that to pressure. These relationships are also apparent in Figs. 6a and 6b. In the projection of Fig. 7a all the proteins at the pseudoinvariant point are not visible, but in Fig. 7b convergence of the five predominance fields is apparent. Note the similarity in Figs. 7b and 3a of the reaction boundary between CSG_METVO and CSG_METJA, as well as the nearly horizontal boundary between CSG_METSC and CSG_METFE, which would be expected from the closeness of their ionization states as a function of pH (see Fig. 4a for the ionization states at 25°C).

Figure 7
figure 7

Metastabilities of proteins around a pseudoinvariant point. Activities of CO2(aq), NH3(aq), H2S(aq) and H2O were calculated using a buffer consisting of the proteins from the METXX organisms listed in Table 1. The variables set in the buffer calculation were activities of proteins equal to 10-3, T = 100°C, P = 1000 bar, log f O 2 ( g ) MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaGaemOzay2aaSbaaSqaaiabb+eapnaaBaaameaacqaIYaGmcqGGOaakcqWGNbWzcqGGPaqkaeqaaaWcbeaaaaa@32AE@ = -58 and pH 7. The diagrams show the variation of protein metastability as a function of (a) temperature and pressure and (b) log f O 2 ( g ) MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaGaemOzay2aaSbaaSqaaiabb+eapnaaBaaameaacqaIYaGmcqGGOaakcqWGNbWzcqGGPaqkaeqaaaWcbeaaaaa@32AE@ and pH. The dashed line in (b) represents the lower stability limit of H2O at this temperature and pressure.

Concluding remarks

A computer program called CHNOSZ was introduced in this paper for producing metastable equilibrium chemical activity diagrams for proteins. The methods used here were borrowed from geochemistry, and the program with the accompanying thermodynamic database is suitable for performing thermodynamic calculations in inorganic and mineral systems as well as organic and biochemical systems, or combinations thereof.

To investigate the utility of the program for a geochemical description of protein reactions, metastability diagrams were produced for surface-layer proteins from a number of bacteria and archaea. The diagrams show either the metastably predominant proteins as a function of two intensive variables or the metastable equilibrium chemical activities of proteins as a function of one variable. The primary variables of interest in this study were log f O 2 ( g ) MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaGaemOzay2aaSbaaSqaaiabb+eapnaaBaaameaacqaIYaGmcqGGOaakcqWGNbWzcqGGPaqkaeqaaaWcbeaaaaa@32AE@ , pH, temperature and pressure. It was found that the predicted metastable equilibrium state of the system responded dramatically to changes in these variables. Representing the proteins in reactions by their residue equivalents instead of whole protein formulas gave rise to predicted equilibrium states in which many proteins coexist metastably with comparable chemical activities.

In the preceding sections we have considered the theoretical metastable equilibrium relationships among only a few model proteins. Because the software is now available to do so, a plethora of predictions concerning the energetically favorable outcomes of any number of overall protein mutation reactions is now within reach. Consideration of the results presented above, and of the wide range of model systems that could potentially be investigated in a similar manner, leads to the conclusion that the metastable equilibrium distribution of proteins in many cases does not mirror geobiochemical reality. Nevertheless, the ability to quantify the characteristics of metastable equilibrium reference states as a function of geochemical variables may be of utility in identifying specific pathways in evolution where the resulting proteins are relatively energetically favored. These particular outcomes may reflect a tendency for natural selection to increase the fit between phenotypes and their environments [46].

A thermodynamic and geochemical perspective on the relative metastabilities of proteins permits a quantitative integration of observations on the geosphere and biosphere. This study has only touched the surface of the myriad possible environments and organisms, the properties and chemical compositions of which are becoming more well constrained through experiment and observation. As these data grow in abundance, they will provide other opportunities where thermodynamic description of the chemical speciation of proteins can be tested and calibrated.

Methods

The thermodynamic conventions and relations used to compute the relative metastabilities of proteins in the present study are summarized below. The computational assessment depends first on the adoption of standard states for the species appearing in chemical reactions.

Standard state conventions

The standard state convention adopted for aqueous species other than H2O corresponds to unit activity of a hypothetical one molal solution referenced to infinite dilution at any temperature and pressure [30, 47]. The conventional standard molal thermodynamic properties of both the aqueous electron and proton are taken to be zero at all temperatures and pressures [48]. For gases, the standard state convention is unit fugacity of the hypothetical pure ideal gas at 1 bar and any temperature. The standard state convention adopted for solids and liquids, including H2O, corresponds to unit activity of the pure substance at any temperature and pressure.

Protein formation and metastability reactions

The compositions of species of interest, such as proteins, are represented by linear combination of the compositions of basis species in a system (for an application in geochemical systems, see Ref. [49]). The number of basis species is the minimum required to write formation reactions for all possible species of interest. There are no thermodynamic restrictions on the actual identities of the basis species, and the basis species do not necessarily correspond to thermodynamic components in the system of interest [50]. Hence, the choice of basis species may be constrained by the chemical activities that can be measured in a system or that are thought to behave as perfectly mobile components [22]. The basis species used in the present study are CO2(aq), H2O, NH3(aq), H2S(aq), H+ and O2(g).

Let a generic chemical formula for the jth ionized protein be written as C C j H H j N N j O O j S S j Z j MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaGaee4qam0aaSbaaSqaaiabdoeadnaaBaaameaacqWGQbGAaeqaaaWcbeaakiabbIeainaaBaaaleaacqWGibasdaWgaaadbaGaemOAaOgabeaaaSqabaGccqqGobGtdaWgaaWcbaGaemOta40aaSbaaWqaaiabdQgaQbqabaaaleqaaOGaee4ta80aaSbaaSqaaiabd+eapnaaBaaameaacqWGQbGAaeqaaaWcbeaakiabbofatnaaDaaaleaacqWGtbWudaWgaaadbaGaemOAaOgabeaaaSqaaiabdQfaAnaaBaaameaacqWGQbGAaeqaaaaaaaa@42C6@ , where C j , H j , N j , O j , S j and Z j denote the number of moles of the corresponding element (or charge) in one mole of protein. These coefficients can be non-integer and positive or negative (e.g., Z j usually is negative at some alkaline pHs). The formation reaction from basis species of one mole of the jth protein can be written as

C j CO 2 ( a q ) + N j NH 3 ( a q ) + S j H 2 S ( a q ) + Z j H + + ( H j 3 N j 2 S j Z j 2 ) H 2 O + ( O j 2 C j ( H j 3 N j 2 S j Z j 2 ) 2 ) O 2 ( g ) C C j H H j N N j O O j S S j Z j . MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xI8qiVKYPFjYdHaVhbbf9v8qqaqFr0xc9vqFj0dXdbba91qpepeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaqbaeaabiqaaaqaaiabdoeadnaaBaaaleaacqWGQbGAaeqaaOGaee4qamKaee4ta80aaSbaaSqaaiabikdaYiabcIcaOiabdggaHjabdghaXjabcMcaPaqabaGccqGHRaWkcqWGobGtdaWgaaWcbaGaemOAaOgabeaakiabb6eaojabbIeainaaBaaaleaacqaIZaWmcqGGOaakcqWGHbqycqWGXbqCcqGGPaqkaeqaaOGaey4kaSIaem4uam1aaSbaaSqaaiabdQgaQbqabaGccqqGibasdaWgaaWcbaGaeGOmaidabeaakiabbofatnaaBaaaleaacqGGOaakcqWGHbqycqWGXbqCcqGGPaqkaeqaaOGaey4kaSIaemOwaO1aaSbaaSqaaiabdQgaQbqabaGccqqGibasdaahaaWcbeqaaiabgUcaRaaakiabgUcaRmaabmaajuaGbaWaaSaaaeaacqWGibasdaWgaaqaaiabdQgaQbqabaGaeyOeI0IaeG4mamJaemOta40aaSbaaeaacqWGQbGAaeqaaiabgkHiTiabikdaYiabdofatnaaBaaabaGaemOAaOgabeaacqGHsislcqWGAbGwdaWgaaqaaiabdQgaQbqabaaabaGaeGOmaidaaaGccaGLOaGaayzkaaGaeeisaG0aaSbaaSqaaiabikdaYaqabaGccqqGpbWtaeaacaWLjaGaey4kaSYaaeWaaKqbagaadaWcaaqaaiabd+eapnaaBaaabaGaemOAaOgabeaacqGHsislcqaIYaGmcqWGdbWqdaWgaaqaaiabdQgaQbqabaGaeyOeI0YaaeWaaeaadaWcaaqaaiabdIeainaaBaaabaGaemOAaOgabeaacqGHsislcqaIZaWmcqWGobGtdaWgaaqaaiabdQgaQbqabaGaeyOeI0IaeGOmaiJaem4uam1aaSbaaeaacqWGQbGAaeqaaiabgkHiTiabdQfaAnaaBaaabaGaemOAaOgabeaaaeaacqaIYaGmaaaacaGLOaGaayzkaaaabaGaeGOmaidaaaGccaGLOaGaayzkaaGaee4ta80aaSbaaSqaaiabikdaYiabcIcaOiabdEgaNjabcMcaPaqabaGccqWImhYGcqqGdbWqdaWgaaWcbaGaem4qam0aaSbaaWqaaiabdQgaQbqabaaaleqaaOGaeeisaG0aaSbaaSqaaiabdIeainaaBaaameaacqWGQbGAaeqaaaWcbeaakiabb6eaonaaBaaaleaacqWGobGtdaWgaaadbaGaemOAaOgabeaaaSqabaGccqqGpbWtdaWgaaWcbaGaem4ta80aaSbaaWqaaiabdQgaQbqabaaaleqaaOGaee4uam1aa0baaSqaaiabdofatnaaBaaameaacqWGQbGAaeqaaaWcbaGaemOwaO1aaSbaaWqaaiabdQgaQbqabaaaaOGaeiOla4caaaaa@A7F6@
(M1)

The reaction coefficients on the basis species in Reaction M1 are completely determined by the chemical formulas of the protein and of the basis species. Depending on the sign of the coefficients in front of the basis species, they would appear in specific statements of Reaction M1 as reactants or products.

A generic metastability reaction between two proteins (j = 1 and j = 2) can be written as

1 n 1 C C 1 H H 1 N N 1 O O 1 S S 1 Z 1 + ( C 2 n 2 C 1 n 1 ) CO 2 ( a q ) + ( N 2 n 2 N 1 n 1 ) NH 3 ( a q ) + ( S 2 n 2 S 1 n 1 ) H 2 S ( a q ) + ( Z 2 n 2 Z 1 n 1 ) H + + ( ( H 2 n 2 H 1 n 1 ) 3 ( N 2 n 2 N 1 n 1 ) 2 ( S 2 n 2 S 1 n 1 ) ( Z 2 n 2 Z 1 n 1 ) 2 ) H 2 O + ( ( O 2 n 2 O 1 n 1 ) 2 ( C 2 n 2 C 1 n 1 ) ( ( H 2 n 2 H 1 n 1 ) 3 ( N 2 n 2 N 1 n 1 ) 2 ( S 2 n 2 S 1 n 1 ) ( Z 2 n 2 Z 1 n 1 ) 2 ) 2 ) O 2 ( g ) 1 n 2 C C 2 H H 2 N N 2 O O 2 S S 2 Z 2 , MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xI8qiVKYPFjYdHaVhbbf9v8qqaqFr0xc9vqFj0dXdbba91qpepeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaqbaeqabiqaaaqaaKqbaoaalaaabaGaeGymaedabaGaemOBa42aaSbaaeaacqaIXaqmaeqaaaaakiabboeadnaaBaaaleaacqWGdbWqdaWgaaadbaGaeGymaedabeaaaSqabaGccqqGibasdaWgaaWcbaGaemisaG0aaSbaaWqaaiabigdaXaqabaaaleqaaOGaeeOta40aaSbaaSqaaiabd6eaonaaBaaameaacqaIXaqmaeqaaaWcbeaakiabb+eapnaaBaaaleaacqWGpbWtdaWgaaadbaGaeGymaedabeaaaSqabaGccqqGtbWudaqhaaWcbaGaem4uam1aaSbaaWqaaiabigdaXaqabaaaleaacqWGAbGwdaWgaaadbaGaeGymaedabeaaaaGccqGHRaWkdaqadaqcfayaamaalaaabaGaem4qam0aaSbaaeaacqaIYaGmaeqaaaqaaiabd6gaUnaaBaaabaGaeGOmaidabeaaaaGaeyOeI0YaaSaaaeaacqWGdbWqdaWgaaqaaiabigdaXaqabaaabaGaemOBa42aaSbaaeaacqaIXaqmaeqaaaaaaOGaayjkaiaawMcaaiabboeadjabb+eapnaaBaaaleaacqaIYaGmcqGGOaakcqWGHbqycqWGXbqCcqGGPaqkaeqaaOGaey4kaSYaaeWaaKqbagaadaWcaaqaaiabd6eaonaaBaaabaGaeGOmaidabeaaaeaacqWGUbGBdaWgaaqaaiabikdaYaqabaaaaiabgkHiTmaalaaabaGaemOta40aaSbaaeaacqaIXaqmaeqaaaqaaiabd6gaUnaaBaaabaGaeGymaedabeaaaaaakiaawIcacaGLPaaacqqGobGtcqqGibasdaWgaaWcbaGaeG4mamJaeiikaGIaemyyaeMaemyCaeNaeiykaKcabeaakiabgUcaRmaabmaajuaGbaWaaSaaaeaacqWGtbWudaWgaaqaaiabikdaYaqabaaabaGaemOBa42aaSbaaeaacqaIYaGmaeqaaaaacqGHsisldaWcaaqaaiabdofatnaaBaaabaGaeGymaedabeaaaeaacqWGUbGBdaWgaaqaaiabigdaXaqabaaaaaGccaGLOaGaayzkaaGaeeisaG0aaSbaaSqaaiabikdaYaqabaGccqqGtbWudaWgaaWcbaGaeiikaGIaemyyaeMaemyCaeNaeiykaKcabeaaaOqaauaabeqadeaaaeaacqGHRaWkdaqadaqcfayaamaalaaabaGaemOwaO1aaSbaaeaacqaIYaGmaeqaaaqaaiabd6gaUnaaBaaabaGaeGOmaidabeaaaaGaeyOeI0YaaSaaaeaacqWGAbGwdaWgaaqaaiabigdaXaqabaaabaGaemOBa42aaSbaaeaacqaIXaqmaeqaaaaaaOGaayjkaiaawMcaaiabbIeainaaCaaaleqabaGaey4kaScaaOGaey4kaSYaaeWaaKqbagaadaWcaaqaamaabmaabaWaaSaaaeaacqWGibasdaWgaaqaaiabikdaYaqabaaabaGaemOBa42aaSbaaeaacqaIYaGmaeqaaaaacqGHsisldaWcaaqaaiabdIeainaaBaaabaGaeGymaedabeaaaeaacqWGUbGBdaWgaaqaaiabigdaXaqabaaaaaGaayjkaiaawMcaaiabgkHiTiabiodaZmaabmaabaWaaSaaaeaacqWGobGtdaWgaaqaaiabikdaYaqabaaabaGaemOBa42aaSbaaeaacqaIYaGmaeqaaaaacqGHsisldaWcaaqaaiabd6eaonaaBaaabaGaeGymaedabeaaaeaacqWGUbGBdaWgaaqaaiabigdaXaqabaaaaaGaayjkaiaawMcaaiabgkHiTiabikdaYmaabmaabaWaaSaaaeaacqWGtbWudaWgaaqaaiabikdaYaqabaaabaGaemOBa42aaSbaaeaacqaIYaGmaeqaaaaacqGHsisldaWcaaqaaiabdofatnaaBaaabaGaeGymaedabeaaaeaacqWGUbGBdaWgaaqaaiabigdaXaqabaaaaaGaayjkaiaawMcaaiabgkHiTmaabmaabaWaaSaaaeaacqWGAbGwdaWgaaqaaiabikdaYaqabaaabaGaemOBa42aaSbaaeaacqaIYaGmaeqaaaaacqGHsisldaWcaaqaaiabdQfaAnaaBaaabaGaeGymaedabeaaaeaacqWGUbGBdaWgaaqaaiabigdaXaqabaaaaaGaayjkaiaawMcaaaqaaiabikdaYaaaaOGaayjkaiaawMcaaiabbIeainaaBaaaleaacqaIYaGmaeqaaOGaee4ta8eabaGaey4kaSYaaeWaaKqbagaadaWcaaqaamaabmaabaWaaSaaaeaacqWGpbWtdaWgaaqaaiabikdaYaqabaaabaGaemOBa42aaSbaaeaacqaIYaGmaeqaaaaacqGHsisldaWcaaqaaiabd+eapnaaBaaabaGaeGymaedabeaaaeaacqWGUbGBdaWgaaqaaiabigdaXaqabaaaaaGaayjkaiaawMcaaiabgkHiTiabikdaYmaabmaabaWaaSaaaeaacqWGdbWqdaWgaaqaaiabikdaYaqabaaabaGaemOBa42aaSbaaeaacqaIYaGmaeqaaaaacqGHsisldaWcaaqaaiabdoeadnaaBaaabaGaeGymaedabeaaaeaacqWGUbGBdaWgaaqaaiabigdaXaqabaaaaaGaayjkaiaawMcaaiabgkHiTmaabmaabaWaaSaaaeaadaqadaqaamaalaaabaGaemisaG0aaSbaaeaacqaIYaGmaeqaaaqaaiabd6gaUnaaBaaabaGaeGOmaidabeaaaaGaeyOeI0YaaSaaaeaacqWGibasdaWgaaqaaiabigdaXaqabaaabaGaemOBa42aaSbaaeaacqaIXaqmaeqaaaaaaiaawIcacaGLPaaacqGHsislcqaIZaWmdaqadaqaamaalaaabaGaemOta40aaSbaaeaacqaIYaGmaeqaaaqaaiabd6gaUnaaBaaabaGaeGOmaidabeaaaaGaeyOeI0YaaSaaaeaacqWGobGtdaWgaaqaaiabigdaXaqabaaabaGaemOBa42aaSbaaeaacqaIXaqmaeqaaaaaaiaawIcacaGLPaaacqGHsislcqaIYaGmdaqadaqaamaalaaabaGaem4uam1aaSbaaeaacqaIYaGmaeqaaaqaaiabd6gaUnaaBaaabaGaeGOmaidabeaaaaGaeyOeI0YaaSaaaeaacqWGtbWudaWgaaqaaiabigdaXaqabaaabaGaemOBa42aaSbaaeaacqaIXaqmaeqaaaaaaiaawIcacaGLPaaacqGHsisldaqadaqaamaalaaabaGaemOwaO1aaSbaaeaacqaIYaGmaeqaaaqaaiabd6gaUnaaBaaabaGaeGOmaidabeaaaaGaeyOeI0YaaSaaaeaacqWGAbGwdaWgaaqaaiabigdaXaqabaaabaGaemOBa42aaSbaaeaacqaIXaqmaeqaaaaaaiaawIcacaGLPaaaaeaacqaIYaGmaaaacaGLOaGaayzkaaaabaGaeGOmaidaaaGccaGLOaGaayzkaaGaee4ta80aaSbaaSqaaiabikdaYiabcIcaOiabdEgaNjabcMcaPaqabaaakeaacqWImhYGjuaGdaWcaaqaaiabigdaXaqaaiabd6gaUnaaBaaabaGaeGOmaidabeaaaaGccqqGdbWqdaWgaaWcbaGaem4qam0aaSbaaWqaaiabikdaYaqabaaaleqaaOGaeeisaG0aaSbaaSqaaiabdIeainaaBaaameaacqaIYaGmaeqaaaWcbeaakiabb6eaonaaBaaaleaacqWGobGtdaWgaaadbaGaeGOmaidabeaaaSqabaGccqqGpbWtdaWgaaWcbaGaem4ta80aaSbaaWqaaiabikdaYaqabaaaleqaaOGaee4uam1aa0baaSqaaiabdofatnaaBaaameaacqaIYaGmaeqaaaWcbaGaemOwaO1aaSbaaWqaaiabikdaYaqabaaaaOGaeiilaWcaaaaaaaa@46CD@
(M2)

which corresponds to the difference between specific statements of Reaction M1 for j = 2 and j = 1, divided by n2 or n1, respectively. Here, 1/n1 and 1/n2 denote the conservation coefficients for the corresponding proteins. Reaction M2 is balanced with respect to mass and charge for any values of n1 and n2. If n1 = n2 = 1, Reaction M2 denotes the mass balance constraints for the formation of one mole of product protein at the expense of one mole of reactant protein. Other values may be chosen for n1 and n2, depending on what is specified about the conservation constraints in the system. For example, if n1 = C1 and n2 = C2, the protein metastability reaction conserves carbon [18] (i.e., the coefficient on CO2(aq) in Reaction M2 becomes zero). The protein metastability reactions considered in the present study are written for n j equal to the length of the jth protein.

Relation of reaction energetics to activities of basis species

The standard Gibbs energy of the rth formation or metastability reaction ( Δ G r MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaGaeuiLdqKaem4raC0aa0baaSqaaiabdkhaYbqaaiablIHiVbaaaaa@3126@ ) can be expressed as

Δ G r = r n ^ i , r Δ G r , MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xI8qiVKYPFjYdHaVhbbf9v8qqaqFr0xc9vqFj0dXdbba91qpepeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaGaeuiLdqKaem4raC0aa0baaSqaaiabdkhaYbqaaiablIHiVbaakiabg2da9maaqafabaGafmOBa4MbaKaadaWgaaWcbaGaemyAaKMaeiilaWIaemOCaihabeaaaeaacqWGYbGCaeqaniabggHiLdGccqqHuoarcqWGhbWrdaqhaaWcbaGaemOCaihabaGaeSigI8gaaOGaeiilaWcaaa@4197@
(M3)

where n ^ i , r MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaGafmOBa4MbaKaadaWgaaWcbaGaemyAaKMaeiilaWIaemOCaihabeaaaaa@311E@ and Δ G i MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaGaeuiLdqKaem4raC0aa0baaSqaaiabdMgaPbqaaiablIHiVbaaaaa@3114@ denote, respectively, the stoichiometric reaction coefficient and standard molal Gibbs energy of formation from the elements of the ith basis species or protein in the reaction. For products in a reaction, n ^ i , r MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaGafmOBa4MbaKaadaWgaaWcbaGaemyAaKMaeiilaWIaemOCaihabeaaaaa@311E@ > 0. The corresponding equilibrium constant of the reaction (K r ) is given by

log K r = Δ G r / 2.303 R T . MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xI8qiVKYPFjYdHaVhbbf9v8qqaqFr0xc9vqFj0dXdbba91qpepeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaGagiiBaWMaei4Ba8Maei4zaCMaem4saS0aaSbaaSqaaiabdkhaYbqabaGccqGH9aqpcqGHsislcqqHuoarcqWGhbWrdaqhaaWcbaGaemOCaihabaGaeSigI8gaaOGaei4la8IaeGOmaiJaeiOla4IaeG4mamJaeGimaaJaeG4mamJaemOuaiLaemivaqLaeiOla4caaa@4325@
(M4)

The equilibrium constant, like Δ G r MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaGaeuiLdqKaem4raC0aa0baaSqaaiabdkhaYbqaaiablIHiVbaaaaa@3126@ , is a standard-state property which is is independent of composition and depends on temperature and pressure. The non-standard-state counterpart to K r is the activity product of the reaction (Q r ), which can be computed using

Q r i a i n ^ i , r , MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xI8qiVKYPFjYdHaVhbbf9v8qqaqFr0xc9vqFj0dXdbba91qpepeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaGaemyuae1aaSbaaSqaaiabdkhaYbqabaGccqGHHjIUdaqeqbqaaiabdggaHnaaDaaaleaacqWGPbqAaeaacuWGUbGBgaqcamaaBaaameaacqWGPbqAcqGGSaalcqWGYbGCaeqaaaaaaSqaaiabdMgaPbqab0Gaey4dIunakiabcYcaSaaa@3D2E@
(M5)

where a i represents the chemical activity of the ith species in the reaction. For gaseous species, a i in Eqn. (M5) is replaced by the fugacity of the species (f i ). Activity and fugacity coefficients are taken in a first approximation in this study to be unity.

The activity or fugacity of the ith aqueous or gaseous component is related to its chemical potential (μ i ) by [6]

μ i = μ i + R T ln f i f i = μ i + R T ln a i , MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xI8qiVKYPFjYdHaVhbbf9v8qqaqFr0xc9vqFj0dXdbba91qpepeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaGaeqiVd02aaSbaaSqaaiabdMgaPbqabaGccqGH9aqpcqaH8oqBdaqhaaWcbaGaemyAaKgabaGaeSigI8gaaOGaey4kaSIaemOuaiLaemivaqLagiiBaWMaeiOBa4wcfa4aaSaaaeaacqWGMbGzdaWgaaqaaiabdMgaPbqabaaabaGaemOzay2aa0baaeaacqWGPbqAaeaacqWIyiYBaaaaaOGaeyypa0JaeqiVd02aa0baaSqaaiabdMgaPbqaaiablIHiVbaakiabgUcaRiabdkfasjabdsfaujGbcYgaSjabc6gaUjabdggaHnaaBaaaleaacqWGPbqAaeqaaOGaeiilaWcaaa@51C7@
(M6)

where μ i MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaGaeqiVd02aa0baaSqaaiabdMgaPbqaaiablIHiVbaaaaa@304D@ denotes the standard chemical potential of the ith species and f i MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaGaemOzay2aa0baaSqaaiabdMgaPbqaaiablIHiVbaaaaa@2FEC@ stands for the fugacity of the species in its standard state, which is unity for gases.

The chemical affinities of reactions (A r ) can be computed from [51]

A r = 2.303RT log (K r /Q r ),

which can be combined with Eqn. (M5) to write for Reaction M2

log K r = A r / 2.303 R T + log a C C 2 H H 2 N N 2 O O 2 S S 2 Z 2 1 / n 2 a C C 1 H H 1 N N 1 O O 1 S S 1 Z 1 1 / n 1 + log ( a CO 2 ( a q ) n ^ CO 2 a H 2 O n ^ H 2 O a NH 3 ( a q ) n ^ NH 3 f O 2 ( a q ) n ^ O 2 a H 2 S ( a q ) n ^ H 2 S a H + n ^ H + ) . MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xI8qiVKYPFjYdHaVhbbf9v8qqaqFr0xc9vqFj0dXdbba91qpepeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaGagiiBaWMaei4Ba8Maei4zaCMaem4saS0aaSbaaSqaaiabdkhaYbqabaGccqGH9aqpieWacqWFbbqqdaWgaaWcbaGaemOCaihabeaakiabc+caViabikdaYiabc6caUiabiodaZiabicdaWiabiodaZiabdkfasjabdsfaujabgUcaRiGbcYgaSjabc+gaVjabcEgaNLqbaoaalaaabaGaemyyae2aa0baaeaacqqGdbWqdaWgaaqaaiabdoeadnaaBaaabaGaeGOmaidabeaaaeqaaiabbIeainaaBaaabaGaemisaG0aaSbaaeaacqaIYaGmaeqaaaqabaGaeeOta40aaSbaaeaacqWGobGtdaWgaaqaaiabikdaYaqabaaabeaacqqGpbWtdaWgaaqaaiabd+eapnaaBaaabaGaeGOmaidabeaaaeqaaiabbofatnaaDaaabaGaem4uam1aaSbaaeaacqaIYaGmaeqaaaqaaiabdQfaAnaaBaaabaGaeGOmaidabeaaaaaabaGaeGymaeJaei4la8IaemOBa42aaSbaaeaacqaIYaGmaeqaaaaaaeaacqWGHbqydaqhaaqaaiabboeadnaaBaaabaGaem4qam0aaSbaaeaacqaIXaqmaeqaaaqabaGaeeisaG0aaSbaaeaacqWGibasdaWgaaqaaiabigdaXaqabaaabeaacqqGobGtdaWgaaqaaiabd6eaonaaBaaabaGaeGymaedabeaaaeqaaiabb+eapnaaBaaabaGaem4ta80aaSbaaeaacqaIXaqmaeqaaaqabaGaee4uam1aa0baaeaacqWGtbWudaWgaaqaaiabigdaXaqabaaabaGaemOwaO1aaSbaaeaacqaIXaqmaeqaaaaaaeaacqaIXaqmcqGGVaWlcqWGUbGBdaWgaaqaaiabigdaXaqabaaaaaaakiabgUcaRiGbcYgaSjabc+gaVjabcEgaNnaabmaabaGaemyyae2aa0baaSqaaiabboeadjabb+eapnaaBaaameaacqaIYaGmcqGGOaakcqWGHbqycqWGXbqCcqGGPaqkaeqaaaWcbaGafmOBa4MbaKaadaWgaaadbaGaee4qamKaee4ta80aaSbaaeaacqaIYaGmaeqaaaqabaaaaOGaemyyae2aa0baaSqaaiabbIeainaaBaaameaacqaIYaGmaeqaaSGaee4ta8eabaGafmOBa4MbaKaadaWgaaadbaGaeeisaG0aaSbaaeaacqaIYaGmaeqaaiabb+eapbqabaaaaOGaemyyae2aa0baaSqaaiabb6eaojabbIeainaaBaaameaacqaIZaWmcqGGOaakcqWGHbqycqWGXbqCcqGGPaqkaeqaaaWcbaGafmOBa4MbaKaadaWgaaadbaGaeeOta4KaeeisaG0aaSbaaeaacqaIZaWmaeqaaaqabaaaaOGaemOzay2aa0baaSqaaiabb+eapnaaBaaameaacqaIYaGmcqGGOaakcqWGHbqycqWGXbqCcqGGPaqkaeqaaaWcbaGafmOBa4MbaKaadaWgaaadbaGaee4ta80aaSbaaeaacqaIYaGmaeqaaaqabaaaaOGaemyyae2aa0baaSqaaiabbIeainaaBaaameaacqaIYaGmaeqaaSGaee4uam1aaSbaaWqaaiabcIcaOiabdggaHjabdghaXjabcMcaPaqabaaaleaacuWGUbGBgaqcamaaBaaameaacqqGibasdaWgaaqaaiabbkdaYaqabaGaee4uamfabeaaaaGccqWGHbqydaqhaaWcbaGaeeisaGKaey4kaScabaGafmOBa4MbaKaadaWgaaadbaGaeeisaGKaey4kaScabeaaaaaakiaawIcacaGLPaaacqGGUaGlaaa@C64A@
(M8)

In an equilibrium state, A r = 0 for metastability reactions and Eqn. (M8) reduces to the logarithmic analog of the law of mass action equation for Reaction M2.

Reference activities of basis species and proteins

The reference temperature and pressure correspond to 25°C and 1 bar, respectively. The reference chemical activities of basis species used in this study are given by log a H 2 O MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaGaemyyae2aaSbaaSqaaiabbIeainaaBaaameaacqaIYaGmaeqaaSGaee4ta8eabeaaaaa@30B2@ = 0, log a CO 2 ( a q ) MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaGaemyyae2aaSbaaSqaaiabboeadjabb+eapnaaBaaameaacqaIYaGmcqGGOaakcqWGHbqycqWGXbqCcqGGPaqkaeqaaaWcbeaaaaa@3510@ = -3, log a NH 3 ( a q ) MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaGaemyyae2aaSbaaSqaaiabb6eaojabbIeainaaBaaameaacqaIZaWmcqGGOaakcqWGHbqycqWGXbqCcqGGPaqkaeqaaaWcbeaaaaa@351A@ = -4, log a H 2 S ( a q ) MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaGaemyyae2aaSbaaSqaaiabbIeainaaBaaameaacqaIYaGmaeqaaSGaee4uam1aaSbaaWqaaiabcIcaOiabdggaHjabdghaXjabcMcaPaqabaaaleqaaaaa@355A@ = -7 and log aH+ = -7 (pH 7). The reference value for log a H 2 O MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaGaemyyae2aaSbaaSqaaiabbIeainaaBaaameaacqaIYaGmaeqaaSGaee4ta8eabeaaaaa@30B2@ corresponds to pure water, and the others are nominal values that generally fall within the compositional ranges of hydrothermal fluids and seawater [52]. The reference chemical activities of proteins are taken to be 10-3, which is a nominal value that is similar to experimental concentrations used in protein unfolding studies [53].

Equations of state

The standard molal thermodynamic properties of aqueous species as a function of temperature and pressure can be evaluated using the revised Helgeson-Kirkham-Flowers (HKF) equations of state [3033, 54, 55]. The temperature dependence of the standard molal thermodynamic properties of crystalline, gaseous and liquid species other than H2O are calculated using a standard equation for heat capacity [34, 35, 56]. For the basis species other than H+ and e-, values of the standard molal thermodynamic properties and of the equations of state parameters were taken from Refs. [55, 57] (CO2(aq), NH3(aq) and H2S(aq)) and [58, 59] (O2(g)). The equations of state adopted for liquid H2O in the present study are those used in the SUPCRT92 software package [24].

Group additivity algorithms for ionized proteins

The standard molal properties and revised HKF equations of state parameters of ionized proteins are calculated in the present study using group additivity algorithms and data taken from Ref. [18] and outlined briefly below. The standard molal Gibbs energy of the jth unfolded protein with net charge denoted by Z j ( Δ G U P j Z j MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaGaeuiLdqKaem4raC0aa0baaSqaaiabdwfavjabdcfaqnaaDaaameaacqWGQbGAaeaacqWGAbGwdaWgaaqaaiabdQgaQbqabaaaaaWcbaGaeSigI8gaaaaa@3666@ ) can be written as

Δ G U P j Z j = Δ G U P j 0 + Δ G ion , j , MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xI8qiVKYPFjYdHaVhbbf9v8qqaqFr0xc9vqFj0dXdbba91qpepeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaGaeuiLdqKaem4raC0aa0baaSqaaiabdwfavjabdcfaqnaaDaaameaacqWGQbGAaeaacqWGAbGwdaWgaaqaaiabdQgaQbqabaaaaaWcbaGaeSigI8gaaOGaeyypa0JaeuiLdqKaem4raC0aa0baaSqaaiabdwfavjabdcfaqnaaDaaameaacqWGQbGAaeaacqaIWaamaaaaleaacqWIyiYBaaGccqGHRaWkcqqHuoarcqWGhbWrdaqhaaWcbaGaeeyAaKMaee4Ba8MaeeOBa4MaeiilaWIaemOAaOgabaGaeSigI8gaaOGaeiilaWcaaa@4CA0@
(M9)

where Δ G U P j 0 MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaGaeuiLdqKaem4raC0aa0baaSqaaiabdwfavjabdcfaqnaaDaaameaacqWGQbGAaeaacqaIWaamaaaaleaacqWIyiYBaaaaaa@3499@ stands for the standard molal Gibbs energy of the completely neutral (nonionized) unfolded protein and Δ G ion , j MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaGaeuiLdqKaem4raC0aa0baaSqaaiabbMgaPjabb+gaVjabb6gaUjabcYcaSiabdQgaQbqaaiablIHiVbaaaaa@3617@ stands for the contribution of ionization of sidechain and terminal groups to the standard molal Gibbs energy of the ionized protein. The latter term can be calculated by first writing

Δ G ion , j = i n i , j α i Δ G ion , i , MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xI8qiVKYPFjYdHaVhbbf9v8qqaqFr0xc9vqFj0dXdbba91qpepeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaGaeuiLdqKaem4raC0aa0baaSqaaiabbMgaPjabb+gaVjabb6gaUjabcYcaSiabdQgaQbqaaiablIHiVbaakiabg2da9maaqafabaGaemOBa42aaSbaaSqaaiabdMgaPjabcYcaSiabdQgaQbqabaGccqaHXoqydaWgaaWcbaGaemyAaKgabeaakiabfs5aejabdEeahnaaDaaaleaacqqGPbqAcqqGVbWBcqqGUbGBcqGGSaalcqWGPbqAaeaacqWIyiYBaaaabaGaemyAaKgabeqdcqGHris5aOGaeiilaWcaaa@4E75@
(M10)

where, for the ith type of ionizable sidechain or backbone group, ni, jrepresents the number of moles of the group in one mole of protein, α i denotes the degree of ionization of the group (0 <α i < 1), and Δ G ion , i MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaGaeuiLdqKaem4raC0aa0baaSqaaiabbMgaPjabb+gaVjabb6gaUjabcYcaSiabdMgaPbqaaiablIHiVbaaaaa@3615@ corresponds to the standard molal Gibbs energy of ionization of the group. Values of α i and Δ G ion , j MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaGaeuiLdqKaem4raC0aa0baaSqaaiabbMgaPjabb+gaVjabb6gaUjabcYcaSiabdQgaQbqaaiablIHiVbaaaaa@3617@ in Eqn. (M10) were taken in a simple approximation to be equal for all occurrences of a given ionizable group. It may be possible to refine this approach in the future by taking account of interactions of charged residues on the protein surfaces (Ref. [60] and others since).

Although and Δ G ion , j MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaGaeuiLdqKaem4raC0aa0baaSqaaiabbMgaPjabb+gaVjabb6gaUjabcYcaSiabdQgaQbqaaiablIHiVbaaaaa@3617@ in Eqns. (M9) and (M10) are functions only of temperature and pressure for any protein in a defined charge state, α i in Eqn. (M10) is a function of temperature, pressure, and solution pH [18]. Hence, Δ G ion , j MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaGaeuiLdqKaem4raC0aa0baaSqaaiabbMgaPjabb+gaVjabb6gaUjabcYcaSiabdQgaQbqaaiablIHiVbaaaaa@3617@ and can be effectively computed for a given protein in different ionization states as a function of pH as well as of temperature and pressure. The net charge of the jth protein (Z j ) as a function of temperature, pressure and pH can be calculated using

Z j = i n i , j α i Z i , MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xI8qiVKYPFjYdHaVhbbf9v8qqaqFr0xc9vqFj0dXdbba91qpepeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaGaemOwaO1aaSbaaSqaaiabdQgaQbqabaGccqGH9aqpdaaeqbqaaiabd6gaUnaaBaaaleaacqWGPbqAcqGGSaalcqWGQbGAaeqaaOGaeqySde2aaSbaaSqaaiabdMgaPbqabaGccqWGAbGwdaWgaaWcbaGaemyAaKgabeaaaeaacqWGPbqAaeqaniabggHiLdGccqGGSaalaaa@3F7D@
(M11)

where Z i denotes the charge (+1 or -1) of the ith ionized group and α i (also in Eqn. M10) is given by

α i = 1 1 + 10 Z i ( pH p K i ) , MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xI8qiVKYPFjYdHaVhbbf9v8qqaqFr0xc9vqFj0dXdbba91qpepeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaGaeqySde2aaSbaaSqaaiabdMgaPbqabaGccqGH9aqpjuaGdaWcaaqaaiabigdaXaqaaiabigdaXiabgUcaRiabigdaXiabicdaWmaaCaaabeqaaiabdQfaAnaaBaaabaGaemyAaKgabeaacqGGOaakcqqGWbaCcqqGibascqGHsislcqqGWbaCcqWGlbWsdaWgaaqaaiabdMgaPbqabaGaeiykaKcaaaaakiabcYcaSaaa@427B@
(M12)

where pK i represents the negative logarithm of the equilibrium constant for the deprotonation reaction of the ith ionizable group.

For a protein composed of a single polypeptide chain, the values of Δ G U P j 0 MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaGaeuiLdqKaem4raC0aa0baaSqaaiabdwfavjabdcfaqnaaDaaameaacqWGQbGAaeaacqaIWaamaaaaleaacqWIyiYBaaaaaa@3499@ in Eqn. (M9) can be calculated from the group additivity algorithm represented by [17, 18]

Δ G U P j 0 = Δ G [ AABB ] + ( n j 1 ) Δ G [ UPBB ] + i 20 n [ S C ] i Δ G [ S C ] i , MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xI8qiVKYPFjYdHaVhbbf9v8qqaqFr0xc9vqFj0dXdbba91qpepeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaGaeuiLdqKaem4raC0aa0baaSqaaiabdwfavjabdcfaqnaaDaaameaacqWGQbGAaeaacqaIWaamaaaaleaacqWIyiYBaaGccqGH9aqpcqqHuoarcqWGhbWrdaqhaaWcbaGaei4waSLaeeyqaeKaeeyqaeKaeeOqaiKaeeOqaiKaeiyxa0fabaGaeSigI8gaaOGaey4kaSIaeiikaGIaemOBa42aaSbaaSqaaiabdQgaQbqabaGccqGHsislcqaIXaqmcqGGPaqkcqqHuoarcqWGhbWrdaqhaaWcbaGaei4waSLaeeyvauLaeeiuaaLaeeOqaiKaeeOqaiKaeiyxa0fabaGaeSigI8gaaOGaey4kaSYaaabCaeaacqWGUbGBdaWgaaWcbaGaei4waSLaem4uamLaem4qamKaeiyxa01aaSbaaWqaaiabdMgaPbqabaaaleqaaOGaeuiLdqKaem4raC0aa0baaSqaaiabcUfaBjabdofatjabdoeadjabc2faDnaaBaaameaacqWGPbqAaeqaaaWcbaGaeSigI8gaaaqaaiabdMgaPbqaaiabikdaYiabicdaWaqdcqGHris5aOGaeiilaWcaaa@6C31@
(M13)

where Δ G [ AABB ] , Δ G [ UPBB ] MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaGaeuiLdqKaem4raC0aa0baaSqaaiabcUfaBjabbgeabjabbgeabjabbkeacjabbkeacjabc2faDbqaaiablIHiVbaakiabcYcaSiabfs5aejabdEeahnaaDaaaleaacqGGBbWwcqqGvbqvcqqGqbaucqqGcbGqcqqGcbGqcqGGDbqxaeaacqWIyiYBaaaaaa@421D@ and Δ G [ S C ] i MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaGaeuiLdqKaem4raC0aa0baaSqaaiabcUfaBjabdofatjabdoeadjabc2faDnaaBaaameaacqWGPbqAaeqaaaWcbaGaeSigI8gaaaaa@360A@ denote the standard molal Gibbs energies of the amino acid backbone group, unfolded protein backbone group, and the ith type of amino acid sidechain group, respectively, n [ S C ] i MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaGaemOBa42aaSbaaSqaaiabcUfaBjabdofatjabdoeadjabc2faDnaaBaaameaacqWGPbqAaeqaaaWcbeaaaaa@33B7@ stands for the number of moles of the ith type of amino acid sidechain group in one mole of the protein, and

n j = i = 1 20 n [ S C ] i MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xI8qiVKYPFjYdHaVhbbf9v8qqaqFr0xc9vqFj0dXdbba91qpepeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaGaemOBa42aaSbaaSqaaiabdQgaQbqabaGccqGH9aqpdaaeWbqaaiabd6gaUnaaBaaaleaacqGGBbWwcqWGtbWucqWGdbWqcqGGDbqxdaWgaaadbaGaemyAaKgabeaaaSqabaaabaGaemyAaKMaeyypa0JaeGymaedabaGaeGOmaiJaeGimaadaniabggHiLdaaaa@3F6B@
(M14)

represents the total number of amino acid residues, or length of the protein. Values of n [ S C ] i MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaGaemOBa42aaSbaaSqaaiabcUfaBjabdofatjabdoeadjabc2faDnaaBaaameaacqWGPbqAaeqaaaWcbeaaaaa@33B7@ for the model proteins considered in the present study were retrieved from the Swiss-Prot/UniProt protein sequence database [37] (see Table 1).

The thermodynamic properties of unfolded aqueous proteins calculated using the above equations are taken in a first approximation to be representative of the proteins of interest, which may be folded and/or present in crystalline form in cells. Two observations lend support to the applicability of the unfolded protein reference state for the present calculations: 1) The standard molal Gibbs energies of protein folding would tend to cancel each other in metastability reactions, in which proteins appear on both sides of the reaction. 2) The Gibbs energy of unfolding for a small to average-sized protein is about two or three orders of magnitude smaller than the standard molal Gibbs energy for the unfolded protein itself. For example, the Gibbs energy of unfolding of chicken lysozyme is ~14.5 kcal mol-1 at 25°C [61], but the standard molal Gibbs energy of this protein at 25°C and 1 bar is ~-4.2 × 103 kcal mol-1 (see Figs. 2a and 2b). The size of the unfolding property in this case is much smaller than the ca. ± 5% uncertainty ascribed to the group additivity algorithm [18]. It should be noted, however, that the compositional consequences of protein folding include changes in ionization state, and preferential surface exposure of charged residues [1], which would be manifested by changes in the reaction coefficients of basis species that might affect the outcome of metastability calculations to a greater extent than the differences in Gibbs free energy alone.

References

  1. Fukuchi S, Nishikawa K: Protein surface amino acid compositions distinctively differ between thermophilic and mesophilic bacteria. J Mol Biol. 2001, 309 (4): 835-843. 10.1006/jmbi.2001.4718.

    Article  Google Scholar 

  2. Jaenicke R: Protein stability and molecular adaptation to extreme conditions. Eur J Biochem. 1991, 202 (3): 715-728. 10.1111/j.1432-1033.1991.tb16426.x.

    Article  Google Scholar 

  3. Sanchez-Ruiz JM, Makhatadze GI: To charge or not to charge?. Trends Biotechnol. 2001, 19 (4): 132-135. 10.1016/S0167-7799(00)01548-1.

    Article  Google Scholar 

  4. Mazel D, Marlière P: Adaptive eradication of methionine and cysteine from cyanobacterial light-harvesting proteins. Nature. 1989, 341 (6239): 245-248. 10.1038/341245a0.

    Article  Google Scholar 

  5. Zen E-A: Components, phases, and criteria of chemical equilibrium in rocks. Am J Sci. 1963, 261 (10): 929-942.

    Article  Google Scholar 

  6. Helgeson HC: Evaluation of irreversible reactions in geochemical processes involving minerals and aqueous solutions. I. Thermodynamic relations. Geochim Cosmochim Acta. 1968, 32 (8): 853-877. 10.1016/0016-7037(68)90100-2.

    Article  Google Scholar 

  7. Shock EL: Organic acid metastability in sedimentary basins. Geology. 1988, 16 (10): 886-890. 10.1130/0091-7613(1988)016<0886:OAMISB>2.3.CO;2.

    Article  Google Scholar 

  8. Shock EL: Geochemical constraints on the origin of organic compounds in hydrothermal systems. Orig Life Evol Biosph. 1990, 20 (3–4): 331-367. 10.1007/BF01808115.

    Article  Google Scholar 

  9. Shock EL, Schulte MD: Amino acid synthesis in carbonaceous meteorites by aqueous alteration of polycyclic aromatic hydrocarbons. Nature. 1990, 343 (6260): 728-731. 10.1038/343728a0.

    Article  Google Scholar 

  10. Helgeson HC, Knox AM, Owens CE, Shock EL: Petroleum, oil-field waters, and authigenic mineral assemblages: Are they in metastable equilibrium in hydrocarbon reservoirs?. Geochim Cosmochim Acta. 1993, 57 (14): 3295-3339. 10.1016/0016-7037(93)90541-4.

    Article  Google Scholar 

  11. Helgeson HC, Amend JP: Relative stabilities of biomolecules at high temperatures and pressures. Thermochim Acta. 1994, 245: 89-119. 10.1016/0040-6031(94)85072-0.

    Article  Google Scholar 

  12. Seewald JS: Evidence for metastable equilibrium between hydrocarbons under hydrothermal conditions. Nature. 1994, 370 (6487): 285-287. 10.1038/370285a0.

    Article  Google Scholar 

  13. Andersson E, Holm NG: The stability of some selected amino acids under attempted redox constrained hydrothermal conditions. Orig Life Evol Biosph. 2000, 30: 9-23. 10.1023/A:1006668322298.

    Article  Google Scholar 

  14. McCollom TM, Seewald JS: Abiotic synthesis of organic compounds in deep-sea hydrothermal environments. Chem Rev. 2007, 107 (2): 382-401. 10.1021/cr0503660.

    Article  Google Scholar 

  15. Sleytr UB, Messner P, Pum D, Sára M, Eds: Crystalline Bacterial Cell Surface Proteins. 1996, Austin, Texas: R. G. Landes Company

    Google Scholar 

  16. Sára M, Sleytr UB: S-layer proteins. J Bacteriol. 2000, 182 (4): 859-868. 10.1128/JB.182.4.859-868.2000.

    Article  Google Scholar 

  17. Amend JP, Helgeson HC: Calculation of the standard molal thermodynamic properties of aqueous biomolecules at elevated temperatures and pressures. II. Unfolded proteins. Biophys Chem. 2000, 84 (2): 105-136. 10.1016/S0301-4622(00)00116-2.

    Article  Google Scholar 

  18. Dick JM, LaRowe DE, Helgeson HC: Temperature, pressure, and electrochemical constraints on protein speciation: Group additivity calculation of the standard molal thermodynamic properties of ionized unfolded proteins. Biogeosciences. 2006, 3 (3): 311-336.

    Article  Google Scholar 

  19. Thompson JB: The thermodynamic basis for the mineral facies concept. Am J Sci. 1955, 253 (2): 65-103.

    Article  Google Scholar 

  20. Korzhinskii DS: Physicochemical Basis of the Analysis of the Paragenesis of Minerals. 1959, New York: Consultants Bureau, Inc

    Google Scholar 

  21. Korzhinskii DS: Thermodynamic potentials of open systems whose acidity and redox potential are determined by external conditions. Dokl Acad Sci USSR. 1963, 152 (2): 175-177. [A.G.I. Translation].

    Google Scholar 

  22. Korzhinskii DS: The theory of systems with perfectly mobile components and processes of mineral formation. Am J Sci. 1965, 263 (3): 193-205.

    Article  Google Scholar 

  23. R Development Core Team: R: A Language and Environment for Statistical Computing. 2008, R Foundation for Statistical Computing, Vienna, Austria, [http://www.R-project.org]

    Google Scholar 

  24. Johnson JW, Oelkers EH, Helgeson HC: SUPCRT92: A software package for calculating the standard molal thermodynamic properties of minerals, gases, aqueous species, and reactions from 1 to 5000 bar and 0 to 1000°C. Comp Geosci. 1992, 18 (7): 899-947. 10.1016/0098-3004(92)90029-Q.

    Article  Google Scholar 

  25. Levelt-Sengers JMH, Kamgarparsi B, Balfour FW, Sengers JV: Thermodynamic properties of steam in the critical region. J Phys Chem Ref Data. 1983, 12: 1-28.

    Article  Google Scholar 

  26. Haar L, Gallagher JS, Kell GS: NBS/NRC Steam Tables. 1984, Washington, D. C.: Hemisphere

    Google Scholar 

  27. Johnson JW, Norton D: Critical phenomena in hydrothermal systems: state, thermodynamic, electrostatic, and transport properties of H2O in the critical region. Am J Sci. 1991, 291 (6): 541-648.

    Article  Google Scholar 

  28. Wagner W, Pruss A: The IAPWS formulation 1995 for the thermodynamic properties of ordinary water substance for general and scientific use. J Phys Chem Ref Data. 2002, 31 (2): 387-535. 10.1063/1.1461829.

    Article  Google Scholar 

  29. Archer DG, Wang PM: The dielectric constant of water and Debye-Hückel limiting law slopes. J Phys Chem Ref Data. 1990, 19 (2): 371-411.

    Article  Google Scholar 

  30. Helgeson HC, Kirkham DH, Flowers GC: Theoretical prediction of the thermodynamic behavior of aqueous electrolytes at high pressures and temperatures. IV. Calculation of activity coefficients, osmotic coefficients, and apparent molal and standard and relative partial molal properties to 600°C and 5 Kb. Am J Sci. 1981, 281 (10): 1249-1516.

    Article  Google Scholar 

  31. Tanger JC, Helgeson HC: Calculation of the thermodynamic and transport properties of aqueous species at high pressures and temperatures: Revised equations of state for the standard partial molal properties of ions and electrolytes. Am J Sci. 1988, 288: 19-98.

    Article  Google Scholar 

  32. Shock EL, Helgeson HC: Calculation of the thermodynamic and transport properties of aqueous species at high pressures and temperatures: Standard partial molal properties of organic species. Geochim Cosmochim Acta. 1990, 54 (4): 915-945. 10.1016/0016-7037(90)90429-O.

    Article  Google Scholar 

  33. Shock EL, Oelkers EH, Johnson JW, Sverjensky DA, Helgeson HC: Calculation of the thermodynamic properties of aqueous species at high pressures and temperatures: Effective electrostatic radii, dissociation constants, and standard partial molal properties to 1000°C and 5 kbar. J Chem Soc Faraday Trans. 1992, 88 (6): 803-826. 10.1039/ft9928800803.

    Article  Google Scholar 

  34. Robie RA, Hemingway BS: Thermodynamic Properties of Minerals and Related Substances at 298.15 K and 1 Bar (10.5Pascals) Pressure and at Higher Temperatures. 1995, Bulletin 2131, U. S. Geol. Surv

    Google Scholar 

  35. Maier CG, Kelley KK: An equation for the representation of high-temperature heat content data. J Am Chem Soc. 1932, 54 (8): 3243-3246. 10.1021/ja01347a029.

    Article  Google Scholar 

  36. Richard L, Helgeson HC: Calculation of the thermodynamic properties at elevated temperatures and pressures of saturated and aromatic high molecular weight solid and liquid hydrocarbons in kerogen, bitumen, petroleum, and other organic matter of biogeochemical interest. Geochim Cosmochim Acta. 1998, 62 (23–24): 3591-3636. 10.1016/S0016-7037(97)00345-1.

    Article  Google Scholar 

  37. Boeckmann B, Bairoch A, Apweiler R, Blatter MC, Estreicher A, Gasteiger E, Martin MJ, Michoud K, O'Donovan C, Phan I, Pilbout S, Schneider M: The SWISS-PROT protein knowledgebase and its supplement TrEMBL in 2003. Nucleic Acids Res. 2003, 31: 365-370. 10.1093/nar/gkg095.

    Article  Google Scholar 

  38. Amend JP, Plyasunov AV: Carbohydrates in thermophile metabolism: Calculation of the standard molal thermodynamic properties of aqueous pentoses and hexoses at elevated temperatures and pressures. Geochim Cosmochim Acta. 2001, 65 (21): 3901-3917. 10.1016/S0016-7037(01)00707-4.

    Article  Google Scholar 

  39. LaRowe DE, Helgeson HC: Biomolecules in hydrothermal systems: Calculation of the standard molal thermodynamic properties of nucleic-acid bases, nucleosides, and nucleotides at elevated temperatures and pressures. Geochim Cosmochim Acta. 2006, 70 (18): 4680-4724. 10.1016/j.gca.2006.04.010.

    Article  Google Scholar 

  40. LaRowe DE, Helgeson HC: The energetics of metabolism in hydrothermal systems: Calculation of the standard molal thermodynamic properties of magnesium-complexed adenosine nucleotides and NAD and NADP at elevated temperatures and pressures. Thermochim Acta. 2006, 448 (2): 82-106. 10.1016/j.tca.2006.06.008.

    Article  Google Scholar 

  41. Nakamura S, Aono R, Mizutani S, Takashina T, Grant WD, Horikoshi K: The cell-surface glycoprotein of Haloarcula japonica TR-1. Biosci Biotechnol Biochem. 1992, 56 (6): 996-998.

    Article  Google Scholar 

  42. Whitman WB, Ankwanda E, Wolfe RS: Nutrition and carbon metabolism of Methanococcus voltae. J Bacteriol. 1982, 149 (3): 852-863.

    Google Scholar 

  43. Jones WJ, Leigh JA, Mayer F, Woese CR, Wolfe RS: Methanococcus jannaschii sp. nov., an extremely thermophilic methanogen from a submarine hydrothermal vent. Arch Microbiol. 1983, 136 (4): 254-261. 10.1007/BF00425213.

    Article  Google Scholar 

  44. Hoehler TM, Alperin MJ, Albert DB, Martens CS: Thermodynamic control on hydrogen concentrations in anoxic sediments. Geochim Cosmochim Acta. 1998, 62 (10): 1745-1756. 10.1016/S0016-7037(98)00106-9.

    Article  Google Scholar 

  45. McCollom TM, Shock EL: Geochemical constraints on chemolithoautotrophic metabolism by microorganisms in seafloor hydrothermal systems. Geochim Cosmochim Acta. 1997, 61 (20): 4375-4391. 10.1016/S0016-7037(97)00241-X.

    Article  Google Scholar 

  46. Waddington CH: The Evolution of an Evolutionist. 1975, Edinburgh: Edinburgh University Press

    Google Scholar 

  47. Lewis GN, Randall M: Thermodynamics. 1961, New York: McGraw-Hill, [Rev. by K. S. Pitzer and L. Brewer].

    Google Scholar 

  48. Drever JI: The Geochemistry of Natural Waters. 1997, Upper Saddle River, New Jersey: Prentice Hall, 3

    Google Scholar 

  49. Lichtner PC: Continuum formulation of multicomponent-multiphase reactive transport. Reactive Transport in Porous Media, Reviews in Mineralogy. Edited by: Lichtner PC, Steefel CI, Oelkers EH. 1996, Washington, D. C.: Mineralogical Society of America, 34: 1-81.

    Google Scholar 

  50. Helgeson HC: Solution chemistry and metamorphism. Researches in Geochemistry. Edited by: Abelson PH. 1967, New York: Wiley, 2: 362-404.

    Google Scholar 

  51. Prigogine I, Defay R: Chemical Thermodynamics. 1954, London: Longmans, Green and Co

    Google Scholar 

  52. Amend JP, Shock EL: Energetics of amino acid synthesis in hydrothermal ecosystems. Science. 1998, 281 (5383): 1659-1662. 10.1126/science.281.5383.1659.

    Article  Google Scholar 

  53. Pfeil W, Privalov PL: Thermodynamic investigations of proteins. 1. Standard functions for proteins with lysozyme as an example. Biophys Chem. 1976, 4: 23-32. 10.1016/0301-4622(76)80003-8.

    Article  Google Scholar 

  54. Shock EL, Helgeson HC: Calculation of the thermodynamic and transport properties of aqueous species at high pressures and temperatures: Correlation algorithms for ionic species and equation of state predictions to 5 kb and 1000°C. Geochim Cosmochim Acta. 1988, 52 (8): 2009-2036. 10.1016/0016-7037(88)90181-0.

    Article  Google Scholar 

  55. Shock EL, Helgeson HC, Sverjensky DA: Calculation of the thermodynamic and transport properties of aqueous species at high pressures and temperatures: Standard partial molal properties of inorganic neutral species. Geochim Cosmochim Acta. 1989, 53 (9): 2157-2183. 10.1016/0016-7037(89)90341-4.

    Article  Google Scholar 

  56. Helgeson HC, Owens CE, Knox AM, Richard L: Calculation of the standard molal thermodynamic properties of crystalline, liquid, and gas organic molecules at high temperatures and pressures. Geochim Cosmochim Acta. 1998, 62 (6): 985-1081. 10.1016/S0016-7037(97)00219-6.

    Article  Google Scholar 

  57. Schulte MD, Shock EL, Wood RH: The temperature dependence of the standard-state thermodynamic properties of aqueous nonelectrolytes. Geochim Cosmochim Acta. 2001, 65 (21): 3919-3930. 10.1016/S0016-7037(01)00717-7.

    Article  Google Scholar 

  58. Kelley KK: Contributions to the Data in Theoretical Metallurgy XIII: High Temperature Heat Content, Heat Capacities and Entropy Data for the Elements and Inorganic Compounds. 1960, Bulletin 584, U. S. Bureau of Mines

    Google Scholar 

  59. Wagman DD, Evans WH, Parker VB, Schumm RH, Halow I, Bailey SM, Churney KL, Nuttall RL: The NBS tables of chemical thermodynamic properties. Selected values for inorganic and C1 and C2 organic substances in SI units. J Phys Chem Ref Data. 1982, 11: 1-392.

    Article  Google Scholar 

  60. Linderstrøm-Lang KU: On the ionization of proteins. CR Trav Lab Carlsberg. 1924, 15 (7): 1-29.

    Google Scholar 

  61. Pfeil W, Privalov PL: Thermodynamic investigations of proteins. 3. Thermodynamic description of lysozyme. Biophys Chem. 1976, 4: 41-50. 10.1016/0301-4622(76)80005-1.

    Article  Google Scholar 

  62. Lauerer G, Kristjansson JK, Langworthy TA, König H, Stetter KO: Methanothermus sociabilis sp. nov., a second species within the Methanothermaceae growing at 97°C. Syst Appl Microbiol. 1986, 8 (1–2): 100-105.

    Article  Google Scholar 

  63. Stetter KO, Thomm M, Winter J, Wildgruber G, Huber H, Zillig W, Janécovic D, König H, Palm P, Wunderl S: Methanothermus fervidus, sp. nov., a novel extremely thermophilic methanogen isolated from an Icelandic hot spring. Zentralbl Bakteriol Hyg Abt I Orig C. 1981, 2 (2): 166-178.

    Google Scholar 

  64. Horikoshi K, Aono R, Nakamura S: The triangular halophilic archaebacterium Haloarcula japonica strain TR-1. Experientia. 1993, 49 (6–7): 497-502. 10.1007/BF01955151.

    Article  Google Scholar 

  65. Franzmann PD, Springer N, Ludwig W, Demacario EC, Rohde M: A methanogenic archaeon from Ace Lake, Antarctica: Methanococcoides burtonii sp. nov. Syst Appl Microbiol. 1992, 15 (4): 573-581.

    Article  Google Scholar 

  66. Leigh JA, Wolfe RS: Acetogenium kivui gen. nov., sp. nov., a thermophilic acetogenic bacterium. Int J Syst Bacteriol. 1983, 33 (4): 886-886.

    Article  Google Scholar 

  67. Wilson KS, Vorgias CE, Tanaka I, White SW, Kimura M: The thermostability of DNA-binding protein HU from bacilli. Protein Eng. 1990, 4: 11-22. 10.1093/protein/4.1.11.

    Article  Google Scholar 

  68. Warth AD: Relationship between the heat resistance of spores and the optimum and maximum growth temperatures of Bacillus species. J Bacteriol. 1978, 134 (3): 699-705.

    Google Scholar 

  69. Guérin-Faublée V, Charles S, Chomarat M, Flandrois J-P: Reappraisal of the effect of temperature on the growth kinetics of Aeromonas salmonicida. Lett Appl Microbiol. 1997, 25 (5): 363-366. 10.1046/j.1472-765X.1997.00229.x.

    Article  Google Scholar 

Download references

Acknowledgements

I would like to acknowledge the late Professor Harold C. Helgeson for his friendship and advice during the Ph.D. research project that provided the foundation for this paper. This work was supported by grants EAR-0309829 from the U.S. National Science Foundation and DE-FG02-03ER15418 from the U.S. Department of Energy.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jeffrey M Dick.

Additional information

Competing interests

The authors declare that they have no competing interests.

Electronic supplementary material

12932_2008_91_MOESM1_ESM.txt

Additional file 1: Program script for generating figures. This text file contains the program script used to make the diagrams in Figs. 3, 4, 5, 6, 7. Use the commands listed at the top of the file to generate one or all of the figures on screen or in postscript format. (TXT 7 KB)

12932_2008_91_MOESM2_ESM.pdf

Additional file 2: Degree of formation diagram. This file contains the degree of formation diagram related to Fig. 5b (see text) together with the program script used to make the figure. This additional material is the source of the graphical abstract for this paper. (PDF 39 KB)

Authors’ original submitted files for images

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 2.0 International License (https://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Reprints and permissions

About this article

Cite this article

Dick, J.M. Calculation of the relative metastabilities of proteins using the CHNOSZ software package. Geochem Trans 9, 10 (2008). https://doi.org/10.1186/1467-4866-9-10

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1186/1467-4866-9-10

Keywords