Three-Dimensional Molecular Modeling of Bovine Caseins Three-Dimensional Molecular Modeling of Bovine Caseins

Three-dimensional (3-D) structures derived from X-ray crystallography are important in elucidating struc ture-function relationships for many proteins. However, not all food proteins can be crystallized. The caseins of bovine milk are one class of non-crysta11izable proteins (a, 1, K, and /3). The complete primary and partial secondary structures of these proteins are known, but homologous proteins of known crystallographic structure cannot be found. Therefore, sequence based predictions of secondary structure were made and adjusted to con form with data from Raman and Fourier-transformed in fra-red spectroscopy. With this information, 3-D struc tures for these caseins were built using the Sybyl molec ular modeling programs. The K-casein structure contain ed two anti-parallel P-sheets which are predominately hydrophobic. The a,1-casein structure also contained a hydrophobic domain composed of .B-sheets as well as a hydrophilic domain; these two are connected by a seg ment of ex-helix. Both the K-and a, 1 -caseins represent unrefined models in that they have been manipulated to remove unrealistic bonds but have not been energy-mini mized. Nevertheless the models account for the tenden cy of these caseins to associate. The .B-casein model ap pears to follow a divergent structural pattern. When subjected to energy minimization, it yielded a loosely packed structure with an ax.ial ratio of 2 to I, a hydro phobic C-terminal domain, and a hydrophilic N-terminal end. All three casein structures showed good agreement with literature concerning their global biochemical and physico-chemical properties.


Introduction
The skim milk system contains a unique biocolloid , the casein micelle.This colloidal complex is in dynamic equilibrium with its environment.Changes in the state of the casein micelle system occur during milk secretion and processing (Farrell , 1988;Schmidt , 1982) .Because of its innate importance to the milk system , the casein micelle has been studied extensively .From th is large basis set of physical data , various inves tigators (for reviews see : Farrell , 1988 ;Schmidt, 1982) have attempted with varying degrees of success to formulate models which will adequately account for the properlies of the casein micelle and allow for predictions of its behavior in real systems (Farrell, 1988) .These models often fall short because of a serious gap in the data basis set.The combination of X-ray crystall ography and molecular biology has contributed greatly to our understanding of the mechanisms of action of globula r proteins.While the techniques of molecular biology are now being applied to casein , crystallographic structures will probably not be realized.It is this gap in our knowledge, the reconciliation of the physical chemical data for the caseins wit h molecular structure , that we have attempted to bridge with the tools of three-dimensional molecular modeling.These models are not an end in themselves ; it is hoped that they will represent a new starting point in the examination and exploration of casein structure and function .

Predictions of secondary structures
Selection of appropriate conformational states for the individual amino acid residues was accomplished by comparing the results of sequence-based predictive techniques , primari ly those of Chou and Fasman (1978) , Garnier eta/. (1978), and Cohen eta/. (1983, 1986).
Assignments of secondary structure (a-helix , P-sheet or P-turn) for the amino acid sequ ences were made when either predicted by more than one method, or strongly predicted by one and not predicted against by the others.In addition, because of the large number of proline residues in the caseins, proline-based turn predictions were made usi ng the data of Benedetti eta/.(1983) and Ananthanarayanan era/.(1984).

Molecular modeling
The three-dimensional structures of the caseins were approximated using molecular modeling methods cont~ined within the program Sybyl (Tripos, St. Louis, MO) and implemented with an Evans and Sutherland (St. Louis, MO) PS390 1 interactive computer graphics display interfaced with a Silicon Graphics (Mountain View , CA) W-4 035 processor.In building the macromolecule we used a library or a dictionary of geometric parameters , i.e., bond lengths between specified atoms, bond angles, and van der Waals radii which are compatible with those found by X-ray crystallographic analysis.The molecular modeling software package contains this library o r di ctionary with average geometric parameters for each amino acid.In itial construction of the models followed the procedures previously developed (Kumosinskietal., 199la, 199lb , 1993).Theconstruction of a model was accomplished by drawing upon the predicted behavior of the polypeptide chain from its amino acid sequence (see Secondary structures section above).and reconciling these predictions with spec-troscop~cdata .Next, by examining the "domain" pat-t~rns an _s tn g from secondary structural predictions, combmed w1th hydropathy plots (Farrell, 1988) peptide segments of from 60 to 100 residues were selected.Each peptide was built amino acid by amino acid, and assign-e~ cP and Y, angles characteristic of the re spective predtcted structures for each residue.All w angles were assigned th e co nv entional trans configuration.In addition , aperiodic st ru ctures were in the extended rather than totally random configuration.This is a ve ry important point , especially for the caseins which by spectroscopic and physical measurements do appear to have non-classical structures and are not "random walk" peptides .The choice of extended structure attempts to address this issue.The Sybyl subroutine "SCAN" was used, on the side chain s only, to adjust torsional angles and relieve inappropriate van der Waals contacts.The individual pieces were then joined together to produce th e total polypeptide model and readjusted with the "SCAN" subroutine.The models for a 11 -and K-caseins were generated in this fashion.For .$-casein further refinement was necessary.
Molecular force fi eld-energy minimization _ _ Quantum mechanical calculations have greatly facllttated the deve lopment of structure-function relationships for small molecules.However, many problems of biological interest , such as protein co nform ation still require elementary mode l empirical energy func~ions.
Though the functions are crude, this approach has been 1 Mention of brand of firm name does not constitute endorsement by USDA over products of a similar nature not mentioned.

236
applied successfully to the study of hydrocarbons, oligonucleotides, peptides , and proteins.
Here for 13-casein we have employed a potential energy model which is simply described as a collection of overlapping balls for the atoms with given van der Waals radii connected by springs, and which mimic the vibrational character of the bonds.In addition, the a~oms are assigned van dcr Waals attractive and repul-Sive forces, as well as electrostatic forces which quantitate bonded and non-bonded interactions .Molecular mechanics or force field methods employ a combination of potential energy functions to optimize a structure .Three requirements for these force field calculations are an equation to calculate energy as a function of molecular geometry, parameterization (a set of best values for experimentally obtainable molecular properties), and an algorithm to calculate new atomic coordinates.Empirical energy approaches are based on the assumption that one can replace a Born-Oppenheimer energy surface for a molecule, or system of molecules, by an analytical function.The potential energy function chose n is generally given as a sum of bond energies and non-bonded interaction terms (Weiner et at., 1984(Weiner et at., , 1986)).Three terms are used to represent the difference in energy between a geometry in which the bond leng ths, bond angles, and dihedral angles have ideal values, and the actual geometry.The remaining terms represent non -bonded van der Waals and electrostatic interactions.The current version of AMHEK by Kollman and coworkers (Weiner, eta/., 1986) includes the option of calculating hydrogen bond energies .This option was chosen for the .B-casein work.The parameters used with this force field should include awmic partial charges calculated by Kollman using a united atom approach when only essential hydrogens are used.The calculated energy was minimized using a conjugate gradient algorithm.

Results and Discussion
Rationale for generation of secondary structure models Various methodologies for sequence based secondary structural predictions are currently available, and these methods have been applied to the caseins (Creamer era/., 1981; Loucheux-Lefebvre era/., 1978; Raap er a/., 1983).In our initial approaches a sequence-based prediction was generated from each of three basis sets for the major proteins of milk: a 11 -, .Band K-caseins.The results were ambiguous ; for some portions of these-quen_ces , stru_ctures (a-helix, etc.) were consistently predtcted, whtle for others they were not.The relatively high proline contents of the casei ns pose a problem since this residue, whil e occasionally found in both .B-sheet and a-helical st ructures, is generally not favorable to either.In attempting to ge nerate a consensus sequencebased prediction, we first focused upon solvi ng the proline problem in a way commensurate with known behavior of this residue in proteins and model peptides.•a, 1 -Casein is a si ngle polypeptide chain of 199 residues.
•Byler era/.(1988) cK-Casein is a single polypeptide chain of 169 res idues.dCurley, Kumosinski and Farrell, preliminary data with 10% error.e,B-Casein is a single polypeptide chain of 209 residues.Only this st ructure has been subjected to energy minimization and so it has an "initial" structure similar to the two others and a "final" structure following minimization.
The proline residues of the caseins are somewhat evenly dispersed throughout the sequences probably ruling out Type I and Type II polyproline structures suggested by Garnier (1966).It has been documented, however, that in synthetic peptides of known crystal structure, proline frequently occupies the second position of either a four-residue ,8-turn or a three-residue ') '-turn  (Benedetti er al., 1983 ; Ananthanarayanan er al. , 1984;   Rose et al., 1985).Moreover, recent Raman and Fourier-transformed infra-red (FTIR) spectroscopic analyses of the caseins have suggested that by reference to known X-ray data and force constant calculations, up to 35% of the residues in caseins appear to be in ,8-or -y-turns (Byler er al., 1988).To assess the probability that proline resi dues in the caseins might be located in turns , we examined the tetrapeptides containing proline in the second position for similarity with synthetic peptides known to form ,8and -y-turns.These comparisons led to the hypothesis that most of the proline residues were in turns of some type.When the models were built, normal ,8-turns containing proline residues generally resulted in unfavorable van der Waals contacts with surrounding residues, leading us to assign some proline residues initially to the -y-turn conformation.,8-Turns, other than those based on proline , were predicted by the Chou and Fasman (1978) and Cohen eta/.(1986) methods.Many of the proline residues of ,8-casein, unlike those of a 5 1 - and ~e-, are in Pro-Pro or Pro-X-Pro sequences.These sequences occur primarily in the C-terminal half of the molecule, thus these two special cases are important in ,8-casein.Using the method developed for K-casein , where model peptides His-Pro-Pro-His and His-Pro-His-Pro-His were built and energy-minimized to test the best ¢ and ift angles to be inserted in these sequences (Kumosinski eta/., 199la), new models for ll-casein, where -X-is hydrophobic , were constructed.These peptide models result in structures which do not unduly constrain the polypeptide chains and which have minimized 237 energies of approximately -8 kcal/residue, a value equivalent to that attained by energy minimization of X-ray crystallographic structures.For all of the above reasons, the total number of residues assigned to turn structures was only slightly in excess of the total predicted by spectroscopic methods (Table 1).
In a similar fashion , "consensus" scores for a-helix and .B-sheet were arrived at by choosing from those regions having the highest predicted probability of a given structure to yield values in accord with the FTIR and Raman data (Byler et a/. , 1988 ;and Curley , Kumosinski and Farrell, preliminary data with 10% error).In this case, all residues previously assigned to proline-based turns were eliminated first from consideration in a-helical or extended ,8-structures.A concern was that the assignment of all prolines to turns might decrease significantly the content of a-helix or J3-sheet since proline can be the first residue in these st ructures.However, the net results of these calculations for {J -, ~e-, and a 51 -casein are compared with the spectroscopic data in Table 1 and the a-helical and ,8-structures are not overly reduced .Finally, all residues not included in these periodic structures were then considered to be in an extended aperiodic conformation .The initial conformational assignments for ~e-casein are shown with its sequence in Figure l , for a 51 -casein in Figure 2, and for ,8-casein in Figure 3.The ~e-casein mode l is similar to that of Loucheux-Lefebvre era/.(1978) and with the exception of the {J-caseins and-y-turns, the a 5 1 -and ,8-models are comparable to those of Creamer era/.(1981) .Rationa le for generation of three-dimensional models The secondary structural assignments which had been reconciled with FTJR and Raman spectroscopic data were used as a point of departure for the generation of three-dimensional models for a 51 -, {3-, and ~e-casein.
Idealized 4> and Y, angles assigned initially for each structural element are given in Table 2.The proteins were assembled as described in Materials and Methods.   . . Figure 3 .(A) Sequence of ll-casein A 2 ; P below S denotes serine phosphate.(B) Summary of initial secondary structural assignments made for {3-casein A 2 ; P denotes prol ine; SP denotes serine phosphate; a-helix denoted by scroll (e.g. , resi dues 1-6) ; il -sheet by sawtooth (e.g ., residues 25 to 27) and turns by turn-like structures (e.g. , res idues 17-20) .Since the molecules were assembled from N -C terminal , the procedure is analogous to the mechanism taking place in vivo, whereby the proteins are synthesized and presumably fold after insertion into the lumen of the endoplasmic reticulum (Farrell and Thompson, 1988).

Three-dimensional molecular model of ~:-casein
The computer model generated as described above 240 for K-casein is shown in Figure 4. Here, the hydrophobic side chains are colored green, serine phosphates, aspartic, and glutamic acid side chains are red, while the lysine, and arginine groups are colored purple.In descriptive terms, the protein can be thought of as being represented by a "horse and rider" as delineated in Chemistry or K-caseio and the three-dimensional model Chymosin: K-Casein differs from a 1 -and ,B-caseins in th at it is soluble over a broad range of calcium ion concentrations (Waugh and Von Hippe\, 1956).It was this calcium sol ubility whic h led these workers , upon discovering the "-fraction, to assign to it the role of casein micelle stabilization.It is also the K-casein fraction which is most readily cleaved by chymosin (rennin) (Jo\les et a/., 1962, Kalan an d Woychick, 1965); the resulting products are termed para-K-casei n and the macropeptide.It would appear that K-casei n is th e key to micell e stru cture in that it stabilizes the calcium in soluble a, 1 -and {3-casein s.
T he action of chymosi n on the casein micelle is primari ly the hydrolysis of the highly se nsitive Phe-Met peptide bond (residues \05-106) of <-casei n.Sequence data (Mercier et a/., \973) show this bond to be in a proline-rich region of the molecule between Pro-His-Pro (residues 99-101) and Pro-Pro (residues 109-110) which perhaps accounts for the high susceptibility of thi s specific bond to chymosin .From st udi es of model peptides, it has been suggested that the residues lying between Pro 101 and Pro 109 occur in a jl-sheet structure (Raap et a/ ., 1983).Using other predictive methods , th ere is an equal chance that these residues could be in an a -helical conformation.In e it her case, the Pro-X-Pro and Pro-Pro residues cause the formatio n of a kink which nea tl y presents the otherwise hydrophobic Phe-Met on the surface of the molecule.In our model, we show the a-helix whi ch rep resents the "horse's bit" in our descriptive "horse and rider" model.From model building co nsiderations, thi s seq uen ce represents the minimum number of residu es for a stab le a-helix .Cleavage of the Phe-Met (105-106) bond wou ld render the helical conformation untenable, the helix would unwind and a considerable amount of confi gurational entropy would be added to the hyd roly sis reaction.The change from {3-sheet to extended conformation, though significan t, wou ld be lesser in its energy co ntribution.In either case, helix or sheet, the Pro-Pro and Pro-X-Pro turns presen t thi s segment on the surface and make it readily accessible to chymosin .All models for casein micelle structure must in some way account fo r this feature of K-casein .
Hydrophobic interactions : Earl ier spec ulation (Hi ll and Wake, \969) that <-casei n might be a lin ear amph iphi le seems to be only partiall y true.T he am in oterminal fifth of the molecule has a re latively hi gh charge frequency (0.34), ho weve r, the net charge is zero and thi s part of the protein is also relatively hydrophobic .Residues 35 to 68 represent an exceptionally hydrophobic area wit h almost no charge.It is precisely within this region that the majority of the re sidues found in the "legged M structures of the K-casein molecule occur.The se non-stranded , highly hydrophobic jl-sheets make ideal sites for shee t-sheet in teractions wit h ot her ~e -case in molecules or wi th hydrophob ic dom ains of a 1 -and {3-case ins.Indeed th e concentration-dependen t self-association profile of the redu ced form of purified K-casein can be fitted with a model fo r polymeri zation at a critical micelle co nce ntration (Vreeman eta/., 1981) .Several investigators (Rose, 1968;Dow ne y and Murphy , 1970;Ali et al ., 1980;Davies and Law, \ 983) have noted that 242 {3-and ~e-caseins diffuse out of the casei n mice:te at low temperatures.As one decreases the temperature, hydrophobic stabilization energy decreases, and K-<:asein is able to dissociate from the micelle.Finally , t e importance of the hydrophobic region in casein-caseiil interactions can be su pported by the research of Woychik and Wondolowski (1969).Of nine tyrosi ne residu es in ~e-ca sein, seven are located between residues 35 and 68; nitrati on of 7 tyrosines in ~e -case in severely inhibited its ability to stabili ze a, 1 -casein.As can be seen in Figure 4 , the "legged " structures are constituted fror.1 th is region and contain 7 tyrosine re sidues.
Sulfhydryl-disulfide interactions : ~e -C a sei n contains two Cys residues: one resides at amino acid resi due 11 and the other at amino acid residue 88 .Whether these can form intra-or intermolecular disulfide bonds and th e effects of suc h bonding on micelle stabil izatio n have not been clearly establis hed.The occurrence of free su lfh ydryl groups in the milk-protein com plex has been reported by Beeby (1964) but not by others (Swaisgood eta/., 1964).The latter authors reported significant S-S cross-linking in purified K-casein.Howeve r, Woychik eta/ .(1966) demon strated that reduced and alky lated ~e-casein stabilized a, 1 -casein against calcium precipitation as well as native IC -Casein.Pepper and Farrell (1982~ fou nd th at in soluble whole casein, in th e absence of Ca +, K-casein occurred as a po lyd isperse high mo lecular weight complex.At low pro:ein concentrations this complex cou ld be separated fr om the other casei ns by size exclusion gel chromatography in the absence of urea.The addition of reducing agents converted the ~e -casein to the su lfh yd ryl-form which exhibited concentrat ion -depen dent associations, both with itself and with other caseins.Thus although K-ClSein represe nts only 13 % of the casein , man y K-caseb molecules must be somewhat contiguous in the mi cell e in order to form these disulfid e linked aggregates .Thi s find ing has been confirmed by Groves et a/.(1992) " ho demonstrated that "-casein occurs as a se ries of disLifide li nked polymers ran gin g from monomer to octamer It is interesting to note that Cys-11 is located betwt! n two segments predicted to be in a -helical conformHions and is found on the "rider 's " left , while Cys-88 is located in a predicted {3-turn on the opposi te side.In our model both of these resid ues are located nea r the surface of the molecule and are directed away from eac h 1ther .This could accou nt for the ab ility of th e K-casein molecule to form the inter-chain disulfide bonded polyners as discussed above , and quantitated by Rasm ussen et a/.(1992).
Sites for glycosylation and phosph, try lation of "~casein: Of the major components of th e casein co mplex , only K-casein can be glycosylated Nearly all of the carbohydrate as well as the phosphat!associated with (-casein is bound to the macropeptide (Ei gel eta/., 1984), wh ich is the highly soluble portion released by chymosin hydrolysis.The major site for gl tcosylation, Thr-133, is found in our model on the >ack of the "horse" and is on a ,6-turn.The sites of J}lospho ryla-tion , Ser-149 and Thr-145 , are on the back portion and are found in P-turns as well.Three-dimensional molecular model of as 1 -casein The computer model, generated for a, 1 -casein as described above is shown in Figure 6 (on color plate, p .239) where it is displayed from carboxy-to aminoterminal (left to right) .The molecule is colored as described forK-casein.The best representation shows th e molecule to be composed (right to left) of a hydrophilic amino-terminal portion , a seg ment of rather hydrophobic ~!-sheet, the phosphopeptide segment and then a very hydrophobic carboxy-terminal domain.For clarity th e backbone without side chains is shown in Figure 7. Chemistry or as1-casein and the three-dimensional model Sites or phosphorylation in a 5 ccasein: a, 1 -Casein B is a single polypeptide chain with 199 amino acid residues and a molecular weight of 23,619 daltons (D) (Mercier et a/. , 1971) .The a 51 -B molecule contains eight phosphate residues, all in the form of serin e monophosphates.Seven of th ese phosphoserine residues are clustered in an acidic portion of the molecule bounded by re sidues 43 and 80 (the second fifth of the molecule from the amino-te rmin al end).This highly acidic segment contains 12 ca rboxy li c acid groups as well as the seven phosphoserines.The model shows the phosphoserine residues to be located on I)-turns whi ch is compatible with other known pho sphorylated residues in proteins.Thi s c luster forms a highly hydrophilic domain on the right shoulder of the molecule.Thi s c lu ster is the major si te for calcium binding , which in turn can be thermodyna mica lly linked to the physico-chemical properties of a 11 -casein (Fa rrell eta/. , 1988) .
The as1-casc in A deletion and chymosin cleavage sites: The rare a 11 -casein A genetic va ri ant exhibits interactions which are hi gh ly temperature depend en t.The A variant is the result of the sequential deletion of 13 amino acid residues bounded by residues 13 a nd 27; the majority of these deleted amino acids are apolar (Farrell era/., 1988) but Glu 18 and 14 and Arg 22 are also deleted .The deleted seg ment encompasses a region predicted to be in a ,B-sheet.This sheet provides a spacer-arm between the hydrophilic amino-terminal region ("five o'clock" in Figure 6) and the phosphopeptide region.Deletion of thi s spacer-arm brings the hydrophilic section closer to th e phosphate rich shoulder .In addition , the Phe-Phe bond (residues 23 and 24) is removed ; this represe nts a major chymosin cleavage site (Mulvihill a nd Fox , 1979) and its absence may account for the poor qualit y products prepared from a 51 -casein A milks (Thompson er al., 1969).Additional chymosin cleavage sites have been suggested (Mulvihill and Fox, 1979), but th ese do not appea r to be relati vely exposed for enzymatic attac k.
Hydrophobic in teractions : For as 1 -casein , the high degree of hydrophob ic ity exhibited by the segment containing re sidues I 00 to 199 is probably respon sible, in part, for the pronounced self-association of the 243  1982).Thi s self-association approaches a limiting size under cond it io ns of lowered ionic strength; the hig hly charged phosphopeptide regio n can readil y account for thi s phenomenon throug h c ha rge repul sion s.It is noteworthy that at ionic strengths > 0.5, a 51 -casein is salted out of solution at 3rC.This seg ment (residues 100 -199) of a 51 -casein is highly enriched in hydrophobic residues and contains a segment of non -s trand ed ,B-sheet remini scent of those found in K-Casein.This reg ion is ri ch in tyro sine and as noted above for K-casei n, nitration of a 51 -casei n also leads to decreased stability in reconstituted mi cellar stru ctures (Woychi k and Wond olowski , 1969)./1-Casein energy minimization or construcled model The ,8-casein molecule is a single-chain polypeptide with five phosphoserine residues and a molecular weight of23 ,980 D (Ribadeau-Dumas era/., 1972) .The hi gh proline content of this molecule , relative to the other caseins, made co nst ruction of its model important ror compa ri son with those developed above.The ini tial stru cture of .B-casein was open and too diffuse, it was quit e difre rent from those of a 51 -and K-casein (Kumosi nski era/., 1993).The mod e l was, th erefore , energy-minimized using the Kollman force field potential (Weiner era/., 1986).Only essential hydrogen bonding protons were used in order to increase the speed of the calcul ation .Hence , van der Waals radii of carbon atoms were in creased to account for the lack of hydroge ns .This united atom approach is widely used when dealin g with protein s in excess of thirty resi dues.Electrostatic interactions were added to th e calculation by the use of united ato m partial charges acco rding to Wei ner er al. (1986).Atoms at a distance of more th an 0.8 nm (8 A) were not considered as contributing to van der Waals and elect ro static interactions; th e "non-bonded cut-off" used was thus 0. 8 nm.The st ructure was minimized to a lim it of ± I kcal and the result s for each type of energy, as delineated in Equation 1, are presented in Table 3.Here, this total energy corresponds to a value of -13 kcal per residue which is consistent with va lu es obtained from energy minimi zatio n of structures derived from X-ray crystallog raphy.

Energy• minimized molec ular model of (~-case in
The energy-m in imized th ree-dimensional molecular mode l of ,8-casein A 2 is presented as a colored stick model in Figure 8 (on p. 239).The molecule is colored as desc ribed for K-casein except for plasmin and chym esi n cleavage si tes (Fox, 1981;Visser, 1981); whi ch are colored red-orange and orange, respectively ; the backbone residues are traced by a double yellow ribb on.It is apparent from Figure 8 that although the st ructure of ,8-casei n A 2 appea rs more compact than the previously derived structu res forKand a 5 1 -caseins (Figures 4 and  6), it con tains considerab le asymmet ric character.T he backbone of the model demonstrates loops through whic h water can easi ly pass.The te rti ary structure of ,8-casei n is not that of a globular protein nor that of a solventden atu red random coil.On inspection, the model ca n be thought to have a "crab-like" appearance with two large di sto rted hydrophilic arm s.The lower one (at "four o'clock" position) resu lt s from multiple proline based turns (residues 85 to 11 9).The other large hydrop hilic arm (residues 28 to 55) results from a comb in ation of second ary stru ctures wh ich includes extended, shee t , turn s and a seg ment of a-helix (residues 28 to 34); phosphose rin e 35 occurs in this arm.Both arms co nt ain relativ ely hi gh charge frequ ency (Tab le I) and the well characteri zed pl asmin cleavage si tes whi ch occur at residues 28-29, 105 -106 and 107-108.
Figu re 8, shows that the overall sha pe o f the molecul e is asymmet ric an d can be approximated by a prolate ellipsoid with an axia l ratio two to one.Fu rthermore, one end o f the structu re is predom ina tely hydrophilic as noted above, and the res't co nsists mo stl y of a highly hydrophobic domain (left side of Figu re 8) .In ,8-casein , Pro-Pro and Pro-X-Pro sequences appea r at intervals and re su lt in highly convoluted hydrophob ic segmen ts or loops through whi ch water may pass.The cent ra l X-residues of th e Pro-X-P ro sequ ences are invariably non polar, as are 60 % of residues fl ank ing Pro-X-Pro and Pro-Pro units.These rigid hyd roph obic segment s may act as anchors to position porti ons of th e sequence away from the surface (mu ch as de fined secondary structures can result in specific foldin g pa ttern s for globul ar protei ns) an d thus to provide interaction sit es for hyd roph obicall y dri ve n association reaction s.In co ntrast , proline based -y -turn s in the models o f th e a 5 1 -and K-casein s result in hydrophobic , anti parallel tl-sheets (Figures 4 and 6) .This model for tl-casei n gives the imp ression of a large di sto rted surfactant molecule .This is most likely the re sult of the numerous turns introduced by the Pro-Pro and Pro-X-Pro residues.The ki nd s and types of proline based turns are given in Table 4 .The surfactan t nature of ,8-casein is underscored by a dipole moment of 1557 Deb ye and a net charge of -18.2 5 ato mic charge un its calcul ated from thi s st ru cture usin g the Kollman (Weiner et al., 1986) 196-197 200-201 1 Middle amino acids of the four-re sidue ,8-turn. 2 AII three residu es of the -y-turn.3 Resi dues in two successive Pro-X-Pro sequences res ulting in a spring st ru cture with the polypeptide chain proceeding in same direc ti on.structure with th e same net charge.A radius of gyration of 2.3 nm was calculated by assuming a solid prolate of revolution with axial ratio of 2 to I. A backbone structure with labeled proline resid ues and a rel axed stereo view are also presented in Figures 9A a nd B. Labeled in Figure 9C a re the plasmin and chymosin cleavage sites which all show accessibili ty fo r surface interactions with enzymes.Note that Trp-143 is relatively exposed in the monomer model.Pearce ( 1975) demonstrated that there is a blue shift of fluorescence whe n ,8-casei n selfassociates with in creasing temperatu re .Thus Trp-143 wo uld be exposed in th e mo nomer, but buried in the polymer, accounting fo r the blue shi ft reported to occur on agg regation (Pea rce, 1975).Al so contained in the hydrophilic end of th e st ru cture are the ph os phoserine residues whic h are located in a turn region as was the case for as 1 -casein (Figure 6).In fac t , an extended turn reg ion which rese mbles th e crab's head .(Figures 8, 9) ranges from re sidue 14 to 22 and contains 4 of the 5 phosphoserines in ,8-casei n.

Secondary structure ana lys is
Theenergy-minimized three-dimensional structure of ,6-casein A 2 was compared with reported secondary structural analysis via Raman spectroscopy (Table 1).
Figure 10 shows a Ram ac hanJran plot of </>, Y, backbone dihedral anrles (open circles) calculated fr om the refined ,6 -casein A structure usi ng th e T ripes' Sybyl molecu lar modeling software; areas of tP , 1/1, angles which have been observed fo r known secondary structures (Rose et at. , 1985) are denoted.From this pl ot , the number of residues prese nt , within the a rea assig ned to a particular structure, can be estimated to ob tain a rough value for . .
the global amounts of a -helix, 13-sheet, or turn s.However, this type of analysis alone can lead to overestimations sin ce it does not take into account the required min imum residue length for a periodic st ru cture (five to six residues).In addition,¢ and 1/t angles ma y not exactly represe nt the seco ndar y structure due to changes in bac kbone bond lengths or bonding , so that visual inspection of the secondary stru ct ure by use of a chain trace or ribbon as in Figures 8 and 9 should also be employed in analyzing the secondary st ructure of a model.
The above procedure was used to analyze the 13casein A 2 structure; the global secondary structural results are in good agreement with those obtained in 0 2 0 via Raman spectroscopy (Table 1) .In addition to these classical stru ctures , other types of non-classical structure are predi cted.One type is the ,8-spiral arising from multipl e proline turns as observed for residues 196-206 and 61 -71 (Figures 9A and B).Another type of non -classi cal structure is the di sto rted hydrophilic arm discussed in the previous sect ion , wh ich also results from multiple proline turns contain ed with in residues 85 -90 and 100-120.

Plasmin and chymosin cleavage sites
Several investigators (for reviews see: Fox, 1981 ; Visse r, 1981) have determined the sites of ,8-casein which are cleaved by the proteolytic enzymes plasmin and chymosin.It is of interest to further examine the ,8casein A 2 three-dimensional model to ascertain wh ether these proteolytic sites are buried or exposed on its 246 surface.The principal plasmin cleavage sites, residues 28-29, 105-106 and 107-108, are colored red -on>nge on the model (Figure 8).All appear to be relatively exposed on the surface of the proposed str ucture.In fact , the cleavage sites are all located on the hydrophilic side of the molecule and particularly on the long distorted arms.Hence, all sites are not only solvent-accessible but also sufficiently exposed that in solution plasmin would readily hydrolyze these sites under the reported conditions of temperature and salt concentrations (Fox, 1981 ;Visser, 1981).
The chymosin cleavage sites are exposed on the monomer su rface as wel1 but a re located on the hydrophobic (left) end of the proposed structure (see Figure 8).This model would predict th at hydrophobic interactions, due to temperature-dependent self-association or interactions with other caseins, would inhibit chymosin act ion on 8-casei n.In fact , Visser (1981) andCreamer (1976) have shown that successful chymosin action on (3casein in sol ution occurs predominantly either at low temperatures where hydrophobically driven self-association is minimized or at lower pH.Berry and Creamer (1975) successfully limited hydrolysis to residues 189-190 on ,B -casein at 2 °C; the resulting large polypeptide, i.e., residues 1-189, showed a marked decrease in selfassociation, relative to the parent 13-casein by gel-permeation chromatography.In our model, residues 189-190 are located in a ,8-sheet structure (Figures 3 and 8), which is followed by a unique .8-spiral(figure 9A and  B).This unique sheet-spiral st ructure is perpendicular to the hydrophobic surface of the model and is accessible to enzyme action in the monomer but not if hydrophobic self-associations occur.

Correlation or 13-casein model with solution physicochemical studies
The proposed three-dimensional structure for {3casei n A 2 has a radius of gyra ti on (Rg) of approximately 2.3 nm and dimensions of 2.1 by 4.2 nm, assuming a prolate ellipsoid of axial ratio two to one.Since the Sybyl molecular modeling software does not provide adequate surface (S) and vo lume (V) calculations, a V of 77.6 nm 3 was estimated from the dimensions of the prolate ellipsoid model by assuming a S/V value of 2 nm-1 .This latter value was calculated by Kumosinski andPessen (1982, 1985) for pepsin , which has a large surface area.Using this S/ V ratio in conjunction with molecular weight and partial specific volume, a sedimentation co nstant of 1. 7 S could be calculated for the {3-casein .The actual value at 2 • c is 1.5 S (Payens, 1979;Pa yens and va n Markwijk, 1963).This calculated value of I .7 S would be an upper limit and in reality a more precise sur face area calculation migh t indeed yield a sedimentation coefficient much closer to the value of 1.5 S the experimentally determined by Payens and coworkers.
Andrews era/.(1979) stud ied the aggregation of {3-casein by performing small-angle X-ray scattering experiments at 4 °C and 25 °C at protein concentrations ;, 10 mg/ml.They calculated an Rg of 4 .6± 0 .2nm for a presumed ,8-casein monomer at 4 °C.However, they did not estimate the value of the molecular weights for either the monomer or the observed polymer.Here, there is serious disagreement between their monomer Rg value of 4.6 nm and that determined from our structure of 2 .3nm.The value of 4 .6 nm is extraordinarily high for protein of 24 ,000 D molecular weight and would be equivalent to a globular protein of molecular weight 312,000 D (Kumosinski andPessen, 1982, 1985), or to completely denatured ,8-casein in a random coil with no secondary structure (Payens, 1979).To be consistent with an Rg of 4.6 nm and a sedimentation constant of 1.5 S, the monomer would have an axial ratio of 8 to 1 for the previously calculated volume of 77.6 nm 3 .The resulting ellipsoidal particle would have dimensions of 2.6 nm by 2.6 nm by 20.8 nm; even this hypothetical value represents a more compact structure than that of Payens and van Markwijk (1963) who calculated an axial ratio of 12 to 1 for {3-casein at 4oc and Waugh eta/.(1970) who calculated a 9 to I ratio.Noelken and Reibstein (1968) showed via sedimentation equilibrium studies that ,8-casein is monomeric at 2.5°C, but in the presence of ethylenediaminetetraacetic acid (EDTA) and at much lower concentration than the small-angle X-ray scattering experiments of Andrews eta/.(1979).Waugh eta/.(1970) suggested that reported differences in th e physical chemical data of ,8-casein could be due to the presence or absence of Ca 2 + resultin g in some conformational changes.ln the absence of consistent data, it would be prudent to assume that (3-casein at 2°C is present, as a species intermediate between a totally random coil and a globular protein.Such a structure described in the above paragraph, a prolate ellipsoid of revolution with an axial ratio of two to one and a volume of 77.6 nm 3 , would indeed accommodate the intermediate type of structure suggested by Garnier (1966), Noelkin andReibstein (1968), andWaugh eta/. (1970).In fact, a globular protein of equal molecular weight (24,500 D) would have a volume of only 40 nm 3 (Kumosinski and  Pessen, 1982 , 1985), which is significantly lower than 77.6 nm 3 calculated for that of the refined (3-casein A 2 model.This large volume is indicative of the relatively loose packing density which may characterize a structure intermediate between a random-walk polymer and a globular protein and which is also in accord with recent small-angle X-ray scattering results for whole casein (Pessen et a/., 1991).
Four explanations can be offered for the discrepancy between the experimentally observed Rg at 2 oc and that of the model.The first, attributable initially to Garnier (1966) , is that between 4' and 8'C a conformational change occurs which increases the degree of structure in (3-casein.It is then this more structured molecule (perhaps similar to the model presented here) which participates in the aggregation reaction .The second, suggested by Pearce (1975), is that at 2'C 13-casein is unstructured but as the self-association reaction occurs with increasing temperature , the monomer changes conformation in response to polymer formation (perhaps in a cooperative fashion) resulting in a more compact poly-247 mer structure.A third concept drawn from Waugh eta/.(1970) suggests that residual Ca2+ might cause conformational changes, leading to more compact structures.The fourth, offered here, is that the 4.6 nm Rg (Andrews eta/., 1979) represents, in fact , polymers of a more structured monomer, since the monomer molecular weight is only observed at very low concentrations with chelating agents in the cold.Evidence for a physical change of some sort in {3-casein between 4 and 8°C has been presented by several investigators.Garnier (1966) observed changes in the ORD (optical rotatory dispersion) spectrum of bovine {3-casein .Pearce (1975) observed marked native fluorescence changes between 2 and 10 °C, and most recently Slattery and coworkers (Javor eta/., 1991) have observed changes in fluorescent dye binding and native fluorescence for human {3-casein in this temperature region.The net result of all of these changes, regardless of the pathway, is that polymers of {3-casein are more structured than ,8-casein monomers observed at 2 °C and low concentrations in the presence of EDTA .The model presen ted here may thus represent either the kinetically active species in th e polymerization reaction, or th e final more compact {3-casein monomer within its polymer.

Concluding Remarks
It must be stressed that the structures presented in this work represent preliminary models.They have been partially refined and poor van der Waals side chain contacts removed, but only the ,8-casein is an energy-minimized structure.For as 1 -and K-casein electrostatic interactions, hydrogen bond formation, and backbone interactions were not taken into account.The refinement of these structures through the use of tools such as the Kollman Force Field (Weiner eta/., 1986) will be the thrust of future work.However, it is imperative to note that even these non -refined molecular modeling techniques (when combined with predicted secondary structures and spectroscopic results) can reveal important structure-function relationships.
For (3-casein A 2 , an energy-minimized three-dimensional model is constructed using a combination of secondary structure sequence-based prediction , global secondary structural results from Raman spectroscopy and molecular modeling techniques.This structure is in agreement with biochemical cleavage results for plasmin and chymosin action on ,8-casein.It is also in .agreementwith other experimentally derived results from sedimentation and small-angle X-ray scattering experiments and it provides a molecular basis for the temperature dependent self-association of {3-casein. The models of all three caseins are static and not dynamic and do not allow for a choice between the proposed mechanisms of conformational changes.However, these structures should be viewed as working models with the flexibility to be changed as more precise experiments are performed to ascertain the validity and predictability of this three-dimensional structure, which of course is the nature of any scientific hypothesis.In future work, molecular dynamics calculations will be performed on these structures to test their stability when a kinetic energy equivalent to a bulk temperature is applied.However, the current models represent a starting point, and are assumed to represent an average dynamic structure.These structures will serve to allow comparisons with physico-chemical data and information from small-angle X-ray scattering.Finally all of this information can in turn be used in the genetic desig n of novel bovine proteins with altered functionality by the use of site-directed mutagenesis (Richardson et al., 1992) .
Discussion with Rev iewers H. E. Swaisgood: What is th e expected effect of omi tting solven t water from these calcu lat ions?Auth ors: As you are well aware, th e caseins in general have a pronounced tendency to self-associ ate primarily , th oug h not exclusively, throu gh hydroph obic interactio ns .In the in vacuo models one of the "signatures" for a , 1 -and K-caseins is the proline-based hydrophobic turn.With neither water, nor a hydrophobic calculation , these hydro phobic areas appear on the monomer su rface.These hydrophobic areas may thu s serve as protein-protein interaction sites and in fact become buried in the po lymer, which is in line with many experimental observations.rn point of fact , the absence of these terms may be a positive element in these studies.For .B-casei n, the abse nce of water in the calculations may allow the mo lecu le to contract somewhat more than it would as a monomer , but most physical evidence suggests thi s occurs on association so that the net result is that the (3monomer could represent the monomer-within-the-polymer of associated ,8 -caseins .In both cases, the monomers may, thus, ultimately serve as potential buildingblocks for understanding casein micelle structure. H. E. Swaisgood: How was the possibility of multiple energy minima handled?Authors: In de novo calculations, the potential for false energy minima is great.However, in these studies we superimposed a number of constraints including secondary structural predictions, thus not every bond begins from a random start , and the most favored sections persis t.The conjugate gradien t algorithm used here ci rcumvents part but not all of the problem.Molecular dynamics calculations are needed to further test the validity of the st ru ctures presented.These are currently underway and show good promise but the "final" answer is really not at hand.We feel that we are following a logical pattern to arrive at the best working model possible.Note the word "working" is emphasized .M.N.Liebman: With respect to the conformation of .8casein in solution, could not the Ca 2 + effects be monitored with FTIR to add ress its potential contribution or lack thereof!Authors: T hat is a very good suggestion, and we have some preliminary data to suggest that for whole casein s there are subtle conformation changes with Ca 2 + .Following the conformation of (3-casein in a temperatureco ntro ll ed FTIR cell is an excell ent suggestion fo r further work.M .N. Liebman: I would encou rage the adoption of conservative analysis of the projected models and encourage th ei r comparison with other proteins to try to identify structural motifs , components, characteristics which might be supportive of their segmental similarity of compliance with other aspects of protein structure.Authors : These are working models and must to be altered and improved as more data are acquired.In essence, these models sh ould represent the tru e sc ientific dialectic at work.L.K. Creamer: With respect to the nitration of tyrosine and its effects on K-casein, nitration as used by Woychik and Wondolowski was only employed for a shor t time until it was fou nd that it co mmonl y caused c ro ss-links between tyrosine residues.Thus, these data need to be interpreted with cau tion.(ln our hands, at about the same time.we obtained hea vily cross-linked proteins).Authors: This is an interesting observation because our thesis is that th e hydrophobic areas containing tyrosines se rve as si tes for protein-protein interactions.If the cross-links are between "-casei n molecules , it serves to illustrate the point.If they are within a molecule, it would tell us something regarding the throu g h-space di stance of non-adjacent tyrosine residues in the molecule, 250 and these distances could be calculated from the software available and could help refine the model.L.K. Creamer: One of the problems in all data-based, or statistical, prediction systems is how to deal with un co mmon amino acid residues .1t is not stated how the authors dealt with phosphoserine but from the result that phosphoserine was commonly found at (3-bends, it seems likely that phosphoserine wa s treated as though it were serine.This is fine for predicting the structure of the unphosphorylated protein but would have drawbacks for the final protein with its hi gh concentration of negative charges in very small volumes such as occur in a 5 1 -and ,8-casein in the absence of calcium.Authors: The Sybyl programs allow construction of phosphoserine and th is was done.In all the past and future work we use the phosphorylated form .In some recent trials we have also compared "dephosphorylated" forms and there are differences.The energy-minimized ,8-casein began and ended with ph osphorylated serine residues found by sequen ce analysis.L.K. Creamer: One of th e major drivin g forces for proteins to adopt particular structures is the so-called hydrophobic effect (see for example in Dill, Biochemistry 29: 7133-7155).In th e mod el-building described , the interaction s of water with the protein, or the proline peptides was apparently neglected in the energy-minimization calculations.Will any of the future work include water interactions as a part of the calculatio ns? Authors : Addition of solvent water is a good idea for ex tension of this work as ou r ability to do protracted calculations will increase.A future need in molecular modeling for proteins an d membranes wou ld be the formulation of a reliable molecular modeling calcu latio n for the hydrophobic effect.Such calculations are in their infancy at this time.As noted above in answer to Dr. Swaisgood's question , the absence of these calcu lations may not be a seriou s drawback, since our exposed hydrophobic surfaces on our monomers may ultimately be buried on polymerization, and the dominant physical feature of all caseins is their ability as purified proteins to self-associate or to enter into mi xed associations with other caseins in micelle formation.
r t o S 0 p o 1 S T P t 1 ~ A v ~ S I v A I L r. 0 S p !: u t r.I P P r.
.tL.OIV P O t i Y E I LI IS tr.SlTPIMIO !: I f P f OIEEOO .

Figure
Figure 4. Three-dimensional molecular model of K-casein .The peptide backbone is colored cyan, hydrophobic side chains green, acidic side chains red and basic side chains purple .Note: Figure S is on page 240.

Figure
Figure 6 .Three-dimensio nal molecular model of a 51 -casein.The peptide backbone is colored cyan, hydrophobi c side chains green, acidic side chains red and basic side chai ns purple.

Figure 8 .
Figure 8 .Three-dimensional model of fj-casein-A 2 .The hydrophobic side chain s are colored green, acidic side chai ns are red and basic side chains purple.Side chain atoms of plasmin cleavage sites are colore d red -orange , chymosin cleavage sites are orange and the C-terminal Val 209 is magenta.All other atoms are colored cyan and the backbone atoms are replaced by a double ribbon which traces the backbone and is colored yellow.

Figure 5 .
Figure 5. (A) Chain trace of K-casein; pro lines indicated.The model may be represented by a "horse and rider".The "horse" representing the legged structures and residues occupying the lower 2/3's of the figure, while the "rider" centers about Pro 92.(B) Stereo view of the three-dimensional molecular model of K-casein, showing an a-carbon chain trace without side chains except phenylalanine-methionine 105-106 are labeled.Note: Fig. 6 is on color plate, p. 239.
Figure 7 .A) Chai n trace of a, 1 -casei n; prolines (P) indicated.Area I represents the hydrophilic N-terminal region ; area II the .-, 1 -A deletion peptide; area III the major phosphopeptide regio n; area IV regions of hydrophobic ,B-sheet; and area V the C-terminal region.(B) Stereo view of the three-dimensional molecular model of a 11 -casein, showing an a-carbo1 chain trace without side chains except phosphoseri nes; the N-and C-terminal ends of the molecule are labelled .Tte ticked line represents the suggested stereo center, where the center of the stereo viewer should be placed .sheet region: comprising res idues 10 to 25; 29 to 34; 39 to 45 and 41 to 55, which are connected by -y-or (3 -turns .The 1verall dimensions of the K-casein mono mer predicted by this model are 8 x 6.2 x 5 .8nm .In the following Sfction we will attempt to reconcile some known featu•es of the chemistry of K-casein with this molecular nodel.To aid in this discussion the back- Figure 9. (A) Chain trace of ,8-casein-A 2 with pralines (P) and serine phosphates (SP) indicated.(B) Stereo view, relaxed , of ,8-casein A 2 structure showing and acarbon chai n trace without side chains but N-and C-terminals are labeled .(C) Same orientation as in A; plasmin and chymosin cleavage sites are shown with re sidues labe led.

Table 1 .
Comparison of adjusted sequence based predictions with spectroscopic data.

Table 2 .
Dihedral (¢,f) angles assigned to specific conformational states in the initial structures for the caseins.

Table 3 .
Summary of the energy comronents (i n kcall mol) for energy-minimized P -casei n A model.

Table 4 .
Type of proline turn and sequence in ,8 -casein.