Phenomenological Consideration of Protein Crystal Nucleation ; the Physics and Biochemistry behind the Phenomenon

Physical and biochemical aspects of protein crystal nucleation can be distinguished in an appropriately designed experimental setting. From a physical perspective, the diminishing number of nucleation-active particles (and/or centers), and the appearance of nucleation exclusion zones, are two factors that act simultaneously and retard the initially fast heterogeneous nucleation, thus leading to a logistic time dependence of nuclei number density. Experimental data for protein crystal (and small-molecule droplet) nucleation are interpreted on this basis. Homogeneous nucleation considered from the same physical perspective reveals a difference—the nucleation exclusion zones lose significance as a nucleation decelerating factor when their overlapping starts. From that point on, a drop of overall system supersaturation becomes the sole decelerating factor. Despite the different scenarios of both heterogeneous and homogeneous nucleation, S-shaped time dependences of nuclei number densities are practically indistinguishable due to the exponential functions involved. The biochemically conditioned constraints imposed on the protein crystal nucleation are elucidated as well. They arise because of the highly inhomogeneous (patchy) protein molecule surface, which makes bond selection a requisite for protein crystal nucleation (and growth). Relatively simple experiments confirm this assumption.


Introduction
Biomolecule structures are essential when it comes to understanding the mechanisms of life and human genomes, and developing novel protein-based pharmaceuticals.The most powerful method for structure-function studies of biomolecules is X-ray diffraction (with complementary neutron diffraction) and Nuclear Magnetic Resonance, considered as an ancillary tool only.Both X-ray and neutron diffraction require well-diffracting crystals [1].Growing such crystals of newly-expressed proteins is, however, the major obstacle in protein structure determination.There is no recipe for their growth.It is usually the trial-and-error approach that is applied.Despite the numerous state-of-the-art crystallization tools employed (such as robots, automation and miniaturization of crystallization trials, Dynamic Light Scattering, crystallization screening kits, etc.), researchers' creativeness and acumen remain indispensable.
Protein crystal nucleation is a prerequisite for the crystal growth of newly-expressed proteins.However, there is no theory that could help predict adequately crystallization conditions.Quite often, the classical nucleation theory (CNT) is employed to give a (physical) rendition of protein crystal nucleation process.While providing a logical explanation of the fluctuation-based mechanism and the origin of nucleation barrier, CNT fails to predict correctly nucleation rates.In some cases, the deviations are of many orders of magnitude, e.g., [2].In this work, applying microfluidics technologies, localized DC electric field, and gel crystallization, the authors studied the spatial and temporal location of the nucleation event.They used a confinement effect coupled to an external localized DC electric field to evoke a desired nucleation and growth of lysozyme crystals, in 20 mg/mL lysozyme, 0.7 M NaCl in agarose gel 1%.
A reason for the inadequacy between some experiments and the CNT could be the uncertainty in determining the energy of the interface arising between the new phase and the mother phase-interface energy variation of only 10% can alter the nucleation rate substantially because it depends exponentially on the nucleation energy barrier, which in turn is determined by the interface free energy in power three.The issue with CNT lies in the assumption that an emerging nucleus already has the order and density of the bulk crystal.The interface is described as a sharp surface with a specific (per unit area) free energy, usually not available from direct measurements.However, Wölk et al. [3] have shown that in cases for which CNT was devised originally, such as homogeneous nucleation of water droplets, a simple empirical modification to the CNT-nucleation rate (expressed by Becker-Doering formula) yields a robust function for predicting water nucleation rates over broad ranges of temperature and supersaturation.
The so-called two-step nucleation mechanism (TSNM) denies the simultaneous densification and ordering during a single nucleation event.While preserving the CNT basic concept for a fluctuation-based nucleation mechanism, TSNM assumes nucleation initiation via an intermediate condensed liquid appearing in the bulk solution.Being only densified, the intermediate phase preserves some similarity to the mother phase.Therefore, the phase-transition energy barrier is lowered bellow the one needed for direct transition mother-phase-to-crystal occurring via the CNT mechanism.The second step in TSNM is the formation of crystal nuclei inside the highly-concentrated regions.Thus, TSNM resembles the Ostwald's rule of stages, which stipulates that a thermodynamically less-stable phase appears first, then a polymorphic transition toward a stable phase occurs.Ten Wolde and Frenkel [4] have predicted theoretically the existence of amorphous precursors that have been further confirmed experimentally by Vivares et al. [5], Sauter et al. [6], and Schubert et al. [7].Sleutel and Van Driessche [8] have observed a non-classical nucleation for the 3D liquid-to-crystal transition of glucose isomerase-local increase in density and crystallinity do not occur simultaneously, but rather sequentially.They have demonstrated that at high concentrations (~100 mg/mL), glucose isomerase can form mesoscopic liquid-like aggregates (the molecules in them retain enough mobility), which are potential precursors of crystalline clusters.These aggregates are stable with respect to the parent liquid, and metastable compared with the crystalline phase.In contrast, glucose isomerase 2D crystal nucleation proceeds classically [9] and they proved the existence of a critical crystal size.They also observed that the interior of all clusters is in the crystalline state and the cluster dynamics are determined by single molecular attachment and detachment events.Whitelam presents a molecular model designed to study crystallization in the presence and absence of amorphous intermediates [10].Based on computer simulation, he suggests tuning the relative strengths of the specific and nonspecific interactions.Thus, the relative efficiencies of the various pathways leading towards the final crystalline state have been studied.Most recently, direct transition electron microscopic observations of Yamazaki et al. [11] have suggested a significant departure from the initial TSNM assumption.The authors have never observed formation of crystalline phases inside amorphous solid particles consisting of lysozyme molecules, which are like those previously assumed to consist of a dense liquid.
Although governed by physical laws established previously for small molecule crystallization, protein crystal nucleation is an extremely complex process.The complexity arises from the subtle interplay between process physics and biochemistry.It is the large size of the protein molecules and their highly inhomogeneous and patchy surface [12] that make the molecular-kinetic mechanism of protein crystal nucleation so specific.Protein crystal nucleation rate is reduced by a biochemical constraint associated with the strict selection of crystalline bonds.Based on experiments, this paper differentiates physical from biochemical protein crystal nucleation aspects.

Experimental Results
Any attempt to formulate accurate predictions by amending and overcoming CNT limitations should rest upon interpretation of some basic experimental observations.For instance, experimental data show that nuclei number density (n) of a new phase (crystals, droplets) depends simultaneously on both time (t) and supersaturation (∆µ), i.e., n = n(t, ∆µ).S-shaped dependences of n vs. t at constant supersaturation have been known to cause electrochemical new-phase nucleation for a long time e.g., [13,14].But they remained unelucidated [15] until recently, when it was shown that they obey logistic functional dependence [16].The same function also governs insulin crystal nucleation-large amounts of data for which can be found in [17].Using custom-made quasi-two-dimensional all-glass cells with intentionally introduced air bubbles, n vs. t dependences were measured in this study simultaneously at four typical places: in solution bulk, at the glass support, at the air/solution interface, and at the three-phase boundary solution/glass/air.Stationary nucleation rates were determined from the linear parts of the corresponding plots, and energy barriers for nucleus formation and nucleus sizes were estimated.By simply focusing the microscope on the upper glass plate of the cell, heterogeneous on-glass crystal nucleation is differentiated from the one in the bulk solution.It is also argued that the latter proceeds heterogeneously, on some (unknown) foreign particles of biological origin.Seven different supersaturations have been studied with BioChemika-insulin, showing that crystal nucleation in bulk solution prevails greatly [17].
Using digitalized original experimental data from [14,18], logistic dependences (with very high goodness of fit, R 2 ) are presented in Figures 1 and 2. Such time dependence has also been established for bovine β-lactoglobulin crystal nucleation which proceeds by a TSNM [6].Good logistic fits of insulin crystal nucleation data for seven different supersaturations are shown in Figure 3, where appropriate (supersaturation dependent) parameters are used.The relations, showing the degrees to which saturated crystal-nuclei number densities (n s ) are neared, (n/n s ), are plotted vs. t/t p (using Equation (2); here t p is the time for reaching n s ; t p = 2t c , and t c is the time when the half of n s is reached (namely the mid-point of the corresponding sinusoid).Plots in Figure 4a are for bulk insulin crystal nucleation, and in Figure 4b-for on-glass crystal nucleation.This issue will be considered below.

Experimental Results
Any attempt to formulate accurate predictions by amending and overcoming CNT limitations should rest upon interpretation of some basic experimental observations.For instance, experimental data show that nuclei number density (n) of a new phase (crystals, droplets) depends simultaneously on both time (t) and supersaturation (Δμ), i.e., n = n(t, Δμ).S-shaped dependences of n vs. t at constant supersaturation have been known to cause electrochemical new-phase nucleation for a long time e.g., [13,14].But they remained unelucidated [15] until recently, when it was shown that they obey logistic functional dependence [16].The same function also governs insulin crystal nucleation-large amounts of data for which can be found in [17].Using custom-made quasi-two-dimensional all-glass cells with intentionally introduced air bubbles, n vs. t dependences were measured in this study simultaneously at four typical places: in solution bulk, at the glass support, at the air/solution interface, and at the three-phase boundary solution/glass/air.Stationary nucleation rates were determined from the linear parts of the corresponding plots, and energy barriers for nucleus formation and nucleus sizes were estimated.By simply focusing the microscope on the upper glass plate of the cell, heterogeneous on-glass crystal nucleation is differentiated from the one in the bulk solution.It is also argued that the latter proceeds heterogeneously, on some (unknown) foreign particles of biological origin.Seven different supersaturations have been studied with BioChemikainsulin, showing that crystal nucleation in bulk solution prevails greatly [17].
Using digitalized original experimental data from [14,18], logistic dependences (with very high goodness of fit, R 2 ) are presented in Figures 1 and 2. Such time dependence has also been established for bovine β-lactoglobulin crystal nucleation which proceeds by a TSNM [6].Good logistic fits of insulin crystal nucleation data for seven different supersaturations are shown in Figure 3, where appropriate (supersaturation dependent) parameters are used.The relations, showing the degrees to which saturated crystal-nuclei number densities (ns) are neared, (n/ns), are plotted vs. t/tp (using Equation ( 2); here tp is the time for reaching ns; tp = 2tc, and tc is the time when the half of ns is reached (namely the mid-point of the corresponding sinusoid).Plots in Figure 4a are for bulk insulin crystal nucleation, and in Figure 4b-for on-glass crystal nucleation.This issue will be considered below.In conclusion, being common for both small inorganic molecules and large bio-molecules, the S-shaped dependences of nuclei number densities on time underline a common physical aspect of the nucleation processes.

Logistic Time Dependence of Protein and Small Molecule Crystal Nucleation
The fluctuation-based concept of CNT supposes that a new distribution of larger clusters starts replacing the equilibrium one immediately after establishing supersaturation in the system.Since the larger the cluster is, the longer it takes for it to emerge, the critically sized cluster should appear the latest.Importantly, the nucleus is the cluster of maximum energy and minimum concentration.Therefore, many subcritical clusters, smaller than the critical nucleus by a single molecule only, are formed in the meantime.However, the rearrangement of cluster size distribution does not end with the emergence of the very first critical cluster.Accommodation of the supersaturated system state continues gradually, leading to an enhanced supply of nuclei.Thus, the nucleation rate increases throughout the initial non-steady-state nucleation period.
As per definition, the momentary nucleation rate (dn/dt) during the initial non-steady-state nucleation period, i.e., the rate at any point of the n vs. t graph, is given by the number (n) of nuclei formed in a unit volume (1 cm 3 ) divided by the (infinitesimally short) nucleation time (t).Denoting the frequency of molecule attachments leading to formation of nuclei by k (s −1 ), gives dn/dt = kn.Here, the attachment frequency k is defined as the frequency of molecule attachments to clusters which are smaller than the critical nucleus by a single molecule, minus the frequency of molecule detachments.
The attachment frequency k depends on supersaturation, which, however, remains constant during the whole nucleation process.The reason is that the extremely small nucleus volume (typically about 10 −19 cm 3 ) and nucleation per se does not change the overall supersaturation-even during the most intensive nucleation (e.g., n approaching 10 6 cm −3 ).Thus, beginning with a single nucleus, the nucleation process advances in an exponential manner with time.Nonetheless, no unlimited nuclei augmentation is physically feasible.Experimental results show that after a rapid initial increase, the nucleation process gradually decelerates to an almost constant nucleation rate up to reaching saturated nuclei number densities (n s ) in the plateau regions of the n vs. t dependences (Figure 3).Nucleation rate changes have been attributed [16] to two retardation factors acting simultaneously for heterogeneous nucleation (different for homogeneous nucleation).
A basic assumption of CNT is the supposition of continuous cluster size changes, which is a good approximation to reality only for large critical clusters.The consideration presented here does not suffer from such a limitation-irrespective of the mechanism involved, either CNT or TSNM, it is capable of accounting for discrete cluster size changes as well.

Heterogeneous Nucleation
During solution crystallization, heterogeneous nucleation is the pervasive process.It is the energy barrier that makes it the preferred nucleation process-heterogeneous nucleation energy barrier is only a fraction of the energy barrier of homogenous nucleation.Two nucleation retardation factors acting simultaneously during heterogeneous nucleation have been anticipated in [16]: (1) occupation of nucleation-active particles and/or centers (generally known as nucleants), associated with the nucleation process itself; and (2) appearance of nucleation exclusion zones (NEZ) formed around growing nuclei.NEZ gradually engulf some of the active nucleants, such that are situated close enough to the formed nuclei, lie in the arising NEZ and are deactivated.This process starts soon after nucleation onset.However, as seen, NEZ do not change the overall system supersaturation.
Equation (1) shows that the maximum nucleation rate is reached when nucleation acceleration and deceleration tendencies equilibrate, at time t c , when n = n s /2: (dn/dt) max = kn s /4 (5) which is the (quasi-)stationary nucleation rate, mentioned above.

Unambiguity of the Logistic Nucleation Time-Dependence
Figures 1 and 2 exemplify the high goodness (R 2 > 0.99) of logistic plots.Considered from a physical perspective (as presented above), the good fit of experimental n/n s vs. t/t c data for insulin crystal nucleation (Figure 4a,b) shows more stringently the logistic nucleation time-dependence.Firstly, recalling that n = n s when t = 2t c , x = k (t − t c ) results in x ns = kt c = const., this explains the self-adjustment between k and t c occurring for all supersaturations.Secondly, the (orange) logistic curves in Figure 4a,b result from the logistic equation with 2kt c = 10 (see the inserts in the figures).Hence, these are standard logistic functional plots with ±kt c = 5.Due to the function exponential nature, the standard logistic function obtains its real values in the range of x = ± 5 on both sides of its midpoint (Figure 5); in the case under consideration, the latter being at n s /2.It is logical to conclude that an x-value from −6 to −5 can be attributed to the so-called nucleation induction time.
adjustment between k and tc occurring for all supersaturations.Secondly, the (orange) logistic curves in Figure 4a,b result from the logistic equation with 2ktc = 10 (see the inserts in the figures).Hence, these are standard logistic functional plots with ±ktc = 5.Due to the function exponential nature, the standard logistic function obtains its real values in the range of x = ± 5 on both sides of its midpoint (Figure 5); in the case under consideration, the latter being at ns/2.It is logical to conclude that an xvalue from −6 to −5 can be attributed to the so-called nucleation induction time.Further, an almost linear increase of ns on Δμ is observed for insulin (BioChemika, ≥85% (GE), ~24 IU/mg) crystal nucleation in bulk solution, Figure 6.However, it is highly improbable that sets of Further, an almost linear increase of n s on ∆µ is observed for insulin (BioChemika, ≥85% (GE), ~24 IU/mg) crystal nucleation in bulk solution, Figure 6.However, it is highly improbable that sets of nucleants possessing nucleation-promoting abilities which correspond exactly to each supersaturation used are present.It is rather a situation where lesser nucleants are engulfed by NEZ (and thus, deactivated) at higher supersaturations.

Rate of Homogeneous Nucleation
Notwithstanding the substantially higher supersaturation required, homogeneous nucleation is indispensable in systems without nucleants.Removal of all nucleants from a protein solution is not an easy task, albeit achievable in the vapor phase.For instance, liquid droplets nucleate homogeneously by rapidly expanding and cooling exceptionally pure water vapors.An exponential increase in water droplet nucleation rate has been measured by means of different techniques [19].However, homogeneous nucleation could also pose an issue because no unlimited nuclei augmentation is physically feasible.An evident obstacle for observing nucleation rate limits may be the uncountable number of the nucleated droplets.Additional experimental work is needed to compare nucleation rate measurement data to theoretical considerations.Until then, a theoretical approach to the issue is worth attempting.
Like the heterogeneous case, homogeneous nucleation should be a self-limiting process.Again, there are two factors decelerating it.The first one is like the one in heterogeneous nucleation, namely, increase in the number of NEZ appearing around some nuclei and diminishing the volume where nucleation can still occur.The second decelerating factor, namely, a drop in system's overall supersaturation, is different.It is enacted only during prolonged nuclei growth and consumption of a noticeable molecule amount.Furthermore, while the two retardation factors act in parallel in a heterogeneous nucleation, in a homogeneous process they act consecutively, being interrupted by an intermediate period.This constitutes the substantial difference between both processes.

Rate of Homogeneous Nucleation
Notwithstanding the substantially higher supersaturation required, homogeneous nucleation is indispensable in systems without nucleants.Removal of all nucleants from a protein solution is not an easy task, albeit achievable in the vapor phase.For instance, liquid droplets nucleate homogeneously by rapidly expanding and cooling exceptionally pure water vapors.An exponential increase in water droplet nucleation rate has been measured by means of different techniques [19].However, homogeneous nucleation could also pose an issue because no unlimited nuclei augmentation is physically feasible.An evident obstacle for observing nucleation rate limits may be the uncountable number of the nucleated droplets.Additional experimental work is needed to compare nucleation rate measurement data to theoretical considerations.Until then, a theoretical approach to the issue is worth attempting.
Like the heterogeneous case, homogeneous nucleation should be a self-limiting process.Again, there are two factors decelerating it.The first one is like the one in heterogeneous nucleation, namely, increase in the number of NEZ appearing around some nuclei and diminishing the volume where nucleation can still occur.The second decelerating factor, namely, a drop in system's overall supersaturation, is different.It is enacted only during prolonged nuclei growth and consumption of a noticeable molecule amount.Furthermore, while the two retardation factors act in parallel in a heterogeneous nucleation, in a homogeneous process they act consecutively, being interrupted by an intermediate period.This constitutes the substantial difference between both processes.

Effect of NEZ
As already mentioned (see Section 2.2), the overall supersaturation remains constant initially.Thus, the probability (n/α) for NEZ appearance, where α (s) is the time needed for the formation of the very first NEZ, also remains constant.The initial nucleation rate (dn/dt) init can be expressed as: where Preserving the exponential character during the initial nucleation stage, the graphical n vs. t track of the homogeneous nucleation is indistinguishable from the corresponding part of the heterogeneous nucleation curve.However, knowing that merely one decelerating factor is acting, the homogeneous n vs. t dependence should be steeper and relatively longer.

Effect of Decreasing Supersaturation
Increasing in number, soon or latter, the NEZ start overlapping.This indicates that the first nucleation decelerating factor is of no importance on the account of the second one-decrease in system's overall supersaturation.Intermediately, new nuclei appear in the remaining interstitials between NEZ, but there is a substantial deceleration in the exponential increase of n.When ∆µ decreases bellow the nucleation-limiting threshold, n vs. t dependence should reach a plateau, corresponding to a zero nucleation rate.Supersaturation dependence of nucleation rate is given by the well-known equation of Volmer.For the second homogeneous nucleation stage, it should be written as: where A is a pre-exponential coefficient which denotes the number of nuclei that appear in a unit volume (1 cm 3 ) per unit time (1 s); ∆G* is the thermodynamic energy barrier for nucleus formation; the constant B for homogeneously formed spherical nucleus is B = 16πΩ 2 γ 3 /3 (because ∆G* = 16πΩ 2 γ 3 /3∆µ 2 ); and Ω is the volume of a crystal building block.Qualitatively, this behavior of the system gives a S-shaped dependence of n on time elapsed, t.However, despite the different scenarios involved in heterogeneous and homogeneous nucleation, the exponential functions make their S-shaped time dependent nuclei number densities indistinguishable.Equation (7) shows that a symmetric S-shape (logistic) curve may describe the homogeneous nucleation, only provided ∆µ depends linearly on t, i.e., ∆µ = −st, where −s is the line slope.Under constant temperature, however, supersaturation decrease results from nuclei growth itself, making the linear dependence physically infeasible.Since the new-phase particles nucleate at different time-points, they grow in different sizes and the size difference is amplified due to the Gibbs-Thomson effect [20]-smaller crystals grow slower than larger crystals; the reason being that the larger the crystal, the lower the saturation with which it stands in equilibrium.That is why, along with an increase in number, nucleated crystals accelerate their growth with the time and increase the rate of supersaturation depletion.In view of the extremely high sensitivity of (dn/dt) second on ∆µ-value expressed by Equation (7), only the precise function of ∆µ on t (but not its linear substitute) is meaningful.Thus, in contrast to the symmetric S-shaped (logistic) curve describing heterogeneous nucleation, a non-symmetric S-shaped curve should describe the n vs. t dependence for homogeneous nucleation.

Biochemical Specificity of the Protein Crystal Nucleation
In proteins, it is only the molecule surface structure that dictates protein ability to bind to partners.This is attributed to the molecular interactions in protein bulk concealed under amino acid residues situated on the molecular surface.Because of millions of years of natural selection, physiological protein-protein bonds are highly specific.Proteins operate within the cellular context with typical concentrations of up to 300 mg/mL.Therefore, any non-specific inter-protein interaction may be fatal.It is known that physiological protein-protein bonds result from strong hydrophobic interactions via which contacting areas occupy relatively large portions on the protein molecule surface.
In contrast, the protein crystal lattice contacts are hydrophilic, polar and smaller in size [21].Yet again, it is only the molecular surface structure that dictates proteins' ability to bind to partners in a crystallization setting.In such a setting, a limited number of discrete patches, that are the only attractive molecule portions, appear on the protein surface.If supersaturation is extremely high, amorphous precipitation will occur even under crystallization conditions; such a disordered aggregation is a result of very strong hydrophobic protein-protein interactions.Therefore, it is logical to assume that attraction strength between crystallizing protein molecules should be fine-tuned.Attraction should be large enough to promote crystallization, while not being too large to provoke amorphous precipitation.This means that also protein crystal lattice contacts are formed by a selection of the most appropriate patches on the protein molecule surface.Selection preferences have been revealed using X-ray diffraction data for protein crystal lattice contacts available in Protein Data Bank [21,22].
Strict selection of crystal lattice contacting patches is also evidenced by relatively simple experiments [23].Periodically alternated layer-by-layer crystal overgrowth has been observed with the unique protein couple apo-and holoferritin.Despite the dramatically different core, their surface structure is identical.Uniform in thickness overlaying crystal layers have been deposited using equimolar protein concentrations under the same solution conditions, pH-value, CdSO 4 , and buffer concentrations.Since no reentrant corners have been observed (Figure 7), those crystals should be single-crystals composed of alternating apo-and holoferritin layers, rather than poly-crystals.Crystals of each protein are used as substrates for a sequential in contiguity crystallization of the counterpart protein in a completely repeatable process.A monocrystalline overgrowth of three to four alternating layers apo-on holoferritin, and vice versa, was achieved [23].A clear distinction is allowed as the layers are of different color (apoferritin crystals are yellowish; holoferritin crystals-reddish-brown).
Crystals 2017, 7, 193 9 of 12 been revealed using X-ray diffraction data for protein crystal lattice contacts available in Protein Data Bank [21,22].Strict selection of crystal lattice contacting patches is also evidenced by relatively simple experiments [23].Periodically alternated layer-by-layer crystal overgrowth has been observed with the unique protein couple apo-and holoferritin.Despite the dramatically different core, their surface structure is identical.Uniform in thickness overlaying crystal layers have been deposited using equimolar protein concentrations under the same solution conditions, рН-value, CdSO4, and buffer concentrations.Since no reentrant corners have been observed (Figure 7), those crystals should be single-crystals composed of alternating apo-and holoferritin layers, rather than poly-crystals.Crystals of each protein are used as substrates for a sequential in contiguity crystallization of the counterpart protein in a completely repeatable process.A monocrystalline overgrowth of three to four alternating layers apo-on holoferritin, and vice versa, was achieved [23].A clear distinction is allowed as the layers are of different color (apoferritin crystals are yellowish; holoferritin crystalsreddish-brown).In contrast, no homoepitaxial monocrystalline overgrowth is possible with proteins possessing differing molecule surfaces.Apoferritin crystals have been purposefully introduced in solutions designed for lysozyme crystallization.No single-crystalline overgrowth, but merely formation of poly-crystalline lysozyme-apoferritin aggregates, has been observed [23].This shows that a molecule attachment to the protein crystal lattice does not occur at random.It requires selection of the binding partner.It is worth noting that no binding selection is needed for small-molecule crystallization, e.g., by electrodeposition of metal alloys.
Selection of protein-protein patchy interactions has been accounted for by the so-called bond In contrast, no homoepitaxial monocrystalline overgrowth is possible with proteins possessing differing molecule surfaces.Apoferritin crystals have been purposefully introduced in solutions designed for lysozyme crystallization.No single-crystalline overgrowth, but merely formation of poly-crystalline lysozyme-apoferritin aggregates, has been observed [23].This shows that a molecule attachment to the protein crystal lattice does not occur at random.It requires selection of the binding partner.It is worth noting that no binding selection is needed for small-molecule crystallization, e.g., by electrodeposition of metal alloys.
Selection of protein-protein patchy interactions has been accounted for by the so-called bond selection mechanism (BSM) [24].It assumes that a successful collision between protein molecules, leading to formation of a crystalline connection, requires not only sufficiently close proximity of the protein molecules (respectively molecules to clusters), but also their proper spatial orientation.Because relatively small fractions of molecule surface are occupied by contacting patches, the arising steric restriction to protein-protein association postpones the nucleation process significantly.Thus, based on the biochemical specificity of proteins, BSM explains the slow protein crystal nucleation kinetics [25].Although requiring unusually high supersaturation, it is orders of magnitude slower in comparison to the process with small molecule substances, e.g., during electrochemical nucleation [13,14].Recalling that crystal nucleation rate changes with process stages, one can only compare the (quasi)-stationary nucleation rates expressed by Equation (5).As seen in Section 2.2.1, k-values determining nucleation frequency are 6 to 8 orders of magnitude lower for protein crystallization than k-values for small molecule new-phase nucleation, e.g., electrochemical nucleation (also proceeding in solutions).So, due to BSM, a much lower attachment frequency (ν R *) of molecules to the critical cluster must be in place in the pre-exponential coefficient of Volmer's equation: where c 1 is solute concentration, and Z is known as Zeldovich factor.

Materials and Methods
Insulin crystal nucleation kinetics was studied via the so-called nucleation and growth separation principle.Two different insulin sorts, from BioChemika (BioChemika, ≥85% (GE), ~24 IU/mg) and from SIGMA, Denmark, Lot # 080M1589V, were used under identical crystallization conditions.BioChemika-insulin was shown to be more prone to crystal nucleation than SIGMA-insulin.Because more crystals ensure better statistics, BioChemika-insulin was preferred in our studies.Sufficient details allowing replication of the experimental studies are provided in the original paper [17].

Conclusions
The early stages of crystal nucleation dictate crystal polymorph selection, which is of great interest to the pharmaceutical industry.Unfortunately, our understanding of these stages remains insufficient [26].Because of the molecular-scale involved, numerous specifics of nucleation remain largely unknown.Even with state-of-the-art measurements, it is exceptionally challenging to probe the processes in real time.Moreover, new-phase embryos are not labeled, making it impossible to distinguish them in the vast ensemble of constantly growing and decaying clusters of different sizes.The aim of this paper is to shed some additional light on the problem.
A physical aspect of crystal nucleation is considered from the fluctuation-based perspective to cover both CNT and TSNM.Logistic functional dependences according to Equations (1) and ( 2), symmetric S-shaped curves, characterize the heterogeneous nucleation, while homogeneous nucleation obeys non-symmetric S-shaped functional dependences.Due to the highly inhomogeneous (patchy) surface, proteins are characterized by highly directional interactions which postpone substantially protein crystal nucleation.This is a biochemical constraint imposed on the process.Provided molecule surface patches enabling crystal lattice formation are known, the so-called BSM hypothesis may help in offering clues to proper polymorph selection.Suitable crystal polymorphs can be grown by changing adequately solution conditions (and/or protein molecule surface residues), thus, activating or deactivating different surface patches.However, it is worth also noticing that the precipitants used as crystallizing agents play a specific role [27].

Figure 3 .
Figure 3. Experimental data for insulin crystal nucleation in bulk solution, n vs. t at series of dimensionless supersaturations, ln(c/ce), where c is the actual insulin concentration, and ce is the equilibrium concentration.The corresponding dimensionless supersaturations are given on the righthand side.For the color references, refer to the web version of this article.

Figure 4 .
Figure 4. Logistic functional plots of n/ns vs. t/tp for insulin crystal nucleation.All experimental data for the dimensionless supersaturations studied (numbers on right-hand side) fall on the orange logistic curves (for the color references, refer to the web version of this article).The dashed straight lines with coordinates (00), (11) are a guide for the eye only; it is seen that the experimental points less than t/tp = 0.5 are situated below the dashed straight line, while the points for greater t/tp values (up to 1) lie above this line.Experimental data are plotted for: (a) bulk nucleation; (b) on-glass nucleation.

Figure 3 .
Figure 3. Experimental data for insulin crystal nucleation in bulk solution, n vs. t at series of dimensionless supersaturations, ln(c/ce), where c is the actual insulin concentration, and ce is the equilibrium concentration.The corresponding dimensionless supersaturations are given on the righthand side.For the color references, refer to the web version of this article.

Figure 4 .
Figure 4. Logistic functional plots of n/ns vs. t/tp for insulin crystal nucleation.All experimental data for the dimensionless supersaturations studied (numbers on right-hand side) fall on the orange logistic curves (for the color references, refer to the web version of this article).The dashed straight lines with coordinates (00), (11) are a guide for the eye only; it is seen that the experimental points less than t/tp = 0.5 are situated below the dashed straight line, while the points for greater t/tp values (up to 1) lie above this line.Experimental data are plotted for: (a) bulk nucleation; (b) on-glass nucleation.

Figure 3 . 12 Figure 2 .
Figure 3. Experimental data for insulin crystal nucleation in bulk solution, n vs. t at series of dimensionless supersaturations, ln(c/c e ), where c is the actual insulin concentration, and c e is the equilibrium concentration.The corresponding dimensionless supersaturations are given on the right-hand side.For the color references, refer to the web version of this article.

Figure 3 .
Figure 3. Experimental data for insulin crystal nucleation in bulk solution, n vs. t at series of dimensionless supersaturations, ln(c/ce), where c is the actual insulin concentration, and ce is the equilibrium concentration.The corresponding dimensionless supersaturations are given on the righthand side.For the color references, refer to the web version of this article.

Figure 4 .
Figure 4. Logistic functional plots of n/ns vs. t/tp for insulin crystal nucleation.All experimental data for the dimensionless supersaturations studied (numbers on right-hand side) fall on the orange logistic curves (for the color references, refer to the web version of this article).The dashed straight lines with coordinates (00), (11) are a guide for the eye only; it is seen that the experimental points less than t/tp = 0.5 are situated below the dashed straight line, while the points for greater t/tp values (up to 1) lie above this line.Experimental data are plotted for: (a) bulk nucleation; (b) on-glass nucleation.

Figure 4 .
Figure 4. Logistic functional plots of n/n s vs. t/t p for insulin crystal nucleation.All experimental data for the dimensionless supersaturations studied (numbers on right-hand side) fall on the orange logistic curves (for the color references, refer to the web version of this article).The dashed straight lines with coordinates (00), (11) are a guide for the eye only; it is seen that the experimental points less than t/t p = 0.5 are situated below the dashed straight line, while the points for greater t/t p values (up to 1) lie above this line.Experimental data are plotted for: (a) bulk nucleation; (b) on-glass nucleation.

Figure 5 .
Figure 5.Standard logistic function; (0), (0.5) are the coordinates of the midpoint; in the case considered, it is n s /2.
Crystals 2017, 7, 193 7 of 12nucleants possessing nucleation-promoting abilities which correspond exactly to each supersaturation used are present.It is rather a situation where lesser nucleants are engulfed by NEZ (and thus, deactivated) at higher supersaturations.

Figure 6 .
Figure 6.Dependence of n s in bulk insulin solution vs. dimensionless supersaturation, ∆µ/k B T = ln(c/c e ).