1. Introduction
Graphene flakes (GFs) are finite molecular fragments of an otherwise infinite graphene sheet. Since the discovery of graphene [
1], perfect high-symmetry GFs have become attractive model systems for computational and pure theoretical studies. However, among perfectly high-symmetry GFs, those with a hexagonal shape are most prominent among Cu substrate-grown GFs, especially when using bottom-up approaches like CVD [
2,
3,
4]. The regular hexagonal shape has been theoretically proven as a global energy minimizer for 2D graphene-based structures using Lennard-Jones-like potentials [
5,
6]. Furthermore, two noteworthy theoretical results can be found in the works of Harary and Harbort [
7] and Fulep and Sieben [
8], which are of key importance to the present work. Using combinatorics and graph theory, they have proven that building a polyhex structure by adding hexagons in a spiral way, one hexagon after another, ensures the minimal possible edge perimeter and site (vertex) perimeter for a given number of cells. From a physics point of view, a minimal possible perimeter structure could be assumed as equivalent to the structure with the minimal formation energy, i.e., the most stable structure. In the case of GFs, this should correspond to the fewest possible dangling bonds. However, a free (non-passivated) graphene edge undergoes the so-called edge reconstruction to overcome the valence deficit in edge atoms [
9]. This leads to an overall lowering of the system’s energy. It is widely accepted that among the two most abundant graphene edge configurations (armchair and zig-zag configurations), the armchair configuration has a lower energy per carbon atom and thus is more stable. The local geometry of the armchair configuration edges allows for the formation of a triple bond between two adjacent carbon (C) atoms with dangling bonds [
9]. However, it has been proven experimentally that GFs interacting with a metal substrate are most stable when structured as perfect hexagons with full zig-zag edges [
10]. Moreover, using CVD, especially with almost perfect epitaxial matches like on Cu (111) surfaces, one can grow macroscopic single-crystal graphene with atomically smooth, full zig-zag edges and perfect hexagonal shape [
11,
12]. In contrast, using a plasma torch to produce free GFs results in most GFs having a circular (dodecagonal) shape with mostly armchair edges [
13]. It is also known that even for GFs with minimal perimeters, the number of possible isomer structures grows exponentially [
14,
15]. Thus, an exhaustive search approach for all structurally stable GFs is not realistically feasible, especially for large-scale GFs, even in theory. The lack of uniqueness, which follows from the exponential growth of the number of structures, naturally extends to all potentially possible growth paths. The usual approach to such types of problems is to obtain the so-called lower and upper bounds of the quantity we are interested in. In our case, this is the GF’s perimeter, as an indicator directly corresponding to the number of dangling bonds. Nevertheless, the advantages of determining the most stable GF structures are evident. The overall thermodynamic stability is directly correlated to thermal and chemical stability, which are fundamental to all bottom-up formation processes and related process parameters such as reagent flow, pressure, growth speed, and nucleation time. Moreover, important physical properties of graphene, such as band gap opening, are known to be sensitive to size and shape [
16]. For example, hexagonal flakes have the most prominent metallic properties among the various GF topologies.
The model is based on a multidisciplinary approach, combining techniques and results from combinatorics/graph theory, appended with quantum chemical calculations. A schematic representation of the general idea and its realization in the final model is shown in
Figure S1 in the Supplementary Materials. Using this approach, we derive a formal relationship, showing in an explicit form the dependence of the GF energy on its structural characteristics. It is shown that the model predicts the GF energy with a relative deviation of about 2–3%, most likely maintaining this accuracy for clusters with up to about 10,000 atoms.
Our original motivation for this study was to determine the construction of graphene clusters with the fewest possible dangling bonds for a given size. It is known that such configurations are ground states for graphene nanoflakes based on purely mathematical reasoning [
5,
6]. However, we found that the idea and methodology can be applied not only to the description of ground-state configurations but also to most morphological forms of graphene. The initial idea of accurately quantifying the edges of graphene clusters in terms of the type and number of bonds was extended to a comprehensive qualitative and quantitative description of entire clusters. The concept of representing the formation energy of compounds as a sum of their constituent bonds is longstanding and can be traced back to Herndon (1974) [
17]. We found two works on graphene flakes that are relevant to our study. In the first one, Hendra and Witek [
18] correlated various geometrical features (length, width, size, etc.) of rectangular graphene flakes with calculated density functional tight-binding total energies.
In the second paper, Fthenakis [
19] focused on the study of GFs with a perfect hexagonal shape and provided an analytical expression for formation/cohesive energy as a function of size, which makes it closer to our work than the first one. The problem of studying only the perfect hexagonal shape is that there are many possible structures between two consecutive perfect flakes.
As discussed previously, the GFs with the fewest possible dangling bonds are the most stable structures, i.e., the ones with minimal formation energy [
5,
6,
7]. In general, the formation energy of a compound can be represented as the sum of the constituent bond energies [
17].
2. Theoretical Model
The model presented here is based on the idea that the formation energy (binding energy)
Ebind of a GF
Hm is the sum of the energy contributions of all the bonds in the flake. Hence, to calculate the formation energy of a given GF
Hm, we need to classify and enumerate the chemical bonds in
Hm correctly. The formation energy can generally be represented as
where
i and
j denote the summation of the types and numbers of bonds, respectively. In our case, the bonds are classified into 4 types. A graphical presentation of all types of bonds is given in
Figure 1.
Three of these types have a straightforward chemical and graph theory basis. The fourth classification type comes as a robust conclusion only after the analysis of optimized geometries. Our assumptions are supported in this respect by the work of A. V. Vorontsov and E. V. Tretyakov [
20], both through semiempirical PM7 and DFT calculations. All 4 classification types are based on bond length values as the most important factor for determining bond strength. The numbering of every type is explicit and exact and is based on combining results from graph theory on planar hexagonal graphs.
Some useful definitions of the main physical characteristics of GFs and their graph theory correspondence are listed in
Table 1.
With V’
3, we denote deg(3) vertices that have at least one connection with V
2; with V”
3, we denote such vertices that have only connections with V
3. With these definitions and the ones from
Table 1, we make the following correspondences between the graph vertices and the GF atoms’ hybridization state:
V”3 → spc2, V2 → spe1, V’3 → spe2.
The colors encoded in
Figure 1 are as follows: blue edges, sp
e1-sp
e1 bonds; red edges, sp
e1-sp
e2 bonds plus sp
e2-sp
e2 (in the bay); green edges, sp
e2-sp
c2; black edges, sp
c2-sp
c2; and
e and
c subscripts denote flake boundaries (edges) and cores.
The boundaries are defined as the union of all bonds and atoms that form the following types of C-C bonds spe1-spe1, spe1-spe2, and spe2-spe2 (bay). Transition bonds (zone) are defined as the union of {all C-C bonds spe2-spc2}, and the GF core is defined as all atoms and bonds that form spc2-spc2 types of interaction.
As shown in
Figure 1b, when a single boundary cell shares only two edges, there are 3 consecutive equivalent C atoms with dangling bonds: (-sp
e1-sp
e1-sp
e1-). From a chemical point of view, only two options regarding chemical bonding in this segment are possible. Evidently, regarding bond type, only one triple bond can form in this segment. So, the two possibilities are (-sp
e1≡sp
e1-sp
e2-) and (-sp
e2=sp
e2=sp
e2-). The analysis of the bond lengths in the studied structures shows that the first scenario can be found more often; hence, we adopt it in our theoretical model. Flakes with such a structural motive are observed whenever a new wall of the flake starts to form.
For the flake structure generation, we use the spiral construction by Harary and Harborth [
7], shown in
Figure 2.
According to the same authors, this construction ensures the smallest possible perimeter for a specific number of cells. However, it should be noted that it is not the only possible construction.
For bond enumeration, we use the following relations:
—Harary and Harborth [
7]
—Fülep and Sieben [
8]
—Harary and Harborth [
7]
where
is a ceiling function with the argument
;
Pe(
n) is the edge perimeter and corresponds to the total number of bonds at the boundary (edge) of the flake;
Pv(
n), the site perimeter, is equal to the number of carbon atoms with a valence deficit (dangling bonds);
v(
n) is the number of vertices, i.e., the number of all carbon atoms;
e is the number of edges (bonds); and
n is the number of cells (hexagons).
Then, based on Equations (2)–(5) and taking into account that there are 4 types of bond classification, we can rewrite Equation (1) as follows:
where
x(
n) is the number of (-sp
e1-sp
e1-) triple bonds,
y(
n) is the number of (-sp
e1-sp
e2-) bonds,
z(
n) is the number of (-sp
e2-sp
c2-) bonds, and
t(
n) is the number of (-sp
c2-sp
c2-) bonds; the coefficients
α,
β,
γ, and
δ denote the corresponding bond energy weight (measure).
The number of triple bonds (-sp
e1≡sp
e1-) for hexagonal topologies is at least 6. It can be seen (in
Figure 1 and
Figure 2) that the number of triple bonds for a hexagonal flake (generated by spiral construction) is either 6 or 7, depending on the flake wall completion. If the flake has completed walls, then
x = 6, since
x = 6 +
b, while
b = 0. If the current wall is not completely built, i.e., flake has a bay
x = 7, since
x = 6 +
b, while
b = 1 [
21,
22]. The only exceptions from the abovementioned cases appear when a new wall starts being built, i.e., in the case when the first hexagonal cell connects to the already fully built zig-zag edge (
Figure 1b). Here, it is accepted that
b = 0, despite the presence of one bay.
Figure 1 shows flakes with one bay (
b = 1),
x = 6 +
b. For spiral construction, the possible values of
b are 0 or 1, depending on the geometry of the flakes. Generally, in spiral construction, the parameter
b can be expressed as a function of n, thus allowing a fully automatic generation. However, this is formally a bit tedious and beyond our research scope.
Figure 2.
The Harary–Harborth construction of hexagonal systems with the fewest possible external vertices and edges. The system is constructed by adding hexagons one by one along the indicated spiral line [
7,
22].
Figure 2.
The Harary–Harborth construction of hexagonal systems with the fewest possible external vertices and edges. The system is constructed by adding hexagons one by one along the indicated spiral line [
7,
22].
The total number of bonds in a graphene boundary is
Pe(
n), so the number of (-sp
e1-sp
e2-) bonds is
y(
n) =
Pe(
n) −
x. On the other hand, it is clear from
Figure 1 that there is a one-to-one correspondence between
z(
n) and the number of sp
e2 atoms. So,
z(
n) is the difference between the total number of edge atoms and the atoms with dangling bonds:
z(
n) =
Pe(
n) −
Pv(
n). Equation (5) is the Euler characteristic for a planar graph. Finally, for core bond numbers
t(
n), we have
t(
n) = total number of bonds − (
x +
y +
z). Substituting
e from Equation (5) into
t(
n) as well as
x(
n),
y(
n), and
z(
n) in
t(
n) yields
t(
n) =
n +
v − 1 − 2
Pe(
n) +
Pv(
n). Finally, considering
Pe(
n),
Pv(
n),
e, and
v from Equations (2)–(5), we obtain the following for
x(
n),
y(
n),
z(
n), and
t(
n):
Substituting
x(
n),
y(
n),
z(
n), and
t(
n) from Equations (7)–(10) into Equation (6), and after performing some algebraic transformations, we obtain
Equation (11) is the main result of our model. The coefficients α, β, γ, and δ can be determined by an appropriate fitting procedure applied to a pre-calculated set of GF energies. For the latter, the PM6 method was used. The sum of the squares of the differences was chosen as the minimization criterion in the fitting procedure {
Emodel −
EPM6}. It should be noted that Equation (11) is a nonlinear and non-smooth function. This fact requires the special selection of the methodology for the fitting procedure. The Generalized Reduced Gradient (GRG) and Evolutionary methods were used in this case. The GRG method cannot guarantee that the calculated solution is a global minimum due to the mentioned features of Equation (11). On the other hand, the Evolutionary method is suitable for non-smooth functions, which is much more time-consuming and, due to its nature, does not guarantee that the resulting solution is a global minimum. For these reasons, we use the following strategy: for initial fitting, the GRG method was used until an optimal solution was found; then, in several successive steps, we used the Evolutionary method until we obtained a stable (unchanging) solution. Finally, the following values are obtained for the coefficients in Equation (11)
Now we can rewrite Equation (11) in a more useful form by replacing the coefficients with their numerical values and converting them to eV units of energy to obtain the following:
A natural way of obtaining the four unknown coefficients {
α,
β,
γ,
δ} is to construct and solve a system of four equations in the form of Equation (6), as follows:
By solving Equation (13) using a precise selection of flakes {
Hn1,
Hn2,
Hn3,
Hn4}, one can obtain results that are very close to the predictions of Equation (12). For example, using the information about the clusters {C
59, C
99, C
144, C
157}, a set of values of the coefficients {
α,
β,
γ,
δ} is derived, leading to a variant of Equation (12) with good qualitative and quantitative correspondence of the predicted energy values. The example provided above is not an isolated case. As shown in
Table S1 in the Supplementary Materials, most flake predictions of Equation (12) fall within a 2% error margin from the PM6 energy, which is comparable to the results of different quantum computational methods. Based on the value of the coefficients {
α,
β,
γ,
δ}, their direct interpretation as corresponding bond strengths does not seem to be reasonable. By definition, if
α,
β,
γ, and
δ represent the bond energies of, respectively,
α: (sp
e1-sp
e1) (implying a bond order of 3);
β: (sp
e1-sp
e2)ꓴ(sp
e2-sp
e2), which has an order between single and double bonds but is closer to a double bond;
γ: (sp
e2-sp
c2), which has a bond order between 1 and 2 but closer to 1; and
δ: (sp
c2-sp
c2), which represents a delocalized bond with an order between a double and single bond, one might thus expect that the values of the coefficients would be arranged as follows,
α ˃
β ≥
δ ˃
γ, but instead, the order is
β ˃
α ˃
δ ˃
γ. First, we must consider that all flake formation energies are positive, resulting from the fact that in PM6, only valence electrons are considered, similar to all semiempirical methods. Therefore, the absolute value of the PM6 energy is expected to be much larger than that of non-pseudopotential methods like ab initio. Secondly, in the PMx methods, the parametrization for carbon yields a slight positive formation energy for graphite, which has a standard enthalpy of formation Δ
Hf = 0 by definition [
23]. With this parametrization, the formation energy of all flakes should be positive, as they must possess higher formation energy than “infinite” graphite. The value of
δ is very close to zero but remains positive. This is reasonable, as this coefficient represents bonds in the core of the flake, similar to those in ideal graphene. The negative value of
γ may seem unusual, as we expect it to have the highest energy due to the longest length of this type of bond. However, as indicated in Equation (11), its functional dependence is not straightforward and does not change the sign of the second term. Also, in the last term of Equation (11), there is a negative sign in front of
γ, so its effect on this term is to raise the energy, as expected. The third term in Equation (11) describes the impact of triple bonds due to the (
α −
β) part, and if
α ˃
β, this term will increase the energy, which would be physically incorrect; hence,
α <
β. Moreover, the result
β ˃
α is extremely significant. Examining Equation (12), it becomes clear that because
β ˃
α, the third term in the equation is negative. It should be noted that this is the only negative term, leading to a decrease in energy. In other words,
β ˃
α implies that the formation of a triple bond (carbyne) along the GF boundary, due to boundary reconstruction, results in a greater energy reduction than the double (carbene) bond formation. In summary, the functional dependence described by Equation (11) is complex and requires further analysis.
3. Computational Method, Results, and Details
Here, a total of 121 graphene flakes were studied. A full list of all cluster energies predicted by the model, PM6 energy, and structural characteristics is given in the
Supplementary Materials. The smallest cluster is C
24, and the largest is C
294. Each structure was generated by adding one hexagon to the previous one (
Figure 2). The initial geometry of all clusters was chosen to be the same as that of perfect graphene, with all bonds set to 1.42 Å and all angles set to 120°. Geometry optimization was performed using the PM6 semiempirical method within the Gaussian09 software package [
24]. A singlet spin state was assumed for all clusters. Since the investigated effect is expected to be mostly structural and geometric, we employed the PM6 method as it provides reliable geometry for graphene structures [
25,
26]. After geometry optimization, if a global minimum of energy was not found, it was performed again with a slightly changed initial geometry. This procedure was repeated until a global energy minimum was found. The energy set {E
24, E
27, E
30, …, E
292, E
294} represents the most important data from the PM6 calculations, as we directly used it to build our theoretical model.
Examples of optimized geometries of the GFs C
54, C
57, C
59, and C
64 (front and side views) are given in
Figure 3.
A significant edge reconstruction is observed in all flakes. In general, the bond length contraction appears in almost all edge bonds. The type 4 bond classification adopted in the model agrees well with the computational results.
In addition, the bond length analysis of most graphene flakes also validated our assumption for 4 bond types. Normally, the spiral construction as a building algorithm starts at C
6. However, we decided to start with a larger structure, namely C
24. The reason for this is that the properties of the small flakes are expected to deviate significantly from those of the larger flakes. This is also evident from our results. In the case of C
24, the predicted energy has the largest deviation, even though it possesses a perfect bond length separation (
Figure 4). Additionally, C
24 is the smallest flake in which the 4-bond type classification is reasonable. Moreover, it is known that, for some small flakes like C
14, C
18, and C
22, the polyhex form is unstable, and after geometry optimization, it reorganizes into a large monocyclic structure [
27].
The bond length distribution for some clusters is presented in
Figure 4.
As we already mentioned, separating the sp
e2-sp
c2 C-C bonds as a distinct bond type is a result of the analysis of the bond length distribution. Indeed, from
Figure 4, a relatively good separation of the {sp
e2-sp
c2} bonds from the remaining bond types can be observed. This is most obvious for highly symmetric flakes like C
24, C
54, and C
96 that possess D
6h symmetry. The existence of this “transition bonds” region can be explained by the fact that the flake core is a network of three coordinated carbon atoms, all in the sp
2 hybridization state with no valence deficits. The lowest energy state is the one with the optimal delocalization of π-electrons, and this is possible only if the flake core is flat. On the other hand, the edge region contains unsaturated C atoms and can be stabilized via reconstructions. Thus, the core has a distorted, out-of-plane structure. Hence, the flake core and edge have two mutually opposing preferred optimal geometric states. An overall energy minimum will be reached if these two regions are separated as much as possible; hence, a transition zone (bonds) appears with longer bond lengths.
Figure 4 shows a substantial overlap and scatter between the {sp
e1-sp
e2} and {sp
c2-sp
c2} bonds, especially for larger flakes. From here, one may assume that the 4-type classification is unnecessary and that a 3-type classification, which unifies the {sp
e1-sp
e2} and {sp
c2-sp
c2} bonds under one group, will be reasonable. This approach is erroneous as it contradicts the fundamental distinction between the boundary and core regions. This will lead to qualitative and quantitative differences, because Equation (11) will change significantly. Fthenakis [
19] proposes a 3-type bond model (2 types for boundary and 1 for the core). We tested this 3-type bond model and found that it yields 1.5–2 times larger deviation from the PM6 energy deviations. Thus, a 4-type model is the optimal one concerning precision and simplicity. The energies
EPM6 and
Emodel (see Equation (12)) are shown in
Figure S2 (Supplementary Materials). Their qualitative and quantitative similarities are evident when plotted together (
Figure 5).
From the slope of the error distribution in
Figure 5 (right) and relying on the linear fit, we can conclude that the model can predict the energy of GFs with up to 10,000 atoms within a 2–3% deviation. A more detailed analysis regarding the extrapolation of the model for larger clusters can be found in the SMs. Equation (12) allows us to make a relative estimation of the binding energy for any GF size and shape, including GFs with topologies and geometries different from the ones we used to build the model. We also tested the reliability of the model on the following six structures: (a) C
108 (rectangular flake), (b) C
144 (hexagonal flake with internal 6C-atom defect), (c) C
282 (dodecagonal flake), (d) C
108 (one-side-elongated hexagonal flake), (e) C
123 (elongated rhomboidal flake), and (f) C
114 (armchair hexagonal flake).
Numerical values of energy as well as errors are given in
Table 2. Each of the six flakes in
Figure 6 represents unique morphological features. For example, flake (a) can be seen as a nanoribbon. Flake (b) represents a flake with an internal defect. Flakes (d) and (e) can be seen as hexagonal C
96, on which, in the case of (d), only one of the sides has been grown with 2 layers of cells, and, in the case of (e), two sides have been grown. The significantly larger error in the case of C
144 can be explained by the fact that the internal edges are much more different from the external ones. For example, no reconstruction in the internal edges leads to triple bonding. The overall reconstruction in this region is not as dominant as in the cases with external edges. This explains why the model predicts a lower energy of such flakes compared to their actual value (
Table 2).
Flakes C
282 and C
114 are even more special. As mentioned previously, C
282 is an example of the smallest flake with a satisfactory minimal perimeter (dangling bonds) and maximal bay (b) numbers,
b = 6. According to our main equation, such a flake must represent the global energy minimum. Indeed, comparing the energy of a spirally constructed C
282 to the one shown in
Figure 6 (a dodecagonal flake), there is a significant energy difference of more than 15 eV. We compare the two structures in
Figure S3 (Supplementary Materials). Moreover, the growth of such octagonal flakes is an isotropic process; i.e., it allows a simultaneous growth/etching on all six sides of the hexagonal flake. Graphene clusters with such a structure (1 <
b ≤ 6) present a strong argument that a theoretical model of the growth mechanism of graphene clusters should contain such structures. Indeed, this grown pattern has been observed both experimentally [
28] and theoretically in Monte Carlo simulations [
29]. C
114 is an example of a hexagonal flake with a fully armchair edge. As it has already become clear in the current model, which is a theory for minimal perimeter flakes, zig-zag edges are the dominant type of boundary. On the other hand, as one can see from
Table 2, the current model predicts well the energy of a fully armchair flake. Yet, we believe that establishing an analogous theoretical model for armchair graphene flakes is of emerging importance for a deeper understanding of graphene flakes.
As mentioned above, we found two publications related to our research [
18,
19]. The former is for hydrogen-passivated flakes, while the latter allows for only a partial energy comparison to our results. Since the model in [
19] is constructed solely for perfect hexagonal structures (C
24, C
54, C
96…), it is expected to provide more accurate estimation for such structures compared to our model. On the contrary, our model is about 6 to 7 times more accurate for structures different from perfect hexagons.
We will also highlight several important aspects of the model in terms of possibilities for potential applications: (i) The model can be a tool for the rough but fast estimation of the energy of a large number of cluster structures, both isomeric to the ones studied in this work and to those with a different topology. (ii) The model can be the basis for studying and constructing potential mechanisms of cluster growth near equilibrium conditions. (iii) The model can be used to establish the relationship between structural characteristics and physical properties of graphene-like nanoclusters. Geometry, topology, and size can be mathematically correlated to a band gap (hence excitation energy), frequencies of normal vibrational modes, etc. (iv) The model can be of interest as an algorithmic basis for machine learning in the field of graphene nanoclusters research.
Some parallel analogies can be made between our model and Monte Carlo approaches to graphene growth: (i) Both methods rely on elementary events: a single aromatic ring attachment, as in our case, and usually from a single C atom up to a single aromatic ring in the case of kinetic Monte Carlo. (ii) Energy parameters for elementary events come from independent methods, usually ab initio MD. (iii) Regarding graphene, in both approaches, the main task is to investigate the possible growth paths.
The potential improvement of the model can be achieved along the following lines: (i) The direct quantitative improvement of the predicted energies can be achieved by replacing the semiempirical PM6 method with more accurate ab initio methods. (ii) Conceptual improvements to the model can be based on the discovery of further bond decompositions of the edge and core of the cluster. (iii) Finally, we emphasize the importance of a model describing clusters with dominant armchair edges.