Review on Y6-Based Semiconductor Materials and Their Future Development via Machine Learning

: Non-fullerene acceptors are promising to achieve high efﬁciency in organic solar cells (OSCs). Y6-based acceptors, one group of new n-type semiconductors, have triggered tremendous attention when they reported a power-conversion efﬁciency (PCE) of 15.7% in 2019. After that, scientists are trying to improve the efﬁciency in different aspects including choosing new donors, tuning Y6 structures, and device engineering. In this review, we ﬁrst summarize the properties of Y6 materials and the seven critical methods modifying the Y6 structure to improve the PCEs developed in the latest three years as well as the basic principles and parameters of OSCs. Finally, the authors would share perspectives on possibilities, necessities, challenges, and potential applications for designing multifunctional organic device with desired performances via machine learning.


Introduction
With the increasing global energy demand, it is an urgent problem to study clean and renewable energy instead of traditional energy such as fossil fuels. Among the many new clean energy resources being developed, solar cells have been an area of focus due to the advantages of pollution-free, low noise, low cost of use, and no regional restrictions [1]. Presently inorganic materials [2,3] such as silicon-based semiconductors play a key role because of the high efficiency and high charge carrier mobility. However, organic solar cells (OSCs) are promising alternatives due to their multi-functional integration [4,5], superior processability [6,7], structural versatility, being light and inexpensive compared to inorganic counterparts.
Organic solar cells mainly include Schottky or single-layer solar cells, double-layer heterojunction solar cells, and bulk-heterojunction (BHJ) solar cells. The BHJ solar cell showed a significant improvement in efficiency [8]. In a BHJ solar cell, active layers containing phase separation as well as interpenetrating donor and acceptor materials are essential materials to be designed. In this review, we mainly focused on the material design of acceptors. One of the first efficient acceptors, fullerene (C 60 ), was reported by Sariciftci et al. [9]. Since then, some soluble fullerene-based acceptors such as PC 61 BM and PC 71 BM have been developed and have attracted extensive attention [10][11][12][13]. These are mainly ascribed to the unique features of the acceptors such as a large π conjugated system, a rigid molecular skeleton, and efficient photo-induced electron transfer [9,14]. A single-junction device containing PC 71 BM via hydrocarbon solvents has been fabricated, and the highest conversion efficiency has reached 11.5% [15]. However, the low efficiency compared with state-of-art organic solar cells restricts the development, which is mainly due to the materials of fullerene itself: (1) the absorption of fullerenes and their derivatives are not only weak in the spectral range but also in the extinction coefficient [16]; (2) it is hard to adjust the energy levels dramatically; (3) the spherical structure of fullerene structure is easy to crystalize and aggregate, causing long-term instability of blend morphology [17,18].
In this review, the primary working mechanism and parameters for OSCs are first introduced. Secondly, the properties of Y6 materials along with the design strategy and structures are summarized. Thirdly, we summarize seven different approaches developed in the latest three years for synthesizing new materials. Finally, we focus on key challenges and some guidance suggestions on fastening material design and improving the PCE, which are mainly using machine learning as a powerful tool to screen, design, and understand the materials.

Basic Principle of Organic Solar Cells
Before going through the latest methods of efficiency improvement, it is necessary to understand the basic principles and parameters of OSCs. Some abbreviations shown in Table 1 and sequential molecular structures including donors, small-molecule acceptors, and solvents shown in Figure 1 are used to make the article more precise. A conventional solar cell of BHJ architecture with active layers containing phase separation and interpenetrating donor and acceptor materials are shown in Figure 2a. Apart from the active layer, electrodes and interlayers are stacked between the active layers. Metal oxides, salts, polymer blends, organic acids, small molecules, and conjugated compounds can be promising materials for interlayers, which serve as charge-selectivity, transport layers and can form ohmic contacts between the electrodes and active layers [30,31].
The basic principle of the organic solar cell is shown in Figure 2a, including four crucial processes: exciton formation, exciton diffusion, exciton dissociation, and charge transport and collection [32]. Under optical excitation (Process 1), electrons in the donor will be excited from the highest occupied molecular orbital (HOMO) to the lowest unoccupied molecular orbital (LUMO) and will leave holes in HOMO (Process 2). This will create an exciton or a Coulomb hole-and-electron pair. When the exciton diffuses to the interface of donor and acceptor, an exciton separates to a low-energy state. The electron transfers to the LUMO of the acceptor, and the hole remains in the HOMO of the donor, creating a charge-transfer state (CT states) (Process 3). The electrons and the holes will be collected by the electrodes (Process 4). Eventually, an electric motive force will be generated.   A conventional solar cell of BHJ architecture with active layers containing phase separation and interpenetrating donor and acceptor materials are shown in Figure 2a. Apart from the active layer, electrodes and interlayers are stacked between the active layers. Metal oxides, salts, polymer blends, organic acids, small molecules, and conjugated compounds can be promising materials for interlayers, which serve as charge-selectivity, transport layers and can form ohmic contacts between the electrodes and active layers [30,31]. The basic principle of the organic solar cell is shown in Figure 2a, including four crucial processes: exciton formation, exciton diffusion, exciton dissociation, and charge transport and collection [32]. Under optical excitation (Process 1), electrons in the donor will be excited from the highest occupied molecular orbital (HOMO) to the lowest unoccupied molecular orbital (LUMO) and will leave holes in HOMO (Process 2). This will create an exciton or a Coulomb hole-and-electron pair. When the exciton diffuses to the interface of donor and acceptor, an exciton separates to a low-energy state. The electron transfers to the LUMO of the acceptor, and the hole remains in the HOMO of the donor, creating a charge-transfer state (CT states) (Process 3). The electrons and the holes will be collected by the electrodes (Process 4). Eventually, an electric motive force will be generated.

Power Conversion Efficiency
A typical graph demonstrating the performance is the I-V graph shown in Figure 2b. The most important parameters are the power-conversion efficiency (PCE or ), the ability to convert solar energy into electricity, which is defined as:

Power Conversion Efficiency
A typical graph demonstrating the performance is the I-V graph shown in Figure 2b. The most important parameters are the power-conversion efficiency (PCE or η), the ability to convert solar energy into electricity, which is defined as: where V OC is the voltage at an open circuit; I SC is the short circuit current, i.e., the current at zero voltage; and P in is the solar power absorbed, normally standardized as AM1.5 sunlight.

Fill Factor
FF is the fill factor describing the extraction of the photogenerated current [33,34], and the value can be obtained by with the numerator being the maximum area enclosed by the curve shown in Figure 2b. A is the active area, and J is the current density, i.e., the current divided by the area through which the current flows. Ideally, the curve will be a square shape, and thus the FF is 1. However, there are mainly three crucial parameters affecting FF, which are series resistance R s , shunt resistance R sh , and the diode [34]. The slope in region I is determined by 1/R s , which is mainly attributed to the resistance of the active layer, electrodes, and the interface between them. Region II is governed by a photodiode equation (active layer can be considered as pn junctions containing n-type semiconductors and p-type semiconductors): where I s is the dark current; k B is the Boltzmann constant; e is the elementary charge; T is the temperature; and n is a correction factor controlled by recombination and dissociation. The sharp slope in region III is dominated by 1/R sh , which exists due to the current leakage from the pinhole and edge in the device.

Short Circuit Current Density
J SC was found to be the dominant factor of the PCE [35]. Photons absorbed, and free charge carriers generated and collected determine J SC [35]. Technically, the smaller the bandgap, the more overlap between the absorption of the active layer and the solar spectrum, resulting in the more potential to achieve a higher J SC . However, a smaller bandgap could also result in a lower V OC in terms of the same voltage loss. Li and coworkers [36] have found that an ideal NFA with a bandgap of about 1.4 eV could reach 19.8%.

Voltage Loss
Voltage loss (V loss ) originates from three losses [37,38] shown in Figure 2c and can be written as where ∆V 1 represents the intrinsic loss caused by radiative recombination above the bandgap; ∆V 2 denotes the additional radiative recombination below the bandgap; ∆V 3 stands for the loss caused by non-radiative recombination; and E g can be determined by the absorption onset E onset g and a more accurate method determined by the interception of the absorption and emission spectrum E inter g .
We can see that V OC can reach E g /e when there is no loss. ∆V 1 is unavoidable, and it is typically between 0.25 and 0.3 V [38]. So, we can control the other losses by high illumination efficiencies and an aligned band structure (a smaller energy difference between two different LUMO) [38]. For example, the losses can be significantly reduced from around 0.9 eV to about 0.5 eV when replacing a fullerene acceptor with an NFA [17,23].

Synthesis and Properties of Y6 Material
To simplify, all the synthesis processes and properties of Y6 are cited from Yuan et al [20].

Material Design
Y6 material (Figure 3a), also known as BTP-4F [21], BTF-4F-8 [39], and BTPTT-4F [40], is an n-type organic semiconductor or non-fullerene acceptor. The molecule design synthesis was based on a recently used push-pull strategy [41][42][43][44], A-DAD-A, where "A" represents the accepting electrons or the electron-withdrawing moiety in green regions in Figure 3a, and "D" denotes the donating electrons or electron-donating moiety in blue regions in Figure 3a. Firstly, commercially available 2, 1, 3-benzothiadiazole (BT) [45,46] was employed as the central core, which not only serves as constructing low-bandgap materials [8] but can also be a potential candidate in thick-film devices [47] due to the high hole mobility of the BT unit [48]. Secondly, based on the BT unit, a novel dithienothiophen[3.2b]-pyrrolobenzothiadiazole (TPBT) unit was introduced to extend the conjugation length. Thirdly, long linear and branched alkyl chains in "D" units were used to possess high solubility and the opportunity of preferable packing, which will be later discussed in "progress of the improvement of Y6 material" section. Finally, 2-(5,6-difluoro-3-oxo-2,3dihydro-1H-inden-1-ylidene)malononitrile (2FIC) as end units were encapsulated, which are believed to promote intermolecular interactions by non-covalent bondings [49,50] and to enhance optical absorption [20]. Zhang and coworkers [23] reported that delocalization of exciton and electron wave functions in the Y6 material and distinctive two-dimensional π − π molecular packing in solution and thin films illustrate a good material design for highly efficient organic solar cells.
sess high solubility and the opportunity of preferable packing, which will be later discussed in "progress of the improvement of Y6 material" section. Finally, 2-(5,6-difluoro-3-oxo-2,3-dihydro-1H-inden-1-ylidene)malononitrile (2FIC) as end units were encapsulated, which are believed to promote intermolecular interactions by non-covalent bondings [49,50] and to enhance optical absorption [20]. Zhang and coworkers [23] reported that delocalization of exciton and electron wave functions in the Y6 material and distinctive two-dimensional − molecular packing in solution and thin films illustrate a good material design for highly efficient organic solar cells.  [20]. Reprinted with permission from Ref. [20]. Copyright 2019 Elsevier Inc. (c) Synthesis of Y6 material with yield rates in each step [20]. Adapted with permission from Ref. [20]. Copyright 2019 Elsevier Inc.

Material Synthesis
Dark-blue Y6 powder was synthesized with only four steps and an overall yield of 12% [20]. The raw materials and the famous name reactions including Stile coupling, Cadogan reductive cyclization, the Vilsmeier-Haack reaction, and Knoevenagel condensation for the whole process are shown in Figure 3c.  [20]. Reprinted with permission from Ref. [20]. Copyright 2019 Elsevier Inc. (c) Synthesis of Y6 material with yield rates in each step [20]. Adapted with permission from Ref. [20]. Copyright 2019 Elsevier Inc.

Material Synthesis
Dark-blue Y6 powder was synthesized with only four steps and an overall yield of 12% [20]. The raw materials and the famous name reactions including Stile coupling, Cadogan reductive cyclization, the Vilsmeier-Haack reaction, and Knoevenagel condensation for the whole process are shown in Figure 3c.

Absorption Behaviors
The absorption spectra of Y6 in the solution (black line) and film (red line) are shown in Figure 3b. Y6 exhibits strong and broad absorption in the 600-1000 nm region and extends to 1100 nm corresponding to the near-infrared region with a maximum absorption peak at 821 nm and an absorption coefficient of 1.07 × 10 5 cm −1 . The absorption coefficient of Y6 material is higher and red-shifted compared with another common NFA material, ITIC [19]. Moreover, low absorption in the range of 400 nm to 600 nm can be observed in solution and the thin film of Y6, indicating potential applications for transparent optoelectronic devices. The optical bandgap (E g ) of Y6 is 1.33 eV (close to 1.4 eV), calculated from E g = 1240 λ onset where the absorption onset (λ onset ) of Y6 is 931 nm.
The absorption range of Y6 material is complemented with that of PM6 materials where its absorption range is within 400 and 600 nm. The two materials make full use of the photons in the solar-radiation spectrum, making it one of the main reasons for high J SC in OSCs.

Band Properties
To evaluate the practical HOMO and LUMO, the band properties of Y6 were measured via electrochemical cyclic voltammetry. They used anhydrous acetonitrile (CH 3 CN) solution with Ag/AgCl as a reference electrode and the ferrocene/ferrocenium (Fc/Fc + ) as an internal reference. The energy levels of HOMO and LUMO for Y6 are −5.65 eV and −4.10 eV, respectively [20].
The voltage loss of the PM6:Y6-based device is 0.67 V by E onset g , which was much smaller than fullerene and some non-fullerene devices with a smaller bandgap at that time [23,38], where PM6 is also known as PBDB-TF [51], PBDB-T-F [52], or PBDB-T-2F [23].
Other properties such as sufficient thermal stability, high electron and hole mobilities, and good morphology properties can also be found in Ref. [20]. In conclusion, compared with ITIC-series- [53] and M-series- [54] based solar cells, higher utilization of sunlight, lower energy loss, and a high fill factor give rise to high efficiency for conventional and inverted structure [55,56] of the device, both demonstrating the high efficiency of up to 15.7% [20].

Progress of the Improvement in Y6 Material
After successful synthesis of the Y6 NFA, scientists have tried different methods to improve the material including new acceptors retaining Y6 s main properties; new donors matching Y6 including P2F-Ehp [40], PtzBI-dF [57], PM7 [58], and D18 [59]; and device engineering such as the ternary strategy [60][61][62], the quaternary strategy [63], and layer-bylayer structures [64]. For Y6 derivatives, there are mainly seven different material engineering methods shown in Figure 4, and the following discussion will be based on this figure.

Fine-Tuning the Flanking Unit
Altering the end group provides an effective method to tune the energy levels shown in Figure 5a.

Fine-Tuning the Flanking Unit
Altering the end group provides an effective method to tune the energy levels shown in Figure 5a. P3HT is the simplest and cheapest donor that can more possibly be industrialized [69]. However, in a P3HT:Y6 system, the significantly large difference in LUMO and HOMO of the active layers gives rise to high voltage loss (0.86 V), low VOC (0.45 V), and consequently a low efficiency (2.41%) [65]. Using a weaker non-fluorine electron-withdrawing group to encapsulate the core, Y5 shows better-aligned energy levels with P3HT, leading to a PCE of 3.60% [65]. Yang and coworkers [65], in 2020, reported a new TPBT-RCN acceptor modifying the end-capping groups of Y6 via an even weaker group. The absorption onset of the new NFA acceptor, TPBT-RCN, with a weaker end group leads to a significant blue-shift to that of Y5 and Y6. However, better-matched energy levels, more photon absorbing in the visible spectrum and higher mobility could be attributed to a comparable JSC of 16.49 mA cm −2 , a higher VOC of 0.81 V, and an efficiency of 5.11% [65].
Hou and his group [66] in 2020 synthesized novel ZY-4Cl n-type semiconductors with another end group enabling a different band structure and better morphology of nanoscale phase separation. Consequently, Jsc and FF improve significantly to 16.49 mA cm −2 and 0.65, respectively, with a little decline in VOC in comparison with that of TPBT-RCN. Finally, a P3HT:ZY-Cl device achieves 9.5%, which was the highest reported efficiency of P3HT-based OSCs at that time [66].
In 2019, Cui and coworkers [21] changed the F-substituted to the Cl-substituted terminal unit of the Y6 and reported the acceptor BTP-4Cl that is also later known as Y7 [26]. Usually, chlorination results in a downshifted LUMO level and thus a reduced VOC in the device [70,71]. However, an abnormal phenomenon was observed. BTP-4Cl (Y7) materials show a red-shifted absorption compared with BTP-4F (Y6), but they also demonstrate increased VOC due to the Vloss decreasing by 0.04 eV. These properties together give a higher efficiency of 16.5% [21], matching with the PM6 donor.  P3HT is the simplest and cheapest donor that can more possibly be industrialized [69]. However, in a P3HT:Y6 system, the significantly large difference in LUMO and HOMO of the active layers gives rise to high voltage loss (0.86 V), low V OC (0.45 V), and consequently a low efficiency (2.41%) [65]. Using a weaker non-fluorine electron-withdrawing group to encapsulate the core, Y5 shows better-aligned energy levels with P3HT, leading to a PCE of 3.60% [65]. Yang and coworkers [65], in 2020, reported a new TPBT-RCN acceptor modifying the end-capping groups of Y6 via an even weaker group. The absorption onset of the new NFA acceptor, TPBT-RCN, with a weaker end group leads to a significant blueshift to that of Y5 and Y6. However, better-matched energy levels, more photon absorbing in the visible spectrum and higher mobility could be attributed to a comparable J SC of 16.49 mA cm −2 , a higher V OC of 0.81 V, and an efficiency of 5.11% [65].
Hou and his group [66] in 2020 synthesized novel ZY-4Cl n-type semiconductors with another end group enabling a different band structure and better morphology of nanoscale phase separation. Consequently, Jsc and FF improve significantly to 16.49 mA cm −2 and 0.65, respectively, with a little decline in V OC in comparison with that of TPBT-RCN. Finally, a P3HT:ZY-Cl device achieves 9.5%, which was the highest reported efficiency of P3HTbased OSCs at that time [66].
In 2019, Cui and coworkers [21] changed the F-substituted to the Cl-substituted terminal unit of the Y6 and reported the acceptor BTP-4Cl that is also later known as Y7 [26]. Usually, chlorination results in a downshifted LUMO level and thus a reduced V OC in the device [70,71]. However, an abnormal phenomenon was observed. BTP-4Cl (Y7) materials show a red-shifted absorption compared with BTP-4F (Y6), but they also demonstrate increased V OC due to the V loss decreasing by 0.04 eV. These properties together give a higher efficiency of 16.5% [21], matching with the PM6 donor.

Creating a New Core Moiety
Synthesizing a new center core is another method to change the electronic properties, especially the band properties shown in Figure 5b.
Zhu and coworkers [68] in 2020 employed a novel imide-functionalized quinoxaline (QI) in the central unit and synthesis of QIP-4Cl and QIP-4F, resulting in a deeper HOMO and a higher LUMO level shown in Figure 5b. These achieve an impressive high V OC of 0.94 V and thus the highest efficiency of 13.3% [68] for P2F-Ehp:QIP-4Cl. This efficiency wass among the best QI-based binary device at that time. These materials also suggest that the end group can alter the band structure too.
In late 2020, Zhang and coworkers [67] have reported Y6Se by substituting selenium for Sulphur in the central building block. This strategy has successfully decreased the radiative (∆V 2 ) and non-radiative (∆V 3 ) recombination loss as well as a little broader absorption, higher mobility, and better photostability. All of these show the highest efficiency of 17.7% [67] at that time.

Altering the Donor Units in A-DAD-A
Modifying the donor units in A-DAD-A, especially the emerging asymmetric molecule design, is an effective method to control the molecule stacking.
In 2020, Cai and coworkers [72] first unidirectionally removed thiophene in the ladder benzothiophene unit, resulting in an asymmetric molecular structure, Y21, and a larger dipole moment than that of Y6. This larger dipole moment would reinforce the molecular packing, and, consequently, the efficiency of the PM6:Y21 sample is 15.4% [72].
In 2021, Gao and coworkers [73] laterally fused four and five thiophene units into benzothiadiazole to produce BP4T-4F and BP5T-4F. Besides, a highly asymmetric BP4T-4F molecule (ABP4T-4F) with one thiophene on one side and three thiophenes on the other side were also synthesized and studied. From Figure 6a-c, the signal of BP4T-4F is the most distinct, meaning the favorable face-on packing, whereas that of ABPT-4F is the least. Blended with PM6, the PCEs of BP4T-4F, BP5T-4F, and ABPT-4F are 17.1% [73], 16.7% [73], and 15.2% [73], respectively. Gao and coworkers [74] also tried to fuse six thiophene units into the Y6 core and constructed a new acceptor BP6T-4F. In addition, another isomer molecule of BP6T-4F, namely, ABP6T-4F, was synthesized based on the isomerization of asymmetric strategy. However, this leads to quite opposite effects in the efficiency. In Figure 6d,e, the diffraction signal in ABP6T-4F shows a stronger signal in the q r direction (stronger π-π stacking property) even though the signal is weaker in the q z direction (weaker crystallization property). The advantages of the symmetry-breaking strategy in this molecule far outweigh the disadvantages, thus resulting in a significantly improved efficiency of ABP6T-4F (15.8% [74]) than that of BP6T-4F (6.4% [74]).

Tuning the Alkyl Chain R 1
Sidechain modification on the beta position of the thiophene unit, R 1 , is also a flexible strategy mainly used to control intermolecular stacking.
Chai and coworkers [75] focused on the side-chain orientation by 2021. They designed and synthesized three isomeric NFA named o-BTP-PhC6, m-BTP-PhC6, and p-BTP-PhC6. The results show that the hexyl chain of m-BTP-PhC6 is titled in terms of the molecule plane in comparison with vertical o-BTP-PhC6 and the horizontal p-BTP-PhC6 shown in Figure 7a by DFT (density functional theory) calculation where the 2-ethylhexyl chains attached to the pyrrole rings were replaced by 2-methyl propyl to accelerate the calculations. Due to this unique orientation, m-BTP-PhC6 exhibits the most advantageous intermolecular stacking among the three isomers. Figure 7b shows that the m-BTP-PhC6 peak is the most prominent in the q z direction between 1.5 A −1 and 2 A −1 , which indicates a face-on direction beneficial to light-harvesting where q z is the magnitude of the scattering vector in the z-direction. When the molecule is mixed with PTQ10, the PCE of the device based on m-BTP-PhC6 reaches 17.7% [75], which is significantly better than that of the device based on o-BTP-PhC6 (16.0% [75]) and p-BTP-PhC6 (17.1% [75]) and the best value of non-fullerene OSC device based on PTQ10 at that time. [73], and 15.2% [73], respectively. Gao and coworkers [74] also tried to fuse six thiophene units into the Y6 core and constructed a new acceptor BP6T-4F. In addition, another isomer molecule of BP6T-4F, namely, ABP6T-4F, was synthesized based on the isomerization of asymmetric strategy. However, this leads to quite opposite effects in the efficiency. In Figure 6d,e, the diffraction signal in ABP6T-4F shows a stronger signal in the qr direction (stronger -stacking property) even though the signal is weaker in the qz direction (weaker crystallization property). The advantages of the symmetry-breaking strategy in this molecule far outweigh the disadvantages, thus resulting in a significantly improved efficiency of ABP6T-4F (15.8% [74]) than that of BP6T-4F (6.4% [74]). Li and coworkers [22] by 2021 have created L8-R (L8-BO, L8-HD, and L8-OD) acceptors through a branched-chain in R 1 . Compared with the two kinds of Y6 molecule π-π stacking, there are three stacking patterns for L8-R molecules shown in Figure 7c through the Zerner's intermediate neglect of differential overlap (ZINDO) method [76], which can provide more charge-hopping channels, which are thus more favorable for the transition of charge. Among the three NFAs, the electron mobility of L8-BO (2-butyl octyl substitution) was 6.79 × 10 −4 cm 2 V −1 s −1 , which is higher than the mobilities of Y6 (4.49 × 10 −4 cm 2 V −1 s −1 ), L8-OD (4.87 × 10 −4 cm 2 V −1 s −1 ), and L8-HD (5.54 × 10 −4 cm 2 V −1 s −1 ) [22]. Besides, TEM images in Figure 7d show that L8-BO has the best morphology due to its uniformity and no obvious aggregation to impede charge transfer. These factors together improve the performance of the device. When L8-BO is blended with PM6, the efficiency of the single junction photovoltaic device is 18.32% [22].
These results show that Y6 can still have more room to be improved, reaching 20% via different methods.

Extending the Conjugation Length
It is useful to extend the conjugation length to absorb more photons in the near-infrared region. The absorption and photovoltaic properties of Y6 and some tailored red-shifted materials are shown in Table 2.
He and coworkers [77], by 2020, successfully employed different types of alkyl thiophenes to extend the absorption onset of Y6. As seen from Table 2, H1, H2, and H3 bathochromic-shift 1016 nm. Interestingly, the different alkyl chains in the thiophenes again demonstrate different packing, namely, edge-on for H1, mixed with edge-on and face-on for H2, and dominantly face-on for H3. Consequently, H3 exhibits the highest efficiency of 13.75% among the three new materials. Li and coworkers [24] later employed the H3 to semi-transparent organic solar cells and achieved the PCE of 8.26% [24] and average photopic transmittance (APT) of 47.72% [24].
Hai and coworkers [78] attached vinylene to one side (BTP-1V-1F) and both sides (BTP-1V-1F) to the core of the Y6. Although BTP-2V-2F red-shift not as obviously as H3 and BTP-2V-2F (Table 2), its efficiency (14.24% [78]) is one of the highest values for binary single-junction OSCs based on ultra-narrow bandgap NFA with a bandgap below 1.29 eV [78]. chains attached to the pyrrole rings were replaced by 2-methyl propyl to accelerate the calculations. Due to this unique orientation, m-BTP-PhC6 exhibits the most advantageous intermolecular stacking among the three isomers. Figure 7b shows that the m-BTP-PhC6 peak is the most prominent in the qz direction between 1.5 A −1 and 2 A −1 , which indicates a face-on direction beneficial to light-harvesting where qz is the magnitude of the scattering vector in the z-direction. When the molecule is mixed with PTQ10, the PCE of the device based on m-BTP-PhC6 reaches 17.7% [75], which is significantly better than that of the device based on o-BTP-PhC6 (16.0% [75]) and p-BTP-PhC6 (17.1% [75]) and the best value of non-fullerene OSC device based on PTQ10 at that time.   [75]. (c) Electronic coupling (meV) for L8-R (L8-BO, L8-HD, L8-OD), and Y6 dimers with different crystal packing motif. The electronic couplings for dimers of L8-R and Y6 in crystals were computed [22]. Adapted with permission from Ref. [22]. Copyright 2021 Springer Nature Limited. (d) TEM patterns of PM6:Y6 and PM6:L8-R blend films [22]. Adapted with permission from Ref. [22]. Copyright 2021 Springer Nature Limited.

Modulating the Side Chain R 2
Tailoring the side chain in the pyrrole unit, R 2 , could be mostly used to increase solubility and apply to a large-scale device while maintaining nearly the same absorption and band profiles.
Jiang and coworkers [79] have first synthesized N-C11 by swapping the R 1 and R 2 chains of Y6. They found that N-C11 demonstrates much lower solubility and larger domain (56.4 nm compared with 21 nm in Y6 measured by resonant soft X-ray scattering or RsoXS in short), which impairs the efficiency of the device. In light of this, it is vital to keep the branched alkyl chain on the nitrogen atom of the molecule to retain steric hindrance, and the branched position of the branched alkyl chain on the pyrrole unit is further optimized. The group also synthesized N3 and N4 via shifting branching units of R 2 . Comparing three kinds of molecules (Y6, N3, and N4) with different branchedchain positions, it was found that the acceptor with the third branched alkyl chain has the best solubility, morphology (Figure 8a), and electronic properties, thus achieving the best efficiency of 16.0% [79]. Finally, a highly efficient organic solar cell containing the PM6 with a PCE of 16.74% [79] was obtained by using the ternary strategy when adding a small amount of PC 71 BM receptor.

Modulating the Side Chain R2
Tailoring the side chain in the pyrrole unit, R2, could be mostly used to increase solubility and apply to a large-scale device while maintaining nearly the same absorption and band profiles.
Jiang and coworkers [79] have first synthesized N-C11 by swapping the R1 and R2 chains of Y6. They found that N-C11 demonstrates much lower solubility and larger domain (56.4 nm compared with 21 nm in Y6 measured by resonant soft X-ray scattering or RsoXS in short), which impairs the efficiency of the device. In light of this, it is vital to keep the branched alkyl chain on the nitrogen atom of the molecule to retain steric hindrance, and the branched position of the branched alkyl chain on the pyrrole unit is further optimized. The group also synthesized N3 and N4 via shifting branching units of R2. Comparing three kinds of molecules (Y6, N3, and N4) with different branched-chain positions, it was found that the acceptor with the third branched alkyl chain has the best solubility, morphology (Figure 8a), and electronic properties, thus achieving the best efficiency of 16.0% [79]. Finally, a highly efficient organic solar cell containing the PM6 with a PCE of 16.74% [79] was obtained by using the ternary strategy when adding a small amount of PC71BM receptor.  Hong and coworkers [39] in 2019 reported that a novel small-molecule acceptor material BTP-4F-12 had been synthesized successfully by increasing the length of the branched alkyl chain R 2 (from 8 carbons to 12 carbons) on the Y6 non-fullerene acceptor. This strategy shows improved efficiency of the single-junction binary OSC to 16.4% [39] with PM6 due to the improvement of crystallinity and electron mobility. Further studies had shown that replacing the donor T1 with even better solubility enables the possibility of different kinds of eco-compatible solvents to fabricate OSCs, with the results shown in Figure 8b. More importantly, with tetrahydrofuran (THF), a non-chlorinated and non-aromatic solvent, a high PCE of 14.4% [39] was achieved with an active layer of 1.07 cm 2 by blade-coating rather than traditional spin-coating [69].
Dong and coworkers [7] by 2020 also tried to extend the side chains and have reported a non-fullerene acceptor DTY6. When mixed with donor PM6, the OSCs based on DTY6 show an excellent PCE of 16.3% [7] when using non-halogen solvent o-xylene (XY), whereas Y6-based OSC showed poor device performance (PCE < 11% [7]) when treated with XY. Due to the aggregation of Y6 in XY (Figure 8c), very large domains appear in Y6-based hybrid films, which leads to the low efficiency of hole transfer from Y6 to PM6 and enhances the non-radiative recombination to 0.28 eV [7]. However, the films based on DTY6 display a more reasonable domain size (Figure 8c), which ensures effective hole transfer and low non-radiative recombination (0.24 eV [7]) from DTY6 to PM6. A large active layer area of 18 cm 2 with opaque PM6: DTY6-based components treated with o-xylene were fabricated via the blade-coating method with a PCE of 14.4% [7].
Cui and coworkers [80] in 2019 applied this strategy to Y7 materials and synthesized a series of NFA BTP-4Cl-X (X = 8, 12, or 16). They also blended the BTP-4Cl-X acceptor materials with the polymer-donor material PM6 and prepared OSCs by a spin-coating method as well as a blade-coating method with different areas of the active layer, respectively. The results in Figure 8d show that BTP-4Cl-12 achieves the highest efficiency in the two processing methods with different areas, indicating that the most preferable processability and thus the most suitable morphology characteristics. More specifically, small cells with an active area of 0.09 cm 2 (0.06 cm 2 mask area) by spin coating and larger cells with an active area of 0.81 cm 2 (1 cm 2 mask area) prepared by blade coating achieved a PCE of 17% and 15.5%, respectively [80].

Polymer Acceptor
All-polymer solar cells (all-PSCs) composed of a mixture of a polymer donor and a polymer acceptor have attracted much attention due to better flexibility, easier large-scale production, and higher transparency [81][82][83][84][85]. However, the PCE of the pure polymer blend system is still far behind that based on the polymer donors and small-molecule acceptors' counterparts. This is mainly due to the lack of high-performance polymer acceptors including a high absorption coefficient and a low energy bandgap and the poor mutual solubility with polymer donors due to the self-aggregation of polymers [86]. In this section, we discuss some tailored polymers structures (Figure 4) based on Y6.
Wang and coworkers [86] have successfully synthesized a series of PYT polymers based on the Y6 structure with low, medium, and high molecular weights, namely, PYT L , PYT M , and PYT H , all of which exhibit a low optical bandgap of 1.40-1.44 eV and a high absorption coefficient of over 10 5 cm −1 . The relationship between molecular weight and various physical properties was also studied. PYT M has better miscibility, higher mobility, and less loss. Therefore, the final device with PYT M :PM6 shows a 13.44% high efficiency, better than that based on PYT L (12.55%) and PYT H (8.61%) [86].
Compared with the non-fluorinated compound PYT, the PYF-T reported by Yu and coworkers [87] in 2021 has a stronger and red-shifted absorption spectrum, stronger molecular packing, and higher electron mobility. At the same time, the fluorination on the terminal group of PYF-T leads to the downshift of its energy level and the higher matching with the donor PM6, allowing a higher charge transfer and a smaller voltage loss. Therefore, the device efficiency of PM6: PYF-T is 14.1% [87]. A similar strategy was applied to another polymer acceptor PFA1 by Peng and coworkers [88] in 2020, with a dramatically improved efficiency of 15.11% paired with PTzBi-Si, where the non-fluorinated counterpart only results in a PCE of 4.01% [88].
Two regionally specific polymer acceptors named PYT-T-o and PYF-T-m had been prepared by Yu and coworkers [89] by 2021. Compared with the random polymer PYF-T and PYF-T-m with weaker conjugation and the intramolecular charge transfer effect, PYF-T-o shows a more regular molecular arrangement, stronger and red-shifted absorption, and more ideal phase separation, all of which lead to the device efficiency of PM6:PYF-T-o to 15.2% [89].
Based on PYF-T, Yu and coworkers [90] in 2021 reported a PY2F-T with two fluorine atoms on its one end group, and an all-polymer solar cell with an efficiency of 15.22% [90] was obtained. After introducing the fluorine-free PYT as the third component into the main system of PM6: PY2F-T by Sun and coworkers [91], the improved PCE of PM6:PY2F-T is as high as 17.2% [91] due to a balance between material crystallization and phase separation.

Future and Perspectives
Exploring ever more powerful donor or acceptor materials should be the top priority to achieving the high efficiency of OSCs. The design of organic materials is far more fundamental and complex than the device engineering issues. In terms of material development, continuing to design and modify new materials is important, with the emphasis on ML- [92] guided molecular design.

Possibilities
Since the first report of OSCs in 1973 [93], around 2000 donor molecules in use of OSCs have been synthesized and tested in photovoltaic cells [94]. These donors provide quite efficient data for high-dimension and/or deep ML. Additionally, the computing speed of the computer and ML have developed rapidly, especially the prevalence of deep learning and graphics processing units (GPU). Sun and coworkers [94] by 2019 established a dataset containing 1719 donor materials and used a variety of ML algorithms for binary classification. In the end, the best ML algorithm was independently validated by synthesizing 10 new OSC donor materials. The accuracy of the 10 new materials is 80% [94], which is in good agreement with the experimental classification results. These together give the possibilities to use ML to assist molecule design.

Necessities
Employing ML is not only possible but also desirable and necessary. As discussed in the review previously, from fullerene-based acceptors to non-fullerene acceptors or from the successful trial of the Y6 material, scientists always do a large number of trials and errors in different aspects such as tuning the side chains, trying new flanking units, and designing new building blocks. It is fine and admirable since these experiments provide a huge amount of data and experience, which is of great use for further development. However, the experience and accuracy of human beings are often inferior to those of computers. It is important to note that trials fail, and the successful reports seem inspiring but few. This is especially true that at least one property would be affected when other properties are being improved. For example, when improving J SC , the two parameters-V OC and FF-would usually reduce.

Future of Machine Learning in OSCs
The main process of combining ML with traditional design is shown in Figure 9. Datasets, including different predicted values (Ys) and the relevant factors or features (Xs) such as molecular structure (M.S) and processing conditions (P.C), need to be built. Computers, which can only identify numbers, cannot recognize molecular structures like human beings. Therefore, molecular structures as well as other relevant parameters need to be converted to numbers by feature extractors. Xs and Ys would be then put into the machine, which can learn a model or a function via ML. After training the model to have the lowest losses or highest accuracies, the molecular design could be obtained quickly. human beings. Therefore, molecular structures as well as other relevant parameters need to be converted to numbers by feature extractors. Xs and Ys would be then put into the machine, which can learn a model or a function via ML. After training the model to have the lowest losses or highest accuracies, the molecular design could be obtained quickly.
. . The tools for ML workflow are available and are user-friendly for material scientists. The key issue is setting up a high-quality database and features extractor for a given material. Although scientists have carried out substantial research [35,[94][95][96][97][98][99][100][101][102] on ML for OSCs, there are still problems that will be discussed in Section 6.3.1. More applications will be discussed in Section 6.3.2.

Key Challenges (a) New and more Datasets
Some reports [35,[98][99][100] on machine learning are based on fullerene-based singlejunction OSCs, which have limited efficiency, and fewer people have focused on it compared with NFAs. The dataset of 1719 OSC donor materials set up by Sun and coworkers [94] includes both new and old materials. However, these data contain some old-reported ones, which are probably meaningless and even harmful for ML. This is because old-reported ones will have much higher efficiency now due to the accumulation of experience and significantly improved device-engineering strategies. So, in order to achieve high accuracy of ML, we must first establish high-quality data. Some efficient methods include web crawl and a Natural Language Process (NLP) [103], which could be further established. Additionally, building some online platforms in public to renew, check, and train the data is an effective method, which could probably act as a magnet to attract the research attention. In addition, more predicted values including efficiency, transparency, The tools for ML workflow are available and are user-friendly for material scientists.
The key issue is setting up a high-quality database and features extractor for a given material. Although scientists have carried out substantial research [35,[94][95][96][97][98][99][100][101][102] on ML for OSCs, there are still problems that will be discussed in Section 6.3.1. More applications will be discussed in Section 6.3.2.
6.3.1. Key Challenges (a) New and more Datasets Some reports [35,[98][99][100] on machine learning are based on fullerene-based singlejunction OSCs, which have limited efficiency, and fewer people have focused on it compared with NFAs. The dataset of 1719 OSC donor materials set up by Sun and coworkers [94] includes both new and old materials. However, these data contain some old-reported ones, which are probably meaningless and even harmful for ML. This is because old-reported ones will have much higher efficiency now due to the accumulation of experience and significantly improved device-engineering strategies. So, in order to achieve high accuracy of ML, we must first establish high-quality data. Some efficient methods include web crawl and a Natural Language Process (NLP) [103], which could be further established. Additionally, building some online platforms in public to renew, check, and train the data is an effective method, which could probably act as a magnet to attract the research attention. In addition, more predicted values including efficiency, transparency, FF, stability, processability parameters, and thermal properties could be concluded in the dataset.

(b) Suitable Fingerprints
Most feature extractors on ML for organic materials mainly come from fingerprints. There are more than 10 kinds of fingerprints with more than 15 software applications that could convert molecular structures to corresponding fingerprints [104]. Fingerprints, containing many bits ranging from 166 (MACCS) [105] to 13,824 (TGT) [106], are mature now especially in drug discoveries [107,108]. However, these fingerprints are not quite suitable for ML in OSCs. Firstly, the more bits the fingerprints contain, the more information they convey; but they are more time-consuming when performing ML. Besides, those fingerprints are mainly used in pharmaceutics, of which some parts are trivial for pharmaceutics but may be significant for the design of OSCs due to totally different molecular design purposes. Secondly, increasing one dimension means significantly more data needed to be input [109]. Even with 166 bits and containing nearly all the useful information, these 166 dimensions for the input system would be a disaster when there are at most 1719 relatively low-quality data available.

Applications
Possible applications mainly include the following part, which is sorted mainly according to the difficulty of implementation.
(a) Filtering before Synthesis Filtering a material before synthesis may be the easiest part. Once we want to synthesize a new material, we can put it into the trained model to see whether it performs high efficiency before conducting the experiment. If the value is not high, iterations could be carried out until an acceptable value is achieved. This will reduce some unnecessary costs for designing low-efficiency materials. Note that the algorithm should allow the ceiling efficiency to be high instead of low when achieving the same accuracy. Besides, we can also allow predicted values such as mobility and absorption onset, so that materials used in other areas such as conducting polymers would be guided.
Based on the functions predicted, required molecules are known, and even the highest efficiency or the highest mobilities could be derived. Take the three-dimensional input as an example: we can solve for the maximum value of the function and subsequently get an exact 3-bits fingerprint, which can, in turn, get back the corresponding molecule.

(b) Predicting and even Unifying More Things Together
After synthesizing or choosing materials, processing conditions could also be optimized via machine learning. Du and coworkers [102] by 2021 have combined robot and ML to screen processing conditions in terms of efficiency and photostability. Furthermore, we can combine material properties as well as different processing conditions to design materials or devices by extrapolation if we could set up the dataset with more predicted values or combine different datasets ( Figure 10). This is particularly promising as efficiency is no longer the primary factor restricting its application. Research now also focuses on multifunctional integration and application of OSCs mentioned in introduction. For example, it is easier to achieve a highly transparent or highly efficient device, but it is more difficult to achieve these two characteristics simultaneously as well as stability, processability, and mechanical properties. However, this dataset and algorithm will act as a powerful tool to guide the design of multi-functional devices. ML finds rules from the beginning to the end. It will be relatively accurate with more data and important features. It would be difficult for human beings to learn and consider all kinds of processing conditions, properties, and performances at the same time using the ab initio method, especially when the materials are in the external complex environment.

(c) Aid Theories Behind
Understanding the theory especially in atomic and even electronic levels behind the devices plays a significant role. It would involve the quantum mechanism and simulation of the Schrodinger equation. Although we can apply the first principle and some software packages, it is very challenging and time-consuming to build from the bottom to the top. However, we can use machine learning to guide it. For example, we can know the weights end even hidden equations between possible variables and the value of FF. These weights and equations can be an important guide for the principle establishment and material design especially when we do not fully understand the factors of a phenomenon.

Conclusions
Recent years have witnessed remarkable efficiency improvement in organic solar cells since the high efficiency of Y6 NFA was reported in 2019. This review started with the necessary working mechanism and summarized the key design strategies, structures, and various properties on the PM6:Y6-based OSCs for the past three years. Modifying the Y6 material by tuning the side-chain, increasing or decreasing the number of rings of the fused unit, altering the end unit, and changing atoms could serve as effective methods to further improve the PCE. As more and more data are available, and with the rapid improvement in computer calculation speed and ML algorithms, the solutions offered by ML may shed some light for the future development of highly efficient and multi-functional OSCs or other applicable fields. By improving the technologies and overcoming the challenges such as low-quality data and less-efficient feature extractors, an era of more machine filtering and aiding before experiments would be achieved in the near future.

(c) Aid Theories Behind
Understanding the theory especially in atomic and even electronic levels behind the devices plays a significant role. It would involve the quantum mechanism and simulation of the Schrodinger equation. Although we can apply the first principle and some software packages, it is very challenging and time-consuming to build from the bottom to the top. However, we can use machine learning to guide it. For example, we can know the weights end even hidden equations between possible variables and the value of FF. These weights and equations can be an important guide for the principle establishment and material design especially when we do not fully understand the factors of a phenomenon.

Conclusions
Recent years have witnessed remarkable efficiency improvement in organic solar cells since the high efficiency of Y6 NFA was reported in 2019. This review started with the necessary working mechanism and summarized the key design strategies, structures, and various properties on the PM6:Y6-based OSCs for the past three years. Modifying the Y6 material by tuning the side-chain, increasing or decreasing the number of rings of the fused unit, altering the end unit, and changing atoms could serve as effective methods to further improve the PCE. As more and more data are available, and with the rapid improvement in computer calculation speed and ML algorithms, the solutions offered by ML may shed some light for the future development of highly efficient and multi-functional OSCs or other applicable fields. By improving the technologies and overcoming the challenges such as low-quality data and less-efficient feature extractors, an era of more machine filtering and aiding before experiments would be achieved in the near future.