Accelerated Discovery of the Polymer Blends for Cartilage Repair through Data-Mining Tools and Machine-Learning Algorithm

In designing successful cartilage substitutes, the selection of scaffold materials plays a central role, among several other important factors. In an empirical approach, the selection of the most appropriate polymer(s) for cartilage repair is an expensive and time-consuming affair, as traditionally it requires numerous trials. Moreover, it is humanly impossible to go through the huge library of literature available on the potential polymer(s) and to correlate the physical, mechanical, and biological properties that might be suitable for cartilage tissue engineering. Hence, the objective of this study is to implement an inverse design approach to predict the best polymer(s)/blend(s) for cartilage repair by using a machine-learning algorithm (i.e., multinomial logistic regression (MNLR)). Initially, a systematic bibliometric analysis on cartilage repair has been performed by using the bibliometrix package in the R program. Then, the database was created by extracting the mechanical properties of the most frequently used polymers/blends from the PoLyInfo library by using data-mining tools. Then, an MNLR algorithm was run by using the mechanical properties of the polymers, which are similar to the cartilages, as the input and the polymer(s)/blends as the predicted output. The MNLR algorithm used in this study predicts polyethylene/polyethylene-graftpoly(maleic anhydride) blend as the best candidate for cartilage repair.


Introduction
Cartilages are the connective tissues mostly present in the long bones in the human body. Their primary functions are to provide lubrication and to act as a cushion against the friction on movement. The damage to these tissues can occur due to trauma, obesity, aging, osteoarthritis, and by several other factors. Often, even a minute tear in the cartilage over time leads to further irreversible damage [1,2]. Patients with the disintegration of cartilages experience debilitating joint pain followed by restricted movement [3,4]. Alarmingly, more than 200 million people are suffering from osteoarthritis daily around the globe [5].
Chondrocytes play a significant role by producing the extracellular matrix (ECM) sought for the repair of cartilages. However, the chondrocytes have only a limited capacity for self-renewal; this makes the cartilage repair difficult [6,7]. Therefore, the insertion of cartilage substitutes is deemed to be the potential solution. The damaged cartilages are often replaced by using several surgical procedures such as total knee replacement, microfracture, and mosaicplasty. Moreover, as a possible therapeutic option, the chondrocytes extracted from the donors are transplanted to the damaged area to reduce the severity of the disease [8]. However, the rejection of the implanted chondrocytes by the recipient(s) makes this procedure unpredictable, and finding the right donor is also troublesome. Especially in the case of chondrocytes, instability of the monolayer is a crucial obstacle [9,10]. Therefore, it is evident that none of these techniques offers long-lasting solutions to cartilage damage-related diseases [11,12].
Recently, tissue engineering has delivered promising results in the field of cartilage regeneration and repair. The 3D scaffolds play a crucial role in replacing biological tissues through the development of fully functioning load-bearing biomaterials [13,14]. Fabricated 3D scaffolds should have the capacity to considerably mimic the characteristics and functions of the extracellular matrix (ECM) of the cartilages [15]. The 3D scaffolds should acquire mechanical integrity and appropriate cell attachment, cell adhesion, and cell proliferation. Over the years, the polymers have shown tremendous potential to be molded as 3D scaffolds having the abovementioned properties [16,17]. By using many different combinations of natural and synthetic polymers, many have attempted to develop fully functioning and weight-bearing cartilages. However, the complexity of natural cartilages makes it very challenging to create the designed biological substitutes. As a whole, the advanced biomaterials typically fall short either in biomechanics or in functioning [18,19].
Moreover, the discovery of novel combinatory materials for cartilage tissue engineering typically takes a long time (i.e., 10-20 years) from the material design to commercialization. In particular, the material design procedure is one of the most tedious, time-consuming, and costly affairs in this regard [20,21]. Because going through material design and development is such a lengthy procedure, most of the time the developed product turns out to either be outdated, or the initial hypothesis of the researchers becomes irrelevant or inadequate due to the advancement of research in the respective field. Moreover, during the period of evolution of designing a commercial product, an immense amount of data is being generated in the relevant field(s). Manually, it becomes laborious and time-consuming to find and interpret the data patterns or to extract any meaningful information out of them. With the advancements in information technology, information can be retrieved from these data to implement knowledge discovery by data mining through machine-learning algorithms [22][23][24]. The process and tools of data mining provide immense help in executing the algorithms needed for material informatics [25,26]. In material informatics, a vast amount of data in the form of experimental outcomes from the previous research is being retrieved, and using the tools of machine learning, facilitation of knowledge discovery is implemented [23]. Indeed, the fourth dimension of material science involves extracting information from the literature with the aid of machine-learning algorithms. In this way, the knowledge can be retrieved by discovering the association between the data, pattern recognition, and clustering without any human intervention. This approach leads to speedy design and development of novel materials, as once the information is attained, a minimal amount of trial and error is needed to be carried out [27].
Consequently, the use of material informatics in developing new materials from the potential polymer(s) is currently in great demand. More precisely, in recent years, the implementation of material informatics and machine learning in materials science (i.e., polymer design, feature selection) has increased exponentially [26,28,29]. In material science, experimental design can be carried out through direct design and inverse design approaches. A direct or conventional design approach involves the prediction of the properties of the fabricated materials by taking "materials" as the input. Recently, with the advancement of machine learning, a new technique of material design, namely inverse design, can be implemented. Inverse design is a fully data-driven approach that predicts the target materials by putting the relevant material properties (i.e., molecular structures, physical, mechanical, thermal, biological, etc.) as the input [30,31].
For example, Venkatraman et al. (2018) used an evolutionary algorithm for the virtual screening of several classes of monomers while developing a batch of polymeric materials with a high refractive index to determine which chemical groups have a major effect on increasing the refractive indices of the developed materials [32]. Similarly, Tao et al. (2021) carried out a comparative study on the capability of 79 different machine-learning algorithms to predict the glass transition temperature of polymers. The random forest was found to be ideal in the prediction of glass transition temperature by using a large database of polymers as input [33]. Also, the heat capacity of the polymer was predicted with good accuracy by using an artificial neural network by Ishikiriyama (2021). By using the data found in ATHAS data bank artificial neural network could predict the heat capacity with minimum error [34]. Very recently, Chen et al. (2021) synthesized a hand-crafted new polymer using machine-learning techniques. This study involves the creation of a polymerization database comprised of information regarding the reactants, homopolymers, and the polymerization paths that were used to predict the synthesis pathway of the new polymer comprising of the targeted properties [35]. In another study, Le (2020) used the Gaussian process regression method to predict the tensile strength of the nanocomposites by setting the types and mechanical properties of the polymer matrices, types, and properties of carbon nanotubes as nanofillers and incorporation parameters as inputs [36]. While Venkatraman et al. (2018) [32] and Le (2020) [36] adopted a direct design approach, in a recent study, Kim et al. [30] developed a deep-learning neural network inverse design model to predict high-performance organic molecules by creating a relationship between the structure and their material properties. Very recently in 2020, Kim et al. [37] employed the inverse design approach through a neural network algorithm in which 31,713 known zeolites properties were considered as input to predict 121 porous nanostructures.
To the best of the authors' knowledge, no study has yet been conducted to predict the polymer(s)/blend(s) to mimic human cartilages by a machine-learning algorithm. The primary objective of this study is to implement an inverse design approach to obtain the target polymer(s)/blend(s) that exhibit similar properties of the human cartilage. In this study, the null hypothesis assumes that the prediction of polymer/blends' names in the database and the subset is quite similar. However, the alternative hypothesis would be that the prediction between the sets is dissimilar in nature. This research was carried out in four steps; initially, the systematic bibliometric analysis was carried out by using the review articles' citation data in the field of cartilage tissue engineering, and then the relevant database was created from the PoLyInfo library by using data-mining tools. Then a machine-learning technique (i.e., multinomial logistic regression) has been used to run both single and multiple properties optimizations. In the final step, the machine-learning algorithm was employed to predict the polymer(s)/blends that possess similar functional properties of the human cartilages (e.g., tensile modulus, tensile strength, and elongation at break).

Bibliometric Analysis
Bibliometric analysis is a powerful tool that allows researchers to get an overview of the trend in which the specific research field is heading into. The benefit of this analysis includes the extraction of the original articles and their citation summary to run the overall publication analysis in a particular field of interest [38,39]. From the large group of polymers and subgroups of polymers available in the market, the objective of this study was to discover the polymers/composites which are among the best to be used in cartilage repair. To retrieve the major groups of polymers/composites, a bibliometric analysis was carried out. In this study, using "cartilage" as the keyword, review journal articles' title, abstracts and their citation reports were extracted, and bibliometric analysis was run in R program. The results containing the top ten highly cited articles were tabulated and summarized in Table 1. Each review paper linked to cartilage repair was manually reviewed, and the names of the major polymers/composites mentioned in these papers were extracted and listed. The selection of these polymers/composites was done based on their recurrent usage in cartilage tissue engineering. The final selection of the polymers/blends was made based on the availability of data in the Polyinfo database on January 2021 summarized in Table 2.

Database Creation
For the success and durability of the biomaterials, mechanical properties play a substantial role [50,51]. Specifically, in cartilages, a primary symptom of the disease (i.e., osteoarthritis) is the deterioration of the mechanical properties of the cartilages [52]. Concerning the biomechanical properties of the cartilages, the tensile strength, tensile modulus, and elongation at break are the most sought mechanical properties, because the main function of cartilage is to hold/resist the amount of stress and compressive force exerted on the body part(s) of interest at any given moment [53]. The key mechanical properties of the native articular cartilages were extracted from the literature by using data-mining tools and are summarized in Table 3. The tensile strength, tensile modulus, and elongation of the natural cartilages reported in Table 3 are 35 MPa, 3-100 MPa, and 2-140%, respectively However, under 15% less strain, the tensile modulus reaches only up to 5 to 10 MPa [54]. Therefore, the database of the polymers/composites has been created taking into account these key mechanical properties of the natural cartilages. PolyInfo is a section of the NIIMS materials database that extracts numerical data from the relevant sources (i.e., academic articles) [58]. In this study, data-mining tools were used to retrieve the numerical values of the major mechanical properties of the polymers/blends used in cartilage repair from the PolyInfo database. The summarized database (Table 2) includes a collection of 97 polymers/blends and their related mechanical properties. The ranges of the extracted values for each of the mechanical properties were chosen as the input or independent variable in this study, whereas the names of the polymers/blends were taken as the output or the dependent variable (i.e., categorical in nature) for the machine-learning algorithm. The input and output variables were chosen in such a way to implement the inverse design approach shown in Figure 1. Through this design approach, the polymers/blends' names were predicted by using properties of the natural cartilage extracted from journal articles (summarized in Table 3).

Multinomial Logistic Regression (MNLR)
For dealing with the categorical dependent variable with multiple levels, very few modeling techniques are available. Among those few techniques, multinomial logistic regression (MNLR) is one of the most suitable machine-learning algorithms used to model

Multinomial Logistic Regression (MNLR)
For dealing with the categorical dependent variable with multiple levels, very few modeling techniques are available. Among those few techniques, multinomial logistic regression (MNLR) is one of the most suitable machine-learning algorithms used to model the data having multiple factors and levels. The dataset used to implement the multinomial logistic regression technique is typically categorical and has multiple levels. This approach can deduce the probability of occurrence of the output in the dataset. This regression is distinct from its linear regression as it implements a sigmoidal behavior to its data [59].
In multinomial logistic regression, the model calculates the probability of the one factor chosen in place of the other. The probability mass function is given by Equation (1) Equation (2) can be used to calculate the log-likelihood function: wherein the x β j can be computed by using the Equation (3) x β j = ln Pr y = j To evaluate the modeled data having a categorical response variable, it is crucial to develop a relationship between the logarithm odds and the explanatory variables for the modeled data. It is given by Equation (4): where x is the explanatory variable, βs are the regression coefficient of the factor(s), and p is the predicted probability. In dealing with the multiclass regression problem, a relationship between the input and output is developed by Equation (5): where k is the number of classes and βs are the regression coefficient of the factor(s) [61]. The overall workflow of MNLR is depicted in the form of a flowchart in Figure 2. The initial step includes the preprocessing of the data, as the computer cannot differentiate between the factorial and numerical variables. Therefore, each parameter was needed to be assigned as either numerical or categorical. The final step of the data preprocessing includes the removal of the outlier(s) from the dataset. To check the accuracy of the prediction, the data were divided into training and testing sets. Then the training set was being fed into the algorithm and the likelihood ratio test was performed. The deviance of the null hypothesis and the residual was noted. The model's goodness of fit was confirmed. By using the testing data without the output, a new prediction was retrieved. Once the difference between the observed and the predicted values (i.e., residual) was minimum, the prediction was done by using the tensile modulus, tensile strength, and the elongation at break of the natural cartilages.
the null hypothesis and the residual was noted. The model's goodness of fit was confirmed. By using the testing data without the output, a new prediction was retrieved. Once the difference between the observed and the predicted values (i.e., residual) was minimum, the prediction was done by using the tensile modulus, tensile strength, and the elongation at break of the natural cartilages.

Bibliometric Analysis
The most convenient and least time-consuming approach to obtain an overall standing (i.e., trend, current progress, etc.) of any research field is the bibliometric analysis. It enables the researchers to summarize the overall research trends and to develop the link between the variables in the field(s). The bibliometric analysis can be used to analyze the most evaluated component(s) in the area of the research [62,63]. Particularly in tissue en-

Bibliometric Analysis
The most convenient and least time-consuming approach to obtain an overall standing (i.e., trend, current progress, etc.) of any research field is the bibliometric analysis. It enables the researchers to summarize the overall research trends and to develop the link between the variables in the field(s). The bibliometric analysis can be used to analyze the most evaluated component(s) in the area of the research [62,63]. Particularly in tissue engineering, a huge number of materials/blends/composites are being investigated to evaluate their efficacy to replace damaged or degrading cartilages. Among them, polymers are at the frontline in creating biomaterial substitutes (i.e., scaffolds) [64,65]. In cartilage tissue engineering, several different types and combinations of polymers are being investigated to mimic articular cartilages [8,66]. To select the most suitable polymer(s) and/or the combination of polymers, the bibliometric analysis was used in this study. The review papers were extracted from the Web of Science by using the "cartilages" and "polymers" as the keywords. The review articles' citation details were downloaded for the period from 2005-2020. By using the bibliometrix package in R program [67], a list of highly cited review papers was extracted and the top ten cited papers are being summarized in Table 1.
Upon running the bibliometric analysis using "cartilages" and "polymers" as keywords, the most recurrent words were displayed in the form of the wordcloud as shown in Figure 3. All of the keywords shown in the wordcloud appeared more than 70 times in the published literature. In Figure 3, the keywords are displayed in larger to smaller fonts depending on their recurrence in the literature. It is evident from Figure 3 that cartilage, scaffolds properties, collagen, polymers, hydrogels, mechanical strengths, and chondrocytes are found to be among the most recurrent keywords. In other words, these are the most important parameters to consider while designing a new material for cartilages repair. In this study, our focus was limited to the mechanical strength of the polymer(s) to mimic the articular cartilages. Considering the mechanical properties (i.e., tensile strength, tensile modulus, and elongation, etc.), based on the recurrent mentions in the review papers and the data available in the PoLyInfo database, the list of polymers/blends has been prepared to be used in the machine learning algorithm (shown in Table 2).
are at the frontline in creating biomaterial substitutes (i.e., scaffolds) [64,65]. In cartilage tissue engineering, several different types and combinations of polymers are being investigated to mimic articular cartilages [8,66]. To select the most suitable polymer(s) and/or the combination of polymers, the bibliometric analysis was used in this study. The review papers were extracted from the Web of Science by using the "cartilages" and "polymers" as the keywords. The review articles' citation details were downloaded for the period from 2005-2020. By using the bibliometrix package in R program [67], a list of highly cited review papers was extracted and the top ten cited papers are being summarized in Table  1.
Upon running the bibliometric analysis using "cartilages" and "polymers" as keywords, the most recurrent words were displayed in the form of the wordcloud as shown in Figure 3. All of the keywords shown in the wordcloud appeared more than 70 times in the published literature. In Figure 3, the keywords are displayed in larger to smaller fonts depending on their recurrence in the literature. It is evident from Figure 3 that cartilage, scaffolds properties, collagen, polymers, hydrogels, mechanical strengths, and chondrocytes are found to be among the most recurrent keywords. In other words, these are the most important parameters to consider while designing a new material for cartilages repair. In this study, our focus was limited to the mechanical strength of the polymer(s) to mimic the articular cartilages. Considering the mechanical properties (i.e., tensile strength, tensile modulus, and elongation, etc.), based on the recurrent mentions in the review papers and the data available in the PoLyInfo database, the list of polymers/blends has been prepared to be used in the machine learning algorithm (shown in Table 2). Figure 3. Wordcloud of the most recurrently (>70 times) occurred words found in the articles' abstracts using "polymers" and "cartilages" as the keywords.

Selection and Preprocessing of the Database
Depending on the load or the direction of stretching, the components of cartilages, especially the collagen fibrils and proteoglycans, move towards the direction of the load. Initially, when the tensile stress is less, only the collagen fibers' realignment occurs [68]. Once the cartilage experiences large deformation, the collagen attains a large amount of tensile stiffness due to the stretching of collagen fibers. Once the tension is removed, the collagen fibrils and proteoglycans move back to their normal position. Indeed, the viscoelasticity of cartilages in tension is best described by the mechanical properties, such as Figure 3. Wordcloud of the most recurrently (>70 times) occurred words found in the articles' abstracts using "polymers" and "cartilages" as the keywords.

Selection and Preprocessing of the Database
Depending on the load or the direction of stretching, the components of cartilages, especially the collagen fibrils and proteoglycans, move towards the direction of the load. Initially, when the tensile stress is less, only the collagen fibers' realignment occurs [68]. Once the cartilage experiences large deformation, the collagen attains a large amount of tensile stiffness due to the stretching of collagen fibers. Once the tension is removed, the collagen fibrils and proteoglycans move back to their normal position. Indeed, the viscoelasticity of cartilages in tension is best described by the mechanical properties, such as tensile strength, elongation at break, and tensile modulus [68,69]. Therefore, in this study, the ranges of the tensile strength, elongation at break, and tensile modulus have been considered for the database to take account of the viscoelastic behavior of cartilages.
Typically, in the inverse design approach, the properties of the polymers/blends are used as the input whereas the output is the most suitable blend to be used in the intended applications. In this study, the input is the numerical range of the selected properties, and the output is found as a categorical variable (i.e., string or the text). For this purpose, the scattered plots have been plotted in Figure 4 to represent the raw data for tensile modulus (Figure 4a,b), tensile strength (Figure 4c,d), and elongation at break (Figure 4e,f), respectively. It is important to note that the database has been created based on the list of the polymers/blends most recurrently used in cartilages repair. The raw data retrieved from the databases consisted of outliers that were needed to be screened/removed before running the machine-learning algorithm. After cleaning the outliers, the most concentrated data zones were selected for all three properties of interest. As shown in Figure 4, the tensile strength data is so concentrated that they almost formed a straight line, whereas the tensile modulus and elongation data shown in Figure 4 were a little more scattered. The blue rectangular boxes shown in each of Figure 4a-f represent the numerical ranges of tensile modulus, tensile strength, and elongation, respectively. Upon cleaning up the outliers, the magnitude range of the tensile modulus, tensile strength, and elongation was found to be 0-2 GPa, 0-0.2 GPa, and 0-400%, respectively. These ranges are in agreement with the mechanical properties of human cartilages presented in Table 3. sidered for the database to take account of the viscoelastic behavior of cartilages.
Typically, in the inverse design approach, the properties of the polymers/blends are used as the input whereas the output is the most suitable blend to be used in the intended applications. In this study, the input is the numerical range of the selected properties, and the output is found as a categorical variable (i.e., string or the text). For this purpose, the scattered plots have been plotted in Figure 4 to represent the raw data for tensile modulus (Figure 4a,b), tensile strength (Figure 4c,d), and elongation at break (Figure 4e,f), respectively. It is important to note that the database has been created based on the list of the polymers/blends most recurrently used in cartilages repair. The raw data retrieved from the databases consisted of outliers that were needed to be screened/removed before running the machine-learning algorithm. After cleaning the outliers, the most concentrated data zones were selected for all three properties of interest. As shown in Figure 4, the tensile strength data is so concentrated that they almost formed a straight line, whereas the tensile modulus and elongation data shown in Figure 4 were a little more scattered. The blue rectangular boxes shown in each of Figure 4a-f represent the numerical ranges of tensile modulus, tensile strength, and elongation, respectively. Upon cleaning up the outliers, the magnitude range of the tensile modulus, tensile strength, and elongation was found to be 0-2 GPa, 0-0.2 GPa, and 0-400%, respectively. These ranges are in agreement with the mechanical properties of human cartilages presented in Table 3.

Multinomial Logistic Regression (MNLR)
In this study, numerical independent variables (i.e., inputs) and categorical response variables (i.e., the outputs) were used. Indeed, the response variables were 97 different polymers/blends, and consist of multiple levels; hence, the multinomial logistic regression (MNLR) was deemed to be suitable for modeling the response variables as factors [59,70,71]. The numerical factors were at two levels, and they consisted of a range of minimum and maximum values of the tensile strength at yield, tensile modulus, and elongation at break. The input was either an individual factor or a combination of multiple factors for multivariable optimization. Figure 5 shows the schematic diagram of the whole simulation process. Once the data is divided into 75% as training data and 25% as testing data, the formula and vectors in which data is assigned are needed to run the machine-learning algorithm. Once the algorithm's simulation is completed, the values such as the goodness of fit, likelihood ratio, and its capacity to reject the null hypothesis is reviewed. Once the model efficacy is verified, the user-defined inputs are inserted into the algorithm to predict the output. For example, taking the tensile modulus of blends as the input (i.e., single factor), the training datasets are modeled. After modeling with the training data, the range of the tensile modulus of the cartilages was used as the testing input to predict the best polymer blends owing to having similar properties of the cartilages. The response variables were found to be the blends of poly(glycolic acid)//poly(lactic acid) and poly(methyl methacrylate)//poly(epsilon-caprolactone). variables (i.e., the outputs) were used. Indeed, the response variables were 97 different polymers/blends, and consist of multiple levels; hence, the multinomial logistic regression (MNLR) was deemed to be suitable for modeling the response variables as factors [59,70,71]. The numerical factors were at two levels, and they consisted of a range of minimum and maximum values of the tensile strength at yield, tensile modulus, and elongation at break. The input was either an individual factor or a combination of multiple factors for multivariable optimization. Figure 5 shows the schematic diagram of the whole simulation process. Once the data is divided into 75% as training data and 25% as testing data, the formula and vectors in which data is assigned are needed to run the machinelearning algorithm. Once the algorithm's simulation is completed, the values such as the goodness of fit, likelihood ratio, and its capacity to reject the null hypothesis is reviewed. Once the model efficacy is verified, the user-defined inputs are inserted into the algorithm to predict the output. For example, taking the tensile modulus of blends as the input (i.e., single factor), the training datasets are modeled. After modeling with the training data, the range of the tensile modulus of the cartilages was used as the testing input to predict the best polymer blends owing to having similar properties of the cartilages. The response variables were found to be the blends of poly(glycolic acid)//poly(lactic acid) and poly(methyl methacrylate)//poly(epsilon-caprolactone). The goodness-of-fit model was assessed by comparing its residual deviance (Dm = −2 LLm = 1466.6345) with the null hypothesis residual deviance for the model (D0 = −2 LL0 = 1763.898), which includes only the intercepts. The deviance is a measure of how poorly the model reproduces the observed data. The likelihood ratio test (G = D0 − D1 = 297.26296, df = 94, p < 0.001) compares these two deviances. The null hypothesis is rejected, indicating a statistically significant decrease in deviance when the predictor (X) is included in the model. This means that the model fits the data better than the null model in terms of the correspondence between the observed and predicted conditional probabilities. The goodness-of-fit of modeled data was interpreted by utilizing p-value, and the residual deviation and its corresponding p-value were summarized in Table 4. It is evident from Table The goodness-of-fit model was assessed by comparing its residual deviance (D m = −2 LL m = 1466.6345) with the null hypothesis residual deviance for the model (D 0 = −2 LL 0 = 1763.898), which includes only the intercepts. The deviance is a measure of how poorly the model reproduces the observed data. The likelihood ratio test (G = D 0 − D 1 = 297.26296, df = 94, p < 0.001) compares these two deviances. The null hypothesis is rejected, indicating a statistically significant decrease in deviance when the predictor (X) is included in the model. This means that the model fits the data better than the null model in terms of the correspondence between the observed and predicted conditional probabilities. The goodnessof-fit of modeled data was interpreted by utilizing p-value, and the residual deviation and its corresponding p-value were summarized in Table 4. It is evident from Table 4 that the null hypothesis was rejected for all of the independent variables, and thereby, the p-value is significant for all of the parameters (p < 0.05).
The MNLR model was run by using the neural network pack in R after 100 iterations [72]. The residual values have been plotted against the fitted values to generate the scatter plot (Figure 6a) while considering all three independent variables (i.e., tensile modulus, tensile strength, and elongation at break of the natural cartilages) used in this study for multivariable optimization. The scatter plot presented in Figure 6a proves the data independence, homoscedasticity, and linearity. On inserting the tensile modulus of 3-100 MPa, elongation of 2-140%, and the tensile strength of 35 MPa to the already fitted model, the multinomial regression model predicted polyethene/polyethene-graft-poly(maleic anhy-dride) blend as the most suitable one for the cartilage repair. The predicted results along with the residual deviance for all other individual and combinatory testing inputs are summarized in Table 5.   (1) poly(glycolic acid)//poly(lactic acid) (2) poly(methyl  To confirm whether the predictions made by the MNLR model are accurate and relevant to cartilage tissue engineering, the predicted polymers/blends' names were chosen as the keywords in PubMed and ScienceDirect and searched. The search results were summarized in the form of a pie chart, as shown in Figure 6b. It was found that polyethylene and polylactic acid have been mentioned with cartilage tissue engineering 10,603 and 6214 times, respectively. Rise in the use of the polycaprolactone (PCL) in the field of cartilage tissue engineering is attributed to the minimal intermolecular interaction and high movement in the chain segment [73,74]. This mobility facilitates the design the PCL scaffolds in the form of composites, foams and fibers [75,76]. The natural polymers like chitosan and collagen with PCL is proven to improve the crosslinking as well as its mechanical properties. Moreover, the application of the crosslinking agents to hydrophobic polymers is known to improve their water uptake and convert them to hydrophilic polymers [77,78]. The mechanical properties of PCL are highly influenced by its molecular weight. For instance, the scaffold made up of PCL having the molecular weight of 15,000 g/mol is known to exhibit brittle characteristics, whereas the scaffold consisting of 40,000 g/mol is soft and semicrystalline in nature [73,74]. A second attractive feature of PCL and polyethylene is their biodegradability, and they dissolve conveniently in presence of enzyme activity and follows a natural metabolic pathway. PCL is also used as crosslinking agents in many studies as they blend with most of the polymers easily [73,74].
Similarly, the mechanical properties of polyethylene are highly influenced by the molecular weight. Best wear performance is observed with the polyethylene consisting of 1 million repeating units. The high-density polyethylene is known to have a lesser degree of branching, which is attributed to its recommendable intermolecular forces and tensile strength. Along with the molecular weight, crosslinking agents and crystallinity also impact their mechanical properties. Normally polymers having lower crystallization temperature contain numerous amorphous regions which weaken the overall mechanical strength of the materials. However, the addition of nanofillers, acting as nucleation agents, to polyethylene is known to increase its crystallinity and mechanical properties [79,80].
In addition, the mechanical properties of polymers or blends are influenced by several factors, such as the molecular weight, degree of polymerization, and the cross-linking agent. The influences vary from one polymer to another. For example, as mentioned earlier, the increase in molecular weight in PCL (above 40,000 g/mol) improves its mechanical properties. However, in case of poly(ethylene glycol) diacrylate, the rise in molecular weight of polymer blend leads to its improved mechanical properties, but has negative impact on the cell growth [81]. The crystallinity of the polymer also contributes to the mechanical properties of the polymers. The mechanical strength increases with a rise in crystallinity. However, the studies indicate that the cell attaches more in the amorphous region than in the crystalline region, which is due to the surface roughness as the amorphous regions are rougher than the crystalline regions [80]. More specifically, chondrocytes are known to attach at higher concentration at PGA than in PCL [80].
Moreover, polylactic acid and polycaprolactone belong to the group of linear aliphatic polyester polymers [82] and polycaprolactone is known to increase cell viability by 20% [83]. Even the byproducts of the degradation of polylactic acid (i.e., water and carbon dioxide) are non-toxic in nature [84]. Moreover, both polypropylene and polyethylene are widely used in developing implants, as they are easy to be molded to the desired shape and are inexpensive [85][86][87][88]. They have been known to initiate a minimal immune response, and have superior mechanical (i.e., viscoelastic) properties and biocompatibility [89][90][91]. Particularly, both PLA and PCL can be modified to exhibit viscoelastic properties required for mimicking cartilages [92][93][94][95][96]. Moreover, polypropylene has proven to be an excellent candidate in the development of cartilages in nasal reconstructive surgery [97]. Overall, all the polymers/blends mentioned in the pie chart (Figure 6b) have been employed in the field of cartilage tissue engineering [89,90,98,99].

Conclusions
The design of new biomaterials is a complex, tedious, and time-consuming affair. Designing cartilage substitutes is even more intricate due to their unique properties/functionality and their diverse locations in the human body. Among many, viscoelasticity is one of the most important parameters that needs to be taken into serious consideration in designing cartilages. More importantly, the viscoelasticity of the cartilages may not be attributed to any single property; rather it is better represented by a set of mechanical properties such as tensile strength, tensile modulus, and the elongation at break. Therefore, it is expected that the best polymer matrices/blends to be used in cartilage repair must exhibit these properties as much as in the ranges of the properties of the natural articular cartilages. This study attempts to use the inverse design approach by using a machine-learning algorithm (i.e., multinomial logistic regression) to predict the most suitable polymers/blends for cartilage substitutes by using the ranges of the tensile modulus, elongation at break, and tensile strength of the natural cartilages as inputs. Both single and multivariable optimization was conducted so that the output was predicted by using both individual and combinatory properties of the cartilages. Considering all three properties of interest, poly(epsilon-caprolactone)/poly(bisphenol A carbonate) and polyethene//polyethene-graft-poly(maleic anhydride) were found to be the best polymer(s)/blends for cartilage repair using the multinomial logistic regression techniques. All of the predicted polymer(s)/blend(s) through this machine-learning algorithm are FDA-approved to be used in cartilage tissue engineering; more importantly, they possess the similar tensile biomechanical properties of the natural cartilages, and may only initiate minimal immune responses in the body environment.
However, the limitation of this study lies in the low level of goodness of fit of the modeled data, which is largely attributed to the response variable to be categorical in nature. The different machine-learning algorithms may be explored to handle the categorical variable(s) with multiple levels. Moreover, the biological properties of the natural cartilages may be included as inputs in future research, although there is still a lack of an appropriate database to correlate the properties of the stem cells linked to the polymer matrix/blends to be used in cartilage repair. Hence, it is crucial to encourage researchers to report the biological data to the journals in a uniform format, which will eventually help to create the database. As a result, the data mining and machine-learning approaches can be employed to predict the list of suitable polymers and/or to predict their properties to be used in several different tissue engineering applications.

Data Availability Statement:
The data is included with the manuscript and the source code will be made available in Supplementary section.