Classiﬁcation of BOF Slag by Data Mining Techniques According to Chemical Composition

: In the process of converting pig iron into steel, some co-products are generated—among which, basic oxygen furnace (BOF) slag is highlighted due to the great amount generated (about 126 kg of BOF slag per ton of steel grade). Great e ﬀ orts have been made throughout the years toward ﬁnding an application to minimize the environmental impact and to increase sustainability while generating added value. Finding BOF slag valorization is di ﬃ cult due to its heterogeneity, strength, and overall swallowing, which prevents its use in civil engineering projects. This work is focused on trying to resolve the heterogeneity issue. If many di ﬀ erent types of steel are manufactured, then di ﬀ erent types of slag could also be generated, and for each type of BOF slag, there is an adequate valorization option. Not all of the slag can be valorized, but it can be a tool for reducing the amount that must go to landﬁll and to minimize the environmental impact. An analysis by means of data mining techniques allows a classiﬁcation of BOF slag to be obtained, and each one of these types has a better adjustment to certain valorization alternatives. In the plant used as an example of the application of these studies, eight di ﬀ erent slag clusters were obtained, which were then linked to their di ﬀ erent potential applications with the aim of increasing the amount valorized.


Introduction
There has been a progressive increase in environmental awareness over the last few years. Industrial processes require the valorization of useless co-product materials for their sustainability, thus, preventing new useless co-products from being consumed and diminishing the need for new landfills. A large part of this material is sent to a nonhazardous landfill with a lot of capacity due to the volume generated, and, thus, there is a big environmental impact.
The steel-making sector produces a great amount of steel-about 1000 million tons-which generates huge amounts of useless co-product materials throughout the process. Slag, which is produced in several stages of the process, represents the greatest amount of the useless co-product materials [1][2][3][4]. Slag from installations such as the blast furnace is successfully valorized [5,6], but other slag does not have this valorization rate. Linz-Donawitz (LD) steel-making is the most common process, whereby oxygen is blown through carbon-rich molten pig iron and changes it into low-carbon steel. It is a basic process in which fluxes of burnt lime or dolomite are added to promote the removal of impurities and protect the lining of the converter. As a result, 126 kg of basic oxygen furnace (BOF) slag is generated per ton of steel [7]. The significance of BOF slag as a useless co-product material does not come from its potential hazardousness, but rather from the huge volume generated [8,9].
The valorization of BOF slag is largely limited by its lime-free and magnesia-free content, leading to highly problematic swelling, and preventing civil engineering application. Furthermore, the presence of phosphorus is a limiting factor for its reuse in iron and steel industrial processes [10]. There are several possibilities regarding treatment and valorization [8,11,12]; however, both are limited by the characteristics of the material. If this material could be used in the aforementioned applications, the slag sent to landfill would be minimized and the life of landfills would increase.
Due its low specific worth, the transportation costs of slag are another limitation to take into account. However, BOF slag is heterogeneous, and its composition depends on the additives used (selected according to availability), raw materials, and the produced steel characteristics. Steel shops produce a wide variety of products with different components, which, afterwards, correspond to different treatment processes.
In this paper, the aim was to perform a taxonomy of the different types of slag associated with steel production, as shown in Figure 1. Furthermore, an assessment was carried out to determine the optimum applications of valorization for each of these groups.
Sustainability 2020, 12, x FOR PEER REVIEW 2 of 11 co-product material does not come from its potential hazardousness, but rather from the huge volume generated [8,9]. The valorization of BOF slag is largely limited by its lime-free and magnesia-free content, leading to highly problematic swelling, and preventing civil engineering application. Furthermore, the presence of phosphorus is a limiting factor for its reuse in iron and steel industrial processes [10]. There are several possibilities regarding treatment and valorization [8,11,12]; however, both are limited by the characteristics of the material. If this material could be used in the aforementioned applications, the slag sent to landfill would be minimized and the life of landfills would increase.
Due its low specific worth, the transportation costs of slag are another limitation to take into account. However, BOF slag is heterogeneous, and its composition depends on the additives used (selected according to availability), raw materials, and the produced steel characteristics. Steel shops produce a wide variety of products with different components, which, afterwards, correspond to different treatment processes.
In this paper, the aim was to perform a taxonomy of the different types of slag associated with steel production, as shown in Figure 1. Furthermore, an assessment was carried out to determine the optimum applications of valorization for each of these groups.

BOF Slag
LD steel mill slag ( Figure 2) is produced via the process of transforming pig iron obtained from the blast furnace [13] when the impurities are removed by oxidation.
In the LD process for transforming pig iron from the blast furnace into steel, refinement is performed by injecting pressurized oxygen into the bath, which contains both raw materials and additions [14], in order to produce slag (mainly calcium oxide, dolostone, and spar).

Current Prediction
Steel-making

BOF Slag
LD steel mill slag ( Figure 2) is produced via the process of transforming pig iron obtained from the blast furnace [13] when the impurities are removed by oxidation. During the process in Figure 3, the raw materials are introduced into the converter. Next, oxygen is insufflated through a refrigerated spear until the carbon and its impurities are removed from the pig iron. In this process, carbon is released as a gas (as CO and CO2) [15] and the impurities form a slag, which floats on the liquid steel. Afterwards, the slag is separated from the steel and sent  In the LD process for transforming pig iron from the blast furnace into steel, refinement is performed by injecting pressurized oxygen into the bath, which contains both raw materials and additions [14], in order to produce slag (mainly calcium oxide, dolostone, and spar).
During the process in Figure 3, the raw materials are introduced into the converter. Next, oxygen is insufflated through a refrigerated spear until the carbon and its impurities are removed from the pig iron. In this process, carbon is released as a gas (as CO and CO 2 ) [15] and the impurities form a slag, which floats on the liquid steel. Afterwards, the slag is separated from the steel and sent to a yard, where it is irrigated until it reaches a temperature below 50 • C [16]. During the process in Figure 3, the raw materials are introduced into the converter. Next, oxygen is insufflated through a refrigerated spear until the carbon and its impurities are removed from the pig iron. In this process, carbon is released as a gas (as CO and CO2) [15] and the impurities form a slag, which floats on the liquid steel. Afterwards, the slag is separated from the steel and sent to a yard, where it is irrigated until it reaches a temperature below 50 °C [16].   Figure 3. Steel-making process. Table 1 displays the average chemical components of LD steel mill slag, which were obtained via an analysis performed [17] for the ArcelorMittal plant in Avilés (Spain). The main components of this type of slag are Fe, CaO, and SiO 2 .. The lime-free content limits the recyclability of this material in roads, cement, and other applications, where the expansibility of the final product is a critical parameter [18,19].

Methodology used for Slag Classification
Given the historical data extracted from the operation of a European steel mill, the extraction of the classification of the process results was considered as the best option. Information on existing datasets was extracted by means of data mining techniques (in this particular case, the characteristics that allow the classification of different types of slag). It is necessary to extract reliable data and to follow a suitable methodology; indeed, in this case, the Cross Industry Standard Process for Data Mining (CRISP-DM) was used for the knowledge extraction, classifying, and modeling processes [20][21][22].
CRISP-DM methodology was developed for its use in data mining projects at an industrial level [23], and the different stages associated with this methodology are displayed in Figure 4.

P2O5
1-3 Cu 0.03 Mo 0.08 As <1 ppm Cd <0.5 ppm B 0.17 The main components of this type of slag are Fe, CaO, and SiO2.. The lime-free content limits the recyclability of this material in roads, cement, and other applications, where the expansibility of the final product is a critical parameter [18,19].

Methodology used for Slag Classification
Given the historical data extracted from the operation of a European steel mill, the extraction of the classification of the process results was considered as the best option. Information on existing datasets was extracted by means of data mining techniques (in this particular case, the characteristics that allow the classification of different types of slag). It is necessary to extract reliable data and to follow a suitable methodology; indeed, in this case, the Cross Industry Standard Process for Data Mining (CRISP-DM) was used for the knowledge extraction, classifying, and modeling processes [20][21][22].
CRISP-DM methodology was developed for its use in data mining projects at an industrial level [23], and the different stages associated with this methodology are displayed in Figure 4. The first stage was learning what the industrial problem is and then collecting the necessary data. This task was performed in collaboration with steel shop technologists. All of the necessary data were assessed and unloaded from the tracking system archive store and from the analysis performed by the Environmental Department in a specific tracking campaign that lasted 11 months. Nineteen variables were collected overall, which defined the type of steel to be manufactured and the composition of slag produced during casting. Information on 2600 castings was captured [21].
Given that there was not a huge data volume, storage via the MS Access database was considered [24]. Stages associated with the Cross Industry Standard Process for Data Mining (CRISP-DM) methodology.
The first stage was learning what the industrial problem is and then collecting the necessary data. This task was performed in collaboration with steel shop technologists. All of the necessary data were assessed and unloaded from the tracking system archive store and from the analysis performed by the Environmental Department in a specific tracking campaign that lasted 11 months. Nineteen variables were collected overall, which defined the type of steel to be manufactured and the composition of slag produced during casting. Information on 2600 castings was captured [21].
Given that there was not a huge data volume, storage via the MS Access database was considered [24].
According to the study performed on the process, it was determined that the technique to be implemented must be able to establish a classification without previous knowledge of either the number of types or different groups, or the criteria that would define the separation between different groups [25].
Therefore, the classification can be made exclusively according to unprejudiced data. Then, the unsupervised artificial neural network denominated Self Organizing Maps (SOM) was chosen [24,26]. These neural networks enable the projection of an n-dimensional space into an m-dimensional discrete space (generally m equals 1 or 2) with topological sorting. The closest input patterns correspond to the closest points on the map, and they have unsupervised training; in other words, it is not necessary to know the output for every record in order to generate the classifier. On the contrary, the network itself chooses the types and separates the data conveniently. Using this method, it was possible to detect the categories more effectively than with methods that required previous indications on the number of groups, and those which were strongly influenced by outliers [27].
In order to achieve a better understanding of the data, basic statistics, histograms, and graphic representations were made. The information was filtered in order to remove missing or null values due to a lack of measurement on that casting.
The data were filtered in order to detect collecting and transmission errors according to the limit values provided by technologists. The outliers were filtered using Sammon projections [28] to remove them by identifying the castings that did not adjust to the possible parameters of the process. Given the low percentage of outliers detected, the dataset was considered to have enough quality for correct SOM application.

Classification of BOF Slag
Once the dataset was analyzed and the anomalous cases were removed, the networks were then trained and compared and contrasted with all data. Several tests were performed, modifying the number of artificial neurons in the network; thus, excellent results were found in identifying the clusters generated from 64 neurons onward when taking the chemical components of slag as inputs.
Once trained, the k-means technique was applied in order to group together the representative elements of each cell and, therefore, obtain the minimum number of clusters that generated the minor error in classification. In this case, eight clusters were identified, which are displayed in Figure 5, with each color representing a different slag cluster. different groups [25].
Therefore, the classification can be made exclusively according to unprejudiced data. Then, the unsupervised artificial neural network denominated Self Organizing Maps (SOM) was chosen [24,26]. These neural networks enable the projection of an n-dimensional space into an m-dimensional discrete space (generally m equals 1 or 2) with topological sorting. The closest input patterns correspond to the closest points on the map, and they have unsupervised training; in other words, it is not necessary to know the output for every record in order to generate the classifier. On the contrary, the network itself chooses the types and separates the data conveniently. Using this method, it was possible to detect the categories more effectively than with methods that required previous indications on the number of groups, and those which were strongly influenced by outliers [27].
In order to achieve a better understanding of the data, basic statistics, histograms, and graphic representations were made. The information was filtered in order to remove missing or null values due to a lack of measurement on that casting.
The data were filtered in order to detect collecting and transmission errors according to the limit values provided by technologists. The outliers were filtered using Sammon projections [28] to remove them by identifying the castings that did not adjust to the possible parameters of the process. Given the low percentage of outliers detected, the dataset was considered to have enough quality for correct SOM application.

Classification of BOF Slag
Once the dataset was analyzed and the anomalous cases were removed, the networks were then trained and compared and contrasted with all data. Several tests were performed, modifying the number of artificial neurons in the network; thus, excellent results were found in identifying the clusters generated from 64 neurons onward when taking the chemical components of slag as inputs.
Once trained, the k-means technique was applied in order to group together the representative elements of each cell and, therefore, obtain the minimum number of clusters that generated the minor error in classification. In this case, eight clusters were identified, which are displayed in Figure 5, with each color representing a different slag cluster.  Each cell was linked to an artificial neuron of the neural network, and the closeness between the cells indicated the similarities among data that had fallen in each cell. This allowed the k-means technique to group together cells with similar elements into a bigger cluster.
From the analysis of the data of each slag cluster, a taxonomy could be performed on the chemical components that defined each of the eight clusters. In order to make it easier to use, the variation ranks of the slag chemical components were coded into qualitative ranks. Therefore, k-means clustering was undertaken for three of the clusters (which should have been reduced to two clusters in some cases as, for example, in the magnesium oxide) coded as high (H), medium (M), and low (L). The main results derived from the previous studies are displayed in Table 2. Once these groups were identified, it was considered necessary to identify what type of steel was produced when a certain type of slag was obtained as a useless co-product. Thus, it would be possible to identify the potential applications of each one of these groups of slag.
After the dataset analysis, 84 steel grades were found to have been produced in the plant where the data were taken from during the sampling period. Given the high number of grades, it was decided to group the steel grades with similarities when regarding their chemical components.
In order to find the steel grades with similar chemical components, the SOM technique was also used; thus, in each of the neural network cells, steel cases with similar behavior could be found.
For this case, a grid was taken as network topology, since the aim was to identify generic clusters instead of very detailed ones. It was for this reason that the decision to distribute a topology of 3 × 3 artificial neurons was made. Figure 6 shows the average value of every chemical composition of steel in each of the nine artificial neurons. Artificial neurons with similar behaviors were found (remarked in blue and red in Figure 6), since similar castings could be distributed into two different artificial neurons-e.g., the two cases circled in blue and the ones circled in red.
The following case involved clustering castings according to their types of steel and slag by means of creating neural networks. With this dataset, a pivot table was created, where steel grades C Mn Si S P Cu C Mn Si S P Cu C Mn Si S P Cu C Mn Si S P Cu C Mn Si S P Cu C Mn Si S P Cu C Mn Si S P Cu C Mn Si S P Cu C Mn Si S P Cu Artificial neurons with similar behaviors were found (remarked in blue and red in Figure 6), since similar castings could be distributed into two different artificial neurons-e.g., the two cases circled in blue and the ones circled in red.
The following case involved clustering castings according to their types of steel and slag by means of creating neural networks. With this dataset, a pivot table was created, where steel grades and types of slag were exposed and the number of castings were calculated. The steel grades with similar distribution in all of the neurons were linked, since they presented similar behaviors in the network.
When this reduction was finished, there were 18 steel grades and 8 types of slag. Therefore, a cross-reference (Table 3) could be generated, where the types of steel grades that generated a certain type of slag could be learned. An irregular distribution of castings in the different cells of the pivot table can be observed. There were some cases, such as the steel class 17, that almost corresponded to the slag cluster 5 (both classifications were done independently, with one of them using slag components and the other one with components of the resulting steel). Slag clusters 3 and 6 corresponded to high ferric oxide percentages and represented most of the castings. However, there were cases, such as steel class 15, that always presented slag with relatively low steel concentrations, corresponding to slag clusters 1 and 2. However, the ultimate aim of this study was to link classifications with prospective applications, which will be detailed in the following section.

Relations between Slag Clusters and Potential Applications
Due to the vast research performed with the aim to valorize slag, it is interesting to obtain a relationship between potential applications and slag clusters [18,19,29,30]. If this is performed for every possible application, a better valorization of each slag cluster can be obtained.
There are many possibilities when valorizing BOF slag; in order to analyze the potential of the clusters, this relation was studied for the three potential applications proposed as application examples (aggregate for road constructions, railway ballast, and environmental remediation).
The use of slag as an aggregate for road construction is one the most studied valorizations. It is one of the applications with more volume of material to be valorized. Its main restriction in the use of the aggregate in the superficial layer is the expansion that the slag may cause. This expansion is linked to the quantities of CaO and MgO; thus, it is necessary to achieve lower values of both CaO and MgO. Therefore, the most interesting slag clusters for these applications are displayed in Table 4. Thus, the slag that best adjusts in terms of its chemical composition for its use in road construction can be selected.
Due to the similar characteristics of slag and gravel, it is possible to use it as a railway ballast [30]. However, the iron content in the slag presents a problem for its valorization due to the conductivity that it may generate. Thus, slag used for this purpose must have low iron content.
The most interesting clusters for this application are displayed in Table 5. Slag cluster 1 presents the best properties for its use as a railway ballast. Due to the high values of lime (and pH), slag is interesting as an acid neutralizer: e.g., mining acid water. The characteristics of slag must include a high CaO content and a low SiO 2 content (to obtain a high pH value).
The most interesting clusters for this study are the ones displayed in Table 6. The most suitable characteristics for valorization as a material for environmental remediation due to a high pH are in cluster 8. This analysis must be performed with each of the potential valorizations, which may be different for each steel shop depending on the availability of aggregates in the environment, the available cement factories and other factories, and the distance in transportation. The aim is to know what the ideal slag distribution is in order to minimize treatments and to maximize by-product valorization. In this way, it is possible to maximize the valorization of BOF slag and to minimize the amount of this co-product in landfills.

Discussion
Eight slag clusters with different characteristics were obtained using data mining techniques. This classification allows the avoidance of the recycling limitations of many potential applications. Each cluster is more convenient for some valorizations than others. For example, this co-product could be used in road constructions due to its physical characteristics, but its expansion is a limitation; thus, Sustainability 2020, 12, 3301 9 of 10 the best type of slag to valorize in this application is that which has low CaO and MgO content. This result minimizes the possibility of sending slag to the landfill, instead, substituting the use of other raw materials by this by-product.
Some examples regarding the technical feasibility were shown. It is mandatory to know all of the potential applications for this by-product in detail in order to analyze the best valorization options for each cluster in each plant.
When the technical feasibility of BOF slag is known for all of its potential applications, it is necessary to analyze the environmental impact of each application. The same slag cluster could be technically adequate for more than one application. For this reason, the environmental impact could be critical in making a decision about the final valorization of each cluster.

Conclusions
In this paper, a methodology to classify the BOF slag generated in the LD process was presented. For the particular steel shop analyzed, eight clusters were obtained. As this co-product can be used in many applications-albeit with some limitations-this methodology allows the most adequate valorizations to be determined. Each type of slag presents a series of characteristics that make them suitable for certain valorizations.
This methodology uses well-proven and relatively easy techniques, on which, technical experts in each plant can be trained. The classification here was performed using data mining techniques, which is feasible as long as historical data on the composition of slag are available.
The implementation of the approach described in this paper would allow every steel shop to focus their slag valorization activities on the most suitable types.