Benchmarking of Laser Powder Bed Fusion Machines

This paper presents the methodology and results of an extensive benchmarking of laser powder bed fusion (LPBF) machines conducted across five top machine producers and two end users. The objective was to understand the influence of the individual machine on the final quality of predesigned specimens, given a specific material and from multiple perspectives, in order to assess the current capabilities and limitations of the technology and compare them with the capabilities of an 11-year-old machine belonging to one of the end users participating in this investigation. The collected results give a clear representation of the status of LPBF technology considering its maturity in terms of process capabilities and potential applications in a production environment.


Introduction and State of the Art
For about the last 40 years, additive manufacturing (AM), initially known as rapid prototyping, has experienced huge technological advancements. AM is distinguished from other existing technologies as it presents unparalleled design freedom and the possibility of mass customization [1,2]. Complexity and customization in AM were deeply analyzed by Conner et al. by developing a reference system for products that considered customization, volume, and complexity [3], and by Quinlan et al., who analyzed the capability of AM to provide "complexity for free" and compared this capacity with conventional manufacturing technology [4].
According to the ASTM F42 committee, all AM processes can be classified into the following seven main categories [5]: The applications of AM technologies are numerous across all industries, but their full potential has not yet been exploited [6]. Gu et al. claimed that in the next few years, prototyping production using polymers will no longer be a research focus, since it will reach its full maturity, while the focus will be more on AM techniques for the industrial production of metallic components that cannot be produced easily with conventional technologies [7]. Among the AM technologies that are capable of processing metal materials and are used the most for industrial applications, powder bed fusionalso called selective laser melting (SLM) or laser powder bed fusion (LPBF)-stands out. Today, LPBF technologies have been demonstrated to be among the most versatile AM metal processes, showing the capability of producing metal components that have complex shape and are otherwise impossible to produce by conventional manufacturing technologies [8].
LPBF, as all powder-based systems, produces parts by spreading dry powder on a building platform and scanning the selected areas given by the CAD design, converted in STL format, with a laser. The platform is then lowered by one step, a new layer of powder is spread on the surface and the process starts again. One of the main advantages of LPBF is its relatively high resolution including its internal features [9].
Among the main applications of LPBF are aerospace and aviation engineering, biomedical, and tooling [10].
Several LPBF systems are currently available on the market, each with its own strengths and limitations. However, it is still under debate how much powder bed fusion metal AM is ready for industrial production.
In addition, when considering the broad variety of available metal AM systems, it should be investigated how much the choice of a specific LPBF machine influences the quality of the final products. This is the framework through which the present work tries to give some answers to those questions through a machine benchmarking study.
One of the most detailed works dedicated to the benchmarking of AM machines was conducted by Mahesh et al. in 2002. The study describes benchmarking as a process to identify the "highest standards of excellence for products, services and processes" [11]. Benchmarking is a known method to evaluate the capabilities and limitations of a process, in this case an AM process, but is also a useful instrument to identify optimization approaches and sources of information [12]. In their work, Mahesh and colleagues also proposed for the first time a benchmark classification of AM processes, depending on the main purpose of the benchmark. These were divided into three groups: 1. Geometrical benchmark, used to evaluate the geometrical and dimensional accuracy of the additively manufactured products; 2. Mechanical benchmark, a standard design of components to evaluate the AM parts' mechanical properties such as tensile strength, shrinkage, creep properties, etc.; and 3. Process benchmark including all the benchmarking artifacts used to optimize the process, for example, to define the best process parameters for a given outcome [11].
Many benchmarking artifacts have been designed and tested in recent years, and a comprehensive review was presented by Rebaioli and Fassi in 2017 [13]. From this extensive review, we recognize the need for a standardized procedure to evaluate AM processes. In their paper from 2014, Moylan et al. attempted to define general rules for a standard artifact: that it should be large enough to evaluate the system performance, but does not consume too much material to be printed. A standard artifact should include small, medium, and large features, and have both holes and bosses. It must be easy to measure and present some real part features [14]. These guidelines, among other considerations, were followed by Moylan et al. for their artifact, which was developed at the National Institute of Standards and Technology (NIST), and was proposed as a standard test artifact for AM machines [15].
What appears to be clear through the literature review of benchmarking is that most of the studies and recommendations have not focused on a specific AM technology. In fact, the aim was mostly to compare different technologies. In Table 1, some selected examples of benchmarking research that was particularly important for understanding AM processes and comparisons are presented. Definition of a DoE to optimize the manufacturing of a massproduction consumer device geometry.

Goals and Structure of the Paper
In the present work, the aim was not to compare different AM technologies, but rather to compare different machines belonging to the same LPBF family. The aim was to understand the performance of different systems and assess the current capabilities and limitations of the process. For these reasons, an innovative artifact design was formulated to highlight the differences and similarities among the considered systems.
The rest of this paper is structured as follows: in Section 2, the design of this extensive benchmarking project is summarized, referring to a previous paper where this design was introduced by Moshiri et al. [27]. Next, in Section 3, the methodology followed in this work is explained in detail, considering both the data transfer with the participants who took part and the actual analysis undertaken on the specimens. For confidentiality, the identities of the participants are not disclosed; but consist of five current state-of-the-art LPBF machine producers and two end users whose production LPBF systems are used daily. Finally, one machine placed at one of the end users was 12 years old. This system was considered in order to highlight the degree of performance that more modern systems have reached after approximately one decade of process development and advancements. In Section 4, the collected results are presented and the various systems are compared. Our final remarks and conclusions are presented in Section 5.

Design of the Artifact
The reason for proposing a new design is that none of the benchmarking research in the published literature to date has been prepared considering a holistic evaluation of all aspects of a specific AM technology, in this case LPBF. The objective of this innovative design is to combine aspects from geometrical, mechanical, and process benchmarks, according to [11], in order to have an overall characterization of the actual technological readiness level of LPBF for its implementation in industrial production technology (see Figure 1). Figure 1. Key elements considered for the benchmarking design approach, adapted from [27].
To limit the open variables as much as possible, this investigation focused on one process (LPBF), and the design of the benchmarking specimens (prepared by the authors and sent to the participants) and the material were locked. The material, maraging steel grade 300 (1.2709), was chosen because it is a common material for LPBF processes, particularly relevant and applied in tooling applications [28]. The material required for the evaluation was new powder, never used and sieved in other jobs, to avoid any cross-contamination that would have caused a deviation of the results from the influence of the machine. Only the technology, intended as the specific machine, was the open variable.
All of the samples were analyzed as built; no post-process was allowed, apart from cutting the parts from the building platform. The benchmarking project was designed to allow an evaluation of multiple aspects (see Table 2).

Dimension of feature
Various features visible in Figure 3, with minimum dimensions that go beyond the currently known machine´s limitation (see Table 3).

Surface roughness
Important factor considering industrial production and current need for post-processing parts to achieve specified roughness.

Repeatability
Same job, different positions Evaluated by repeating production of the same part in 4 corners of the machine´s building volume and centre of the building platform; extremely important considering industrial need for a robust and reliable process. Different jobs Same as above; each job repeated 3 times. The dimensions of the features mentioned in Table 2 are summarized in Table 3, and were compared with the minimum dimensions the most capable machine manufacturers claimed to be able to produce. Designing features that have a high probability of failure is critical to highlight the limits of current capabilities of a given technology. Most of the designs in the literature were based on other AM technologies or did not include this "fail test", where it would not have been possible to easily determine the technological capabilities of using such a design. The design of the complete benchmarking job is presented in Figure 2. In the spiral samples, it was possible to notice all the features mentioned in Table 3 plus others used for a comprehensive characterization of the machine performance. The sample presented in Figure 3 contains the following features: Four tall samples (with the shape of tensile test specimens); • Five cylinders with diameter 20 mm and height 60 mm.
All features were checked to be measurable with the intended available equipment, in accordance with the design-for-metrology guidelines [29,30].

Materials and Methods
Each participant received the STL file for each sample that had to be inserted on the building platform, with an indication of where to place it (see Figure 2b). The choice of sending the STL files directly was made as all of the participants received the designs with exactly the same mesh sizes to avoid having the quality of the products depend on the meshing accuracy. Participants were asked to additively manufacture the parts with new maraging steel powder and not post-process any of the parts, apart from cutting them from the building platform, and to specify the technology used for the separation operation. The companies were informed about the analysis that was going to be conducted and the aspects that were going to be evaluated. Specific aspects of the AM process such as the choice of process parameters and the type of support structures to connect the parts to the platform were left to the companies under the assumption that they were in fact in the best position to understand the equipment and its usage to ensure the best results.
As already mentioned, the material chosen was maraging steel grade 300 (18% Ni Maraging 300, 1.2709), with the chemical composition presented in Table 4. The bulk density of the material considered was 8.1 g/cm 3 [31,32]. Then, the analysis of the parts was conducted as follows: • All spiral samples received (five samples per job for three jobs and seven participants) were coated with a thin layer of TiO2 and measured with a GOM Atos ScanBox 5108 16M 3D scanner. From the results collected, it was possible to evaluate the repeatability among positions and jobs and the distortions of the parts as an indicator of residual stresses. It was also possible, with further analysis of the scan, to determine the deviations of a feature's dimensions from the nominal CAD design. The data from the scanner were analyzed with GOM Inspect software (as was done, for example, in [23,33]). The authors' choice of using a contactless system to measure geometrical features (a 3D scanner) instead of more conventional instruments, such as micrometers and calipers, was justified by a preference to obtain more measurement points over an easier and faster measurement technique for more reliable results [34]. • Figure 4 shows the sample and the direction of the surface roughness measurements: X and Y in the XY Cartesian plane, Z and T in the ZX plane. The surface roughness was measured with a Taylor Hobson Form Talysurf 50 profilometer. Each measurement was performed only on the central samples for each job. The measurements were conducted according to DIN EN ISO 4288:1998 and 3274:1998. The specifications used are reported in Table 5. In this report, the choice was to use the Ra parameter for the roughness evaluation, since it is one of the most frequently adopted texture parameters, to allow for an easier comparison of performance in the existing literature [35].

•
The homogeneity evaluation was conducted through density measurements on the small parallelepiped (15 × 15 × 10 mm) using the Archimedes principle, as in [36], and by analyzing the residual defects under an optical microscope of two polished surfaces (on the XY and ZX planes) of all central big parallelepipeds (30 × 30 × 20 mm). The samples were mirror-polished on the two surfaces and observed under an optical microscope, and the pictures collected were elaborated with ImageJ software, as in [37]. The images were converted to black and white to count the number of darker pixels that were considered residual defects by using a specific tool in the software. In many industrial applications, the surface of LPBF products needs to be postprocessed, in particular polished, to obtain the required surface quality. When defects are presented close to the surface, for example, residual porosities or inclusions, the surface quality after polishing will not be acceptable, and this is why it was important to evaluate this aspect. • Rockwell C (HRC) was the method chosen for hardness evaluation, as in [38], and it was performed on a clean surface, slightly ground, in the XY and ZX planes. • Moreover, the time necessary for producing each job was recorded and compared. As already mentioned, all but one of the machines were among the newest technology on the market, and in this paper are identified with letters. Table 6 shows the shareable information from the participants (i.e., type of company, number of lasers, technology used to separate the parts from the building platform, and layer thickness used for production). Company E was not able to produce three complete jobs, so no results from E are presented in this research.

Results
In this section, all the results collected from the above-mentioned analysis are presented. Figure  5 shows the CAD drawing of the part with reference dimensions, and Figures 6 and 7 show some of the manufactured samples. The white residuals on the surface of some parts came from the TiO2 coating used for the 3D scanning of the parts. Examining the parts when they were still attached to the building platform, it was possible to do some useful observation such as in Figure 8, where a significant part distortion that generated delamination of the parts from the platform can be seen. Another aspect immediately visible from the pictures is the effect of the choice of tool to cut the part from the building platform. We noticed that the use of a wire-cut electrical discharge machining (EDM) left the bottom surface smoother and more precise, with the risk that rust would be created on the surface if a water dielectric was chosen, as happened for some participants.

3D Scanner Results: Accuracy, Repeatability, and Complex Feature Production
All the results generated by the 3D scanner of the spiral samples were produced by using Gaussian best fit alignment, focusing on three surfaces to obtain a more comprehensive and robust comparison of the dimensions and distortions of parts. The first two surfaces used for the alignment are presented in Figure 9. An additional surface for alignment, focusing on the thin wall of the spiral sample, was used to more accurately evaluate the distortion of the part, as presented in Figure 10. An example of a color plot generated from the 3D scanner is given in Figure 11. On the right side of the color plot, the scale bar shows the upper and lower limits, chosen as +/− 0.1 mm, since greater distortions would make the part unacceptable for most applications.
The generated color plot was used to evaluate the repeatability of the machines across jobs and positions. A comparison of the color plots between different suppliers across the jobs for each company is shown in Figure 12 for all central positions. It is evident in Figure 12 that repeatability across jobs is not ensured for all machines, for example, in the color plots for participants A, B, D, F, and G, where the deviations differed for all three jobs. In addition, repeatability was not ensured across different positions, as presented in Figure  13 for company G.  Considering the same printing job, the dimensional analysis highlights that the parts produced were affected by some repeatability issues depending on their position on the building platform. Similar observations could be made for all participants. A distortion evaluation was conducted by looking at the entire color plot (and for some companies these were particularly evident such as for company C in Figure 12) as well as by comparing the distortions, focusing on the deviations of thin walls compared to the nominal design. A comparison is presented in Figure 15. For some companies such as C, distortions on the thin walls were extremely evident and already visible during the visual inspection, as shown in Figure 16. From the 3D scanner results, all features were analyzed to evaluate how much they deviated from the nominal CAD design. The first objective of features such as pins and crosses is to immediately understand whether the machine is capable of producing them or not, as a pass/fail evaluation. In the following tables, the pass/fail test results are presented for each supplier and for all pins and crosses.
Starting from the pin's features, Table 7 shows which features managed to be produced and which did not. The analysis and dimensions were assessed with an optical microscope. The number of features is in the same ascending order as in Section 2, here reported again: diameter of pins 1-8: 0.1 mm, 0.2 mm, 0.3 mm, 0.5 mm, 1 mm, 2 mm, 3 mm, and 4 mm. The symbol legend is as follows: ✓ acceptable; ≈ uncertain; ✗ not built. A feature was considered acceptable when it was fully built, with approximately the dimensions defined in the CAD that could be measured with the optical microscope, also tolerating slightly bent pins because of normal handling. A feature was considered uncertain when it was properly built but too bent to be measured, in order to confirm its dimension compared to the nominal. The optical microscope analysis revealed that most of the smallest pins (i.e., 0.3 mm diameter or less) were deformed or completely bent. As far as the largest pins were concerned, it was possible to distinguish the contour lie from the internal filling of the laser scan track, as presented in Figure 17. With GOM Inspect analytical software from the 3D scanner, the deviation of dimensions of the printed features compared to the nominal design was plotted, and shown in Figure 18. In the graph, the samples are defined on the x-axis, and the y-axis shows the deviation from the nominal CAD design (referred as 0). Each line color represents a different pin, while each peak/valley is a different sample. On the top, the companies are indicated for each region of the graph. Some of the lines of the deviation graph look interrupted or completely out of average; this is again due to the fact that most of the smallest pins were broken or bent (probably due also in part to handling), and therefore it was not possible to obtain a proper measurement. It is interesting to observe that most of the companies produced pins that were smaller than the nominal CAD dimension.
The same investigation was conducted for the crosses, and Table 8 shows a summary of the pass/fail dimensions. Crosses 1-8 had the following dimensions (vertical and horizontal wall thickness): 0.10 mm, 0.15 mm, 0.20 mm, 0.25 mm, 0.30 mm, 0.35 mm, 0.40 mm, and 0.50 mm, respectively. The symbol legend is as follows: ✓ acceptable; ≈ existing feature, oversize; ✓ existing feature, not acceptable; ✗ not built. Measurement of the crosses was carried out with an optical microscope and a 3D scanner. Some pictures collected with the optical microscope are presented in Figure 19, and also show the shapes of the crosses that were considered to be not acceptable. Again, the deviation analysis of the real dimensions of the features against the nominals from the GOM software is reported in Figure 20. One reason why most of the crosses of small dimensions, compared to the pins, were built while the pins were not is due to the geometry of the features. The pins, whose dimensions were almost as large as the beam focus (around 80 µm for most of the suppliers), were built with a single laser shot, while the crosses were made of an entire scan track, which is inherently more resistant, for a more robust design.
The same type of evaluation was conducted for the holes, in particular using the deviation analysis produced by the GOM software. The deviation graph of the real dimensions compared to the nominal dimensions is plotted in Figure 21. Examples of holes are given in Figure 22. Observing and comparing the deviation graphs of the average diameters of pins and holes, it is possible to consider the beam offset. As reported by Moylan et al., a too large beam offset can be recognized when the diameters of the pins are smaller than the nominal, while the holes are larger. On the contrary, a too small beam offset generates larger pins and smaller holes than the nominal [14]. This tendency can be recognized for company C, for which the beam offset was too large, and to some extent for company G, which had a too small beam offset.
The last important feature that needs to be considered in this evaluation is unsupported pyramids (see Figure 23). All the companies managed to produce a pyramid, even ones most inclined at 25°, for all positions and jobs. Only one of the participants, D, presented some issues with the down-facing surface, most likely due to the wrong choice of process parameters for this type of surface, particularly critical in LPBF, and/or damage on the recoated surface that is also visible on the top. Figure 24 shows the pyramids and recoated issues.

Surface Roughness: Accuracy
The surface roughness was evaluated in four directions (Figure 4), using the Ra parameter and repeating each measurement at least three times. Table 9 shows the collection of results, showing the company, direction, average (avg, in µm), and standard deviation (std.-dev). All the data presented were analyzed using Minitab software with one-way ANOVA to recognize particular trends depending on the job number for each direction. For companies A, F, and H, there were no significant differences among jobs for any direction; all of the other companies revealed some discontinuities. To summarize, Figure 25 shows the means comparison charts from the one-way ANOVA analysis for all directions.

Homogeneity
The homogeneity of the products was evaluated both by analyzing the density and by measuring the quantity of residual defects after polishing two sides (XY and ZX planes) of the cubes. The density results are presented in Table 10, and relative density was calculated as follows: where the theoretical bulk density was considered to be 8.1 g/cm 3 , as mentioned in Section 3. As expected, density for most of the participants was above 99%, except for company D, again most likely because of the recoated issue that did not ensure proper distribution of the powder.
For evaluation of the residual defect, the central larger parallelepiped was polished on two surfaces: XY (parallel to the building platform) and XZ (one of the growing sides). Using ImageJ software, as in [30], the number of defects was calculated to produce the results presented in Table  11 and plotted in Figure 26. The defects identified with this method were recognized as residual porosities or inclusions in the material.  The participant that showed the highest percentage of residual defects on the surfaces was C (see Figure 27, showing two selected pictures for each face).
The sample from company C presented multiple defects in some areas of the XZ surfaces of dimensions considerably bigger than all samples from other companies or from the same company, but on the XY face, which explains the very big deviations of values seen in Figure 26. To precisely define the source of these differences, further analysis would be required.

Mechanical Properties: Rockwell Hardness C
Rockwell hardness C (HRC) was measured for two sides, XY and XZ, on the as-built parts. The means charts are presented in Figure 28. As expected, no relevant differences were recorded from the conventional value of HRC for nonheat-treated maraging steel grade 300 [39]. One-way ANOVA analysis was conducted to identify eventual variation across jobs and positions, but it was negligible.

Build Speed Comparison
The average time required to print each job was provided by the participants to have a reference for the building speed of the machine. Table 12 reports the times along with the number of lasers that the specific machine used for the job.  It is clear that the build time was influenced mainly by the number of lasers and not by how new the machine was. Other factors that certainly influenced the build time are related to the process parameters chosen and the scan strategy.

Tall Parts Production
Tall specimens with the shape of tensile test samples were used to evaluate the capability of the machines to build relatively tall parts (more than 80 mm, with a 10 mm base diameter). Figure 29 shows selected pictures. Apart from company G, all participants managed to produce most of the samples.

Discussion and Conclusions
In this work, the authors present the results from an extensive benchmarking activity conducted on LPBF machines, comparing the performance of five state-of-the-art machines operated by their respective manufacturers and two-state-of-the art machines operated by their respective owners (end users), and contrasting the newest machine capabilities with a machine more than 10 years old. The old machine, identified with the letter H, was an EOS M270. Table 13 presents an overview of the results collected with related discussion. Table 13. Results overview and discussion.

General Aspect Specifications Results and Discussion
Accuracy Dimension of feature Most features measured overcame limitations stated by machine manufacturers in Table 3.

Surface roughness
Surface roughness revealed for most companies' discontinuities in Ra value among jobs depended on the directions; in general, between all samples, Ra was never lower than 5.0 ± 0.3 µm (G-T-2).

Same job, different positions
Color plot analysis shows there was total repeatability among positions and/or jobs for none of the companies; however, differences could have been appreciated between companies. Different jobs Complex feature Spiral shape: mold´s cooling channel All companies managed to produce good quality spiral features, well handling complex geometries.

Homogeneity
Residual defects Most companies had density higher than 99%, as expected, during Archimedes analysis; in polished surface analysis some differences of residual defects could be seen, even if most participants did not overstep 0.1% of defects on both surfaces.

Residual stress Part distortions
Thin wall showed, especially in some cases, great distortions coming from process that would need to be considered for production of end products with tight tolerance; during analysis of repeatability it was possible to appreciate part distortion of overall spiral sample. Mechanical properties

Rockwell hardness C
Measurements did not reveal discontinuity of products from any company, recording expected HRC value.

Built speed
Time required for production was important input for readiness of use of LPBF for industrial production as well as an indication of the recent direction of technology development. Tall parts production Tall samples Tall parts production (with a small base) presented issues for only one company, confirming that it does not represent a current problem.
The outcomes of the benchmarking work can be summarized as follows: • Considering the good results from company H, which used the oldest machine, it is evident that user experience and expertise play large roles in the final quality of the delivered product. • The part design presented for realizing holistic benchmarking of LPBF machines was successful in achieving the goal. • According to this work, even the newest machines did not outperform the older machine. • One of the major trends revealed by analyzing the newest systems is the focus of machine manufacturers on building larger and faster machines; this goal is currently achieved in the industry by increasing the number of lasers, but little attention is given to other factors (such as higher repeatability and the ability to print tiny features, or improving the surface roughness), as was observed during this work.
LPBF, and AM in general, is technology that is trying to overcome prototyping applications to create its own space in industrial production. In order to help this technology achieve this ambitious goal, it is extremely important that the process become more robust, in terms of better repeatability and less dependence on the user's experience.
An additional consideration of this work can be made regarding the suitability of using a benchmarking component to compare different AM technologies, particularly considering the multiple attempts made in recent years to standardize a design for additive manufacturing. Given the seven categories of AM technology, even if they all essentially build parts layer upon layer, they are profoundly different from each other with respect to the basic working principles and the materials they consider (for example, in PBF, both metal and polymer powders can be processed), so it is very complex, if not impossible, to create a unique design without considering the final application of products built with the technology. Moreover, considering the fast development of all AM technologies, specific geometrical aspects such as the minimum feature size (both external and internal) and the components' overall size should always be updated. As a final reflection on how this work integrates in the landscape of previous benchmarking research for AM, it is the first attempt to evaluate the influence of the specific machine used, by multiple manufacturers, from a holistic perspective, requiring the design of not only a single artifact but a complete job that integrates designs for metrology concepts to ensure the measurability of all specimens and features.