## Abstract

## 1. Introduction

## 2. Parallel Computing

the number of processing units, or
the size of the problem, depending on the hardware's parallel architecture. Note, only a minor proportion of all contemporary algorithms can be decomposed into completely independent pieces, enabling the theoretical linear speedup.

## 3. Time Integration for Numerical Simulation Computation

), solving such problems in just a single time step [31,35,37,39], as schematically depicted in Figure 5.

## 4. Problem Transformation

## 5. Search-Space Reduction

## 6. Conclusions

## Author Contributions

## Funding

## Institutional Review Board Statement

## Informed Consent Statement

## Data Availability Statement

## Conflicts of Interest

**Figure 4.**An example of the temporal evolution of a mould-filling simulation adopting the explicit time-integration scheme.

**Figure 5.**An example of the one-shot computation of a mould-filling simulation adopting the implicit time-integration scheme.

**Figure 7.**An example of search-space reduction for a car hood model, with the red region indicating the restricted search space excluded from the search process.

**Figure 9.**Mesh-density refinement: global mesh-density refinement (entire domain) and local mesh-density refinement (small rectangular area).

