Recent works have shown that the task of path planning for MVS image acquisition can be efficiently addressed by employing submodularity to the candidate view selection [

10,

11] enabling approximation guarantees on the solution using greedy methods. With respect to our notation in

Section 3.1, submodularity is a property of a set function

$f:{2}^{\left|\mathcal{P}\right|}\to \mathbb{R}$ that assigns each subset

$\mathcal{T}\subseteq \mathcal{P}$ a value

$f(\mathcal{T})$.

$f(\phantom{\rule{0.166667em}{0ex}}\xb7\phantom{\rule{0.166667em}{0ex}})$ is submodular if for every

${\mathcal{T}}_{1}\subseteq {\mathcal{T}}_{2}\subseteq \mathcal{P}$ and an element

$\mathit{p}\in \mathcal{P}\backslash {\mathcal{T}}_{2}$ it holds that

$\Delta (\mathit{p}|{\mathcal{T}}_{1})\ge \Delta (\mathit{p}|{\mathcal{T}}_{2})$. An equivalent and more commonly used definition of submodularity for

${\mathcal{T}}_{1},{\mathcal{T}}_{2}\subseteq \mathcal{P}$ is given by

$f({\mathcal{T}}_{1}\cup {\mathcal{T}}_{2})+f({\mathcal{T}}_{1}\cap {\mathcal{T}}_{2})\le f({\mathcal{T}}_{1})+f({\mathcal{T}}_{2})$. In other words, submodularity implies that adding an element to a small subset results in large rewards while adding the same element to a larger subset leads to diminishing returns. Speaking of our path planning problem, as we increase more viewpoint candidates to our trajectory, the marginal benefit of adding another viewpoint with large overlap to the set decreases. Adding the same viewpoint to a smaller set with limited coverage, on the other hand, leads to larger rewards. This property hinders explicit modeling of stereo-matching, as already pointed out by Hepp et al. [

11]. Adding a viewpoint to a smaller subset

${\mathcal{T}}_{1}$ which does not allow a stereo matching yields less reward (zero, as it is not matchable) than adding it to a larger set

${\mathcal{T}}_{2}$ to which it is matchable and therefore it violates the submodularity condition. For that reason, a submodular function

$f(\phantom{\rule{0.166667em}{0ex}}\xb7\phantom{\rule{0.166667em}{0ex}})$ has to be defined which approximates stereo matching in terms of contributions from single views for 3D modeling. This requires

$f(\phantom{\rule{0.166667em}{0ex}}\xb7\phantom{\rule{0.166667em}{0ex}})$ to be both monotone and non-decreasing stated as monotonicity, which means that adding more elements to the set cannot decrease its value. The marginal gain of a viewpoint candidate

$\mathit{p}$ toward a trajectory

$\mathcal{T}$ is given by

$\Delta (\mathit{p}|\mathcal{T}):=f(\mathcal{T}\cup \mathit{p})-f(\mathcal{T})$. It has been shown that a simple greedy algorithm can be considered for providing a solution of the NP-hard maximization of submodular functions with a reasonable approximation guarantee [

58]. Similar to Hepp et al. [

11], we constrain our submodular objective function

to limit the maximum reward for each surface point to 1, where

v reduces the obtained reward from a single view in order to enforce at least

v different views capturing the same surface point

${\mathit{s}}_{j}$. Since this objective function is both monotone and non-decreasing, we can transform the individual information rewards

$I({\mathit{p}}_{i},{\mathcal{S}}_{{\mathit{p}}_{i}})$ from Equation (

4) for all viewpoint candidates to tightly additive information rewards

${I}_{i}^{\mathrm{add}}$ utilizing a simple greedy algorithm given in Algorithm 1. Note that the submodular function in Equation (

7) on its own does not explicitly incorporate stereo matching, as it only considers single contributions based on the distance and observation angles from single viewpoints toward the object surface. However, a stereo matching approximation is firstly given by the matchability graph, ensuring paths along the graph for which viewpoints exhibit large overlap toward preceding viewpoints. Secondly, the greedy algorithm incorporates the observation angle segments by penalizing information rewards for camera viewpoints which intersect already seen surface points in the same observation angle segments. This helps to decrease the additive information rewards for cameras with only little parallax angles and therefore avoids ego-motions in the optimized path which are obstructive for stereo matching.

The greedy method iteratively computes the marginal rewards of each viewpoint for the current reconstructability of each surface point and adds the viewpoint with the highest additive information reward

${I}_{i}^{\mathrm{add}}$ toward the output set. After each iteration, the reconstructability of all surface points is updated according to the previously selected viewpoint rewards. The marginal reward of remaining viewpoints with similar intersection segments of already considered viewpoints is reduced and therefore these are less likely to be chosen in the next iteration. This procedure is repeated until the marginal rewards of all viewpoints have been considered and assigned to the output set. After executing the greedy method, each viewpoint candidate

${\mathit{p}}_{i}$ is coupled with a marginal information reward

${I}_{i}^{\mathrm{add}}$ representing its value for the reconstructability of the object. Roberts et al. [

10] presented an efficient way to transform additive rewards into a standard additive orienteering problem, formulated as a mixed-integer programming (MIP) problem, which can be solved with off-the-shelf solvers.

An orienteering problem can be considered as a combination of a traveling salesman problem and knapsack problem. In other words, the optimization needs to find a closed path that maximizes the collected rewards under a time or travel budget constraint. However, the choice of a suitable travel budget is hard to predict for some scenes and the optimization will almost always fulfil the full path constraint due to the pure additive nature of the rewards which always increases the full coverage of the model. Given an overestimated path length

${L}^{\mathrm{eucl}}$, a similar amount of total rewards can be obtained with a shorter trajectory by penalizing lengthy paths with a regularization factor

$\lambda $. With respect to the semantic restriction on the airspace, the optimized trajectory must not exceed a user-defined path length

${L}^{\mathrm{sem}}$ above restricted objects. Summarized, the optimization objective can be formulated as

where

${I}_{i}^{\mathrm{add}}$ defines the additive rewards of the nodes along a path

$\mathcal{T}$ with traversed Euclidean distances

${\sum}_{{\mathit{e}}_{k}\in \mathcal{E}}{w}_{k}^{\mathrm{eucl}}$ and traversed distances above semantic restricted airspaces

${\sum}_{{\mathit{e}}_{k}\in \mathcal{E}}{w}_{k}^{\mathrm{sem}}$. The regularization forces to reduce the maximum path length

${L}^{\mathrm{eucl}}$ for similar optimization results in shorter paths. The second constraint allows the optimization to select nodes in restricted but not prohibited airspaces but, however, encourages to find the most efficient and shortest path through these conditionally accessible airspaces.