Analytical Solution for the Problem of Point Location in Arbitrary Planar Domains

Santos, Vitor

doi:10.3390/a17100444

Open AccessArticle

Analytical Solution for the Problem of Point Location in Arbitrary Planar Domains

by

Vitor Santos

Department of Mechanical Engineering, Institute for Electronics Engineering and Informatics of Aveiro, University of Aveiro, 3810-193 Aveiro, Portugal

Algorithms 2024, 17(10), 444; https://doi.org/10.3390/a17100444

Submission received: 27 July 2024 / Revised: 19 September 2024 / Accepted: 3 October 2024 / Published: 5 October 2024

(This article belongs to the Special Issue Numerical Optimization and Algorithms: 2nd Edition)

Download

Browse Figures

Versions Notes

Abstract

This paper presents a general analytical solution for the problem of locating points in planar regions with an arbitrary geometry at the boundary. The proposed methodology overcomes the traditional solutions used for polygonal regions. The method originated from the explicit evaluation of the contour integral using the Residue and Cauchy theorems, which then evolved toward a technique very similar to the winding number and, finally, simplified into a variant of ray-crossing approach slightly more informed and more universal than the classic approach, which had been used for decades. The very close relation of both techniques also emerges during the derivation of the solution. The resulting algorithm becomes simpler and potentially faster than the current state of the art for point locations in arbitrary polygons because it uses fewer operations. For polygonal regions, it is also applicable without further processing for special cases of degeneracy, and it is possible to use in fully integer arithmetic; it can also be vectorized for parallel computation. The major novelty, however, is the extension of the technique to virtually any shape or segment delimiting a planar domain, be it linear, a circular arc, or a higher order curve.

Keywords:

cauchy theorem; residue theorem; Jordan curve theorem; generalized polygons; complex calculus; parametric curves; Bézier segments; winding number; negative real axis intersection

1. Introduction

Testing whether a point is enclosed by a polygon is a problem that many programmers in numerous fields, such as in computer graphics, geographic information systems, machine vision, or robotics, among others, have certainly come across more than once, as many authors have been asserting for a long time [1]. Indeed, the issue is very important and is the object of several concerns in Computational Geometry. It has already been studied and solved for practically all cases of polygonal regions, despite the fact that some earlier algorithms occasionally needed iterative computations, were ambiguous in special cases of on-boundary point locations, or show numerical instability in situations of extreme proximity to borders and vertices. Nonetheless, these limitations are often not very relevant in terms of practical implementation, and the algorithms are actually used on many applications in the software industry. What these algorithms are not aimed at, though, is the analytical evaluation of whether a point lies inside a more complex region bounded by mixed linear and curved segments. Actually, there is no published solution known to the author that indicates how it is possible, with the very same base algorithm and in a straightforward and scalable procedure, to determine analytically, i.e., without linearizing the contour or performing geometric approximations, whether a point is inside or outside regions such as those illustrated in Figure 1.

2. Related Work

The first known reference to an algorithm for the point location problem dates back to 1962 by M. Shimrat [2]. That early algorithm had some limitations pointed out soon after by R. Hacker [3], and later by W. Randolph Franklin, who also states to have delivered, in 1970, a code written in FORTRAN for this same problem [4]. The books by Preparata and Shamos, published in 1985 [5], and by Sedgewick, published in 1990 [6], also covered the point in the polygon problem and have been useful references for many other authors and programmers who have developed specific implementations with adaptations for their own set-ups. Still, nowadays, it is easy to find papers in conferences and journals—in very diverse contexts—where this problem is addressed, even if the novelties have stabilized a long time ago around the two main principles, as described further below.

Eventually, what has become one of the most well-known references in the literature was the publication Point in Polygon Strategies, in 1994, by Eric Haines [7], and also some years later, the work of Hormann and Agathos, in 2001 [8], which is the algorithm being used in Matlab for point location in arbitrary polygons.

A clarification is recommended here to better focus the scope of the paper. Actually, there are two kinds of challenges in this problem: the “point in region” problem, and the “point location in subdivisions of the plane”. Although related, there are some formal differences in these problems.

In the first problem, there is only one region, and we are testing whether a point lies inside or outside of it. In the second problem, there is a group of regions (typically a subdivision of the plane), and we need to identify which specific region the point belongs to. The first problem is to be solved using containment tests specific to the region (e.g., point-in-polygon), and it has constant-time complexity relative to the number of regions but may depend on the complexity of the region’s boundary. The second problem typically involves building a data structure for efficient spatial querying across many regions, with logarithmic or sublinear complexity relative to the number of regions in the subdivision. The second problem usually involves triangulated regions (using Delaunay triangulation), and several works stand out in that front, ranging from the fundamentals from Voronoi diagrams [9] and Delaunay triangulation, including later optimizations for better data structure management [10], up to actual point location algorithms in such triangular subdivisions of the plane [11,12].

This paper focuses specifically on the first problem and proposes an alternative solution by including additionally the ability to extend the regions to general curve-shaped boundaries. So, in this paper, the expression “point location” refers to assessing whether some point lies inside or outside a given planar region.

2.1. Ray-Crossing Approaches

For a long time, one of the most popular algorithms was the one based on the idea of defining an infinite line starting at the point being analyzed: the number of intersections with the polygon boundary gives indications for the solution, as described, for example, by Preparata and Shamos [5]. Briefly stated, a point belongs to a polygon if an infinite straight line starting on it intersects the polygon an odd number of times in one direction (Figure 2). This algorithm is also known as the ray-crossing [13], the crossing number [14] algorithm, and the even-odd rule algorithm [15].

One traditional issue with the ray-crossing algorithm was that it may require further tests to account for the cases where the intersections of the line coincide with one or more vertices or even with an entire edge of the polygon itself. Implementations of this algorithm can be found in many places, and the earlier most well-known implementations occur in the works of Sedgewick [6], Haines [7], and O’Rourke [16].

The ray-crossing algorithm is simple and of limited computational cost,

{O (n)}

, but, for more functionality, it may require additional steps, hence reducing its elegance. Additionally, applying its technique to generally shaped areas other than polygons, although possible in theory, could pose huge computational and practical difficulties since systems of simultaneous equations might need to be managed and solved. Still, and although it surely has not been intended for that, its applicability for more general self-intersecting (non-simple) domains may fail, as illustrated in Figure 3, where the point e would be evaluated as being in the “outside”, which contradicts both the mathematical concept based on the winding number [17], and the human perception of “insideness” on a folded (and/or twisted) polygon.

2.2. Winding Number Based Approaches

The problem of non-simple polygons (self-intersecting) has nonetheless been practically solved with algorithms based on the winding number [7], also named nonzero-rule algorithms, which have been refined and even made as efficient as the ray-crossing approach [14]. Its formulation, based on the sum of all the oriented angles with a center in the test point and delimited by each pair of successive vertices of the polygon, allows to obtain the winding number around a point (the number of times a closed curve winds around a given point in the positive sense), and then conclude whether the point is inside or outside the region. If

α_{1}, α_{2}, \dots, α_{M}

are the M oriented angles defined by the testing point and the M segments defined with the M vertices of a polygon, the winding number is calculated by

W_{N} = \frac{1}{2 π} \sum_{i = 1}^{M} α_{i}

. If

W_{N}

is equal to zero, then the number of winds of the polygon around the point is zero, meaning that the point is “outside” the polygon. Any other value for

W_{N}

will indicate that the point is “inside” the polygon.

Figure 4 illustrates the geometric principle of the procedure for a simple polygon.

However, the traditional winding number-based algorithms may be ambiguous if the point lies on the boundary and nearly all implementations show some degree of false positives and missed detections, as demonstrated by S. Shirra in [18]. The technique described generically in Figure 4 is often seen in the literature as the calculation of the winding number contribution of each linear segment (

\bar{p_{1} p_{2}}

) relative to a point a, where all of the contributions are added up. Mathematically, the procedure is simple, as presented, for example in [19], and can be easily understood as the calculation of the winding number contribution of that segment relative to the given test point a:

\{\begin{matrix} \vec{u} = p_{1} - a \\ \vec{v} = p_{2} - a \\ α = atan 2 (\vec{u} \times \vec{v}, \vec{u} \cdot \vec{v}) \end{matrix} .

(1)

Demonstration of (1) is straightforward because

\vec{u} \times \vec{v} = | \vec{u} | | \vec{v} | sin (α)

and

\vec{u} \cdot \vec{v} = | \vec{u} | | \vec{v} | cos (α)

. In this context

\vec{u} \times \vec{v}

is actually supposed to mean the third component of the actual cross product

(u_{x}, u_{y}, 0) \times (v_{x}, v_{y}, 0)

. Possible alternative notations would be to replace

\vec{u} \times \vec{v}

in (1) by

det \begin{matrix} [{\vec{u}}^{⊺} & {\vec{v}}^{⊺} & 1^{⊺}] \end{matrix}

, where

1 = \begin{matrix} [1 & 1 & 1] \end{matrix}

, or

{(\vec{u} \times \vec{v})}_{3}

or, of course,

(u_{x} v_{y} - u_{y} v_{x})

.

When a point falls over the boundary of a region, more than one interpretation may arise. An example of that occurs in Figure 5, where two adjacent regions

R_{1}

and

R_{2}

(delimited by different shades and without the border lines being drawn to visually enhance the effect) seem to share a point a because it falls precisely on the boundaries of the regions. This raises the question of whether point a belongs to both regions, to only one of them, or neither of them!

The possibility of belonging to neither of the regions must be discarded since a lies in a area covered by both regions altogether, seeming absurd to affirm that a does not belong to neither of the regions and at the same time belonging to their union. So, there remains the possibility of belonging to only

R_{1}

or

R_{2}

, or both at the same time. This is where the variation of interpretation can take place. If regions are analyzed independently, and using the winding number approach, any conclusion (belonging or not to

R_{1}

or

R_{2}

) can be taken depending on the circulation sense of the polygonal lines, as described next. If, on the other hand, the two regions are taken as a degenerate self-intersecting polygon where the common edge is actually a double edge, it will be demonstrated further that a belongs naturally to the compound region (

R_{1} \cup R_{2}

), independently of the circulation sense.

As mentioned earlier, when the testing point is over the boundary, the winding number approach has a variable behavior, depending on the circulation sense, which can be illustrated with this simple example based on a triangle with vertices

p_{1} = (- 1, 0)

,

p_{2} = (0, - 1)

,

p_{3} = (0, 1)

and a test point

a = (0, 0)

. For the triangle longer segment, we have

\vec{u} = p_{1} - (0, 0)

and

\vec{v} = p_{2} - (0, 0)

. Also,

\vec{u} \times \vec{v} = 1 \times 1 \times sin (π) = 0

and

\vec{u} \cdot \vec{v} = 1 \times 1 \times cos (π) = - 1

, and consequently

α = atan 2 (0, - 1) = π

. If we swap

p_{1}

with

p_{2}

, which is equivalent to reversing the circulation sense, the result is the same for this segment (because it still holds

α_{12} = + π

), which indicates that the approach does not distinguish the sense of circulation (winding number contribution) along a segment when the point is over it. But, for the remainder two segments, the winding number contribution is detected accordingly in different values and, therefore, the overall winding number changes! In conclusion, there are cases where the point will be considered inside and others outside the polygon, depending on the circulation sense. Figure 6 shows the elements that provide a simple example demonstrating this situation, which is occasionally mentioned by some authors, but also usually dismissed as irrelevant or avoided with specific argumentation, such as in [4].

Although apparently inconvenient and potentially unexplainable (the sense of circulation was not expected to determine the conclusion of the point inclusion test), this behavior is actually predicted by the Jordan curve theorem.

Indeed, an important concept related to the problem of point location is the well-known Jordan curve theorem (JCT) that, in a simple redaction, as is done in [20], states that “Every Jordan curve (a non-self-intersecting continuous loop in the plane) separates the plane into exactly two components”. This means that any point is either in one region or in the other, and implicitly, there is no distinct third region, like, for example, “the border”. So, formally, any results derived from the JCT, namely those based on the winding number, may find ambiguity in assessing which of the two regions is the point lying on the boundary, and this is an unavoidable fact that must be managed when using techniques based on circulation or winding number. Actually, for simple curves, when the observer is traveling along the curve counter-clockwise, the points on this curve (boundary), by convention, will belong to the region on the left hand side; so, the identification of which is the “inside” region depends on the traveling (circulation) sense of the observer, and this is where the ambiguity may take place.

2.3. Generalization to Generic Shapes

The generalization of the problem for domains beyond polygons (simple or non-simple) can be carried out by using the winding number approach. However, the literature has not shown a definitive viable implementation to do it, which includes the case of S. Gatilov in [19], who gives a methodology to calculate the winding angle associated with a circular arc. But Gatilov’s approach counts on the winding number technique described in (1), with its ambiguity for points over segments, which in the case of arcs would be points over the chord of the arc. That is not viable since the point will not even be on the boundary, and the calculation may fail depending on the sense of the circulation! Let us consider the example in Figure 7 with

a = (0, 0)

. The contribution for the winding number only by the arc

p_{1} ⌢ p_{2}

depends on the circulation sense, as happens with any segment, but in case of this arc, the value is not symmetric: it is

π

or 0! Since the contributions of the other two segments are symmetric depending on the circulation sense (

\pm π

), this all adds up to ambiguity in detecting whether a lies inside or outside the region, which is absurd because point a is clearly inside the region.

Actually, the problem of point location in regions that may have curve segments was first mentioned, although not actually solved in practice, by Edelsbrunner and Maurer in 1981 [21], and the general problem is seen by Kirkpatrick in 1983 as a challenge that may require a completely new approach:

[…] While the algorithm [proposed by Kirkpatrick] can be adapted to certain other situations (for example, when all internal regions are star shaped), the general problem of optimal search in subdivisions formed from arbitrary curve segments may require a totally new approach. ([22])

The winding number approach has been a well-accepted solution for the problem of point location, and the calculation of the winding number of a polygonal curve

Γ

around some point a is easy to calculate by determining the number of intersections of

Γ

with the real (horizontal) axis, as first pointed out in [23] and later implemented into a specific algorithm by [1]. However, in the literature, nothing specific is added concerning the winding number of other curves for the purpose of point inclusion, although algorithms exist to calculate the winding number contribution of Bézier curves [24].

2.4. Background and Scope of This Paper

This paper demonstrates and gives implementations of a unified approach that covers the traditional solutions based on ray-crossing or winding numbers, solves the cases where those algorithms may find limitations or ambiguities, and extends and demonstrates a practical, viable solution for generic contours that can be expressed in a parametric form, which is its major novelty.

In the early 1990’s, information was not so easy to obtain or track, and faced with the need, the author proposed, during his Ph.D. Thesis [25], an approach for the problem of point location in polygons based on the Cauchy and Residue theorems, which is closely related to the winding number approach, although unfamiliar at the time.

The algorithm operates by explicitly evaluating contour integrals in the complex plane of a specially chosen function. The solution, found by accidental simplification (it was not fully demonstrated then), proved to work well with a very simple and elegant mathematical formulation. Only later, when formal full mathematical demonstrations were needed, was the current solution developed, as presented in this paper. Hence, it can be said that the solution came to be after the usage of the Cauchy and Residue theorems, but it later became as it it now in its present state. The next sections describe in detail the foundations of the technique, including algorithmic implementations, followed by illustrative results and conclusions.

Code for this research and demonstrations of the algorithms described in the remainder of this paper are publicly available in a GitHub repository (https://github.com/vitoruapt/PointInclusion (accessed on 2 October 2024)).

3. Base Approach and Related Theorems

The relative location of a point in a polygon, or actually any other form of closed contour in the plane, can be explored by the results provided by the theorem of Residue and the theorem of Cauchy, which are introduced next.

Theorem 1 (Residue Theorem).

If

Γ

is a closed curve of the plane, and

f (z)

is an analytic function in the domain

R (Γ)

enclosed by Γ (included), except in a finite number of points inside R, then:

\oint_{Γ} f (z) d z = 2 π j \sum_{k} r_{k}

(2)

being

r_{k}

the residues of

f (z)

at the singular points inside R (being

j = \sqrt{- 1}

).

The residue r of a complex function

f (z)

at a point a, which is a pole of order n of f, is calculated by

r = \frac{1}{(n - 1)!} lim_{z \to a} \frac{d^{n - 1}}{d z^{n - 1}} [{(z - a)}^{n} f (z)]

. A complex function is analytic on a region R if it is complex differentiable at every point in R. A common alternative designation is “holomorphic function”, usually the preferable variant by some mathematicians.

Another important fact is given by the Cauchy theorem (or Cauchy–Goursat theorem, since Edouard Goursat (1858–1936) proved this theorem without imposing the condition that

f^{'} (z)

should be continuous inside

R (Γ)

and on the boundary, which was an additional condition left by Cauchy [26]) that states the following:

Theorem 2 (Cauchy–Goursat Theorem).

If

R (Γ)

is a simply or multiply connected region whose boundary Γ is sectionally smooth, and if

f (z)

is analytic in

R (Γ)

, then the following result applies:

\oint_{Γ} f (z) d z = 0 .

(3)

The previous statement is actually the formalization of the common expression, which states that a line integral of any analytic function is independent of the path. So, the advantage here is to use non-analytic functions on the regions of interest. After selecting a function that is non-defined in only one point inside

R (Γ)

, and whose residue is non-null on that point, by using the Residue and Cauchy theorems, the following reasoning can be engaged:

f (z) is analytic inside Γ \Rightarrow \oint_{Γ} f (z) d z = 0

(4)

or, conversely:

\oint_{Γ} f (z) d z \neq 0 \Rightarrow f (z) is not analytic inside Γ .

(5)

Asserting that

f (z)

is not analytic inside

Γ

means that

f (z)

has at least one discontinuity inside

Γ

. Consequently, and being aware that a is the unique pole (discontinuity) of

f (z)

in

R (Γ)

, i.e., the unique point where it is not defined, a function such as

f (z) = \frac{1}{z - a}

allows us to state this final and most valuable conclusion: if the contour integral of

f (z)

along a given contour

Γ

is not null, then its pole lies in the region delimited by that contour, or, in mathematical terms:

\{\begin{matrix} a is the pole of f (z) \\ \oint_{Γ} f (z) d z \neq 0 \end{matrix} \Leftrightarrow a \in R (Γ) .

(6)

It is now necessary to draw some considerations to select an appropriate function and the procedures to apply and take advantage of the previous result. Let a boundary

Γ

be decomposed in M parts

Ω_{k}

with

1 \leq k \leq M

. After reminding ourselves that a contour integral is no more than a line integral where the integration path is a closed curve, the following is clear:

\oint_{Γ} f (z) d z = \sum_{i = 1}^{M} \int_{Ω_{i}} f (z) d z .

(7)

Although other possibilities exist, we can adopt for

f (z)

the one suggested earlier and presented in (8):

f (z) = \frac{1}{z - a} .

(8)

The residue of

f (z)

at pole a has the value 1. The option for

f (z)

given in (8) is due to the fact that the integrand function should be simple for concerns of computational cost, and that it must possess a non-null residue, which would otherwise lead to inconclusive results. For example, all functions of the type

f (z) = \frac{1}{{(z - a)}^{n}}

, with

n > 1

, or the type

f (z) = \frac{z}{{(z - a)}^{n}}

, with

n > 2

, have a residue of 0 on point a, making them useless for this purpose. On the other hand, there are many other possibilities, all with residue 1 at point a, such as the following:

f (z) = \frac{z}{{(z - a)}^{2}}

,

f (z) = \frac{1}{1 - e^{(z - a)}}

or

f (z) = \frac{1}{sin (z - a)}

. However, in all these (usable) alternatives, there would always be the need to calculate a complex logarithm for the anti-derivative. Therefore, the simplest and most obvious case found was precisely the function in expression (8), but, indeed, the principle is independent of the function, which is, however, useful only if it has poles and non-null residues.

4. Calculation of the Contour Integral

If

f (z)

is analytic in any path between

z_{1}

and

z_{2}

, then the fundamental theorem of calculus along curves applies:

\int_{Γ_{z_{1} z_{2}}} f (z) d z = F (z_{2}) - F (z_{1}) .

(9)

Expression (9) states a very well-known and useful technique, where

F (z)

is the anti-derivative of

f (z)

; however, it can be directly applied only if

F (z)

is continuous along the integration path and

f (z)

is defined over the integration path; this later condition may not be satisfied when the point being tested is over the contour. Nonetheless, and based on the definition of the Cauchy Principal Value (CPV), the contour integral can still be calculated even if the function being integrated has a pole on the contour. That can be performed by using expression (10) assuming that the pole is enclosed by a circle of radius

δ

, and the fraction of the path outside that circle is named

Γ (\bar{δ})

, where the function

f (z)

is always integrable independently of how small

δ

becomes [27]:

\int_{Γ} f (z) d z = lim_{δ \to 0} \int_{Γ (\bar{δ})} f (z) d z .

(10)

Having solved the possible problem of the non-definition of

f (z)

in a finite number of points, it only remains the issue of the continuity of

F (z)

, which is going to be managed with a methodology derived further.

4.1. The Complex Logarithm

The anti-derivative of the chosen function (8) is the logarithm of a complex number, which, formally, is a multi-valued result. By definition, the logarithm of a complex number z is the number w that satisfies the expression

z = e^{w}

. After this definition, it is then clear that the logarithm of z,

Ln (z)

, is multi-valued; if

z = r e^{j θ}

, using

j = \sqrt{- 1}

, and being true that

r e^{j θ} = r e^{j (θ + 2 k π)} = e^{ln r + j (θ + 2 k π)}

, the following arises:

Ln (z) = ln r + j (θ + 2 k π), k = 0, \pm 1, \pm 2, \dots

(11)

It must be noted that the logarithm of a complex number is here graphed with a capital letter to stress its multi-valued nature. With multi-valued expressions, it is not possible to establish comparisons or other relational operations without precautions. The meaning of

Ln (z)

depends on the branch where the logarithm is being evaluated, that is, the value of k in (11). However, the logarithm can be restricted to a single branch, called the principal value, where the imaginary part is unique, hence with only one possible argument, called the principal argument (often designated by the mathematical

arg ()

function). The principal value is obtained by making

k = 0

in (11) resulting in:

ln (z) = ln r + j arg (z) = ln r + j θ .

(12)

When restricted to the principal values, logarithms exhibit some particularities like, for example,

ln (e^{- j \frac{5}{4} π})

equals

\frac{3}{4} π j

rather than

- \frac{5}{4} π j

, and, as well,

ln (e^{- j π}) = + π j

and not

- π j

! As expected, the restriction of the function output to an interval will certainly cause discontinuity of that function and, indeed,

ln (z)

is not continuous for all points of the non-positive real axis because it is clear that:

\{\begin{matrix} lim_{θ \to 0^{-}} ln e^{j (π + θ)} = lim_{θ \to 0^{+}} ln e^{j (π - θ)} = + π j \\ lim_{θ \to 0^{+}} ln e^{j (π + θ)} = - π j \end{matrix} .

(13)

These issues of the principal value and continuity also affect the logarithms of products (or quotients) when expanded to sums (or subtractions) of logarithms; subtraction or addition of complex logarithms may not always give a result with the imaginary part restricted to the principal value and that is why the following definitions are stated for the logarithm of a quotient, or equivalently, subtraction of logarithms:

ln \frac{z_{1}}{z_{2}} = \{\begin{matrix} ln z_{1} - ln z_{2} & \Leftarrow - π < arg z_{1} - arg z_{2} \leq + π \\ ln z_{1} - ln z_{2} - 2 π j & \Leftarrow arg z_{1} - arg z_{2} > + π \\ ln z_{1} - ln z_{2} + 2 π j & \Leftarrow arg z_{1} - arg z_{2} \leq - π \end{matrix}

(14)

and, conversely,

ln z_{1} - ln z_{2} = \{\begin{matrix} ln \frac{z_{1}}{z_{2}} & \Leftarrow - π < arg z_{1} - arg z_{2} \leq + π \\ ln \frac{z_{1}}{z_{2}} + 2 π j & \Leftarrow arg z_{1} - arg z_{2} > + π \\ ln \frac{z_{1}}{z_{2}} - 2 π j & \Leftarrow arg z_{1} - arg z_{2} \leq - π \end{matrix} .

(15)

4.2. Line Integral for Linear Segments

Resuming back to the genesis of the algorithm, the essence of the entire procedure is then to evaluate explicitly

\oint_{Γ} \frac{d z}{z - a}

and verify whether its value is null or not. It will not be null if a lies within or over the contour

Γ

. By applying (9) to a segment of a path between points

z_{1}

and

z_{2}

, the following result would potentially occur:

\int_{[z_{1} z_{2}]} \frac{d z}{z - a} = ln (z_{2} - a) - ln (z_{1} - a) .

(16)

4.2.1. Concerns When Calculating the Line Integral

Despite the elegant solution given by (16), there remains the uncertainty of its applicability due to the possible non-continuity of

ln (z)

along the path

[z_{1} z_{2}]

. That non-continuity will occur precisely when the path from

z_{1}

to

z_{2}

crosses the negative real axis (NRA), that is, in a neighborhood where z on the path passes from the 2^nd quadrant (

Q 2

) to the 3^rd quadrant (

Q 3

) or vice-versa. The following notation is adopted to define the situations of a path crossing the NRA:

(17)

So, the problem is then to calculate the line integral in such conditions of discontinuity, which is explained next. The discontinuity of

ln (z)

occurs when z crosses the negative real axis, but this effect may occur with other functions apart from the logarithm. So, let us assume that some generic function f can be parametrized on

θ

(to ease the practical calculation, knowing that

z = r e^{j θ}

), and its definite integral is to be evaluated from

(π - α)

to

(π + β)

, where

α

and

β

are positive angles smaller than

π

radians, i.e.,

α, β \in] 0, π [

. Being

δ

an infinitesimal positive value, then the following can be written:

\int_{π - α}^{π + β} f (θ) d θ = \int_{π - α}^{π - δ} f (θ) d θ + \int_{π - δ}^{π + δ} f (θ) d θ + \int_{π + δ}^{π + β} f (θ) d θ .

(18)

From the three terms on the right side of (18), the first and the last are always defined for any value of

δ

(for the functions involved in this analysis). So, taking the limit when

δ \to 0

will also force the middle term to be zero because the following holds true for any function

f (θ)

:

lim_{δ \to 0} \int_{π - δ}^{π + δ} f (θ) d θ = \int_{π}^{π} f (θ) d θ = 0 .

(19)

Therefore, and being

F (θ)

the anti-derivative of

f (θ)

, expression (18) can be developed into (20):

\begin{matrix} \int_{π - α}^{π + β} f (θ) d θ & = lim_{δ \to 0^{+}} [F (θ) |_{π - α}^{π - δ} + F (θ) |_{π + δ}^{π + β}] \\ = lim_{δ \to 0^{+}} [F (π - δ) - F (π - α) + F (π + β) - F (π + δ)] \\ = F (π + β) - F (π - α) + lim_{δ \to 0^{+}} [F (π - δ) - F (π + δ)] . \end{matrix}

(20)

Expression (20) shows that, when the integration path crosses the negative horizontal axis, the line integral of a function f, has an additional term of

{lim}_{δ \to 0^{+}} [F (π - δ) - F (π + δ)]

, whose concrete value depends obviously on the continuity of

F (θ)

. If

F (θ)

happens to be continuous on the value

θ = π

, then that additional term is null, and we fall back to the fundamental theorem (9). Resuming to the case in discussion, using

F (θ)

=

ln (r e^{j θ})

, then, by taking into account (13), the following holds:

\begin{matrix} lim_{δ \to 0^{+}} [ln (r e^{j (π - δ)}) - ln (r e^{j (π + δ)})] = [ln (r) + π j] - [ln (r) - π j] = + 2 π j, \end{matrix}

(21)

which, by recovering the original variable (

z = r e^{j θ}

), yields:

(22)

A similar reasoning is easy to replicate for the case where the sense of the path is reversed, e.g., swap the integral limits in (20), that is, when the crossing occurs from 3rd quadrant (

Q 3

) to 2nd quadrant (

Q 2

), and the result for that case is:

(23)

In summary, it can be concluded that the line integral for a path between two points

z_{1}

and

z_{2}

of

\frac{1}{z - a}

is given by:

(24)

Considering points

z_{1}

and

z_{2}

to be such that

\{\begin{matrix} (z_{1} - a) \in Q 2 & \Rightarrow + \frac{π}{2} < arg (z_{1} - a) < π \\ (z_{2} - a) \in Q 3 & \Rightarrow - π < arg (z_{2} - a) < - \frac{π}{2} \end{matrix},

(25)

makes it simple to assert that, for this case, we have:

lim (max {arg (z_{2} - a)} - min {arg (z_{1} - a)}) = - π,

(26)

and consequently, the following always holds true:

arg (z_{2} - a) - arg (z_{1} - a) < - π .

(27)

Similarly, if we have the reverse situation,

\{\begin{matrix} (z_{1} - a) \in Q 3 \\ (z_{2} - a) \in Q 2 \end{matrix},

(28)

then, the following can also be demonstrated:

arg (z_{2} - a) - arg (z_{1} - a) > + π .

(29)

In summary, when NRA intersection occurs from

Q 2

to

Q 3

(

), we have the first case in (24) which, by using the equality of case 3 from (15), certified by (27), allows to state the following:

\begin{matrix} \int_{[z_{1} z_{2}]} \frac{d z}{z - a} & = ln (z_{2} - a) - ln (z_{1} - a) + 2 π j \\ = (ln \frac{z_{2} - a}{z_{1} - a} - 2 π j) + 2 π j = ln \frac{z_{2} - a}{z_{1} - a} \end{matrix}

(30)

Similarly, when NRA intersection occurs from

Q 3

to

Q 2

(

), we have the second case in (24), which allows us to state the following by using the equality of case 2 from (15), as confirmed by (29):

\begin{matrix} \int_{[z_{1} z_{2}]} \frac{d z}{z - a} & = ln (z_{2} - a) - ln (z_{1} - a) - 2 π j \\ = (ln \frac{z_{2} - a}{z_{1} - a} + 2 π j) - 2 π j = ln \frac{z_{2} - a}{z_{1} - a} \end{matrix}

(31)

In conclusion, and by comparing expressions (24) and (15), the following final result is obtained for any

a \neq z_{1}, z_{2}

:

\int_{[z_{1} z_{2}]} \frac{d z}{z - a} = ln \frac{z_{2} - a}{z_{1} - a} .

(32)

A non-null value for expression (32) as the indicator of point inclusion in polygons is precisely the methodology firstly used by the author in [25] (pp. 149–152), in the early 1990’s, as mentioned before, and whose validity has just been formally demonstrated.

4.2.2. Simplifying the Calculation of the Contour Integral

The result expressed by (32) allows to assert the full methodology because it calculates, directly and in a straightforward mode, the circulation of a path made up of linear segments. In the particular case of

Γ

defined as M linear segments, and with

f (z) = \frac{1}{z - a}

, expression (7) with the results of expression (32) yields the following generic result:

\begin{matrix} \oint_{Γ} f (z) d z = & ln \frac{z_{2} - a}{z_{1} - a} + ln \frac{z_{3} - a}{z_{2} - a} + \dots + ln \frac{z_{M} - a}{z_{M - 1} - a} + ln \frac{z_{1} - a}{z_{M} - a} \\ = & \sum_{n = 1}^{M} ln \frac{z_{(n mod M) + 1} - a}{z_{n} - a} . \end{matrix}

(33)

Furthermore, to simplify the calculation and avoid using at all the complex logarithms, the previous calculations can be even more simplified since the possible variations of the contour calculations only occur in the imaginary part, as it is intuitive from previous statements, and easily demonstrable by expression (34). Indeed, being

z_{k} = r_{k} e^{j θ_{k}}

and

z_{m} = r_{m} e^{j θ_{m}}

and therefore

ℜ \{ln \frac{z_{k}}{z_{m}}\} = ℜ \{ln \frac{r_{k}}{r_{m}} + j arg \frac{z_{k}}{z_{m}}\} = ln \frac{r_{k}}{r_{m}}

, the calculation of the real part of (33) for the polygon with M vertices is always null, as shown next:

\begin{matrix} ℜ \{\oint_{Γ} f (z) d z\} & = ℜ \{ln \frac{z_{2}}{z_{1}} + \dots + ln \frac{z_{M}}{z_{M - 1}} + ln \frac{z_{1}}{z_{M}}\} \\ = ln \frac{r_{2}}{r_{1}} + \dots + ln \frac{r_{M}}{r_{M - 1}} + ln \frac{r_{1}}{r_{M}} \\ = ln (\frac{r_{2}}{r_{1}} \frac{r_{3}}{r_{2}} \dots \frac{r_{M}}{r_{M - 1}} \frac{r_{1}}{r_{M}}) = ln (1) = 0 . \end{matrix}

(34)

Finally, it may then be stated that the circulation can be obtained by simply calculating

arg ()

operations using (35):

\oint_{Γ} f (z) d z = \sum_{n = 1}^{M} arg \frac{z_{(n mod M) + 1} - a}{z_{n} - a} .

(35)

Reminding the definition in (14), and being

z_{1} = r_{1} e^{j θ_{1}}

and

z_{2} = r_{2} e^{j θ_{2}}

, the definition of the

arg ()

function for a quotient of two complex numbers is, of course, given by (36):

arg \frac{z_{1}}{z_{2}} = \{\begin{matrix} θ_{1} - θ_{2} & \Leftarrow - π < θ_{1} - θ_{2} \leq + π \\ θ_{1} - θ_{2} - 2 π & \Leftarrow θ_{1} - θ_{2} > + π \\ θ_{1} - θ_{2} + 2 π & \Leftarrow θ_{1} - θ_{2} \leq - π \end{matrix} .

(36)

Resorting to the actual geometric problem, and denoting the testing point by

a = (x_{a}, y_{a})

, and a generic point to delimit segments by

P_{n} = (x_{n}, y_{n})

, it is immediate to state that

θ_{n} = arg (P_{n} - a) = atan 2 (y_{n} - a_{y}, x_{n} - a_{x})

(37)

and, hence, Algorithm 1 on the following page can be promptly established. Moreover, since the circulation of

f (z)

always has a null real part (34), only the imaginary part is relevant and, therefore, the variant for the calculation given by (35) is preferred (simpler) than the calculation using expression (33).

Algorithm 1: Polygon inclusion test using real

arctan ()

In conclusion, in a closed contour, the neat value of (35) will be null if, along the path, there are no transitions between quadrants

Q 2

and

Q 3

, but if they occur, additional terms of

\pm 2 π j

must be added at each time, as expressed by (36). So, the following proposition can be asserted:

Theorem 3 (NRA intersection theorem).

In a region enclosing a singularity of a function

f (z)

, the value of the contour integral (hence, the point inclusion) depends exclusively on the number and sense of path transitions over the negative real axis (NRA).

This section has demonstrated how the contour integral is indeed the translation of the winding number approach introduced earlier as one of the well-known techniques for point location in polygons. Moreover, as stated in the NRA intersection theorem, it is also clear that the issues of the winding number calculation (angle contributions) occur only when the path crosses the NRA, which resembles very much the ray crossing method.

4.2.3. Points on the Boundary of Polygons with Undefined Orientation

Algorithm 1 is valid for all combinations of point a and polygon

{P_{1}, P_{2}, \dots, P_{M}}

, and requires no post-processing; however, there can be ambiguity in a special situation described next, which is nevertheless easily solvable with a more general approach.

The special situation just mentioned concerns the points over the boundary when the orientation of the polygon (coarsely, the order in which the vertices are browsed) is not the direct orientation, which may give ambiguous results, as expected by the Jordan curve theorem, and mentioned at the beginning of this paper. Indeed, the contour integral

\oint_{Γ} \frac{d z}{z - a}

, when a is over the path

Γ

, can result either in 0 or

2 π j

or, equivalently, a winding number of 0 or 1. Additionally, if the polygon is not simple (i.e., is self-intersecting), the polygon orientation is not adjustable (by reversing the order of vertices, for example) because the orientation changes locally after each self-intersection of the boundary line! For those cases, the contour integral is nonetheless always defined, and Figure 8 shows several examples of points near a self-intersecting polygon.

In Figure 8, several observations immediately arise: the first is that clear “outside” and “inside” points are always well determined (winding number of zero or non-zero, respectively). A second observation is that points on the boundary may yield a winding number of 0 or 1, depending on the circulation sense.

There is also a particular case of a point being twice on the boundary (in the intersections), which is always considered “inside”, no matter the sense of circulation. The mentioned ambiguity is however not unsolvable since it is detectable: simply, in case of doubt about the sense of the polygon orientation, to cover even for points over the boundary, a double calculation can be performed: one with the given vertex point order, and the other with its reverse sense; in case one of the calculations yields zero and the other non-zero, then it is the case of a point over the boundary, and further decision can be taken (normally to be considered included in the polygon). This avoids the need of special analysis to detect points on the boundary, which could be absolutely not recommend on general shaped contours, as managed ahead in this paper.

The challenge of points over the boundary is recurrent in all algorithms in the literature. Even one of the most common approaches in the state of the art (the already cited work of Hormann and Agathos [8] used in Matlab) has a special treatment for the points on the boundary that the authors call boundary version and is included in Algorithm 7 in their paper, and deals efficiently with points on boundaries of polygons.

In conclusion, if there can be points anywhere on the boundary that need to be precisely assessed, and there is no knowledge whether a sequence of points

{P_{1}, P_{2}, \dots, P_{M}}

of a simple polygon is not given in the positive circulation sense, or whether the polygon is not simple, then Algorithm 1 must be called twice, being the second time done with the polygon vertices in the reverse order. If the two calls of the algorithm yield different results, then we are in the presence of a point over the boundary, and it is (usually) to be considered inside the polygon (Algorithm 2 on the following page).

Algorithm 2 on the next page requires double the computation resources since it calls the actual calculation algorithm twice. This is required only for the absolutely general case of totally random polygons and testing points, although total randomness will hardly generate situations of points exactly over the border. Anyway, alternatives will be presented ahead.

Despite being effective and elegant, in practice, Algorithm 1 is computationally demanding because one trigonometric function (arctan) is used per each side of the polygon, making it less efficient computationally than other alternatives. That, too, is a reason for a change in the paradigm of the calculation, which will align with the existing state of the art algorithms for polygons but also go beyond their performance, not only by outperforming the computational cost but also going where they do not go actually, that is, extension for shapes other than polygons with linear segments. That paradigm is the parametric definition of the integration path.

Algorithm 2: Unambiguous test for generalized polygons

4.3. Parametric Definition of the Integration Path

An observation that can be made about the technique described in the previous sections is that calculating the line integral for a function with a non-continuous anti-derivative requires care because of the sense of the integration path crossing the NRA, which results in different contributions, as seen. Anyway, for linear segments, the method just described circumvents that issue; however, for non-straight paths to be discussed further, that may not be so clear. Hence, let us adopt an alternative approach to obtain (32) by using parametric integration paths because that explicitly defines a path from a start to an end, along with the variation of the parameter that describes the curve. Let us use a parameter t that varies from 0 to 1 to cover the full path segment, and let us start with the case of linear segments. Being

z_{1}

and

z_{2}

the extremities of a linear path, any point of the path from

z_{1}

to

z_{2}

is given by:

z = z_{1} + t (z_{2} - z_{1}), 0 \leq t \leq 1 .

(38)

After adjusting the integration variable,

d z = (z_{2} - z_{1}) d t

, the line integral can now be calculated as follows:

\int_{[z_{1} z_{2}]} \frac{d z}{z - a} = \int_{0}^{1} \frac{(z_{2} - z_{1}) d t}{z_{1} + t (z_{2} - z_{1}) - a}

(39)

that results in:

{[ln (z_{1} + t (z_{2} - z_{1}) - a)]}_{0}^{1} = ln (z_{2} - a) - ln (z_{1} - a),

(40)

which is the same as (16), but shares the same risks of discontinuity because the anti-derivative is still a logarithm, despite the fact that the integration variable is now t. Therefore, the solution is exactly the same as before, given in (22) and (23). In the case of NRA crossing, and depending on the sense ( Algorithms 17 00444 i010

or

), there will be a contribution of

\pm j 2 π

to the contour integral, or

\pm 1

in the NRA crossings counter.

Although it could have been stated and adopted earlier, working for a segment

\bar{z_{1} z_{2}}

in respect to a point a is actually equivalent to working with segment

\bar{(z_{1} - a) (z_{2} - a)}

in respect to

a = 0

. So, from now onwards, when z points are used, it is assumed that they result from the original points subtracted from a, i.e., (

z - a

), and that the reference point in analysis is the system origin

(0, 0)

.

5. NRA Crossing in Parametric Paths

We can now settle the procedures to calculate the contour integral along a closed path or, equivalently, the inclusion of a point by a region based solely on the intersection of that path with the NRA.

5.1. NRA Crossings and Their Sense for Linear Segments

What we seek to detect is whether there is an intersection and what is its sense. And this is straightforward to calculate after (38), which can be expanded into:

\{\begin{matrix} ℜ {z} = x = x_{1} + t (x_{2} - x_{1}) & \Leftarrow 0 \leq t \leq 1 \\ ℑ {z} = y = y_{1} + t (y_{2} - y_{1}) & \Leftarrow 0 \leq t \leq 1 \end{matrix} .

(41)

A NRA intersection occurs at some point

z_{c}

when

ℑ {z_{c}} = 0

and

ℜ {z_{c}} \leq 0

in

z_{c} = z_{1} + t_{c} (z_{2} - z_{1}), 0 \leq t_{c} \leq 1

, i.e., intersection occurs for the value

t_{c}

that verifies:

\{\begin{matrix} y_{1} + t_{c} (y_{2} - y_{1}) = 0 \\ x_{1} + t_{c} (x_{2} - x_{1}) \leq 0 \end{matrix} \Leftrightarrow \{\begin{matrix} t_{c} = \frac{y_{1}}{y_{1} - y_{2}} \\ t_{c} (x_{1} - x_{2}) \geq x_{1} \end{matrix} .

(42)

Notice that the case

y_{1} = y_{2}

represents a horizontal segment, and “intersection” only occurs for

y_{1} = y_{2} = 0

. However, that is not actually an intersection of the NRA, as explained later, and this special situation is to be discarded early in the algorithms (also to avoid the division by 0). Apart from that case, if a valid

t_{c}

is found, then it is necessary to determine the sense of the NRA crossing, and that is easily verifiable by the derivative of the imaginary component

ℑ {z}

in relation to t. If the derivative is positive for that value of

t_{c}

, it means that the

y = ℑ {z}

component is increasing with t, therefore the path segment is passing from

Q 3

to

Q 2

, or

, that is, there is a contribution of

- 1

to the intersection counting (

I_{C}

), or

+ 1

for the reverse case (

Q 2

to

Q 3

, or

). This can be formalized as:

I_{C} = - sgn (\frac{d y}{d t} |_{t = t_{c}}) .

(43)

For linear segments, the previous expression is constant for any point of the segment where the NRA crossing might occur:

I_{C} = - sgn (y_{2} - y_{1}) = sgn (y_{1} - y_{2}) .

(44)

If, by chance, the testing point happens to be an intersection point of the polygon boundary with the NRA, either along the path segment (

t_{c} \neq 0

and

t_{c} \neq 1

) or strictly on a vertex (

t_{c}

is 0 or 1), that can be promptly detected because its real coordinate would be zero:

x_{1} + t_{c} (x_{2} - x_{1}) = 0

. This indicates that the point is over the boundary, and a decision can be made immediately without further calculations. Formally: if

x_{1} = t_{c} (x_{1} - x_{2}) = t_{c} Δ x

, then the point is on the boundary, and can be dispatched earlier in the algorithm.

There are also the particular cases of

t_{c} = 0

or

t_{c} = 1

; these represent the cases where the segment starts or ends exactly on the NRA ( Algorithms 17 00444 i014

or

). In those cases, this means that another segment starts or ends there too! To leave the contributions separate, but allowing them to add up or cancel, for those cases, the intersection contribution calculated above should be halved, that is, a

\pm 0.5

contribution for segments that originate or end on the NRA is to be applied. In other words, each crossing of the NRA can be accounted with

\pm 1

, but reaching or leaving it can be accounted as a half-crossing, or

\pm 0.5

. Formally, this can be stated as in (45):

t_{c} = 0 \lor t_{c} = 1 \Rightarrow I_{C} = - 0.5 sgn (\frac{d y}{d t} |_{t = t_{c}}) .

(45)

Figure 9 illustrates the situation of

I_{C}

values for several testing points (

a_{1}, a_{2}, \dots, a_{9}

) in a polygonal self-intersecting region; the associated table details the partial and total values of NRA intersection counting. At each case (point

a_{i}

), the imaginary axis would be a vertical line that crosses the enclosed domain exactly at each of the

a_{i}

points.

In summary, if a path does not intersect the NRA ( Algorithms 17 00444 i016

), the calculation of the line integral is simple (actually, not necessary in practice) because there are no discontinuities in the process and, in the end, all terms of the closed path will cancel altogether; only the intersections of the path with the NRA will affect contour evaluation or the winding number. Also, notice that horizontal segments (

y_{1} = y_{2}

) never intersect NRA; they may even lie over it but never actually cross it, so their contribution to the

I_{C}

is null, as can be seen in the segment that contains the

a_{2}

point in Figure 9.

The major novelties in this approach for polygonal regions are that the existence of NRA crossing can be determined by detecting the value of a single parameter t and the sense of crossing can be determined by a simple comparison operation.

5.2. The Case of Points on the Border, Again!

In Figure 9 specific situations arise if the testing point is exactly a NRA crossing point or, which is to say, the testing point is over the border. Once again, this situation forbids the possibility of seamlessly applying a unique algorithm for points on the border. For example, still in Figure 9, if the testing point occurs on the first crossing point (the one on the left of

a_{1}

), since it has no other crossing on its left, this implies being outside the region; so, its own intersection contribution (

I_{C}

) should be counted to consider it inside. But, if this reasoning is applied to the 4^th intersection (between

a_{3}

and

a_{4}

), then the overall contribution at that point (

+ 1 - 0.5 + 0.5 - 1

) would result in 0 and that point would be considered outside the region, contrarily to the intention of considering it as inside. The conclusion is that there is no unique solution to distinguish with the same operation whether a point on the border is always considered inside or outside the region: this is unavoidable, as supported by the Jordan curve theorem presented earlier, where the sense of crossing (circulation) affects the winding contribution, and therefore the conclusion about inclusion. The solution, as already mentioned, is to check those situations before further calculation.

5.3. Universal Algorithm for Arbitrary Polygons

All of the previous procedures are integrated in Algorithm 3, which shows all the steps to test polygonal inclusion of any point a on any polygon with M vertices (for any

M > 0

, but only

M > 2

is really meaningful), normal or self-intersecting.

Algorithm 3: Universal inclusion test in arbitrary polygons

The computational cost of Algorithm 3 is

O (n)

, and the arithmetic operations involved are only sums (or subtractions) and at most M floating point divisions per polygon, although optimizable, as described further.

In Algorithm 3, the points on the border, including vertices, are considered inside the polygon, but, if wanted otherwise, this can be easily modified to exclude them. It can be done either with an optional initial checking for a being one of the vertices or during the normal flow of the algorithm after test from line 12 that results from the NRA crossing conditions in expression (42). The decision on line 13 of Algorithm 3 could be set to false if the point on the boundary is defined as being outside.

Notice also that the apparent risk of division by zero in line 11 of the algorithm never occurs because of early analysis in line 6 of the algorithm; indeed, if

Y_{1} = = Y_{2}

, we also have

sgn (Y_{1}) = = sgn (Y_{2})

and the algorithm skips to the next segment since there is no contribution of the current linear segment to the value of

I_{C}

. Illustration of the functionality and results of Algorithm 3 can be found in the GitHub repository indicated earlier.

5.4. Multi-Ring Polygons and Multi-Polygons

There is a category of planar domains commonly used in Geographical Information Systems (GIS), namely to represent contours of countries and other geographical entities, that accounts both for multiple separate polygons (multi-polygons) or for polygons that include “holes” defined by one (or more) separate closed lines fully included in the outer polygon; each of these separate contours is named a ring. A simple polygon has one ring, and a polygon with two “holes” has three rings [28]. There are also strict cases of multi-ring multi-polygons where rings nest inside each other. All these situations can be handled with the same concepts described earlier using the sense or circulation of a contour. Figure 10 illustrates three situations of multi-ring polygons.

The figure also illustrates the need for defining correctly the sequence of the polygon vertices in order to apply the same concepts of winding sense and, therefore, obtain the number of NRA intersections. Extension of the algorithm to this type of region is achieved by simply applying the procedure on each of the 2, 3, or more contours (rings) and sum up the separate results of NRA intersections.

5.5. Optimizing Algorithm 3

Algorithm 3 on the preceding page can be computationally alleviated by avoiding unnecessary divisions in case the NRA intersection does not occur. Indeed, as required by expression (42), for NRA intersection to occur we must have the following:

0 \leq \frac{y_{1}}{y_{1} - y_{2}} \leq 1

. In other words, if either

\frac{y_{1}}{y_{1} - y_{2}} > 1

or

\frac{y_{1}}{y_{1} - y_{2}} < 0

, then the NRA intersection does not occur, and no more calculations need to be performed. So, a few comparisons (3 at most) can be made before asserting the need to perform the division to find the actual

t_{c}

of NRA intersection. Since

y_{1} - y_{2} = Δ y

, consider the following:

\frac{y_{1}}{Δ y} < 0 \Rightarrow (Δ y > 0 \land y_{1} < 0) \lor (Δ y < 0 \land y_{1} > 0)

(46)

\frac{y_{1}}{Δ y} > 1 \Rightarrow (Δ y > 0 \land y_{2} > 0) \lor (Δ y < 0 \land y_{2} < 0),

(47)

which can be combined. We are allowed to state that there is no NRA intersection if the following occurs:

(48)

Hence, just before line 11 of Algorithm 3 a few lines could be added to optimize the procedure, demonstrating also that at most 3 comparisons are needed to conclude whether there is NRA intersection or not before calculating

t_{c} = Y_{1} / Δ Y

, as proposed next: Algorithms 17 00444 i025

Previous expressions can be further optimized because when they are reached, it is already known that

sgn (Y_{1}) \neq sgn (Y_{2})

. Therefore,

Y_{1} < 0 \lor Y_{2} > 0

is equivalent to only one of them, e.g.,

Y_{1} < 0

. A similar reasoning applies to the “else”. This further reduces the number of comparisons from 3 to 2.

In case there is intersection (i.e., the previous tests concluded that there may be a NRA intersection because

0 \leq t_{c} \leq 1

), it is then necessary to calculate its precise value to test the second requirement of expression (42), that is, check the condition

t_{c} (x_{1} - x_{2}) \geq x_{1}

, which at first glance seems unavoidable to calculate through the division. But, actually, that division operation can be avoided and replaced by two multiplications and a comparison, as is shown next: being

x_{1} - x_{2} = Δ x

and

t_{c} Δ x \geq x_{1} \Leftrightarrow \frac{y_{1}}{Δ y} Δ x \geq x_{1}

, if

Δ y > 0

then, for NRA crossing, we need to have:

y_{1} Δ x \geq x_{1} Δ y

, otherwise, we need to have the following:

y_{1} Δ x \leq x_{1} Δ y

.

The conclusion is summarized by expression (49):

(49)

This last simplification not only avoids the mathematical division butalso allows the algorithm to fully operate in integer arithmetic representation, which would improve its usage in an approach based purely on the integer representation of points/vertices.

Yet, in line 17, Algorithm 3 still requires to check whether

t_{c}

is equal to 0 or to 1 to conclude if the “crossing” is actually a “touching”, which would imply an adjustment on the intersection counter. But that checking turns out redundant because those cases correspond to

y_{1} = 0

or

y_{2} = 0

, that is, there is no need to calculate

t_{c}

for those cases either.

In summary, the algorithm in its full extension and covering all these optimizations, is detailed as Algorithm 4 on the next page. It can be noted, in line 32, that a multiplication by

0.5

is present, which, although being a statistically rare situation, could be modified for further optimization; that would be to count each full crossing as an integer of value

\pm 2

, by, for example, changing line 30 into

I_{P} \leftarrow sgn (Δ Y) + sgn (Δ Y)

, and a half crossing being half of that value (

\pm 1

) with the final conclusions still holding, and integer computations could be present throughout the entire algorithm.

In conclusion, besides the simplicity of the formulation, the algorithm has also the virtue of a straightforward operation with degeneracy situations, such as point over segments or over vertices, or even as null length segments, which could happen in situations of rounding the point representation to integers.

5.6. Vectorization of Algorithm 4

Although not necessarily a breakthrough in reducing computational costs, Algorithm 4 on the following page can be easily vectorized for parallel computing, at least for the test of the inclusion of one point. Follows in Figure 11 a fully operational excerpt in Matlab code (listing) that accepts a matrix P with the polygon vertices (with the last vertex replicated from the first) and a point A. The procedure is vector based and operates with an entire polygon “at once”.

Despite the possibility of some finer tuning, the vectorized approach, nevertheless, forces all the operations to be performed through the entire chain of tests. Some tests could be dismissed early in a sequential approach, but the vector approach forces all the operations for all cases. Hence, it may not be suited for better performance in all computational setups.

Algorithm 4: An optimized version of Algorithm 3

5.7. Comparison to the State-of-the-Art Algorithm

For several years, Matlab has been using a version based on the algorithm from Hormann and Agathos [8], valid for point location in arbitrary polygons. According to the original paper, that algorithm is based on three steps, which, on the optimized variant, includes the following operations:

Evaluation of the determinant: performs 2 multiplications and 1 subtraction per segment;
Quadrant classification: uses 6 comparisons per vertex;
Determination of winding number: performs 2 subtractions/sums and up to 4 + 1 comparisons per vertex.

Algorithm 4 has more steps (though simpler) than Algorithm 3 and requires:

Initial test: 1 sign comparison per vertex;
Coordinate differences: 2 per vertex;
Y coordinate conditions: up to 3 comparisons per vertex;
Test NRA: 2 multiplications and 2 comparisons per vertex;
Test special cases end points: 2 comparisons with zero;
Crossing counter: 1 addition per NRA crossing.

The comparison is given side-by-side in Table 1, where it can be seen that Algorithm 4 has fewer comparisons (1 + 2 + 2 + 2 = 7 vs. 10 + 1 = 11) and potentially much less if tests dismiss early; there is a similar number of additions/sums: 2 + 1 vs. 3; the multiplications are in equal number but the advantage is that in Algorithm 4 they are needed only when x-axis intersection occurs. In the previous analysis,

2 M

subtractions were omitted, reporting to the relocation of the polygon around point

(0, 0)

, but which is a common operation to all.

An illustrative example of simple benchmarking is shown in Figure 12 with a polygon with more than 2200 sides and thousands of testing points in two situations: on the left, 5000 completely random points, and on the right more than 2200 points near the border. Results are shown for Algorithm 3 and the state-of-the-art algorithm in Matlab [8]. The result shown is the average of 100 runs and had the same outcome in both algorithms. In both cases, Algorithm 3 proposed in this paper outperforms by a factor of about 5.

6. NRA Crossing and Sense for Circular Arcs

For circular arcs, a few changes occur relative to linear segments. What is the most apparent is the fact that circular arcs can cross the NRA more than once or may even be tangent (Figure 13). Moreover, the parametric definition of arcs is a little more elaborate than for straight lines, as described next.

6.1. Parametric Expression for Circular Arcs

Any point of a circular arc with center at

z_{0}

, and evolving in the positive sense (counter-clockwise—

C C W

), between point

z_{1} = z_{0} + r e^{j θ_{1}}

and point

z_{2} = z_{0} + r e^{j θ_{2}}

is expressed by

z = z_{0} + r e^{j θ}

, where

θ

covers the interval defined by

θ_{1}

and

θ_{2}

, and

r = |z_{0} - z_{1}| = |z_{0} - z_{2}|

.

To simplify the parametrization and unify the procedures used for the linear segment, the following parametric representation is used:

z = z_{0} + r e^{j [θ_{1} + t (θ_{2} - θ_{1})]}, 0 \leq t \leq 1 .

(50)

However, to ensure an unambiguous interpretation of the path described for the positive sense, it must be

θ_{1} \leq θ_{2}

where it is expected that the angles are obtained by the following operations:

θ_{1} = arg (z_{1} - z_{0})

and

θ_{2} = arg (z_{2} - z_{0})

. But if that is not the case, then one of the angles must be converted by modulus

2 π

, that is, force

θ_{1}

and

θ_{2}

to fall in the interval

] - 2 π + 2 π]

, by applying either

θ_{2} \leftarrow θ_{2} + 2 π

or

θ_{1} \leftarrow θ_{1} - 2 π

, depending on the necessary case.

All the situations in Figure 13 respect the proper condition,

θ_{1} \leq θ \leq θ_{2}

, but different situations are illustrated in Figure 14 where

θ

encounters a discontinuity if restricted to the interval

] - π + π]

. As the figure explains, in one case the solution for a proper parametrization is to convert

θ_{2}

into

θ_{2} + 2 π

and the other to convert

θ_{1}

into

θ_{1} - 2 π

.

On the other hand, if a negative (or clockwise,

C W

) sense is intended, an equivalent operation must be taken into account, as illustrated in Figure 15.

In summary, to ensure the proper and unambiguous parametrization of a circular segment, the following steps have to be carried out when defining a circular boundary segment:

Indicate $z_{1}$ , $z_{2}$ , and the center $z_{0}$ (or some other means to obtain $z_{0}$ , like a third point $z_{p}$ , or a radius and the relative placement of the center);
Obtain $θ_{1} = arg (z_{1} - z_{0})$ and $θ_{2} = arg (z_{2} - z_{0})$ ;
Define circulation sense ( $C C W$ or $C W$ );
Check the potential angle adjustments given by (51):

\{\begin{matrix} θ_{2} \leq θ_{1} \land C C W & \{\begin{matrix} θ_{2} < 0 & \Rightarrow θ_{2} \leftarrow θ_{2} + 2 π \\ θ_{2} \geq 0 & \Rightarrow θ_{1} \leftarrow θ_{1} - 2 π \end{matrix} \\ θ_{1} \leq θ_{2} \land C W & \{\begin{matrix} θ_{1} < 0 & \Rightarrow θ_{1} \leftarrow θ_{1} + 2 π \\ θ_{1} \geq 0 & \Rightarrow θ_{2} \leftarrow θ_{2} - 2 π \end{matrix} \end{matrix} .

(51)

In expression (51), the verification of the equality between

θ_{1}

and

θ_{2}

is also considered to include the situation of complete circles (where

θ_{1}

and

θ_{2}

would coincide, but actually one of them should be

2 π

larger than the other to ensure that the possibility of a full circle is covered).

Having the circular path segments duly defined and parametrized, the calculation of the NRA intersections is as straightforward as for the linear segment. Hence, to adapt for the circular arc, there must exist one (or two) values for t that generate an NRA crossing, and let us call them

t_{c}

:

x_{c} + j y_{c} = (x_{0} + j y_{0}) + r e^{j [θ_{1} + t_{c} (θ_{2} - θ_{1})]}, 0 \leq t_{c} \leq 1

(52)

or in expanded view:

\{\begin{matrix} x_{c} = x_{0} + r cos [θ_{1} + t_{c} (θ_{2} - θ_{1})] & 0 \leq t_{c} \leq 1 \\ y_{c} = y_{0} + r sin [θ_{1} + t_{c} (θ_{2} - θ_{1})] & 0 \leq t_{c} \leq 1 \end{matrix} .

(53)

For the same conditions as before,

x_{c} \leq 0

and

y_{c} = 0

as seen in Section sec:NRAcrossingParametricPaths, we then have:

\{\begin{matrix} x_{0} \leq - r cos [θ_{1} + t_{c} (θ_{2} - θ_{1})] & 0 \leq t_{c} \leq 1 \\ y_{0} = - r sin [θ_{1} + t_{c} (θ_{2} - θ_{1})] & 0 \leq t_{c} \leq 1 \end{matrix} .

(54)

However, since the circle can cross the NRA twice, two solutions would be expected for

t_{c}

in the second equation of (54); indeed, the total number of solutions in

θ

of an equation of the type

x = sin (θ)

is given by the following:

θ = \{\begin{matrix} arcsin x \pm 2 k π & k = 0, 1, 2, \dots \\ π - arcsin x \pm 2 k π & k = 0, 1, 2, \dots \end{matrix}

(55)

that, when restricted to the

[- 2 π + 2 π]

interval, simplifies into, at most, the following 5 solutions:

θ = \{\begin{matrix} arcsin x \\ arcsin x \pm 2 π \\ - arcsin x \pm π \end{matrix} .

(56)

Resuming to the variables from (54), we have

\{\begin{matrix} t_{c_{1}} = \frac{θ_{1} + arcsin \frac{y_{0}}{r}}{θ_{1} - θ_{2}} \\ t_{c_{2, 3}} = \frac{θ_{1} + arcsin \frac{y_{0}}{r} \pm 2 π}{θ_{1} - θ_{2}} \\ t_{c_{4, 5}} = \frac{θ_{1} - arcsin \frac{y_{0}}{r} \pm π}{θ_{1} - θ_{2}} \end{matrix}

(57)

where all those potential solutions, besides needing to verify the condition

0 \leq t_{c} \leq 1

, must also verify the first condition of (54), that is,

x_{0} \leq - r cos [θ_{1} + t_{c} (θ_{2} - θ_{1})]

.

Naturally, at most two of the solutions from (57) are valid (two intersections of the x axis), but there can be only one, or even no, solutions at all. As just mentioned, both potential solutions (let us name them

t_{c_{A}}

and

t_{c_{B}}

) must be tested whether they verify expression (54). Nonetheless, there are obvious cases that can be tested beforehand to accelerate the numeric algorithms: one is

| y_{0} | > r

, resulting in a clear impossibility, and the other case is when

| y_{0} | = r

, implying that the arc is tangent to the x axis.

For the case of circular arcs, the situation of tangency may be interpreted as either a non-intersection or a double touching of the NRA (entering and leaving the NRA at the same point). The net additional contribution for the calculation of the line integral is, therefore, zero. Hence, if

| y_{0} | \geq r

, then the circular segment does not affect the intersection counter

I_{C}

. The exception occurs in case the tangency point (

| y_{0} | = r

) happens for

t_{c}

being 0 or 1, which is a “half intersection”, as described next.

6.2. Sense of NRA Crossing for Circular Arcs

As before, in case there is an NRA intersection, its sense must be determined, and expression (43) can be used and adapted for the case of arcs:

I_{C} = - sgn {(θ_{2} - θ_{1}) r cos [θ_{1} + t_{c} (θ_{2} - θ_{1})]} .

(58)

Expression (58) can be further simplified avoiding the need to calculate the trigonometric function. Knowing that

r > 0

and being

θ (t_{c}) = (θ_{1} + t_{c} (θ_{2} - θ_{1})) mod 2 π

, we have:

I_{C} = \{\begin{matrix} sgn (θ_{1} - θ_{2}) & \Leftarrow - \frac{π}{2} < θ (t_{c}) < \frac{π}{2} \\ sgn (θ_{2} - θ_{1}) & \Leftarrow Otherwise \end{matrix} .

(59)

Also, as occurred for the linear segments, the particular cases of

t_{c} = 0

or

t_{c} = 1

correspond to “half intersections” (one extremity of the arc is over the NRA) and, therefore, the

I_{C}

calculated with (59) must be adjusted the same way, as given by (45), or formally stated:

t_{c} = 0 \lor t_{c} = 1 \Rightarrow I_{C} \leftarrow 0.5 \times I_{C} .

7. Application to Bézier Curves

Most planar domains can be defined by closed splines or other arbitrary curve segments, delimited by two extreme (anchor) points and some control points in a parametric way, which is most frequently of a polynomial nature.Each segment can however be treated the same way as was demonstrated earlier for linear and circular segments. For example, Bézier curves of order n in the complex plane can be defined by:

B_{n} (t) = \sum_{k = 0}^{n} t^{k} (\frac{n!}{(n - k)!} \sum_{i = 0}^{k} \frac{{(- 1)}^{i + k} z_{i}}{i! (k - i)!})

(60)

where

z_{i}

are the control points, starting at

z_{0}

and ending at

z_{n}

(the anchor points), and with

0 \leq t \leq 1

.

Segments are described by polynomials, making it easy both to differentiate and find roots, except for higher orders (like 5, 6, or more), and in that case, it can represent a challenge for analytical approaches, although numerical approaches are always possible. Bézier curves can also be expressed in the more compact notation:

B_{n} (t) = \sum_{i = 0}^{n} (\binom{n}{i}) {(1 - t)}^{n - i} t^{i} z_{i}, 0 ⩽ t ⩽ 1

(61)

from which follows an example of a Bézier segment of the 3^rd order, which is one of the most commonly used, thus, representative of many challenges in computational geometry:

B_{3} (t) = {(1 - t)}^{3} z_{0} + 3 {(1 - t)}^{2} t z_{1} + 3 (1 - t) t^{2} z_{2} + t^{3} z_{3}

(62)

or, in full expansion:

\begin{matrix} B_{3} (t) = (3 z_{1} - z_{0} - 3 z_{2} + z_{3}) t^{3} + 3 (z_{0} - 2 z_{1} + z_{2}) t^{2} + 3 (z_{1} - z_{0}) t + z_{0} \end{matrix} .

(63)

As described earlier, detecting the NRA crossing of such a segment requires the solution of

B_{3} (t) = 0

, which would yield up to 3 results for t in the

[0, + 1]

interval. If less than 3 NRA intersections occur, this means that some solutions for t are complex, and are to be discarded. To determine the sense of growing at the intersection points with the NRA, the derivative of

B_{3} (t)

in order to t is required, and that too is a direct operation easily allowed by the polynomial representation of the curve segment:

\begin{matrix} \frac{d B_{3} (t)}{d t} = 3 (3 z_{1} - z_{0} - 3 z_{2} + z_{3}) t^{2} + 6 (z_{0} - 2 z_{1} + z_{2}) t + 3 (z_{1} - z_{0}) \end{matrix} .

(64)

Figure 16 shows a region defined by a cubic Bézier curve and two linear segments. In this case, the Bézier anchor points are

z_{0} = (0, 0)

and

z_{3} = (2, 1)

and the two control points are

z_{2} = (0, - 1)

and

z_{3} = (2, 1)

. The two linear segments are

\bar{z_{3} z_{4}}

and

\bar{z_{4} z_{0}}

. Points

a_{1} = (0.9, 0.6)

and

a_{2} = (1.1, 0.4)

will be tested but only

a_{1}

is included in the region.

To test points

a_{1}

and

a_{2}

, the procedure seems to require the adjustment of the “vertices” values reported to each testing point and calculate the new resulting polynomials. Actually, that turns out unnecessary in the parametric form because the new polynomial only changes by an offset, and the generic expression (63) simply becomes

B_{3} (t) \leftarrow B_{3} (t) - a_{i}

, where

a_{i}

is the point being tested.

An additional relevant aspect for computational purposes is that the expression for the Bézier polynomial derivative (64) remains the same whatever the testing point

a_{i}

, which saves computation resources when testing the inclusion of many different points for the same region. The generic equations to test the NRA crossing and contributions (

I_{C}

) for a given

a_{i} = (a_{x_{i}}, a_{y_{i}})

are:

\{\begin{matrix} x_{B_{3}} = 8 t^{3} - 12 t^{2} + 6 t - a_{x_{i}} \\ y_{B_{3}} = 10 t^{3} - 15 t^{2} + 6 t - a_{y_{i}} \\ \frac{d y_{B_{3}}}{d t} = 30 t^{2} - 30 t + 6 \end{matrix} .

To solve the problem for this particular situation, the procedure is implemented as follows:

Find the roots of $y_{B_{3}} = 10 t^{3} - 15 t^{2} + 6 t - a_{y_{i}}$ ;
Test whether the found roots satisfy the condition of NRA intersection, that is, $x_{B_{3}} = 8 t^{3} - 12 t^{2} + 6 t - a_{x_{i}} \leq 0$ ;
For each root that satisfies the previous expression there is a NRA intersection whose sense needs to be assessed by the sign of $\frac{d y_{B_{3}}}{d t} = 30 t^{2} - 30 t + 6$ .

Table 2 shows some numerical results that illustrate the functionality of the algorithm for a cubic Bézier segment for two test points

a_{1}

and

a_{2}

.

Figure 17 and Figure 18 illustrate the plots concerning NRA intersections by the cubic Bézier segment for the cases of point

a_{1}

and

a_{2}

, respectively.

The intersection contributions

I_{C}

(detailed in Table 2) clearly translate the situations of inclusion of the tested points. For

a_{1}

, total

I_{C}

is

- 1

and for

a_{2}

total

I_{C}

is

+ 1 - 1 = 0

. Actually, the

I_{C}

contributions from the two linear segments were not analyzed, but it is simple to confirm that they do not contribute at all because they do not cross the NRA: the horizontal segment will never contribute, and the vertical segment does not cross the NRA for the points being tested.

With higher order curves, such as these 3^rd order Bézier, it may happen that the intersection point is tangent to the NRA and expression (43) would result in 0 (inconclusive). This is an easy situation to detect (one further comparison) and concerns a case where all three roots

t_{c}

are equal. There are two solutions for that very special case to allow the detection of the intersection sense: one would be to calculate expression (43) in a “small” neighborhood of

t_{c}

, for example,

t_{c} - δ

instead of

t_{c}

(where the value of

δ

can be on the range of the precision involved—e.g., 10⁻⁶); the other solution, formally more elegant, is simply to check the quadrants of

z_{1}

and

z_{2}

, i.e., use the value of

sgn (y_{1} - y_{2}) = sgn (Δ Y_{b})

.

Application to the Challenges Given

By using the previous techniques for linear, circular, and Bézier segments in an integrated and complementary form, it is possible to test situations as those illustrated in Figure 1 for a set of random points, which results as shown in Figure 19.

From the point of view of programming, three functions were created and used: NRAintLin(), NRAintArc() and NRAintBez(), whose source code is also available at the GitHub repository indicated earlier. These functions return the NRA counting for each respective segment when invoked accordingly to the type of segment: 3 linear and 3 circular arcs on the region on the left, and 2 linear, 3 circular arcs and 2 Bézier segments on the region on the right. These three functions are essential to establish the general algorithm for regions delimited by linear, circular, or Bézier segments, which is the subject of the next section.

8. Algorithm for Arbitrary Planar Domains

After the statements, demonstrations, and examples described in the previous sections, it is now straightforward to establish the basis of a general algorithm for arbitrary planar domains that can be described by parametric equations, which have associated a sense of circulation along the boundary. The procedure can be stated as follows:

Set up the closed region by segments defined parametrically in such a way that each segment is defined by two extremes $z_{1}$ and $z_{2}$ , a parameter t, being $0 \leq t \leq 1$ , and optional points or parameters depending on the type of the segments (linear, circular arcs, Bézier curves, etc.)
- Linear segments do not require any further parameters or points;
- Circular arcs require a third point (center), or else a radius along with the definition of its orientation;
- Other curves (of second or higher order) require additional elements to fully define them (3rd order Bézier curves require two more points).
Normalize the domain delimiting points (vertices) relative positions by subtracting the test point a to all vertices.
Check the intersections of this newly relocated region with the negative half part of the horizontal axis (x). Intersections occur from $Q 2$ to $Q 3$ or vice-versa.
Full intersections are accounted for an integer value ( $+ 1$ or $- 1$ , depending on the sense, $Q 2$ to $Q 3$ () or $Q 3$ to $Q 2$ (), respectively), or a partial value ( $+ 0.5$ or $- 0.5$ ) if the segment starts or ends on the very horizontal axis (,), and the positive or negative contribution is determined once again if the segment goes or comes to/from quadrants $Q 2$ or $Q 3$ .
The verification of the sense of the evolution of the path must be asserted in these intersection/touching points: it is straightforward for linear segments, but for other types of segments, the descendant gradient of y with respect to t must be used at that point to determine the sense of cross/approach to the horizontal axis.

Algorithm 5 formalizes the required steps

Algorithm 5: Universal inclusion test in arbitrary regions

A most relevant point when comparing this general algorithm to the algorithms presented in the preceding sections is the input of data: besides the testing point and the list of vertices that require no changes at all, there are now more parameters to be given to distinguish and define the types of segments.

The algorithm then requires the following data as inputs:

The testing point a (same as before)
The vertices (extreme points) of the region segments (same as before)
A set of numbers indicating the type of segment or number of control points (e.g., 0 for linear, 1 for circular, 2 for 3^rd order Bézier segments, etc.)
A set of control points for each segment

The steps of the algorithm that vary with the type of segment occur mainly in lines 8 and 9, where the parametric equations of the segment must be obtained and solved, and also in line 15 to calculate a derivative and obtain the sense of NRA crossing.

As mentioned before, the GitHub repository made available also includes the code and illustration of the application of this general algorithm with several examples besides the ones in Figure 19. No benchmarking is performed with this general algorithm because, in the literature, there are no other algorithms for this same purpose to compare with.

9. Discussion and Conclusions

This paper presents a methodology to solve, in analytical form, the problem of point inclusion in planar domains for regions of virtually unlimited complexity, namely with boundary curves of order greater than linear segments. Specifically, for the cases for regions bounded arbitrarily by polygonal, circular arcs and Bézier lines have been derived and implemented, and the methodology for regions with other shapes has been described and is ready to be applied to the general algorithm.

The approaches described are handled in analytical form, so they are independent of any numerical approximations, hence independent of scale. Nonetheless, when performing actual calculations in floating point operations, the limitation of the numeric representation of the computational system naturally occurs. However, due to the nature of the operations (sums and multiplications in the Algorithms 3 and 4 for polygonal regions only), the propagation of uncertainty is minimal, and operations can even be performed in integer representation, which is suitable for some embedded systems. On the other hand, this is not the case for the general approach of non-linear segments (covered in Algorithm 5) because boundary segments are non-linear, and non-linear calculations are required, including trigonometric or root operations. Since the accumulated chain of non-linear operations is short, uncertainty propagation is contained, and results have shown that very good accuracy (in 12 or more decimal places) is kept in the final outcome. This can be checked, for example, by “zooming” into the plots extensively and verifying the correct result of point inclusion. The mentioned GitHub repository allows these tests to be performed by other users.

The paper also demonstrates why a simple winding number based technique for detecting point inclusion will always be ambiguous if the testing point is on the border. Nonetheless, the proposed algorithms detect those situations early, and the technique relies on the need of a single pass operation, i.e., it requires no iterative procedure or further verifications.

Besides this advantage of a single operation, the possibility of extending the algorithm to regions of more complex shapes opens wider directions in the field of simulation and computational geometry. The usage for curves of high orders needs to deal with negative real axis intersection, which is equivalent to finding the roots of polynomials or other transcendental curves. This implies unavoidable additional computational costs when compared to straight polygonal shapes. Nevertheless, the pure analytical formulation of the method may remain an advantage against the alternatives of linearization or approximation of curves to polygonal lines.

Along these lines, a final remark is worth mentioning, which concerns a potential alternative to this approach for general shapes. We could think of the possibility of triangulating a region such as the ones shown in Figure 19 and apply point location techniques cited in Section 2. Although the technique is usually applied for a problem not addressed in this paper, a region could be triangulated and the test of inclusion of a point in the overall region could be performed by identifying which subregion, if any, includes that point. Besides the variable computational costs of this extreme solution, two main limitations immediately arise. The first is concerning self-intersecting regions and regions with holes (like the ones in Figure 10) that require additional care and pre-processing to be triangulated. Secondly, and even more relevant, if the region boundaries are curved, they must be linearized (sampled) to perform the triangulation. This has two drawbacks: the computational demands of the preprocessing and an additional computational cost that depends on the scale. The dependency on scale is particularly relevant near the boundaries: if very high accuracy is required, the samples over the curved boundaries must be more dense; hence, a larger number of triangles is generated in the mesh. This dependency on scale (with consequences on the accuracy in areas closer to the borders) does not occur in the proposed approach because it is analytical.

The next list of topics explains the advantages of the proposed approach in this paper, the reasons for better performance than existing solutions, and the reason for that to happen.

It involves fewer operations: besides the common sums and subtractions, at most only one floating point division is required per segment (or two multiplications instead) and also fewer comparisons, for the case of polygonal shaped regions. This makes it faster than state of the art solutions (about 5 times as fast was demonstrated for polygons with many sides and a large number of data points for the algorithm implemented in Matlab [8]).
It has no restrictions on which points define the polygons (or the generically shaped region) nor on the testing points that can be over the border, or being themselves the very vertices, and no particular post processing or operations are required.
It accepts the geometry of boundary segments to be other than straight lines and is applicable virtually to any type of curve that can be expressed parametrically.
The parametric formulation of the boundary makes it straightforward to define the curve segments and their extremes (anchor points).
The parametric definition is intrinsically associated with the concept of sense or circulation, which allows the detection of the sense of axis intersection, and not only the fact that there are intersections; and the sense of intersection is crucial to obtaining the global winding number.
The parametric description allows a separation of coordinates, making algebraic operations independent, and allows us to overcome limitations in degenerate situations that would occur in two coordinate point representations but do not occur in separate coordinate representations.

Although the formal and analytic procedures for point location in generic planar domains have been fully established and demonstrated in this paper, the future may hold improvements in the computational component of the algorithms for specific curves. This can be true, namely for some curves of very high complexity, as well as for the detection of specific situations of point location relative to those more complex boundaries, in order to accelerate the calculation operations.

Funding

An early part of this work was developed under the Ph.D. Grant BD-1657/91_IA from JNICT, Programa CIENCIA, Portugal.

Data Availability Statement

Algorithms and data to support the results of this paper are available at a GitHub repository: https://github.com/vitoruapt/PointInclusion.

Acknowledgments

The author acknowledges the EC Joint Research Center of Ispra, Italy, for hosting him as a Ph.D. student in the early 1990s’ where the basic idea that lead to this research started.

Conflicts of Interest

The author declares no conflicts of interest.

References

Alciatori, D.; Miranda, R. A Winding Number and Point-in-Polygon Algorithm; Technical Report; Department of Mechanical Engineering, Colorado State University: Fort Collins, CO, USA, 1995. [Google Scholar]
Shimrat, M. Algorithm 112: Position of Point Relative to Polygon. Commun. ACM 1962, 5, 434. [Google Scholar] [CrossRef]
Hacker, R. Certification of Algorithm 112: Position of Point Relative to Polygon. Commun. ACM 1962, 5, 606. [Google Scholar] [CrossRef]
Franklin, W.R. PNPOLY-Point Inclusion in Polygon Test. 1994–2006. Available online: https://wrfranklin.org/Research/Short_Notes/pnpoly.html (accessed on 2 October 2024).
Preparata, F.P.; Shamos, M.I. Computational Geometry: An Introduction; Springer: New York, NY, USA, 1985. [Google Scholar]
Sedgewick, R. Algorithms in C; Addison-Wesley Longman Publishing Co., Inc.: Boston, MA, USA, 1990. [Google Scholar]
Haines, E. Point in Polygon Strategies. In Graphics Gems IV; Heckbert, P.S., Ed.; Academic Press Professional, Inc.: San Diego, CA, USA, 1994; pp. 24–46. [Google Scholar]
Hormann, K.; Agathos, A. The point in polygon problem for arbitrary polygons. Comput. Geom. Theory Appl. 2001, 20, 131–144. [Google Scholar] [CrossRef]
Aurenhammer, F. Voronoi diagrams—A survey of a fundamental geometric data structure. ACM Comput. Surv. 1991, 23, 345–405. [Google Scholar] [CrossRef]
Devillers, O. The Delaunay Hierarchy. Int. J. Found. Comput. Sci. 2002, 13, 163–180. [Google Scholar] [CrossRef]
Devroye, L.; Mücke, E.P.; Zhu, B. A Note on Point Location in Delaunay Triangulations of Random Points. Algorithmica 1998, 22, 477–482. [Google Scholar] [CrossRef]
Mücke, E.P.; Saias, I.; Zhu, B. Fast randomized point location without preprocessing in two- and three-dimensional Delaunay triangulations. Comput. Geom. 1999, 12, 63–83. [Google Scholar] [CrossRef]
O’Rourke, J. How Do I Find If a Point Lies within a Polygon. 2003. Available online: http://www.faqs.org/faqs/graphics/algorithms-faq/ (accessed on 2 October 2024).
Sunday, D. Inclusion of a Point in a Polygon. 2012. Available online: https://geomalgorithms.com/a03-_inclusion.html (accessed on 2 October 2024).
Foley, J.D.; van Dam, A.; Feiner, S.K.; Hughes, J.F. Computer Graphics: Principles and Practice, 2nd ed.; Addison-Wesley Longman Publishing Co., Inc.: Boston, MA, USA, 1990. [Google Scholar]
O’Rourke, J. Computational Geometry in C; Cambridge University Press: Cambridge, CA, USA, 1998. [Google Scholar]
Needham, T. Visual Complex Analysis; Oxford University Press: Oxford, UK, 1998. [Google Scholar]
Schirra, S. How Reliable Are Practical Point-In-Polygon Strategies? In Proceedings of the 16th Annual European Symposium on Algorithms (ESA ’08), Karlsruhe Germany, 15–17 September 2008; pp. 744–755. [Google Scholar] [CrossRef]
Gatilov, S. Efficient Angle Summation Algorithm for Point Inclusion Test and Its Robustness. J. Reliab. Comput. 2013, 19, 1–25. [Google Scholar]
Ross, F.; Ross, W.T. The Jordan curve theorem is non-trivial. J. Math. Arts 2011, 5, 213–219. [Google Scholar] [CrossRef]
Edelsbrunner, H.; Maurer, H.A. A space-optimal solution of general region location. Theor. Comput. Sci. 1981, 16, 329–336. [Google Scholar] [CrossRef]
Kirkpatrick, D. Optimal search in planar subdivisions. SIAM J. Comput. 1983, 12, 28–35. [Google Scholar] [CrossRef]
Guibas, L.; Ramshaw, L.; Stolfi, J. A kinetic framework for computational geometry. In Proceedings of the 24th Annual Symposium on Foundations of Computer Science (1983), Washington, DC, USA, 7–9 November 1983; pp. 100–111. [Google Scholar] [CrossRef]
Jackowski, B. Computing the area and winding number for a Bézier curve. In TUGboat; Tex Users Group: Portland, OR, USA, 2012; Volume 33, Available online: https://tug.org/TUGboat/tb33-1/tb103jackowski.pdf (accessed on 2 October 2024).
Santos, V. Robot Autonomous Navigation: Sensorial Data Interpretation and Local Navigation. Ph.D. Thesis, Universidade de Aveiro, Aveiro, Portugal, 1995. Available online: http://hdl.handle.net/10773/17931 (accessed on 2 October 2024).
Wylie, C.; Barrett, L. Advanced Engineering Mathematics; McGraw-Hill: New York, NY, USA, 1982. [Google Scholar]
Kanwal, R. Linear Integral Equations; Birkhäuser Boston: Boston, MA, USA, 1996. [Google Scholar]
OGC. OpenGIS Simple Features Specification for SQL, Revision 1.1; Technical Report; Open Geospatial Consortium, Inc.: Arlington, VA, USA, 1999. [Google Scholar]

Figure 1. Examples of shapes bounded by linear, circular, or Bézier segments, including twisting and self-intersection, and for which lacks a procedure to determine with the same algorithmic process the relative location of the points illustrated. The case on the right is much more complex than the one on the left, and that can be challenging, even for humans.

Figure 2. The traditional ray-crossing algorithm is based on the parity of intersection points. An odd number of intersections of a horizontal line starting on the point being tested along one of its two sides indicates that the point is inside the polygon, and an even number of intersections (also 0) indicates that the point is outside.

Figure 3. The classic ray-crossing algorithm is not meant to deal with non-simple or self-intersecting polygons. Point e would be detected as “outside”, but it is “inside”.

Figure 4. Algorithm based on the winding number. If

\sum_{i} α_{i} = 0

, then the testing point (b in this case) is outside the polygon.

Figure 4. Algorithm based on the winding number. If

\sum_{i} α_{i} = 0

, then the testing point (b in this case) is outside the polygon.

Figure 5. Example of two adjacent regions

R_{1}

and

R_{2}

where point a lies precisely on the border, raising the question to which region it belongs.

Figure 5. Example of two adjacent regions

R_{1}

and

R_{2}

where point a lies precisely on the border, raising the question to which region it belongs.

Figure 6. Example of the behavior of the winding number technique for a point on the border,

a = (0, 0)

, of a triangle when the circulation sense is reversed (the order

p_{1} \to p_{2} \to p_{3} \to p_{1}

is always assumed). For the inclusion of a, the sum of the three angles is expected to be

\sum α_{i j} = \pm 2 π

, which does not occur on the right (clockwise circulation sense) because

α_{12} = + π

, causing a net

\sum α_{i j} = 0

. For the sake of clarity, auxiliary vectors

\vec{u}

and

\vec{v}

(from Equation (1)) are shown only for the first oriented segment (

p_{1} \to p_{2}

).

Figure 6. Example of the behavior of the winding number technique for a point on the border,

a = (0, 0)

, of a triangle when the circulation sense is reversed (the order

p_{1} \to p_{2} \to p_{3} \to p_{1}

is always assumed). For the inclusion of a, the sum of the three angles is expected to be

\sum α_{i j} = \pm 2 π

, which does not occur on the right (clockwise circulation sense) because

α_{12} = + π

, causing a net

\sum α_{i j} = 0

. For the sake of clarity, auxiliary vectors

\vec{u}

and

\vec{v}

(from Equation (1)) are shown only for the first oriented segment (

p_{1} \to p_{2}

).

Figure 7. Case where point a is without doubt included in the delimited region, including a circular arc, independently of the circulation sense, but which could fail by the application of the traditional winding number technique proposed by [19].

Figure 8. Several examples of winding numbers for testing points (* in the image) in different situations on a non-simple (self-intersecting) polygon. On the left, the direct sense of circulation was used when starting on

P_{1}

and moving toward

P_{2}

, and on the right the reverse sense was used (also starting on

P_{1}

but toward a new

P_{2}

). Some points over the boundary result in an ambiguous winding number that depends on the circulation sense.

Figure 8. Several examples of winding numbers for testing points (* in the image) in different situations on a non-simple (self-intersecting) polygon. On the left, the direct sense of circulation was used when starting on

P_{1}

and moving toward

P_{2}

, and on the right the reverse sense was used (also starting on

P_{1}

but toward a new

P_{2}

). Some points over the boundary result in an ambiguous winding number that depends on the circulation sense.

Figure 9. Contributions of NRA intersections/touchings on the left side of nine example test points. The sense of NRA reaching/crossing is relevant for the final sum. Since the imaginary axis would lie over each

a_{i}

point, the numbers in the table indicate the

I_{C}

partial and total values, i.e., the counting of intersections (crossings) of the NRA. For example, the total down

I_{C}

for

a_{6}

is given by summing all downward NRA intersections at its left:

+ 3 = + 1 + 0.5 + 0.5 + 1

and, similarly, for the total up

I_{C}

, the count is

- 2 = - 0.5 - 1 - 0.5

. The net total is

I_{C} = + 1

, hence

a_{6}

is inside. Curiously, the result for

a_{7}

is

I_{C} = + 2

, that is, we could consider that

a_{7}

is inside “twice”, which is graspable from the figure!

Figure 9. Contributions of NRA intersections/touchings on the left side of nine example test points. The sense of NRA reaching/crossing is relevant for the final sum. Since the imaginary axis would lie over each

a_{i}

point, the numbers in the table indicate the

I_{C}

partial and total values, i.e., the counting of intersections (crossings) of the NRA. For example, the total down

I_{C}

for

a_{6}

is given by summing all downward NRA intersections at its left:

+ 3 = + 1 + 0.5 + 0.5 + 1

and, similarly, for the total up

I_{C}

, the count is

- 2 = - 0.5 - 1 - 0.5

. The net total is

I_{C} = + 1

, hence

a_{6}

is inside. Curiously, the result for

a_{7}

is

I_{C} = + 2

, that is, we could consider that

a_{7}

is inside “twice”, which is graspable from the figure!

Figure 10. Multi-ring polygons as used in GIS contexts. The orientations are indicative, but appropriate relative senses must be ensured to correctly represent, for example, a polygon with a “hole”.

Figure 11. Example of vectorization of Algorithm 4 in Matlab code.

Figure 12. Example of points tested in a 2216-side polygon: 5000 random on the left and more than 2200 random points very close to the borders on the right. As indicated at the top of each image, Algorithm 3 ran about 5 times faster than the state-of-the-art (SOTA) Matlab’s native algorithm [8]. Tested in a HP ZBook Fury 15 with Intel Core i7-10750H CPU 2.60 GHz with 32 GBytes of RAM in Ubuntu Linux 22.04.3.

Figure 13. Examples of arc segments with center in

z_{0}

starting on point

z_{1}

and ending at point

z_{2}

. Depending on

z_{0}

, the very same arc can intersect the negative real axis (NRA), zero times, once (

I_{1}

), twice (

I_{1}

and

I_{2}

), or be tangent (

I_{1} \equiv I_{2}

), which, for the purposes of path integral evaluation, is equivalent to intersect the NRA twice, in opposite senses.

Figure 13. Examples of arc segments with center in

z_{0}

starting on point

z_{1}

and ending at point

z_{2}

. Depending on

z_{0}

, the very same arc can intersect the negative real axis (NRA), zero times, once (

I_{1}

), twice (

I_{1}

and

I_{2}

), or be tangent (

I_{1} \equiv I_{2}

), which, for the purposes of path integral evaluation, is equivalent to intersect the NRA twice, in opposite senses.

Figure 14. Example of arc segments where the

C C W

evolution of

θ

from

θ_{1}

to

θ_{2}

would not be monotonic when

θ_{i} \in] - π + π]

, but that can be made monotonic by taking the equivalent angles: on the left,

θ_{2} \leftarrow θ_{2} + 2 π

, that is, use

θ_{2 b}

instead of

θ_{2 a}

, and on the right

θ_{1} \leftarrow θ_{1} - 2 π

, that is, use

θ_{1 b}

instead of

θ_{1 a}

.

Figure 14. Example of arc segments where the

C C W

evolution of

θ

from

θ_{1}

to

θ_{2}

would not be monotonic when

θ_{i} \in] - π + π]

, but that can be made monotonic by taking the equivalent angles: on the left,

θ_{2} \leftarrow θ_{2} + 2 π

, that is, use

θ_{2 b}

instead of

θ_{2 a}

, and on the right

θ_{1} \leftarrow θ_{1} - 2 π

, that is, use

θ_{1 b}

instead of

θ_{1 a}

.

Figure 15. Example of arc segments where the

C W

evolution of

θ

from

θ_{1}

to

θ_{2}

would not be monotonic when

θ_{i} \in] - π + π]

, but that can be made monotonic by taking the equivalent angles: on the left,

θ_{1} \leftarrow θ_{1} + 2 π

, that is, use

θ_{1 b}

instead of

θ_{1 a}

, and on the right

θ_{2} \leftarrow θ_{2} - 2 π

, that is, use

θ_{2 b}

instead of

θ_{2 a}

.

Figure 15. Example of arc segments where the

C W

evolution of

θ

from

θ_{1}

to

θ_{2}

would not be monotonic when

θ_{i} \in] - π + π]

, but that can be made monotonic by taking the equivalent angles: on the left,

θ_{1} \leftarrow θ_{1} + 2 π

, that is, use

θ_{1 b}

instead of

θ_{1 a}

, and on the right

θ_{2} \leftarrow θ_{2} - 2 π

, that is, use

θ_{2 b}

instead of

θ_{2 a}

.

Figure 16. Region delimited by a cubic Bézier and two linear segments. Points

z_{1}

and

z_{2}

(not plotted in the figure) are the Bézier control points.

Figure 16. Region delimited by a cubic Bézier and two linear segments. Points

z_{1}

and

z_{2}

(not plotted in the figure) are the Bézier control points.

Figure 17. Bézier real axis intersections for testing point

a_{1}

. The “*” are the points where the curve intersects the horizontal axis, the green circle represents the point being tested when shifted to the system origin, and the square indicates a NRA crossing that effectively contributes to the intersection counting. Check Table 2 for more details.

Figure 17. Bézier real axis intersections for testing point

a_{1}

. The “*” are the points where the curve intersects the horizontal axis, the green circle represents the point being tested when shifted to the system origin, and the square indicates a NRA crossing that effectively contributes to the intersection counting. Check Table 2 for more details.

Figure 18. Bézier real axis intersections for testing point

a_{2}

. The “*” are the points where the curve intersects the horizontal axis, the green circle represents the point being tested when shifted to the system origin, and the squares indicate a NRA crossing that effectively contributes to the intersection counting. Check Table 2 for more details.

Figure 18. Bézier real axis intersections for testing point

a_{2}

. The “*” are the points where the curve intersects the horizontal axis, the green circle represents the point being tested when shifted to the system origin, and the squares indicate a NRA crossing that effectively contributes to the intersection counting. Check Table 2 for more details.

Figure 19. Results of algorithms applied to the regions of Figure 1. Due to limited resolution in the display above, some points may seem to be on the border, but actually, none of them are.

Table 1. Comparison of estimated number of operations per vertex/segment in the algorithms.

	Algorithm 3	Algorithm 4	Algorithm from [8]
Comparisons	1 + 3 + 2	1 + 2 + 2 + 2	10 + 1
Sums/subtractions	2 + 1	2 + 1	3
Multiplications	1	2	2
Divisions	1	0	0

Table 2. Detailed analysis of operations concerned with NRA crossings by a cubic Bézier segment for two testing points in the case illustrated in Figure 16, Figure 17 and Figure 18.

Testing Points	$t_{i}$	$x_{B_{3}} (t_{i})$		$\frac{d y_{B_{3}}}{d t} (t_{i})$	$I_{C}$	$\sum I_{C}$
$a_{1}$	$t_{1} = 0.1517$	$- 0.2379$	yes	$2.139$	$- 1$	$- 1$
	$t_{2} = 0.4312$	$0.0974$	no	$- 1.358$	0
	$t_{3} = 0.9171$	$0.6805$	no	$3.719$	0
$a_{2}$	$t_{1} = 0.0829$	$- 0.6805$	yes	$3.719$	$- 1$	0
	$t_{2} = 0.5688$	$- 0.0974$	yes	$- 1.358$	$+ 1$
	$t_{3} = 0.8483$	$0.2379$	no	$2.139$	0

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the author. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Santos, V. Analytical Solution for the Problem of Point Location in Arbitrary Planar Domains. Algorithms 2024, 17, 444. https://doi.org/10.3390/a17100444

AMA Style

Santos V. Analytical Solution for the Problem of Point Location in Arbitrary Planar Domains. Algorithms. 2024; 17(10):444. https://doi.org/10.3390/a17100444

Chicago/Turabian Style

Santos, Vitor. 2024. "Analytical Solution for the Problem of Point Location in Arbitrary Planar Domains" Algorithms 17, no. 10: 444. https://doi.org/10.3390/a17100444

APA Style

Santos, V. (2024). Analytical Solution for the Problem of Point Location in Arbitrary Planar Domains. Algorithms, 17(10), 444. https://doi.org/10.3390/a17100444

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Analytical Solution for the Problem of Point Location in Arbitrary Planar Domains

Abstract

1. Introduction

2. Related Work

2.1. Ray-Crossing Approaches

2.2. Winding Number Based Approaches

2.3. Generalization to Generic Shapes

2.4. Background and Scope of This Paper

3. Base Approach and Related Theorems

4. Calculation of the Contour Integral

4.1. The Complex Logarithm

4.2. Line Integral for Linear Segments

4.2.1. Concerns When Calculating the Line Integral

4.2.2. Simplifying the Calculation of the Contour Integral

4.2.3. Points on the Boundary of Polygons with Undefined Orientation

4.3. Parametric Definition of the Integration Path

5. NRA Crossing in Parametric Paths

5.1. NRA Crossings and Their Sense for Linear Segments

5.2. The Case of Points on the Border, Again!

5.3. Universal Algorithm for Arbitrary Polygons

5.4. Multi-Ring Polygons and Multi-Polygons

5.5. Optimizing Algorithm 3

5.6. Vectorization of Algorithm 4

5.7. Comparison to the State-of-the-Art Algorithm

6. NRA Crossing and Sense for Circular Arcs

6.1. Parametric Expression for Circular Arcs

6.2. Sense of NRA Crossing for Circular Arcs

7. Application to Bézier Curves

Application to the Challenges Given

8. Algorithm for Arbitrary Planar Domains

9. Discussion and Conclusions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI