Towards Nonlinearity: The p-Regularity Theory

Bednarczuk, Ewa; Brezhneva, Olga; Leśniewski, Krzysztof; Prusińska, Agnieszka; Tret’yakov, Alexey A.

doi:10.3390/e27050518

Open AccessArticle

Towards Nonlinearity: The p-Regularity Theory

by

Ewa Bednarczuk

¹,

Olga Brezhneva

^2,*

,

Krzysztof Leśniewski

^3,*,

Agnieszka Prusińska

⁴

and

Alexey A. Tret’yakov

^4,5

¹

Department of CAD/CAM Systems Design and Computer-Aided Medicine, Faculty of Mathematics and Information Sciences, Warsaw University of Technology, 00-661 Warszawa, Poland

²

Department of Mathematics, Miami University, Oxford, OH 45056, USA

³

System Research Institute, Polish Academy of Sciences, 02-106 Warsaw, Poland

⁴

Faculty of Science, University of Siedlce, 08-110 Siedlce, Poland

⁵

Dorodnicyn Computing Center, Federal Research Center “Computer Science and Control”, Russian Academy of Sciences, Moscow 119333, Russia

^*

Authors to whom correspondence should be addressed.

Entropy 2025, 27(5), 518; https://doi.org/10.3390/e27050518

Submission received: 19 January 2025 / Revised: 25 April 2025 / Accepted: 3 May 2025 / Published: 12 May 2025

(This article belongs to the Section Complexity)

Download Review Reports Versions Notes

Abstract

:

We present recent advances in the analysis of nonlinear problems involving singular (degenerate) operators. The results are obtained within the framework of p-regularity theory, which has been successfully developed over the past four decades. We illustrate the theory with applications to degenerate problems in various areas of mathematics, including optimization and differential equations. In particular, we address the problem of describing the tangent cone to the solution set of nonlinear equations in singular cases. The structure of p-factor operators is used to propose optimality conditions and to construct novel numerical methods for solving degenerate nonlinear equations and optimization problems. The numerical methods presented in this paper represent the first approaches targeting solutions to degenerate problems such as the Van der Pol differential equation, boundary-value problems with small parameters, and partial differential equations where Poincaré’s method of small parameters fails. Additionally, these methods may be extended to nonlinear degenerate dynamical systems and other related problems.

Keywords:

nonlinear optimization; operator equations; implicit function theorem; nonlinear differential equations; degenerate problems; numerical methods; singularities; p-regularity

1. Introduction

Many fundamental results in nonlinear analysis and classical numerical methods in Banach spaces X and Y rely on the regularity of the mapping

F : X \to Y

at a given point

\bar{x} \in X

. The regularity of a Fréchet differentiable mapping F is commonly understood as the surjectivity of its Fréchet derivative

F^{'}

. However, a growing number of applications in areas such as partial differential equations, control theory, and optimization require the development of special approaches to deal with nonregular problems.

We present the theory of p-regularity, which originated in the 1980s with the aim of providing constructive tools for the analysis of nonregular problems. To date, the theory of p-regularity has found successful applications in various contexts and different areas of mathematics, as discussed in numerous papers. This paper highlights the most distinguished applications of the theory of p-regularity, with the goal of reviewing important results and indicating potential and promising directions for its future development and applications.

The theory of p-regularity, also known as higher-order regularity theory, offers a framework for studying nonlinear problems in situations where regularity assumptions are not satisfied. It focuses on utilizing higher-order derivatives to analyze and understand the behavior of mappings having first-order derivatives that are not onto or lack regularity.

Definition 1

(cf. Definition 1.16 of [1]). Let

F : X \to Y

be a continuously differentiable mapping from an open set

U \subset X

of a Banach space X into a Banach space Y. A vector

\bar{x} \in U

is called a regular point of F if

F^{'} (\bar{x})

maps X onto the entire space Y, expressed as

I m F^{'} (\bar{x}) = Y .

If

I m F^{'} (\bar{x}) \neq Y

, we refer to

\bar{x}

as a singular (nonregular, irregular, degenerate) point of F.

1.1. Recollection of the Fundamental Results in the Regular Case

Regularity is a common assumption in many fundamental results of real and functional analysis, such as the inverse function theorem and the Implicit Function Theorem (IFT). In this section, we revisit some of these results. The theorems presented in this section have their roots in the following classical result.

Theorem 1

(Banach open mapping principle [2], see also [3]). Let X and Y be Banach spaces. For any linear and bounded single-valued mapping

A : X \to Y

, the following properties are equivalent:

(a): A is surjective.
(b): A is open.
(c): There is a constant $κ > 0$ , such that for all $y \in Y$ there exists $x \in X$ with $A x = y$ and $∥ x ∥ \leq κ ∥ y ∥$ .

The Banach open mapping principle can be extended to nonlinear mappings in various ways. One such result is stated in the next theorem. Let

B_{X} (x, t)

denote the open ball in X with center at x and radius t.

Theorem 2

(Graves’ Theorem [4]). Let X and Y be Banach spaces, and let

F : X \to Y

be a continuous function with

F (0) = 0

. Let A be a linear operator from X onto Y, and let

κ > 0

be the corresponding constant from Theorem 1. Suppose that there exists a constant

δ > 0

with

δ < κ^{- 1}

and

ε > 0

, such that

| F (x_{1}) - F (x_{2}) - A (x_{1} - x_{2}) | \leq δ | x_{1} - x_{2} |

(1)

for all

x_{1}, x_{2} \in B_{X} (0, ε)

. Then, the equation

y = F (x)

has a solution

x \in B_{X} (0, ε)

whenever

| y | \leq κ^{- 1} - δ

.

Note that the assumption of differentiability of F at 0 is not made, as the concept of a strictly differentiable function was introduced several years after the publication of Graves’ work [4]. Instead, the surjectivity of the operator A is used. The proof of Theorem 2, along with its reformulation in terms of a strictly differentiable function F and related discussions, can be found in [5]. For historical remarks, refer to [6].

Both the inverse function theorem and the Implicit Function Theorem can be deduced from Theorem 3 below; see Theorem 1.20 in Section 1.2 of [1] for details. We should also mention that certain variants of the inverse function theorem can be derived directly from the Implicit Function Theorem; see Section 4.2 for details.

To state Theorem 3, let us recall the definition of the Banach constant, denoted by

C (A)

, for a bounded linear operator A between Banach spaces X and Y, as given in [1]:

C (A) = sup {r \geq 0 ∣ B_{Y} (0, r) \subset A (B_{X} (0, 1))} = inf {∥ y ∥ ∣ y \notin A (B_{X} (0, 1))} .

(2)

Theorem 3

(Lyusternik–Graves Theorem, [1]). Let X and Y be Banach spaces. Suppose that

F : X \to Y

is strictly differentiable and regular at

\bar{x} \in X

. Then, for any positive

r < C (F^{'} (\bar{x}))

, there exists an

ε > 0

such that

B_{Y} (F (x), r t) \subset F (B_{X} (x, t)),

whenever

∥ x - \bar{x} ∥ < ε

and

0 \leq t < ε

.

For a thorough analysis of numerous consequences of Theorem 3, we refer the reader to the paper written by Dmitruk, Milyutin, and Osmolovskii [7], where the theorem is called the “generalized Lyusternik theorem”. In the monographs by Dontchev [3], Ioffe [1], and Dontchev and Rockafellar [8], Theorem 3 is also called the “Lyusternik-Graves theorem”. From a more general point of view, the theorem is treated by Dontchev and Frankowska in [9,10].

One of the consequences of Theorem 3 is the description of tangent vectors to the level set of a continuously differentiable mapping F at regular points (see Section 4.1 below).

1.2. Generalizations

Since the early 1970s, due to theoretical interests and an increasing number of involved economic and industrial applications, a vast body of literature has been devoted to relaxing the surjectivity assumption of the derivative in the fundamental results, some of which are given above, while maintaining as much of their conclusions as possible. It is beyond the scope of this paper to provide an exhaustive survey of the existing generalizations of the theorems stated above. For the purpose of this paper, we can distinguish between generalizations exploiting higher-order derivatives (see, e.g., Frankowska [11]) and generalizations that attempt to relax the surjectivity assumption of the derivative without referring to higher-order derivatives (see, e.g., Ekeland [12], Hamilton [13], Bednarczuk, Leśniewski, and Rutkowski [14]). The theory of p-regularity belongs to the first group of generalizations where higher-order derivatives are involved.

In this manuscript, we present the main concepts and results of the p-regularity theory, which has been developing successfully for the last forty years. One of the main goals of the theory of p-regularity is to replace the operator of the first derivative, which is not surjective, by a special mapping that is onto. Nonlinear mappings analyzed within the framework of p-regularity theory are those for which the derivatives up to the order

p - 1

are not surjective at a given point

\bar{x}

, where

p \geq 2

. The main concept of the p-regularity theory is the construction of the p-factor-operator, which is surjective at the point

\bar{x}

(see Definition 4). The special definition and the property of surjectivity of the p-factor operator lead to generalizations of the fundamental results of analysis, including IFT and some classical numerical methods. The p-factor operator is defined in such a constructive way that it efficiently replaces the nonsurjective first derivative in a variety of situations. The structure of the p-factor-operator is used as a basis for analyzing nonregular problems and for constructing numerical methods for solving degenerate nonlinear equations and optimization problems. We discuss these generalizations in this paper.

There are many publications that focus on the case of

p = 2

and use a 2-factor operator in a variety of applications. In this work, we consider a more general case of

p \geq 2

and do not make some additional assumptions introduced and required in the publications of other authors.

In the framework of metric spaces, the concept related to the problems discussed in the present paper and attempting to generalize the classical results given above is the concept of metric regularity; see, for example, [1,3]. For a function f acting between Banach spaces X and Y and being strictly differentiable at the given point

\bar{x}

, Corollary 5.3 in [3] complements Theorem 5.1 in [3]. It concludes that metric regularity at

\bar{x}

for

f (\bar{x})

is equivalent to surjectivity of the Fréchet derivative

D f (\bar{x})

of f at

\bar{x}

. In the case when

X = Y = R^{n}

, this is the same as the nonsingularity of the Jacobian matrix

\nabla f (\bar{x})

.

The theory of p-regularity and the apparatus of p-factor operators make it possible to create new methods in computational mathematics to solve nonlinear problems of mathematical physics, such as the Van der Pol differential equation, boundary-value problems with a small parameter, Partial Differential Equations (PDEs) where Poincaré’s method of small parameters fails, nonlinear degenerate dynamical systems, and others. This is associated with the proposed fundamentally new design (after Newton) of a numerical method for solving essentially (degenerate) problems, which is described in this paper. Moreover, the proposed approach will allow us to construct a new type of difference scheme in computational mathematics for solving problems of nonlinear mathematical physics that are stable and converge quickly to a solution.

This pertains to the numerical solution of nonlinear equations such as the nonlinear heat equation, Burgers equation, Korteweg–de Vries equation, Navier–Stokes equation, etc. It also enables us to reorganize Numerical Analysis in a novel way, with solutions obtained being related to the mentioned problems. All of this is applicable to the emerging prospects for developing new technologies and designs in Computational Mathematics for solving problems and models related to Artificial Intelligence, optimization problems, dynamical systems, optimal control problems, etc. New opportunities are emerging for modeling and researching neural networks and creating new architectures for supercomputing.

It is important to highlight the development of fundamentally new and innovative methods in computational mathematics. The resulting schemes were far from any previous designs, emerging from years of research into the structure of degeneracy—specifically, the structure of degenerate mappings and solution sets of degenerate systems. Analyzing these structures requires methods that differ significantly from those used in the analysis of linear problems, leading to entirely new forms of mathematical objects, such as the p-factor-operator:

F^{(p)} (\bar{x}) {[h]}^{p - 1},

where

h \in X

. In general form, this operator is formally introduced in Definition 4.

In degenerate problems, where the first derivative operator is not surjective, the p-factor-operator serves as its replacement. At the same time, this research has uncovered a previously hidden nonlinear world with unexpectedly rich diversity (cf. Theorem 4). The range of possible new methods is remarkably broad, yet it follows a structured framework that remains stable under small changes—similar to regularization techniques. These methods can also be adjusted based on the specific problem being solved.

Moreover, recent studies revealed that the so-called ill-posed problems and essentially nonlinear ones are locally equivalent. This finding suggests that many important problems, such as inverse problems, can be solved using the p-factor method or p-factor regularization. This represents a new and promising direction in both theoretical mathematics and practical applications.

1.3. Aims and Scope

The main focus of this work is on analyzing and solving nonlinear equations of the form

F (x) = 0,

(3)

and optimization problems of the form

min f (x) subject to F (x) = 0,

(4)

where

f : X \to R

and

F : X \to Y

are sufficiently smooth mappings, and X and Y are Banach spaces. Many interesting applied nonlinear problems can be written in one of these forms.

Nonlinear mappings F and problems of the form (3) and (4) can be divided into two classes, called regular (or nonsingular) and singular (or degenerate). The classification depends on the mapping F, which is either regular (that is,

F^{'} (\bar{x}) : X \to Y

is onto for a given

\bar{x} \in X

) or singular (that is, if

F^{'} (\bar{x}

) is not onto). Roughly speaking, regular mappings are those for which the Implicit Function Theorem arguments can be applied, and singular problems are those for which they cannot, at least, be directly applied.

The purpose of this paper is to give an overview of methods and tools of the p-regularity theory, and to show how they can be applied to analyze and develop methods for solving singular (irregular, degenerate) nonlinear equations and equality-constrained optimization problems. The development of the theory of p-regularity started in approximately 1983–1984, with the concept of p-regularity introduced by Tret’yakov in [15,16].

One of the main results of the theory of p-regularity is a detailed description of the structure of the zero set

{x \in X, F (x) = 0}

of a nonregular nonlinear mapping

F : X \to Y

. It is interesting to note that there have been several examples in the history of mathematics when fundamental results were obtained independently in the same general time period. One such example related to the theory of p-regularity concerns theorems about the structure of the zero sets of an irregular mapping satisfying a special higher-order regularity condition. The result that we are referring to was simultaneously obtained by Buchner, Marsden and Schecter [17] and Tretyakov [16]. The approaches proposed in [17] and in [16] are the same. The difference is in motivation and the context for the main result in both papers. In [17], the structure of the zero set around a point where the derivative is not surjective was studied in the context of bifurcation theory. Theorem 1.3 in [17] is referred to as a blowing-up result. In Fink and Rheinboldt [18], it was noted that Theorem 1.3 in [17] was a powerful generalization of the Morse Lemma, and some interesting counterexamples for a naive approach to the Morse Lemma were found. The same theorem derived by Tretyakov [16] is one of the main results for the p-regularity theory. The result led to various theoretical developments and applications of the theory to nonregular (or degenerate) problems in many areas of mathematics. We should note that the results and constructions introduced by Marsden and Tret’yakov are the same in the completely degenerate case.

This paper is organized as follows. We discuss essential nonlinearity and singular mappings in Section 2. We then recall the main concepts and definitions of the p-regularity theory in Section 3. We discuss some classical results of analysis and methods for solving nonlinear problems via the p-regularity theory in Section 4. In each subsection, we focus on singular problems that illustrate that the classical results are not necessarily satisfied in the nonregular case. We present generalizations of the same classical results, which were derived during the last forty years using the constructions and definitions of the p-regularity theory.

In this manuscript, we consider a variety of applications. We start Section 4 with the Lyusternik theorem in Section 4.1. The Lyusternik theorem plays an important role in the description of the solution sets of nonlinear equations and feasible sets of optimization problems in the regular case. However, the classical Lyusternik theorem might not hold if mapping F is singular at a given point

\bar{x}

. The first generalization of the classical Lyusternik theorem for p-regular mappings was derived and proved simultaneously in [15,17]. It can be applied to describe the zero set of a p-regular mapping. Representation Theorem and Morse Lemma are also presented in Section 4.1. We continue with the consideration of the Implicit Function Theorem in Section 4.2. Numerous books and papers, such as [13,19], discuss the classical Implicit Function Theorem. However, the classical version of the theorem is not applicable when a mapping

F : X \times Y \to Z

is not regular, meaning that

F_{y}^{'} (\bar{x}, \bar{y}) : Y \to Z

is not onto for some

(\bar{x}, \bar{y})

, where the index y denotes the partial derivative with respect to the variable y (for a more detailed explanation of the notation, see the “General Notation” below). We present a generalization of the Implicit Function Theorem for nonregular mappings in Section 4.2. In Section 4.3, we cover the p-factor Newton’s method for solving nonlinear Equation (3) and finding critical points of an unconstrained optimization problem. Optimality conditions for equality-constrained optimization problems and Lagrange multiplier theorems for the regular and degenerate cases are considered in Section 4.4. The modified Lagrange function method for 2-regular problems is covered in Section 4.5. Singular problems of the calculus of variations and optimality conditions for p-regular problems of the calculus of variations are considered in Section 4.6. The existence of solutions to nonlinear equations in regular and degenerate cases is covered in Section 4.7. The second-order nonlinear ordinary differential equations with boundary conditions are presented in Section 4.8. Newton interpolation polynomials and the p-factor interpolation method are considered in Section 4.9. We make some concluding remarks in Section 5.

General Notation

Let

L (X, Y)

be the space of all continuous linear operators from X to Y, and for a given linear operator

Λ \in L (X, Y)

, let us denote its kernel and image by

Ker Λ = {x \in X ∣ Λ x = 0}

and

Im Λ = {y \in Y ∣ y = Λ x f o r s o m e x \in X}

, respectively. Also,

Λ^{*} : Y^{*} \to X^{*}

denotes the adjoint of

Λ

, where

X^{*}

and

Y^{*}

denote the dual spaces of X and Y, respectively.

Let p be a natural number and let

A : X \times X \times \dots \times X (with p copies of X) \to Y

be a continuous symmetric p-multilinear mapping. The p-form associated with

A

is the map

A {[x]}^{p} : X \to Y

defined by

A {[x]}^{p} = A (x, x, \dots, x),

for

x \in X

, where all instances of x in the expression

A (x, x, \dots, x)

are the same. Alternatively, we may simply view

A {[\cdot]}^{p}

as a homogeneous polynomial

Q : X \to Y

of degree p with

Q (α x) = α^{p} Q (x)

. Therefore, the space of continuous homogeneous polynomials

Q : X \to Y

of degree p is denoted by

Q^{p} (X, Y)

.

If

F : X \to Y

is of class

C^{2}

, then its derivative

F^{'}

at a point

\bar{x} \in X

is a continuous linear operator

F^{'} (\bar{x}) \in L (X, Y)

. The second derivative

F^{''} (\bar{x})

is a bilinear operator on

X \times X

and can be viewed as a mapping from X to

L (X, Y)

. See ([20], Chapter VIII) for further details.

If

F : X \to Y

is of class

C^{p}

, we denote by

F^{(p)} (\bar{x}) : X^{p} \to Y

the pth-order derivative of F at a given point

\bar{x}

. This is a symmetric p-multilinear map from

X^{p}

to Y. The associated p-form, also called the pth–order mapping, is defined as

F^{(p)} (\bar{x}) {[h]}^{p} = F^{(p)} (\bar{x}) (h, h, \dots, h) .

In particular, for

p = 2

, we have

F^{''} (\bar{x}) {[h]}^{2} = F^{''} (\bar{x}) (h, h) .

Furthermore, for a given p-multilinear map, we introduce the following key notation for the p-kernel of the pth-order mapping:

{Ker}^{p} F^{(p)} (\bar{x}) = {h \in X | F^{(p)} (\bar{x}) {[h]}^{p} = 0} .

Here, h represents elements of X that are repeatedly applied in the multilinear mapping. This set is also referred to as the locus of

F^{(p)} (\bar{x})

.

When

F : X \times Y \to Z

is a continuously differentiable mapping, we use the notation

F_{y}^{'} (\bar{x}, \bar{y})

to denote the partial derivative of F with respect to y. Specifically, for

(\bar{x}, \bar{y}) \in X \times Y

, the operator

F_{y}^{'} (\bar{x}, \bar{y})

is defined as the Fréchet derivative of F with respect to y, satisfying:

lim_{h \to 0} \frac{F (\bar{x}, \bar{y} + h) - F (\bar{x}, \bar{y}) - F_{y}^{'} (\bar{x}, \bar{y}) h}{∥ h ∥} = 0,

where

h \in Y

and

F_{y}^{'} (\bar{x}, \bar{y}) : Y \to Z

is a continuous linear operator.

2. Essential Nonlinearity and Singular Mappings

Let

X, Y

be Banach spaces. Let

F : X \to Y

be a mapping in

C^{1} (W)

, where W is a neighborhood of a point

\bar{x} \in X

. According to Definition 1, a mapping F is called regular at

\bar{x}

, if

Im F^{'} (\bar{x}) = Y .

(5)

The following lemma on the local representation of a regular mapping holds.

Lemma 1

(Lemma 1, Section 1.3.3 of [21]). Let X and Y be Banach spaces, let W be a neighborhood of a point

\bar{x} \in X

, and let

F : X \to Y

be of class

C^{1} (W)

. If F is regular at

\bar{x}

, then there exist a neighborhood U of 0, a neighborhood

V \subset W

, and a diffeomorphism

φ : U \to V

, such that:

1.: $φ (0) = \bar{x}$ ,
2.: $F (φ (x)) = F (\bar{x}) + F^{'} (\bar{x}) x$ for all $x \in U$ ,
3.: $φ^{'} (0) = I_{X}$ (the identity mapping on X).

Lemma 1 states that the diffeomorphism

φ

transforms F locally into an affine mapping:

F (φ (x)) = F (\bar{x}) + F^{'} (\bar{x}) x for all x \in U .

(6)

In other words,

φ

provides a local reparametrization under which F takes an affine form in U. This result is also known as the local “trivialization theorem” (Theorem 1.26 of [1]).

If the regularity condition (5) is not satisfied, then, in general, F cannot be locally linearized because such a diffeomorphism

φ

does not exist.

There exist many mappings that do not admit local linearization. The concept of essentially nonlinear mappings, introduced in [22], provides a formal framework for describing such cases.

Definition 2.

Let V be a neighborhood of a point

\bar{x}

in X, and let

U \subset X

be a neighborhood of 0. A mapping

F : V \to Y

, where

F \in C^{2} (V)

, is said to be essentially nonlinear at

\bar{x}

if there exists a perturbation of the form

\tilde{F} (\bar{x} + x) = F (\bar{x} + x) + ω (x), w h e r e ∥ ω (x) ∥ = o (∥ x ∥),

such that there does not exist any nondegenerate transformation

φ : U \to V

,

φ \in C^{1} (U)

, satisfying

φ (0) = \bar{x}

,

φ^{'} (0) = I_{X}

, and such that Equation (6) holds with φ and

\tilde{F}

.

We say that

∥ ω (x) ∥ = o (∥ x ∥)

as

x \to 0

if

lim_{x \to 0} \frac{ω (x)}{∥ x ∥} = 0 .

For example, if

ω : X \to R

, then

ω (x) = {∥ x ∥}^{2}

is

o (∥ x ∥)

.

Definition 3.

We say that the mapping

F : X \to Y

is singular (or degenerate) at

\bar{x} \in X

if it fails to be regular; that is, if its derivative is not onto:

Im F^{'} (\bar{x}) \neq Y .

(7)

The following Theorem 4, which establishes the relationship between the two notions of essential nonlinearity and singularity, was derived as Theorem 2.3 in [22]. We provide its proof here to complete our development.

Theorem 4.

Suppose V is a neighborhood of a given point

\bar{x} \in X,

and

X, Y

are Banach spaces. Suppose

F : V \to Y

is in

C^{2}

. If

F (\bar{x}) = 0

, then F is essentially nonlinear at the point

\bar{x}

if and only if F is singular at

\bar{x}

.

Proof.

Suppose F is singular at the point

\bar{x}

and

F (\bar{x}) = 0

. Since

Im F^{'} (\bar{x}) \neq Y

, there exists a nonzero element

ξ \in Y

, such that

ξ \notin Im F^{'} (\bar{x}) .

(8)

Thus,

ξ \in Y ∖ Im F^{'} (\bar{x})

. Since

F^{'} (\bar{x})

is linear, we can assume

∥ ξ ∥ = 1

.

Assume on the contrary that F is not essentially nonlinear at

\bar{x}

. Define the mapping

\tilde{F} : X \to Y

by

\tilde{F} (\bar{x} + x) = F (\bar{x}) + F^{'} (\bar{x}) x + ξ {∥ x ∥}^{2},

(9)

where

{ξ ∥ x ∥}^{2} \notin Im F^{'} (\bar{x}) .

By Definition 2, with

\tilde{F}

defined in (9), there exist a neighborhood

U \subset X

of 0 and a nondegenerate transformation

φ : U \to V

,

φ \in C^{1} (U)

, such that

φ (0) = \bar{x}

,

φ^{'} (0) = I_{X}

, and (6) holds with

φ

and

\tilde{F}

:

\tilde{F} (φ (x)) = \tilde{F} (\bar{x}) + {\tilde{F}}^{'} (\bar{x}) x = F (\bar{x}) + F^{'} (\bar{x}) x

(10)

for all

x \in U

.

Since

F (\bar{x}) = 0

and

F^{'} (\bar{x}) x \in Im F^{'} (\bar{x})

, it follows from (10) that

\tilde{F} (φ (x)) \in Im F^{'} (\bar{x}) .

(11)

However, using

F (\bar{x}) = 0

,

φ (0) = \bar{x}

, and

φ^{'} (0) = I_{X}

, we obtain

\begin{matrix} \tilde{F} (φ (x)) & = F (\bar{x} + (φ (x) - \bar{x})) \\ = F (\bar{x}) + F^{'} (\bar{x}) (φ (x) - \bar{x}) + ξ {∥ φ (x) - \bar{x} ∥}^{2} \\ = F^{'} (\bar{x}) (φ (x) - \bar{x}) + ξ {∥ φ (0) + φ^{'} (0) x + ω_{1} (x) - \bar{x} ∥}^{2} \\ = F^{'} (\bar{x}) (φ (x) - \bar{x}) + ξ {∥ x + ω_{1} (x) ∥}^{2}, \end{matrix}

(12)

where

∥ ω_{1} (x) ∥ = o (∥ x ∥)

. Thus, for small x,

ξ ∥ x + ω_{1} {(x) ∥}^{2} \neq 0 .

Taking into account that

{ξ ∥ x ∥}^{2} \notin Im F^{'} (\bar{x})

for any

x \in V

, along with Equation (12) and the fact that

F^{'} (\bar{x}) (φ (x) - \bar{x}) \in Im F^{'} (\bar{x})

, we conclude that

\tilde{F} (φ (x)) \notin Im F^{'} (\bar{x}) .

(13)

This contradicts (11), and therefore F is essentially nonlinear at

\bar{x}

.

To prove the converse, suppose that F is essentially nonlinear at

\bar{x}

but not singular; that is, suppose F is regular at this point.

By the persistence of the regularity condition, for any perturbation

\tilde{F} (\bar{x} + x) = F (\bar{x} + x) + ω (x),

where

∥ ω (x) ∥ = o (∥ x ∥)

, the map

\tilde{F} (\bar{x} + x)

remains regular at

\bar{x}

, and

F^{'} (\bar{x}) = {\tilde{F}}^{'} (\bar{x})

. Hence, by Lemma 1,

\tilde{F} (\bar{x} + x)

can be written as

\tilde{F} (φ (x)) = \tilde{F} (\bar{x}) + {\tilde{F}}^{'} (\bar{x}) x,

(14)

where

φ (0) = \bar{x}

and

φ^{'} (0) = I_{X}

. This contradicts the definition of the essential nonlinearity of the mapping F. □

Under additional splitting assumptions, which are not made here, the representation (14) would be a standard consequence of the IFT, as in, for example, ([23], §2.5).

3. Elements of $p$ -Regularity Theory

For the purpose of describing essentially nonlinear problems, a concept of p-regularity was introduced by Tret’yakov [15,16,24] using the notion of a p-factor operator. In this section, we introduce the main definitions of the p-regularity theory, as presented, for example, in [21,22,24].

Let X and Y be Banach spaces. Suppose that

F : X \to Y

is a

C^{p}

-class mapping that is singular (nonregular) at a given point

\bar{x} \in X

. We construct the p-factor operator under the assumption that the space Y can be decomposed into the (topological) direct sum

Y = Y_{1} \oplus \dots \oplus Y_{p},

(15)

where

Y_{1} = cl (Im F^{'} (\bar{x})),

is the closure of the image of the first derivative of F evaluated at

\bar{x}

. To define the remaining spaces, let

S_{1} = Y

and let

S_{2} \subset Y

be a closed complementary subspace to

Y_{1}

, that is,

Y = Y_{1} \oplus S_{2}

if

S_{2}

exists. Next, let

P_{S_{2}} : Y \to S_{2}

be the projection operator onto

S_{2}

along

Y_{1}

. Define

Y_{2}

as the closed linear span of the projection of the quadratic map image:

Y_{2} = cl (span Im P_{S_{2}} F^{(2)} (\bar{x}) {[h]}^{2}) .

More generally, define

Y_{i}

inductively as follows:

Y_{i} = cl (span Im P_{S_{i}} F^{(i)} (\bar{x}) {[h]}^{i}) \subseteq S_{i}, i = 2, \dots, p - 1, p > 2,

where

S_{i}

is a choice of a closed complementary subspace for

Y_{1} \oplus \dots \oplus Y_{i - 1}

with respect to Y, and

P_{S_{i}} : Y \to S_{i}

is the projection operator onto

S_{i}

along

Y_{1} \oplus \dots \oplus Y_{i - 1}

for

i = 2, \dots, p

. Finally, let

Y_{p} = S_{p} .

The order p is the minimum number for which (15) holds. In particular, for

p = 2

, we have

Y = S_{1} = Y_{1} \oplus S_{2}

. When Y is a Hilbert space, there exists a complementary subspace to

Y_{1}

, namely the orthogonal subspace

Y_{2} = Y_{1}^{⊥}

.

Remark 1.

The subspaces

Y_{i}

in assumption (15) can be replaced, in further considerations, by subspaces constructed using the so-called factorization procedure. Specifically, we define

Y_{1} = cl (Im F^{'} (\bar{x})),

as before. However, instead of

Y_{2}

, we use the space

Y / Y_{1}

, called the quotient (or factor) space. Note that the quotient space is itself a Banach space (see, e.g., [25]). Moreover, if decomposition (15) holds, then

Y_{2}

is isomorphic to

Y / Y_{1}

. For simplicity of presentation, we continue to use assumption (15).

Define the following mappings (see Tret’yakov [24]):

f_{i} : X \to Y_{i}, f_{i} (x) = P_{Y_{i}} F (x), i = 1, \dots, p,

(16)

where

P_{Y_{i}} : Y \to Y_{i}

is the projection operator onto

Y_{i}

along

Y_{1} \oplus \dots \oplus Y_{i - 1} \oplus Y_{i + 1} \oplus \dots \oplus Y_{p}

with respect to Y for

i = 1, \dots, p

. Recall that

P_{Y_{i}}

is the projection onto

Y_{i}

along (or parallel to)

W_{i} = Y_{1} \oplus \dots \oplus Y_{i - 1} \oplus Y_{i + 1} \oplus \dots \oplus Y_{p}

if

Ker P_{Y_{i}} = W_{i}

.

In our notation,

f_{i}^{(k)} (\bar{x})

denotes the k-th derivative of

f_{i}

at

\bar{x}

. By the construction of the subspaces

Y_{i}

, we have

f_{i}^{(k)} (\bar{x}) = P_{Y_{i}} F^{(k)} (\bar{x}) = 0, i = 1, \dots, p, k = 1, \dots, i - 1 .

(17)

We define a mapping F as completely degenerate up to order p if

F^{(k)} (\bar{x}) = 0 f o r k = 1, \dots, p - 1 .

(18)

Remark 2.

If the mapping F is completely degenerate up to order p, then (17) implies that each mapping

f_{i}

, defined in (16), is also completely degenerate at

\bar{x}

up to order

i - 1

for

i = 1, \dots, p

. That is,

f_{i}^{(k)} (\bar{x}) = 0, k = 1, \dots, i - 1, i = 1, \dots, p .

With all the notation established above, we are now ready to define the p-factor operator.

Definition 4.

For a fixed vector

h \in X,

h \neq 0

, and mappings

f_{i}

, defined in (16), the linear operator

Ψ_{p} (h) \in L (X, Y_{1} \oplus \dots \oplus Y_{p})

,

Ψ_{p} (h) x = f_{1}^{'} (\bar{x}) x + f_{2}^{''} (\bar{x}) [h] x + \dots + f_{p}^{(p)} (\bar{x}) {[h]}^{p - 1} x, x \in X,

(19)

is called the p-factor operator. Alternatively, the following equivalent form can be used:

Ψ_{p} (h) x = f_{1}^{'} (\bar{x}) x + \frac{1}{2!} f_{2}^{''} (\bar{x}) [h] x + \dots + \frac{1}{p!} f_{p}^{(p)} (\bar{x}) {[h]}^{p - 1} x .

Note that when F is regular at

\bar{x}

, meaning

Im F (\bar{x}) = Y

, we have

Y_{1} = Y

. In this case, the p-factor operator reduces to the operator of the first derivative:

Ψ_{1} (h) x = F^{'} (\bar{x}) x

for any

x \in X

.

For

p = 2

, the p-factor-operator (19) takes the form

Ψ_{2} (h) x = f_{1}^{'} (\bar{x}) x + f_{2}^{''} (\bar{x}) [h] x, x \in X,

(20)

or, equivalently,

Ψ_{2} (h) x = f_{1}^{'} (\bar{x}) x + \frac{1}{2} f_{2}^{''} (\bar{x}) [h] x

for

x \in X,

where

h \in X

and

h \neq 0

. In view of (17), the construction of the operator

Ψ_{2} (h)

(and

Ψ_{p} (h)

in general) is closely tied to the decomposition of the image space (15). The idea is to use higher-order derivatives of F up to order p to obtain (15).

In particular, for

p = 2

and

Ψ_{2} (h)

given by (20), we seek those

h \in X

that ensure the equality

Im f_{2}^{''} (\bar{x}) [h] (X) = S_{2}

, where

S_{2}

is the complementary space to

Y_{1}

.

If a mapping F is completely degenerate up to order p, meaning that (18) holds, and

Im F^{(p)} (\bar{x}) {[h]}^{p - 1} = Y

, then the p-factor operator simplifies to

Ψ_{p} (h) x = F^{(p)} (\bar{x}) {[h]}^{p - 1} x

.

Recall that a bounded linear operator

T : X \to Y

between Banach spaces X and Y is called Fredholm if the kernel of T has finite dimension and the image of T is a closed subspace of finite codimension in Y (see, for example ([26], Chapter 4) and [27]).

Hence, in the case of a Fredholm operator

F^{'} (\bar{x})

, the subspace

Y_{1} = Im F^{'} (\bar{x})

has a complementary finite-dimensional subspace

Z_{2}

such that

Y = Y_{1} \oplus Z_{2}

.

With the p-factor operator

Ψ_{p} (h)

defined in (19), we are now ready to state a few definitions of various types of p-regularity for a

C^{p}

-class mapping

F : X \to Y

.

Definition 5.

We say that the mapping

F : X \to Y

is p-regular at a given point

\bar{x}

along an element

h \in X

if

Im Ψ_{p} (h) = Y .

Remark 3.

The condition of p-regularity of the mapping F at the point

\bar{x}

along

h \in X

is equivalent to the following condition:

Im f_{p}^{(p)} (\bar{x}) {[h]}^{p - 1} (Ker Ψ_{p - 1} (h)) = Y_{p},

(21)

where

Ψ_{p - 1} (h) = f_{1}^{'} (\bar{x}) + f_{2}^{''} (\bar{x}) [h] + \dots + f_{p - 1}^{(p - 1)} (\bar{x}) {[h]}^{p - 2} .

In particular, when

p = 2

, we have

Ψ_{1} (h) = f_{1}^{'} (\bar{x})

, and condition (21) reduces to

Im f_{2}^{''} (\bar{x}) [h] (Ker f_{1}^{'} (\bar{x})) = Y_{2}

, which follows from elementary algebra.

We also define the k-kernel of the kth-order mapping

f_{k}^{(k)} (\bar{x})

as follows:

{Ker}^{k} f_{k}^{(k)} (\bar{x}) = {h \in X ∣ f_{k}^{(k)} (\bar{x}) {[h]}^{k} = 0} .

(22)

Definition 6.

We say the mapping F is p-regular at

\bar{x}

if it is p-regular along any h from the set

H_{p} (\bar{x}) = \{h \in X ∣ h \in ⋂_{i = 1}^{p} {Ker}^{i} f_{i}^{(i)} (\bar{x})\} ∖ {0},

(23)

where the i-kernel of the ith-order mapping

f_{i}^{(i)} (\bar{x})

is defined in (22).

For a linear surjective operator

Ψ_{p} (h) : X \to Y

between Banach spaces, we denote its right inverse by

{Ψ_{p} (h)}^{- 1}

(see [28]). Therefore,

{Ψ_{p} (h)}^{- 1} : Y \to 2^{X}

and we have

{Ψ_{p} (h)}^{- 1} (y) = \{x \in X ∣ Ψ_{p} (h) x = y\} .

(24)

We define the norm of

{Ψ_{p} (h)}^{- 1}

by

∥ {Ψ_{p} (h)}^{- 1} ∥ = sup_{∥ y ∥ = 1} inf {∥ x ∥ ∣ x \in {Ψ_{p} (h)}^{- 1} (y)} .

(25)

We say that

{Ψ_{p} (h)}^{- 1}

is bounded if

∥ {Ψ_{p} (h)}^{- 1} ∥ < \infty .

Definition 7.

A mapping

F \in C^{p}

is called strongly p-regular at a point

\bar{x}

if there exists

α > 0

such that

sup_{h \in H_{p}^{α} (\bar{x})} ∥{Ψ_{p} (h)}^{- 1}∥ < \infty,

where

{Ψ_{p} (h)}^{- 1}

is the right inverse operator of

Ψ_{p} (h)

and

H_{p}^{α} (\bar{x}) = \{h \in X | ∥ f_{i}^{(i)} (\bar{x}) {[h]}^{i} ∥ \leq α for all i = 1, \dots, p, ∥ h ∥ = 1\} .

The following examples illustrate the construction of the p-factor operator for the cases

p = 2

and

p = 3

.

Example 1.

Consider the mapping

F : R^{2} \to R^{2}

defined by

F (x) = (\begin{matrix} x_{1} + x_{2} \\ x_{1} x_{2} \end{matrix}) .

Let

\bar{x} = {(0, 0)}^{T}

. Then, the Jacobian

F^{'} (\bar{x}) = (\begin{matrix} 1 & 1 \\ 0 & 0 \end{matrix})

is singular (degenerate) at

\bar{x} .

Hence,

Im F^{'} (\bar{x}) = span {(1, 0)} \neq R^{2}

. Let

Y_{1} = span {(1, 0)}

and

Y_{2} = span {(0, 1)} .

To construct the 2-factor operator, we use the projection matrices

P_{Y_{1}} = (\begin{matrix} 1 & 0 \\ 0 & 0 \end{matrix}) a n d P_{Y_{2}} = (\begin{matrix} 0 & 0 \\ 0 & 1 \end{matrix}) .

According to Equation (16), the mappings

f_{1} : R^{2} \to Y_{1}

and

f_{2} : R^{2} \to Y_{2}

have the form

f_{1} (x) = (\begin{matrix} x_{1} + x_{2} \\ 0 \end{matrix}) a n d f_{2} (x) = (\begin{matrix} 0 \\ x_{1} x_{2} \end{matrix}) .

Then

f_{1}^{'} (x) = (\begin{matrix} 1 & 1 \\ 0 & 0 \end{matrix}), f_{2}^{'} (x) = (\begin{matrix} 0 & 0 \\ x_{2} & x_{1} \end{matrix})

and

f_{2}^{''} (x) h = (\begin{matrix} 0 & 0 \\ h_{2} & h_{1} \end{matrix}) .

Hence, for

h = {(h_{1}, h_{2})}^{T} \in R^{2}

, the 2-factor operator is defined by

Ψ_{2} (h) = f_{1}^{'} (\bar{x}) + f_{2}^{''} (\bar{x}) h = (\begin{matrix} 1 & 1 \\ h_{2} & h_{1} \end{matrix}) .

It can be verified that the 2-factor operator is surjective whenever

h_{1} \neq h_{2}

.

In this example, we have

{Ker}^{1} f_{1}^{'} (\bar{x}) = span \{(1, - 1)\} a n d {Ker}^{2} f_{2}^{''} (\bar{x}) = span \{(1, 0)\} \cup span \{(0, 1)\} .

This result implies that

H_{2} (\bar{x}) = \emptyset

. Hence, according to Definition 5, the mapping F is 2-regular at

\bar{x}

along any

h \in X

with

h_{1} \neq h_{2}

, but it is not 2-regular at

\bar{x}

. As we observe, it may happen that F is 2-regular along some

h \in X

but

H_{2} (\bar{x}) = \emptyset

. Therefore, a given mapping F may fail to be 2-regular with respect to all

h \in X

,

h \neq 0

.

Example 2.

Case

p = 3

. Consider mapping

F : R^{2} \to R^{3}

defined by

F (x) = (\begin{matrix} x_{1} + x_{2} \\ x_{1} x_{2}^{2} \\ x_{1}^{3} \end{matrix}) .

With

\bar{x} = {(0, 0)}^{T} = 0

, we obtain

F^{'} (x) = (\begin{matrix} 1 & 1 \\ x_{2}^{2} & 2 x_{1} x_{2} \\ 3 x_{1}^{2} & 0 \end{matrix}) a n d F^{'} (0) = (\begin{matrix} 1 & 1 \\ 0 & 0 \\ 0 & 0 \end{matrix}) .

Then, with

h = {(h_{1}, h_{2})}^{T}

,

F^{'} (x) h = (\begin{matrix} h_{1} + h_{2} \\ x_{2}^{2} h_{1} + 2 x_{1} x_{2} h_{2} \\ 3 x_{1}^{2} h_{1} \end{matrix}),

F^{''} (x) [h] h = F^{''} (x) {[h]}^{2} = (\begin{matrix} 0 \\ 4 x_{2} h_{1} h_{2} + 2 x_{1} h_{2}^{2} \\ 6 x_{1} h_{1}^{2} \end{matrix}), F^{'''} (x) {[h]}^{2} = (\begin{matrix} 0 & 0 \\ 2 h_{2}^{2} & 4 h_{1} h_{2} \\ 6 h_{1}^{2} & 0 \end{matrix}) .

In this example,

Y_{1} = Im F^{'} (0) = span {(1, 0, 0)}, Y_{2} = (0, 0, 0), Y_{3} = span {(0, 1, 0), (0, 0, 1)} .

To construct the 3-factor operator, we use the projection matrices

P_{Y_{1}} = (\begin{matrix} 1 & 0 & 0 \\ 0 & 0 & 0 \\ 0 & 0 & 0 \end{matrix}) a n d P_{Y_{3}} = (\begin{matrix} 0 & 0 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & 1 \end{matrix}) .

Then, using Equation (16), we define

f_{1}

and

f_{3}

as follows:

f_{1} (x) = P_{Y_{1}} F (x) = (\begin{matrix} x_{1} + x_{2} \\ 0 \\ 0 \end{matrix}) a n d f_{3} (x) = P_{Y_{3}} F (x) = (\begin{matrix} 0 \\ x_{1} x_{2}^{2} \\ x_{1}^{3} \end{matrix}) .

By the definition of the 3-factor-operator, we obtain

\begin{matrix} Ψ_{3} (h) & = f_{1}^{'} (\bar{x}) + f_{3}^{'''} (\bar{x}) {[h]}^{2}, \\ = (\begin{matrix} 1 & 1 \\ 0 & 0 \\ 0 & 0 \end{matrix}) + (\begin{matrix} 0 & 0 \\ 2 h_{2}^{2} & 4 h_{1} h_{2} \\ 6 h_{1}^{2} & 0 \end{matrix}) = (\begin{matrix} 1 & 1 \\ 2 h_{2}^{2} & 4 h_{1} h_{2} \\ 6 h_{1}^{2} & 0 . \end{matrix}) . \end{matrix}

For

h = (h_{1}, 0)

, the 3-factor operator takes the form

Ψ_{3} (h) = (\begin{matrix} 1 & 1 \\ 0 & 0 \\ 6 h_{1}^{2} & 0 \end{matrix})

and

c l I m Ψ_{3} (h) = s p a n {(1, 0, 0), (1, 0, 1)}

.

For

h = (0, h_{2})

, the 3-factor operator takes the form

Ψ_{3} (h) = (\begin{matrix} 1 & 1 \\ 2 h_{2}^{2} & 0 \\ 0 & 0 \end{matrix})

and

c l I m Ψ_{3} (h) = s p a n {(1, 1, 0), (1, 0, 0)}

.

Now, using (22), we determine the elements

h = (h_{1}, h_{2})

in the kernels by solving the following equations:

f_{1}^{'} (\bar{x}) h = (\begin{matrix} h_{1} + h_{2} \\ 0 \\ 0 \end{matrix}) = (\begin{matrix} 0 \\ 0 \\ 0 \end{matrix}), f_{3}^{'''} (\bar{x}) {[h]}^{3} = (\begin{matrix} 0 \\ 6 h_{1} h_{2}^{2} \\ 6 h_{1}^{3} \end{matrix}) = (\begin{matrix} 0 \\ 0 \\ 0 \end{matrix}) .

Thus, we obtain

Ker f_{1}^{'} (0) = {h = (h_{1}, h_{2}) | h_{1} + h_{2} = 0} = s p a n {(1, - 1)},

and

Ker f_{3}^{'''} (0) = {h = (h_{1}, h_{2}) | 6 h_{1} h_{2}^{2} = 0, 6 h_{1}^{3} = 0} = s p a n {(0, 1)} .

Finally, one can verify that

Ker f_{2}^{''} (0, 0) = R^{2} .

4. Singular Problems and Classical Results via the $p$ -Regularity Theory

4.1. Lyusternik Theorem and Description of Solution Sets

The Lyusternik theorem plays an important role in describing solution sets of nonlinear equations and feasible sets of optimization problems in the regular case. This theorem has practical applications across various fields. It is particularly important in the study of optimization and variational problems. By characterizing the tangent cone, the theorem provides valuable information about critical points and the behavior of solutions in their vicinity. In control theory, the Lyusternik theorem can be used to analyze the stability and controllability of nonlinear control systems. By examining the tangent cone, one can gain insights into system behavior near critical points and determine the conditions necessary for stability and controllability. The Lyusternik theorem is also useful in the development and analysis of optimization algorithms, such as gradient-based methods. By characterizing the tangent cone, the theorem helps in designing efficient algorithms and understanding their convergence properties.

These are just a few examples of the practical applications of the Lyusternik theorem. Its insights into the tangent cone are valuable in many areas, including optimization, control theory, partial differential equations, and geometry, providing a deeper understanding of the behavior of solutions and critical points in a variety of mathematical problems.

Consider a nonlinear mapping

F : U \to Y

, where U is a neighborhood of a point

\bar{x} \in X

. We are interested in the description of the set

M (\bar{x})

:

M (\bar{x}) = \{x \in U ∣ F (x) = F (\bar{x})\} .

(26)

This notation highlights the fact that we will focus our attention on

\bar{x} .

It is useful to recall the following definition of tangent vectors and tangent cones (see, for instance, [29]).

Definition 8.

Let M be a subset of a Banach space X. A vector

h \in X

is said to be tangent to the set M at a point

\bar{x} \in M

if there exist

ε > 0

and a mapping

r (t)

,

r : [0, ε] \to X

, such that

\bar{x} + t h + r (t) \in M \forall t \in [0, ε],

and

\lim_{t \to 0} \frac{∥ r (t) ∥}{t} = 0 .

The set of vectors tangent to M at the point

\bar{x}

is called the tangent cone to the set M at

\bar{x}

, and is denoted by

T_{1} M (\bar{x})

.

4.1.1. Lyusternik Theorem in the Regular Case

In the regular case, the Lyusternik theorem (see [30]) can be formulated as follows.

Theorem 5

(Lyusternik theorem). Let X and Y be Banach spaces, and let U be a neighborhood of

\bar{x}

in X. Assume that

F : U \to Y

is Fréchet differentiable on U, and that its derivative

F^{'} : U \to L (X, Y)

is continuous at

\bar{x}

. Suppose further that F is regular at

\bar{x}

.

Then, the tangent cone to the set

M (\bar{x})

defined in (26) coincides with the kernel of the operator

F^{'} (\bar{x})

:

T_{1} M (\bar{x}) = Ker F^{'} (\bar{x}) .

(27)

If F is singular at

\bar{x}

, then in some problems we may have

T_{1} M (\bar{x}) \neq Ker F^{'} (\bar{x})

, as illustrated in the following example.

Example 3.

Let

X = R^{2}

, and let

x = (x_{1}, x_{2}) \in R^{2}

. Define the mapping

F : R^{2} \to R

by

F (x) = x_{1}^{2} - x_{2}^{2} + {o (∥ x ∥}^{2}) .

Then, the derivative of F is given by

F^{'} (x_{1}, x_{2}) = [2 x_{1}, - 2 x_{2}] .

Evaluating at

\bar{x} = (0, 0)

, we obtain

F (0, 0) = 0

,

F^{'} (0, 0) = (0, 0)

, and

Ker F^{'} (0, 0) = R^{2}

. Calculating

T_{1} M (0, 0) = span {(1, 1)} \cup span {(1, - 1)},

we conclude that

T_{1} M (0, 0) \neq Ker F^{'} (0, 0)

.

Example 4.

Let

F : C ([0, 1]) \to C ([0, 1])

be defined as

F (x (t)) = x (t) .

Then, the set

M = {x (t) \in C ([0, 1]) ∣ F (x (t)) = F (0) = 0} = {0}

consists only of the zero function. The derivative of F, given by

F^{'} (x (t)) = I

, is surjective, where I is the identity operator on

C ([0, 1])

. Moreover, we have

Ker F^{'} (0) = {0} = T_{1} M (0) .

Example 5.

Let

M = \{x (t) \in C ([0, 1]) ∣ \int_{0}^{1} sin x (t) d t = \frac{2}{π}\}

and define

\bar{x} (t) = π t

. To calculate

T_{1} M (\bar{x})

, it is enough to apply Lyusternik’s theorem with

F : C ([0, 1]) \to R

given by

F (x (t)) = \int_{0}^{1} sin x (t) d t .

Using the trigonometric addition formulas, we obtain

\frac{F (x + h) - F (x)}{∥ h ∥} = \int_{0}^{1} sin x (t) \frac{cos h (t) - 1}{∥ h (t) ∥} d t + \int_{0}^{1} cos x (t) \frac{sin h (t)}{∥ h (t) ∥} d t .

The first term on the right-hand side approaches 0 as

∥ h (t) ∥ \to 0

. In the second term, we use the fact that

\frac{sin h (t)}{∥ h (t) ∥}

approaches 1 as

∥ h (t) ∥ \to 0 .

Therefore, the derivative of F at

\bar{x}

is given by

F^{'} (\bar{x} (t)) (x (t)) = \int_{0}^{1} x (t) cos \bar{x} (t) d t,

which is surjective onto

R

. By Theorem 5, we conclude that

T_{1} M (\bar{x}) = Ker F^{'} (\bar{x}) = {x (t) \in C ([0, 1]) ∣ \int_{0}^{1} x (t) cos \bar{x} (t) d t = 0} .

The problem of describing solution sets in more general settings (e.g., nonlinear systems of inequalities) is approached qualitatively using metric regularity [31,32] and geometric derivability [33].

4.1.2. A Generalization of the Lyusternik Theorem

Consider the problem of describing the set

M (\bar{x})

in the nonregular case. As demonstrated in Example 3, the classical Lyusternik theorem 5 may not hold when F is singular at

\bar{x}

, so that

T_{1} M (\bar{x}) \neq Ker F^{'} (\bar{x})

.

The first generalization of the classical Lyusternik theorem for p-regular mappings was independently derived and proved in [15,17]; see also [21]. This generalization can be used to describe the zero set of a p-regular mapping.

Theorem 6

(Generalized Lyusternik theorem, [15]). Let X and Y be Banach spaces, and let U be a neighborhood of a point

\bar{x} \in X

. Assume that

F : U \to Y

is a p–times continuously Fréchet differentiable mapping on U. Assume also that F is p-regular at

\bar{x}

.

Then,

T_{1} M (\bar{x}) = H_{p} (\bar{x}),

where the set

H_{p} (\bar{x})

is defined in (23).

The problem of describing the tangent cone to the solution set

M (\bar{x})

of a nonlinear Equation (3) with a singular mapping F has also been studied in other papers (see, for example, [16,34]).

Example 6.

To illustrate the statement of Theorem 6, define mapping

F : R^{3} \to R^{2}

by

F (x) = (\begin{matrix} x_{1}^{2} - x_{2}^{2} + x_{3}^{2} \\ x_{1}^{2} - x_{2}^{2} + x_{3}^{2} + x_{2} x_{3} \end{matrix}) .

(28)

Consider

\bar{x} = {(0, 0, 0)}^{T} .

A straightforward computation shows that

F^{'} (\bar{x}) = 0

, and

F^{''} (\bar{x}) = (\begin{matrix} (\begin{matrix} 2 & 0 & 0 \\ 0 & - 2 & 0 \\ 0 & 0 & 2 \end{matrix}) \\ (\begin{matrix} 2 & 0 & 0 \\ 0 & - 2 & 1 \\ 0 & 1 & 2 \end{matrix}) \end{matrix}) .

Also,

{Ker}^{2} F^{''} (\bar{x}) = span ([\begin{matrix} 1 \\ - 1 \\ 0 \end{matrix}]) ⋃ span ([\begin{matrix} 1 \\ 1 \\ 0 \end{matrix}]) .

Let

h = {(1, 1, 0)}^{T}

(or

h = {(1, - 1, 0)}^{T}

), then

Im F^{''} (\bar{x}) h = R^{2} .

Hence, the mapping

F (x)

is 2-regular at

\bar{x} = 0

. Then the statement of Theorem 6 in this example reduces to

T_{1} M (\bar{x}) = H_{2} (\bar{x}) = {Ker}^{2} F^{''} (\bar{x})

or

T_{1} M (\bar{x}) = {Ker}^{2} F^{''} (\bar{x}) = span ([\begin{matrix} 1 \\ - 1 \\ 0 \end{matrix}]) ⋃ span ([\begin{matrix} 1 \\ 1 \\ 0 \end{matrix}]) .

The next theorem presents another version of Theorem 6, which was formulated in [24] (see also [17,21] for additional results along these lines). To state the result, we denote by

dist (x, M)

, the distance function from a point

x \in X

to a set M:

dist (x, M) = inf_{y \in M} ∥ x - y ∥, x \in X .

Theorem 7.

Let X and Y be Banach spaces, and let U be a neighborhood of a point

\bar{x} \in X

. Assume that

F : X \to Y

is a p-times continuously Fréchet differentiable mapping in U. Assume also that F is strongly p-regular at

\bar{x}

. Then, there exist a neighborhood

U^{'} \subseteq U

of

\bar{x}

, a mapping

ξ \to x (ξ) : U^{'} \to X

, and constants

δ_{1} > 0

and

δ_{2} > 0

, such that for all

ξ \in U^{'}

the following holds:

F (ξ + x (ξ)) = F (\bar{x}),

dist (ξ, M (\bar{x})) \leq ∥ x (ξ) ∥ \leq δ_{1} \sum_{i = 1}^{p} \frac{∥ f_{i} (ξ) - f_{i} (\bar{x}) ∥}{∥ ξ - \bar{x} ∥^{i - 1}},

(29)

where

f_{i}

are given by (16), and

dist (ξ, M (\bar{x})) \leq ∥ x (ξ) ∥ \leq δ_{2} \sum_{i = 1}^{p} {∥ f_{i} (ξ) - f_{i} (\bar{x}) ∥}^{1 / i} .

For the proof, see [21].

4.1.3. Representation Theorem

The Representation Theorem is used in nonlinear analysis and is relevant to the study of the local behavior and representation of a mapping F around a special point

\bar{x}

. It also guarantees the existence of certain auxiliary mappings that have desirable properties and relate to the given mapping F and its local representation in some neighborhood of

\bar{x}

.

The Representation Theorem can be used, for example, in the study of optimization problems, particularly in constrained optimization. It helps in establishing the existence of critical points and characterizing their properties, which is essential for finding optimal solutions. The theorem is also relevant to variational methods, partial differential equations, and other areas of mathematical analysis. Moreover, it is useful in various numerical methods and computational techniques for approximating solutions of equations. Its versatility and utility stem from its ability to provide insights into the local behavior and representations of mappings near critical points, with wide-ranging applications in mathematical analysis and optimization. Its versatility and usefulness stem from its ability to provide insights into the local behavior and representations of mappings around critical points, which has wide-ranging applications in mathematical analysis and optimization.

To simplify the presentation of the next result, we state it for the case of the completely degenerate mapping F, defined in (18). Recall that in this case,

Im F^{(p)} (\bar{x}) {[h]}^{p - 1} = Y

, and the p-factor operator can be simplified to

Ψ_{p} (h) x = \frac{1}{p!} F^{(p)} (\bar{x}) {[h]}^{p - 1} x

.

Theorem 8

([22]). Let X and Y be Banach spaces, and let V be a neighborhood of

\bar{x}

in X. Suppose that

F : V \to Y

is of class

C^{p + 1}

, and that

F^{(i)} (\bar{x}) = 0

for

i = 1, \dots, p - 1

. Also assume the existence of a constant

C > 0

, such that

sup_{∥ h ∥ = 1} ∥{F^{(p)} (\bar{x}) {[h]}^{p - 1}}^{- 1}∥ \leq C .

Then, there exist a neighborhood U of 0 in X, a neighborhood V of

\bar{x}

in X, and mappings

φ : U \to X

and

ψ : V \to X

, such that φ and ψ are Fréchet-differentiable at 0 and

\bar{x}

, respectively, and the following hold:

1.: $φ (0) = \bar{x},$ $ψ (\bar{x}) = 0$ ;
2.: $F (φ (x)) = F (\bar{x}) + \frac{1}{p!} F^{(p)} (\bar{x}) {[x]}^{p}$ for all $x \in U$ ;
3.: $F (x) = F (\bar{x}) + \frac{1}{p!} F^{(p)} (\bar{x}) {[ψ (x)]}^{p}$ for all $x \in V$ ;
4.: $φ^{'} (0) = ψ^{'} (\bar{x}) = I_{X}$ .

All assumptions of Theorem 8 are satisfied, for example, by the mapping

F (x_{1}, x_{2}) = x_{1}^{p} - x_{2}^{p} + x_{1}^{p + 1} + x_{2}^{p + 1},

where

p \geq 2

,

p \in N

. See also [35] for additional work on the representation theorem.

4.1.4. Morse Lemma

The Morse Lemma is another fundamental result in analysis that relates the behavior of a smooth function near a nondegenerate critical point

\bar{x}

to the local structure of its level sets. The Morse Lemma has several important applications in various areas of mathematics.

The Morse Lemma is used in differential geometry to analyze the behavior of geodesics and study the geometry of manifolds. By considering a function that measures the length or energy of curves on a manifold, the Morse Lemma allows us to understand the critical points of this function and their geometric implications. It provides insights into the existence, stability, and bifurcations of geodesics on a manifold.

The Morse Lemma has important applications in optimization and control theory, where it is used to analyze the behavior of objective functions and control systems near critical points. It helps characterize the local behavior of optimal solutions and understand stability properties. The Lemma can be employed to find critical points, perform sensitivity analysis, and study bifurcations in optimization problems and dynamical systems.

The Morse Lemma is also utilized in singularity theory, which focuses on the properties and classification of singular points or critical points of differentiable mappings. It provides a framework for understanding the local behavior of singularities and the ways in which their structure may change under small perturbations. The Lemma plays a key role in the classification and analysis of singular points and their stability.

The most interesting formulation of the Morse Lemma in the finite-dimensional case is given in the following lemma.

Lemma 2

(Morse Lemma). Let

\bar{x} \in R^{n}

, and let

f : R^{n} \to R

be a function of class

C^{3} (R^{n})

, such that

f^{'} (\bar{x}) = 0

and the Hessian

f^{''} (\bar{x})

is not degenerate. Then, in a neighborhood V of

\bar{x}

, there exist a curvelinear coordinate system

(y_{1}, \dots, y_{n})

and an integer number

k \in {0, \dots, n}

, such that

f (x) = f (\bar{x}) + \sum_{i = 1}^{k} y_{i}^{2} - \sum_{i = k + 1}^{n} y_{i}^{2}

for all

x \in V

.

Proof.

Without loss of generality, we can assume that Hessian matrix

f^{''} (\bar{x})

is diagonal:

f^{''} (\bar{x}) = (\begin{matrix} 1 & 0 & \dots & 0 & 0 \\ 0 & 1 & \dots & 0 & 0 \\ ⋮ & ⋮ & ⋱ & ⋮ & ⋮ \\ 0 & 0 & \dots & - 1 & 0 \\ 0 & 0 & \dots & 0 & - 1 \end{matrix}),

where for some number

k^{'}

between 0 and n, the first

k^{'}

columns have 1 on the main diagonal, and the other columns have

- 1

. Otherwise, changing the basis, we can transform the Hessian to be a diagonal matrix.

Then, in this case,

f (x) = f (\bar{x}) + \sum_{i = 1}^{k^{'}} {(x_{i} - x_{i}^{*})}^{2} - \sum_{i = k^{'} + 1}^{n} {(x_{i} - x_{i}^{*})}^{2} + o (∥ x - \bar{x} ∥^{3}) .

Note that if the assumptions of the Morse Lemma hold, then the assumptions of the representation Theorem 8 are satisfied with

p = 2

and

F = f

. Hence, there exists a mapping

ψ (x) : U \to R

, such that

f (ψ (x)) = f (\bar{x}) + \frac{1}{2} f^{''} (\bar{x}) {[x]}^{2},

where

∥ ψ (x) - (x - \bar{x}) ∥ = o (∥ x - \bar{x} ∥)

and

ψ^{'} (\bar{x}) = I_{X}

. It follows that

k = k^{'}

. Note that if

k = k^{'}

, then

ψ (x) = x - \bar{x} + o (x - \bar{x})

, and if

k \neq k^{'}

, then we obtain a contradiction. Now, we can apply the statement of the representation Theorem 8 to the mapping

y = ψ (x)

to get the statement of Morse Lemma 2. □

See additional work on Morse Lemma in [36].

4.2. Implicit Function Theorem

In this section, we consider the equation

F (x, y) = 0

, where

F : X \times Y \to Z

and X, Y, and Z are Banach spaces. Let

(\bar{x}, \bar{y})

be a given point in

X \times Y

that satisfies

F (\bar{x}, \bar{y}) = 0

. We are interested in the existence of a mapping

φ (x)

defined in a neighborhood

U (\bar{x})

, such that

φ (x) : U (\bar{x}) \to Y

is a solution of the equation

F (x, y) = 0

near the given point

(\bar{x}, \bar{y})

. This mapping should satisfy the following conditions:

F (x, φ (x)) = 0 a n d \bar{y} = φ (\bar{x}) .

(30)

4.2.1. Implicit Function Theorem in the Regular Case

In the case when

F : X \times Y \to Z

is a continuously differentiable mapping, we denote its (Fréchet) derivative with respect to y at a point

(\bar{x}, \bar{y}) \in X \times Y

by

F_{y}^{'} (\bar{x}, \bar{y}) : Y \to Z

.

In the case when F is regular at a point

(\bar{x}, \bar{y})

, meaning

F_{y}^{'} (\bar{x}, \bar{y})

is onto, the classical Implicit Function Theorem (IFT) guarantees the existence of a smooth mapping

φ (x)

defined in a neighborhood

U (\bar{x})

, such that (30) holds and

∥ φ (x) - \bar{y} ∥ \leq C ∥ F (x, \bar{y}) ∥

for all x in

U (\bar{x})

, where

C > 0 .

There are numerous books and papers devoted to the IFT, including [13,19]. Various formulations of the standard IFT exist, and Theorem 9 presents one such statement.

Theorem 9

(Implicit Function Theorem). Let X and Y be Banach Spaces. Assume that

F : X \times Y \to Z

is continuously Fréchet differentiable at

(\bar{x}, \bar{y}) \in X \times Y

,

F (\bar{x}, \bar{y}) = 0

, and that

Im F_{y}^{'} (\bar{x}, \bar{y}) = Z .

Then, there exist constants

C, C_{1} > 0

, a sufficiently small

δ > 0

, and a function

y : B (\bar{x}, δ) \to Y

such that, for

x \in B (\bar{x}, δ)

, the following holds:

φ (\bar{x}) = \bar{y}, F (x, φ (x)) = 0, ∥ φ (x) - \bar{y} ∥ \leq C_{1} ∥ F (x, \bar{y}) ∥ \leq C ∥ x - \bar{x} ∥ .

(31)

The situation changes when the mapping F is degenerate (nonregular) at

(\bar{x}, \bar{y})

; that is, when

F_{y}^{'} (\bar{x}, \bar{y})

is not onto. In this case, the classical IFT cannot be applied to guarantee the (local) existence of a solution

y (x)

. The importance of examining this situation arises from the need to solve various nonlinear problems, many of which, as shown in [22], are singular (degenerate).

4.2.2. Implicit Function Theorem in the Degenerate Case

In this section, we focus on the case when mapping

F : X \times Y \to Z

is not regular; that is, when

F_{y}^{'} (\bar{x}, \bar{y})

is not onto.

As an example, consider mapping

F : R \times R \to R

,

F (x, y) = x - y^{p}

, where

p = 2 k + 1

with some

k \in N

. If

(\bar{x}, \bar{y}) = (0, 0)

, then

F (\bar{x}, \bar{y}) = 0

and

F_{y}^{'} (\bar{x}, \bar{y}) = 0

, so the mapping F is not surjective. The classical IFT is not applicable in this case. However, there exists mapping

φ (x) = x^{1 / p}

, such that

F (x, x^{1 / p}) = x - {(x^{1 / p})}^{p} = 0

. Moreover,

∥ φ (x) - \bar{y} ∥ = ∥ F (x, \bar{y}) ∥^{1 / p}

, and, by (31), the following inequality holds with

C = 1 > 0

:

∥ φ (x) - \bar{y} ∥ \leq C ∥ F (x, \bar{y}) ∥^{1 / p} .

Thus, while the conditions of a standard implicit function are not satisfied in the example, the statement similar to (31) holds. The example serves as a motivation and illustration for the p-order IFT. To our knowledge, the first generalization of the IFT for nonregular mappings was formulated in [24]. Generalizations of the IFT for 2-regular mappings were obtained in [21,36]. We will present a few versions of the IFT for p-regular mappings in this section.

To simplify the presentation, we begin with Theorem 10, which is stated in Euclidean spaces. A slight modification of this theorem was derived in [21]. To formulate the theorem, we first need to define the operator

Ψ_{p} (h)

related to the mapping

F : X \times Y \to Z

. To do so, and similarly to the mappings introduced in Section 3, we define the following mappings (see [24]):

f_{i} (x, y) : X \times Y \to Z_{i}, f_{i} (x, y) = P_{Z_{i}} F (x, y), i = 1, \dots, p,

(32)

where

P_{Z_{i}} : Z \to Z_{i}

is the projection operator onto

Z_{i}

along

Z_{1} \oplus \dots \oplus Z_{i - 1} \oplus Z_{i + 1} \oplus \dots \oplus Z_{p}

with respect to Z for

i = 1, \dots, p

. The definition of

Z_{i}

is similar to the definition of the subspaces

Y_{i}

in Section 3.

Now we are ready to present the definition of the linear operator

Ψ_{p} (h) : Y \to Z_{1} \times \dots \times Z_{p}

, which is similar to the operator

Ψ_{p} (h)

defined in (19). Since the construction of the p factor-operators are similar, we retain the same notation to keep the presentation clear and consistent. For a fixed vector

h \in Y,

h \neq 0

, and mappings

f_{i}

defined in (16), the linear operator

Ψ_{p} (h) \in L (Y, Z_{1} \oplus \dots \oplus Z_{p})

is given by

Ψ_{p} (h) y = {f_{1}^{'}}_{y} (\bar{x}, \bar{y}) y + {f_{2}^{''}}_{y y} (\bar{x}, \bar{y}) [h] y + \dots + {f_{p}^{(p)}}_{y \dots y} (\bar{x}, \bar{y}) {[h]}^{p - 1} y, y \in Y,

(33)

where

y \dots y

indicates that all derivatives are taken with respect to the same variable y, which belongs to Y.

Before stating Theorem 10, we introduce some additional notation that will be used:

In the expression ${f_{i}^{(r)}}_{\underset{q}{\underset{︸}{x \dots x}} \underset{r - q}{\underset{︸}{y \dots y}}} (\bar{x}, \bar{y})$ , r represents the total order of differentiation, where differentiation is performed q times with respect to x and $r - q$ times with respect to y.
While the notation ${[h]}^{r - 1}$ appears in the definition (33) of the linear operator $Ψ_{p} (h)$ , the expression ${f_{i}^{(r)}}_{\underset{q}{\underset{︸}{x \dots x}} \underset{r - q}{\underset{︸}{y \dots y}}} (\bar{x}, \bar{y}) = 0$ signifies that all components of the derivative are equal to zero.
The subscript notation $x \dots x$ (q-times) indicates partial differentiation with respect to the first variable x performed q times.
For $r = 0$ , the notation $f_{i}^{(0)} (\bar{x}, \bar{y})$ represents the function value $f_{i} (\bar{x}, \bar{y})$ itself.

Theorem 10

(Implicit Function Theorem [22]). Suppose that X, Y, and Z are Euclidean spaces, and let W be a neighborhood of a point

(\bar{x}, \bar{y})

in

X \times Y

. Assume that

F : W \to Z

is of class

C^{2}

. Suppose

F (\bar{x}, \bar{y}) = 0

and there exists a neighborhood

U (\bar{x})

in X, such that the following conditions hold:

The Singularity Condition:

${f_{i}^{(r)}}_{\underset{q}{\underset{︸}{x \dots x}} \underset{r - q}{\underset{︸}{y \dots y}}} (\bar{x}, \bar{y}) = 0, r = 0, \dots, i - 1, q = 0, \dots, r - 1, i = 1, \dots, p;$

${f_{i}^{(i)}}_{\underset{q}{\underset{︸}{x \dots x}} \underset{i - q}{\underset{︸}{y \dots y}}} (\bar{x}, \bar{y}) = 0, q = 1, \dots, i - 1, i = 1, \dots, p .$
The pth Order Regularity Condition at the Point $(\bar{x}, \bar{y})$ :
The operator $Ψ_{p} (h)$ defined in (33) satisfies

$Ψ_{p} (h) Y = Z$

for all $h \in {{(Ψ_{p} (h))}_{y}}^{- 1} (- F (x, \bar{y}))$ and all $x \in U (\bar{x})$ , such that $F (x, \bar{y}) \neq 0$ .
The Banach Condition:
There exists a constant $c > 0$ such that, for any $z \in Z$ with $∥ z ∥ = 1$ , the following holds:

$Ψ_{p} (h) y = z, ∥ y ∥ \leq c .$
The Elliptic Condition with respect to x:
There exists a constant $m > 0$ such that

$∥ f_{i} (x, \bar{y}) ∥ \geq m ∥ x - \bar{x} ∥$

for all $x \in U (\bar{x})$ and for all $i = 1, \dots, p$ .

If conditions 1 to 4 are satisfied, then for any

ε > 0

, there exist

δ > 0

and

K > 0

such that

B (\bar{x}, δ) \subset U (\bar{x})

, and there is a map

φ : B (\bar{x}, δ) \to B (\bar{y}, ε)

satisfying:

(a): $φ (\bar{x}) = \bar{y}$ ;
(b): $F (x, φ (x)) = 0$ for all $x \in B (\bar{x}, δ)$ ;
(c): $∥ φ (x) - \bar{y} ∥ \leq K \sum_{i = 1}^{p} {∥f_{i} (x, \bar{y})∥}^{1 / i}$ for all $x \in B (\bar{x}, δ)$ .

The alternative version of the IFT for nonregular mappings, presented as Theorem 11, was proved in [37]. Before stating the theorem, we introduce the following definition (Definition 2.3 in [38]).

Definition 9.

The mapping

F : X \times Y \to Z

is called uniformly p-regular over a set M in Y if

sup_{h \in M} ∥ {Ψ_{p} (\bar{h})}^{- 1} ∥ < \infty, \bar{h} = \frac{h}{∥ h ∥}, h \neq 0,

where

∥ {Ψ_{p} (\bar{h})}^{- 1} ∥ = sup_{∥ z ∥ = 1} inf {∥ y ∥ |Ψ_{p} (h) [y] = z} .

Additionally, we define the mapping

Φ_{p} : Y \to Z_{1} \oplus \dots \oplus Z_{p}

by

Φ_{p} = (f_{1 y}^{'} (\bar{x}, \bar{y}), \frac{1}{2} f_{2 y y}^{''} (\bar{x}, \bar{y}), \dots, \frac{1}{p!} f_{p y \dots y}^{(p)} (\bar{x}, \bar{y})),

where

Φ_{p} {[y]}^{p} = (f_{1 y}^{'} (\bar{x}, \bar{y}) [y], \frac{1}{2} f_{2 y y}^{''} (\bar{x}, \bar{y}) {[y]}^{2}, \dots, \frac{1}{p!} f_{p y \dots y}^{(p)} (\bar{x}, \bar{y}) {[y]}^{p}) .

Under the assumption that

Z_{1} \oplus \dots \oplus Z_{p} = Z

, we also introduce the corresponding inverse multivalued operator

Φ_{p}^{- 1}

:

Φ_{p}^{- 1} (z) = \{η \in Y |(f_{1 y}^{'} (\bar{x}, \bar{y}) [η], \dots, \frac{1}{p!} f_{p y \dots y}^{(p)} (\bar{x}, \bar{y}) {[η]}^{p}) = (z_{1}, z_{2}, \dots, z_{p})\},

where

z_{i} \in Z_{i}

,

i = 1, \dots, p

.

Theorem 11

(The pth-order IFT). Let X, Y and Z be Banach spaces, and let

U (\bar{x})

and

U (\bar{y})

be sufficiently small neighborhoods of

\bar{x} \in X

and

\bar{y} \in Y

, respectively. Suppose that

F \in C^{p + 1} (X \times Y)

and

F (\bar{x}, \bar{y}) = 0

. Assume that the mappings

f_{i} (x, y)

,

i = 1, \dots, p

, introduced in Equation (32), satisfy the following conditions:

(1): The Singularity Condition:

$\begin{matrix} f_{i \underset{q}{\underset{︸}{x \dots x}} \underset{r - q}{\underset{︸}{y \dots y}}}^{(r)} (\bar{x}, \bar{y}) = 0, r = 1, \dots, i - 1, q = 0, \dots, r - 1, i = 1, \dots, p, \\ f_{i \underset{q}{\underset{︸}{x \dots x}} \underset{i - q}{\underset{︸}{y \dots y}}}^{(i)} (\bar{x}, \bar{y}) = 0, q = 1, \dots, i - 1, i = 1, \dots, p . \end{matrix}$
(2): The p-Factor Approximation Condition:
There exists a sufficiently small $ε > 0$ such that, for all $y_{1}, y_{2} \in (U (\bar{y}) ∖ {\bar{y}})$ , the following holds:

$\begin{matrix} ∥f_{i} (x, \bar{y} + y_{1}) - f_{i} (x, \bar{y} + y_{2}) - \frac{1}{i!} f_{i y \dots y}^{(i)} (\bar{x}, \bar{y}) {[y_{1}]}^{i} + \frac{1}{i!} f_{i y \dots y}^{(i)} (\bar{x}, \bar{y}) {[y_{2}]}^{i}∥ \\ \leq ε (∥ y_{1} ∥^{i - 1} + {∥ y_{2} ∥}^{i - 1}) ∥ y_{1} - y_{2} ∥, i = 1, \dots, p . \end{matrix}$
(3): The Banach Condition:
There exists a nonempty open set $Γ (\bar{x}) \subset U (\bar{x})$ in X such that for any sufficiently small γ, the intersection of the set $Γ (\bar{x})$ with the ball $B (\bar{x}, γ)$ is not empty and $Γ (\bar{x}) \cap B (\bar{x}, γ) \neq {\bar{x}}$ . Moreover, for $x \in Γ (\bar{x})$ , there exist $h (x) : X \to Y$ and a constant c such that $0 < c_{1} < \infty$ and

$Φ_{p} {[h (x)]}^{p} = - F (x, \bar{y}), ∥ h (x) ∥ \leq c_{1} \sum_{r = 1}^{p} {∥ f_{r} (x, \bar{y}) ∥}^{1 / r},$

(34)
(4): The Uniform p-Regularity Condition:
The mapping $F (x, y)$ is uniformly p-regular over the set $Φ_{p}^{- 1} (- F (x, \bar{y})) .$

If conditions 1 to 4 are satisfied, then there exists a constant

k > 0

, a sufficiently small

δ \leq γ

, and a mapping

φ : Γ (\bar{x}) \cap B (\bar{x}, δ) \to U (\bar{y})

such that the following hold for

x \in Γ (\bar{x}) \cap B (\bar{x}, δ)

:

φ (\bar{x}) = \bar{y};

F (x, φ (x)) = 0,

∥ φ (x) - \bar{y} ∥ \leq k \sum_{r = 1}^{p} {∥ f_{r} (x, \bar{y}) ∥}^{1 / r} .

There are generalizations of IFT for nonregular mappings derived by other authors. Some examples include a generalization of the IFT and its application to a parametric linear time-optimal control problem presented in [39], generalized IFT applied to ordinary differential equations in [40], and IFT for 2-regular mappings in [41,42].

4.3. Newton’s Method

4.3.1. Classical Newton’s Method for Nonlinear Equations and Unconstrained Optimization Problems

Consider the problem of solving the nonlinear Equation (3), where

F : X \to Y

is sufficiently smooth, so that

F \in C^{p + 1} (X)

for some

p \in N .

Let

\bar{x}

be a solution of (3), that is,

F (\bar{x}) = 0

. Assume that mapping F is singular at the point

\bar{x}

.

In the finite dimensional case, when

F (x) = {(F_{1} (x), \dots, F_{n} (x))}^{T}

,

X = R^{n}

, and

Y = R^{n}

, the singularity of F at

\bar{x}

means that the Jacobian

F^{'} (\bar{x})

of F is singular at

\bar{x}

, as in the following example.

Example 7

([43]). Consider function

F : R^{2} \to R^{2}

from Example 1, defined by

F (x) = (\begin{matrix} x_{1} + x_{2} \\ x_{1} x_{2} \end{matrix}),

where

\bar{x} = {(0, 0)}^{T}

is a solution to Equation (3) and

F^{'} (\bar{x}) = (\begin{matrix} 1 & 1 \\ 0 & 0 \end{matrix})

is singular (degenerate) at the point

\bar{x} .

Consider a sufficiently small

ε > 0

and some initial point

x^{0} \in B (0, ε)

. The classical Newton method is defined by

x^{k + 1} = x^{k} - {(F^{'} (x^{k}))}^{- 1} F (x^{k}), k = 0, 1, \dots .

(35)

If

x^{k} = (x_{1}, x_{2})

in this example, we obtain

F^{'} (x^{k}) = (\begin{matrix} 1 & 1 \\ x_{2} & x_{1} \end{matrix}), {(F^{'} (x^{k}))}^{- 1} = \frac{1}{x_{1} - x_{2}} (\begin{matrix} x_{1} & - 1 \\ - x_{2} & 1 \end{matrix}) .

Then

{(F^{'} (x^{k}))}^{- 1} F (x^{k}) = \frac{1}{x_{1} - x_{2}} (\begin{matrix} x_{1}^{2} \\ - x_{2}^{2} \end{matrix}),

x^{k + 1} = x^{k} - {(F^{'} (x^{k}))}^{- 1} F (x^{k}) = \frac{1}{x_{1} - x_{2}} (\begin{matrix} - x_{1} x_{2} \\ x_{1} x_{2} \end{matrix}) .

If

x_{1} = x_{2}

, then

{(F^{'} (x^{k}))}^{- 1}

does not exist and, hence, method (35) is not applicable. Even in the case when

{(F^{'} (x^{k}))}^{- 1}

exists, method (35) might be diverging. As an example, consider point

x^{k} = {(t + t^{3}, t)}^{T}

, where t is sufficiently small. Then

x^{k + 1} = \frac{1}{t^{3}} (\begin{matrix} - t^{2} - t^{4} \\ t^{2} + t^{4} \end{matrix}) = {(- \frac{1}{t} - t, \frac{1}{t} + t)}^{T}

and, for a sufficiently small values of t,

∥ x^{k + 1} - \bar{x} ∥ = ∥ x^{k + 1} - 0 ∥ \approx \frac{1}{t} \to \infty

when

t \to 0^{+} .

For instance, if

t = 10^{- 5}

, then

∥ x^{k + 1} - 0 ∥ \approx 10^{5}

and the method (35) is diverging.

For the overview of the existing approaches to Newton-like methods for singular operators, see, e.g., [44].

Now we consider Newton’s method for finding critical points of an unconstrained optimization problem:

min_{x \in R^{2}} f (x),

(36)

where

f : R^{2} \to R

. The classical Newton’s method applied to problem (36) has the form

x^{k + 1} = x^{k} - {(f^{''} (x^{k}))}^{- 1} f^{'} (x^{k}) .

(37)

As an example, consider minimization of function f given by

f (x) = x_{1}^{2} + x_{1}^{2} x_{2} + x_{2}^{4}

(see [43]). One of the critical points of the function f is

\bar{x} = {(0, 0)}^{T}

. Let

x^{0} = {(x_{1}^{0}, x_{2}^{0})}^{T}

where

x_{1}^{0} = x_{2}^{0} \sqrt{6 (1 + x_{2}^{0})}

. Then

f^{''} (x^{0}) = (\begin{matrix} 2 + 2 x_{2}^{0} & 2 x_{2}^{0} \sqrt{6 (1 + x_{2}^{0})} \\ 2 x_{2}^{0} \sqrt{6 (1 + x_{2}^{0})} & 12 {(x_{2}^{0})}^{2} \end{matrix})

and

det f^{''} (x^{0}) = 0 .

Hence,

{(f^{''} (x^{0}))}^{- 1}

does not exist, so Newton’s method (37) is not applicable.

4.3.2. The p-Factor Newton’s Method

In this section, we describe a method for solving nonlinear Equation (3), where

F : R^{n} \to R^{n}

and the matrix

F^{'} (\bar{x})

is singular at the solution point

\bar{x}

(see [43]). The proposed method is based on the construction of the p-factor operator.

There are various publications describing the p-factor-method for solving degenerate nonlinear systems and nonregular optimization problems. Some examples are given in [43,45,46].

Let

h \in R^{n}

. Similarly to the definitions in Section 3, now we define

Y_{1}

by

Y_{1} = Im F^{'} (\bar{x}),

and define the projection

{\bar{P}}_{1} = P_{Y_{1}^{⊥}}

as the projection of Y onto the orthogonal complementary subspace

Y_{1}^{⊥}

of

Y_{1}

in Y. Similarly, we can define

Y_{2}

as

Y_{2} = Im (F^{'} (\bar{x}) + {\bar{P}}_{1} F^{''} (\bar{x}) h), and {\bar{P}}_{2} = P_{Y_{2}^{⊥}} .

Continuing in the same way for each

k = 2, \dots, p - 1,

we obtain

{\bar{P}}_{k + 1} = P_{Y_{k + 1}^{⊥}}

and

Y_{k + 1} = Im (F^{'} (\bar{x}) + \sum_{i = 1}^{k} {\bar{P}}_{i} F^{''} (\bar{x}) h + \sum_{\begin{matrix} i_{2} > i_{1} \\ i_{1}, i_{2} \in {1, 2, 3} \end{matrix}} {\bar{P}}_{i_{2}} {\bar{P}}_{i_{1}} F^{(3)} (\bar{x}) {[h]}^{2} + \dots

+ \sum_{\begin{matrix} i_{k} > \dots > i_{1} \\ i_{1}, \dots, i_{k} \in {1, \dots, k} \end{matrix}} {\bar{P}}_{i_{k}} \dots {\bar{P}}_{i_{1}} F^{(k)} (\bar{x}) {[h]}^{(k - 1)}) .

Let h be a fixed vector such that

∥ h ∥ = 1

and mapping F is p-regular at the solution

\bar{x}

along vector h. Let matrices

P_{i}, i = 1, \dots, p - 1

, be defined as follows:

P_{1} = \sum_{i = 1}^{p - 1} {\bar{P}}_{i}, P_{2} = \sum_{\begin{matrix} i_{2} > i_{1} \\ i_{1}, i_{2} \in {1, \dots, p - 1} \end{matrix}} {\bar{P}}_{i_{2}} {\bar{P}}_{i_{1}}, P_{k + 1} = \sum_{\begin{matrix} i_{k} > \dots > i_{1} \\ i_{1}, \dots, i_{k} \in {1, \dots, p - 1} \end{matrix}} {\bar{P}}_{i_{k}} \dots {\bar{P}}_{i_{1}}

for all

k = 2, \dots, p - 1 .

We assume that

\bar{x}

is a solution of

F (x) = 0

. Now, instead of

F (\bar{x}) = 0

, consider

F (\bar{x}) + P_{1} F^{'} (\bar{x}) h + \dots + P_{p - 1} F^{(p - 1)} (\bar{x}) {[h]}^{p - 1} = 0 .

The assumption of p-regularity of the mapping F at the solution

\bar{x}

along the vector h implies that the p-factor matrix given by

F^{'} (\bar{x}) + P_{1} F^{''} (\bar{x}) h + \dots + P_{p - 1} F^{(p)} (\bar{x}) {[h]}^{p - 1}

(38)

is not singular. Hence,

{\bar{P}}_{p} = 0

and

Y_{p} = R^{n} .

Then, the p-factor Newton method can be defined as

x^{k + 1} = x^{k} - {(F^{'} (x^{k}) + P_{1} F^{''} (x^{k}) h + \dots + P_{p - 1} F^{(p)} (x^{k}) {[h]}^{p - 1})}^{- 1}

(39)

\times (F (x^{k}) + P_{1} F^{'} (x^{k}) h + \dots + P_{p - 1} F^{(p - 1)} (x^{k}) {[h]}^{p - 1}) .

The following theorem provides conditions that ensure the quadratic convergence of the p-factor Newton method (39).

Theorem 12

([43]). Let

F \in C^{p} (R^{n})

, and let

\bar{x}

be a solution of

F (x) = 0

. Assume that there exists a vector

h \in R^{n}

,

∥ h ∥ = 1

, such that the p-factor matrix defined in Equation (38) is not singular. Then, for any

x^{0} \in U_{ε} (\bar{x})

(with

ε > 0

sufficiently small) and for the sequence

{x^{k}}

generated by the method in Equation (39), the following inequality holds for some constant

c > 0

:

∥ x^{k + 1} - \bar{x} ∥ \leq c ∥ x^{k} - \bar{x} ∥^{2}, k = 0, 1, \dots .

(40)

In the case of

p = 2

, the p-factor Newton method (39) reduces to the following:

x^{k + 1} = x^{k} - {(F^{'} (x^{k}) + P_{1} F^{''} (x^{k}) h)}^{- 1} (F (x^{k}) + P_{1} F^{'} (x^{k}) h)

(41)

where

P_{1}

is the orthogonal projection onto

Im {(F^{'} (\bar{x}))}^{⊥}

, and the vector h

(∥ h ∥ = 1)

is chosen such that the 2-factor matrix

F^{'} (\bar{x}) + P_{1} F^{''} (\bar{x}) h

(42)

is not singular. This condition is equivalent to F being 2-regular at

\bar{x}

along h. In this case, the equation

F (\bar{x}) + P_{1} F^{'} (\bar{x}) h = 0

is satisfied at

\bar{x}

. Note that (42) implies that

\bar{x}

is a locally unique solution of (3).

The 2-factor Newton method presented here can be applied to solve the equation in Example 7. Specifically, instead of using the iterative procedure (35), the 2-factor Newton method given by (41) should be used.

Example 8

([43]). Consider the following problem

min_{x \in R^{2}} f (x),

where

f : R^{2} \to R

is defined by

f (x) = x_{1}^{2} + x_{1}^{2} x_{2} + x_{2}^{4}

. If

F (x) = f^{'} (x)

, then

F (x) = (\begin{matrix} 2 x_{1} + 2 x_{1} x_{2} \\ x_{1}^{2} + 4 x_{2}^{3} \end{matrix})

, and for

\bar{x} = {(0, 0)}^{T}

, we have

F (0, 0) = {(0, 0)}^{T} .

It can be shown that F is 3-regular at

(0, 0)

along

h = {(1, 1)}^{T}

.

Namely, according to the previous definitions, in this example,

{\bar{P}}_{1} = (\begin{matrix} 0 & 0 \\ 0 & 1 \end{matrix}), {\bar{P}}_{2} = \frac{1}{2} (\begin{matrix} 1 & - 1 \\ - 1 & 1 \end{matrix}),

P_{1} = {\bar{P}}_{1} + {\bar{P}}_{2} = \frac{1}{2} (\binom{1}{- 1} \binom{- 1}{3}), P_{2} = {\bar{P}}_{2} {\bar{P}}_{1} = \frac{1}{2} (\binom{0}{0} \binom{- 1}{1}) .

Then, the following matrix is nonsingular:

F^{'} (0) + P_{1} F^{''} (0) h + P_{2} F^{(3)} (0) {[h]}^{2} = f^{''} (0) + P_{1} f^{(3)} (0) h + P_{2} f^{(4)} (0) {[h]}^{2} = (\binom{2}{2} \binom{- 11}{11}) .

Consider the 3-factor method:

\begin{matrix} x^{k + 1} = x^{k} - \\ {(f^{''} (0) + P_{1} f^{(3)} (0) [h] + P_{2} f^{(4)} (0) {[h]}^{2})}^{- 1} (f^{'} (x^{k}) + P_{1} f^{''} (x^{k}) [h] + P_{2} f^{(3)} (x^{k}) {[h]}^{2}) . \end{matrix}

Let

x^{k} = {(x_{1}, x_{2})}^{T} .

Then

\begin{matrix} ∥ x^{k + 1} - 0 ∥ & = ∥x^{k} - {(\begin{matrix} 2 & - 11 \\ 2 & 11 \end{matrix})}^{- 1} (\begin{matrix} 2 x_{1} - 11 x_{2} + 2 x_{1} x_{2} - 6 x_{2}^{2} \\ 2 x_{1} + 11 x_{2} + x_{1}^{2} + 18 x_{2}^{2} + 4 x_{2}^{3} \end{matrix})∥ = \\ = \frac{1}{44} ∥(\begin{matrix} 11 x_{1}^{2} + 132 x_{2}^{2} + 22 x_{1} x_{2} + 44 x_{2}^{3} \\ 2 x_{1}^{2} + 48 x_{2}^{2} - 4 x_{1} x_{2} + 8 x_{2}^{3} \end{matrix})∥ \leq 10 {∥ x^{k} - 0 ∥}^{2} . \end{matrix}

4.4. Optimality Conditions for Equality-Constrained Optimization Problems

In this section, we consider optimization problem (4):

min f (x) subject to F (x) = 0,

where

f : X \to R

is a sufficiently smooth function and

F : X \to Y

is a sufficiently smooth mapping from a Banach space X to a Banach space Y.

4.4.1. Optimality Conditions: Lagrange Multiplier Theorem

There is an extensive body of literature discussing optimality conditions for regular constrained optimization problems, which are problems that satisfy certain constraint qualifications. One notable reference on this topic is Chapter 3 of the book [47].

The classical optimality conditions state that if

\bar{x}

is a regular solution of Problem (4), then there exists a Lagrange multiplier in the form of a constant vector

\bar{λ} \in Y^{*}

, such that

f^{'} (\bar{x}) = F^{'} {(\bar{x})}^{*} \bar{λ},

(43)

where

F^{'} {(\bar{x})}^{*} : Y^{*} \to X^{*}

denotes the adjoint of

F^{'} (\bar{x})

, and

X^{*}

and

Y^{*}

denote the dual spaces of X and Y, respectively.

The situation changes in the degenerate case when the derivative

F' (\bar{x})

is not surjective. In such cases, the classical optimality conditions in the form of Equation (43) do not hold, as illustrated in the following example.

Example 9.

Consider the problem

\begin{matrix} \underset{x \in R^{3}}{minimize} & x_{2}^{2} + x_{3} \\ subject to & (\begin{matrix} x_{1}^{2} - x_{2}^{2} + x_{3}^{2} \\ x_{1}^{2} - x_{2}^{2} + x_{3}^{2} + x_{2} x_{3} \end{matrix}) = (\binom{0}{0}) . \end{matrix}

(44)

Note that mapping

F (x) = (\begin{matrix} x_{1}^{2} - x_{2}^{2} + x_{3}^{2} \\ x_{1}^{2} - x_{2}^{2} + x_{3}^{2} + x_{2} x_{3} \end{matrix})

was introduced in (28).

In this example, if

\bar{x} = {(0, 0, 0)}^{T}

, then

f^{'} (\bar{x}) = {(0, 0, 1)}^{T}, a n d F^{'} (\bar{x}) = (\binom{0}{0} \binom{0}{0} \binom{0}{0}) .

Hence

f^{'} (\bar{x}) \neq F^{'} {(\bar{x})}^{T} \bar{λ}

and Equation (43) does not hold.

4.4.2. Optimality Conditions for p-Regular Optimization Problems

In this section, we will focus on the case when the equality constraints defined by mapping

F (x)

are not regular at a solution

\bar{x}

of the problem (4). We define the p-factor-Lagrange function

L_{p} (x, λ (h), h) : X \times (Y_{1}^{*} \times \dots \times Y_{p}^{*}) \times X \to R

as

L_{p} (x, λ (h), h) = f (x) + \sum_{i = 1}^{p} 〈 λ_{i} (h), f_{i}^{(i - 1)} (x) {[h]}^{i - 1} 〉,

(45)

where

x, h \in X

,

λ_{i} (h) \in Y_{i}^{*}

for

i = 1, \dots, p

, and the mappings

f_{i} (x)

are defined in (16). Note that the p-factor-Lagrange function is a generalization of the classical Lagrange function and reduces to it in the regular case.

The development of optimality conditions for nonregular problems has become an active area of research (see [16,48,49,50,51] and references therein).

To state the sufficient conditions in Theorem 13, we also introduce an alternative version of the p-factor-Lagrange function,

{\bar{L}}_{p} (x, λ (h), h : X \times (Y_{1}^{*} \times \dots \times Y_{p}^{*}) \times X \to R,

which is defined as follows:

{\bar{L}}_{p} (x, λ (h), h) = f (x) + \sum_{i = 1}^{p} \frac{2}{i (i + 1)} 〈 λ_{i} (h), f_{i}^{(i - 1)} (x) {[h]}^{i - 1} 〉 .

(46)

To state optimality conditions for p-regular optimization problems, we use the definition of strong p-regularity at

\bar{x}

given in Definition 7. We also use the set

H_{p} (\bar{x})

defined in Equation (23), and the operator

Ψ_{p} (h)

defined in Equation (19).

Theorem 13

([16], necessary and sufficient conditions for optimality). Assume that X and Y are Banach spaces, U is a neighborhood of a point

\bar{x}

in X,

f : U \to R

is a twice continuously Fréchet differentiable function in U, and

F : U \to Y

is a

(p + 1)

-times Fréchet differentiable mapping in U.

Necessary conditions for optimality.

Assume that for an element

h \in H_{p} (\bar{x})

, the set

I m Ψ_{p} (h)

is closed in

Y_{1} \oplus \dots \oplus Y_{p}

. Suppose that F is p-regular at the point

\bar{x}

along the vector

h \in H_{p} (\bar{x})

. If

\bar{x}

is a local minimizer of problem (4), then there exist multipliers

\bar{λ} (h) = ({\bar{λ}}_{1} (h), \dots, {\bar{λ}}_{p} (h)) \in (Y_{1}^{*} \times \dots \times Y_{p}^{*})

such that the partial derivative of the function

{\bar{L}}_{p}

with respect to x, denoted by

{(L_{p}^{'})}_{x}

, satisfies

{(L_{p}^{'})}_{x} (\bar{x}, \bar{λ} (h), h) = 0 .

(47)

Sufficient conditions for optimality.

Assume that the set

Im Ψ_{p} (h)

is closed in

Y_{1} \oplus \dots \oplus Y_{p}

for every

h \in H_{p} (\bar{x})

, and that

Im Ψ_{p} (h) = Y_{1} \oplus \dots \oplus Y_{p} .

Assume also that the mapping F is strongly p-regular at

\bar{x} .

Suppose that there exist a constant

α > 0

and a multiplier

\bar{λ} (h)

such that Equation (47) is satisfied, and that the second-order partial derivative of the function

{\bar{L}}_{p}

(defined in (46)) with respect to x, denoted by

{({\bar{L}}_{p})}_{x x}

, satisfies

{({\bar{L}}_{p})}_{x x} (\bar{x}, \bar{λ} (h), h) {[h]}^{2} \geq α {∥ h ∥}^{2}

(48)

for every

h \in H_{p} (\bar{x}) .

Then

\bar{x}

is a strict local minimizer of the problem Equation (4).

Example 10.

In this example, we continue with the analysis of problem (44) from Section 4.4.1. It can be verified that the point

\bar{x} = ({\bar{x}}_{1}, {\bar{x}}_{2}, {\bar{x}}_{3}) = {(0, 0, 0)}^{T}

is a local minimizer of Equation (44). In Example 6, we showed that the mapping

F (x) = (\begin{matrix} x_{1}^{2} - x_{2}^{2} + x_{3}^{2} \\ x_{1}^{2} - x_{2}^{2} + x_{3}^{2} + x_{2} x_{3} \end{matrix})

is 2-regular at

\bar{x}

along the vector

h = {(1, 1, 0)}^{T}

.

In this example, the 2-factor Lagrange function

L_{2}

, defined in (45) for

p = 2

, is given by

L_{2} (x, λ (h), h) = f (x) + 〈 λ_{1} (h), f_{1} (x) 〉 + 〈 λ_{2} (h), f_{2}^{'} (x) [h] 〉,

where

λ (h) = (λ_{1} (h), λ_{2} (h))

,

λ_{1} (h) = {(0, 0)}^{T}

, and

λ_{2} (h) = {(α, β)}^{T} .

Substituting the given expressions, we obtain the following form:

L_{2} (x, λ (h), h) = x_{2}^{2} + x_{3} + α (x_{1} - x_{2}) + β (x_{1} - x_{2} + \frac{1}{2} x_{3}) .

Solving the equation

L_{2 x}^{'} (\bar{x}, λ (h), h) = 0,

we obtain the following system:

\begin{matrix} α + β & = & 0 \\ 2 {\bar{x}}_{2} - α - β & = & 0 \\ 1 + \frac{1}{2} β & = & 0 . \end{matrix}

Substituting

{\bar{x}}_{2} = 0

, we obtain

β = - 2

and

α = 2

. Hence, the function

{\bar{L}}_{2} (x, λ (h), h),

defined in (46), takes the following form in this example:

{\bar{L}}_{2} (x, λ (h), h) = x_{2}^{2} + x_{3} + \frac{2}{3} (x_{1} - x_{2}) - \frac{2}{3} (x_{1} - x_{2} + \frac{1}{2} x_{3}) = x_{2}^{2} + \frac{2}{3} x_{3} .

Recall that the set

H_{2} (\bar{x})

was determined in Example 6 as

H_{2} (\bar{x}) = {Ker}^{2} F^{''} (\bar{x}) = span ([\begin{matrix} 1 \\ - 1 \\ 0 \end{matrix}]) ⋃ span ([\begin{matrix} 1 \\ 1 \\ 0 \end{matrix}]) .

Then, the second derivative of

\bar{L}

, defined in (46) for

p = 2

, satisfies

{({\bar{L}}_{2}^{''})}_{x x} (\bar{x}, λ (h), h) {[h]}^{2} = 2 \geq α {∥ h ∥}^{2}

for some

α > 0,

and for every

h \in H_{2} (\bar{x})

. Hence, the sufficient conditions in Theorem 13 are satisfied, and we conclude that

\bar{x}

is a strict local minimizer of problem Equation (44).

4.5. Modified Lagrangian Function Method

4.5.1. The Problem

Consider the following constrained optimization problem:

min f (x), subject to g_{i} (x) \leq 0, i = 1, \dots, m,

(49)

where

f : R^{n} \to R

is an objective function, and

g_{i} : R^{n} \to R

are constraint functions. The goal is to find a vector

(\bar{x} \in R^{n}

such that

f (x)

is minimized while satisfying all constraints. To solve this, we introduce the modified Lagrangian function

L_{E} (x, λ) : R^{n} \times R^{m} \to R

, which incorporates both the objective function and the constraints (see, e.g., [45,52,53]):

L_{E} (x, λ) = f (x) + \frac{1}{2} \sum_{i = 1}^{m} λ_{i}^{2} g_{i} (x),

(50)

where

λ = (λ_{1}, \dots, λ_{m})

. This modified Lagrangian function transforms the nonlinear optimization problem into a system of nonlinear equations.

Define the mapping

G : R^{n} \times R^{m} \to R^{n + m}

by

G (x, λ) = (\begin{matrix} \nabla f (x) + \frac{1}{2} \sum_{i = 1}^{m} λ_{i}^{2} \nabla g_{i} (x) \\ D (λ) g (x) \end{matrix}),

(51)

where

D (λ) = diag {λ_{i}},

i = 1, \dots, m

, and

λ \in R^{m}

.

Consider the equation

G (x, λ) = 0 .

(52)

Let

g^{'} (x)

be the Jacobian matrix of the mapping

g (x)

. Then, the Jacobian matrix

G^{'} (x, λ)

of the mapping

G (x, λ)

is given by

G^{'} (x, λ) = (\begin{matrix} \nabla^{2} f (x) + \frac{1}{2} \sum_{i = 1}^{m} λ_{i}^{2} g_{i}^{''} (x) & {(g^{'} (x))}^{T} D (λ) \\ D (λ) {(g^{'} (x))}^{T} & D (g (x)) \end{matrix}) .

(53)

Define the set

I (\bar{x}) = {j = 1, \dots, m ∣ g_{j} (\bar{x}) = 0}

consisting of active constraints, and the set

I_{0} (\bar{x}) = {j = 1, \dots, m ∣ {\bar{λ}}_{j} = 0, g_{j} (\bar{x}) = 0} \subset I (\bar{x}),

consisting of weakly active constraints, and the set

I_{+} (\bar{x}) = I (\bar{x}) ∖ I_{0} (\bar{x})

, consisting of strongly active constraints.

Recall that the Strict Complementary Condition (SCQ) means that, for each index

j = 1, \dots, m

, one and only one of

g_{j} (\bar{x})

and

{\bar{λ}}_{j}

is equal to zero. If

(\bar{x}, \bar{λ})

is a solution of Problem (52), and for some index j, both

g_{i} (\bar{x}) = 0

and

{\bar{λ}}_{i} = 0

, then the set

I_{0} (\bar{x})

is nonempty, and the SCQ fails. Consequently,

G^{'} (\bar{x}, \bar{λ})

is a degenerate matrix. Example 11 illustrates this situation.

Example 11

([45]). Consider the problem

min_{x \in R^{n}} (x_{1}^{2} + x_{2}^{2} + 4 x_{1} x_{2}) subject to x_{1} \geq 0, x_{2} \geq 0 .

(54)

A direct argument confirms that

\bar{x} = {(0, 0)}^{T}

is a solution of Problem (54) with the corresponding Lagrange multiplier

\bar{λ} = {(0, 0)}^{T}

.

The modified Lagrange function in this case is

L_{E} (x, λ) = x_{1}^{2} + x_{2}^{2} + 4 x_{1} x_{2} - \frac{1}{2} λ_{1}^{2} x_{1} - \frac{1}{2} λ_{2}^{2} x_{2} .

The mapping G is defined by

G (x, λ) = (\begin{matrix} 2 x_{1} + 4 x_{2} - \frac{1}{2} λ_{1}^{2} \\ 2 x_{2} + 4 x_{1} - \frac{1}{2} λ_{2}^{2} \\ - λ_{1} x_{1} \\ - λ_{2} x_{2} \end{matrix}),

and, therefore, the Jacobian matrix

G^{'} (\bar{x}, \bar{λ})

defined in (53) is singular.

4.5.2. Modified Lagrange Function Method for 2-Regular Problems

In this section, we consider the constrained optimization problem (49) with the modified Lagrangian function

L_{E} (x, λ)

defined in (50). We focus on the nonregular case when the Jacobian matrix

G^{'} (\bar{x}, \bar{λ})

defined in (53) is singular at the solution

(\bar{x}, \bar{λ})

of (52).

We will show that the mapping

G (x, λ)

defined in (51) is 2-regular at

(\bar{x}, \bar{λ})

.

Without loss of generality, assume that

I_{0} (\bar{x}) = {1, \dots, s}

, so that

{\bar{λ}}_{j} = 0

and

g_{j} (\bar{x}) = 0

for all

j = 1, \dots, s

. Additionally, we assume that

I_{+} (\bar{x}) = {s + 1, s + 2, \dots, p}

. Introduce the notation

l = m - p

. Then, the rows of matrix

G^{'} (\bar{x}, \bar{λ})

with the numbers from the

(n + 1)

th to the

(n + s)

th contain only zeros. Define the vector

h \in R^{n + m}

as follows

h^{T} = (0_{n}^{T}, 1_{s}^{T}, 0_{m - s}^{T}),

(55)

where

1_{s}^{T}

is an s-dimensional all-one row vector.

Let the mapping

Φ : R^{n} \times R^{m}

be given by

Φ (x, λ) = G (x, λ) + G^{'} (x, λ) h,

(56)

where h is defined in (55).

The following result is well known.

Lemma 3

([45]). Let an

n \times n

matrix V and an

n \times p

matrix Q satisfy the properties:

1.: Q has linearly independent columns, and
2.: $x^{T} V x > 0$ for all $x \in Ker Q^{T} ∖ {0}$ .

Assume also that

D_{N}

is a full-rank diagonal

l \times l

matrix. Then, the matrix

\bar{A}

defined by

\bar{A} = (\begin{matrix} V & Q & 0 \\ Q^{T} & 0 & 0 \\ 0 & 0 & D_{N} \end{matrix})

(57)

is a nonsingular matrix.

The Linear Independence Constraint Qualification (LICQ) holds for the optimization problem (49) if the gradients of active constraints are linearly independent.

The second-order sufficient optimality condition holds if there exists

α > 0

such that

z^{T} \nabla_{x x}^{2} L_{E} (\bar{x}, \bar{λ}) z \geq α {∥ z ∥}^{2}

(58)

for all

z \in R^{n}

that satisfy the conditions

{(\nabla g_{j} (\bar{x}))}^{T} z \leq 0 \forall j \in I (\bar{x}) .

Lemma 4

([45]). Let

f, g_{i} \in C^{3} (R^{n})

, for

i = 1, \dots, m

. Assume that the LICQ (Linear Independence Constraint Qualification) and the second-order sufficient optimality conditions are satisfied at the solution

(\bar{x}, \bar{λ})

of (52), and that Φ is a mapping given by Equation (56). Then, the 2-factor operator

Φ^{'} (x, λ) = G^{'} (x, λ) + G^{''} (x, λ) h

is nonsingular at the point

(\bar{x}, \bar{λ})

.

The proof of Lemma 4 can be derived from Lemma 3.

Indeed, if

D (λ)

is a diagonal matrix with

λ_{j}

as the j-th diagonal entry,

V = \nabla_{x x}^{2} L_{E} (\bar{x}, \bar{λ}), D_{N} = D (g_{N} (\bar{x})), g_{N} (x) = {(g_{p + 1} (x), \dots, g_{m} (x))}^{T},

and

Q = [\nabla g_{1} (\bar{x}), \dots, \nabla g_{s} (\bar{x}), {\bar{λ}}_{s + 1} \nabla g_{s + 1} (\bar{x}), \dots, {\bar{λ}}_{p} \nabla g_{p} (\bar{x})],

then

Φ^{'} (\bar{x}, \bar{λ}) = \bar{A}

, where matrix

\bar{A}

is defined in (57). Lemma 4 implies that the 2-factor Newton method is given by

w^{k + 1} = w^{k} - {(G^{'} (w^{k}) + G^{''} (w^{k}) h)}^{- 1} (G (w^{k}) + G^{'} (w^{k}) h), k = 0, 1, \dots,

(59)

and it can be applied to solve the system (52), where G is defined in (51). As a result, we have the following theorem.

Theorem 14

([45]). Let

\bar{x}

be a solution to (49) and

f, g_{i} (x) \in C^{3} (R^{n})

, for

i = 1, \dots, m

. Assume that the LICQ and the second-order sufficient optimality conditions (58) are satisfied at the point

\bar{x}

. Then, there exists a sufficiently small open ball

B (\bar{w}, ε)

, where

\bar{w} = (\bar{x}, \bar{λ})

, such that the estimate

∥ w^{k + 1} - \bar{w} ∥ \leq β ∥ w^{k} - \bar{w} ∥^{2},

holds for the method (59), where

w^{0} \in B (\bar{w}, ε)

and

β > 0

is a constant independent of k.

In addition, there are other publications where a modified Lagrange function is used in various contexts, such as [54,55]. Higher-order analysis of optimality conditions has been performed in [56].

4.6. Calculus of Variations

The methods of the calculus of variations are widely used to solve many problems in physics and classical mechanics. However, since the classical approach cannot be directly applied to many of these problems, there is a need to extend or reformulate classical theorems to accommodate irregular cases. Over the years, various types of irregular problems in the calculus of variations have been extensively studied in both mathematics and its applications (see, e.g., [1,3,32,57,58,59,60,61,62]).

4.6.1. Singular Problems of Calculus of Variations

In this section, we consider the following Lagrange problem, which involves finding a curve

x = x (t)

, such that (see [63]):

J_{0} (x) = \int_{t_{1}}^{t_{2}} f (t, x, \dot{x}) d t \to min

(60)

subject to the subsidiary conditions:

Γ (x) = 0, q (x (t_{1}), x (t_{2})) = 0,

(61)

where

Γ (x) (t) = G (t, x (t), \dot{x} (t)) = 0 f o r a l l t \in [t_{1}, t_{2}], \dot{x} = \frac{d x}{d t},

X = C_{n}^{1} ([t_{1}, t_{2}]), Y = C_{m} ([t_{1}, t_{2}]), x (t) \in X, Γ \in C^{p + 1} (X, Y),

G : R \times R^{n} \times R^{n} \to R^{m}, G (t, x (t), \dot{x} (t)) = (G_{1} (t, x (t), \dot{x} (t)), \dots, G_{m} (t, x (t), \dot{x} (t)),

and

q : R^{n} \times R^{n} \to R^{k}, f : R \times R^{n} \times R^{n} \to R .

We assume that all mappings and their partial derivatives are continuous with respect to

t,

x,

and

\dot{x}

. We denote by

\bar{x} (t)

a solution to Problem (60) and (61). While each of x,

\dot{x}

, and each component of x is a function of t, (e.g.,

x = x (t)

), we do not write this explicitly in order to avoid over complicated notation.

In the regular case, when

Im Γ^{'} (\bar{x}) = Y,

the Euler–Lagrange necessary conditions are satisfied and take the form (see, e.g., [60,64]):

f_{x} + λ (t) G_{x} - \frac{d}{d t} (f_{\dot{x}} + λ (t) G_{\dot{x}}) = 0 for all t \in [t_{1}, t_{2}] .

(62)

Let

λ (t) = {(λ_{1} (t), \dots, λ_{m} (t))}^{T} .

Then

λ (t) G = λ_{1} (t) G_{1} + \dots + λ_{m} (t) G_{m} a n d λ (t) G_{x} = λ_{1} (t) G_{1 x} + \dots + λ_{m} (t) G_{m x} .

In the singular case, when

Im Γ^{'} (\bar{x}) \neq Y,

we can only guarantee that the following equation is satisfied:

λ_{0} f_{x} + λ (t) G_{x} - \frac{d}{d t} (λ_{0} f_{\dot{x}} + λ (t) G_{\dot{x}}) = 0,

(63)

where

λ_{0}^{2} + {∥ λ (t) ∥}^{2} = 1 .

In this case,

λ_{0}

might be equal to 0, which results in no constructive conditions for the description or finding

\bar{x} (t)

.

Example 12

([63]). Consider the following problem of finding a curve

x (t) = (x_{1} (t), \dots, x_{5} (t))

such that

J_{0} (x) = \int_{0}^{2 π} (x_{1}^{2} + x_{2}^{2} + x_{3}^{2} + x_{4}^{2} + x_{5}^{2}) d t \to min

(64)

subject to

Γ (x) = (\begin{matrix} \dot{x_{1}} - x_{2} + x_{3}^{2} x_{1} + x_{4}^{2} x_{2} - x_{5}^{2} (x_{1} + x_{2}) \\ \dot{x_{2}} + x_{1} + x_{3}^{2} x_{2} - x_{4}^{2} x_{1} - x_{5}^{2} (x_{2} - x_{1}) \end{matrix}) = 0,

(65)

x_{i} (0) = x_{i} (2 π), i = 1, \dots, 5,

where

Γ : C_{5}^{2} ([0, 2 π]) \to C_{2} ([0, 2 π]),

{\dot{x}}_{1} = \frac{d x_{1}}{d t}

, and

{\dot{x}}_{2} = \frac{d x_{2}}{d t} .

Here,

f (x) = x_{1}^{2} + x_{2}^{2} + x_{3}^{2} + x_{4}^{2} + x_{5}^{2}, a n d q_{i} (x (0), x (2 π)) = x_{i} (0) - x_{i} (2 π), i = 1, \dots, 5 .

The solution of Problem (64) and (65) is

\bar{x} (t) = 0

and

Γ^{'} (0)

is singular. Indeed, using the differentiation rules in functional spaces, we obtain

Γ^{'} (0) = (\begin{matrix} {\dot{(\cdot)}}_{1} & - {(\cdot)}_{2} & 0 & 0 & 0 \\ {(\cdot)}_{1} & {\dot{(\cdot)}}_{2} & 0 & 0 & 0 \end{matrix}) a n d Γ^{'} (0) x = (\begin{matrix} {\dot{x}}_{1} - x_{2} \\ x_{1} + {\dot{x}}_{2} \end{matrix}) .

Introducing the notation

z (t) = x_{1} (t)

and using the methods of differential equations, one can show that the mapping

Γ (z (t)) = z^{''} (t) + z (t),

with boundary conditions

z (0) = z (2 π),

is not surjective. Indeed, for

y \in C [0, 2 π],

satisfying

\int_{0}^{2 π} sin τ y (τ) d τ \neq 0 o r \int_{0}^{2 π} cos τ y (τ) d τ \neq 0,

the equation

z^{''} (t) + z (t) = y (t)

does not have a solution.

With

z = z (t)

, the corresponding Euler–Lagrange equations in this case are as follows:

\begin{matrix} 2 λ_{0} z + λ_{2} - {\dot{λ}}_{1} + λ_{1} x_{3}^{2} + λ_{2} x_{5}^{2} - λ_{2} x_{4}^{2} & = & 0, \\ 2 λ_{0} x_{2} - λ_{1} - {\dot{λ}}_{2} + λ_{1} x_{4}^{2} + λ_{2} x_{3}^{2} - λ_{1} x_{5}^{2} - λ_{2} x_{5}^{2} & = & 0, \\ 2 λ_{0} x_{3} + 2 λ_{1} z x_{3} + 2 λ_{2} x_{2} x_{3} & = & 0, \\ 2 λ_{0} x_{4} + 2 λ_{1} x_{2} x_{4} - λ_{2} z x_{4} & = & 0, \\ 2 λ_{0} x_{5} - 2 λ_{1} x_{5} z - 2 λ_{1} x_{2} x_{5} - 2 λ_{2} x_{2} x_{5} + 2 λ_{2} z x_{5} & = & 0, \\ λ_{i} (0) = λ_{i} (2 π), i = 1, 2 . \end{matrix}

Unfortunately, we cannot guarantee that

λ_{0} \neq 0

. For

λ_{0} = 0

, we obtain a series of spurious solutions to the problem (64) and (65):

z (t) = a sin t, x_{2} = a cos t, x_{3} = x_{4} = x_{5} = 0, λ_{1} = b sin t, λ_{2} = b cos t, a, b \in R .

(66)

The derivation of the solutions (66) is based on standard techniques, so we are omitting the technical details from the paper.

4.6.2. Optimality Conditions for p-Regular Problems of Calculus of Variations

To formulate optimality conditions for the problem (60) and (61) in the singular case, we define the p-factor Euler–Lagrange function by

E (x) = f (x) + λ (t) Γ^{(p - 1)} (x) {[h]}^{p - 1},

where

Γ^{(p - 1)} (x) {[h]}^{p - 1} = g_{1} (x) + g_{2}^{'} (x) [h] + \dots + g_{p}^{(p - 1)} (x) {[h]}^{p - 1},

λ (t) Γ^{(p - 1)} (x) {[h]}^{p - 1} = 〈λ (t), (g_{1} (x) + g_{2}^{'} (x) [h] + \dots + g_{p}^{(p - 1)} (x) {[h]}^{p - 1})〉,

λ (t) = {(λ_{1} (t), \dots, λ_{m} (t))}^{T}, h = h (t) \in X .

Functions

g_{i} (x),

i = 1, \dots, p

, are determined for the mapping

Γ (x)

in a way that is similar to how functions

f_{i} (x),

i = 1, \dots, p,

are defined for the mapping

F (x),

in Equation (16). Namely,

g_{k} (x) = P_{Y_{k}} Γ (x), k = 1, \dots, p .

Let

g_{k}^{(k - 1)} (x) {[h]}^{k - 1} = \sum_{i + j = k - 1} C_{k - 1}^{i} {g_{k}^{(k - 1)}}_{x^{i} {(\dot{x})}^{j}} (x) {[h]}^{i} {[h^{'}]}^{j}, k = 1, \dots, p,

where

{g_{k}^{(k - 1)}}_{x^{i} {(\dot{x})}^{j}} (x) = {g_{k}^{(k - 1)}}_{\underset{i}{\underset{︸}{x \dots x}} \underset{j}{\underset{︸}{\dot{x} \dots \dot{x}}}} (x) .

Definition 10.

Let

X = C_{n}^{1} ([t_{1}, t_{2}])

and

Y = C_{m} ([t_{1}, t_{2}])

. We say that problem (60) and (61) is p-regular at

\bar{x} (t) \in X

along some vector

h (t) \in X

,

h (t) \in ⋂_{k = 1}^{p} {Ker}^{k} g_{k}^{(k)} (\bar{x} (t)),

∥ h (t) ∥ \neq 0

, if

Im (g_{1}^{'} (\bar{x} (t)) + \dots + g_{p}^{(p)} (\bar{x} (t)) {[h (t)]}^{p - 1}) = Y .

Theorem 15

([63]). Assume that the problem (60) and (61) is p-regular at its solution

\bar{x} (t) \in X

along

h = h (t) \in X

,

h \in ⋂_{k = 1}^{p} {Ker}^{k} g_{k}^{(k)} (\bar{x} (t)) .

Then, there exists a multiplier

\hat{λ} (t) = {({\hat{λ}}_{1} (t), \dots, {\hat{λ}}_{m} (t))}^{T}

such that the following p-factor Euler–Lagrange equation holds:

\begin{matrix} E_{x} (\bar{x} (t)) - \frac{d}{d t} E_{\dot{x}} (\bar{x} (t)) = f_{x} (\bar{x} (t)) + {〈\hat{λ} (t), \sum_{k = 1}^{p} \sum_{i + j = k - 1} C_{k - 1}^{i} g_{x^{i} {(\dot{x})}^{j}}^{(k - 1)} (\bar{x} (t)) {[h]}^{i} {(\dot{h})}^{j}〉}_{x} \\ - \frac{d}{d t} [f_{\dot{x}} (\bar{x} (t)) + {〈\hat{λ} (t), \sum_{k = 1}^{p} \sum_{i + j = k - 1} C_{k - 1}^{i} g_{x^{i} {(\dot{x})}^{j}}^{(k - 1)} (\bar{x} (t)) {[h]}^{i} {(\dot{h})}^{j}〉}_{\dot{x}}] = 0, \\ λ_{i} (0) = λ_{i} (2 π), i = 1, 2 . \end{matrix}

(67)

The proof of Theorem 15 is similar to the one for the singular isoperimetric problem in [65].

We now go back to Example 12 for further consideration. The mapping

Γ

is 2-regular at

\bar{x} (t) = {(a sin t, a cos t, 0, 0, 0)}^{T}

along

h (t) = {(a sin t, a cos t, 1, 1, 1)}^{T}

. This means that in this problem

p = 2

.

Consider the following equation

f_{x} (x) + {(Γ^{'} (x) + P_{Y_{2}} Γ^{''} (x) h)}^{*} λ (t) = 0 .

The equation is equivalent to the system of Euler–Lagrange equations

\{\begin{matrix} 2 x_{1} - {\dot{λ}}_{1} + λ_{2} = 0 \\ 2 x_{2} - {\dot{λ}}_{2} - λ_{1} = 0 \\ 2 x_{3} + 2 λ_{1} a sin t + 2 λ_{2} a cos t = 0 \\ 2 x_{4} + 2 λ_{1} a cos t - 2 λ_{2} a sin t = 0 \\ 2 x_{5} + 2 λ_{1} a (cos t - sin t) + 2 λ_{2} a (sin t - cos t) = 0 . \\ λ_{i} (0) = λ_{i} (2 π), i = 1, 2 . \end{matrix}

(68)

One can verify that the following “false solutions” of (64) and (65),

x_{1} = a sin t, x_{2} = a cos t, x_{3} = x_{4} = x_{5} = 0,

do not satisfy the system (68) if

a \neq 0 .

This implies that

x_{1} = a sin t, x_{2} = a cos t, x_{3} = x_{4} = x_{5}

are not solutions to the two-factor Euler–Lagrange Equation (67) from Theorem 15. Therefore, the only solution to Example 12 is

\bar{x} (t) = {(0, 0, 0, 0, 0)}^{T} .

Indeed, the two-factor Euler–Lagrange equation in this case has the following form:

\{\begin{matrix} - {\dot{λ}}_{1} + λ_{2} = 0 \\ - {\dot{λ}}_{2} - λ_{1} = 0 \\ 2 λ_{1} a sin t + 2 λ_{2} a cos t = 0 \\ 2 λ_{1} a cos t - 2 λ_{2} a sin t = 0 \\ 2 λ_{1} a (cos t - sin t) + 2 λ_{2} a (sin t - cos t) = 0 . \\ λ_{i} (0) = λ_{i} (π), i = 1, 2 . \end{matrix}

This system has the solution

\bar{x} (t) = {(0, 0, 0, 0, 0)}^{T}

and

{\bar{λ}}_{i} (t) = 0

,

i = 1, 2 .

4.7. Existence of Solutions to Nonlinear Equations

This section addresses the existence of a solution to an equation of the form (3),

F (x) = 0

, in the neighbourhood of a chosen point

\bar{x}

. A very general setting is considered, where the function F maps from a Banach space X to a Banach space Y, and the assumptions pertain to the properties of its derivatives in the neighborhood under consideration. This is one of the classical problems of nonlinear analysis, with many important applications, especially in the theory of differential equations (cf. [66,67,68]).

One well-known method for addressing this problem is Newton’s method (see [69]). The solution is obtained as the limit of a recursively defined sequence of approximations. This method is applied in the proof of the first theorem in this section. In particular, the existence of the inverse operator to the derivative of the function at a chosen point is assumed.

The next theorem presented is more general and uses the p-factor construction of the operator for functions of class

C^{p + 1}

. A certain limitation of this construction is the assumption of the existence of continuous projections onto subspaces of Y corresponding to successive orders of the derivatives of the function F.

4.7.1. Existence of Solutions to Nonlinear Equations in the Regular Case

Let X and Y be Banach spaces. Consider a mapping

F : X \to Y

and a problem of existence of a point

\bar{x}

such that

F (\bar{x}) = 0 .

We know that equation

F (x) = 0

is solvable and has a solution

\bar{x}

when the operator

F^{'} (x_{0})

is surjective [27,70]. A modified version of the following theorem was given in [70].

Theorem 16.

Let X and Y be Banach spaces, and let

x_{0} \in X

, and let

0 < ε < 1 .

Assume

F \in C^{2} (B (x_{0}, ε))

and

∥F (x_{0})∥ = η

for some constant

η > 0

. Suppose that the derivative

F^{'} (x_{0})

is invertible and there exist constants

δ > 0

and

C > 0

such that

∥{(F^{'} (x_{0}))}^{- 1}∥ = δ,

sup_{x \in B (x_{0}, ε)} ∥F^{''} (x)∥ = C < + \infty

. If, moreover, the following conditions are satisfied:

1.: $δ η \leq \frac{ε}{2},$
2.: $δ C ε \leq \frac{1}{4},$
3.: $C ε \leq \frac{1}{2}$ ,

then the equation

F (x) = 0

has a solution

\bar{x} \in B (x_{0}, ε) .

If the first derivative of F at

x_{0}

is not surjective, then Theorem 16 cannot be applied. Consider, for example, a mapping

F : R \to R

defined by

F (x) = \frac{1}{7!} x^{7} + x^{5} + \frac{1}{10^{3}} .

Note that if

x_{0} = 0

, the assumptions of Theorem 16 are not satisfied, but the equation

F (x) = 0

still has a solution

\bar{x} \approx - 0.251188

.

4.7.2. Existence of Solutions to Nonlinear Equations in the Singular Case

In this section, we continue considering the problem introduced in Section 4.7.1. Specifically, let X and Y be Banach spaces, and let

F : X \to Y

. Assume that

F (x_{0}) \neq 0

for some

x_{0}

. We are interested in the existence of a solution

\bar{x}

to the equation

F (x) = 0

in some open ball

B (x_{0}, ε)

of

x_{0}

such that

F (\bar{x}) = 0

. Most of the work in solving this problem focuses on Newton’s method or its modifications, under the assumption that

F^{'} (x_{0})

is regular (see, e.g., [71]).

Now, consider the degenerate case where

F^{'} (x_{0})

is not regular. The focus here is on finding a small constant

ε > 0

such that the neighborhood

B (x_{0}, ε)

contains a solution

\bar{x}

to the equation

F (x) = 0

. We introduce the following notation and assumptions for some

p \geq 2

:

δ = ∥F (x_{0})∥,

(69)

η = ∥{Ψ_{p} (h)}^{- 1}∥ < \infty, h \in ⋂_{k = 1}^{p} {Ker}^{k} f_{k}^{(k)} (x_{0}), ∥ h ∥ = 1,

(70)

c = max_{k = 1, \dots, p} sup_{x \in B (x_{0}, ε)} ∥f_{k}^{(k + 1)} (x)∥, d = 4 max_{k = 1, \dots, p} \frac{1}{(k - 1)!} ∥f_{k}^{(k)} (x_{0})∥,

α = min \{\frac{3}{4^{p + 2} η}, min_{k = 1, \dots, p} \frac{∥f_{k}^{(k)} (x_{0})∥}{(k - 1)!}\} .

(71)

The following theorem was proved in [72].

Theorem 17.

Let X and Y be Banach spaces, and let

F : X \to Y

be of class

C^{p + 1} (X) .

Assume that there exists

h \in ⋂_{k = 1}^{p} {Ker}^{k} f_{k}^{(k)} (x_{0}),

with

∥ h ∥ = 1

, such that F is a p-regular mapping at

x_{0} \in X

along

h .

Assume also that there exists

ω,

0 < ω < \frac{1}{2} ν,

where

ν \in (0, 1),

such that the following inequalities hold:

1.: $η δ \leq α \frac{ω^{p}}{2 p d},$
2.: $\frac{4^{p + 2}}{3} c ω η \leq \frac{1}{2} .$

Then the equation

F (x) = 0

has a solution

\bar{x} = x_{0} + ω h + \bar{x} (ω) \in B_{ν} (x_{0})

, where

\bar{x} (ω)

is a fixed point such that

∥ \bar{x} (ω) ∥ \leq \frac{1}{2} ω .

Recall that if our focus is on finding a radius

ε > 0

such that the open ball

B (x_{0}, ε)

contains a solution

\bar{x}

of

F (x) = 0

, then Theorem 17 implies that

ε = ω + \bar{x} (ω)

. For example, we can take

ε = \frac{3}{2} ω .

As an example of singular nonlinear equation, we consider the problem of existence of local nontrivial solutions of the Boundary Value Problem (BVP) for the ordinary differential equation

y^{''} (t) + y (t) + g (y (t)) = x (t)

(72)

with the boundary conditions

y (0) = y (π) = 0,

(73)

which is degenerate at

\bar{y} (t) = 0

. Here,

y (t) \in C^{2} ([0, π])

and

g, x

are given functions such that

x \in C [0, π], g \in C^{p + 1} ([0, π]), g (0) = g^{'} (0) = 0 .

Remark 4.

Recall that the operator

Ψ_{p}

is defined in (19). The surjectivity of the operator

Ψ_{p} (ω h)

for any

ω \neq 0

implies the p-regularity condition of the mapping F at the point

x_{0}

(by the definition). It is also equivalent to the following inequality with a vector h such that

∥ h ∥ = 1

:

∥ {Ψ_{p} (ω h)}^{- 1} y ∥ \leq (1 + \frac{1}{ω} + \frac{1}{w^{2}} + \dots + \frac{1}{ω^{p - 1}}) .

4.8. Differential Equations

4.8.1. Nonlinear Boundary-Value Problem

The nonlinear BVP analyzed in this section has the form

y^{''} (t) + y (t) + g (y (t)) = x (t)

(74)

with boundary conditions

y (0) = y (π) = 0,

(75)

where

y (t) \in C^{2} [0, π]

,

x (t) \in C [0, π]

, and g is a

C^{3}

function from

R

to

R

, satisfying

g (0) = g^{'} (0) = 0, x (0) = x (π) = 0 .

(76)

We are interested in the problem of the existence of a solution

y (t)

to the BVP (74) and (75) for given functions

x (t)

and

g (t)

.

Introduce the notation

F (x, y) = y^{''} + y + g (y) - x,

(77)

and regard F as a mapping

F : X \times Y \to Z,

where

X = {x \in C [0, π] ∣ x (0) = x (π) = 0}, Y = {y \in C^{2} [0, π] ∣ y (0) = y (π) = 0},

and

Z = C [0, π] .

We can rewrite Equation (74) as

F (x, y) = 0 .

(78)

The assumptions (75) and (76) imply that

(0, 0)

is a solution of (78):

F (0, 0) = 0

. Without loss of generality, we restrict our attention to a neighborhood

U \times V \subset X \times Y

of the point

(0, 0)

. The problem of existence of a solution

y (t)

to the BVP (74) and (75) for a given function

x (t) \in U

is equivalent to the problem of existence of an implicit function

φ (x) : U \to Y

, such that

y = φ (x)

and

F (x, y) = y^{''} + y + g (y) - x = 0, y (0) = y (π) = 0 .

(79)

If

F (0, 0) = 0

and the mapping F is regular at

(0, 0)

—that is, if the partial derivative of F with respect to y, denoted

F_{y}^{'} (0, 0)

, is a surjective linear operator—then the classical IFT 9 guarantees the existence of a smooth mapping

φ

defined on a neighborhood of

\bar{x} = 0

such that

F (x, φ (x)) = 0

and

φ (0) = 0

. In this case, the operator

F_{y}^{'} (0, 0)

is given by

F_{y}^{'} (0, 0) y = y^{''} + y + g^{'} (0) = y^{''} + y,

(80)

since

g^{'} (0) = 0

.

However, the situation changes in the nonregular case. Consider, for example, the BVP

y^{''} (t) + y (t) = sin t, y (0) = y (π) = 0,

which has no solution. To see this, multiply both sides of the equation by

sin t

and integrate from 0 to

π

. The left-hand side, after integration by parts, evaluates to zero, while the right-hand side is nonzero. In this example, the operator

F_{y}^{'} (0, 0)

is not surjective, and, therefore, the classical Implicit Function Theorem does not apply to guarantee the existence of a solution to Equation (78).

4.8.2. Nonlinear Boundary-Value Problem in the Nonregular Case

We consider the boundary-value problem (74) and (75) in the nonregular case, using the definitions and notation introduced in Section 4.8.1. Our analysis is restricted to a neighborhood of the point

(\bar{x} (t), \bar{y} (t)) = (0, 0)

,

t \in [0, π] .

As shown in Section 4.8.1, the operator

F_{y}^{'} (0, 0)

is not surjective. In this case, we apply the pth-order Implicit Function Theorem 11 with

p = 2

to derive conditions for the existence of an implicit function

y = φ (x)

, and, consequently, for the existence of a solution to the BVP (74) and (75).

To apply Theorem 11, we first introduce some auxiliary spaces and functions for the mapping

F (x, y)

, in accordance with Section 4.2.2.

By the definition of the operator

F_{y}^{'} (0, 0)

in (80), its image is the set of all

z (t) \in Z

, such that there exists

y \in Y

satisfying

y^{″} + y = z (t) .

(81)

The general solution of (81) has the form:

y (t) = C_{1} \cos t + C_{2} \sin t - \sin t \int_{0}^{t} \cos τ z (τ) d τ + \cos t \int_{0}^{t} \sin τ z (τ) d τ, C_{1}, C_{2} \in R .

Substituting the boundary conditions

y (0) = y (π) = 0

yields

C_{1} = 0

and

\int_{0}^{π} \sin τ z (τ) d τ = 0 .

Hence,

Z_{1} = Im F_{y}^{'} (0, 0) = \{z (t) \in Z |\int_{0}^{π} \sin τ z (τ) d τ = 0\},

(82)

and, as expected,

Z_{1} \neq Z

. The kernel of

F_{y}^{'} (0, 0)

is defined by the boundary value problem

y^{″} + y = 0, y (0) = y (π) = 0,

whose solution is

y (t) = C \sin t

, with

C \in R

. Therefore,

Ker (F_{y}^{'} (0, 0)) = span (\sin t)

.

Let

Z_{2}

be a closed complementary subspace to

Z_{1}

. Then,

Z_{2} = span (\sin t),

and the projection operator

P_{Z_{2}}

is defined as

P_{Z_{2}} z (t) = \frac{2}{π} \sin t \int_{0}^{π} \sin (τ) z (τ) d τ, z (t) \in Z .

(83)

Next, define the mappings

f_{1} (x, y)

and

f_{2} (x, y)

by

f_{1} (x, y) = F (x, y), f_{2} (x, y) = P_{Z_{2}} F (x, y) .

(84)

For

p = 2

, the 2-factor-operator has the form:

Ψ_{2} (h) y (t) = {(y (t))}^{″} + (y (t)) + P_{Z_{2}} g^{″} (0) [h],

(85)

where

h = h (x (t))

is a function.

Example 13.

Consider the following nonlinear BVP:

y^{″} (t) + y (t) + y^{2} (t) = v \sin t, y (0) = y (π) = 0,

(86)

where

g (y) = y^{2}

,

x (t) = v \sin t

,

F (x, y) = y^{''} + y + y^{2} - v \sin t

, v is a constant, and

F : X \times Y \to Z

, with X, Y and Z defined above.

We now verify that all conditions of the pth-order Implicit Function Theorem 11 are satisfied for the mapping

F (x, y)

with a sufficiently small

v > 0

and

p = 2

. Note that

\bar{y} (t) = 0

is a solution of the homogeneous BVP corresponding to (86), so that

F (\bar{x}, \bar{y}) = 0

.

For

p = 2

, Condition 1 of Theorem 11 holds for F due to the structure of the mapping

g (y)

, as well as

f_{1} (x, y)

and

f_{2} (x, y)

introduced in (84).

Condition 2 (the 2-factor-approximation) depends only on the properties of the mapping

g (y) = y^{2}

and reduces to the existence of a sufficiently small

ε > 0

and a neighborhood

U (\bar{y})

of

\bar{y}

such that for all

y_{1}, y_{2} \in U (\bar{y})

,

∥P_{Z_{1}} (y_{1}^{2} - y_{2}^{2})∥ \leq ∥y_{1}^{2} - y_{2}^{2}∥ \leq ε ∥ y_{1} - y_{2} ∥,

and

∥P_{Z_{2}} (y_{1}^{2} - y_{2}^{2} - y_{1}^{2} + y_{2}^{2})∥ \leq ε (∥ y_{1} ∥ + ∥ y_{2} ∥) ∥ y_{1} - y_{2} ∥ .

Both inequalities hold, and hence Condition 2 is satisfied.

Condition 3 is equivalent to the existence of a neighborhood

U (\bar{x})

such that for some

x \in U (\bar{x})

, there exists a function

h = h (x (t))

and

c_{1} > 0

such that

h^{″} + h + \frac{2}{π} \sin t \int_{0}^{π} \sin (τ) h^{2} (τ) d τ = v \sin t .

(87)

Problem (87) has an explicit solution

h (t) = \sqrt{\frac{3 π v}{8}} \sin t,

(88)

which exists only for

v > 0

. Then, Condition 3 reduces to verifying that there exists a constant

c_{1} > 0

such that

∥\sqrt{\frac{3 π v}{8}} \sin t∥ \leq c_{1} {∥ v \sin t ∥}^{1 / 2} .

This inequality is equivalent to

\sqrt{\frac{3 π}{8}} \leq c_{1},

which is satisfied, for instance, by taking

c_{1} = \sqrt{\frac{3 π}{8}}

.

To verify Condition 4, we observe that with

x = v \sin t

and h given by (88), the set

Φ_{2}^{- 1} (- F (x, \bar{y}))

consists of a single element

{h} .

The operator

Ψ_{2} (\bar{h})

, defined in (85), with

\bar{h} (t) = \sin t,

takes the form:

Ψ_{2} (\bar{h}) y = y^{″} + y + \frac{2}{π} \sin t \int_{0}^{π} (2 y) \sin^{2} τ d τ,

which is surjective, and therefore Condition 4 is satisfied.

Having verified all four conditions of Theorem 11, we conclude that there exists a solution

y (t)

to the BVP (86), satisfying

∥ y (t) ∥ \leq {c ∥ v \sin t ∥}^{1 / 2} \leq c v^{1 / 2}, c > 0 .

4.9. Interpolation by Polynomials

In this section, we consider one of the newest applications of the p-regularity theory. There are many books on numerical analysis and numerical methods where the topics of interpolation and polynomial approximation are described in detail (see, for example, [73,74]).

4.9.1. Newton Interpolation Polynomial

Let f be

C^{p + 1} ([a, b])

and consider the equation

f (x) = 0,

where

x \in [a, b]

. For some

Δ x > 0

, define the points

x_{i}

,

i = 0, \dots, n

, as follows:

x_{0} = a, x_{1} = x_{0} + Δ x, x_{2} = x_{1} + Δ x, \dots, x_{n} = b .

Let

y_{i} = f (x_{i}), i = 0, 1, \dots, n .

The problem of interpolation is to find a polynomial

P_{n} (x)

of degree at most n such that

P_{n} (x_{i}) = y_{i}

,

i = 0, \dots, n

, and that gives a good approximation of the function

f (x)

.

Let

ε = Δ x

be sufficiently small and assume that

| f (x) - P_{n} (x) | \leq C_{1} ε

, where

C_{1} \geq 0

is a constant. Assume that the equation

f (x) = 0

has a solution

\bar{x} \in (a, b)

, and the equation

P_{n} (x) = 0

has a solution

\tilde{x} \in (a, b)

. Our goal is to use the interpolation polynomial

P_{n} (x)

and its solution

\tilde{x}

to obtain the

ε^{2}

-accuracy of the solution

\bar{x}

, in the sense that

| \bar{x} - \tilde{x} | \leq C ε^{2},

(89)

where

C \geq 0

is a constant. In the regular case, this can be obtained by using, for example, the Newton interpolation polynomial

P_{n} (x)

with

Δ x = ε .

Recall that the Newton interpolation polynomial of degree n, related to the data points

(x_{0}, y_{0}), (x_{1}, y_{1}), \dots, (x_{n}, y_{n}),

is defined by

\begin{array}{l} P_{n} (x) = & α_{0} + α_{1} (x - x_{0}) + α_{2} (x - x_{0}) (x - x_{1}) + \\ \dots + α_{n} (x - x_{0}) (x - x_{1}) \dots (x - x_{n - 1}) \\ = \sum_{k = 0}^{n} α_{k} ω_{k} (x), \end{array}

where

\begin{matrix} ω_{0} (x) = 1, \\ ω_{i} (x) = (x - x_{0}) (x - x_{1}) \dots (x - x_{i - 1}) = \prod_{j = 0}^{i - 1} (x - x_{j}), i = 1, \dots, n . \end{matrix}

(90)

The coefficients

α_{k}

are called divided differences and are defined using the following relations:

α_{k} = [y_{0}, \dots, y_{k}], k = 0, 1, \dots, n,

(91)

where

\begin{matrix} [y_{k}] & = y_{k}, k = 0, \dots, n, \\ [y_{k}, \dots, y_{k + j}] & = \frac{[y_{k + 1}, \dots, y_{k + j}] - [y_{k}, \dots, y_{k + j - 1}]}{x_{k + j} - x_{k}}], \\ k = 0, \dots, n - j, j = 1, \dots, k . \end{matrix}

In the following example, we consider a nonlinear function

f (x)

, which is not regular at a solution of the equation

f (x) = 0

, and investigate whether a solution of the equation

P_{n} (x) = 0

provides the desired accuracy (89) for the solution

\bar{x}

of

f (x) = 0

, assuming that

| f (x) - P_{n} (x) | \leq C_{1} ε

holds for a sufficiently small

ε

.

Example 14.

Consider the function

f (x) = x^{3}

. The solution of the equation

f (x) = 0

is

\bar{x} = 0

. The function

f (x)

is singular at

\bar{x} = 0

up to the second order because

f^{(i)} (0) = 0

for

i = 1, 2 .

The goal in this example is to investigate whether the estimate (89) is satisfied when using the interpolation polynomial

P_{1} (x)

and a solution of

P_{1} (x) = 0

to approximate the solution of

f (x) = 0

. Using the equations given above with

n = 1

, we obtain

P_{1} (x) = α_{0} + α_{1} (x - x_{0}),

where the coefficients

α_{0}

and

α_{1}

are determined by using Equation (91).

Let

ε = Δ x

be sufficiently small and consider the segment

[a, b] = [- \frac{1}{3} ε, \frac{2}{3} ε]

. The interpolation points are

x_{0} = a = - \frac{1}{3} ε

and

x_{1} = b = \frac{2}{3} ε

. Calculating the coefficients, we obtain

α_{0} = f (x_{0}) = - \frac{ε^{3}}{27} a n d α_{1} = \frac{f (x_{1}) - f (x_{0})}{x_{1} - x_{0}} = \frac{ε^{2}}{3} .

Hence, the interpolation polynomial has the form

W_{1} (x) = - \frac{ε^{3}}{27} + \frac{ε^{2}}{3} (x + \frac{1}{3} ε) = - \frac{ε^{3}}{27} + \frac{ϵ^{2}}{3} x + \frac{ε^{3}}{9} = \frac{2 ε^{3}}{27} + \frac{ϵ^{2}}{3} x .

Moreover,

| P_{1} (x) - f (x) | \approx ε^{3} \leq C_{2} ε, C_{2} \geq 0,

for a sufficiently small ε.

The solution of the equation

P_{1} (x) = 0

is

\tilde{x} = - \frac{2}{9} ε,

which is not satisfactory from the approximation accuracy point of view, since

| \tilde{x} - \bar{x} | = |- \frac{2}{9} ε - 0| \approx ε > ε^{2}

and the desired accuracy (89) is not obtained.

Thus, in the degenerate case, contrary to the regular case, while we have the required accuracy of the approximation for the function

f (x) = x^{3}

, the accuracy of the solution is only of the order ε, and not

ε^{2}

.

4.9.2. The p-Factor Interpolation Method

In this section, we demonstrate that the desired accuracy (89) for the solution of the equation

f (x) = 0

in the degenerate case can be achieved by using the p-factor interpolation polynomial, rather than the classical Newton interpolation polynomial, to obtain an approximate solution of

f (x) = 0

.

Let

f : R \to R

be a

C^{p + 1}

function that is singular at a point

\bar{x}

.

For some

p > 1

, we associate f with its corresponding p-factor function

\bar{f}

, defined as

\bar{f} (x) = f (x) + f^{'} (x) h + \dots + f^{(p - 1)} (x) {[h]}^{p - 1},

where

h \in R

,

h \neq 0

. Similarly to the Newton interpolation method, we construct the p-factor interpolation polynomial

\bar{P_{n}} (x)

using the function

\bar{f}

as follows:

\bar{P_{n}} (x) = \sum_{k = 0}^{n} {\bar{α}}_{k} ω_{k} (x),

where the functions

ω_{k} (x)

are defined in the same way as in (90), and the coefficients

{\bar{α}}_{k}

, for

k = 0, 1, \dots, n

, are given by

\begin{matrix} {\bar{α}}_{0} = [{\bar{y}}_{0}] = \bar{f} (x_{0}), and [{\bar{y}}_{i}] = \bar{f} (x_{i}), for i = 1, \dots, n, \\ {\bar{α}}_{1} = [{\bar{y}}_{0}, {\bar{y}}_{1}] = \frac{[{\bar{y}}_{1}] - [{\bar{y}}_{0}]}{x_{1} - x_{0}}, \\ ⋮ \\ {\bar{α}}_{n} = [{\bar{y}}_{0}, \dots, {\bar{y}}_{n}] = \frac{[{\bar{y}}_{1}, \dots, {\bar{y}}_{n}] - [{\bar{y}}_{0}, \dots, {\bar{y}}_{n - 1}]}{x_{n} - x_{0}} . \end{matrix}

Theorem 18.

Let the equation

f (x) = 0

has a solution

\bar{x} \in (a, b)

. Assume that

f \in C^{p} ([a, b])

is p-regular along

h \neq 0

at the point

\bar{x}

. Suppose that

\bar{P_{n}} (x)

is the Newton interpolation polynomial for the associated function

\bar{f}

, constructed with a sufficiently small interpolation step

ε = Δ > 0

.

Then, the equation

\bar{P_{n}} (x) = 0

has a solution

\hat{x} \in (a, b)

such that

| \hat{x} - \bar{x} | \leq c ε^{2},

where

c > 0

is an independent constant.

We omit the proof, as it is similar to the proof of convergence of the classical iterative Newton method.

As in the previous sections, we say that a function

f \in C^{p} ([a, b])

is p-regular along

h \neq 0

at the point

\bar{x} \in (a, b)

if there is a natural number

p \geq 2

such that

f^{(i)} (\bar{x}) = 0, i = 1, \dots, p - 1, a n d f^{(p)} (\bar{x}) \neq 0 .

Note that if

p = 1

, the definition of a p-regular function f reduces to the standard definition of a regular function, and the p-factor interpolation polynomial

\bar{P_{n}} (x)

coincides with the classical Newton interpolation polynomial

P_{n} (x)

.

Example 15.

We will apply the p-factor interpolation method to the function from Example 14. Define the function

\bar{f}

for

p = 2

and

h = 1

as

\bar{f} (x) = f (x) + f^{'} (x) h + f^{″} (x) {[h]}^{2} = x^{3} + 3 x^{2} + 6 x,

and consider the p-factor interpolation polynomial

\bar{P_{1}} (x)

. Using the same interval as in Example 14, the interpolation points are

x_{0} = a = - \frac{1}{3} ε

and

x_{1} = b = \frac{2}{3} ε

. The coefficients are given by

{\bar{α}}_{0} = \bar{f} (x_{0}) = - \frac{1}{27} ε^{3} + \frac{1}{3} ε^{2} - 2 ε

and

{\bar{α}}_{1} = \frac{\bar{f} (x_{1}) - \bar{f} (x_{0})}{x_{1} - x_{0}} = \frac{1}{3} ε^{2} + ε + 6 .

Thus, the p-factor interpolation polynomial is

\begin{matrix} \bar{P_{1}} (x) & = & {\bar{α}}_{0} + {\bar{α}}_{1} (x - x_{0}) \\ = & - \frac{1}{27} ε^{3} + \frac{1}{3} ε^{2} - 2 ε + (\frac{1}{3} ε^{2} + ε + 6) (x + \frac{1}{3} ε) \\ = & (\frac{1}{3} ε^{2} + ε + 6) x + \frac{2}{27} ε^{3} + \frac{2}{3} ε^{2} . \end{matrix}

Hence, for a sufficiently small ε, we have

|\bar{P_{1}} (x) - f (x)| \leq C_{3} ε, C_{3} \geq 0 .

Solving the equation

{\bar{W}}_{1} (x) = 0

, we obtain

\hat{x} = - \frac{\frac{2}{27} ε^{3} + \frac{2}{3} ε^{2}}{\frac{1}{3} ε^{2} + ε + 6} = - \frac{\frac{2}{9} ε^{3} + 2 ε^{2}}{ε^{2} + 3 ε + 18} .

Therefore,

| \hat{x} - \bar{x} | = |- \frac{\frac{2}{9} ε^{3} + 2 ε^{2}}{ε^{2} + 3 ε + 18} - 0| < \frac{3 ε^{2}}{18} = \frac{ε^{2}}{6},

an we thus obtain the desired

ε^{2}

-accuracy stated in estimate (89) for the solution of the equation

f (x) = 0

.

Let us now compare the use of the classical polynomial

P_{1} (x)

with the p-factor interpolation polynomial

\bar{P_{1}} (x)

in approximating the solution

\bar{x}

of the equation

f (x) = 0

, for the function f from Example 14. As mentioned earlier, the polynomial

P_{1} (x)

is a good approximation for the function

f (x)

in the sense that

| P_{1} (x) - f (x) | \approx ε^{3} \leq C_{2} ε, C_{2} \geq 0 .

However, the solution

\tilde{x}

of

P_{1} (x) = 0

does not yield the desired accuracy for

\bar{x}

, since

|\tilde{x} - \bar{x}| \approx ε \geq C_{4} ε, C_{4} \geq 0,

and the target accuracy of order

ε^{2}

is not achieved.

In contrast, the p-factor interpolation polynomial

\bar{P_{1}} (x)

approximates

f (x)

with order

ϵ

:

|\bar{P_{1}} (x) - f (x)| \approx ε .

Thus, using the p-factor interpolation polynomial

\bar{P_{1}} (x)

, we achieve the desired accuracy for the solution

\bar{x}

of

f (x) = 0

. Specifically, as shown above, the solution

\hat{x}

of

\bar{P_{1}} (x) = 0

satisfies estimate (89):

| \hat{x} - \bar{x} | \leq \frac{1}{6} ε^{2} .

This level of accuracy could not be achieved using the classical interpolation polynomial

P_{1} (x)

.

5. Conclusions

In this paper, we described various applications of the theory of p-regularity, including the generalization of the Lyusternik and Implicit Function theorems, the Newton method, optimality conditions for equality and inequality constraints, calculus of variations, and the solvability of nonlinear equations.

We should note that we did not cover all areas where the results of the theory can be applied. In addition, there are other areas of mathematics where the theory of p-regularity (or p-factor-analysis) has not yet been applied. For example, we did not provide examples of applying the theory to the analysis of the existence of solutions for singular nonlinear partial differential equations, such as the Burger’s nonlinear equation, the Laplace nonlinear differential equation, and others. We also did not cover results related to the existence of solutions depending on a parameter for the Van der Pol differential equation, the Duffing equation, and others. Other results not covered in this paper include examples of applying the theory of p-regularity for the analysis of nonlinear dynamical systems, optimality conditions for optimal control problems in the nonregular (degenerate) case. Based on the theory of p-regularity, we can develop the theory of so-called p-convexity, which can be effective for the analysis of nonlinear problems. Additional information can be found in other studies by the authors.

Author Contributions

Conceptualization, O.B., A.P. and A.A.T.; Methodology, E.B., O.B., A.P. and A.A.T.; Validation, O.B., K.L., A.P. and A.A.T.; Formal analysis, E.B., O.B., K.L., A.P. and A.A.T.; Writing—original draft, E.B., O.B., K.L., A.P. and A.A.T.; Writing—review and editing, E.B., O.B., K.L. and A.A.T. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Data Availability Statement

The original contributions presented in this study are included in the article. Further inquiries can be directed to the corresponding author(s).

Acknowledgments

We would like to thank the reviewers for their careful and detailed reading of our manuscript. We are sincerely grateful for their constructive comments, insightful suggestions, and the time and effort they dedicated to evaluating our work. Their feedback has been invaluable in helping us improve the clarity, quality, and overall presentation of this paper.

Conflicts of Interest

The authors declare no conflict of interest.

References

Ioffe, A.D. Variational Analysis of Regular Mappings. Theory and Applications, 1st ed.; Springer: Berlin/Heidelberg, Germany, 2017. [Google Scholar]
Banach, S. Théorie des Opérations Linéaires; Monografie Matematyczne, Warszawa 1932: English Translation; North Holland: Amsterdam, The Netherlands, 1987. [Google Scholar]
Dontchev, A.L. Lectures on variational analysis. In Applied Mathematical Sciences; Springer: Cham, Switzerland, 2021; Volume 205, p. xii+219. [Google Scholar] [CrossRef]
Graves, L. Some mapping theorems. Duke Math. J. 1950, 17, 111–114. [Google Scholar] [CrossRef]
Dontchev, A.L. The Graves theorem revisited. J. Convex Anal. 1996, 3, 45–54. [Google Scholar]
Dontchev, A.L.; Bauschke, H.H.; Burachik, R.; Luke, D.R. The Inverse Function Theorems of L. M. Graves. In Splitting Algorithms, Modern Operator Theory, and Applications; Springer International Publishing: Berlin/Heidelberg, Germany, 2019. [Google Scholar] [CrossRef]
Dmitruk, A.V.; Milyutin, A.A.; Osmolovskii, N.P. Lyusternik’s Theorem and the Theory of Extrema. Russ. Math. Surv. 1980, 35, 11–51. [Google Scholar] [CrossRef]
Dontchev, A.L.; Rockafellar, R.T. Implicit Functions and Solution Mappings, 2nd ed.; Springer Series in Operations Research and Financial Engineering; Springer: New York, NY, USA, 2014. [Google Scholar] [CrossRef]
Dontchev, A.L.; Frankowska, H. Lyusternik-Graves theorem and fixed points I. Proc. Am. Math. Soc. 2011, 139, 521–534. [Google Scholar] [CrossRef]
Dontchev, A.L.; Frankowska, H. Lyusternik-Graves theorem and fixed points II. J. Convex Anal. 2012, 19, 955–973. [Google Scholar]
Frankowska, H. High order inverse function theorems. Ann. Inst. H. Poincaré C Anal. Non Linéaire 1989, 6, 283–303. [Google Scholar] [CrossRef]
Ekeland, I. An inverse function theorem in Fréchet spaces. Ann. Inst. H. Poincaré C Anal. Non Linéaire 2011, 28, 91–105. [Google Scholar] [CrossRef]
Hamilton, R. The inverse function theorem of Nash and Moser. Bull. Am. Math. Soc. 1982, 7, 65–222. [Google Scholar] [CrossRef]
Bednarczuk, E.M.; Leśniewski, K.W.; Rutkowski, K.E. On tangent cone to systems of inequalities and equations in Banach spaces under relaxed constant rank condition. ESAIM Control Optim. Calc. Var. 2021, 27, 22. [Google Scholar] [CrossRef]
Tret’yakov, A.A. Necessary conditions for optimality of p-th order. In Control and Optimization; Moscow State University: Moscow, Russia, 1983; pp. 28–35. [Google Scholar]
Tret’yakov, A.A. Necessary and sufficient conditions for optimality of p-th order. USSR Comput. Math. Math. Phys. 1984, 24, 123–127. [Google Scholar] [CrossRef]
Buchner, M.; Marsden, J.; Schecter, S. Applications of the blowing-up construction and algebraic geometry to bifurcation problems. J. Differ. Equ. 1983, 48, 404–433. [Google Scholar] [CrossRef]
Fink, J.; Rheinboldt, W. A geometric framework for the numerical study of singular points. SIAM J. Numer. Anal. 1987, 24, 618–633. [Google Scholar] [CrossRef]
Krantz, S.; Parks, H. The Implicit Function Theorem: History, Theory, and Applications; Birkha¨user: Boston, MA, USA; Basel, Switzerland; Berlin, Germany, 2002. [Google Scholar]
Dieudonné, J. Foundations of Modern Analysis; Pure and Applied Mathematics; Academic Press: New York, NY USA; London, UK, 1969; Volume 10-I. [Google Scholar]
Izmailov, A.F.; Tret’yakov, A.A. Factor-Analysis of Nonlinear Mappings, 1st ed.; Nauka: Moscow, Russia, 1994. [Google Scholar]
Tret’yakov, A.A.; Marsden, J.E. Factor–analysis of nonlinear mappings: p-regularity theory. Commun. Pure Appl. Anal. 2003, 2, 425–445. [Google Scholar]
Abraham, R.; Marsden, J.E.; Ratiu, T. Manifolds, Tensor Analysis and Applications; Springer: New York, NY, USA, 1988. [Google Scholar]
Tret’yakov, A.A. The implicit function theorem in degenerate problems. Russ. Math. Surv. 1987, 42, 179–180. [Google Scholar] [CrossRef]
Alekseev, V.M.; Tihomirov, V.M.; Fomnin, S.V. Optimal Control; Nauka: Moscow, Russia, 1979. [Google Scholar]
Bühler, T.; Salamon, D.A. Functional Analysis; Springer: Berlin, Germany, 1991. [Google Scholar]
Kantorovitch, L.V.; Akilov, G.P. Functional Analysis; Pergamon Press: Oxford, UK, 1982. [Google Scholar]
Ioffe, A.D.; Tihomirov, V.M. Theory of extremal problems. Stud. Math. Its Appl. 1979, 6, 1–460. [Google Scholar]
Clarke, F.H. Optimization and Nonsmooth Analysis; Wiley: New York, NY, USA, 1983. [Google Scholar]
Liusternik, L.A.; Sobolev, V.J. Elements of Functional Analysis; Hindustan Publishing: Delhi, India, 1961. [Google Scholar]
Dontchev, A.L.; Levis, A.S.; Rockafellar, R.T. The radius of metric regularity. Trans. Am. Math. Soc. 2002, 355, 493–517. [Google Scholar] [CrossRef]
Ioffe, A.D. Metric regularity and subdifferential calculus. Variational Analysis of Regular Mappings. Theory and Applications. Russ. Math. Surv. 2000, 55, 501–558. [Google Scholar] [CrossRef]
Sekiguchi, Y.; Takahashi, W. Tangent and normal vectors to feasible regions with geometrically derivable sets. In Scientiae Mathematicae Online; Japanese Association of Mathematical Sciences: Sakai, Osaka, Japan, 2006; pp. 517–527. [Google Scholar]
Ledzewicz, U.; Schättler, H. A higher order generalization of the Lusternik theorem. Nonlinear Anal. 1998, 34, 793–815. [Google Scholar] [CrossRef]
Izmailov, A.F. Theorems on the representation of nonlinear mapping families and implicit function theorems. Math. Notes 2000, 67, 45–54. [Google Scholar] [CrossRef]
Izmailov, A.F. Some generalizations of the Morse lemma. Tr. Mat. Inst. Steklov (Proc. Steklov Inst. Math.) 1998, 220, 142–156. [Google Scholar]
Brezhneva, O.A.; Tret’yakov, A.A.; Marsden, J.E. Higher-order implicit function theorems and degenerate nonlinear boundary-value problems. Commun. Pure Appl. Anal. 2008, 7, 293–315. [Google Scholar] [CrossRef]
Randrianantoanina, B.; Randrianantoanina, N. (Eds.) Banach Spaces and Their Applications in Analysis. In Honor of Nigel Kalton’s 60th Birthday; De Gruyter: Berlin, Germany; Boston, MA, USA, 2007. [Google Scholar] [CrossRef]
Kostina, E.; Kostyukova, O. Generalized implicit function theorem and its application to parametric optimal control problems. J. Math. Anal. Appl. 2006, 320, 736–756. [Google Scholar] [CrossRef]
Arutyunov, A.V.; Zhukovskiy, S.E. Nonlocal Generalized Implicit Function Theorems in Hilbert Spaces. Differ. Equ. 2020, 56, 1525–1538. [Google Scholar] [CrossRef]
Arutyunov, A.V. On Implicit Function Theorems at Abnormal Points. Proc. Steklov Inst. Math. 2010, 271, 18–27. [Google Scholar] [CrossRef]
Arutyunov, A.V.; Salikhova, K. Implicit Function Theorem in a Neighborhood of an Abnormal Point. Proc. Steklov Inst. Math. 2021, 315, 19–26. [Google Scholar] [CrossRef]
Szczepanik, E.; Prusińska, A.; Tret’yakov, A.A. The p-factor method for nonlinear optimization. Schedae Informaticae 2012, 21, 143–159. [Google Scholar]
Buhmiler, S.; Krejič, N.; Lužanin, Z. Practical quasi-Newton algorithms for singular nonlinear systems. Numer. Algor. 2010, 55, 481–502. [Google Scholar] [CrossRef]
Brezhneva, O.A.; Evtushenko, Y.G.; Tret’yakov, A.A. The 2-factor-method with a modified Lagrange function for degenerate constrained optimization problems. Dokl. Math. 2006, 73, 384–387. [Google Scholar] [CrossRef]
Candelario, G.; Cordero, A.; Torregrosa, J.R.; Vassileva, M.P. Generalized conformable fractional Newton-type method for solving nonlinear systems. Numer. Algor. 2023, 93, 1171–1208. [Google Scholar] [CrossRef]
Bonnans, J.F.; Shapiro, A. Perturbation Analysis of Optimization Problems; Springer: New York, NY, USA; Berlin/Heidelberg, Germany, 2000. [Google Scholar]
Avakov, E.R.; MagarilIlyaev, G.G.; Tikhomirov, V.M. The level-set of a smooth mapping in a neighborhood of a singular point, and zeros of a quadratic mapping. Russ. Math. Surv. 2013, 68, 401–433. [Google Scholar] [CrossRef]
Brezhneva, O.A.; Tret’yakov, A.A. Optimality conditions for degenerate extremum problems with equality constraints. SIAM J. Control Optim. 2003, 42, 723–745. [Google Scholar] [CrossRef]
Constantin, E. Necessary conditions for weak minima and for strict minima of order two in nonsmooth constrained multiobjective optimization. J. Glob. Optim. 2021, 80, 177–193. [Google Scholar] [CrossRef]
Gfrerer, H. Second-order necessary conditions for nonlinear optimization problems with abstract constraints: The degenerate case. SIAM J. Optim. 2007, 18, 589–612. [Google Scholar] [CrossRef]
Evtushenko, Y.G. Generalized Lagrange multiplier technique for nonlinear programming. J. Optim. Theory Appl. 1977, 21, 121–135. [Google Scholar] [CrossRef]
Bertsekas, D.P. Nonlinear Programming; Athena Scientific: Belmont, MA, USA, 2016. [Google Scholar]
Antipin, A.S.; Vasilieva, O.O. Dynamic method of multipliers in terminal control. Comput. Math. Math. Phys. 2015, 55, 766–787. [Google Scholar] [CrossRef]
Zhiltsov, A.V.; Namm, R.V. The Lagrange multiplier method in a finite-dimensional convex programming problem. Dalnevost. Mat. Zh. 2015, 15, 53–60. [Google Scholar]
Brezhneva, O.A.; Tret’yakov, A.A. When the Karush–Kuhn–Tucker Theorem Fails: Constraint Qualifications and Higher-Order Optimality Conditions for Degenerate Optimization Problems. J. Optim. Theory Appl. 2017, 174, 367–387. [Google Scholar] [CrossRef]
Bradley, J.S.; Everitt, W.N. Inequalities Associated with Regular and Singular Problems in the Calculus of Variations. Trans. Am. Math. Soc. 1973, 182, 303–321. [Google Scholar] [CrossRef]
Konjik, S.; Kunzinger, M.; Oberguggenberger, M. Foundations of the calculus of variations in generalized function algebras. Acta Appl. Math. 2008, 103, 169–199. [Google Scholar] [CrossRef]
Lecke, A.; Baglini, L.L.; Giordano, P. The classical theory of calculus of variations for generalized functions. Adv. Nonlinear Anal. 2019, 8, 779–808. [Google Scholar] [CrossRef]
Gelfand M., F.S.V. Calculus of Variations; Prentice-Hall: Englewood Cliffs, NJ, USA, 1963. [Google Scholar]
Sivaloganathan, J. Singular minimisers in the Calculus of Variations: A degenerate form of cavitation. Ann. L’Institut Henri Poincaré C Anal. Non Linéaire 1992, 9, 657–681. [Google Scholar] [CrossRef]
Tuckey, C. Nonstandard Methods in the Calculus of Variations; Pitman Research Notes in Math. Ser. 297; Longman Scientific & Technical: Harlow, UK, 1993. [Google Scholar]
Prusińska, A.; Szczepanik, E.; Tret’yakov, A.A. p-th order optimality conditions for singular Lagrange problem in calculus of variations. Elements of p-regularity theory. In System Modeling and Optimization. CSMO 2011. IFIP Advances in Information and Communication Technology; Springer: Berlin/Heidelberg, Germany, 2013; Volume 391, pp. 528–537. [Google Scholar]
Bellman, R.; Adomian, G. The Euler-Lagrange Equations and Characteristics; Springer: Dordrecht, The Netherlands, 1985; pp. 36–58. [Google Scholar] [CrossRef]
Korneva, I.T.; Tret’yakov, A.A. Application of the factor-analysis to the calculus of variations. In Proceedings of Simulation and Analysis in Problems of Decision Making Theory; Computing Center of the Russian Academy of Sciences: Moscow, Russia, 2002; pp. 144–162. (In Russian) [Google Scholar]
Cabada, A.; Aleksić, S.; Tomović, T.V.; Dimitrijević, S. Existence of Solutions of Nonlinear and Non-local Fractional Boundary Value Problems. Mediterr. J. Math. 2019, 16, 119. [Google Scholar] [CrossRef]
Bobisud, L.E. Existence of solutions for nonlinear singular boundary value problems. Appl. Anal. 1990, 35, 43–57. [Google Scholar] [CrossRef]
Zhang, G.; Cheng, S.S. Existence of solutions for a nonlinear system with a parameter. J. Math. Anal. Appl. 2006, 314, 311–319. [Google Scholar] [CrossRef]
Reddien, G.W. On Newton’s method for singular problems. SIAM J. Numer. Anal. 1978, 15, 993–996. [Google Scholar] [CrossRef]
Demidovitch, B.P.; Maron, I.A. Computational Mathematics; Nauka: Moscow, Russia, 1973. [Google Scholar]
Moore, R.E. A Test for Existence of Solutions to Nonlinear Systems. SIAM J. Numer. Anal. 1977, 14, 611–615. [Google Scholar] [CrossRef]
Prusińska, A.; Tret’yakov, A.A. On the Existence of Solutions to Nonlinear Equations Involving Singular Mappings with Non-zero p-Kernel. Set-Valued Anal. 2011, 19, 399–416. [Google Scholar] [CrossRef]
Bradie, B. A Friendly Introduction to Numerical Analysis; Pearson Prentice Hall: Upper Saddle River, NJ, USA, 2006. [Google Scholar]
Burden, R.L.; Faires, J.D. Numerical Analysis; Thomson Brooks/Cole: Pacific Grove, CA, USA, 2005. [Google Scholar]

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Bednarczuk, E.; Brezhneva, O.; Leśniewski, K.; Prusińska, A.; Tret’yakov, A.A. Towards Nonlinearity: The p-Regularity Theory. Entropy 2025, 27, 518. https://doi.org/10.3390/e27050518

AMA Style

Bednarczuk E, Brezhneva O, Leśniewski K, Prusińska A, Tret’yakov AA. Towards Nonlinearity: The p-Regularity Theory. Entropy. 2025; 27(5):518. https://doi.org/10.3390/e27050518

Chicago/Turabian Style

Bednarczuk, Ewa, Olga Brezhneva, Krzysztof Leśniewski, Agnieszka Prusińska, and Alexey A. Tret’yakov. 2025. "Towards Nonlinearity: The p-Regularity Theory" Entropy 27, no. 5: 518. https://doi.org/10.3390/e27050518

APA Style

Bednarczuk, E., Brezhneva, O., Leśniewski, K., Prusińska, A., & Tret’yakov, A. A. (2025). Towards Nonlinearity: The p-Regularity Theory. Entropy, 27(5), 518. https://doi.org/10.3390/e27050518

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Towards Nonlinearity: The p-Regularity Theory

Abstract

1. Introduction

1.1. Recollection of the Fundamental Results in the Regular Case

1.2. Generalizations

1.3. Aims and Scope

General Notation

2. Essential Nonlinearity and Singular Mappings

3. Elements of p -Regularity Theory

4. Singular Problems and Classical Results via the p -Regularity Theory

4.1. Lyusternik Theorem and Description of Solution Sets

4.1.1. Lyusternik Theorem in the Regular Case

4.1.2. A Generalization of the Lyusternik Theorem

4.1.3. Representation Theorem

4.1.4. Morse Lemma

4.2. Implicit Function Theorem

4.2.1. Implicit Function Theorem in the Regular Case

4.2.2. Implicit Function Theorem in the Degenerate Case

4.3. Newton’s Method

4.3.1. Classical Newton’s Method for Nonlinear Equations and Unconstrained Optimization Problems

4.3.2. The p-Factor Newton’s Method

4.4. Optimality Conditions for Equality-Constrained Optimization Problems

4.4.1. Optimality Conditions: Lagrange Multiplier Theorem

4.4.2. Optimality Conditions for p-Regular Optimization Problems

4.5. Modified Lagrangian Function Method

4.5.1. The Problem

4.5.2. Modified Lagrange Function Method for 2-Regular Problems

4.6. Calculus of Variations

4.6.1. Singular Problems of Calculus of Variations

4.6.2. Optimality Conditions for p-Regular Problems of Calculus of Variations

4.7. Existence of Solutions to Nonlinear Equations

4.7.1. Existence of Solutions to Nonlinear Equations in the Regular Case

4.7.2. Existence of Solutions to Nonlinear Equations in the Singular Case

4.8. Differential Equations

4.8.1. Nonlinear Boundary-Value Problem

4.8.2. Nonlinear Boundary-Value Problem in the Nonregular Case

4.9. Interpolation by Polynomials

4.9.1. Newton Interpolation Polynomial

4.9.2. The p-Factor Interpolation Method

5. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

3. Elements of $p$ -Regularity Theory

4. Singular Problems and Classical Results via the $p$ -Regularity Theory