1. Introduction
A dual number is a number of the form
, where
are real numbers and
is a special element with the property
Thus, the set of dual numbers
can be formally described as follows:
Dual numbers were introduced and developed mainly to simplify the mathematical treatment of infinitesimal quantities, particularly for differentiation and geometric transformations. In standard arithmetic involving real or complex numbers, one does not encounter a non-zero number whose square equals zero, as these number systems qualify as integral domains: the condition necessitates that either or . However, in the realm of algebra, structures need not adhere to integral domain properties. When engaging with rings that incorporate zero-divisors, the aforementioned cancellation law fails to hold, allowing for the existence of elements such that . Such elements are classified as nilpotent (specifically of index 2 in this context).
The following are concrete examples:
- (1)
Non-zero element 2 with (mod 4) in ;
- (2)
Non-zero element with but in ;
- (3)
Non-zero element with in matrices ring;
- (4)
Non-zero element is a residue class of x with in the quotient in coordinate ring .
Each ring has zero-divisors, so nilpotent elements can appear. An element a is nilpotent if for some . Index 2 is the simplest case (). All nilpotents are zero-divisors. Nilpotents live in the nil-radical of a ring and behave like algebraic infinitesimals. For example, the dual numbers turn up in differential geometry: models a first-order infinitesimal displacement because every higher-order term vanishes.
Schemes allow nilpotents to encode embedded points and tangent directions. The coordinate ring describes a double point—two copies of a point glued together infinitesimally. Dual (or hyper-dual) numbers underpin automatic differentiation: for , one keeps only the coefficient of because . Nilpotent matrices classify Jordan blocks and control the structure of linear operators. The existence of a non-zero element with square zero is not a paradox; it simply tells you that the algebraic system you are working in is not an integral domain. Once zero-divisors are allowed, cancellation fails, and nilpotent elements—self-annihilating numbers—can and do appear.
In 1873, Clifford [
1] introduced dual numbers as infinitesimal quantities (Clifford algebra) with
, studying kinematics and differential geometry. Kim and Shon [
2] applied dual numbers to geometry and kinematics, particularly in dual quaternions, to simplify spatial transformations. Kim et al. [
3] develops a rigorous Taylor-series calculus for functions taking values in the dual quaternion algebra with a cental nilpotent unit. They specify differentiability with the quaternionic structure and prove a Taylor expansion theorem: around any point where the function is (real/quaternionic) analytic, it admits a convergent series whose coefficients are uniquely determined by higher-order derivatives of its primal and dual parts. A dual number has extensive applications in robotics, rigid-body motion, and screw theory, effectively representing rotations and translations compactly. Dual numbers became fundamental in forward-mode Automatic Differentiation (AD): if
, then automatic differentiation uses dual numbers:
directly extracting derivatives. Rall [
4] analyzes time–memory complexity, checkpointing, and handling of loops, conditionals, and intrinsic functions, contrasting AD with finite differences and symbolic differentiation (avoids expression swell). Rall also treats higher-order derivatives (Hessian-vector products, mixed modes, Taylor propagation) and practical implementation routes. Squire and Trapp [
5] contrast this with forward/central finite differences (which suffer truncation/roundoff trade-offs) and demonstrates straightforward implementation by enabling complex arithmetic in existing codes. They discuss extensions (directional derivatives, higher-order formulas) alongside limitations. It adopted alternative to finite differences that is simple, stable, and highly accurate for derivative estimation. Walther [
6] presents a unified framework for computing second- and higher-order derivatives of code-defined functions using automatic differentiation (AD), emphasizing techniques and trade-offs relevant to large-scale simulation. Walther systematizes how to obtain Hessians, third- and fourth-order tensors, and derivative actions (e.g., Hessian-vector or tensor-vector products).
Quaternions extend complex numbers to four dimensions, providing a powerful, noncommutative algebra to represent rotations and orientations efficiently in three-dimensional space. A quaternion is a number of the form
, where
, and the imaginary units
follow these fundamental rules:
The multiplication rules for quaternions are as follows:
Thus, quaternions form a noncommutative, associative algebra denoted by the following:
Hamilton [
7] introduced quaternions in 1843, aiming to generalize complex numbers to three dimensions; however, he found a consistent system only in four dimensions. Maxwell, Gibbs, and Heaviside initially used it to describe rotations and electromagnetism; it was later supplanted by vector calculus (Gibbs–Heaviside vector analysis). Since the 20th century, robotics, computer graphics, and aerospace engineering have gained renewed interest due to the compactness and efficiency of quaternions in representing rotations in three-dimensional space, making them essential to modern technologies. Szirmay-Kalos [
8] develops the algebraic framework for higher-order duals, proving closure under composition and giving recurrence formulas for propagating derivatives through addition, multiplication, division, and standard intrinsic functions. Also, Szirmay-Kalos presents seeding strategies for multivariate functions to recover mixed partials/Hessian tensors, including multidirectional seeds.
Hyper-dual numbers extend dual numbers by introducing a second nilpotent element, allowing for the efficient and precise calculation of both first- and second-order derivatives simultaneously. This algebraic structure is crucial to advanced computational techniques, especially in automatic differentiation and optimization. A hyper-dual number is defined as an extension of real numbers using two distinct nilpotent elements,
,
, each satisfying the following:
Thus, a hyper-dual number
z is expressed as follows:
Formally, the algebra of hyper-dual numbers
can be expressed as follows:
Hyper-dual numbers can be seen as an extension of dual numbers designed explicitly to simplify the computation of second-order derivatives without truncation error or numerical approximation. In automatic differentiation (AD), derivatives—especially second-order—are computed accurately without numerical approximations or round-off errors. In optimization, Hessian and gradient calculations are utilized in optimization algorithms. In computational fluid dynamics (CFD), efficient sensitivity analysis is essential for aerodynamic design. The computation of derivatives lies at the heart of many branches of mathematics, including analysis, differential geometry, and numerical methods. In applied contexts, derivatives play a crucial role in optimization, sensitivity analysis, and the numerical solution of differential equations. Traditionally, derivatives are computed either symbolically, which can be algebraically complex and computationally intensive, or numerically, using finite difference approximations, which are susceptible to truncation and rounding errors. Fike and Alonso [
9] show how standard operators (addition, multiplication, division, composition, exp/log/trig, etc.) should be overloaded to preserve second-order accuracy, and contrast the approach with finite differences and complex-step methods—highlighting the absence of truncation error and strong numerical stability. Griewank and Walther [
10] proposed the standard, mathematically rigorous introduction to algorithmic differentiation (AD). They develop AD from first principles—the chain rule applied to a program’s computational graph—and show how to obtain exact (to working precision) derivatives of functions defined by code. Imoto et al. [
11] develop a rigorous bridge between hyper-dual arithmetic and matrix calculus for automatic differentiation. They introduce a faithful matrix representation of hyper-dual numbers and proves a fundamental theorem: the representation is an algebra homomorphism that preserves addition, multiplication, and composition with elementary functions.
To address these challenges, alternative algebraic structures such as dual numbers have been introduced. Dual numbers, defined by the relation
for a nilpotent element
, provide a first-order Taylor expansion mechanism that enables exact computation of first derivatives through arithmetic overloading. While powerful, classical dual numbers are inherently limited to first-order differentiation and do not extend naturally to higher-order or mixed partial derivatives in multivariable settings. To overcome these limitations, we consider the algebra of hyper-dual numbers, a higher-order generalization of dual numbers capable of encoding second and higher-order derivative information. Hyper-dual numbers incorporate multiple nilpotent generators satisfying appropriate commutation and annihilation properties, thereby allowing for the propagation of both pure and mixed partial derivatives in a structured and exact manner. Building upon the well-known Complex-Step derivative technique (for first derivatives), Millwater et al. [
12] construct a framework where second-order partial derivatives and Hessian components can be extracted accurately without subtractive cancellation errors. They achieve this by evaluating a target function on specially constructed dual-complex inputs and the algebra to isolate higher-order derivative terms. Peón-Escalante et al. [
13] derive compact closed-form higher-order kinematic formulas via repeated dual-number evaluations and mixed seeding strategies, show how to assemble Jacobians/Hessians needed for sensitivity and synthesis, and present efficient algorithms that slot into existing analysis pipelines with minimal code changes. This paper develops a systematic framework to obtain higher-order kinematic quantities by using dual-number arithmetic.
In this paper, we develop a rigorous mathematical framework for computing higher-order derivatives using hyper-dual numbers. We begin by formally defining hyper-dual numbers and their algebraic properties, including their role in higher-order Taylor expansions. We then construct hyper-dual extensions of real-valued functions and show how standard calculus rules—such as the chain rule and product rule—naturally emerge within this algebraic setting. The main contributions of this paper are as follows: In
Section 2, we provide a detailed algebraic construction of hyper-dual numbers suitable for encoding second- and higher-order derivative information. In
Section 3, we establish the consistency of hyper-dual arithmetic with classical differential calculus, including partial and mixed derivative formulations. In
Section 4, we demonstrate that this approach enables exact and symbol-free computation of derivatives for smooth functions, offering theoretical advantages over both finite difference methods and symbolic differentiation.
This paper situates hyper-dual number calculus within a broader mathematical context, bridging ideas from differential algebra, automatic differentiation, and multivariable calculus. The proposed framework not only clarifies the theoretical underpinnings of hyper-dual differentiation but also lays the foundation for further developments in algorithmic differentiation and functional analysis.
2. Complex-Step Approximation
Generalized complex numbers are algebraic extensions of real numbers that introduce an imaginary unit whose square can equal any real number (
). They unify various algebraic and geometric ideas, improving mathematical analysis across multiple science and engineering fields. A generalized complex number extends the classical complex number system. It is usually defined by introducing a new imaginary unit
j that satisfies:
, where
. Thus, generalized complex numbers (also called hypercomplex numbers or split-complex numbers, depending on
) are represented in the following form:
Depending on the choice of
, generalized complex numbers yield different algebraic structures: Complex numbers (
) have the standard imaginary unit
. Dual numbers (
) consist of a nilpotent unit
. Split-complex (Hyperbolic) numbers (
) have a unit with hyperbolic geometry
. In 1848, Cockle [
14] introduced the term Tessarines, generalized numbers with imaginary units whose squares may differ from
. Clifford [
1] introduced algebras now called Clifford algebras, which generalize complex numbers, quaternions, and include dual and split-complex numbers. Gentili et al. [
15] and others [
16,
17] explored various hypercomplex systems, including dual numbers and split-complex numbers. Generalized complex numbers are applied in geometry, relativity, quantum theory, optimization, and signal processing. Thus, generalized complex numbers originated from a desire to generalize algebraic structures and facilitate applications in mathematics and physics.
Calculating derivatives is an essential part of mathematical analysis and numerical methods, and various approaches exist to accomplish this, each with its own advantages and disadvantages. The finite-difference method is one of the most common techniques for numerical differentiation. It estimates the derivative of a function using values at discrete points. Specifically, it uses the difference between function values at these points divided by the change in the input variable. The finite-difference method can be further divided into forward, backward, and central differences, each offering different levels of accuracy. Derived from the Taylor Series:
While this method is relatively simple to implement and understand, it may encounter issues with truncation errors and requires careful selection of the step size to balance accuracy and numerical stability. Forward Difference—First-Order Approximation
Central Difference—Second-Order Approximation
Also, the Complex-Step Approximation technique provides a more precise alternative to traditional finite-difference methods by applying a small imaginary perturbation to the input variable. Taylor Series with an imaginary step
First-Derivative Approximation
The derivative is obtained from the real part of the function evaluated at this complex point. A key advantage of the Complex-Step method is that it offers much higher accuracy without the subtraction errors commonly seen in finite differences. It is also computationally efficient for problems where function evaluations are not overly costly. However, its implementation can be limited by the nature of the function being differentiated, especially if it is not easily computed for complex numbers.
The Complex-Step Approximation is an effective numerical method that provides several key benefits, especially for evaluating derivatives. One major advantage of this approach is its high resistance to subtractive cancellation error, a common problem in traditional numerical differentiation methods. This makes Complex-Step particularly useful for achieving accurate derivative estimates. One of the main features of the Complex-Step Approximation is its ability to use an arbitrarily small step size while still accurately computing first derivatives. This removes the usual difficulties in choosing an appropriate step size, which often requires heuristic adjustments to balance precision and stability. By using the complex extension of the input variable, the approximation can stay accurate without the risk of losing significant digits through subtractive cancellation. Apart from its high numerical accuracy, implementing the Complex-Step Approximation is quite easy. The mathematical basis is simple enough for practitioners to quickly apply it in various situations, making it a useful tool for those working in numerical analysis. However, an important question arises when considering higher-order derivative calculations, such as second derivatives. The question of whether the Complex-Step Approximation maintains its advantageous properties when used for second-derivative evaluations is critical for users looking to expand its application. Further investigation into this will determine whether the approximation continues to provide the same level of accuracy and resistance to numerical errors or if additional factors need to be considered.
The Complex-Step method is a useful numerical technique for approximating derivatives of functions. When examining the second derivative of a function
,
this approach can be used to create more precise formulas. This method estimates the second derivative using two different complex steps instead of just a standard finite difference approach. It involves adding a small imaginary part to the input variable, which results in more accurate derivative calculations. However, despite this improvement, it is important to recognize that the derived formulas can still be affected by subtractive cancellation errors. This type of error happens when two close numerical values are subtracted, leading to a significant loss of precision in the result. Therefore, while the Complex-Step method reduces some numerical issues, practitioners must remain aware of the inherent risks of numerical instability in their calculations.
In numerical differentiation, the Complex Step Approximation effectively reduces the impact of subtractive cancellation errors. It does this by using the imaginary part of a complex number, where the first derivative becomes the dominant term. As a result, this method enables us to directly extract the derivative without relying on a difference quotient, which may be susceptible to significant errors due to cancellation. To elaborate on this method, it is suggested that in certain applications, the second derivative should be the primary term in the non-real part of a complex function. This means that by combining real and imaginary components in a structured way, we can achieve more accurate representations of derivatives. To thoroughly analyze and accurately model physical phenomena, it is often essential to obtain both the first and second derivatives of the relevant functions. These derivatives provide crucial insights into the behavior of a system. To effectively differentiate using the Complex Step Approximation, it is essential that each derivative—both first and second—serves as the leading term in its respective calculation. This guarantees that the primary contribution of interest is accurately reflected in the results. The suggested methodology emphasizes the use of a complex number framework, which enables the handling of multiple non-real components. This approach not only improves precision but also expands the scope of analysis in numerical methods by incorporating various aspects of the data involved.
In the context of multi-variable functions, calculating cross derivatives depends significantly on previously established results. This means that the accuracy of cross derivatives is directly tied to the precision of the initial calculations. As the complexity of the function increases, this can lead to compounded errors.
It is important to understand that errors in these calculations can accumulate. As each variable is modified and reevaluated, any inaccuracies may propagate through subsequent operations, negatively impacting the overall reliability of the results. To reduce the impact of cumulative errors in cross-derivative calculations, it is advisable to use perturbation techniques. Specifically, perturbations should be introduced for each variable independently, employing distinct non-real components. This method helps to isolate and minimize the influence of errors related to any specific variable. Ultimately, this method highlights the importance of incorporating various non-real components during the perturbation process. This approach can improve the robustness of our calculations and yield more accurate results when analyzing complex multi-variable functions.
The adjoint method is commonly used in optimization and when derivatives of functionals are needed. The adjoint method takes advantage of the structure of specific mathematical problems, especially those in fluid dynamics, structural optimization, and related fields. It involves solving a system of equations backward in time or space, enabling efficient calculation of derivatives with respect to multiple parameters simultaneously. The main benefit of the adjoint method is its efficiency in scenarios with many outputs linked to various inputs, which greatly lowers computational costs. However, it also requires a deeper understanding of the underlying mathematical structures and can be complex to implement for different problem types. Each of these methods balances accuracy, ease of implementation, and computational efficiency, making the choice of method depend on the specific requirements and constraints of the problem. Careful consideration of these factors is essential for obtaining reliable and efficient derivative calculations across different application areas.
4. Implementation of Hyper-Dual Numbers and Their Calculations
We introduce a new way of defining quaternions by relating it to the properties of the basis of Dual numbers. Now, let us consider a different type of number as follows, namely, a Hyper-dual number
such that
satisfying
We define the properties of the basis that constitute the numbers. To eliminate the inconvenience caused by the non-commutability of the multiplication of quaternions, the multiplication of the basis for the new number system is intentionally set to be commutative:
This setting creates a constraint on the possible values of
and
. We give arithmetic operations. Consider two hyper-dual numbers:
Here,
x is the function value,
,
represent the first derivatives with respect to two independent variables, and
represents the mixed second-order partial derivative.
Multiplication is defined as
The inverse is defined as
only exists for
. This provides a definition for the norm denoted as
. It indicates that when making comparisons between values, we should focus exclusively on the real component. This means that the relationship
is equivalent to the comparison of their respective real parts, expressed as
. By adhering to this approach, we can ensure that the code behaves consistently and follows the same execution path as any code that processes real-valued data, thereby maintaining predictable and accurate outcomes in our computations.
Derivative Calculations on Hyper-Dual Numbers
This approach allows for the representation of functions in terms of their derivatives at a specific point, effectively capturing the behavior of the function around that point. Hyper-dual numbers, which extend the concept of dual numbers by introducing two infinitesimal components, enable a comprehensive analysis of higher-order derivatives. As such, the Taylor Series not only facilitates the approximation of differentiable functions but also enhances our understanding of their intricacies by incorporating the properties of hyper-dual numbers into the expansion process.
Theorem 1 (Exactness of First and Mixed Derivatives via Hyper-Dual Numbers).
Let be a function of class , and letwhere denotes the set of second-order hyper-dual numbers. Then, evaluating f at yields differentiable functions f that can be thoughtfully characterized through the Taylor Series expansion when utilizing a generic hyper-dual number, where : Proof. We use the second-order multivariate Taylor expansion:
Let
,
, then
and
. Substitute the following into the expansion:
This completes the proof. □
Proposition 1 (Algebraic Closure of Hyper-Dual Differentiation). Let be smooth functions. Then the class of hyper-dual extended functions is closed under:
- (1)
Scalar multiplication: ;
- (2)
Addition: ;
- (3)
Multiplication: ;
- (4)
Composition (under certain conditions): .
For instance,
Example 1. Let . Then, We will explore the computation of derivatives for a function
, where
is an
n-dimensional vector in
, expressed as
. Our focus will be on calculating the mixed second partial derivative
using hyper-dual numbers, which allows us to obtain derivatives with high precision using a single evaluation of the function. To begin, we define a perturbed vector
as follows:
where
and
are infinitesimally small hyper-dual numbers,
and
are finite perturbations, and
and
are the standard basis vectors in
. The term
effectively introduces a second-order mixed perturbation into the vector.
Utilizing this formulation, we can express the function evaluated at the perturbed vector
as follows:
This equation provides us with a comprehensive means to gather essential derivative information through a single evaluation of the function.
Specifically, from this single evaluation, we can extract the first derivatives:
and the mixed second partial derivative:
As an illustrative example, consider the function
. The evaluation of this function at a point can be expressed in terms of its Taylor expansion around a point
. Specifically, we can write the following:
where we take into account the values of the first and second derivatives of
evaluated at
. The first derivative
is associated with the infinitesimal perturbation
, while the second derivative
becomes relevant in the second-order term associated with
.
Through this detailed analysis, we can see how hyper-dual numbers enable efficient calculations of derivatives, offering a powerful tool for both theoretical analysis and practical applications in various fields of mathematics and engineering.
In particular,
Example 2. Let . This function can be evaluated as follows: We begin by formally defining the algebraic structure of hyper-dual numbers, which generalize classical dual numbers by introducing multiple nilpotent elements to encode higher-order derivative information. For
, the Taylor Series becomes
This expression is exact, with no truncation error.
5. Conclusions
In this paper, we have developed a rigorous and unified framework for computing higher-order derivatives using hyper-dual numbers, extending the classical dual number system. By formalizing the algebraic structure of hyper-dual numbers and demonstrating their compatibility with multivariate Taylor expansions, we established a method for exact evaluation of first and mixed second-order derivatives. The definition and analysis of hyper-dual numbers as algebraic tools for automatic differentiation; exact derivative computation without symbolic manipulation or finite difference approximations; theoretical validation of correctness through algebraic closure, Taylor expansions, and chain rule propagation; application potential in numerical optimization, sensitivity analysis, and machine learning. In contrast to traditional numerical differentiation techniques, the hyper-dual number approach provides machine-accurate derivative information in a stable and efficient manner, particularly suitable for algorithmic implementation in modern computational environments.
Building on the results presented here, several directions for further research and development are evident: the extension of the framework to compute third- or higher-order partial derivatives by introducing additional nilpotent components and formalizing the corresponding hyper-dual algebra; adapting the hyper-dual number methodology for differentiating vector- and matrix-valued functions , including Jacobian and Hessian tensors; embedding hyper-dual arithmetic into automatic differentiation libraries for broad accessibility and practical deployment in scientific computing software; investigating applications in differential geometry, lie groups, and continuum mechanics, where higher-order derivatives play a critical role in curvature, stress, and deformation analysis; applying hyper-dual differentiation to discretized partial differential equations to enable derivative-aware solvers in finite element and spectral methods; exploring deeper algebraic properties such as isomorphism classes, module structures, and potential links to Grassmann or exterior algebras. By continuing to develop and apply hyper-dual number methods, we expect to contribute to a broader class of exact, stable, and efficient derivative computation tools in both pure and applied mathematics.