A Single-Element Plane Grating Monochromator

Hettrick, Michael C.

doi:10.3390/photonics3010003

Open AccessArticle

A Single-Element Plane Grating Monochromator

by

Michael C. Hettrick

Hettrick Scientific, Ho Chi Minh City 70000, Vietnam

Photonics 2016, 3(1), 3; https://doi.org/10.3390/photonics3010003

Submission received: 24 November 2015 / Revised: 19 December 2015 / Accepted: 24 December 2015 / Published: 11 January 2016

Download

Browse Figures

Versions Notes

Abstract

:

Concerted rotations of a self-focused varied line-space diffraction grating about its groove axis and surface normal define a new geometric class of monochromator. Defocusing is canceled, while the scanned wavelength is reinforced at fixed conjugate distances and horizontal deviation angle. This enables high spectral resolution over a wide band, and is of particular advantage at grazing reflection angles. A new, rigorous light-path formulation employs non-paraxial reference points to isolate the lateral ray aberrations, with those of power-sum

\leq 3

explicitly expanded for a plane grating. Each of these 14 Fermat equations agrees precisely with the value extracted from numerical raytrace simulations. An example soft X-ray design (6° deviation angle and 2 × 4 mrad aperture) attains a resolving power

> 25, 000

over a three octave scan range. The proposed rotation scheme is not limited to plane surfaces or monochromators, providing a new degree of freedom in optical design. Grating rotation about its third (meridional) axis may be employed to cancel vertical deflection of the diffracted beam while maintaining the above aberration correction. This enables a simpler (pure rotary) motion for the exit slit and a fixed beam direction both horizontally and vertically.

Keywords:

optical design; grazing incidence; diffraction gratings; monochromators; geometrical aberrations; soft X-ray; varied line-spacing; VLS; in-focus; single-element

1. Introduction

The reflection of light at wavelengths

λ ≲ 100 nm

is encumbered by losses to absorption and scattering, whose reduction favors few and simple (plane or spherical) optical surfaces. This design philosophy inspired the invention of two prior self-focusing grating monochromators suitable for use at grazing incidence [1,2,3]. The first combined rotation and translation of a varied line-space (VLS) concave grating, while the second introduced surface-normal rotation (SNR) and initially employed a constant line-space (CLS) concave grating. To date, these have been the only single-element solutions which remain “in-focus” (no spectral aberration linear versus aperture) with scanned wavelength, yet employ slit positions and ray directions fixed in the direction of dispersion. The endpoint in this progression towards minimization would be reflection from a single plane grating surface, which can exhibit near-invariance of the focal length with graze angle and provide access to the most accurate fabrication methods. These include float-polishing and Silicon crystal cleaving to produce atomically-smooth plane surfaces and short-wavelength lithographies which offer a new generation of flat-substrate gratings having ultra-low scatter and unconstrained two-dimensional line patterns [4].

Existing fixed-slit plane grating monochromators require additional (mirror) reflections to focus the incident beam or to maintain this focus as the grating is rotated to scan wavelength. For example, the classical Czerny–Turner design is theoretically free of geometrical aberrations, but requires two concave (in principle, parabolic) mirrors for collimation and refocusing [5]. Designs in which the grating is illuminated by uncollimated light require some form of effective aberration correction to maintain fixed slits upon rotation of the grating, of which five distinct geometric solutions have previously been devised: (I) a CLS grating plus a fixed concave (ideally, elliptical) mirror and a rotating plane mirror [6]; (II) a CLS grating plus a rotating concave (spherical) mirror [7]; (III) a VLS grating plus a fixed concave mirror [8,9], spherical or otherwise; (IV) a VLS grating plus a rotating-translating plane mirror [10]; and (V) a CLS grating rotating about its surface normal plus a fixed (ideally elliptical) concave mirror [3]. The cited references are the original disclosures, with the defining (minimum) optical geometries and characteristic imaging properties being unchanged in numerous reformulations, optimizations, augmentations, rebranding and other derivatives.

Presented here is a new monochromator geometry, in which a self-focusing VLS grating scans wavelength between slits at fixed distances and horizontal deviation angle, without the need for other optics. This paper reports the detailed imaging characteristics of the basic (astigmatic, single-element, plane grating) configuration, particularly the spectral resolution as a function of aperture and scan range. In Section 2, an approximate light-path formulation provides a cogent algebraic and geometric understanding of the new focusing principle (first degree correction) and derives an advantageous conjugate distance ratio to correct the spectral aberration of second degree. Section 3 introduces a rigorous general light-path formulation and Section 4 applies this to obtain the expansion equations for the present dual-rotation plane VLS grating. As analyzed in Section 5, these equations provide to an exacting degree the focusing condition, the required tilts of the object (or entrance slit) and image (or exit slit) and the lateral ray aberrations in both directions. In Section 6, these spatial aberrations are converted to the geometrical spectral resolution and are exemplified using the dimensional parameters of an ultra-high resolution soft X-ray monochromator. In Section 7, independent simulations (numerical raytracings) are performed and quantitatively compared to the light-path calculations. As the present introductory work lays the theoretical foundation on which subsidiary performance characteristics may be added, Section 8 briefly indicates some prospects for future practical enhancements. The new results disclosed in this paper, including the proposed geometry and two precise methods of deriving the component geometrical aberrations, are summarized in Section 9.

2. The Basic Scheme

Figure 1 illustrates a minimal astigmatic configuration of the proposed optical geometry. The coordinate systems are Cartesian and right-handed, with the general frame fixed in the laboratory being (

y - {\hat{n}}_{o} - z

). The object and image plane frames are also fixed, being perpendicular to the plane of the figure and using coordinate systems

{}_{A}{\hat{n}} - x - z

and

{}_{B}{\hat{n}} - x^{'} - z'

, respectively. The coordinate system of the grating frame is

ω - \hat{n} - σ

, with its origin at the grating pole (P) and the

σ

-axis being coincident with the laboratory

z

-axis (pointing towards the viewer). On the grating surface, G is a general point

(ω, σ)

and E is at an extreme corner

(\pm \overset{˘}{ω}, \pm \overset{˘}{σ})

of a pole-centered rectangular aperture. The distances

r = \bar{O_{o} P}

and

r' = \bar{P {}_{P}I_{o}}

are measured along the principal incident and diffracted rays, which intersect at the grating pole. There the principal angles of incidence (

α

) and diffraction (

β

) are measured relative to the grating tangent plane. Their sum is the in-plane angular deviation

2 γ

and their difference equals 2

δ

. Thus

α = γ - δ, β = γ + δ

and

γ

is the effective graze angle.

At an initial wavelength

λ_{o}

, the grating surface normal

{\hat{n}}_{o}

is oriented at an angle

δ_{o}

relative to its zero order direction (for which

α = β)

, the grooves are oriented “in-plane” (at

θ = 0

, thus parallel to the sagittal

σ

-axis) and self-focusing is provided by VLS positioning of the grooves. The object point

O_{o}

, the grating pole P and the image point

{}_{P}I_{o}

for

θ = 0

define the “horizontal” plane of Figure 1, whose intersection with the grating tangent plane at its pole forms the meridional

ω

-axis. The “Gaussian” image plane is shown, which is at the focal distance

r'

for which the resulting horizontal defocus (first-degree) term

{}_{20}{x^{'}} = 0

at

θ = 0

. As shown in Figure 1 and Figure 2b, the (horizontally) “paraxial” image point

{}_{P}I_{o}

shall refer to the intersection of the exiting principal ray (for any

θ

) with this image plane, being independent of the pupil coordinates (“non-aberrant”).

The scanning to longer wavelengths (

λ

) consists of a conventional rotation (about the

σ

-axis) to the angle

δ

, coordinated with a second (and larger) rotation to the angle

θ

about the grating surface normal (

\hat{n}

-axis). If (as illustrated in Figure 1) the first rotation axis lies on the grating surface, it intersects a stationary grating pole which views the in-plane object point (

O_{o}

) and its paraxial image point (

{}_{P}I_{o}

) along principal ray directions whose projections onto the horizontal plane are fixed. The

σ

-axis rotation results in a strong defocusing of the spectral width which, as will be shown, may be canceled by the defocusing of opposite sign resulting from the

\hat{n}

-axis rotation.

Figure 1. The Single-Element Plane Grating Monochromator (SEPGM), comprising (minimally) one grazing reflection. Coordinated on-center rotations

δ

and

θ

of a VLS grating maintain precise self-focusing as the wavelength is scanned. Projection is onto the horizontal plane, except the angle label for

θ

is viewed from a slight elevation for clarity. The extreme ray aberration (separation of the image points) is shown greatly exaggerated.

Figure 1. The Single-Element Plane Grating Monochromator (SEPGM), comprising (minimally) one grazing reflection. Coordinated on-center rotations

δ

and

θ

of a VLS grating maintain precise self-focusing as the wavelength is scanned. Projection is onto the horizontal plane, except the angle label for

θ

is viewed from a slight elevation for clarity. The extreme ray aberration (separation of the image points) is shown greatly exaggerated.

For generality and clarity, most equations will use dimensionless variables in italics, where the scale factor is r. These include the conjugate distance ratio

η \equiv r^{'} / r

, the meridional pupil coordinate

ω

, the sagittal pupil coordinate

σ

, the grating ruled width coordinate

υ

, the object plane horizontal coordinate

x

and vertical coordinate

z

, the image plane horizontal coordinate

x^{'}

and vertical coordinate

z^{'}

and the rotated image plane coordinates

x^{″}

(in the spectral or “slit width” direction) and

z^{″}

(in the “spatial” or “slit length” direction). By convention, the meridional plane is “horizontal” and the sagittal plane is “vertical”, though this directional labeling need not correspond to the ground-level orientations of either the physical instrument or the figures drawn in this paper. Meridional and sagittal shall refer to the positions

(ω

and

σ

, respectively) on the grating surface, whereas horizontal and vertical shall refer to the lateral coordinates (

x

or

x'

and

z

or

z'

, respectively) within the object or image planes. “Longitudinal” shall nominally refer to the direction of ray propagation. The meridional projection of the physical distance travelled by the principal ray is

(1 + η) r

, which is the nominal length of the monochromator relative to the object point.

As the spectral imaging properties at grazing incidence are largely determined by the dimensionless ratio

ρ \equiv s i n β / s i n α

, this variable will appear in the light-path equations rather than the rotation angle

δ

. Conversion between

ρ

and

δ

is given by

ρ = (1 + t a n δ / t a n γ) / (1 - t a n δ / t a n γ)

and

tan δ / t a n γ = (ρ - 1) / (ρ + 1)

. The European sign convention is adopted, designating the inside spectral orders (

ρ > 1

,

δ > 0

) as negative (

m < 0

) and the outside orders (

ρ < 1

,

δ < 0

) as positive (

m > 0

). Assuming no horizontal focusing optic follows the grating, and the monochromator transmits the scanned wavelength by use of a spatial filter (an exit slit), the grating image must be real (

r^{'} > 0

). This results in

η > 0

when the grating is mounted in a diverging incident beam (a real object,

r > 0

), and

η < 0

in the case of a converging incident beam (a virtual object,

r < 0

). The former corresponds to a self-focusing monochromator, for which the linear axes arrows in Figure 1 and Figure 2 point to positive dimensionless values. However in the latter case (a virtual object), the negative value of the scale factor r results in negative dimensionless values in the directions of these arrows.

2.1. The Standard Light-Path Formulation

The wavefront path length F is the sum of the physical path lengths (

L = \bar{OG}

and

L^{'} = \bar{GI}

) and the interference shift of

m λ N

between N grooves, where

m

is the spectral order and

λ

is the physical value of the wavelength. As a powerful tool in the development and performance analysis of optical designs, the path lengths and groove number are usually mathematically de-composed by power series expansion in the pupil coordinates, providing (in dimensionless units):

F (ω, σ, μ) = \sum_{i, j} F_{i j} (μ) ω^{i} σ^{j}; where F_{i j} (μ) = μ N_{i j} + L_{i j} (μ) + L_{i j}^{'} (μ)

(1)

where

μ \equiv m λ / d_{o}

is the dimensionless wavelength variable given a physical line spacing

d_{o}

at the grating pole. In what shall hereinafter be called “the standard formulation”, the path-length coefficients derive from the series expansion of the distances to conjugate points (O and I) which have no dependence upon the pupil coordinates

ω

or

σ

. In particular, I is equated to the paraxial image point

{}_{P}I

. Thus, the series expansion yields equations for

L_{i j}

and

L_{i j}^{'}

which depend only upon parameters at the grating pole, namely the grating orientation (

α, β and θ

), its position (

η

) relative to the object and image points and the shape of its surface (e.g., its radius of curvature).

Employing Fermat’s principle, the image plane lateral ray positions

x^{'} (ω, σ, μ)

and

z^{'} (ω, σ, μ)

are also power series obtained by differentiation of Equation (1) w.r.t. the grating pupil coordinates:

x' (ω, σ, μ) = - \frac{h η}{s i n β} \frac{\partial F (ω, σ, μ)}{\partial ω} \equiv \sum_{i j} {}_{i j}{x^{'}} ω^{i - 1} σ^{j} where {}_{i j}{x^{'}} (μ) = - i \frac{h η}{s i n β} F_{i j} (μ)

(2)

z^{'} (ω, σ, μ) = h^{3} η \frac{\partial F (ω, σ, μ)}{\partial σ} \equiv \sum_{i j} {}_{i j}{z^{'}} ω^{i} σ^{j - 1} where {}_{i j}{z^{'}} (μ) = j h^{3} η F_{i j} (μ)

(3)

where the inclination factor

h ≃ 1

(given more precisely later). In Section 3, the above standard formulation of the light-path aberrations is shown to be mathematically flawed, and is replaced by a rigorously correct theory. However, to introduce the essential geometrical principles underlying the new design, this complication is temporarily neglected and a concise initial analysis is presented based on Equations (1)–(3) and other simplifying (though not flawed) approximations. For brevity, functional dependences upon the pupil coordinates and the wavelength shall hereinafter be understood without their explicit notation.

The power series terms of first degree will be referred to as de-focus (

{}_{20}{x^{'}}

), astigmatism (

{}_{02}{z^{'}}

), horizontal (sagittally-induced) image tilt (

{}_{11}{x^{'}}

) and vertical (meridionally-induced) image tilt (

{}_{11}{z^{'}}

). Though semantical distractions may be discouraged by referencing the higher-degree terms according to only their (i, j) subscripts and lateral direction, the commonly accepted descriptions (based largely on the image shape) will also be used when this facilitates the discussion. Thus, “meridional coma”, “horizontal coma” or simply “coma” is the (3,0) term (

{}_{30}{x^{'}}

), “sagittal coma”, “astigmatic coma” or “astigmatic curvature” is the horizontal (1,2) term (

{}_{12}{x^{'}}

), “spherical aberration” is the (4,0) term (

{}_{40}{x^{'}}

) and “mixed spherical aberration” is the horizontal (2,2) term (

{}_{22}{x^{'}}

).

2.2. Surface-Normal Rotation Transformation of the Varied Line-Space Coefficients

A varied line-space (VLS) grating is a design in which the groove positions are relatively unconstrained yet possess sufficient symmetry to permit a mechanical ruling. The most common of these symmetries is where the grooves are straight and parallel, thus the groove number (N) may be expressed as a 1D power series in the ruled width coordinate

υ

:

N = (r / d_{o}) \sum_{k} N_{k} υ^{k}

(4)

where

N_{1} = 1

and

N_{k}

(for

k > 1

) are the dimensionless VLS ruling coefficients. Equation (4) corresponds to a relative local groove density

d_{o} / d (υ) = \sum_{k} k N_{k} υ^{k - 1}

and to dimensional coefficients

M_{k} = N_{k} / (d_{o} r^{k - 1}

).

Figure 2a shows that a surface normal rotation

θ

(positive = clockwise viewed from above) of ruling coordinate

υ

relative to the grating meridional coordinate

ω

results in the transformation

υ = ω c o s θ + σ s i n θ

. The groove number coefficients

N_{i j}

in the grating frame (

ω, σ

), as used in Equation (1), are thus obtained by that substitution in Equation (4):

N_{i j} = c_{i j} N_{k} c o s^{i} θ s i n^{j} θ

(5)

where

c_{i j} = 1

for pure meridional terms (

j = 0

);

= 1

for pure sagittal terms (

i = 0

);

= i + j

for the mixed terms (1,1), (1,2), (2,1), (3,1) and (1,3);

= 6

for the mixed term (2,2) and where

k = i + j

. The (

i

,0) terms provide a meridionally-projected groove density of

d_{o} / d (ω) = \sum_{i} i N_{i 0} ω^{i - 1} = [1 + ω C_{2} + ω^{2} C_{3} + ω^{3} C_{4} + . . .] c o s θ

(6)

where

C_{i} \equiv i N_{i} c o s^{i - 1} θ,

revealing that the relative magnitudes between successive coefficients have decreased by

c o s^{i - 1} θ

. This results in diminished focusing power (

i = 2

) and progressively smaller correction of the higher-degree aberrations (

i = 3, 4, \dots

). Additionally, Equation (5) reveals that nonzero values of

θ

create new coefficients of sagittal (

i = 0, j \neq 0

) and mixed (

i \neq 0, j \neq 0

) powers. These new terms result in significant 3D imaging characteristics, including image tilts and the introduction of a dominant mixed aberration.

Figure 2. (a) The grating

ω - σ

plane viewed from above (+

\hat{n}

), detailing the rotational transformation of the varied line spacing between the ruling axis and the meridional axis for any point G on the grating surface, as given by Equation (5); (b) Frontal view (as seen by a distant observer upstream of the entrance slit), detailing the slit rotations in the object and image planes. The paraxial (non-aberrant) deflection out of the meridional plane is shown both originating from the center of the entrance slit (^_______) and originating from one end of this slit (^{__ __ __} ). The (dimensionless) length of the principal ray is

h η

. The presence of aberrations, notably astigmatism, for the extreme non-principal ray (^__·^__) results in the lateral image position summed from all the power terms given in Section 4. Slit rotation angles

ψ

(about

{}_{A}{\hat{n}}

) and

ψ'

(about

{}_{B}{\hat{n}})

, given in Section 5, are relative to the grating frame. Thus, an equivalent optical geometry can maintain the entrance slit parallel to

z

in the laboratory frame, but rotate the grating and exit slit frames by –

ψ

about

{}_{A}{\hat{n}}

.

Figure 2. (a) The grating

ω - σ

plane viewed from above (+

\hat{n}

), detailing the rotational transformation of the varied line spacing between the ruling axis and the meridional axis for any point G on the grating surface, as given by Equation (5); (b) Frontal view (as seen by a distant observer upstream of the entrance slit), detailing the slit rotations in the object and image planes. The paraxial (non-aberrant) deflection out of the meridional plane is shown both originating from the center of the entrance slit (^_______) and originating from one end of this slit (^{__ __ __} ). The (dimensionless) length of the principal ray is

h η

. The presence of aberrations, notably astigmatism, for the extreme non-principal ray (^__·^__) results in the lateral image position summed from all the power terms given in Section 4. Slit rotation angles

ψ

(about

{}_{A}{\hat{n}}

) and

ψ'

(about

{}_{B}{\hat{n}})

, given in Section 5, are relative to the grating frame. Thus, an equivalent optical geometry can maintain the entrance slit parallel to

z

in the laboratory frame, but rotate the grating and exit slit frames by –

ψ

about

{}_{A}{\hat{n}}

.

2.3. The Principal Ray Terms

These are not aberrations (pupil-dependent), but are the horizontal

(x'_{10})

and vertical

(z'_{01})

image coordinates, whereby the wavefront from this point to the grating pole is stationary. Using Equation (1), this condition (

\partial F / \partial ω = \partial F / \partial σ = 0

) is expressed as:

\begin{array}{l} F_{10} = μ c o s θ + & (c o s α - z t a n ψ s i n α) / \sqrt{1 + z^{2} (1 + t a n^{2} ψ)} \\ + (\frac{x'_{10}}{η} s i n β - c o s β) / \sqrt{1 + {(x'_{10} / η)}^{2} + {(z'_{01} / η)}^{2}} = 0 \end{array}

(7)

F_{01} = μ s i n θ - z / \sqrt{1 + z^{2} (1 + t a n^{2} ψ)} - (\frac{z'_{01}}{η}) / \sqrt{1 + {(x'_{10} / η)}^{2} + {(z'_{01} / η)}^{2}} = 0

(8)

The leading terms in Equations (7) and (8) are the differential phase shifts, namely

N_{10}

and

N_{01}

as given by Equation (5). The middle terms are the differentials of the object distance

\bar{OG}

/r about

G = P

, obtained by simple trigonometry using Figure 1 and Figure 2b (or Equations (32)–(34)) with

O

being a point along a line (e.g., an entrance slit) tilted by an angle

ψ

from the vertical. The last terms are differentials of the image distance

\bar{G {}_{P}I} / r

about

G = P

, also obtained trigonometrically (Equations (35)–(37)) with

{}_{P}I

being the paraxial image point. Simultaneously meeting Equations (7) and (8) yields quadratics with the following series solutions:

{z^{'}}_{01} / η = - (1 + \frac{3}{2} μ^{2} s i n^{2} θ) z + (1 + 𝛤 / 2) (μ s i n θ) z^{2} + μ s i n θ + \frac{1}{2} μ^{3} s i n^{3} θ + \frac{3}{8} μ^{5} s i n^{5} θ

(9)

\begin{array}{l} μ c o s θ & = f_{β} c o s β - c o s α = (f_{β} - 𝛤) c o s β = ε - (\frac{1}{2} μ^{2} s i n^{2} θ + \frac{1}{8} μ^{4} s i n^{4} θ) c o s β \\ = ε - \frac{1}{2} ε^{2} c o s β t a n^{2} θ + \frac{1}{2} ε^{3} c o s^{2} β t a n^{4} θ + \frac{1}{8} ε^{4} (5 c o s^{2} β t a n^{2} θ + 1) c o s β t a n^{4} θ \\ + \frac{1}{8} ε^{5} (5 c o s^{2} β t a n^{2} θ + 3) c o s^{2} β t a n^{6} θ \end{array}

(10)

\begin{array}{l} \frac{{x^{'}}_{10}}{η} t a n β ≃ & [(1 + μ^{2} s i n^{2} θ) μ s i n θ + (\frac{t a n β}{ρ}) t a n ψ] z - {\frac{μ c o s θ}{2 c o s β} + \frac{μ^{2} s i n^{2} θ}{t a n^{2} β} \frac{1}{2} [1 + (𝛤 + 3) t a n^{2} β] \\ + \frac{μ s i n θ}{ρ t a n β} (1 + 3 t a n^{2} β) t a n ψ + \frac{1}{2} (\frac{1}{ρ^{2}} - 𝛤) t a n^{2} ψ} z^{2} \end{array}

(11)

where

𝛤 \equiv c o s α / c o s β

and

ε \equiv c o s β - c o s α

. Equation (9) is a “law of sagittal reflection” with a vertical object position (

z

) and generalized to include a surface-normal rotation angle (

θ

). As illustrated in Figure 2b, the latter causes a deflection of the image center out of the horizontal plane and is dominated by the

μ s i n θ

term, which scales linearly with wavelength. A possible method of canceling this (undesired) movement is proposed in Section 8.1, however this term will be included in all the aberration equations to be derived in Section 3, Section 4 and Section 5. Equation (10) is a generalized “grating equation” for the in-plane object point (

z = 0)

, including both the

c o s θ

projection of the groove spacing and the fine-correction factor

f_{β} = 1 - \frac{1}{2} μ^{2} s i n^{2} θ - \frac{1}{8} μ^{4} s i n^{4} θ

. The result is essentially exact agreement with the numerical raytracings (discrepancy

<

10⁻¹³ radians). The last equality in Equation (10) is a five-term series solution which retains an accuracy of

<

4.2 × 10⁻¹¹ radians. Alternatively, if the

μ^{4}

term is excluded, Equation (10) is quadratic in

μ

with the following simple solution being accurate to

<

3.2 × 10⁻⁹ radians (~ 0.036 microns at the grating focal distance of ~ 11 meters):

μ c o s θ = \frac{\sqrt{1 + 2 ε c o s β t a n^{2} θ} - 1}{t a n^{2} θ c o s β}

(12)

In the absence of groove axis rotation, Equations (9) and (10) are found to collapse to the results reported previously for a pure surface-normal rotation monochromator [2,3], namely

z_{01}^{'} ≃ r^{'} μ_{o} t a n θ

and

μ c o s θ = μ_{o}

. Equation (11) is the horizontal image position of off-plane points

z

along an entrance slit, revealing that a vertically displaced object point (

z \neq 0)

results in a horizontally-displaced image point (

x' \neq 0

). The term linear with

z

corresponds to image tilt, while the quadratic term accounts for image curvature of a straight slit; further analysis is given in Section 5.4 and Section 5.5.

2.4. Pure Meridional Aberrations and their Graze Angle-Invariant Approximation

Using the paraxial horizontal image position (

x^{'} = 0)

to construct the path lengths and neglecting the effect of off-plane vertical image positions (

z^{'}

), the grating Equation (10) simplifies to

μ c o s θ = ε \equiv c o s β - c o s α

and the lowest three meridional-only (

j = 0

) wavefront terms of Equation (1) may be written concisely in the following form:

F_{i 0} = μ N_{i} c o s^{i} θ + \frac{1}{2} [b_{i 0} ρ^{2} / η^{i - 1} + a_{i 0} {(- 1)}^{i}] s i n^{2} α

(13)

where

i = 2, 3 or 4,

b_{30} = c o s β, a_{30} = c o s α, b_{40} = 1 - \frac{5}{4} s i n^{2} β, a_{40} = 1 - \frac{5}{4} s i n^{2} α, b_{20} = 1

and

a_{20} = 1

. At grazing angles (

γ ≲

15°), the small-angle approximations

(s i n α ≃ α, s i n β ≃ β, c o s α ≃ 1 or 1 - \frac{1}{2} α^{2}

and

c o s β ≃ 1 or 1 - \frac{1}{2} β^{2}

) provide accuracies

≲

1%. This yields

a_{i 0} ≃ b_{i 0} ≃ 1

,

s i n^{2} α ≃ - 2 (μ c o s θ) / (ρ^{2} - 1)

and a simplified (

γ

-invariant) expression for the wavefront:

F_{i 0} ≃ [N_{i} c o s^{i - 1} θ - \frac{ρ^{2} / η^{i - 1} + {(- 1)}^{i}}{ρ^{2} - 1}] μ c o s θ

(14)

at nonzero spectral orders (

ρ \neq 1

). At the initial wavelength (prior to scanning),

ρ = ρ_{o}

and

θ = 0

and all horizontal aberrations vanish (

F_{i 0} ≃ 0

) by choice of the following VLS coefficients:

N_{i} ≃ \frac{ρ_{o}^{2} / η^{i - 1} + {(- 1)}^{i}}{ρ_{o}^{2} - 1}

(15)

As the geometrical imaging properties of VLS plane gratings contrast with intuitions fostered in classical optics, it is emphasized that the required coefficients

N_{i}

are (for all

i

) nearly independent of

γ

at grazing incidence, with this invariance becoming (asymptotically) exact as

γ

approaches zero. This is opposite to the behavior of classical (curved surface) methods of focusing, which are strongly dependent on a precise

γ

at grazing incidence, and become increasingly so as

γ

decreases. As with the prior VLS self-focusing plane grating geometry (IV in Table 1), this invariance enables a given grating to provide aberration correction at any graze angle given fixed values for

ρ_{o}

,

r and r'

. This offers a flexibility in graze angle and wavelength coverage not available with concave gratings.

For the author’s original VLS plane grating converging-beam geometry (III in Table 1), the conjugate distance ratio is

η ≃ - 1

, simplifying Equation (15) to

N_{i} ≃ {(- 1)}^{i - 1}

(dimensionally, this is

M_{i} ≃ {(r')}^{1 - i} / d_{o}

). Though this grating mount requires a horizontal focusing mirror to provide the incident converging beam, Equation (14) with

η ≃ - 1

reveals the yet stronger

γ

-invariance whereby the aberration correction is also independent of

ρ_{o}

. This allows the angles of incidence and diffraction to be changed independently (while still maintaining the fixed focal length

r'

) given only that the dimensional factor

r

is unchanged. For example, the grating may be configured with this mirror into an erect-field (varied

β

) spectrograph [8] with fixed

α

, a constant-deviation (fixed

α + β

) monochromator [9] or any other desired combination of the 2 angles. A derivative of this geometric class (i.e., having unaltered imaging properties) thus provides a fixed difference

(α - β)

for “on-blaze” diffraction efficiency to a moveable slit (or add a conventional rotating plane pick-off mirror to redirect this

β

to a fixed slit).

However, when

η > 0

(

r > 0

), the focal length is a strong function of

ρ

. This sensitivity is a general characteristic of a self-focusing grating geometry at grazing incidence, whether it be a classical or VLS grating, and whether the surface is plane or curved. In contrast, it has been previously shown [2,3] that a surface-normal (

\hat{n}

) rotation does not change the focal length of a CLS grating, given fixed meridional angles and a fixed axis-symmetric (e.g., plane or spherical) surface curvature. However, in the case of a VLS grating, an

\hat{n}

-rotation angle diminishes the interference term by the factor

c o s^{i - 1} θ

, as seen in Equation (14). This would result in an under-correction of all aberrations initially corrected at

θ = 0

. The new focusing condition presented here arose by considering if this under-correction could balance the strong defocus induced by the change to

ρ

upon a conventional (groove axis) rotation.

2.5. The New Focusing Condition

Combining Equations (14) and (15) in the case of no defocusing (

F_{20} = 0

) yields:

c o s [θ (ρ)] ≃ (\frac{ρ_{o}^{2} - 1}{ρ^{2} - 1}) (\frac{ρ^{2} + η}{ρ_{o}^{2} + η})

(16)

As

ρ ≃ (1 + δ / γ) / (1 - δ / γ)

, Equation (16) provides a mathematical relationship between the two rotation angles (

θ

and

δ

). However, this is of no physical relevance unless

0 < c o s θ < 1

. Fortuitously, this is always the case, as may be seen by making the substitution

ρ^{2} = ρ_{o}^{2} + Δ ρ^{2}

. Equation (16) is then rewritten as

[1 + Δ ρ^{2} / (ρ_{o}^{2} + η)] / [1 + Δ ρ^{2} / (ρ_{o}^{2} - 1)]

. For negative spectral orders,

ρ_{o} > 1

and a scan towards longer wavelengths requires

Δ ρ^{2} > 0

. Since

η > 0

in the present (self-focusing) geometry, the value of this equation must be between 0 and +1. By the principle of optical reversibility, this must also be the case for positive orders, for which the photon reverses its direction of travel and

ρ

becomes 1/

ρ

. The potential to cancel the defocusings induced by each of the 2 rotations (the 2 terms in Equation (14)) may also be understood geometrically. As illustrated by the first animation sequence of Figure 3, scanning towards longer wavelengths (shown visually going from “blue” to “orange”) by a

σ

-axis rotation always decreases the focal length. However, due to the cos

θ

reduction in VLS focusing power, the

\hat{n}

-rotation (away from

θ = 0

) always increases the focal length.

Though Equation (16) relies upon the small-angle approximation and also accounts only for pure horizontal defocusing (the effect of image tilt is addressed in Section 5), it confirms the physical relevance of this scheme. Additionally, it provides some insight by considering the case of unit magnification at

η = ρ_{o}

. By use of Equation (16) and the small-angle approximation to the generalized grating Equation (10), the in-plane rotation and the surface-normal rotation are each seen to increase the scanned wavelength by the same amount. In the more general cases of

η \neq ρ_{o}

or

ρ \neq ρ_{o}

(away from

λ_{o}

), though the contributions are not equal they are comparable, thus the scanning range (

λ / λ_{o}

) scales approximately as the square of that provided by each rotation. Due to this (second) fortuity, the proposed dual rotation is not only the key to a focused single-element plane grating monochromator, but also enables a wider scan in wavelength than does a single rotation for the same change in

ρ

(the latter affecting grating imaging, magnification and efficiency). For the monochromator parameters exemplified in Section 2.6, a factor 8 in scan range results from a factor 2.54 due to the in-plane rotation (changing

ρ

by only a factor of 1.89) times a factor 3.12 due to the surface-normal rotation.

Figure 3. Concerted rotations of a varied line-space (VLS) self-focused plane grating about two axes provide in-focus scanning of the wavelength (please see the animation of this figure in the supplementary material). Tilt and off-plane deflection of the image are not shown here. Slits may be placed at the object and image.

2.6. Two-Point Coma-Correction: Optimization of the Conjugate Distance Ratio

With defocusing eliminated at all scanned wavelengths, the next higher degree aberration to be considered is given by

F_{30}

in Equation (1). At grazing angles, this aberration is typically large and thus determines whether a system which is technically “in-focus” (

F_{20} = 0

) can in fact deliver high resolution at a usable aperture. It is tempting to first consider unit magnification

(η / ρ = 1)

, which would eliminate this aberration in the absence of a varied line-density term

N_{3}

. This is similar to the classical concave grating being free of this aberration on the (unit magnification) Rowland circle. Setting

N_{3} =

0 would also remove this interference term from Equation (14) and thus avoid its multiplication by cos²

θ

(which would re-introduce coma upon surface-normal rotation). However, because

α

and

β

are changing in opposite directions as the wavelength is scanned in the first (groove axis) rotation, the resulting change in ρ ~ β/α allows

η / ρ =

1 at only one wavelength. Even if this correction wavelength is optimally selected (by choice of

N_{3} \neq 0

alone) to be near the center of the scanned spectrum, the growth in coma away from a single correction point is rapid, making it ineffective for all but a very narrow wavelength range.

The high level of coma correction required to maintain ultra-high spectral resolution over wide scanning ranges is realized by determining optimal values for the two free parameters (

η

and

N_{3}

) available from Equation (14) when

i = 3

to correct

F_{30}

at two wavelengths. In general, the simultaneous solution is a quartic, however it collapses to a quadratic if the two correction points are (tentatively) chosen to be at

θ = 0

(cos

θ = 1

) and

θ =

60° (cos

θ = ½

), resulting in the following closed-form solution:

η_{c} = (ρ_{o}^{2} - 1) + \sqrt{{(ρ_{o}^{2} - 1)}^{2} + ρ_{o}^{2}}

(17)

The corresponding

N_{3 c}

is obtained by setting

F_{30} = 0

in Equation (14) at

θ = 0

and

η = η_{c}

. Minimization of

F_{30}

over the scan range is then obtained by adjusting both

η

and

N_{3}

from these tentative values, such that the two correction points straddle the spectrum center and are symmetrically inset from the edges. In this case,

| F_{30} |

reaches the same maximum at the extreme ends of the range and at a wavelength between the 2 correction points. Such optimization for a scanning range of

λ_{m a x} / λ_{o}

~ 8 (

θ_{m a x}

~ 71.25°) resulted in

η =

1.275

η_{c}

and

N_{3} =

1.08

N_{3 c}

(if the scan range were reduced to

θ_{m a x}

~ 60° (

λ_{m a x} / λ_{o}

~ 3.6), optimization of this coma-correction would yield

η = η_{c}

and

N_{3} =

1.04

N_{3 c}

). This and the preceding meridional analysis provide the following set of dimensionless parameters for an example soft X-ray SEPGM: 2

γ =

6°,

ρ_{o} =

1.4458939,

η =

3.699998982,

N_{2} =

1.432972878,

N_{3} = -

0.8378590,

N_{4} =

1.28655, where

N_{4}

results from using Equation (14) to cancel spherical aberration (

F_{40} = 0

) at

θ \sim 32^{°}

, thereby minimizing its average magnitude over the three-octave scan range.

2.7. A Classification of Plane Grating Geometries

The simple form of Equation (14) also enables a concise comparison of the fundamental differences between various aberration-corrected plane grating geometries. In Table 1, the present class is listed alongside the previous five, comparing the defining categories of the minimum number of reflections (grating + mirror(s)), the grating line density function (the VLS coefficients

N_{i}

), the functional value of

η

defining the grating mount, the grating parameter which is fixed upon a focused (

F_{20} =

0) scan of wavelength and the corrected horizontal aberrations (

F_{i 0} =

0).

Table 1. Plane grating aberration-corrected monochromators with fixed slits.

**Table 1.** Plane grating aberration-corrected monochromators with fixed slits.
Geometry	Optics (min)	Approx. VLS $N_{i}$ (i $=$ 2, 3, 4)	Grating Mount (a)	Fixed Parameter(s)	Corrected Aberrations
I. Petersen [6]	3	$N / A (CLS)$	$η = - ρ^{2}$	$ρ$	$x_{20}^{'}$
II. Lu [7]	2	$N / A (CLS)$	$η = - ρ^{2}$	r′ only (b)	$x_{20}^{'}$
III. Hettrick [9]	2	$≃ {(- 1)}^{i - 1}$	$η ≃$ −1	$α + β$ (c)	$x_{20}^{'}$ $x_{30}^{'} (1 λ)$ $x_{40}^{'} (1 λ)$
IV. Harada [10]	2	$≃ \frac{ρ_{o}^{2} / η^{i - 1} + {(- 1)}^{i}}{ρ_{o}^{2} - 1}$	$η > 0$	$ρ$	$x_{20}^{'}$ $x_{30}^{'}$ ( $1 λ)$ $x_{40}^{'}$ ( $1 λ)$
V. Hettrick [3]	2	$N / A (CLS)$	$η = - ρ^{2}$	$α, β (d)$	$x_{20}^{'}$
VI. This Work	1	$≃ \frac{ρ_{o}^{2} / η^{i - 1} + {(- 1)}^{i}}{ρ_{o}^{2} - 1}$	$η > 0$	$α + β$ (e)	$x_{20}^{'}$ $x_{30}^{'} (2 λ^{'} s)$ $x_{40}^{'}$ ( $1 λ)$

Notes: (a)

η = - ρ^{2}

is the Monk–Gillieson mount and

η > 0

is a “self-focusing” mount; (b) the grating object distance r changes to be confocal with the spherical mirror image; (c) given

η ≃

−1,

ρ

is unconstrained at grazing angles (allowing fixed

α

, fixed

α + β

, fixed

α - β

, etc.); (d) type “SNR-III” grating system of 8th figure in the referenced patent; and (e) detailed in the present work for comparison to other monochromator designs with horizontally fixed slit positions; however, other combinations of

α

and

β

may be used by changing the focusing condition.

3. A Rigorous Theory of Light-Path Expansion

The purist of wave aberration methods [11] constructs the physical light-path from object point to actual (floating) image point and employs Fermat’s principle by setting the wave aberration differential to zero for the sum of all terms. This approach does not evaluate the individual aberrations, other than those of lowest degree in each direction (i.e., defocusing and astigmatism). This avoids the difficult task of constructing an accurate reference wavefront, and is convenient for optimizing (“fine-tuning”) the parameters of existing geometries by minimizing the numerical variation in the image point position as the incident ray wanders over the pupil surface. Unfortunately, this provides little intuitive information conducive to algebraic or geometrical understanding, and is therefore not suited to the development or qualitative improvement of new geometries. Indeed, even the (numerical) data from the precise raytracings performed as part of the present work (Section 7) are mathematically reduced to extract the individual coefficients of the algebraic power series, and are thus more convenient for basic design development than is the total wave aberration formalism.

More insight and fundamental progress in optical design is facilitated by isolating and controlling successive expansion terms

ω^{i - 1} σ^{j}

(and

ω^{i} σ^{j - 1}

) of the lateral ray aberrations. Especially in the case of large mixed terms (neither i nor j being zero), this requires a more rigorous light-path formulation than previously presented in the literature. Though complicating the derivation, the resulting separation of the individual terms provides a precise and consistent analytical method indispensible to the development of new designs. In this section, the general equations are given for rigorously expanding the lateral aberrations of a reflection grating. In Section 4, these will be simplified for the case of a plane grating and used to obtain the explicit aberration terms for the basic astigmatic configuration of the new monochromator geometry.

3.1. Flaw in the Standard Theory

Relative to any chosen point in space, the optical wavefront (a surface of constant phase) is a (physically) measureable quantity, as is the lateral image position of a ray diffracted from any point on the pupil. However, their expansion into power series component terms is (though very useful) only a mathematical abstraction. In expanding the image path length relevant to an individual (

{x^{'}}_{i j}

or

{z^{'}}_{i j}

) lateral aberration term, a consistent formulation requires the use of a correspondingly abstract reference image point. This must be chosen to remove the other lateral aberration terms which, upon expansion and pupil differentiation of the light-path, may erroneously form the same power dependence as the intended term. Only in this way do the (mathematically-constructed) terms become separate incremental components, the sum of which is the total (physical) aberration.

In general, the reference point depends not only upon the (i, j) term in the light-path function, but also upon whether a given (i, j) term is differentiated relative to the meridional pupil coordinate to determine (via Fermat’s principle) the

{x^{'}}_{i j}

lateral ray position or relative to the sagittal pupil coordinate to determine

{z^{'}}_{i j}

. Thus, a single light-path function may not be expanded (e.g., into a power series) applicable to both directions. Nonetheless, the individual derivatives (

\partial / \partial ω and \partial / \partial σ

) of the appropriate functions may be expanded for rigorous determination of the separate lateral ray positions, as given in Section 3.3.

Unfortunately, the standard formulation (briefly summarized in Section 2.1) systematically employs only the paraxial image as the reference point (in wavefront terminology, it is usually asserted to use the “Gaussian reference sphere”). The resulting expanded (inferred) power terms are therefore subject to cross-contamination, providing incorrect results for both the individual aberrations and their sum, the latter thereby not matching the actual (measurable) lateral ray position. In the analysis of classical optics, such contamination is usually small, due to these designs either being rotationally symmetric, having an in-plane dispersion geometry, or being anastigmatic and (ideally) absent of second-degree (e.g., meridional coma) lateral aberrations, and is neglected. An historical exception appears to have been in determining the mixed power term of astigmatic coma (

{}_{12}{x^{'}}

), where Beutler [12] correctly added the aberration of astigmatism (

{}_{02}{z^{'}}

) to form the image reference point for a concave grating geometry (particularly on the Rowland circle). Though such special consideration of the lowest-degree vertical aberration is consistent with the image reference being “paraxial”, it contradicts the assertion that the wave aberration is evaluated at the (single point) intersection of the principal ray with the (horizontal) Gaussian image plane [13]. As astigmatism separates the horizontal and vertical image planes, the actual wavefront becomes toroidal. Moreover, inclusion of this one aberration in the reference does not provide a correct determination of all the lateral aberrations.

In the more general sense, this approach leaves uncorrected an underlying problem that the standard formulation is based on the methods developed for the treatment of normal incidence and rotationally symmetric classical optics systems. For such, the horizontal and vertical image planes coincide, neither astigmatism nor coma exist when using an on-axis object point and a single light-path function may be used to determine the lateral aberrations in both directions. Using such methods, the lateral terms which can dominate grazing incidence and asymmetrical systems are not systematically removed from the expansion of higher-degree aberrations. The neglected terms include all of the pure meridional aberrations

x_{i 0}^{'}

(such as horizontal coma) and the mixed terms

x_{i j}^{'}

(in which both i and j are nonzero). Though a comprehensive discussion of the errors resulting from widespread use of the standard formulation is beyond the scope of this paper, Appendix A shows that the standard result for the spherical aberration term of even a (classical) spherical mirror is incorrect at non-unit magnification. This illustrates that the new formulation yields more accurate results even in the absence of surface-normal rotation, varied line-spacing, grazing incidence or even grating diffraction.

The light-path expansion theory developed below reveals that the isolation of different power series terms is generally more complex than previously explained by use of paraxial reference points. The new formulation employs a pupil-dependent (“aberrant”) reference image position tailored to the expansion of each power term, and a Fermat differentiation of infinitesimal pupil variables (

Δ ω

and

Δ σ

) which are independent of this image position and the pupil coordinates (

ω

and

σ

).

3.2. The Reference Path Lengths

The reference path-lengths are

A_{i j}

(

ω, σ; x_{i j}, z_{i j}

) as the distance between a reference point (

x_{i j}, z_{i j}

) in the object plane and the grating pupil coordinate (

ω, σ

), and

B_{i j}

(

ω, σ, ξ_{i j}, ζ_{i j}

) as the distance between (

ω, σ

) and a reference point (

ξ_{i j}, ζ_{i j}

) in the image plane. The general equations in this section include a grating surface of revolution with a radius of curvature

R \equiv R / r

at the pole, where

R > 0

is concave (positive focusing power) and

R < 0

is convex (negative focusing power). In the below equations,

S = 0

provides the exact results for a paraboloidal surface, while a spherical surface is treated (correct to the 5th degree in path-length) by substituting

S = 1

. Given the coordinate systems shown in Figure 1 and Figure 2:

A_{i j} = {(1 + {}_{A}t_{i j})}^{1 / 2}

(18)

in which

{}_{A}t_{i j}

is the sum of the squares of the two lateral (”transverse”) segments:

\begin{matrix} {}_{A}t_{i j} = 2 ω c o s α - 2 x_{i j} ω s i n α - 2 z_{i j} σ + x_{i j}^{2} + z_{i j}^{2} + [1 - (s i n α + x_{i j} c o s α) / R] (ω^{2} + σ^{2}) \\ + \frac{1}{4} [1 - S (s i n α + x_{i j} c o s α) / R] {(ω^{2} + σ^{2})}^{2} / R^{2} \end{matrix}

(19)

where,

x_{i j} = {}_{i j}x ω^{i - 1} σ^{j} and z_{i j} = {}_{i j}z ω^{i} σ^{j - 1}

(20)

In Equation (19),

z

is the vertical distance of an object point from the horizontal (meridional) plane. The terms

x_{10}

and

z_{01}

have no dependence upon the grating pupil coordinates and thus define a source point whose emitted rays fill the grating aperture. Making

x_{10}

a function

z_{01}

(or vice-versa) can define the two-dimensional curve on which such points may lie in the object plane. For example, a parabolic entrance slit illuminated by a spatially diffuse upstream source may be specified by

x_{10} = z_{01} t a n ψ + z_{01}^{2} / (2 R_{s l i t}),

where

ψ

is the slit’s tilt angle towards the

x

-axis and

R_{s l i t}

is its curvature radius at

z_{01} = 0

. In the case of a linear slit 1/

R_{s l i t} = 0

; if

ψ

is also zero, then the slit is aligned with the vertical

z

-axis. Equations (19) and (20) also allow power series terms (

i + j > 1

) to accommodate dependences between the object point position and the grating pupil position. For example, consider a vertically aligned entrance slit being illuminated by a horizontally aligned linear source at an upstream distance Y. This longitudinal separation of the effective horizontal and vertical object conjugate planes results in

z_{02} ≃ σ / (1 + r / Y

). Nonzero values for

x_{i j}

or

z_{i j}

can also be constructed to include the geometrical aberrations of optics preceding the grating (or the object plane), e.g., those of a mirror which reduces the (vertical) astigmatism and/or provides horizontal re-focusing of a distant source.

Again referring to Figure 1 and Figure 2, one formulates

:

B_{i j} = η {(1 + {}_{B}t_{i j})}^{1 / 2}

(21)

in which,

\begin{array}{l} {}_{B}t_{i j} = - 2 (\frac{ω c o s β}{η}) & + 2 (\frac{ξ_{i j} s i n β}{η}) (\frac{ω}{η}) - 2 (\frac{ζ_{i j}}{η}) (\frac{σ}{η}) + {(\frac{ξ_{i j}}{η})}^{2} + {(\frac{ζ_{i j}}{η})}^{2} \\ + [1 - (η s i n β + ξ_{i j} c o s β) / R] [{(\frac{ω}{η})}^{2} + {(\frac{σ}{η})}^{2}] \\ + \frac{1}{4} {(\frac{η}{R})}^{2} [1 - S (η s i n β + ξ_{i j} c o s β) / R] {[{(\frac{ω}{η})}^{2} + {(\frac{σ}{η})}^{2}]}^{2} \end{array}

(22)

where,

ξ_{i j} = \sum_{(I, J) \neq (i, j)} x_{I J}^{'} = \sum_{(I, J) \neq (i, j)} {}_{I J}x^{'} ω^{I - 1} σ^{J} and ζ_{i j} = \sum_{(I, J) \neq (i, j)} z_{I J}^{'} = \sum_{(I, J) \neq (i, j)} {}_{I J}z^{'} ω^{I} σ^{J - 1}

(23)

The coordinate system of the image plane has as its origin the principal ray position for the in-plane configuration, namely where the object point is at

(x = 0, z = 0

) and where the grating rotation angle

θ =

0. However, all other object points and all nonzero values of

θ

will result in an image position which does not strike this origin in one or both lateral directions. Therefore,

ξ_{i j} = ξ_{i j}^{∦} + x^{'} (0, 0)

and

ζ_{i j} = ζ_{i j}^{∦} + z^{'} (0, 0

), where the superscript

∦

denotes the component due to aberrations. In the case of the minimal astigmatic configuration of the SEPGM geometry (whose detailed expansion is given in Section 4), the aberration terms assume an object at

(x = 0, z = 0

). This constrain

s x^{'} (0, 0) = 0

for all rotation angles, thus

ξ_{i j}

contains only the aberration terms (

ξ_{i j}^{∦}

). However,

z^{'} (0, 0) \neq 0

when

θ \neq 0,

as given by Equation (9). Thus,

ζ_{i j}

is the sum of an aberration component (

ζ_{i j}^{∦}

) and the off-plane position of the principal ray.

3.3. Fermat Derivation of the Lateral Ray Aberrations

Mathematical separation of the individual aberration power terms requires strict adherence to Fermat’s principle, which specifies the light-path (i.e., the direction along which the optic provides constructive interference) to be the one for which the phase is stationary relative to small offsets in the pupil coordinates. Deviations from this can be converted to the lateral ray deviations (aberrations) from a reference image point. In mathematical terms, the first derivative must be taken relative to the pupil coordinates, while maintaining a fixed reference image point. To insure a proper formulation, it is therefore convenient to add small offsets (

Δ ω

and

Δ σ

) to the two pupil coordinates (

ω

and

σ

) appearing in Equations (19) and (22), but not to the pupil coordinates appearing in Equations (20) and (23). The Fermat derivatives of Equations (18) and (21) are then taken with respect to

Δ ω

and

Δ σ

. In the absence of these effectively separate variables, i.e. if one simply differentiated with respect to

ω

and

σ

, any aberration (a ray position which depends on the pupil coordinates) included in the construction of the reference image point would move that point during the differentiation, improperly distorting the (spherical) reference wavefront. The same situation occurs in the presence of an aperture-dependent object position (discussed in Section 3.2).

The proper formulation of Fermat’s principle thereby yields:

\begin{array}{l} \partial A_{i j} / \partial (Δ ω) = & \frac{1}{2} [\partial {}_{A}t_{i j} / \partial (Δ ω)] {(1 + {}_{A}t_{i j})}^{- 1 / 2} = {c o s α - x_{i j} s i n α + [1 - (s i n α + x_{i j} c o s α) / R] ω \\ + \frac{1}{2} [1 - S (s i n α + x_{i j} c o s α) / R] (ω σ^{2} + ω^{3}) / R^{2}} (1 {+ {}_{A}t_{i j})}^{- 1 / 2} \end{array}

(24)

\begin{array}{l} \partial A_{i j} / \partial (Δ σ) & = \frac{1}{2} [\partial {}_{A}t_{i j} / \partial (Δ σ)] {(1 + {}_{A}t_{i j})}^{- 1 / 2} = {- z_{i j} + [1 - (s i n α + x_{i j} c o s α) / R] σ \\ + \frac{1}{2} [1 - S (s i n α + x_{i j} c o s α) / R] (ω^{2} σ + σ^{3}) / R^{2}} (1 {+ {}_{A}t_{i j})}^{- 1 / 2} \end{array}

(25)

\begin{array}{l} \partial B_{i j} / \partial (Δ ω) & = \frac{η}{2} [\partial {}_{B}t_{i j} / \partial (Δ ω)] {(1 + {}_{B}t_{i j})}^{- \frac{1}{2}} \\ = {- c o s β + \frac{ξ_{i j}}{η} s i n β + [1 - (η s i n β + ξ_{i j} c o s β) / R] (\frac{ω}{η}) \\ + \frac{1}{2} [1 - S (η s i n β + ξ_{i j} c o s β) / R] [(\frac{ω}{η}) {(\frac{σ}{η})}^{2} + {(\frac{ω}{η})}^{3}] {(\frac{η}{R})}^{2}} {(1 + {}_{B}t_{i j})}^{- 1 / 2} \end{array}

(26)

\begin{array}{l} \partial B_{i j} / \partial (Δ σ) & = \frac{η}{2} [\partial {}_{B}t_{i j} / \partial (Δ σ)] {(1 + {}_{B}t_{i j})}^{- \frac{1}{2}} = {- \frac{ζ_{i j}}{η} + [1 - (s i n β + ξ_{i j} c o s β) / R] (\frac{σ}{η}) \\ + \frac{1}{2} [1 - S (η s i n β + ξ_{i j} c o s β) / R] [{(\frac{ω}{η})}^{2} (\frac{σ}{η}) + {(\frac{σ}{η})}^{3}] {(\frac{η}{R})}^{2}} {(1 + {}_{B}t_{i j})}^{- 1 / 2} \end{array}

(27)

The reference coordinates

ξ_{i j}

and

ζ_{i j}

in the above equations are each sums of all the other lateral ray aberration power terms

x_{I J}^{'} \neq x_{i j}^{'}

and

z_{I J}^{'} \neq z_{i j}^{'}

, respectively, which when combined and expanded with the other terms give rise to the desired power term.

The reciprocal radicals in Equations (24)–(27) are to be expanded by Taylor series (

1 / \sqrt{1 + t} \approx 1 - \frac{1}{2} t + \frac{3}{8} t^{2} - \frac{5}{16} t^{3} + \frac{35}{128} t^{4})

to isolate the different power terms. The horizontal lateral aberration

x^{'}

is obtained by Fermat conversion of the wavefront error by summing the

ω^{i - 1} σ^{j}

terms, retained in the

Δ ω

-derivatives of the three path-length components (N, A and B), and multiplying by the “lever arm” distance to the image :

x^{'} (ω, σ, μ) = \sum_{i j} x_{i j}^{'} = \sum_{i j} {}_{i j}{x^{'}} ω^{i - 1} σ^{j} = \sum_{i j} \sum_{k} {}_{i j k}{x^{'}} μ^{k} ω^{i - 1} σ^{j}

(28)

where,

{}_{i j}{x^{'}} = - h η {i μ N_{i j} + {[\frac{\partial A_{i j}}{\partial (Δ ω)} + \frac{\partial B_{i j}}{\partial (Δ ω)}]}_{ω^{i - 1} σ^{j} c o e f f i c i e n t}} / s i n β - \frac{({}_{i - 1, j}{x^{'}})}{R} c o t β

(29)

and

h = 1 + \frac{1}{2} μ^{2} s i n^{2} θ

is the inclination factor due to an off-plane image coordinate of

μ s i n θ

. As

h η

is thereby the distance between the optic center and the paraxial image point, this conversion is accurate to order

η ω^{7}

, as given by Born and Wolf [14]. The trailing term in Equation (29) accounts for the linear variation of 1/sin

β

with

ω

when the optical surface is not flat (

1 / R \neq 0

).

For the vertical ray aberration (summing the

ω^{i} σ^{j - 1}

terms), the inclination factor

h

enters a second time at the exit pupil (in the same manner as the 1/sin

β

factor in the horizontal equation) and a third time in projecting the transverse aberration (normal to the direction of propagation) onto the image plane. This results in a net factor of h³

≅

1 + \frac{3}{2} {(μ s i n θ)}^{2} + \frac{15}{8} {(μ s i n θ)}^{4} + \dots

, as employed below:

z^{'} (ω, σ, μ) = \sum_{i j} z_{i j}^{'} = \sum_{i j} {}_{i j}z^{'} ω^{i} σ^{j - 1} = \sum_{i j} \sum_{k} {}_{i j k}z^{'} μ^{k} ω^{i} σ^{j - 1}

(30)

where,

{}_{i j}z^{'} = h^{3} η {j μ N_{i j} + {[\partial A_{i j} / \partial (Δ σ) + \partial B_{i j} / \partial (Δ σ)]}_{ω^{i} σ^{j - 1} c o e f f i c i e n t}}

(31)

The above general equations provide the foundation for an accurate mathematical decomposition of the lateral ray aberration into power terms

x_{i j}^{'}

(horizontally) and

z_{i j}^{'}

(vertically). The improvement over the standard formulations lies in the systematic inclusion of all other relevant (i.e., able to form expansion terms of power

ω^{i - 1} σ^{j}

(horizontally) or

ω^{i} σ^{j - 1}

(vertically)) aberrations

x_{I J}^{'}

and

z_{I J}^{'}

in constructing the reference image point (

ξ_{i j}, ζ_{i j})

for calculation of the geometrical path lengths

B_{i j}

used in determining both

x_{i j}^{'}

and

z_{i j}^{'}

.

One could integrate Equations (29) and (31) to form “wavefront” (path-length) coefficients; simply being Equations (2) and (3) in reverse, yielding

{}_{H}F_{i j} \equiv - ({}_{i j}{x^{'}} s i n β) / (i h η)

for the horizontal direction and

{}_{V}F_{i j} \equiv {}_{i j}{z^{'}} / (j h^{3} η)

for the vertical direction. Though such an exercise adds no new information, it confirms that the individual terms in the decomposition of the wavefront (and thus lateral ray positions) are mathematical abstractions. If they were physical quantities, then

{}_{H}F_{i j}

ω^{i} σ^{j}

and

{}_{V}F_{i j} ω^{i} σ^{j}

would not “know” the direction in which they may be differentiated and thus would be equivalent (

{}_{H}F_{i j} \equiv {}_{V}F_{i j} \equiv F_{i j}

). However, for the present astigmatic design, the required reference point (

ξ_{i j}, ζ_{i j}

) for

x_{i j}^{'}

is different from that for

z_{i j}^{'}

at all (i, j) pairs (see Table 2). Thus,

{}_{H}F_{i j} \neq {}_{V}F_{i j}

for all i and j (when i + j

>

1), indeed with the ratios

{}_{V}F_{11} / {}_{H}F_{11}, {}_{V}F_{12} / {}_{H}F_{12} and {}_{V}F_{22} / {}_{H}F_{22}

calculated to be of significant magnitude (10² to 10³).

4. The Aberration Equations for an Astigmatic Plane Grating Monochromator

In this section, the lateral aberration terms of “power-sum”

n \overset{def}{=}

(i + j − 1)

=

1, 2 and 3 are explicitly expanded for a planar (

R = \infty

) grating surface.

4.1. The Fermat Derivatives

Setting the curvature radius to infinity simplifies Equations (19), (22) and (24)–(27) to the following:

\partial A_{i j} / \partial (Δ ω) = (c o s α - x_{i j} s i n α + ω) (1 - \frac{1}{2} {}_{A}^{\infty}t_{i j} + \frac{3}{8} {}_{A}^{\infty}t_{i j}^{2} - \frac{5}{16} {}_{A}^{\infty}t_{i j}^{3} + \frac{35}{128} {}_{A}^{\infty}t_{i j}^{4})

(32)

\partial A_{i j} / \partial (Δ σ) = (σ - z_{i j}) (1 - \frac{1}{2} {}_{A}^{\infty}t_{i j} + \frac{3}{8} {}_{A}^{\infty}t_{i j}^{2} - \frac{5}{16} {}_{A}^{\infty}t_{i j}^{3} + \frac{35}{128} {}_{A}^{\infty}t_{i j}^{4})

(33)

where,

{}_{A}^{\infty}t_{i j} = 2 ω c o s α + (ω^{2} + σ^{2}) - 2 x_{i j} ω s i n α - 2 z_{i j} σ + x_{i j}^{2} + z_{i j}^{2}

(34)

\partial B_{i j} / \partial (Δ ω) = (- c o s β + \frac{ξ_{i j}}{η} s i n β + \frac{ω}{η}) (1 - \frac{1}{2} {}_{B}^{\infty}t_{i j} + \frac{3}{8} {}_{B}^{\infty}t_{i j}^{2} - \frac{5}{16} {}_{B}^{\infty}t_{i j}^{3} + \frac{35}{128} {}_{B}^{\infty}t_{i j}^{4})

(35)

\partial B_{i j} / \partial (Δ σ) = (\frac{σ}{η} - \frac{ζ_{i j}}{η}) (1 - \frac{1}{2} {}_{B}^{\infty}t_{i j} + \frac{3}{8} {}_{B}^{\infty}t_{i j}^{2} - \frac{5}{16} {}_{B}^{\infty}t_{i j}^{3} + \frac{35}{128} {}_{B}^{\infty}t_{i j}^{4})

(36)

where,

{}_{B}^{\infty}t_{i j} = - 2 (\frac{ω c o s β}{η}) + 2 (\frac{ξ_{i j} s i n β}{η}) (\frac{ω}{η}) - 2 (\frac{ζ_{i j}}{η}) (\frac{σ}{η}) + {(\frac{ξ_{i j}}{η})}^{2} + {(\frac{ζ_{i j}}{η})}^{2} + {(\frac{ω}{η})}^{2} + {(\frac{σ}{η})}^{2}

(37)

For the lateral aberrations, the expansions will use the in-plane object point

x_{i j} = z_{i j} = 0

. The resulting series from Equations (32) and (33) are the same as those of the standard light-path formulation:

\begin{matrix} \partial A_{i j} / \partial (Δ ω) = ω s i n^{2} α - \frac{1}{2} (3 ω^{2} s i n^{2} α + σ^{2}) c o s α + \frac{1}{2} (4 a_{40} ω^{3} s i n^{2} α + 2 a_{22} ω σ^{2}) \\ - \frac{1}{2} (5 a_{50} ω^{4} s i n^{2} α + 3 a_{32} ω^{2} σ^{2} - \frac{3}{4} σ^{4}) c o s α \end{matrix}

(38)

\partial A_{i j} / \partial (Δ σ) = σ - ω σ c o s α - (\frac{1}{2} σ^{3} - a_{22} ω^{2} σ) + (\frac{3}{2} ω σ^{3} - a_{32} ω^{3} σ) c o s α

(39)

a_{40} = (1 - \frac{5}{4} s i n^{2} α), a_{22} = (1 - \frac{3}{2} s i n^{2} α), a_{50} = (1 - \frac{7}{4} s i n^{2} α) and a_{32} = (1 - \frac{5}{2} s i n^{2} α)

(40)

Due to the symmetry imposed by the in-plane object, there are no odd powers of

σ

in Equation (38) and no even powers of

σ

in Equation (39).

Expansion of Equations (35) and (36) is considerably more involved, due to the presence of an image reference point (

ξ_{i j}

,

ζ_{i j}

) whose components, as given by Equation (23), are themselves power series in both

ω

and

σ

. Equation (29) provides nine aberration terms in the horizontal direction

(x_{20}^{'}, x_{30}^{'}, x_{40}^{'}, x_{11}^{'}, x_{21}^{'}, x_{31}^{'}, x_{12}^{'}, x_{22}^{'}, x_{13}^{'})

and Equation (31) provides nine aberration terms in the vertical direction

(z_{02}^{'}, z_{03}^{'}, z_{04}^{'}, z_{11}^{'}, z_{21}^{'}, z_{31}^{'}, z_{12}^{'}, z_{22}^{'}, z_{13}^{'}

). For a given frontal aperture aspect ratio (

g \overset{def}{=} ϕ_{s} / ϕ_{m}

), the surface aperture ratio (

\overset{˘}{σ} / \overset{˘}{ω}

) scales linearly with the graze angle; the aberrations thereby progressively decrease in magnitude within the n^th-degree power-sum family along the sequence

ω^{n} \to ω^{n - 1} σ \to ω^{n - 2} σ^{2} \to \dots \to σ^{n}

. Thus, unless

ϕ_{s} ≫ ϕ_{m}

, the least-significant horizontal terms are

x_{22}^{'} and x_{13}^{'}

, and the least-significant vertical terms

(z_{13}^{'} and z_{04}^{'})

are those which contribute to the same spectral width terms

(x_{22}^{″} and x_{13}^{″}

) following the image plane rotation given in Section 6. Being exceedingly small for the example monochromator (quantitatively verified by the raytrace extraction equations given in Section 7), these four terms are not explicitly expanded here. Each of the remaining 14 aberrations employs image path-length derivatives derived from a dedicated reference image coordinate pair (

ξ_{i j}, ζ_{i j}

) summed from the other (non-(i,j)) lateral aberrations. The resulting expansion coefficients of

ω^{i - 1} σ^{j}

in Equation (35) and

ω^{i} σ^{j - 1}

in Equation (36) complete the determination of the Fermat-generated horizontal and vertical ray aberrations given by Equations (29) and (31), respectively.

4.2. The Explicit Expansion Terms of Power-Sum $\leq 3$

In accordance with the detailed manual algebraic procedures exemplified for the

{}_{21}{x^{'}} ω σ

term in Appendix B, the 14 aberration power series coefficients are given here in explicit form, following a re-listing of the three Equations (12), (9) and (11) derived in Section 2.3 for the principal ray:

μ = [\sqrt{1 + 2 (c o s β - c o s α) c o s β t a n^{2} θ} - 1] / (t a n^{2} θ c o s θ c o s β)

\frac{{z^{'}}_{01}}{η} = \frac{{}_{01}{z^{'}}}{η} = μ s i n θ + \frac{1}{2} μ^{3} s i n^{3} θ + \frac{3}{8} μ^{5} s i n^{5} θ - (1 + \frac{3}{2} μ^{2} s i n^{2} θ) z + (1 + \frac{𝛤}{2}) (μ s i n θ) z^{2}

\begin{array}{l} \frac{{x^{'}}_{10}}{η} t a n β & = [(1 + μ^{2} s i n^{2} θ) μ s i n θ + (\frac{t a n β}{ρ}) t a n ψ] z - {\frac{μ c o s θ}{2 c o s β} + \frac{μ^{2} s i n^{2} θ}{2 t a n^{2} β} [1 + (𝛤 + 3) t a n^{2} β] \\ + \frac{μ s i n θ}{ρ t a n β} (1 + 3 t a n^{2} β) t a n ψ + \frac{1}{2} (\frac{1}{ρ^{2}} - 𝛤) t a n^{2} ψ} z^{2} \end{array}

\frac{{}_{02}{z^{'}}}{η} = 1 + \frac{1}{η} + C_{2} μ s i n θ t a n θ + \frac{3}{2} μ^{2} s i n^{2} θ + \frac{3}{2} C_{2} μ^{3} s i n^{3} θ t a n θ + \frac{15}{8} μ^{4} s i n^{4} θ

(41)

- \frac{{}_{11}{x^{'}}}{η} s i n β = (C_{2} + c o s β) μ s i n θ + (C_{2} c o s β) μ^{2} s i n^{2} θ t a n θ + (\frac{C_{2}}{2} + c o s β) μ^{3} s i n^{3} θ

(42)

\frac{{}_{11}{z^{'}}}{η} = (C_{2} - \frac{c o s β}{η}) μ s i n θ + (\frac{3}{2} C_{2} - \frac{c o s β}{2 η}) μ^{3} s i n^{3} θ

(43)

\begin{matrix} - \frac{{}_{20}{x^{'}}}{η} s i n β = (C_{2} + \frac{1}{Q ρ^{2}} + \frac{1}{Q η}) μ c o s θ + (C_{2} c o s β) (μ^{2} s i n^{2} θ + μ^{4} s i n^{4} θ) \\ + (C_{2} + \frac{1}{Q ρ^{2}}) (\frac{1}{2} μ^{2} s i n^{2} θ + \frac{3}{8} μ^{4} s i n^{4} θ) μ c o s θ \end{matrix}

(44)

\begin{matrix} \frac{{}_{03}{z^{'}}}{η} = (\frac{3}{2} + C_{3} t a n^{2} θ) μ s i n θ + [3 C_{2} + \frac{Q}{2} {(C_{2} + c o s β)}^{2}] μ^{2} {sin}^{2} θ \tan θ \\ + [\frac{15}{4} + \frac{3}{2} ({C_{2}}^{2} + C_{3}) {tan}^{2} θ] μ^{3} s i n^{3} θ \end{matrix}

(45)

\begin{array}{l} \frac{{}_{12}{z^{'}}}{η} & = - (c o s α + \frac{c o s β}{η}) + (2 C_{3} - C_{2} \frac{c o s β}{η}) μ s i n θ t a n θ \\ + [3 C_{2} - \frac{3}{2} (c o s α + \frac{c o s β}{η}) + (Q C_{2} + \frac{1}{ρ^{2}}) (C_{2} + c o s β)] μ^{2} s i n^{2} θ \end{array}

(46)

\begin{array}{l} - \frac{{}_{21}{x^{'}}}{η} s i n β = \\ [2 C_{3} + (1 - \frac{1}{η} + Q C_{2} + \frac{1}{ρ^{2}}) C_{2} c o s β + (Q C_{2} + \frac{1}{ρ^{2}}) c o s^{2} β - (c o s α + \frac{c o s β}{η}) c o s β] μ s i n θ \\ + {C_{2} + \frac{1}{Q ρ^{2}} + [(2 C_{3} + {C_{2}}^{2}) c o s β + (\frac{1}{ρ^{2}} - \frac{1}{η} + 2 Q C_{2} + Q c o s β) C_{2} c o s^{2} β] t a n^{2} θ} μ^{2} s i n θ c o s θ \\ + [C_{3} + 2 {C_{2}}^{2} + (\frac{7}{2} + Q C_{2} + \frac{1}{ρ^{2}}) C_{2} c o s β + \frac{3}{2} (Q C_{2} + \frac{1}{ρ^{2}}) c o s^{2} β \\ - (c o s α + \frac{c o s β}{η}) c o s β + (Q {C_{2}}^{2} c o s^{3} β) t a n^{2} θ] μ^{3} s i n^{3} θ \end{array}

(47)

\begin{array}{l} \frac{{}_{21}{z^{'}}}{η} & = (C_{3} - C_{2} \frac{c o s β}{η}) μ s i n θ + (\frac{C_{2}}{ρ^{2}} + \frac{Q {C_{2}}^{2}}{2} + \frac{1}{2 Q ρ^{4}}) μ^{2} s i n θ c o s θ \\ + [\frac{3}{2} (C_{3} + {C_{2}}^{2}) + \frac{5 c o s^{2} β + 8}{4 η^{2}} + (\frac{1}{ρ^{2}} - \frac{5}{2 η} + Q C_{2}) C_{2} c o s β] μ^{3} s i n^{3} θ \end{array}

(48)

\begin{array}{l} - \frac{{}_{30}{x^{'}}}{η} s i n β & = [C_{3} + (\frac{Q C_{2}}{2} + \frac{1}{ρ^{2}} - \frac{1}{η}) C_{2} c o s β + (\frac{1}{2 ρ^{2}} - \frac{1}{η}) \frac{c o s β}{Q ρ^{2}} - \frac{3 c o s α}{2 Q ρ^{2}}] μ c o s θ \\ + [(C_{3} + \frac{{C_{2}}^{2}}{2}) c o s β + (Q C_{2} - \frac{1}{η} + \frac{1}{ρ^{2}}) C_{2} c o s^{2} β] μ^{2} s i n^{2} θ \\ + [\frac{C_{3}}{2} + (C_{2} + \frac{1}{Q ρ^{2}}) C_{2} + (\frac{Q C_{2}}{2} + \frac{1}{ρ^{2}} - \frac{1}{2 η}) C_{2} c o s β \\ + (\frac{1}{2 ρ^{2}} - \frac{1}{2 η}) \frac{c o s β}{Q ρ^{2}} - \frac{3 c o s α}{4 Q ρ^{2}} + (\frac{Q {C_{2}}^{2} c o s^{3} β}{2}) t a n^{2} θ] μ^{3} s i n^{2} θ c o s θ \end{array}

(49)

\frac{{}_{31}{z^{'}}}{η} = [C_{4} - \frac{c o s β}{η} C_{3} + (C_{2} + 2 \frac{c o s β}{η}) \frac{s i n^{2} β}{2 η^{2}}] μ s i n θ

(50)

\begin{array}{l} - \frac{{}_{40}{x^{'}}}{η} s i n β = & {C_{4} + (Q C_{2} + \frac{1}{ρ^{2}} - \frac{1}{η}) C_{3} c o s β \\ + [4 - \frac{1}{η ρ^{2}} + (\frac{ρ^{2}}{η^{3}} + \frac{1}{η^{2}} - \frac{5}{ρ^{2}}) s i n^{2} β + \frac{c o s^{2} β}{ρ^{4}} + 3 (\frac{1}{η} - \frac{1}{ρ^{2}}) c o s α c o s β] \frac{1}{2 Q ρ^{2}} \\ + [\frac{s i n^{2} β}{η^{2}} - (Q C_{2} + \frac{2}{ρ^{2}}) \frac{1}{η} + 3 (\frac{1}{ρ^{4}} + \frac{Q C_{2}}{ρ^{2}} + \frac{Q^{2} {C_{2}}^{2}}{3}) c o s^{2} β - 3 \frac{c o s α c o s β}{ρ^{2}}] \frac{C_{2}}{2}} μ c o s θ \\ + (C_{2} + \frac{1}{Q ρ^{2}} + \frac{1}{Q η}) (\frac{Q {C_{2}}^{2}}{2} + \frac{C_{2}}{ρ^{2}} - \frac{1}{2 Q η^{2}} + \frac{1}{2 Q ρ^{4}}) μ^{2} c o s^{2} θ \\ + {(3 Q^{2} {C_{2}}^{2} + \frac{6 Q C_{2} - 3 𝛤}{ρ^{2}} + \frac{3}{ρ^{4}} - \frac{2}{η ρ^{2}} - \frac{2 Q C_{2}}{η}) \frac{C_{2}}{2} c o s^{3} β + (C_{2} C_{3} + C_{4}) c o s β \\ + [(2 Q C_{2} + \frac{1}{ρ^{2}} - \frac{1}{η}) C_{3} + (Q C_{2} + \frac{1}{ρ^{2}} - \frac{1}{η}) \frac{{C_{2}}^{2}}{2}] c o s^{2} β} μ^{2} s i n^{2} θ \end{array}

(51)

\begin{array}{l} - \frac{{}_{12}{x^{'}}}{η} s i n β = \frac{1}{2} (c o s β - c o s α) + [C_{3} + (1 + \frac{Q C_{2}}{2}) C_{2} c o s β + Q C_{2} c o s^{2} β + \frac{Q c o s^{3} β}{2}] μ s i n θ t a n θ \\ + {C_{2} + 2 c o s β - \frac{c o s α}{4} + [(C_{3} + \frac{{C_{2}}^{2}}{2}) c o s β + Q {C_{2}}^{2} c o s^{2} β + Q C_{2} c o s^{3} β] t a n^{2} θ} μ^{2} s i n^{2} θ \end{array}

(52)

\begin{array}{l} \frac{{}_{22}{z^{'}}}{η} = 1 + & \frac{c o s α c o s β}{η} + [(\frac{2}{ρ^{2}} + \frac{1}{η} + Q C_{2}) \frac{C_{2}}{2} + (\frac{1}{ρ^{4}} + \frac{1}{η ρ^{2}} - \frac{3}{ρ^{2}} + \frac{1}{η^{2}}) \frac{1}{2 Q} \\ + (1 - \frac{1}{η^{2}} - C_{3} - C_{2} c o s α) t a n θ + (3 C_{4} - \frac{2}{η} C_{3} c o s β) t a n^{2} θ] μ c o s θ \end{array}

(53)

\begin{array}{l} - \frac{{}_{31}{x^{'}}}{η} s i n β & = {3 C_{4} + [1 - \frac{2}{η} + \frac{2}{ρ^{2}} + Q (c o s β + 3 C_{2})] C_{3} c o s β - [1 + (c o s β + C_{2}) Q + \frac{1}{ρ^{2}}] \frac{C_{2}}{η} \\ + [\frac{3}{2 ρ^{4}} + \frac{2 + 6 (C_{2} + c o s β) Q - 3 𝛤}{2 ρ^{2}} + Q C_{2} + \frac{3}{2} Q^{2} {C_{2}}^{2} - (1 + Q c o s β) 𝛤] C_{2} c o s^{2} β + \frac{c o s α}{η} \\ + (\frac{3}{2 ρ^{4}} - \frac{5 𝛤}{2 ρ^{2}} + \frac{3}{2} Q^{2} {C_{2}}^{2}) c o s^{3} β + (1 - \frac{1}{η ρ^{2}}) c o s β + (C_{2} - 4 c o s β) \frac{s i n^{2} β}{2 η^{2}}} μ s i n θ \end{array}

(54)

where the substitute variable

Q

\equiv

(μ c o s θ) / s i n^{2} β

is of order unity at grazing incidence. While the principal ray Equations (9) and (11) include the off-plane (

z

) coordinates of a tilted entrance slit (

x = z t a n ψ,)

, the effect of slit length on the aberrations is negligible at present (cf. the solid and dashed line profile raytracings in Figure 6), thus Equations (41)–(54) do not include any

z

-components.

A summary of the accuracy provided by Equations (41)–(54) is given in Table 2, using the ultra-high resolution soft X-ray monochromator (parameterized at the end of Section 2.6) as a test case. The RMS deviations between these light-path equations and independent numerical raytrace simulations (Section 7) are given in the last 5 columns. Also listed are the lateral rays, including aberrations, which form the reference image position

(ξ_{i j}, ζ_{i j})

. The top-down sequence in this table provides the terms needed for the reference image in subsequent expansions. The bracketed {terms} and corresponding {errors} in the S.F. column refer to the standard light-path formulation {paraxial reference image, including astigmatism per Section 3.1}.

Table 2. Reference image lateral positions and residual calculation errors.

**Table 2.** Reference image lateral positions and residual calculation errors.
RMS error in lateral extreme width: unit = 10^-6 m at 10 m
Term Name	Power	$ξ_{i j}$	$ζ_{i j}$	S.F.	μ⁰	$t o$ μ¹	$t o$ μ²	$t o$ μ³	$t o$ μ⁴
$z'_{02} Astigmat .$	$σ$	0	${z'_{01}}$		350	1.15	0.020	0.00006	0.00003
$x'_{11} Hor . Tilt$	$σ$	0	${z'_{01} + z'_{02}}$			20	0.10	0.00118
$z'_{11} Ver . Tilt$	$ω$	0	${z'_{01}}$			0.114	N/A	0.00004
$x'_{20} Defocus$	$ω$	0	${z'_{01}} + z'_{11}$	${93}$		179	0.124	0.00735	0.000025
$z'_{03} (0, 3)$	$σ^{2}$	$x'_{11}$	${z'_{01} + z'_{02}}$				0.0045	0.00003
$z'_{12} Ver . (1, 2)$	$ω σ$	${x'_{20}} + x'_{11}$	${z'_{01} + z'_{02}} + z'_{11}$	${10.8}$	10.8	0.018	0.00009
$x'_{21} Hor . (2, 1)$	$ω σ$	${x'_{20}} + x'_{11}$	${z'_{01} + z'_{02}} + {z^{'}}_{11} + z'_{12}$	${92}$		0.84	0.0013	0.00022
$z'_{21} Ver . (2, 1)$	$ω^{2}$	${x'_{20}}$	${z'_{01}} + z'_{11}$			0.008	0.0005	0.00005
$x'_{30} Mer . Coma$	$ω^{2}$	${x'_{20}}$	${z'_{01}} + z'_{11} + z'_{21}$	${1.4}$		1.00	0.007	0.00012
$z'_{31} Ver . (3, 1)$	$ω^{3}$	${x'_{20}} + x'_{30}$	${z'_{01}} + z'_{11} + z'_{21}$			0.0004
$x'_{40} Spherical$	$ω^{3}$	${x'_{20}} + x'_{30}$	${z'_{01}} + {z^{'}}_{11} + {z^{'}}_{21} + z'_{31}$	${0.6}$		0.027	0.00028
$x'_{12} Sag . Coma$	$σ^{2}$	$x'_{11}$	${z'_{01} + z'_{02}} + z'_{03}$		0.19	0.018	0.00036
$z'_{22} Ver . (2, 2)$	$ω^{2} σ$	${{x^{'}}_{20}} + {x^{'}}_{30} + x'_{11} + x'_{21}$	${z'_{01} + z'_{02}} + z'_{11} + z'_{12} + z'_{21}$		0.43	0.016
$x'_{31} Hor . (3, 1)$	$ω^{2} σ$	${{x^{'}}_{20}} + {x^{'}}_{30}$ $+ x'_{11} + x'_{21}$	${z'_{01} + z'_{02}} + z'_{11} + z'_{12} + z'_{21} + z'_{22}$			0.019

As listed in Table 2, the reference points for the highest-degree mixed terms (

x'_{31} and z'_{22}

) include numerous lower-degree aberrations and were therefore undertaken only for the linear sub-term in

μ

. These terms provide the worst accuracies (

\sim

0.02 microns laterally at the image plane), albeit of no practical significance at present. All the other aberration terms were expanded to include the powers in

μ

needed to decrease the lateral errors to the size of an atom (

\sim

0.001 microns) or below, insuring they will endure any conceivable future physical application.

By comparison, the standard light-path expansion causes significant formulation errors. This may be seen by the defocus term resulting from use of only the paraxial point in forming the reference image:

- \frac{{}_{20}{x^{'}}}{η} s i n β = (C_{2} + \frac{1}{Q ρ^{2}} + \frac{1}{Q η}) μ c o s θ + \frac{1}{2} μ^{2} s i n^{2} θ + \frac{1}{2} C_{2} μ^{3} s i n^{2} θ c o s θ

Due to the neglect of the

z_{11}^{'}

aberration in forming the image reference position, the

μ^{2}

sub-term in this equation differs significantly from that in the rigorous Equation (44). This results in the calculation error of

\sim

93 microns listed in Table 2. As shown, there is a similar calculation error resulting from the standard formulation of the

{x^{'}}_{21}

term:

- \frac{{}_{21}{x^{'}} s i n β}{η} ≃ (2 C_{3} + \frac{3 c o s^{2} β - 1}{η}) μ s i n θ

This is due to neglecting three relevant aberrations (

z'_{11}, x'_{11} and z'_{12}

) in forming the reference image. While the above equation lists only the first sub-term (linear with

μ

), those of degree

μ^{2}

and

μ^{3}

were also generated but made little improvement due to the linear term already being flawed.

5. Manipulation and Analysis of the Ray Aberrations

5.1. Horizontal Tilt (Sagittally-Induced)

The lowest-degree mixed horizontal aberration created by the surface-normal rotation is

x_{11}^{'}

(Equation (42)) resulting from the nonzero value of

N_{11} = C_{2} s i n θ

in Equation (5). Noting the linearity of

x_{11}^{'}

with the sagittal pupil coordinate

σ

, this is simply a tilt of the astigmatic image by the angle

ψ_{s}^{'} = - arctan (x_{11}^{'}

/

z_{02}^{'}

) about the image coordinates (0,

z_{01}^{'}) .

Employing the expansions given in Equations (41) and (42):

t a n ψ_{s}^{'} = - \frac{(μ s i n θ) [C_{2} (1 + \frac{1}{2} μ^{2} s i n^{2} θ) + (1 + μ C_{2} s i n θ t a n θ + μ^{2} s i n^{2} θ) c o s β]}{[(1 + \frac{1}{η}) + μ C_{2} s i n θ t a n θ + \frac{3}{2} μ^{2} s i n^{2} θ + \frac{3}{2} C_{2} μ^{3} s i n^{3} θ t a n θ] s i n β}

(55)

where a negative value is counter-clockwise for an upstream observer. As in the case of the single-rotation “pure” SNR monochromator [2,3], a sympathetic rotation of the exit slit by

ψ_{s}^{'}

about

(0, z_{01}^{'})

will cancel the rotated value of

x_{11}^{″}

and thus

Δ λ_{11}

. A stationary rotation axis

(0, 0)

would require adjusting the focusing condition (not detailed here) to include the horizontally offset image center intercepting the slit.

5.2. Vertical Tilt (Meridionally-Induced) and the Rigorous Focusing Condition

Given the above astigmatic image tilt, the lowest-order mixed vertical aberration

(z_{11}^{'})

presents the interesting condition that a perfect horizontal focus for meridional rays (

x_{20}^{'} = 0

) would be extended into a vertically extended line, projected upon the tilted slit as a defocus normal to its length (multiplied by tan

ψ_{s}^{'}

). In effect, the vertical aberration due to the meridional rays induces a second tilt angle

ψ_{m}^{'} = - arctan (x_{20}^{'} / z_{11}^{'})

. Employing Equations (43) and (44):

\begin{array}{l} t a n ψ_{m}^{'} & = [- (C_{2} + \frac{1}{Q ρ^{2}} + \frac{1}{Q η}) μ c o s θ + (C_{2} c o s β) (μ^{2} s i n^{2} θ + μ^{4} s i n^{4} θ) \\ + (C_{2} + \frac{1}{Q ρ^{2}}) (\frac{1}{2} μ^{2} s i n^{2} θ + \frac{3}{8} μ^{4} s i n^{4} θ) μ c o s θ] \\ / {[(C_{2} - \frac{c o s β}{η}) + (\frac{3}{2} C_{2} - \frac{c o s β}{2 η}) μ^{2} s i n^{2} θ] (μ s i n θ) s i n β} \end{array}

(56)

The solution is to purposely “defocus” the grating horizontally to force equality of the two tilts (Equations (55) and (56)). Making the following substitutions:

Q = (μ c o s θ) / s i n^{2} β

,

μ c o s θ = [ε - \frac{1}{2} ε^{2} (c o s β) t a n^{2} θ]

and

C_{2} = 2 N_{2} c o s θ

constrains the single parameter

θ

to provide the following rigorous focusing condition:

\begin{array}{l} 0 & = [(\frac{1}{η} + \frac{1}{ρ^{2}}) (1 + \frac{1}{η}) s i n^{2} β - (1 + \frac{1}{η}) ⟦ \frac{{}_{40}{x^{'}} (θ_{o})}{η} s i n β ⟧ {ω_{b a l}}^{2}] c o s^{4} θ \\ + {(1 + \frac{1}{η}) c o s^{2} θ + [(\frac{1}{η} + \frac{1}{ρ^{2}}) s i n^{2} β - ⟦ \frac{{}_{40}{x^{'}} (θ_{o})}{η} s i n β ⟧ {ω_{b a l}}^{2}] s i n^{2} θ} (2 N_{2} c o s^{3} θ) ε \\ + [(\frac{3}{η} - 1) (N_{2} c o s β) c o s^{2} θ + [\frac{1}{η} + (\frac{1}{2 η} + \frac{2}{ρ^{2}} + \frac{1}{2 η ρ^{2}}) s i n^{2} β] c o s θ \\ - (\frac{1}{η} + \frac{1}{ρ^{2}}) (N_{2} s i n^{2} β c o s β) s i n^{2} θ] (s i n^{2} θ c o s θ) ε^{2} \\ + {3 N_{2} c o s^{3} θ - [\frac{1}{η} + (\frac{1}{2 η} + \frac{2}{ρ^{2}} + \frac{1}{2 η ρ^{2}}) s i n^{2} β] (c o s β) s i n^{2} θ \\ + [(\frac{3}{η} + \frac{4}{ρ^{2}}) s i n^{2} β + (1 - \frac{1}{η}) c o s^{2} β] N_{2} s i n^{2} θ c o s θ} (s i n^{2} θ) ε^{3} \end{array}

(57)

accurate to order

ε^{3}

, where

ε \equiv (c o s β - c o s α)

is a small quantity (of order

μ

) and where

ω_{b a l} = 0

at present (the terms involving

{}_{40}{x^{'}}

are derived in Section 5.3). If one drops the

ε^{2}

and

ε^{3}

terms, a quadratic emerges whose closed-form root

θ \equiv θ_{o}

:

\begin{matrix} 0 = [2 (1 + \frac{1}{η}) - 2 (\frac{1}{η} + \frac{1}{ρ^{2}}) (s i n^{2} β)] (N_{2} ε) c o s^{2} θ_{o} \\ + [(\frac{1}{η} + \frac{1}{ρ^{2}}) (1 + \frac{1}{η}) s i n^{2} β] c o s θ_{o} + 2 (\frac{1}{η} + \frac{1}{ρ^{2}}) (s i n^{2} β) (N_{2} ε) \end{matrix}

(58)

Note that the linear approximation to Equations (57) or (58) has the even simpler solution

c o s θ_{o o} = - [(1 / η + 1 / ρ^{2}) s i n^{2} β] / (2 N_{2} ε)

, being a pure horizontal focus (

x_{20}^{'} = 0)

independent of the image tilt. In the small-angle approximation, this is equivalent to that previously given by Equation (16).

The final solution for

θ

is obtained by numerical iteration of Equation (57) using the above “initial guess” of

θ_{o}

or

θ_{o o}

. Even at the maximum rotation angle of

θ_{o o} \sim {71.000}^{o}

, only a small adjustment in

θ

is needed to provide the desired condition

ψ_{m}^{'} = ψ_{s}^{'}

. If using

θ_{o} \sim {71.245}^{o}

as the initial guess, the residual is negligible (

<

10⁻¹²) after only 2 linear interpolations, refining

θ

by

\sim

0.064°.

Equalizing the two image tilts of a point source is critical to providing fine spectral resolution at high rotation angles. In the absence of such a constraint (i.e., if the horizontal defocus alone is corrected), the net resolution of the example monochromator is calculated to degrade by a factor

\sim

6 at

θ \sim 71^{o} .

5.3. Balancing of Spherical Aberration and Defocus

Though typically a very small correction, a fine adjustment in

θ

may also be employed to partially balance the spherical aberration term. This is analogous to the classical technique of offsetting the detection plane from the Gaussian (

x_{20}^{'} = 0

) focus to the plane of “least confusion” which minimizes the sum of these two lateral horizontal aberrations of odd power (

x_{20}^{'} \propto ω and x_{40}^{'} \propto ω^{3})

. Unlike the defocus term, the aberrations of high power (including the

ω^{3}

term) vary only slowly with

θ

(except near their correction points). This allows the

x_{40}^{'}

term to be evaluated at the root of Equations (57) or (58) and to then be treated as a constant to be added to

x_{20}^{'}

for a refined determination of the meridionally-induced tilt angle (

ψ_{m}^{'}

) given above. For example, evaluating

x_{40}^{'}

at

θ_{o}

from Equation (58), the full focusing condition shown in Equation (57) employs a nonzero value for

ω_{b a l}

and the following constant approximated by the dominant lowest-order (

μ c o s θ

) term of Equation (51):

\begin{array}{l} - (\frac{{}_{40}{x^{'}} (θ_{o})}{η} s i n β) ≃ \\ ≃ ⟦ 4 N_{4} c o s^{3} θ_{o} + [4 - \frac{1}{η ρ^{2}} + (\frac{ρ^{2}}{η^{3}} + \frac{1}{η^{2}} - \frac{5}{ρ^{2}}) s i n^{2} β + (\frac{1}{ρ^{4}} - \frac{3 𝛤}{ρ^{2}} + \frac{3 𝛤}{η}) c o s^{2} β] \frac{1}{1 - ρ^{2}} \\ + [(\frac{1}{ρ^{2}} - 1) N_{2} c o s θ_{o} + \frac{1}{ρ^{2}} - \frac{1}{η}] (2 N_{3} c o s β) c o s^{2} θ_{o} + {\frac{s i n^{2} β}{η^{2}} + \frac{2}{η ρ^{2}} - (\frac{1}{η ρ^{2}} - \frac{1}{η}) N_{2} c o s θ_{o} \\ + [\frac{3}{ρ^{4}} (N_{2} (1 - ρ^{2}) c o s θ_{o} + 1 - ρ^{2} 𝛤) + {(\frac{1}{ρ^{2}} - 1)}^{2} {N_{2}}^{2} c o s^{2} θ_{o}] c o s^{2} β} N_{2} c o s θ_{o} ⟧ ε \end{array}

(59)

An optimized balancing of defocus and spherical aberration employs

ω_{b a l} \approx 2 / 3

of the semi-meridional aperture, resulting in the root of

θ

changing by

\sim

−0.0067° at the scan wavelength of

λ / λ_{o} = 6.58

, and the extremum convolution of all the aberrations decreasing by

\sim

30% (cf. Figure 6g,h). More substantial improvements are expected at larger apertures, where spherical aberration is increasingly dominant over the lower-power aberrations.

5.4. Rotation of the Entrance Slit

Given an entrance slit at angle

ψ

to the vertical, its image tilt

{ψ^{'}}_{ψ}

derives simply from Equations (9) and (11):

\begin{array}{l} t a n {ψ^{'}}_{ψ} & \equiv [\frac{{x^{'}}_{10} (z)}{η} - \frac{{x^{'}}_{10} (- z)}{η}] / [\frac{{z^{'}}_{10} (z)}{η} - \frac{{z^{'}}_{10} (- z)}{η}] \\ ≃ - [\frac{t a n ψ}{ρ} + \frac{μ s i n θ}{t a n β} (1 + μ^{2} s i n^{2} θ)] / (1 + \frac{3}{2} μ^{2} s i n^{2} θ) \end{array}

(60)

To maintain focus along the slit length, this must be set equal to the image tilt for a point source from Equation (55). Thus,

{ψ^{'}}_{ψ} = {ψ^{'}}_{s} = {ψ^{'}}_{m} \equiv ψ^{'}

, constraining

ψ

as a function of the scan parameters:

t a n ψ ≃ \frac{2 N_{2} c o s θ - \frac{c o s β}{η} + (4 N_{2} c o s θ - \frac{c o s β}{η}) μ^{2} s i n^{2} θ}{(1 + \frac{1}{η} + 2 μ N_{2} s i n^{2} θ + \frac{3}{2} μ^{2} s i n^{2} θ) s i n β} ρ μ s i n θ

(61)

The required entrance and exit slit rotation angles,

ψ

and

ψ^{'}

, are plotted vs. scan wavelength in Figure 4. While these rotation values are comparable in magnitude, they are opposite in direction. This is due to the present treatment of the entrance slit as being an isotropic emitter (e.g., back-illuminated by a diffuse source), thus each point along the length of the slit has its own principal ray whose vertical coordinate reverses sign at the grating pole. However, if the entrance slit were illuminated by a distant horizontal (vertically-narrow) source, there would by a mapping between the z-values of points along the entrance slit and the sagittal pupil coordinate

σ

. This may be treated by a nonzero value of

z_{02}

, as indicated in Section 3.2 and provided by the general expansion equations. The result would be no sign reversal of the required entrance slit rotation angle, opening the possibility for the entrance and exit slit rotations to be replaced by a third rotation of the grating (about its

ω

-axis), similar to the technique first employed in a SNR monochromator in 1993 as given in Section 4.2d of the cited thesis [15].

Figure 4. The grating scan angles are

δ

(about the groove axis) and

θ

(about the surface normal axis). The entrance slit tilt angle (

ψ

) is slightly smaller in magnitude and opposite in sign (negated here for plotting convenience) than the exit slit tilt angle (

ψ^{'}

).

Figure 4. The grating scan angles are

δ

(about the groove axis) and

θ

(about the surface normal axis). The entrance slit tilt angle (

ψ

) is slightly smaller in magnitude and opposite in sign (negated here for plotting convenience) than the exit slit tilt angle (

ψ^{'}

).

The sequence of analytical calculations which determine the scan operating parameters are now specified:

(1) The grating first rotation angle

δ

, which sets

α = γ + δ

and

β = γ - δ

;

(2) The grating second nominal rotation angle

θ_{o}

from Equation (58), using above

α and β

;

(3) Exact

θ

obtained numerically from Equation (57), with or without balance of spherical aberration (Equation (59));

(4) The dimensionless wavelength

μ

from Equation (10) or (12), using

α, β and θ

;

(5) The exit slit tilt angle

ψ'

from Equation (56), using above

θ and μ

; and

(6) The entrance slit tilt angle

ψ

from Equation (61), using above

θ and μ

.

Using

α, β and θ and μ

as specified above (and an object position

z

), the principal ray terms and aberrations are then determined from Equations (9,11) and (41–54), respectively. Transformation of the horizontal (

x^{'}

) and vertical (

z^{'}

) lateral positions to spectral resolution

Δ λ / λ

and exit slit height

Δ z^{″}

requires a final rotational transformation at the image plane, as given in Section 6.

5.5. Image Curvature

Spectral curvature along the image length is given by

{[x ″]}_{c u r v} = {[x_{10}^{″}]}_{z^{2} t e r m} + {[{}_{12}{x^{″}}]}_{σ^{2} t e r m} + {[{}_{11}{x^{″}}]}_{z σ t e r m}

(62)

which is the sum of three components (the paraxial position and two aberrations). The first term is the paraxial image curvature of a straight entrance slit, resulting in a deviation from a straight exit slit (albeit rotated in accordance with Equations (55), (56) or (60)) obtained by the rotational transformation of Equation (67) on the

z^{2}

terms of the horizontal (

x'_{10}

) and vertical (

z'_{01}

) positions (Equations (11) and (9), respectively):

\begin{matrix} {[x_{10}^{″}]}_{z^{2} t e r m} = - η ⟦ {\frac{μ c o s θ}{2 c o s β} + \frac{μ^{2} s i n^{2} θ}{2 t a n^{2} β} [1 + (𝛤 + 3) t a n^{2} β] + \frac{μ s i n θ}{ρ t a n β} (1 + 3 t a n^{2} β) t a n ψ \\ + \frac{1}{2} (\frac{1}{ρ^{2}} - 𝛤) t a n^{2} ψ} \frac{c o s ψ^{'}}{t a n β} + (1 + \frac{𝛤}{2}) (μ s i n θ) s i n ψ^{'} ⟧ z^{2} \end{matrix}

(63)

This component of the image curvature may be eliminated only by curving the entrance (not exit) slit. However, such correction is both difficult (due to its dependence on

θ

) and unnecessary; for the example monochromator parameters, the uncorrected curvature at the ends of the 74 mm long image of the entrance slit is only

+ 1.05

microns at

θ = 0

and decreases to

- 0.21

microns at

θ = {71.245}^{o}

. This is confirmed (within 0.1 microns) by the numerical raytracings (1D profile of Figure 6i), revealing that the entrance slit length causes only a slight asymmetry and horizontal shift in the spectral line.

Horizontal curvature of the (vertical) astigmatism from a point source is given by

x'_{12}

(Equation (52)), often referred to as “sagittal coma” or “astigmatic coma”. This term, together with

z'_{03}

from Equation (45) determines the curvature in the spectral direction (

x''_{12}

) using Equation (67):

\begin{array}{l} {[{}_{12}{x^{″}}]}_{σ^{2} t e r m} = \\ ⟦ (- η \frac{c o s ψ^{'}}{s i n β}) {\frac{1}{2} μ c o s θ + [C_{3} + (1 + \frac{Q C_{2}}{2}) C_{2} c o s β + Q C_{2} c o s^{2} β + \frac{Q c o s^{3} β}{2}] μ s i n θ t a n θ} \\ + η (s i n ψ^{'}) [(\frac{3}{2} + C_{3} t a n^{2} θ) μ s i n θ] ⟧ σ^{2} \end{array}

(64)

In spatial units, this resolution corresponds to 0.36

μ

m at the minimum scan wavelength, zero at the (passive) correction wavelength (

λ / λ_{o} \sim 1.48

) and 1.16

μ

m at the maximum wavelength, confirmed by the numerical raytracings within 0.07

μ

m. As given in Figure 5, these correspond to spectral resolutions (

Δ λ / λ

) of 2 × 10⁻⁶, zero and 3 × 10⁻⁶, respectively.

It is noted that there are no (i,1) aberrations when

θ =

0, where the straight grooves are perpendicular to the meridional plane. In this case, the dominant mixed aberration is astigmatic coma (

x_{12}^{″}

), for which the residual nonzero component of Equation (64) yields simply

Δ λ_{12} / λ = ϕ_{s}^{2} / 8

, independent of

η

. This is the same result previously reported [8,15] for a plane grating in a converging (stigmatic) beam where

η \approx - 1.

While of importance and of historical significance in that original VLS application to fast XUV telescope beams (

ϕ_{s}^{} \sim

0.1), this aberration is negligible for most soft X-ray laboratory applications (e.g.,

Δ λ_{12} / λ =

2 × 10⁻⁶ for

ϕ_{s}^{} =

0.004). It is also noted that the passive correction point seen in Figure 5 may be obtained by zeroing the coefficient of the

μ

term in Equation (52), yielding

θ_{12} ≃

36°, though this aberration remains small across the scan range.

When expanded to include an off-plane (

z \neq 0

) object point, the spectral aberration of sagittally-induced image tilt (

x''_{11}

), otherwise zero by proper rotation of the exit slit (Section 5.1), also has a small curvature component. However, because nonzero values of z are not included in the expansion of the aberration terms (i + j

> 1

) at present, the required horizontal and vertical component terms are not given in Equations (42) and (41), respectively.

5.6. The Horizontal Mixed Aberrations (2,1) and (3,1)

As shown in Figure 5, the dominant spectral aberration at long wavelengths is due to the (2,1) mixed term, which vanishes at one scan wavelength. A good approximation to this “passive correction” point may be obtained by zeroing the coefficient of the linear term in

μ

from Equation (47):

2 C_{3} + (1 - \frac{1}{η} + Q C_{2} + \frac{1}{ρ^{2}}) C_{2} c o s β + (Q C_{2} + \frac{1}{ρ^{2}}) c o s^{2} β - (c o s α + \frac{c o s β}{η}) c o s β = 0

(65)

Simplifying this expression by treating

α

and

β

as small angles yields:

c o s θ_{21} ≃ [1 / (η c o s θ_{21}) - (1 - 1 / η) N_{2}] / N_{3}

(66)

which occurs at

θ_{21} ≃

47.8° (

λ / λ_{o} \sim 2.08

) for the example monochromator. However, this aberration is dominant at larger rotation angles, reaching a full-width comparable to that of the pure meridional (3,0) aberration at the long-wavelength end of the scan range (

θ =

71.25°). As the magnitude of this mixed aberration scales with ωσ, it is highest at the corners of the solid aperture. An elliptically-shaped illumination would thereby halve the aberration with only a 22% intensity loss. However, for clarity the present analysis employs a simple rectangular aperture. The horizontal (2,1) ray aberration was also required to obtain the non-paraxial reference image position in the expansions of the (4,0) and (3,1) horizontal aberrations. As this is the dominant mixed aberration in the horizontal direction, the coefficients to degree

μ^{3}

were expanded so as to maintain accurate results even if

μ

increases (e.g., due to use of a higher graze angle).

As will be clear from the spectral resolution plot (Figure 5) and the raytrace diagram (Figure 6), the higher-degree (3,1) aberration causes only a small distortion to the above (2,1) aberration. Therefore, its laborious Fermat expansion (requiring nine lateral aberrations to compose the proper reference image, as listed in Table 2) was performed only for the linear component in

μ

and is given by Equation (54).

5.7. Minor Vertical Aberrations (1,2), (2,1), (3,1) and (2,2)

Given that the horizontal (1,2) aberration is very small, the vertical (1,2) aberration is surprisingly large (11 microns for the example design). This result would not be obtained from the standard light-path formulation, even given the aforementioned “astigmatism exception”. The aberration is correctly determined here by the inclusion of two additional non-paraxial reference points, namely

{}_{11}x^{'} and {}_{11}z^{'}

, as listed in Table 2. Though not being of practical significance in itself for a (highly) astigmatic monochromator, the image plane rotation of Equation (67) transforms this vertical aberration into a non-negligible (

\sim

1 micron) component of the (2,1) image width in the spectral direction. The vertical (1,2) aberration is also required as a non-paraxial reference for the correct expansion of the (dominant) horizontal (2,1) aberration.

Similarly, the vertical (2,1), (3,1) and (2,2) aberrations are of no importance in themselves, as they are comparatively negligible additions to the astigmatism term. However, their explicit expansions have been given in Section 4 to provide the comprehensive set of vertical reference image coordinates required for the expansion of the horizontal (3,0), (4,0) and (3,1) aberrations, respectively. The image plane rotation of Equations (67) and (68) also transforms the (2,1), (3,1) and (2,2) vertical aberrations into non-negligible components of the (3,0), (4,0) and (3,1) spectral aberrations.

6. Spectral Resolution

The spectral resolution equals the grating dispersion times the ray aberration normal to the slit length. The latter is determined by a rotational transformation of the image plane coordinates (

x^{'}, z'

) to those of the

ψ^{'}

-tilted image plane (

x^{″}, z ″

). The power series is carried over to the

x^{″}

coordinate by linearly combining the vertical and horizontal coefficients in accordance with:

{}_{i j}x^{″} = {}_{i j}x^{'} c o s ψ^{'} - {}_{i - 1, j + 1}{z^{'}} s i n ψ^{'}

(67)

Division by the linear dispersion per fractional wavelength yields the corresponding wavelength shift coefficients

\frac{{}_{i j}Δ λ}{λ} = \frac{{}_{i j}x ″ s i n β}{η μ c o s θ c o s ψ^{'}} = \frac{s i n β}{η μ c o s θ} ({}_{i j}x' - {}_{i - 1, j + 1}z' t a n ψ^{'})

(68)

where tan

ψ'

is given by Equation (56). The wavelength variation modulus (

Δ

) of each term over the full rectangular grating aperture (

ω = \pm \overset{˘}{ω}, σ = \pm \overset{˘}{σ}

) is its “extremum” (full width) aberration:

Δ λ_{i j} / λ = p_{i j} | {}_{i j}Δ λ / λ | {\overset{˘}{ω}}^{i - 1} {\overset{˘}{σ}}^{j} = p_{i j} | {}_{i j}Δ λ / λ | {(ϕ_{m} / s i n α)}^{i - 1} {ϕ_{s}}^{j}

(69)

where

p_{i j} = 2^{1 - (i + j)}

if i is odd and j is even; otherwise

p_{i j} = 2^{2 - (i + j)}

; and in which the full angular frontal apertures are

ϕ_{m} = 2 \overset{˘}{ω}

sinα (meridionally) and

ϕ_{s} = 2 \overset{˘}{σ}

(sagittally). Using Equations (67)–(69) with the constituent horizontal (

x^{'}

) and vertical (

z^{'}

) ray aberrations given by the light-path Equations (41)–(54), Figure 5 plots each of the nonzero spectral aberration terms

Δ λ_{i j} / λ

. In the absence of spherical aberration balancing (Section 5.3),

Δ λ_{20}

and

Δ λ_{11}

are each zero relative to the rotated exit slit normal, as devised in Section 5.4 and Section 5.5. The

N_{i}

and

η

corrections for (3,0) and (4,0) specified in Section 2 are verified by Figure 5 to indeed minimize their peak magnitudes over the intended scan range of three octaves.

Figure 5. Geometrical spectral aberrations of a soft X-ray single-element plane grating monochromator (SEPGM) at the Gaussian image plane. The grazing angular deviation is 6° and the incident aperture is 2 mrad horizontally × 4 mrad vertically. The meridional and sagittal aperture dependences of the individual terms are given in the legend, and plotted as extrema (peak-to-valley) over the full grating aperture. The colored curves result from rigorous expansion and Fermat differentiation of light-path functions, while the open circles are algebraic extractions from numerical raytracings at 29 wavelengths. Discrepancies between these two independent methods of analysis are negligible (~ 10⁻¹⁰ in

Δ λ / λ

), being four orders of magnitude smaller than the physical diffraction width at a wavelength of 1 nm. The black curve is a conservative index of the net RMS geometrical resolution, summed from the individual terms (see the text).

Figure 5. Geometrical spectral aberrations of a soft X-ray single-element plane grating monochromator (SEPGM) at the Gaussian image plane. The grazing angular deviation is 6° and the incident aperture is 2 mrad horizontally × 4 mrad vertically. The meridional and sagittal aperture dependences of the individual terms are given in the legend, and plotted as extrema (peak-to-valley) over the full grating aperture. The colored curves result from rigorous expansion and Fermat differentiation of light-path functions, while the open circles are algebraic extractions from numerical raytracings at 29 wavelengths. Discrepancies between these two independent methods of analysis are negligible (~ 10⁻¹⁰ in

Δ λ / λ

), being four orders of magnitude smaller than the physical diffraction width at a wavelength of 1 nm. The black curve is a conservative index of the net RMS geometrical resolution, summed from the individual terms (see the text).

An easily-calculable index of the net (measureable) spectral resolution is the root-mean-square (RMS) of the wavelength deviations (relative to that diffracted from the grating pole):

\frac{Δ λ_{R M S}}{λ} \overset{def}{=} {\frac{1}{4 \overset{˘}{ω} \overset{˘}{σ}} ∯ {[\sum_{i, j} ({}_{i j}Δ λ / λ) ω^{i - 1} σ^{j}]}^{2} d ω d σ}^{1 / 2}

(70)

= \emptyset {\begin{matrix} [\frac{1}{40} ℱ_{20} ℱ_{40} \emptyset^{2} + \frac{1}{12} ℱ_{20}^{2} + \frac{1}{448} ℱ_{40}^{2} \emptyset^{4}] + [\frac{1}{72} ℱ_{30} ℱ_{12} g^{2} + \frac{1}{80} (ℱ_{30}^{2} + ℱ_{12}^{2} g^{4})] \emptyset^{2} \\ + \frac{1}{144} ℱ_{21}^{2} g^{2} \emptyset^{2} + \frac{1}{960} ℱ_{31}^{2} g^{2} \emptyset^{4} \end{matrix}}^{1 / 2}

(71)

where

\emptyset \overset{def}{=} \emptyset_{m}

,

g \overset{def}{=} \emptyset_{s} / \emptyset_{m}

,

ℱ_{i j} \overset{def}{=} ({}_{i j}Δ λ / λ) / s i n^{i - 1} α

and where Equation (71) includes all terms up to a power-sum of 3, except for the insignificant terms of

({}_{22}Δ λ / λ) ω σ^{2}

and

({}_{13}Δ λ / λ) σ^{3}

. This RMS value is also plotted in Figure 5. The surface integral in Equation (70) corresponds to rays distributed uniformly on the grating surface, which differs somewhat from a uniform angular distribution originating at the object (source) point. It is also noted that the presence of asymmetrical aberrations, particularly the dominant (3,0) term, results in a nonzero mean value for the deviation. This offset causes Equation (71) to calculate somewhat larger values than a true “RMS width”, where the deviations would be calculated relative to the mean.

The first [bracketed sum] in Equation (71) contains a cross-product of the

ℱ

₂₀ (defocus) and

ℱ

₄₀ (spherical aberration) coefficients; by departing from the (abstract) condition of being “in-focus” (

ℱ

₂₀

=

0), this product can be made negative and thus partially balance the positive

ℱ_{20}^{2}

and

ℱ_{40}^{2}

terms (as accomplished by the focus adjustment derived in Section 5.3). However, such (partial) balancing of these two different terms is aperture-dependent. Similarly, the second [bracketed sum] shows that combined coma may be made smaller than the sum of the positive meridionally-induced and positive sagittally-induced components. While exploited in normal-incidence optics (having

\sim

a 3:1 ratio between these components), grazing angles result in the much larger (

>

10:1) ratios evident in Figure 5, enabling little such coma balancing. The maximum value of sagittally-induced coma (1,2) is 2.9 × 10⁻⁶ at

θ

_max

=

71.25° (being only 1.5 times its in-plane value at

θ =

0) and is small compared to either the (25 times larger) meridionally-induced coma (3,0) or the (50 times larger) dominant aberration (2,1).

More generally, given a design optimized in resolution over a scan range of

q

octaves, and given a ratio of

γ / ϕ

between

\sim

10 and 100, meridional coma dominates the spectral aberration (provided the aperture aspect ratio

g ≲ 2

); fitting to several such sets of parameters yields the following approximate relation for the nominal resolving power (

ℛ

) over the designed scan range:

ℛ \overset{def}{=} 1 / 〈 Δ λ_{R M S} / λ 〉 \sim {(7 γ / ϕ)}^{2} 2^{3 - q}

(72)

This simple result shows a resolving power which scales quadratically with the graze angle and inversely with both the solid angle of acceptance and the spectral range. For the ultra-high resolution design plotted in Figure 5,

γ / ϕ =

25 and

q =

3, for which Equation (72) estimates

ℛ \sim 30, 000

(in agreement with the value of 29,000 obtained by averaging Equation (71) over the scan range). As specified in Section 5.3, optimization of the resolution for higher apertures (𝑒e.g., γ/ϕ ~ 8) requires some “least-confusion” defocusing to balance the dominant spherical aberration term. At yet higher apertures, several limitations would emerge: (1) spherical aberration would finally dominate and thus invalidate Equation (72); (2) more than a 50% line-space variation (Section 8.4) would be required; and (3) the large (

>

12%) variation in graze angle in the meridional direction would compromise the average reflectivity.

In the direction along the slit length, the rotational transformation of the image plane yields

{}_{i j}z ″ = ({}_{i j}z' - {}_{01}z') c o s ψ^{'} + {}_{i + 1, j - 1}x' s i n ψ^{'}

(73)

resulting in the following full-width aberrations in the direction parallel to the slit and relative to its center (

z^{″} = 0

):

Δ z_{i j}^{″} = p_{j i} z_{i j}^{″} (\overset{˘}{ω}, \overset{˘}{σ}) = p_{j i} {}_{i j}z ″ {\overset{˘}{ω}}^{i} {\overset{˘}{σ}}^{j - 1}

(74)

7. Numerical Raytrace Simulations

Independent confirmation of the light-path equations derived in Section 2, Section 3, Section 4, Section 5 and Section 6 is obtained here by three-dimensional numerical raytraces, using the commercial code “BEAM4” developed by M. Lampton [16]. An angular deviation of

2 γ =

6° is chosen, as it provides a single-bounce gold reflectance of

\sim

50% at

λ_{o} =

1 nm in the soft X-ray. To display the optical aberrations, the raytraced source is a point at the entrance slit. However, if 5 micron wide slits are to contribute a dispersive component (Section 8.5) equal to the nominal optical resolving power of

ℛ =

25,000, the object distance must be r

=

3000 mm. This scale converts the design parameters (listed in Section 2.6) to the following dimensional values:

(Image distance)	$r^{'} =$ 11,100 mm
(Line density at the pole)	$M_{1} = 1 / d_{o} =$ 1000 mm⁻¹
(VLS ruling coefficients)	2 $M_{2} =$ 0.955315252 mm⁻²
	3 $M_{3} =$ −2.792863 × 10⁻⁴ mm⁻³
	4 $M_{4} =$ 1.906 × 10^-7 mm^-4

From Equation (4), one may easily determine the required accuracy for

k M_{k}

is

M_{1} / ℛ / {(2 r \overset{˘}{ω})}^{k - 1},

thus (1000/25000)/(214 mm)

\sim

0.0002 mm⁻² for 2

M_{2}

. This translates to a groove positioning error of

\sim

0.0005 mm (1/2 groove width). This tolerance is 3 orders of magnitude less stringent than demonstrated (0.1

\sim

1 nm) by existing technologies in the fabrication of low scatter gratings [4,17]. To accept the 2 mrad (

ϕ_{m}

) × 4 mrad (

ϕ_{s}

) frontal aperture at every angular orientation (

δ, θ

) across the wavelength scan, the grating must have a physical aperture of 214 mm in diameter. At each value of

ρ

, only the light-path equations given in Section 2, Section 3, Section 4 and Section 5 were used to determine the fixed (“cold”) inputs (

α, θ, ψ and M_{k}

) for raytrace simulations run at 29 sample wavelengths across the three-octave scan range. No numerical optimization was performed by the raytrace routine (e.g., no “auto-focus” used).

7.1. Spot Diagrams

The right-hand panels of Figure 6 are the result of uniformly illuminating the grating rectangular pupil aperture with 10,000 randomly placed rays from the on-axis object point. Eight scan wavelengths were selected to highlight the different characteristic aberrations, and their convolutions, as listed in the caption. In these phase-space plots, horizontal sagittal coma (1,2) appears as a finite width at

ω = 0

(visible only in Figure 6a), meridional coma (3,0) appears as a parabola (e.g., Figure 6a,d, with this also being a component aberration in Figure 6c,e,f,i), spherical aberration (4,0) causes a cubic (S-shaped) asymmetry between the –

ω

and

+ ω

regions (most evident in Figure 6a,f,g and i), the dominant mixed-term aberration of horizontal (2,1) appears as a “bow-tie” shape (in Figure 6b,c,e,f,g,h,i) and the minor mixed-term aberration of horizontal (3,1) causes the widths of the bow-tie to be different at the 2 ends (lopsided), as visible only in Figure 6g,h and i.

The bow-tie shaped aberration (2,1) is absent only for the in-plane orientation (

θ = 0)

at which

λ / λ_{o} = 1

and at the passive correction point given by

θ_{21}

using Equation (66) for which

λ / λ_{o} \sim 2.08

. At those two wavelengths, the phase-space spot diagrams of Figure 6a,d show the classical curves resulting from the addition of the quadratic (coma) and cubic (spherical aberration) terms. Conversely, in Figure 6b these pure meridional aberrations are absent or small, resulting in the near-exclusive presence of the mixed aberration (2,1). It is also noted that Figure 6h confirms the least-confusion balancing (Section 5.3) of the defocus and spherical aberration terms given by Equation (59).

7.2. Line Profiles

The left-most panels in Figure 6 display the simulated (raytraced) spectra of the line doublet near three representative wavelengths within the scan range. The thick line segment shows the “RMS” value calculated by Equation (71). Due to the dominant aberration of coma (3,0) being more highly peaked than a normal distribution, the actual marginal optical resolution is much finer than the usual measure of a full-width-at-half-maximum

=

2.355 RMS, except at the 2 coma-corrected wavelengths (

λ / λ_{o} = 1.26

and 6.58) where the marginal resolution is

\sim

1 RMS. Though Section 7.3 will reveal more quantitative detail for each aberration, it is evident in Figure 6 that the spectral resolution is comparable to or better than the 1/20,000 separating the two raytraced lines, thus confirming the net convolution of the light-path terms as plotted (black curve) in Figure 5.

Figure 6. Numerical raytracings of an ultra-high resolution SEPGM, displayed in phase-space 2D spot diagrams (

x ″

vs.

ω

) and 1D spectral profiles (intensity vs.

x ″

). The meridional pupil coordinate

ω

spans

\sim

140 mm at

λ_{o}

to

\sim

214 mm at 8

λ_{o}

, corresponding to

\sim

2 mrad in angular aperture. Wavelength increases to the top of each panel, in which the two wavelengths (shown here in green and red) are separated by 1 part in 20,000; their dispersed separation (

\sim

9 microns at

λ_{o}

to

\sim

18 microns at 8

λ_{o}

) provides the dimensional scale for the image plane coordinate

x ″

. (a)

λ_{o}

(

θ =

0°): dominant coma (3,0), some spherical aberration (4,0) and slight sagittal coma (1,2); (b) 1.26

λ_{o}

: (3,0) canceled, dominant (2,1) and slight (4,0); (c) 1.43

λ_{o}

: (4,0) canceled, (1,2) nearly canceled, (2,1)–(3,0); (d) 2.08

λ_{o}

: (2,1) canceled, dominant (3,0); (e) 2.5

λ_{o}

: (3,1) canceled, dominant (3,0) near maximum, (2,1)–(4,0); (f) 5.2

λ_{o}

: dominant bow-tie (2,1), (3,0)–(4,0); (g) 6.58

λ_{o}

: (3,0) canceled, dominant (2,1), some (4,0), slight (3,1); (h) per 6(g), but with (2,0) adjusted to balance (4,0) for least-confusion; and (i) 7.99

λ_{o}

: (

θ_{m a x} \sim

71.25°), (3,0)–(2,1)–(4,0), slight (3,1).

Figure 6. Numerical raytracings of an ultra-high resolution SEPGM, displayed in phase-space 2D spot diagrams (

x ″

vs.

ω

) and 1D spectral profiles (intensity vs.

x ″

). The meridional pupil coordinate

ω

spans

\sim

140 mm at

λ_{o}

to

\sim

214 mm at 8

λ_{o}

, corresponding to

\sim

2 mrad in angular aperture. Wavelength increases to the top of each panel, in which the two wavelengths (shown here in green and red) are separated by 1 part in 20,000; their dispersed separation (

\sim

9 microns at

λ_{o}

to

\sim

18 microns at 8

λ_{o}

) provides the dimensional scale for the image plane coordinate

x ″

. (a)

λ_{o}

(

θ =

0°): dominant coma (3,0), some spherical aberration (4,0) and slight sagittal coma (1,2); (b) 1.26

λ_{o}

: (3,0) canceled, dominant (2,1) and slight (4,0); (c) 1.43

λ_{o}

: (4,0) canceled, (1,2) nearly canceled, (2,1)–(3,0); (d) 2.08

λ_{o}

: (2,1) canceled, dominant (3,0); (e) 2.5

λ_{o}

: (3,1) canceled, dominant (3,0) near maximum, (2,1)–(4,0); (f) 5.2

λ_{o}

: dominant bow-tie (2,1), (3,0)–(4,0); (g) 6.58

λ_{o}

: (3,0) canceled, dominant (2,1), some (4,0), slight (3,1); (h) per 6(g), but with (2,0) adjusted to balance (4,0) for least-confusion; and (i) 7.99

λ_{o}

: (

θ_{m a x} \sim

71.25°), (3,0)–(2,1)–(4,0), slight (3,1).

Four objects were used in the 1D simulations: a point (^_____), as used for the 2D diagrams; a 3

μ m

horizontal width (^____), a 20 mm long slit (------) and a 3

μ m

wide × 20 mm long slit (- - - -). The entrance slit was an isotropic source (equivalent to a passive slit backlit by a diffuse source) and tilted in accordance with Equation (61) for its image to coincide with the exit slit. The absence of any significant broadening in the line profiles obviates the need at present to include off-plane object (

z \neq 0

) terms in the aberration expansions Equations (41)–(54) which were derived in Section 4 for a point source (

z = 0

). The 1D spectral profiles exhibit the following aberration ratios:

(b)

Δ λ_{21} \sim 5 Δ λ_{40} \sim 6 Δ λ_{31} \sim 59 Δ λ_{12} \sim 89 Δ λ_{22}

;

(d)

Δ λ_{30} \sim 4 Δ λ_{40} \sim 76 Δ λ_{31} \sim 118 Δ λ_{12} \sim 165 Δ λ_{22}

;

(i)

Δ λ_{21} \sim 1.6 Δ λ_{30} \sim 2.7 Δ λ_{40} \sim 7.5 Δ λ_{31} \sim 41 Δ λ_{12}

.

The small longward shifts of the peak positions (compare the solid and dashed curves) for the 2 shorter wavelengths ((b) and (d)) is due to

\sim

1 micron image curvature of the straight entrance slit, as given by Equation (11). For the longest wavelength trace (i), the shift is seen to be in the opposite (shortward) direction and very small (

\sim

0.2 microns), as predicted by Equation (63). “Pre-emptively” curving the entrance slit by a corresponding amount in the opposite sense (see Section 5.5) is found to eliminate such curvature and shift at the image plane (verified by a raytracing at

λ / λ_{o} \sim 1.26

).

7.3. Power Series Extraction

Precise raytracings provide an alternate (and independent) method of determining the numerical value of the power series aberration terms derived from first principles in Section 3 and Section 4. Image positions (a “spot diagram”) from 21 pupil points are sufficient to extract all 20 component terms of power-sum (i + j − 1)

\leq

3, each free of contamination by any (other) term of power-sum

\leq

6. A numbered set of convenient grating pupil coordinate pairs used is:

\begin{matrix} 0 : (0, 0) 1 : (\frac{1}{2} \overset{˘}{ω}, 0) 2 : (0, \frac{1}{2} \overset{˘}{σ}) 3 : (- \frac{1}{2} \overset{˘}{ω}, 0) 4 : (0, - \frac{1}{2} \overset{˘}{σ}) 5 : (\overset{˘}{ω}, 0) 6 : (0, \overset{˘}{σ}) 7 : (- \overset{˘}{ω}, 0) \\ 8 : (0, - \overset{˘}{σ}) 9 : (\overset{˘}{ω}, \overset{˘}{σ}) 10 : (- \overset{˘}{ω}, \overset{˘}{σ}) 11 : (- \overset{˘}{ω}, - \overset{˘}{σ}) 12 : (\overset{˘}{ω}, - \overset{˘}{σ}) 13 : (\frac{1}{4} \overset{˘}{ω}, 0) 14 : (0, \frac{1}{4} \overset{˘}{σ}) \\ 15 : (- \frac{1}{4} \overset{˘}{ω}, 0) 16 : (0, - \frac{1}{4} \overset{˘}{σ}) 17 : (\frac{1}{2} \overset{˘}{ω}, \frac{1}{2} \overset{˘}{σ}) 18 : (- \frac{1}{2} \overset{˘}{ω}, \frac{1}{2} \overset{˘}{σ}) 19 : (- \frac{1}{2} \overset{˘}{ω}, - \frac{1}{2} \overset{˘}{σ}) 20 : (\frac{1}{2} \overset{˘}{ω}, - \frac{1}{2} \overset{˘}{σ}) \end{matrix}

where

\overset{˘}{ω}

and

\overset{˘}{σ}

denote the meridional and sagittal pupil half-widths, respectively. The difference (or sum) of two image plane positions, each relative to the principal ray, is written succinctly as (for example)

x'_{7}^{5 -} \overset{def}{=}

(x^{'} 5 - x' 0) - (x^{'} 7 - x' 0) \overset{def}{=} x^{'} (\overset{˘}{ω}, 0) - x^{'} (- \overset{˘}{ω}, 0)

, or (as another example)

z'_{19}^{17 +} \overset{def}{=}

(z^{'} 17 - z' 0) + (z^{'} 19 - z' 0) \overset{def}{=} z^{'} (\frac{1}{2} \overset{˘}{ω}, \frac{1}{2} \overset{˘}{σ}) + z^{'} (- \frac{1}{2} \overset{˘}{ω}, - \frac{1}{2} \overset{˘}{σ}) - 2 z' (0, 0)

. It is also found that 13 pupil points (nos. 0 through 12) are sufficient to extract the same 20 terms, however with a small amount of contamination from (other) terms of power-sum

>

3.

The first two terms are simply the principal ray positions:

x'_{10} = x' 0 \overset{def}{=} x' (0, 0)

;

z'_{01} = z' 0 \overset{def}{=} z^{'} (0, 0)

. The remaining 18 are the extrema aberrations (

Δ

being the P-V variation in the lateral ray image positions over the rectangular grating aperture of

ω = \pm \overset{˘}{ω}

and

σ = \pm \overset{˘}{σ}

). Each of these “extraction” Equations (75)–(92) are given in both 21-ray form and (after the “

≅

” sign) 13-ray form, followed by an identification of the lowest power contaminant term(s) for the latter. The indicated percentage level of contamination was determined, at the highest scanned wavelength (

λ / λ_{o} =

8) of the example monochromator, by comparing the results at two different values of the aperture in each direction, thus exposing its (distinct) power dependence, namely

{\overset{˘}{ω}}^{i - 1} {\overset{˘}{σ}}^{j}

for

Δ x'_{i j}

and

{\overset{˘}{ω}}^{i} {\overset{˘}{σ}}^{j - 1}

for

Δ z'_{i j}

.

Δ z_{02}^{'} = \frac{1}{45} ({z^{'}}_{8}^{6 -} \begin{matrix} - 40 {z^{'}}_{4}^{2 -} + 256 {z^{'}}_{16}^{14 -} \end{matrix}) ≅ \frac{8}{3} {z^{'}}_{4}^{2 -} - \frac{1}{3} {z^{'}}_{8}^{6 -}; Δ z'_{06} \sim 0.00001 %

(75)

Δ x_{11}^{'} = \frac{1}{45} ({x^{'}}_{8}^{6 -} \begin{matrix} - 40 {x^{'}}_{4}^{2 -} + 256 {x^{'}}_{16}^{14 -} \end{matrix}) ≅ \frac{8}{3} {x^{'}}_{4}^{2 -} - \frac{1}{3} {x^{'}}_{8}^{6 -}; Δ x_{15}^{'} \sim 0.00003 %

(76)

Δ z_{11}^{'} = \frac{1}{45} ({z^{'}}_{7}^{5 -} \begin{matrix} - 40 {z^{'}}_{3}^{1 -} + 256 {z^{'}}_{15}^{13 -} \end{matrix}) ≅ \frac{8}{3} {z^{'}}_{3}^{1 -} - \frac{1}{3} {z^{'}}_{7}^{5 -}; Δ z_{51}^{'} \sim 0.000006 %

(77)

Δ x_{20}^{'} = \frac{1}{45} ({x^{'}}_{7}^{5 -} \begin{matrix} - 40 {x^{'}}_{3}^{1 -} + 256 {x^{'}}_{15}^{13 -} \end{matrix}) ≅ \frac{8}{3} {x^{'}}_{3}^{1 -} - \frac{1}{3} {x^{'}}_{7}^{5 -}; Δ x_{60}^{'} \sim 0.0020 %

(78)

Δ z_{03}^{'} = \frac{1}{90} ({z^{'}}_{8}^{6 +} \begin{matrix} - 80 {z^{'}}_{4}^{2 +} + 1024 {z^{'}}_{16}^{14 +} \end{matrix}) ≅ \frac{8}{3} {z^{'}}_{4}^{2 +} - \frac{1}{6} {z^{'}}_{8}^{6 +}; Δ z_{07}^{'} \sim 0.00013 %

(79)

\begin{matrix} Δ z_{12}^{'} = \frac{8}{3} ({z^{'}}_{18}^{17 -} + {z^{'}}_{20}^{19 -}) - \frac{1}{6} ({z^{'}}_{10}^{9 -} + {z^{'}}_{12}^{11 -}) ≅ \frac{1}{2} ({z^{'}}_{10}^{9 -} + {z^{'}}_{12}^{11 -}); Δ z_{14}^{'} \sim 0.00005 %; Δ z_{32}^{'} \sim 0.096 % \end{matrix}

(80)

\begin{array}{l} Δ x_{21}^{'} = \frac{8}{3} ({x^{'}}_{18}^{17 -} & + {x'}_{20}^{19 -}) - \frac{1}{6} ({x^{'}}_{10}^{9 -} + {x^{'}}_{12}^{11 -}) ≅ \frac{1}{2} ({x^{'}}_{10}^{9 -} + {x^{'}}_{12}^{11 -}); Δ x'_{41} \sim 0.03 %; Δ x'_{23} \sim 0.0001 % \end{array}

(81)

Δ z_{21}^{'} = \frac{1}{90} ({z^{'}}_{7}^{5 +} \begin{matrix} - 80 {z^{'}}_{3}^{1 +} + 1024 {z^{'}}_{15}^{13 +} \end{matrix}) ≅ \frac{8}{3} {z^{'}}_{3}^{1 +} - \frac{1}{6} {z^{'}}_{7}^{5 +}; Δ z_{61}^{'} \sim 0.00003 %

(82)

Δ x_{30}^{'} = \frac{1}{90} ({x^{'}}_{7}^{5 +} \begin{matrix} - 80 {x^{'}}_{3}^{1 +} + 1024 {x^{'}}_{15}^{13 +} \end{matrix}) ≅ \frac{8}{3} {x^{'}}_{3}^{1 +} - \frac{1}{6} {x^{'}}_{7}^{5 +}; Δ x_{70}^{'} \sim 0.0016 %

(83)

Δ z_{31}^{'} = \frac{4}{9} (34 {z^{'}}_{3}^{1 -} - 64 {z^{'}}_{15}^{13 -} - {z^{'}}_{7}^{5 -}) ≅ \frac{4}{3} {z^{'}}_{7}^{5 -} - \frac{8}{3} {z^{'}}_{3}^{1 -}; Δ z_{51}^{'} \sim 0.006 %

(84)

Δ x_{40}^{'} = \frac{4}{9} (34 {x^{'}}_{3}^{1 -} - 64 {x^{'}}_{15}^{13 -} - {x^{'}}_{7}^{5 -}) ≅ \frac{4}{3} {x^{'}}_{7}^{5 -} - \frac{8}{3} {x^{'}}_{3}^{1 -}; Δ x_{60}^{'} \sim 0.30 %

(85)

Δ x_{12}^{'} = \frac{1}{90} ({x^{'}}_{8}^{6 +} \begin{matrix} - 80 {x^{'}}_{4}^{2 +} + 1024 {x^{'}}_{16}^{14 +} \end{matrix}) ≅ \frac{8}{3} {x^{'}}_{4}^{2 +} - \frac{1}{6} {x^{'}}_{8}^{6 +}; Δ x_{16}^{'} < 0.0001 %

(86)

\begin{array}{l} Δ z_{22}^{'} = \frac{16}{3} ({z^{'}}_{20}^{17 -} & - {z^{'}}_{18}^{19 -} - 2 {z^{'}}_{4}^{2 -}) - \frac{1}{6} ({z^{'}}_{12}^{9 -} + {z^{'}}_{11}^{10 -} - 2 {z^{'}}_{8}^{6 -}) \\ ≅ \frac{1}{2} ({z^{'}}_{12}^{9 -} + {z^{'}}_{11}^{10 -}) - {z^{'}}_{8}^{6 -}; Δ z_{24}^{'} \sim 0.002 %; Δ z_{42}^{'} \sim 0.10 % \end{array}

(87)

\begin{array}{l} Δ x_{31}^{'} = \frac{16}{3} ({x^{'}}_{20}^{17 -} & + {x^{'}}_{19}^{18 -} - 2 {x^{'}}_{4}^{2 -}) - \frac{1}{6} ({x^{'}}_{12}^{9 -} + {x^{'}}_{11}^{10 -} - 2 {x^{'}}_{8}^{6 -}) \\ ≅ \frac{1}{2} ({x^{'}}_{12}^{9 -} + {x^{'}}_{11}^{10 -}) - {x^{'}}_{8}^{6 -}; Δ x_{51}^{'} \sim 0.18 %; Δ x_{33}^{'} \sim 0.0012 % \end{array}

(88)

\begin{array}{l} Δ x_{22}^{'} = \frac{16}{3} ({x^{'}}_{18}^{17 -} & - {x^{'}}_{20}^{19 -} - 2 {x^{'}}_{3}^{1 -}) - \frac{1}{6} ({x^{'}}_{10}^{9 -} + {x^{'}}_{11}^{12 -} - 2 {x^{'}}_{7}^{5 -}) \\ ≅ \frac{1}{2} ({x^{'}}_{10}^{9 -} + {x^{'}}_{11}^{12 -}) - {x^{'}}_{7}^{5 -}; Δ x_{42}^{'} \sim 15.9 %; Δ x_{24}^{'} \sim 0.73 % \end{array}

(89)

Δ x_{13}^{'} = \frac{4}{9} (34 {x^{'}}_{4}^{2 -} - 64 {x^{'}}_{16}^{14 -} - {x^{'}}_{8}^{6 -}) ≅ \frac{4}{3} {x^{'}}_{8}^{6 -} - \frac{8}{3} {x^{'}}_{4}^{2 -}; Δ x_{15}^{'} \sim 17.5 %

(90)

Δ z_{04}^{'} = \frac{4}{9} (34 {z^{'}}_{4}^{2 -} - 64 {z^{'}}_{14}^{12 -} - {z^{'}}_{8}^{6 -}) ≅ \frac{4}{3} {z^{'}}_{8}^{6 -} - \frac{8}{3} {z^{'}}_{4}^{2 -}; Δ z_{06}^{'} \sim 36 %

(91)

\begin{array}{l} Δ z_{13}^{'} = \frac{16}{3} ({z^{'}}_{18}^{17 -} & + {z^{'}}_{19}^{20 -} - 2 {z^{'}}_{3}^{1 -}) - \frac{1}{6} ({z^{'}}_{10}^{9 -} + {z^{'}}_{11}^{12 -} - 2 {z^{'}}_{7}^{5 -}) \\ ≅ \frac{1}{2} ({z^{'}}_{10}^{9 -} + {z^{'}}_{11}^{12 -}) - {z^{'}}_{7}^{5 -}; Δ z_{15}^{'} \sim 1.34 %; Δ z_{33}^{'} \sim 0.12 % \end{array}

(92)

The largest absolute contamination in the 13-ray extractions is

Δ x'_{41} \sim 0.03

% times the magnitude of the

Δ x'_{21}

geometric aberration. As shown in Figure 5, the latter results in a maximum spectral aberration (at the longest wavelength) of

Δ λ / λ \sim

10⁻⁴, thus the 13-ray extraction is in error by only the negligible magnitude of

\sim

3 × 10⁻⁸. However, this contamination error was eliminated by use of the 21-ray extraction when calculating the deviations (given in Table 2) between the numerical (raytrace) extractions of Equations (75)–(88) and those calculated from first principles by the new (rigorous) light-path expansions of Equations (41)–(54). There are no adjustable parameters in either method, and thus no “fitting” of one to the other. The extremely small deviations shown in this table can therefore only be the result of both accurate light-path equations and accurate raytracings. For example, the raytracings illuminate the grating surface by the exact and centered rectangular aperture used in the analytical equations (

ω = \pm \overset{˘}{ω}, σ = \pm \overset{˘}{σ}

), whereby

\overset{˘}{ω} \approx \frac{1}{2} ϕ_{m} / s i n α

(though the exact non-linear equation is used) is a strong function of

α

as the grating is rotated. In addition, the formulations take equal care in the horizontal and vertical directions, as both spatial aberrations contribute to the spectral aberration. This is due both (directly) to the image tilt which includes the vertical aberration term

{}_{i - 1, j + 1}{z^{'}}

in the rotational transformation of Equation (68), and (indirectly) to the need for inclusion of the vertical reference image position

ζ_{i j}

in Equation (37).

Equations (75)–(92) may be converted from extrema widths to power series coefficients by use of the transformations:

{}_{i j}{x^{'}} = Δ x_{i j}^{'} / (p_{i j} {\overset{˘}{ω}}^{i - 1} {\overset{˘}{σ}}^{j}) and {}_{i j}{z^{'}} = Δ z_{i j}^{'} / (p_{j i} {\overset{˘}{ω}}^{i} {\overset{˘}{σ}}^{j - 1})

. Using Equations (68)–(70), these terms were then converted to spectral aberrations

Δ λ_{i j} / λ

and plotted in Figure 5 as the open circles at 29 raytraced wavelengths. Note that, as formulated in Section 5, the (tilted) spectral direction and the corresponding adjustment to the focusing condition cancels the effect of the first four aberrations (Equations (41)–(44) or Equations (75)–(78)), therefore Figure 5 shows no power terms of

Δ λ_{20} / λ

or

Δ λ_{11} / λ

, as they vanish to within the accuracy of the calculations. The last four extraction Equations (89)–(92) result in the 58 raytraced points for

Δ λ_{22} / λ

and

Δ λ_{13} / λ

each contributing less than 10⁻⁶. Due to these exceedingly small magnitudes (below the physical diffraction limit) and the large number of non-paraxial reference image coordinates required for their proper analytical expansion, the explicit equations for these have not been derived.

8. Practical Considerations and Enhancements

8.1. Vertical Deflection of the Principal Ray ( $z_{01}^{'}$ )

The functional dependences of this off-plane deflection (Equation (9)) are most easily seen by applying the small-angle (grazing incidence) approximation to the grating Equation (10):

μ c o s θ ≃ c o s β - c o s α ≃ 2 γ^{2} (1 - ρ) / (1 + ρ)

(93)

thus revealing that

z_{01}^{'} / η

in Equation (9) scales with

\sim γ^{2} t a n θ

. At very shallow graze angles (

γ \sim

½°) for use in the X-ray, the deflection is less than ¼ mrad even at the largest rotation angles considered here (

θ =

71.25°), largely eliminating the need for vertical translation of the exit slit. However,

z_{01}^{'} / η

grows to 5 mrad at

γ \sim

3° (soft X-rays) and to 60 mrad at

γ \sim

10° (extreme UV), being comparable to or exceeding the image length due to astigmatism.

This deflection may be compensated by the addition of a mirror which adjusts the vertical direction of the principal ray incident to the grating [15]. Absent such a mirror in the present “one-bounce” configuration, a translating iris may be used to provide off-center illumination of the grating, thereby maintaining a centered astigmatic image at the exit slit plane. In the case of the horizontal and vertical object conjugates being coincident (a point-like source), the iris center must align with an off-plane pupil coordinate

σ_{o}

determined by balancing Equations (9) and (41):

σ_{o} ≃ - (μ s i n θ) / (1 + 1 / η)

(94)

Unfortunately, this would not also cancel the off-plane direction of the diffracted rays, which would continue to cause an off-center illumination of any target significantly downstream of the exit slit.

The iris translation requires an incident beam of vertical angular extent

Φ_{s} >

(

ϕ_{s} + σ_{o}

), being 4 mrad

+

6 mrad for the example monochromator. In addition, the lateral aberrations which scale to powers of

σ^{2}

and higher will increase substantially as a result of the off-center grating illumination. Fortunately, these terms are generally insignificant, the largest one affecting the spectral resolution being horizontal (1,2) which is

\sim

2 orders of magnitude smaller than the dominant terms (see Figure 5).

8.2. Image Tilt

Using the grazing incidence approximation

(μ c o s θ) / s i n β ≃ \frac{1}{2} (α^{2} - β^{2}) / β ≃ \frac{1}{2} (1 / ρ^{2} - 1) β ≃ (1 / ρ^{2} - 1) γ ρ / (ρ + 1) = - (1 - 1 / ρ) γ

results in a simplified form of Equation (55):

ψ^{'} / γ ≃ [(C_{2} + 1) (1 - 1 / ρ) / (1 + 1 / η)] t a n θ

(95)

revealing that the tilt is nearly linear with the graze angle. Consequently, rotations of the exit and entrance slits become less critical as the graze angle is reduced. For an X-ray version (

γ \sim

½°), the maximum value of

ψ

from Equation (61) is

\sim

1.3° and represents only

\sim

0.1 mm deviation at either end of a 10 mm long entrance slit. Such a small tilt may be accomplished by simpler and more accurate means than required for larger rotations.

8.3. Zero Order Overlap

The reflected zero order beam from a plane grating is unfocused, thus one must insure the exit slit does not lie within the specularly reflected diverging beam of undispersed light [10]. Such overlap is avoided if the meridional aperture

ϕ_{m} <

4

γ (ρ - 1) / (ρ + 1)

, assuming

ρ >

1; in the case of

ρ <

1, substitute its reciprocal

=

1/

ρ

. To provide useable dispersion,

ρ

is typically

>

3/2 (or

<

2/3), for which the aperture limitation becomes

ϕ_{m} <

0.8

γ

. A tolerable (e.g.,

<

40%) variation in graze angle across the optic independently constrains

ϕ_{m} < \sim

0.4

γ

, thereby placing the exit slit far into the shadow of the specular zero order light. The unfocused zero order is advantageous when using high peak-power sources, as it avoids a destructively intense focus of the integrated spectrum.

8.4. Required Variation in Spacing

Inserting Equation (15) for

N_{2}

into Equation (6) determines the magnitude of this (dominant) linear component to the fractional variation of spacing across the ruled width:

\frac{Δd}{d} ≃ (\frac{ϕ_{m}}{γ}) \frac{1 + ρ_{o}^{2} / η}{1 - ρ_{o}}

(96)

For values of

ϕ_{m} / γ =

0.04–0.12 (e.g., 2–6 mrad at

γ =

3°),

ρ_{o} \sim

1.5 and

η \sim

3–4, the required variation is 12–36%. Surprisingly, this is not significantly larger than the 10–29% variations required for a converging-beam geometry (

η ≃

−1) and is well within the factor 2 variations previously employed for 200 mm ruled width plane gratings in an astronomical spectrometer [15] (page 37). The sub-50% line density variation required of the present grating design is comparable to those successfully manufactured and provided in numerous (

\sim

50) laboratory VLS spectrometer and monochromator systems designed and constructed by the author over the past 30 years, and thus are commercially available using existing fabrication methods.

8.5. Dispersive Resolution

Simple differentiation of the grating equation yields the contribution from (entrance; exit) slits of width

Δx

to the fractional FWHM resolution of any single-grating monochromator as

{(Δ λ / λ)}_{d i s p e r s i v e} ≃ \frac{1}{γ} (\frac{Δx / r}{| ρ - 1 |}) (1; ρ / η)

(97)

where

η / ρ

is the grating magnification. For the present typical values of

ρ =

2 and

η =

3.7, or their reciprocals,

{(Δ λ / λ)}_{d i s p e r s i v e} \sim

1.27(

Δx / r) / γ

. Thus, attainment of the average geometrical resolution shown in Figure 5 (

Δ λ / λ \sim

4 × 10⁻⁵ or

Δ λ \sim

10⁻⁴ nm) requires

Δx

/r

\sim

1.65

μ

rad at

γ =

3° (e.g.,

Δ x =

5

μ

m,

r =

3 m and a monochromator length of

(1 + η) r =

14.1 m). The extremely low grating geometrical aberration of

Δ λ \sim

10⁻⁵ nm near the first coma-corrected wavelength of

λ =

1.26 nm matches only exceedingly narrow (

Δx \sim

1

μ

m) slits and a grating slope error tolerance of

\sim

0.05

μ

rad; however the latter is attainable for plane surfaces.

8.6. Table-Top Version

At the other extreme from the ultra-high resolution performance displayed in Figure 5, a high-throughput version is parameterized by choosing

ϕ / γ =

1/8, providing

\sim

10 times larger solid aperture. For the same graze angle (3°), the frontal aperture becomes 6.5 × 13 mrad and the resolving power is approximated by Equation (72) as

ℛ \sim

3000. Given this lower resolution, Equation (97) indicates a required physical length of only

\sim

2 m using the same slit widths (

\sim

5

μ

m). An extreme UV version (

γ =

8°) of this “table-top” length monochromator would have a frontal aperture of 17.5 × 35 mrad and yield the same resolving power using slit widths of

\sim

15

μ

m.

8.7. X-Ray Version

For a given meridional aperture (

\emptyset_{m}

) and ruled width (

\sim

r

\emptyset_{m} / s i n γ

), the object distance r

\propto

γ

, resulting in

λ / Δ λ_{d i s p e r s i v e} \propto

γ^{2}

from Equation (97). Therefore, at

γ =

1° (providing usable broadband reflectance to

\sim

5 keV in photon energy), an aperture of

\emptyset_{m} =

2 mrad and a grating width of 214 mm requires r

\sim

1 m (a monochromator length

\sim

4.7 m) and yields

λ / Δ λ_{d i s p e r s i v e} \sim

2800. This figure matches the geometrical resolving power given by Equation (72).

Further reduction of the graze angle allows extension of the monochromator (same length and grating size) into the X-ray region (

λ ≲ 0.25

nm). For example,

γ \sim

½° results in

ϕ_{m} =

1 mrad,

λ / Δ λ_{d i s p e r s i v e} \sim

1400 and a scanning range of

λ \sim

0.125–1 nm (

ℏ ν \sim

10–1.25 keV). It is noted that the first use of a pure (single rotation axis) SNR monochromator was at high energies, with a concave grating at

γ =

1° providing strong line spectra to 3 keV and first-order Bremsstrahlung continuum to its scan limit (

\sim

4 keV) as shown in “Figure 4-4” on page 133 of the cited thesis [15]. The present design additionally benefits from recent advances in grating fabrication on smooth plane surfaces [4], and thus should extend efficient broadband grating spectroscopy to photon energies

>

5 keV.

8.8. Further Refinements and Additions

The detailed equations and raytracing results presented in this initial report demonstrate the fundamental spectral characteristics of the new geometry. Outlined here are a few possible variations to the basic single-element configuration, which may offer advantages dependent upon the graze angle, aperture, source size, performance priorities or desired versatility of the monochromator.

8.8.1. Linear Upstream Source

Inclusion of a nonzero value for

z_{02}

(an object point whose z-coordinate has a linear dependence on

σ

), corresponding to an entrance slit illuminated by a distant horizontal line. Such a case is allowed in the general light-path formulation (Section 3) and in Equations (32)–(34) of Section 4.1, but not included in the current explicit lateral aberration expansions (Section 4.2). Per Section 5.4, this longitudinal separation of the effective object plane in the vertical and horizontal directions may allow the required rotations of entrance and exit slits to be replaced by a (third-axis) rotation of the grating.

8.8.2. A Two-Element Monochromator with Astigmatism Control

A mirror may be added to focus (anastigmatically) or collimate in the vertical direction, the latter being previously used to maintain high spectral resolution without resort to a rotating slit in the case of a pure surface-normal rotation monochromator, as given in Section 4.3 of the cited thesis [15].

8.8.3. Choice of Magnification

In Section 2.6, the conjugate distance ratio (

η

) and the VLS coefficient N₃ were chosen to eliminate coma (3,0) at two scan wavelengths. As the separation between these wavelengths vanishes, this optimization eliminates coma and its first derivative, resulting in a maximally-broad region nearly free of geometrical aberrations, centered on one wavelength. Alternatively,

η

may be chosen independently (e.g. to provide a desired magnification =

η / ρ

), resulting in a narrower high-resolution region at the one scan wavelength where coma is eliminated by choice of N₃.

8.8.4. Curved Grating Surface

A curvature of the grating surface, as already provided in the general light-path formulation (Section 3). Generalizing the graze-angle light-path approximations of Equations (14) and (15) to include a finite curvature radius (R) yields:

F_{i 0} ≃ {N_{i} c o s θ - \frac{[1 - (η / ρ) / (R s i n α)] ρ^{2} / η^{i - 1} + [1 - 1 / (R s i n α)] {(- 1)}^{i}}{ρ^{2} - 1}} μ c o s θ

(98)

N_{i} ≃ \frac{[1 - (η / ρ_{o}) / (R s i n α_{o})] ρ_{o}^{2} / η^{i - 1} + [1 - 1 / (R s i n α_{o})] {(- 1)}^{i}}{ρ_{o}^{2} - 1}

(99)

Combining the above two equations, the focusing condition (

F_{20} = 0

) becomes:

c o s θ ≃ [\frac{ρ^{2} / η + 1 - (ρ + 1) / (R s i n α)}{ρ_{o}^{2} / η + 1 - (ρ_{o} + 1) / (R s i n α_{o})}] (\frac{ρ_{o}^{2} - 1}{ρ^{2} - 1})

(100)

For a given in-plane scan parameter

ρ

, a radius which is positive (concave), but larger in magnitude than that required for a constant line-space (

N_{2} = 0

) grating to focus, requires an increase in

θ

, while a negative (convex) radius requires a decrease in

θ

. From the approximate grating Equation (93), these changes in

θ

result in a more extended scan range (

μ / μ_{o}

) for a concave grating and a (slightly) reduced scan range for a convex grating. An interesting case is when the radius is positive and smaller in magnitude than the constant line-space self-focusing radius, the latter of which completes the equations of this paper on a classical note:

R_{C L S} ≃ [\frac{ρ_{o} + 1}{ρ_{o}^{2} / η + 1}] / s i n α_{o}

(101)

In the case of

0 < R < R_{C L S}

, the surface curvature provides too strong of a focusing power, requiring the VLS to weaken this by a change in sign of

N_{2}

. The result is that the required value of

θ

decreases sharply (for a given

ρ

). For the parameters of the example ultra-high resolution monochromator, choosing

R \sim 0.85 R_{C L S}

is numerically found to minimize

θ

; at

ρ_{m a x} \sim

2.73,

θ \sim 38^{o}

rather than the plane grating requirement of

θ \sim 71^{o}

. However, this reduces the maximum scan wavelength from

\sim

8

λ_{o}

to

\sim

3.2

λ_{o}

. Nonetheless, the advantage of this curved grating design is that this

θ

is still significantly smaller than the value of

\sim

58° required by the plane grating design at the same wavelength. Thus, for this (more limited) scan range, the curved grating results in a significant decrease (factor

\sim

2) in the required slit tilts and

\sim

30% decrease in the vertical image deflection.

Equation (98) with

i = 3

provides three free parameters (

η, N_{3} and R

), potentially enabling the cancelation of coma (

x_{30}^{''} = 0

) at an additional (third) scan wavelength or the cancelation of spherical aberration (

x_{40}^{''} = 0

) at an additional (second) scan wavelength.

8.8.5. Extended-Range Configurations

Dimensionless Equations (14)–(16) reveal that the horizontal focusing (first, second and third degree) and the VLS ruling parameters (

N_{i}

) are nearly independent of graze angles. Therefore, the same physical grating, object and image distances provide nearly the same imaging performance at any small angular deviation 2

γ

. As with other self-focusing plane grating geometries, this requires the in-plane (initial) value of the scan parameter (

ρ_{o}

) to be unchanged, which is arranged by simply re-adjusting the initial grating normal angle

δ_{o} = arctan [(tan γ) (ρ - 1) / (ρ + 1)

] per Figure 1.

For example several stationary exit slits at various horizontal locations could provide different values of

λ_{o}

scaling with

γ^{2}

per Equation (93). Though a linear scaling of wavelength with graze angle would maintain the blaze of phase-diffraction grooves and provide roughly equal reflection coefficients from a gold surface for

λ < \sim

10 nm, the

γ^{2}

scaling is near optimal for obtaining reasonable (

\sim

0.2–0.9) reflectances over the entire 10-octave region (

λ \sim

0.125–120 nm) where grazing incidence is of advantage. This enables construction of an ultra-wide range monochromator employing multiple (5) exit slits, each providing a band spanning three octaves in wavelength:

2 $γ =$ 1° (X-ray band, $λ =$ 0.0625–0.5 nm); reflectance $<$ 10% below 0.125 nm ( $θ =$ 60°)
2 $γ =$ 2° (Deep SXR band, $λ =$ 0.25–2 nm)
2 $γ =$ 4° (SXR band, $λ =$ 1–8 nm)
2 $γ =$ 8° (XUV band, $λ =$ 4–32 nm)
2 $γ =$ 16° (EUV/FUV band, $λ =$ 16–128 nm)

A single amplitude reflection grating could provide a constant relative diffraction efficiency and good suppression of higher spectral orders (

| m | > 1

), or multiple blazed gratings could be interchanged to provide a higher (peak) efficiency in each band. The slit rotations would scale with

γ

(per Equations (55) or (95)), as would the dispersive limit to the resolving power (per Equation (97)). If used at high spectral resolution, then the exact (

γ

-dependent) focusing condition of Equation (57) would be enforced by adjustments to the angle

θ

for each band.

As an alternative to the use of multiple slits, a conventional plane pick-off mirror (rotating-translating or rotating off-axis) could be inserted into the diffracted beam to direct each angular deviation along the same exit axis. Though such a plane mirror does not alter the basic focusing characteristics of the optical system, while further suppressing the higher spectral orders by preferential absorption, such a configuration requires at least two reflections (reducing its efficiency).

8.8.6. A Two-Element Time-Compensated Monochromator

A tandem grating configuration can substantially cancel the groove-to-groove phase differences defining wavefront division for an individual grating, thereby providing a common path-length within the aperture of the emerging beam. The new rotation scheme enables the construction of such a monochromator using only two plane grating reflections, eliminating the need for the additional two concave mirrors employed with current state-of-the-art time-compensated monochromators [18].

9. Conclusions

At fixed conjugate distances and horizontal deviation angle (2

γ

), self-focused grazing-incidence gratings defocus rapidly upon (groove-axis) in-plane rotation. This is due either to the use of a curved (concave) surface or to the change in

ρ

(ratio of diffracted graze angle to incident graze angle) which determines the focusing condition of plane gratings. However, if varied line-spacing (VLS) provides the focusing power rather than surface curvature, strong defocusing also results from a surface-normal (off-plane) rotation, which can therefore be used to cancel that from the in-plane rotation. These concerted rotations also reinforce the change in wavelength, extending the scan range beyond that available from a single rotation. A new scanning geometry and class of monochromator has thereby been invented which requires only one plane surface, representing a terminal point in the progression of grazing incidence monochromator designs towards fewer and simpler surfaces.

Small-angle approximations to the light-path equations for a plane grating provide a clear understanding and classification of the essential spectral aberration characteristics for different optical geometries. This simplifies the equations, clearly revealing the imaging properties which are (nearly) independent of the graze angle. A more rigorous analysis of the imaging properties exhibited by the new monochromator geometry has precipitated the (incidental) introduction of two mathematical tools of more general applicability in aberration analysis; being an expansion formulation based on non-paraxial reference points and an extraction of aberration series coefficients from raytrace simulations:

(1) The small spectral aberrations and dominance of off-plane terms in the present monochromator geometry has unveiled a flaw in standard light-path formulations. A mathematically rigorous general light-path expansion theory has been introduced, which systematically employs reference wavefronts centered on non-paraxial image points. This procedure, while more complex than the standard approach, correctly isolates each power-term of the path-length series. Interestingly, comparison of the two formulations reveals that the standard approach does not in general provide the correct expansion (or consequently the total aberration), being inexact even in the simple case of a spherical mirror.

(2) An accurate method has been introduced for extracting the individual geometrical terms from the image positions of a small number (13–21) of numerically traced rays. For 14 of the 18 lateral aberrations of power-sum

\leq 3

, a detailed comparison between these extractions and the analytical equations (derived from the new light-path formulation) shows essentially exact (

\sim

10⁻¹¹ radians) agreement, being several orders of magnitude finer than the physical diffraction width. This precision suggests that such extractions may be used in the future to infer (rather than derive) the Fermat equations, by composing them from an algebraic template of geometric parameters fit to the raytrace extractions. However, such numerical fitting has not been used in the present work, rather all equations have been derived from first principles.

Initial use of a meridional-only approach (based on the standard theory and the graze-angle-invariant approximation) and final use of the rigorously-developed (non-paraxial image reference) light-path formulation, have provided equations for various focusing conditions (horizontal, spectral and least-confusion spectral) by correlation of the two grating rotations. The theoretical analysis has also derived the required exit slit tilt (

ψ'

) and entrance slit tilt (

ψ

) as functions of the scan parameters, the VLS ruling coefficients (

N_{i}

) for self-focusing and correction of the higher-degree meridional aberrations, an optimized value for the mount parameter (

η \equiv r^{'} / r

) which provides cancelation of the higher-order (3,0) meridional aberration at two chosen wavelengths and explicit expansion equations of ultra-high accuracy for the 14 leading lateral aberration terms.

The spectral resolving power may be approximated by

{(7 γ / ϕ)}^{2} 2^{3 - q}

over a scan range of

q

octaves for a collection aperture of

ϕ

(meridional) ×

2 ϕ

(sagittal) radians. This large product of spectral resolution, solid aperture and scan range is due to the high level of aberration-correction provided by this geometry (particularly the cancelation of coma at 2 wavelengths) combined with the multiplicative change in wavelength provided by the two rotation motions. Such a high figure of merit may be parameterized between these 3 components (resolution, aperture and scan range) as desired, with examples given for an ultra-high resolution (

λ / Δ λ \sim

25,000) soft X-ray “beamline” version (14 m length), a high collection aperture “table-top” version (2 m length) and an X-ray version (

γ \sim

½°).

Supplementary Files

Supplementary File 1

Conflicts of Interest

The authors declare no conflict of interest.

Appendix A. A Simple Illustration

Figure A1 compares the two light-path calculations (“standard” and “rigorous”) of spherical aberration (4,0) vs. magnification for the simplest and most common focusing optic, namely a spherically concave mirror.

Figure A1. Calculated aberrations of a spherical concave mirror at the Gaussian focus. The second-degree (“coma”) lateral ray aberration is correctly calculated by either the standard or the rigorous light-path formulations; the open circles are independent extractions of this term from numerical ray tracings. Light-path calculations of the third-degree (“spherical aberration”) lateral ray aberration are compared for the standard (dashed), semi-standard (dot-dashed) and rigorous (solid) formulations, with the open circles being independent extractions of this term from numerical raytracings. In this example, the object distance is 1000 mm, the graze angle is 10° and the acceptance aperture is 20 mrad.

The first-degree lateral aberration (“defocusing”) vanishes for a mirror curvature

2 / R = (1 + 1 / η) s i n^{2} γ

, where

γ

is the graze angle. Given this constraint, there is no aberration (non-paraxial position) to include as the reference image for calculating the next higher aberration (second-degree “coma”), so the latter result is the same as given by the standard formulation:

- {}_{30}{x^{'}} \frac{s i n γ}{η} = - \frac{3}{4} (1 - \frac{1}{η^{2}}) c o s γ s i n^{2} γ

(A1)

Given the same in-focus constraint, the expansion term for spherical aberration simplifies to:

- {}_{40}{x^{'}} \frac{s i n γ}{η} = e_{1} s i n^{4} γ + e_{2} c o s^{2} γ s i n^{2} γ

(A2)

where,

e_{1} = - \frac{1}{8} (1 - \frac{1}{η} - \frac{1}{η^{2}} + \frac{1}{η^{3}})

(A3)

and,

e_{2} = \frac{1}{8} (9 - \frac{5}{η} - \frac{5}{η^{2}} + \frac{9}{η^{3}})

(A4)

in the standard formulation (paraxial image reference).

However, to formulate

{}_{40}{x^{'}}

rigorously, one must use Equations (19), (22), (24), (26) and (29) with

x_{i j} = z_{i j} = 0

(on-axis point source),

α = β = γ

(mirror),

σ =

0 (no sagittal rays), S

= 1

(spherical surface) and the inclusion therein of the existing lateral ray aberration from Equation (A1) as a non-paraxial image reference point

(ξ_{40} = {}_{30}{x^{'}} ω^{2})

. The resulting value of

e_{1}

is unchanged compared to the standard result, due to only the first term of Equation (A2) surviving when

γ = π / 2

(normal incidence), and in this case there is no

{}_{30}{x^{'}}

aberration and thus only the standard paraxial image point. However, the rigorous formulation yields a different value for

e_{2}

. In addition to using

{}_{30}x'

as the (non-paraxial) reference image point and employing the Fermat differential of Equation (26), the grating curvature term of Equation (29) is significant, adding

({}_{30}{x^{'}} / R) c o t γ = \frac{3}{4} (1 - 1 / η^{2}) (1 + 1 / η) c o s^{2} γ s i n^{2} γ

to the value of

- ({}_{40}{x^{'}} / η) s i n γ

. The result is:

e_{2} = \frac{1}{8} (9 + \frac{7}{η} - \frac{5}{η^{2}} - \frac{3}{η^{3}})

(A5)

which equals Equation (A4) of the standard expansion only at

η =

1, where

{}_{30}x'

vanishes for all

γ

. In the more general case (

η \neq

1 and

γ \neq π / 2)

,

{}_{30}x'

is nonzero and Equation (A5) differs from Equation (A4). As plotted in Figure A1, this discrepancy is clear and seen to be independently confirmed by the overlayed data points from numerical raytrace extractions. Due to the high angular aperture (20 mrad), the “uncontaminated” (21-ray) version of Equation (85), employing seven rays across the pupil meridian at each magnification, was used to insure an accurate extraction of

{}_{40}x'

. This results in essentially exact agreement (

\sim

0.000005 mm at all but the two lowest magnification points) with the rigorous expansion.

Salient specific results of potential interest for optical designers are apparent from Figure A1: (1) The standard formulation under-calculates spherical aberration for magnifications

>

1, while it significantly over-calculates this term for magnifications

<

1; (2) Spherical aberration vanishes at

η \sim

0.7, which for grazing angles is very near the root of the cubic equation given by

e_{2} =

0. Including the

e_{1}

term, the exact value of this “sweet spot” is at

η =

0.697, 0.697, 0.704, 0.718 and 0.740 at respective graze angles of 3°, 10°, 45°, 60° and 70°, the latter two representing a “near-normal incidence” condition; and (3) The standard formulation (dashed curve) incorrectly identifies

{}_{40}x'

(“spherical aberration”) as comparable to

{}_{30}x'

(“coma”) at low magnifications. Figure A1 shows that the extremum (P-V) of

{}_{40}x'

remains significantly smaller than that of

{}_{30}x'

even at the very low magnification of

η =

0.125. Figure A1 also shows the result of spherical aberration calculations using a semi-standard formulation (dashed-dot curve), whereby the trailing term of Equation (29) is included, but the paraxial reference image is (erroneously) maintained. The equation for

e_{2}

then becomes

(12 - 2 / η - 8 / η^{2} + 6 / η^{3}) / 8

.

Compared to the classic horizontal focusing mirror example given above, the errors inherent in the standard light-path formulation are more significant for a surface-normal rotated diffraction grating, and increase further in the presence of varied line-spacing. Firstly, such a geometry gives rise to numerous mixed terms, especially large being those of (i,1), and thus a proliferating 2D power matrix of non-paraxial reference image points. Secondly, the image rotation (1,1) terms cause a transfer of (otherwise insignificant) vertical aberrations into rotated slit-normal (spectral) components. Thirdly, the inclusion of vertical aberrations as image reference coordinates in the rigorous formulation of horizontal light-path terms (and vice versa) requires that even non-dominant aberrations be initially calculated accurately to avoid significant errors in the subsequently expanded higher-degree terms. For the above reasons, the standard formulation does not provide an accurate prediction of the aberrations for the new design, or for other off-plane diffraction mountings in non-stigmatic geometries. As specific examples, refer to the discussion in Section 4 and the Table 2 entries for the horizontal aberrations

x'_{20}

and

x'_{21}

, which are dominant terms of the new optical design and thus require accurate formulation.

Appendix B. A Rigorous Algebraic Expansion in Detail: The (2,1) Horizontal Aberration

From Equation (29):

- \frac{{}_{21}{x^{'}} (λ)}{η} s i n β = h {2 μ N_{21} + {[\frac{\partial A_{21}}{\partial (Δ ω)} + \frac{\partial B_{21}}{\partial (Δ ω)}]}_{ω σ c o e f f i c i e n t}}

(B1)

where, from Equation (5)

N_{21} = 3 N_{3} c o s^{2} θ s i n θ

(B2)

from Equation (38):

\frac{\partial A_{21}}{\partial (Δ ω)} = 0

(B3)

and,

\frac{\partial B_{21}}{\partial (Δ ω)} = (c o s β - \frac{ξ_{21}}{η} s i n β - \frac{ω}{η}) (- 1 + \frac{1}{2} {}_{B}^{\infty}t_{21} - \frac{3}{8} {}_{B}^{\infty}t_{21}^{2} + \frac{5}{16} {}_{B}^{\infty}t_{21}^{3})

(B4)

from Equation (35) in which, from Equation (37):

{}_{B}^{\infty}t_{21} = - 2 (\frac{ω c o s β}{η}) + 2 (\frac{ξ_{21} s i n β}{η}) (\frac{ω}{η}) - 2 (\frac{ζ_{21}}{η}) (\frac{σ}{η}) + {(\frac{ξ_{21}}{η})}^{2} + {(\frac{ζ_{21}}{η})}^{2}

(B5)

where the reference image terms (

ξ_{21}

and

ζ_{21}

) and their products are derived from Equation (23) as:

\frac{ξ_{21} s i n β}{η} = \frac{s i n β}{η} \sum_{(I, J) \neq (2, 1)} {}_{I J}{x^{'}} ω^{I - 1} σ^{J} = ({}_{20}x' \frac{s i n β}{η}) ω + ({}_{11}x' \frac{s i n β}{η}) σ

(B6)

\begin{matrix} = - [s i n^{2} α + \frac{s i n^{2} β}{η} + 2 μ N_{2} c o s^{2} θ + (μ^{2} s i n^{2} θ) (2 N_{2} c o s β c o s θ + \frac{1}{2} s i n^{2} α) + μ^{3} N_{2} s i n^{2} θ c o s^{2} θ] ω \\ - (μ s i n θ) [2 N_{2} c o s θ (1 + \frac{1}{2} μ^{2} s i n^{2} θ) + (1 + 2 μ N_{2} s i n^{2} θ + μ^{2} s i n^{2} θ) c o s β] σ \end{matrix}

(B7)

\begin{matrix} - \frac{ξ_{21}}{η} s i n β - \frac{ω}{η} = [s i n^{2} α - \frac{c o s^{2} β}{η} + μ C_{2} c o s θ + (μ^{2} s i n^{2} θ) (C_{2} c o s β + \frac{1}{2} s i n^{2} α)] ω \\ + (μ s i n θ) [C_{2} (1 + \frac{1}{2} μ^{2} s i n^{2} θ) + (1 + μ C_{2} s i n θ t a n θ + μ^{2} s i n^{2} θ) c o s β] σ \end{matrix}

(B8)

\frac{ζ_{21}}{η} = \frac{1}{η} \sum_{(I, J) \neq (2, 1)} {}_{I J}{z^{'}} ω^{I} σ^{J - 1} = (\frac{{}_{01}z'}{η}) + (\frac{{}_{02}z'}{η}) σ + (\frac{{}_{11}{z^{'}}}{η}) ω + (\frac{{}_{12}z'}{η}) ω σ

(B9)

\begin{matrix} = (μ s i n θ) (1 + \frac{1}{2} μ^{2} s i n^{2} θ) + [(1 + \frac{1}{n}) + C_{2} μ s i n θ t a n θ + \frac{3}{2} μ^{2} s i n^{2} θ + \frac{3}{2} C_{2} μ^{3} s i n^{3} θ t a n θ] σ \\ + [(C_{2} - \frac{c o s β}{η}) μ s i n θ + (\frac{3}{2} C_{2} - \frac{1}{2} \frac{c o s β}{η}) μ^{3} s i n^{3} θ] ω \\ + {- (c o s α + \frac{c o s β}{η}) + (μ s i n θ t a n θ) (2 C_{3} - \frac{1}{η} C_{2} c o s β) \\ + (μ^{2} s i n^{2} θ) [3 C_{2} - \frac{3}{2} (c o s α + \frac{c o s β}{η}) + Q ({C_{2}}^{2} + C_{2} c o s β) + (\frac{1}{ρ^{2}}) (C_{2} + c o s β)]} ω σ \end{matrix}

(B10)

\begin{array}{l} {(\frac{ζ_{21}}{η})}^{2} = & (μ^{2} s i n^{2} θ) + [(2 + \frac{2}{n}) (μ s i n θ) + (2 C_{2}) (μ^{2} s i n^{2} θ) (t a n θ) + (4 + \frac{1}{n}) (μ^{3} s i n^{3} θ)] σ \\ + [(2 C_{2} - 2 \frac{c o s β}{η}) (μ^{2} s i n^{2} θ)] ω \\ + {[(2 + \frac{2}{η}) C_{2} + (- \frac{4}{η} - \frac{2}{η^{2}} - 2 \frac{c o s α}{c o s β}) c o s β] (μ s i n θ) \\ + (4 C_{3} + 2 {C_{2}}^{2} - \frac{4}{η} C_{2} c o s β) (μ^{2} s i n^{2} θ) (t a n θ) \\ + [(12 + \frac{3}{η}) C_{2} - (\frac{8}{η} + \frac{1}{η^{2}}) c o s β - 4 c o s α + 2 Q ({C_{2}}^{2} + C_{2} c o s β) \\ + (\frac{2}{ρ^{2}}) (C_{2} + c o s β)] (μ^{3} s i n^{3} θ)} ω σ \end{array}

(B11)

Using the substitution

\frac{μ c o s θ}{s i n^{2} β} = (\frac{f_{β} c o s β - c o s α}{s i n^{2} β}) \equiv Q

(B12)

and expanding

{(\frac{ξ_{21}}{η})}^{2}

to include all the terms in

μ^{4} / s i n^{2} β

, which then become terms in

μ^{3}

:

\begin{array}{l} {[{(\frac{ξ_{21}}{η})}^{2}]}_{ω σ c o e f f .} & = [(\frac{2}{ρ^{2}} + \frac{2}{η}) (C_{2} + c o s β) + (2 {C_{2}}^{2} + 2 C_{2} c o s β) Q] (μ s i n θ) \\ + [(\frac{2}{ρ^{2}} + \frac{2}{η}) C_{2} c o s β + (4 {C_{2}}^{2} c o s β + 2 C_{2} c o s^{2} β) Q] (μ^{2} s i n^{2} θ) (t a n θ) \\ + [(\frac{2}{ρ^{2}} + \frac{1}{η}) C_{2} + (\frac{3}{ρ^{2}} + \frac{2}{η}) c o s β + (2 {C_{2}}^{2} + 3 C_{2} c o s β) Q \\ + (2 Q {C_{2}}^{2} c o s^{2} β) (t a n^{2} θ)] (μ^{3} s i n^{3} θ) \end{array}

(B13)

\begin{array}{r} 2 (\frac{ξ_{21} s i n β}{η}) (\frac{ω}{η}) = [- (\frac{2 C_{2}}{η} + \frac{2 c o s β}{η}) (μ s i n θ) - (\frac{2 C_{2}}{η}) c o s β (μ^{2} s i n^{2} θ) (t a n θ) \\ - (\frac{2 c o s β}{η} + \frac{C_{2}}{η}) (μ^{3} s i n^{3} θ)] ω σ \end{array}

(B14)

\begin{array}{l} - 2 (\frac{ζ_{21}}{η}) (\frac{σ}{η}) & = [- \frac{2}{η} (μ s i n θ) - \frac{μ^{3} s i n^{3} θ}{η}] σ \\ + [(- \frac{2 C_{2}}{η} + 2 \frac{c o s β}{η^{2}}) (μ s i n θ) + (- \frac{3 C_{2}}{η} + \frac{c o s β}{η^{2}}) (μ^{3} s i n^{3} θ)] ω σ \end{array}

(B15)

\begin{array}{l} {(\frac{ζ_{21}}{η})}^{2} & = (μ^{2} s i n^{2} θ) + [(2 + \frac{2}{n}) (μ s i n θ) + (2 C_{2}) (μ^{2} s i n^{2} θ) (t a n θ) + (4 + \frac{1}{n}) (μ^{3} s i n^{3} θ)] σ \\ + [(2 C_{2} - 2 \frac{c o s β}{η}) (μ^{2} s i n^{2} θ)] ω \\ + {[(2 + \frac{2}{η}) C_{2} + (- \frac{4}{η} - \frac{2}{η^{2}} - 2 \frac{c o s α}{c o s β}) c o s β] (μ s i n θ) \\ + (4 C_{3} + 2 {C_{2}}^{2} - \frac{4}{η} C_{2} c o s β) (μ^{2} s i n^{2} θ) (t a n θ) \\ + [(12 + \frac{3}{η}) C_{2} - (\frac{8}{η} + \frac{1}{η^{2}}) c o s β - 4 c o s α + 2 ({C_{2}}^{2} + C_{2} c o s β) Q \\ + (\frac{2}{ρ^{2}}) (C_{2} + c o s β)] (μ^{3} s i n^{3} θ)} ω σ \end{array}

(B16)

Grouping like power terms together:

\begin{matrix} t_{21} = t_{21; 002} μ^{2} + t_{21; 100} ω + t_{21; 102} ω μ^{2} + t_{21; 011} σ μ + t_{21; 012} σ μ^{2} + t_{21; 013} σ μ^{3} \\ + t_{21; 111} ω σ μ + t_{21; 112} ω σ μ^{2} + t_{21; 113} ω σ μ^{3} \end{matrix}

(B17)

where,

t_{21; 002} = s i n^{2} θ

(B18)

t_{21; 100} = - 2 (\frac{c o s β}{η})

(B19)

t_{21; 102} = 2 (C_{2} - \frac{c o s β}{η}) s i n^{2} θ

(B20)

t_{21; 011} = 2 s i n θ

(B21)

t_{21; 012} = 2 C_{2} t a n θ s i n^{2} θ

(B22)

t_{21; 013} = 4 s i n^{3} θ

(B23)

t_{21; 111} = [(2 {C_{2}}^{2} + 2 C_{2} c o s β) Q + (2 + \frac{2}{ρ^{2}}) C_{2} + (\frac{2}{ρ^{2}} - \frac{4}{η} - 2 \frac{c o s α}{c o s β}) c o s β] s i n θ

(B24)

t_{21; 112} = [4 C_{3} + 2 {C_{2}}^{2} + (4 {C_{2}}^{2} c o s β + 2 C_{2} c o s^{2} β) Q + (\frac{2}{ρ^{2}} - \frac{4}{η}) C_{2} c o s β] t a n θ s i n^{2} θ

(B25)

\begin{array}{l} t_{21; 113} & = [(12 + \frac{4}{ρ^{2}}) C_{2} + (\frac{5}{ρ^{2}} - \frac{8}{η}) c o s β - 4 c o s α \\ + (4 {C_{2}}^{2} + 5 C_{2} c o s β + 2 {C_{2}}^{2} c o s^{2} β t a n^{2} θ) Q] s i n^{3} θ \end{array}

(B26)

From Equation (B4):

{[\frac{\partial B_{21}}{\partial (Δ ω)}]}_{ω σ t e r m} = ⟦ s_{21; 000} + s_{21; 100} ω + s_{21; 101} ω μ + s_{21; 102} ω μ^{2} + s_{21; 011} σ μ + s_{21; 012} σ μ^{2} + s_{21; 013} σ μ^{3} ⟧ \times \begin{matrix} ⟦ \frac{1}{2} {t_{21; 002} μ^{2} + t_{21; 100} ω + t_{21; 102} ω μ^{2} + t_{21; 011} σ μ + t_{21; 012} σ μ^{2} + t_{21; 013} σ μ^{3} + t_{21; 111} ω σ μ \\ + t_{21; 112} ω σ μ^{2} + t_{21; 113} ω σ μ^{3}} \\ - \frac{3}{8} {t_{21; 002} μ^{2} + t_{21; 100} ω + t_{21; 102} ω μ^{2} + t_{21; 011} σ μ + t_{21; 012} σ μ^{2} \\ + t_{21; 013} σ μ^{3} + t_{21; 111} ω σ μ + t_{21; 112} ω σ μ^{2} + t_{21; 113} ω σ μ^{3}}^{2} \\ + \frac{5}{16} {t_{21; 002} μ^{2} + t_{21; 100} ω + t_{21; 102} ω μ^{2} + t_{21; 011} σ μ + t_{21; 012} σ μ^{2} + t_{21; 013} σ μ^{3} \\ + t_{21; 111} ω σ μ + t_{21; 112} ω σ μ^{2} + t_{21; 113} ω σ μ^{3}}^{3} + \dots ⟧_{ω, σ a n d ω σ t e r m s} \end{matrix}

(B27)

where,

s_{21; 000} = c o s β

(B28)

s_{21; 100} = (s i n^{2} α - \frac{c o s^{2} β}{η})

(B29)

s_{21; 101} = C_{2} c o s θ

(B30)

s_{21; 102} = (C_{2} c o s β + \frac{1}{2} s i n^{2} α) s i n^{2} θ

(B31)

s_{21; 011} = (C_{2} + c o s β) s i n θ

(B32)

s_{21; 012} = C_{2} c o s β t a n θ s i n^{2} θ

(B33)

s_{21; 013} = (\frac{1}{2} C_{2} + c o s β) s i n^{3} θ

(B34)

\begin{array}{l} {[\frac{\partial B_{21}}{\partial (Δ ω)}]}_{ω σ c o e f f i c i e n t} = \\ (\frac{1}{2} s_{21; 000} t_{21; 111} - \frac{3}{4} s_{21; 000} t_{21; 100} t_{21; 011} + \frac{1}{2} s_{21; 100} t_{21; 011} + \frac{1}{2} s_{21; 011} t_{21; 100}) μ \\ + (\frac{1}{2} s_{21; 100} t_{21; 012} + \frac{1}{2} s_{21; 101} t_{21; 011} + \frac{1}{2} s_{21; 000} t_{21; 112} + \frac{1}{2} s_{21; 012} t_{21; 100} - \frac{3}{4} s_{21; 000} t_{21; 100} t_{21; 012}) μ^{2} \\ + (\frac{1}{2} s_{21; 000} t_{21; 113} - \frac{3}{4} s_{21; 000} t_{21; 002} t_{21; 111} - \frac{3}{4} s_{21; 000} t_{21; 100} t_{21; 013} - \frac{3}{4} s_{21; 000} t_{21; 102} t_{21; 011} \\ + \frac{15}{8} s_{21; 000} t_{21; 002} t_{21; 100} t_{21; 011} + \frac{1}{2} s_{21; 100} t_{21; 013} + \frac{1}{2} s_{21; 101} t_{21; 012} + \frac{1}{2} s_{21; 102} t_{21; 011} \\ - \frac{3}{4} s_{21; 100} t_{21; 002} t_{21; 011} + \frac{1}{2} s_{21; 011} t_{21; 102} - \frac{3}{4} s_{21; 011} t_{21; 002} t_{21; 100} + \frac{1}{2} s_{21; 013} t_{21; 100}) μ^{3} \end{array}

(B35)

\begin{array}{l} - \frac{{}_{21}{x^{'}} (λ)}{η} s i n β & = [6 N_{3} c o s^{2} θ s i n θ + \frac{1}{2} s_{21; 000} t_{21; 111} - \frac{3}{4} s_{21; 000} t_{21; 100} t_{21; 011} \\ + \frac{1}{2} s_{21; 100} t_{21; 011} + \frac{1}{2} s_{21; 011} t_{21; 100}] μ \\ + [\frac{1}{2} s_{21; 100} t_{21; 012} + \frac{1}{2} s_{21; 101} t_{21; 011} + \frac{1}{2} s_{21; 000} t_{21; 112} + \frac{1}{2} s_{21; 012} t_{21; 100} \\ - \frac{3}{4} s_{21; 000} t_{21; 100} t_{21; 012}] μ^{2} \\ + [\frac{1}{2} s_{21; 000} t_{21; 113} - \frac{3}{4} s_{21; 000} t_{21; 002} t_{21; 111} \\ - \frac{3}{4} s_{21; 000} t_{21; 100} t_{21; 013} - \frac{3}{4} s_{21; 000} t_{21; 102} t_{21; 011} \\ + \frac{15}{8} s_{21; 000} t_{21; 002} t_{21; 100} t_{21; 011} + \frac{1}{2} s_{21; 100} t_{21; 013} + \frac{1}{2} s_{21; 101} t_{21; 012} \\ + \frac{1}{2} s_{21; 102} t_{21; 011} - \frac{3}{4} s_{21; 100} t_{21; 002} t_{21; 011} + \frac{1}{2} s_{21; 011} t_{21; 102} \\ - \frac{3}{4} s_{21; 011} t_{21; 002} t_{21; 100} + \frac{1}{2} s_{21; 013} t_{21; 100} \\ + (\frac{1}{2} s i n^{2} θ) (6 N_{3} c o s^{2} θ s i n θ + \frac{1}{2} s_{21; 000} t_{21; 111} - \frac{3}{4} s_{21; 000} t_{21; 100} t_{21; 011} \\ + \frac{1}{2} s_{21; 100} t_{21; 011} + \frac{1}{2} s_{21; 011} t_{21; 100})] μ^{3} \end{array}

(B36)

yielding Equation (47) as it appears in the main text:

- {}_{21}{x^{'}} \frac{s i n β}{η} = [2 C_{3} + (1 - \frac{1}{η} + Q C_{2} + \frac{1}{ρ^{2}}) C_{2} c o s β + (Q C_{2} + \frac{1}{ρ^{2}}) c o s^{2} β - (c o s α + \frac{c o s β}{η}) c o s β] μ s i n θ + {C_{2} + \frac{1}{Q ρ^{2}} + [(2 C_{3} + {C_{2}}^{2}) c o s β + (\frac{1}{ρ^{2}} - \frac{1}{η} + 2 Q C_{2} + Q c o s β) C_{2} c o s^{2} β] t a n^{2} θ} μ^{2} s i n θ c o s θ + [C_{3} + 2 {C_{2}}^{2} + (\frac{7}{2} + Q C_{2} + \frac{1}{ρ^{2}}) C_{2} c o s β + \frac{3}{2} (Q C_{2} + \frac{1}{ρ^{2}}) c o s^{2} β - (c o s α + \frac{c o s β}{η}) c o s β + (Q {C_{2}}^{2} c o s^{3} β) t a n^{2} θ] μ^{3} s i n^{3} θ

(B37)

Appendix C. Cancelation of Vertical Deflection: 3-Axis Grating Rotation

Additionally rotate the grating about its third axis, namely that defined in Figure 1 and Figure 2b by the meridional coordinate ω. From Equation (9) it is approximated that such rotation by an angle

Ω ≃ arcsin (\frac{- μ sin θ}{sin α + sin β})

(C1)

will correct for the principal ray vertical deflection (z′₀₁). This new scan angle is plotted in Figure C1 for the ultra-high resolution monochromator parameterized in the paper. Also plotted are the scan angles required of the exit slit (𝜓′) and entrance slit (𝜓) re-oriented by subtracting Ω from Equations (55) and (61), respectively. Using the longest scan wavelength (λ_max/λ_o = 7.9945343) as the most sensitive test, these approximations yield Ω ≃ 4.1488°, 𝜓 ≃ −12.158° and 𝜓′ ≃ 4.3898°.

Figure C1. Scan angles of the grating (δ,

θ

,

Ω

) and slits (

- 𝜓

,

𝜓^{'}

) for a 3-axis grating rotation where

z_{01}^{'}

= 0. Compare to the 2-axis grating rotation (Figure 4) where

z_{01}^{'}

≠ 0.

Figure C1. Scan angles of the grating (δ,

θ

,

Ω

) and slits (

- 𝜓

,

𝜓^{'}

) for a 3-axis grating rotation where

z_{01}^{'}

= 0. Compare to the 2-axis grating rotation (Figure 4) where

z_{01}^{'}

≠ 0.

Numerical raytracings have also been performed to independently, and more precisely, determine the optimum 3-axis grating orientation at the scanned wavelength. This was conveniently provided by the “auto-adjust” routine of BEAM4 [16] in a least-squares minimization of the spectral image width. At λ_max this numerical procedure yielded δ = 1.3951844°, θ = 71.40978°, Ω = 4.1525°, 𝜓 = −12.158° and 𝜓′ = 4.4083°. For reference, the scan angles for the 2-axis grating rotation (Ω = 0) are δ = 1.3915105°, θ = 71.24507°, 𝜓 = −8.009° and 𝜓′ = +8.5386°, resulting in z′₀₁ = −84.03 mm. In the 3-axis scan proposed here, the fixed trajectory of the diffracted principal ray simplifies the exit slit rotation to be that about a stationary axis (located at

x_{10}^{'}

= 0,

z_{01}^{'}

= 0). Full-aperture numerical raytracings also reveal spot diagrams and spectral resolution of the undeflected images (z′₀₁ = 0) which are nearly identical to those shown in Figure 6 (for which the vertical deflection at λ_max is ~ 84 mm). Thus, the aberration correction is maintained.

From a mechanical engineering standpoint, upgrading from a 2-axis to a 3-axis goniometer for precise rotation of the grating is more compact and convenient than providing vertical translations of the exit slit rotation axis and the downstream target. Note that both these translations would need to be accurately coordinated with the basic 2-axis grating rotation. Furthermore, in the case of complex or large target stations, it is often impractical to even consider such translation.

The proposed grating rotation about its meridional axis will also remove the vertical deflection inherent in the (simpler) pure surface normal rotation monochromator [2]. Given rotation about both its surface normal axis and its meridional axis, a constant line-space concave grating can focus on its Rowland circle and its principal ray remains fixed in the horizontal and vertical directions.

References

Hettrick, M.C. In-focus monochromator: Theory and experiment of a new grazing incidence mounting. Appl. Opt. 1990, 29, 4531–4535. [Google Scholar] [CrossRef] [PubMed]
Hettrick, M.C. Surface normal rotation: A new technique for grazing incidence monochromators. Appl. Opt. 1992, 31, 7174–7178. [Google Scholar] [CrossRef] [PubMed]
Hettrick, M.C. Grating Monochromators and Spectrometers Based on Surface Normal Rotation. U.S. Patent 5,274,435, 1993. [Google Scholar]
Greiner, C.M.; Iazikov, D.; Mossberg, T.W. Diffraction-limited performance of flat-substrate reflective imaging gratings patterned by DUV photolithography. Opt. Express 2006, 14, 11952–11957. [Google Scholar] [CrossRef] [PubMed][Green Version]
Hunter, W.R.; Williams, R.T.; Rife, J.C.; Kirkland, J.P.; Kabler, M.N. A Grating/Crystal Monochromator for the Spectral Range 5 eV to 5 keV. Nucl. Instr. Meth. Phys. Res. 1982, 195, 141–153. [Google Scholar] [CrossRef]
Petersen, H. The Plane Grating and Elliptical Mirror: A New Optical Configuration for Monochromators. Opt. Commun. 1982, 40, 402–406. [Google Scholar] [CrossRef]
Lu, L.; Cocco, D.; Jark, W. Simple plane grating monochromator for synchrotron radiation. Nucl. Instr. Meth. Phys. Res. A 1994, 339, 604–609. [Google Scholar]
Hettrick, M.C.; Bowyer, S. Varied line-space gratings: new designs for use in grazing incidence spectrometers. Appl. Opt. 1983, 22, 3921–3924. [Google Scholar] [CrossRef] [PubMed]
Hettrick, M.C.; Underwood, J.H.; Batson, P.J.; Eckart, M.J. Resolving Power of 35,000 (5 mÅ) in the extreme ultraviolet employing a grazing incidence spectrometer. Appl. Opt. 1988, 27, 200–202. [Google Scholar] [CrossRef] [PubMed]
Itou, M.; Harada, T.; Kita, T. Soft x-ray monochromator with a varied-space plane grating for synchrotron radiation: design and evaluation. Appl. Opt. 1989, 28, 146–153. [Google Scholar]
Namioka, T. Theory of the Concave Grating I. J. Opt. Soc. Am. 1959, 49, 449–460. [Google Scholar] [CrossRef]
Beutler, H.G. The Theory of the Concave Grating. J. Opt. Soc. Am. 1945, 35, 311–350. [Google Scholar] [CrossRef]
Howells, M.R. X-ray Data Booklet, 3rd ed.2009; Section 4.3.B; pp. 18–23. Available online: http://xdb.lbl.gov/xdb-new.pdf (accessed 16 December 2015). [Google Scholar]
Born, M.; Wolf, E. Principles of Optics, 6th ed.; Pergamon Press Ltd.: Oxford, England, 1980; p. 207. [Google Scholar]
Hettrick, M.C. Optical Design and Experimental Development of Grazing Incidence Fixed Slit Spectrometers for High Resolution Plasma Diagnostics. Ph.D. Thesis, The Graduate University for Advanced Studies, Department of Fusion Science, Gifu, Japan, 1996. Available online: http://www.hettrickscientific.com/pdf/thesis/Hettrick%20phd%20thesis.pdf (accessed on 3 March 2016). [Google Scholar]
Stellar Software. BEAM FOUR: Optical Ray Tracer, Java v. 154. Available online: http://www.stellarsoftware.com (accessed on 22 October 2015).
Harada, T.; Kita, T. Mechanically ruled aberration-corrected concave gratings. Appl. Opt. 1980, 19, 3987–3993. [Google Scholar] [CrossRef] [PubMed]
Frassetto, F.; Miotti, P.; Poletto, L. Grating Configurations for the Spectral Selection of Coherent Ultrashort Pulses in the Extreme-Ultraviolet. Photonics 2014, 1, 442–454. [Google Scholar] [CrossRef]

© 2016 by the author; licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons by Attribution (CC-BY) license (http://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Hettrick, M.C. A Single-Element Plane Grating Monochromator. Photonics 2016, 3, 3. https://doi.org/10.3390/photonics3010003

AMA Style

Hettrick MC. A Single-Element Plane Grating Monochromator. Photonics. 2016; 3(1):3. https://doi.org/10.3390/photonics3010003

Chicago/Turabian Style

Hettrick, Michael C. 2016. "A Single-Element Plane Grating Monochromator" Photonics 3, no. 1: 3. https://doi.org/10.3390/photonics3010003

APA Style

Hettrick, M. C. (2016). A Single-Element Plane Grating Monochromator. Photonics, 3(1), 3. https://doi.org/10.3390/photonics3010003

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

A Single-Element Plane Grating Monochromator

Abstract

1. Introduction

2. The Basic Scheme

2.1. The Standard Light-Path Formulation

2.2. Surface-Normal Rotation Transformation of the Varied Line-Space Coefficients

2.3. The Principal Ray Terms

2.4. Pure Meridional Aberrations and their Graze Angle-Invariant Approximation

2.5. The New Focusing Condition

2.6. Two-Point Coma-Correction: Optimization of the Conjugate Distance Ratio

2.7. A Classification of Plane Grating Geometries

3. A Rigorous Theory of Light-Path Expansion

3.1. Flaw in the Standard Theory

3.2. The Reference Path Lengths

3.3. Fermat Derivation of the Lateral Ray Aberrations

4. The Aberration Equations for an Astigmatic Plane Grating Monochromator

4.1. The Fermat Derivatives

4.2. The Explicit Expansion Terms of Power-Sum ≤ 3

5. Manipulation and Analysis of the Ray Aberrations

5.1. Horizontal Tilt (Sagittally-Induced)

5.2. Vertical Tilt (Meridionally-Induced) and the Rigorous Focusing Condition

5.3. Balancing of Spherical Aberration and Defocus

5.4. Rotation of the Entrance Slit

5.5. Image Curvature

5.6. The Horizontal Mixed Aberrations (2,1) and (3,1)

5.7. Minor Vertical Aberrations (1,2), (2,1), (3,1) and (2,2)

6. Spectral Resolution

7. Numerical Raytrace Simulations

7.1. Spot Diagrams

7.2. Line Profiles

7.3. Power Series Extraction

8. Practical Considerations and Enhancements

8.1. Vertical Deflection of the Principal Ray ( z 01 ′ )

8.2. Image Tilt

8.3. Zero Order Overlap

8.4. Required Variation in Spacing

8.5. Dispersive Resolution

8.6. Table-Top Version

8.7. X-Ray Version

8.8. Further Refinements and Additions

8.8.1. Linear Upstream Source

8.8.2. A Two-Element Monochromator with Astigmatism Control

8.8.3. Choice of Magnification

8.8.4. Curved Grating Surface

8.8.5. Extended-Range Configurations

8.8.6. A Two-Element Time-Compensated Monochromator

9. Conclusions

Supplementary Files

Conflicts of Interest

Appendix A. A Simple Illustration

Appendix B. A Rigorous Algebraic Expansion in Detail: The (2,1) Horizontal Aberration

Appendix C. Cancelation of Vertical Deflection: 3-Axis Grating Rotation

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

4.2. The Explicit Expansion Terms of Power-Sum $\leq 3$

8.1. Vertical Deflection of the Principal Ray ( $z_{01}^{'}$ )