k -Version of Finite Element Method for BVPs and IVPs

: The paper presents k -version of the finite element method for boundary value problems (BVPs) and initial value problems (IVPs) in which global differentiability of approximations is always the result of the union of local approximations. The higher order global differentiability approximations (HGDA/DG) are always p -version hierarchical that permit use of any desired p -level without effecting global differentiability. HGDA/DG are true C i , C ij , C ijk , hence the dofs at the nonhierarchical nodes of the elements are transformable between natural and physical coordinate spaces using calcu-lus. This is not the case with tensor product higher order continuity elements discussed in this paper, thus confirming that the tensor product approximations are not true C i , C ijk , C ijk approximations. It is shown that isogeometric analysis for a domain with more than one patch can only yield solutions of class C 0 . This method has no concept of finite elements and local approximations, just patches. It is shown that compariso of this method with k -version of the finite element method is meaningless. Model problem studies in R 2 establish accuracy and superior convergence characteristics of true C ij p version hierarchical local approximations presented in this paper over tensor product approximations. Convergence characteristics of p -convergence, k -convergence and pk -convergence are illustrated for self adjoint, non-self adjoint and non-linear differential operators in BVPs. It is demonstrated that h , p and k are three independent parameters in all finite element computations. Tensor product local approximations and other published works on k -version and their limitations are discussed in the paper and are compared with present work.


Introduction
In physical sciences, the mathematical descriptions of the deformation of continuous media (solid or fluent continua) derived using conservation and balance laws of thermodynamics and associated constitutive theories lead to initial value problems (IVPs) or boundary values problems (BVPs). Initial value problems describe evolution, hence the dependent variables in their mathematical description exhibit simultaneous dependence on spatial coordinates and time. In case of boundary value problems that are stationary states of evolutions described by IVPs, the dependent variables only exhibit dependence on spatial coordinates. In the absence of permanent damage to the continuum during deformation, the solutions for the BVPs and IVPs related to deformation of the continua must be continuous and must possess continuous derivatives up to certain order which depends upon the differential operator describing the IVP or the BVP and their theoretical solutions. Physical processes do not admit being discontinuous unless there is damage to the continuum. In many cases what is viewed as discontinuous behavior is often an issue of scale. For example, the shocks in compressible flow, if viewed at a bigger scale, may appear as a discontinuous phenomenon, but on closer examination at a finer scale these are indeed continuous and differentiable. This basic assumption that solution of all BVPs and IVPs in physical sciences are generally analytic is the foundation of the work presented in this paper.

Boundary Value Problems (BVPs)
Consider the following BVP (1) in which a A is differentiable operator, the variable φ is unknown and depends on f , and x is an independent variable that could be x or x, y or x, y, z, but for the sake of simplicity we continue with x only and Ω x is the domain of definition. Let the differential operator contain derivatives of φ up to orders 2m. A theoretical solution φ of (1) must be at least of class C 2m (Ω x ) if derivatives of φ of up to orders 2m are continuous and differentiable. This is due to the fact that the continuity and differentiability aspects of the physics used to derive BVP (1) necessitate this feature of the theoretical solution of (1). A theoretical solution φ of (1) can be of class higher than C 2m (Ω x ), this of course depends upon the non-homogeneous function f (x). Thus, a solution of (1) must belong to subspace V such that V is a subspace of scalar product space H k (Ω x ). k is the order of the scalar product space.
A scalar product space of order k contains functions that have continuous derivatives up to orders k − 1 and the derivative of order k can have isolated discontinuities for some x ∈Ω x , but are square integrable. For the BVP (1), k must at least be 2m + 1 (functions φ of class C 2m ), but k > 2m + 1 or k → ∞ (if for example φ consists of trigonometric functions) are admissible as well. When k = 2m + 1, V is called minimally conforming space, i.e., a solution of (1) cannot be of a class lower than C 2m (Ω x ). When seeking finite element solutions for (1) we discretize the spatial domainΩ x intō Ω T x = eΩ e x in whichΩ e x is a finite element.Ω T x is called discretization ofΩ x or a finite element mesh. We consider φ e h to be local approximation of φ overΩ e x , the domain of an element e and φ h to be the approximation of φ overΩ T x , global approximation of φ for the discretizationΩ T x such that and φ e h (x) = In (4), N i (x) are approximation functions and δ e i are nodal dofs. The approximation functions N i (x) are generally obtained using linear combination of algebraic monomials (interpolation theory) up to degree p. This is convenient as well as prudent. Thus, φ e h of class C p (Ω e x ) is always ensured. However, the differentiability φ h (x) ∀ x ∈Ω T x (global differentiability) depends upon (3), i.e., the union of local approximations (φ e h 's) controls the global differentiability of φ h over the discretizationΩ T x . It is obvious that φ h over each Ω e x is of class C p (Ω e x ), hence the differentiability of φ h ∀ x ∈Ω T x is controlled by the differentiability of φ h over the inter-element boundaries of elementsΩ e x . Thus, we could design local approximations φ e h (in (4)) of some polynomial degree p such that (3) gives us the desired global differentiability of φ h at the inter-element boundaries, always less than the differentiability of order p of φ h over Ω e x , interior of each element. The local approximation φ e h overΩ e x of polynomial order p that yield global differentiability of order q at the inter-element boundaries (q < p) are called local approximations of class C q (Ω e x ). Thus, the global differentiability of φ h is of order q due to the local approximation φ e h yielding global differentiability of order q at the inter-element boundaries. Thus, to achieve global differentiability of φ h of certain order we must design local approximations such that (3) in fact gives us the desired global differentiability of φ h at the inter-element boundaries.

k-Version of Finite Element Method
We note that (Surana et al. [1][2][3]) order of the space k in H k,p (Ω x ) giving global differentiability of order k − 1 is an independent parameter in addition to h and p in all finite element processes, thus all finite element computations are dependent on three independent parameters h, p, and k. Just as h and p must be incorporated in local approximations, k also needs to be considered in designing local approximations φ e h so that (3) would yield desired global differentiability of φ h at the inter-element boundaries. h, p refinement cannot change or alter k as it is independent of h and p. Thus, we have k-version of finite element method in addition to hand p-versions. Derivation of higher order global differentiability local approximations for BVPs, their accuracy, convergence behavior and comparisons with published works is the thrust of the work presented in this paper.

Initial Value Problems (IVPs) and k-Version
Consider an initial value problem in which A is a space-time differential operator. Let 2m 1 and 2m 2 be the highest orders of derivations of φ in space x and time t in the (5), then for a solution of φ we have The scalar product space H (k 1 ,k 2 ) (Ω xt ) contains functions that are admissible in (5). k 1 , k 2 are orders of the space in space and time and we have k 1 ≥ 2m 1 + 1; k 2 ≥ 2m 2 + 1 (7) in which k 1 = 2m 1 + 1 and k 2 = 2m 2 + 1 correspond to minimally conforming space in space and time. If we consider space-time coupled finite element processes for obtaining approximation solutions for (5), then we discretize the space-time domainΩ xt intō Ω T xt = ∪ eΩ e xt .Ω T xt is the discretization of the space-time domainΩ xt in whichΩ e xt is a space-time element. Let φ e h (x, t) be local approximation of φ(x, t) overΩ e xt and φ h (x, t) be approximation of φ(x, t) over the discretizationΩ T xt such that and N i (x, t) are space-time local approximation functions and δ e i are nodal dofs for a space-time element e with domainΩ e xt . For the IVP with highest orders of derivatives of orders 2m 1 and 2m 2 in space and time require φ h (x, t) overΩ T xt to be at least of class 2m 1 in space and of class 2m 2 in time, i.e., k 1 = 2m 1 + 1 and k 2 = 2m 2 + 1 correspond to minimally conforming space in space and time. In general we must have k 1 ≥ 2m 1 + 1 and k 2 ≥ 2m 1 + 1 and k 1 → ∞ and k 2 → ∞ are admissible. In general we can write φ h (x, t) must be of class C qr (Ω T xt ), in which q and r are orders of global differentiability of φ h (x, t) in space and time. Hence φ h (x, t) overΩ e xt must be such that φ h = e φ e h yields global differentiability of orders q and r in space and time at the inter-element boundaries. Only then φ h (x, t) of class C qr (Ω T xt ) is possible. Thus, we see that regardless of BVP or IVP, the local approximation must be designed in such a way that their union yields desired global differentiability at the inter-element boundaries.

Literature Review
The importance of the higher order global differentiability of the finite element solutions for BVPs and IVPs have been recognized for quite some time [4][5][6][7][8][9][10]. More recently, Surana and coworkers [11][12][13][14][15][16] pointed out applications in BVPs and IVPs in which higher order global differentiability approximations were shown to be highly meritorious. First formal presentation and introduction of k-version of finite element for BVPs in which k is the order of the approximation space is due to Surana et al., in the three basic papers [1][2][3] devoted to self adjoint, non-selfadjoint and nonlinear differential operators. In these works, variationally consistent (VC) as well as variationally inconsistent (VIC) integral forms [17,18] were used in the model problem studies. In references [19][20][21] Surana et al., presented 2D HGDA/TP as well as HGDA/DG finite element formulations.
Since the publication of the works by Surana et al., on k-version, there have been many attempts with different techniques to address the higher order global differentiability issues primarily in context with the BVPs. In the following we discuss four typical publications that appeared in 2010-2020 that are typical of higher order global differentiability published works. We regret to point out that none of these works acknowledge or reference the large number of published works and two textbooks by Surana et al., that address systematic and general methodologies for deriving HGDA/DG or HGDA/TP in R 2 and R 3 including model problem studies, applications, error estimation and convergence rates. In reference [22], the authors claim to extend tensor product concepts of 1D approximation to derive what they call as C1-Q k family of finite elements. In reference [23] C 1 basis functions are presented, that contain same dofs at the corner nodes that arise due to tensor product of 1D C 1 functions in natural coordinate space. These cannot be transformed to physical coordinate space [17,18], authors in the paper do this incorrectly. Reference [24] claims the QH8-C1 as new innovation in higher order continuity. No reference is made to any of the published works, the work has no mathematical foundation. In reference [25], the choices of the dofs and the geometrical approach used has no mathematical foundation either. There are many other works of similar type in which some approach (mostly without mathematical basis) is used to achieve C 1 continuity. Isogeometric analysis [26,27] also claims the method to be k-version. In subsequent sections we clearly demonstrate that this is indeed not the case. Unfortunately we also find the motivation of k-version as presented by Surana et al. [1][2][3] and discussed in this paper is misrepresented in many of the published works on isogeometric analysis as well as other works. We point out this subsequently so that the readers of this paper have better understanding of the motivation for k-version of FEM initiated by Surana et al. First, we make some remarks. Remark 1.

1.
In all publications only C 1 global differentiability is addressed for a fixed degree of polynomials of the functions in the local approximation.

2.
No published work on C q , C qr , C qrs (in R 1 , R 2 and R 3 ) with q, r, s > 1 higher order global differentiability local approximations is available.

3.
The published C 1 works are not higher degree, hierarchical, i.e., degree of local approximations cannot be increased to any arbitrary value p while maintaining C 1 nature of approximation, hence hierarchical nature of approximations is naturally not possible.

4.
In summary there is no unified and sound mathematical framework that exists at present for constructing C q , C qr and C qrs ; q, r, s ≥ 1 local approximation with global differentiability in R 1 , R 2 and R 3 that allow the use of arbitrarily higher order degree polynomials and in which the local approximations are hierarchical.

Scope and Approach Used in the Present Work
The scope of work published in this paper is summarized in the following.

1.
The paper presents a unified methodology and mathematical infrastructure for deriving higher order global differentiability approximations of class C q , C qr , C qrs ; p, q, r ≥ 0 for BVPs and IVPs in R 1 , R 2 and R 3 .

2.
Since the global differentiability of approximation φ h for discretizationΩ T x orΩ T xt is due to local approximation φ e h overΩ e x orΩ e xt , i.e., Thus, the local approximation φ e h overΩ e x orΩ e xt must be designed in such a way that desired globally differentiability of φ h (i.e., of orders q or qr or qrs) is achievable at the inter-element boundaries of the discretizations. This approach is essential in all finite element processes. This is due to the fact the union of element approximation must always establish the global approximation φ h over the discretizationΩ T x orΩ T xt .

3.
The approximations in R 1 , R 2 and R 3 of type C q , C qr and C qrs must be p-version hierarchical, i.e., for chosen q, r, s, we must be able to increase p-levels to whatever value we desire without affecting order of global differentiability q, r, s. If possible, lower p-levels must be embedded (complete subset) in the higher p-levels (hierarchical property or embedding property) so that the computations performed at lower plevels can be used when doing computations at higher p-level.

4.
The derivation of the local approximations must always be in natural coordinate space with transparent transformations to physical coordinate space to achieve the desired global differentiability in the physical coordinate space.

5.
In all derivations of the type C q , C qr , C qrs in R 1 , R 2 and R 3 the nodal configuration for geometry and for the dofs must remain the same regardless of the choices of q, r, s and p-levels. This permits use of a single discretization for all studies if h-refinement is not used. 6.
All derivations must initialize with C 0 , C 00 and C 000 p-version hierarchical local approximation of arbitrary p-level in the derivations of C q , C qr , C qrs HGDA/DG. The derivations must retain the hierarchical structure of increasing p-level beyond the minimum polynomial degree needed for q, qr, qrs orders of global differentiability.

C q
(Ω e x ) Local Approximation in R 1 Consider a three node 1D element e with domainΩ e x of length h e in the physical coordinate space x, mapped in the natural coordinate space ξ in a two unit length ( Figure 1). Let 1, 2, 3 be local node numbers then describes mapping of a point in ξ-and x-spaces. N i (ξ) are standard quadratic Lagrange functions. Figure 2 shows a typical inter-element boundary between element e − 1 and e.
non−hierarchical node hierarchical node For the global approximation φ h = e φ e h to be of class C q (Ω T x ), the local approximation φ e h overΩ e x must be of class C q (Ω e x ), i.e., the union of local approximations φ e h must yield global differentiability of order C q at the inter-element boundaries. This requires that we construct local approximation φ e h for an element e such that φ, . . , q are the dofs at the nonhierarchical nodes (boundary nodes) so that at the inter-element boundaries continuity of φ, d i φ dx i ; i = 1, 2, . . . , q will be enforced. We shall see that in the interior (Ω e x ) of each element φ e h , φ h is always of class higher than C q . We begin with a C 0 p-version hierarchical local approximation that is 1D local approximation in natural coordinate space ξ. Using the three node configuration of Figure 1b we can write the following. The pversion hierarchical local approximation (using φ instead of φ e h ) of class C 0 and of p-level p is given by Here φ 1 and φ 3 are nodal dofs at nonhierarchical nodes 1 and 3. This local approximation ensures continuity of φ (only) at the inter-element boundaries (nodes 1 and 3), hence φ e h is class C 0 (Ω e x ) and therefore would yield x ) local approximation will require φ and dφ dx as dofs at non hierarchical nodes 1 and 3 ( Figure 1). Thus, if we begin with (11), then φ is already a nodal dof at nodes 1 and 3 and dφ dx at nodes 1 and 3 can be established by borrowing two terms from the sum in (11) and converting them to dφ dx at nodes 1 and 3. First, we write (11) as (borrowing two terms form the sum in (11)).
Using ( and then substitute them back in (12) to obtain in which we note that (for equally spaced nodes in x space) (15) using (15) and we have (16) in which N 1 1 = JN 1 1 and n 1 (16) is the desired p-version local approximation of class C 1 (Ω e x ) in which if we choose p ξ = 3 then there are no dofs at the hierarchical node 2. Each p-level increase beyond p-level of 3 adds one dof at the hierarchical node 2.
dx 2 as dofs at node 1 and 3 of the element of Figure 1. Thus, compared to C 0 (Ω e x ) local approximation of (11), we need additional two dofs at nodes 1 and 3. We borrow four terms from the sum in (11).
This is the desired C 2 (Ω e x ) approximation that would yield global approximationΩ T x of class C 2 (Ω T x ). 4.3. C q (Ω e x ) Local Approximation Following the procedure presented for local approximation of types C 1 (Ω e ) and C 2 (Ω e ), we can derive the following for C q (Ω e ) type local approximations in one dimension that ensure continuity of φ of order q over the discretizationΩ T x ofΩ x . For φ(ξ) of class C q (Ω e x ) and C q (Ω T x ), we can write The expression for φ(ξ) in (19) allows local approximations that yield global differentiability of order q and will permit increase in p-level up to any arbitrary value p ξ beyond the minimum p-level needed for C q (Ω e x ).

C qr
(Ω e xy ) Local Approximation in R 2 Regarding HGDA or HGDA/DG in R 2 there is no single theory, methodology or even a unique and general mathematical procedure in the published works that can be used to derive them. The basic point of confusion begins with the very definition of what C qr (Ω e xy ) local approximations are. From C qr (Ω e xy ), it is clear that we want global differentiability of orders q and r in x and y, but what is not clear is the fact in context with local approximation how do we achieve this in a unique and hierarchical manner. Definition 1. If n and t are normal and tangential directions at an inter-element boundary Γ (Figure 3), then global differentiability of orders q and r of φ in the n and t directions requires that ∂ i+j φ ∂n i ∂t j i = 0, 1, . . . , q; j = 0, 1, . . . , r i + j < q + r (20) be unique at every point Q along the boundary Γ.  Figure 3 shows a four element discretization in R 2 . If φ is a dependent variable, then the C qr (Ω e xy ) local approximation of continuity require continuity of the derivatives of φ in normal (n) and tangential (t) directions of orders q and r, respectively, along the interelement boundary, i.e., these derivations must be unique on Γ regardless of whether we use element a or b . For example along the inter-element boundary Γ between elements a and b the following must hold in which φ a h and φ b h are local approximation for elements a and b of the four element discretization. The orthogonal normal and tangential directions n and t can be transformed to x, y orthogonal direction, thus (21) implies that the following must hold along an inter-element boundary Γ between elements a and b.
Thus, in a discretizationΩ T xy = eΩ e xy , using (nine node p-version) elements, an el-ementΩ e x shown in Figure 4 in xy space with its map in natural coordinates space ξη shown in Figure 5, must have the nodal dofs at the corner nodes for (nonhierarchical nodes) C qr (Ω e xy ) and C qr (Ω ξη ) differentiability approximations in xy and ξη spaces as shown in Figures 4 and 5. We note that local approximations of class C qr (Ω e xy ) require minimum p-levels of 2q + 1 and 2r + 1. For p-levels p ξ = 2q + 1 and p η = 2r + 1, the hierarchical nodes have no dofs. For p ξ > 2q + 1 and p η > 2r + 1 the dofs at the nonhierarchical corner nodes remain unaffected but additional dofs appear at the hierarchical nodes. It is more meaningful to represent dofs in terms of nodal variable operators (obtained by removing dependent variable). When the nodal variable operators act on a dependent variable they produce dofs. These are shown in Figures 4 and 5 in xy and ξη coordinate spaces. Nodal variable operators for the nonhierarchical nodes of a nine node p-version element in xy and ξη spaces for C 11 , C 22 and C 33 classes are listed in Table 1. Table 1. Nodal variable operators at the nonhierarchical nodes for C 11 , C 22 , C 33 classes in xy and ξη spaces.

1.
We note that dofs for C 11 , C 22 and C 33 listed in Table 1 at the nonhierarchical nodes in ξη space can be transformed into those listed in xy space. Transformation of the dofs at nonhierarchical nodes is always possible in general between C qr (Ω ξη ) and C qr (Ω e xy ) (see Section 6).

2.
Based on Remark 1, it is convenient to generate C qr (Ω ξη ) approximation first (nodal dofs or nodal variable operators and associated functions) and then transform the nodal dofs or nodal variable operators at the nonhierarchical nodes into xy space forΩ e xy . We keep in mind that all derivations initiate with p-version C 00 hierarchical local approximation in ξη spaces, as this approach will allow us to retain the feature of increasing p-levels beyond (2q + 1) and (2r + 1) in ξ and η directions without influencing the dofs at the nonhierarchical nodes responsible for C qr (Ω e xy ) continuity at the inter-element boundaries and will also retain the hierarchical nature of C qr (Ω e xy ) approximation.

Transformation of Dofs between C qr (Ω ξη ) and C qr (Ω xy ) at the Nonhierarchical Nodes
Let define a mapping of points between ξη and xy configurations of a nine node p-version hierarchical element. For suitably chosen mapping, the lengths between ξη and xy are defined by in which In the following we define the rule of transformation between {φ} Using the following notation, we can transform dofs between natural and physical coordinate spaces [17,20]. and From (29), (35) and (37) it follows that we can write the following general expression.
Let us use the abbreviation HGDA/DG to denote higher order global differentiability approximation. The symbol C qr (Ω e xy ) denotes a HGDA/DG of class q and r in respective coordinates x and y onΩ e xy . To show specific details that are needed in (41), we consider C 11 (Ω e xy ), C 22 (Ω e xy ), C 33 (Ω e xy ), . . . local approximations.

2D C 33 HGDA/DG Elements
For this element φ x , φ y , φ x 2 , φ xy , φ y 2 , φ x 3 , φ x 2 y , φ xy 2 , and φ y 3 are the dofs needed at the nonheirarchical nodes in addition to φ. Compared to C 00 p-version element, we need nine additional dofs at each of the nonhierarchical nodes (a total of thirty six). We borrow φ ξ 2 , φ ξ 3 , φ ξ 4 , φ ξ 5 , φ ξ 6 and φ ξ 7 from each of the four mid side hierarchical nodes (a total of twenty four). The remaining twelve dofs are borrowed from node nine as shown in Figure 8. See [17][18][19][20][21] for the rationale of this choice and for more details.
Additional dofs from the center node of C 00 p-version element for C 22 HGDA element Additional dofs from the center node of C 00 p-version element for C 33 HGDA element

2D C ij HGDA/DG Elements
The dofs {δ e } of C 00 p-version element are partitioned into {δ e co }, {δ e m }, {δ e c } and {δ e mc } in which co, m, c and mc stand for corner nodes, mid side nodes, center nodes and mid side plus center node and we can write the following: We note that for 2D C 11 HGDA/DG element, we do not need to borrow dofs from node 9 of 2D C 00 p-version element. Hence, {δ e mc } el = {δ e m } el . If {δ e } xy n are the new dofs at the nonhierarchical nodes of 2D C ij HGDA/DG element, then for 2D C ij HGDA/DG we have (in the natural coordinate space) the following at nodes 1, 3, 5 and 7.
We differentiate (42) with respect to ξ and η (as needed) and evaluate these at each of the non hierarchical nodes to obtain the following: Solving for the dofs to be eliminated, i.e., {δ e mc } el in (45), we obtain Substituting Jacobian of transformation from (41) into the above equation, we can transform the new derivative dofs from ξη space to xy space. Equation (46) can thus be written as Now, substituting {δ e mc } el from (47) into (42), Collecting terms in the (48), we obtain the final form of the C ij HGDA/DG local approximations as follows: Remark 3.

1.
The derivation is based on 2D C 00 p-version hierarchical approximation.

2.
Dofs from the hierarchical nodes of C 00 element are borrowed to generate desired degrees of freedom first in ξη space which are then transformed to xy space.

4.
Dofs at the non hierarchical nodes of 2D C ij HGDA/DG are transformable between ξη and xy spaces transparently (an intrinsic feature of truly C ij elements).

HGDA/TP: Higher Order Global Differentiability Elements Using Tensor Product
Consider a 2D distorted nine node p-version hierarchical element in xy space and its map in ξη space in a two unit square ( Figure 9). We know that tensor product requires the two orthogonal directions in which the 1D approximations are defined, hence tensor product cannot be used in xy space for the distorted 2D element. However, in ξη space 2D functions and the dofs for C qr (Ω ξη ) approximated can be derived using tensor product of 1D C q (Ω ξ ) and C r (Ω η ) approximations. For the C qr (Ω ξη ) 2D element in ξη space the nodal dofs at the nonhierarchical nodes are function φ and its derivatives of φ with respect to ξ, η. A C qr (Ω e xy ) element will require transformation of these dofs to the xy space. We consider details in the following. Consider the 1D C q (Ω ξ ) and C r (Ω η ) hierarchical local approximation in the ξ and η directions (following (19)) using (50) and (51) we can generate 2D approximation functions in the ξη space as well as nodal dofs (also in ξη space) by taking tensor product of 1D approximation and the tensor product of the nodal variable operators [17,18]. The dofs at the nonhierarchical nodes (1, 3, 5, 7) generated in doing so are given by For C 11 (Ω ξη ) we have the following dofs at nodes 1, 3, 5, 7 For C 22 (Ω ξη ) we have the following dofs at nodes 1, 3, 5, 7 Based on Section 6, all dofs in (53) and (54) with respect into ξη cannot be transformed to those with respect to xy.  The issue of not being able to transform the derivative dofs (with respect to ξ and η) at nonhierarchical nodes needs further investigation. We note that (Section 5) a C qr (Ω e xy ) approximation has ∂ i+j φ ∂ξ i η j ; i = 0, 1, . . . , q; j = 0, 1, . . . , r; i + j < q + r dofs at the nonhierarchical nodes. These can be transformed from ξη space to xy space as shown in Section 5.

2.
The dofs at the nonhierarchical nodes in the tensor product element in ξη space are for each order of continuity (i.e., C 11 , C 22 , . . . ), (56) contain additional dofs compared to (55). We note that for C 11 (Ω ξη ), sum of the orders of derivatives with respect to ξ and η should be one which is true in case of (55), but in (56) ∂η∂ξ term violates this condition confirming that HGDA/TP is not a true C 11 (Ω ξη ) approximation in ξ and η. The same argument holds true for C 22 (Ω ξη ), as in (54). Thus, the tensor product in ξη space does not generate dofs at the nonhierarchical nodes for C ij (Ω ξη ) ; i = 1, 2, . . . , q, j = 1, 2, . . . , r local approximation that conform to (55), containing only the needed dofs at the nonhierarchical nodes for C ij continuity.

3.
If we only consider rectangular elements in xy space with ξη for each element parallel to xy and pointing in the same direction, then we can transform (50) and (51) to giving By taking tensor product of expressions (58) and (59), the resulting element will contain ∂ i+j φ ∂ξ i ∂y j ; i = 0, 1, . . . , q; j = 0, 1, . . . , r as dofs at the nonhierarchical corner nodes (nodes 1, 3, 5 and 7). We refer to this element as C qr (Ω e xy ) HGDA/TP element. This element is not a true C ij (Ω e xy ) global differentiability element.

4.
HGDA/TP is not very useful in applications as it requires elements to be rectangular in xy space with ξη pointing in the same directions as xy. Thus, distorted element geometries required for an irregular domain cannot be supported by this higher order global differentiability 2D finite elements based on tensor product. 5.
If we consider a discretization of a domainΩ xy for which HGDA/TP are valid, then for this special case we may ask the following question.
∂y∂x at each of the four nonhierarchical node in case of a HGDA/TP element help in improving local approximation or just adds extra dofs without much improvement compared to the dofs in HGDA/DG C 11 (Ω e x ) element? From the numerical studies presented in a later section we conclusively show that the latter is the case, i.e., ∂ 2 φ ∂y∂x dofs do not result in an improved approximation but add additional dofs compared to HGDA/DG C 11 (Ω e xy ) approximation, hence they result in poor accuracy for a given dofs and also result in lower convergence rates. A somewhat less rigorous but persuasive argument may be since Thus, our conclusion is that in the HGDA/TP approach of deriving local approximation: i. true C iJ (Ω e xy ) ; i = 1, 2 . . . , q, j = 1, 2 . . . , r local approximations are not possible. ii.
HGDA/TP elements are inferior to HGDA/DG elements in all aspects.

HGDA/DG: Triangular and Hexadral Elements
Higher order global differentiability local approximations for triangular and hexadron elements can also be derived [17,18,20,21], but are not presented here for the sake of brevity.

Isogeometric Analysis and k-Version
In 2005 [26,27] isogeometric analysis method was presented. According to the authors Analogues of finite element h-and p-refinement schemes are presented and a new, more efficient, higher order concept, k-refinement is introduced. Thus, in our view isogeometric analysis claims to be an h, p, k method. We first present details of the isogeometric analysis, then compare it with hpk finite element method, but, more importantly, we compare it with the k-version of finite element introduced by Surana et al. [1][2][3][11][12][13][14][19][20][21].
In isogeometric analysis, the complex domain is subdivided into subdomains (not finite elements). Let us refer to a subdomain as a patch. For a patch, one picks points called control points that are not necessarily on the boundary of the patch. Using the coordinates (geometric locations) of these points one constructs a geometric description (say using NURBS or other methods). This gives a relationship of the type for geometry.
Equation (61) {δ i p } are the total dofs for the patch i. If the approximation (62) is of degree p in x, y, z then it is also of class C p (Ω i p ) whereΩ i p is the domain of the ith patch. Thus, within the patch the order of the approximation space of u u u is k = p + 1. We remark that so far we do not have finite elements in the true sense in which (Ω i p ) T = eΩ e , in which (Ω i p ) T is a discretization ofΩ i p . Instead (62) is a statement that one would use in classical methods of approximation in whichΩ p is not a discretized domain (patch). Now, since we have an approximation of displacements in (62), we can proceed with standard method using a method of approximation to obtain algebraic equation for the patchΩ i p . If we choose linear elasticity as an example, then the balance of linear momenta gives We construct an integral form of (63) over the domainΩ consisting of N patches, in whichΩ = iΩ i p . If we consider Galerkin method with weak form, then v v v = δu u u N , (64) can be written as (over the patches).
for ith patch using integration by parts (IBP) (since in linear elasticity adjoint of A A A, i.e., A A A * = A A A), we obtain Substituting (62)

1.
In this process described above used in isogeometric analysis there is no concept of finite element anywhere.

2.
A patch appears like a C 0 p-version finite element in which p-level can be as high as desired. Naturally, ifΩ e is the domain of C 0 p-version finite element with degree of local approximation p for a field variable φ then φ e h (local approximation of φ overΩ e ) is naturally of class C p (Ω e ) and holds over the element domainΩ e . This is not a discovery. This has been known since the advent of the finite element method. The isogeometric analysis with patches is exactly like C 0 p-version finite element mesh in which each finite element is a patch. Within each C 0 element (i.e., a C 0 patch) the solution is of class C p but at the inter-element boundaries (same as inter patch boundaries) the solutions remains of class C 0 . Thus, all isogeometric solutions with more than one patch are globally of class C 0 .

3.
Since only displacements are dofs on the boundaries of the patches, {u N } = N i=1 {u i p } remains of class C 0 at the interpatch boundaries. Thus, in all isogeometric analyses containing more than one patch, the global solutions are not of class p (order of the space k = (p + 1) /2), but are undoubtedly of class C 0 due to their C 0 nature at the interpatch boundaries. Thus, presenting isogeometric analysis as k-version is misleading and misrepresentating.

4.
Computation of the patch stiffness matrix requires integrals (2D or 3D) over an irregular domainΩ i p (in (67)). This can be done in several different maps. For example,Ω i p is subdivided and each subdivision is mapped into two unit square or two unit cube to integrate using Gauss quadrature while still maintaining (61) and (62) for each subdivision. The subdivisions are not finite elements but this is necessary to do to perform integration overΩ i p . The subdivision are not finite elements as subdivisions have no concept of local approximation.

5.
The works published by Surana et al., [1][2][3]12,[19][20][21] are based on true finite element methodology in which global differentiability of order k − 1 is always ensured. There is no comparison of these works with isogeometric analysis as at present it cannot do what k-version [1][2][3][19][20][21] can, i.e., to ensure global differentiability of any desired order. Isogeometric at present always has global differentiability of class C 0 , hence no different than traditional C 0 finite element method analysis.

6.
On another small note, in some published works on isogeometric analysis and in other works we find that some authors beleive that work as referenced in [1][2][3][19][20][21] is motivated due to least squares method. This is wrong. Perhaps a consequence of not reading our published works carefully enough. We always point out that minimally conforming space is dictated by the differential operator and not the integral form as the integral form may be a weak form.

7.
We also wish to point out that in all of the published works on isogeometric analysis, the influence of higher order differentiability of geometry on the accuracy of the computations has never been shown. Is it that the benefits advocated are only because of higher order global differentiability of the displacement approximation over the interior of the patches (Ω i p ) and that the geometry has nothing to do with this? In our own works on true k-version finite elements with h and p, this is true. As a matter of scientific curiosity, this should have been addressed at the onset of the method, but continues to be ignored.

A Priori Error Estimation and Convergence Rates: h, p, k-Versions
Surana et al. [28] have shown that variational consistency [17,18] of the integral forms in BVP and IVP is essential in the derivation of a priori error estimates that establish the dependence of various error norms on h, p, k. This was a fundamental and essential discovery that enabled derivation of a priori error estimates for self-adjoint, non-self adjoint, and nonlinear differential operators. Current published works on a priori error estimates rely on the best approximation property in the B(·, ·) norm that only exists for self-adjoint operators. Thus, in the currently published works a priori error estimates are rarely considered and found for non-self-adjoint and non-linear operators. In this paper we only present final results that are useful in model problem studies. Derivation can be found in the references [17,18,28]. Surana et al., showed that for BVPs the following hold when the integral forms used in finite element processes are variationally consistent. When the differential operator is self-adjoint, GM/WF and LSP yields a VC integral form. For non self-adjoint and non-linear differential operators only LSP based on residual functional yields a VC integral form. Following references [17,28] we have the following a priori error estimates when the integral form is VC regardless of the method used.
Let Aφ − f = 0 in Ω x be the BVP and let φ h be the finite element solution of Aφ − f = 0 overΩ T x = eΩ e x , discretization ofΩ x , thus we have and if E = Aφ h − f ; residual function (72) Here e is true error, c 1 , c 2 , c 3 and c 4 are variables that only depend on the order k of the approximation and 2m is the highest order of the derivatives in Aφ − f = 0.

Convergence Rate
Consider e H q in (71). Taking logarithm of both sides with y = log e H q (77) When we use the equality in (76), this is an equation of a straight line in xy space in which m is the slope and c is the intercept., i.e., if we plot log h versus log e H q on an xy plot, then we obtain a straight line whose slope is (p + 1 − q) and intercept is log(c 3 |φ| p+1 ). Slope (p + 1 − q) is called the rate of convergence of e H q . Higher values of (p + 1 − q) imply faster convergence of φ h to φ measured in e H q . Equation (77) can also be expressed in terms of total dofs for the discretization. If h = max(h e ), h e being characteristic length for an element e, then Remark 5.

1.
We keep in mind that the a priori error estimate (75) only holds for a uniform mesh refinement or at the most for a quasi-uniform mesh refinement.

2.
The quantity e H q requires knowledge of the theoretical solution φ which is generally not possible for a practical application.

3.
When the approximation spaces are minimally conforming, i.e., when k = 2m + 1, 2m is the order of the highest derivation in the BVP Aφ − f = 0, then the integrals over discretization Ω T x are Riemann and the E L 2 can be computed forΩ T x as it does not require knowledge of theoretical solution and we have If the solution φ is sufficiently smooth, we can also use k = 2m. In this case the integrals are Lebesgue over discretizationΩ T x .

4.
In the case of 1D problems (82) holds precisely. However in case of BVP in R 2 and R 3 this is generally not true precisely. For example, for exactly same discretization (h fixed) for HGDA/DG and HGDA/TP elements, the HGDA/TP elements have an additional dof at the nonhierarchical nodes, thus the total dof for HGDA/DG and HGDA/TP are going to be different even though h is same. Thus, even though a priori error estimates are derived using h, but it is perhaps more illustrative to exhibit the real convergence rates if we use dof instead of h. Thus, in all model problem studies we present graphs of error norms versus dof.

Model Problems: Numerical Studies (BVPs)
In this section we present numerical studies using higher order continuity 1D pversion hierarchical formulations for: (1) axial deformation of a rod (self adjoint differential operators), (2) 1D convection diffusion equation (non-self adjoint differential operator), (3) and 1D Burgers equation (non-linear differential operator). We also present numerical studies using HGDA/DG and HGDA/TP higher order continuity 2D element formulations for: (1) Poisson's equation (self-adjoint operator) and (2) 2D convection equation (nonself-adjoint operator) and (3) 2D Burgers Equation (non-linear operator). Various aspects of k-version are illustrated in terms of correct description of physics in the computations, improved accuracy and convergence rates.

1D Axial Deformation
The dimensionless form of the 1D deformation of an axial rod is described by (Au − f = 0) (also see [1]) For this BVP A * , the adjoint of the differential operator is the same as A is naturally linear and the concomitant < Au, v > Γ is zero as the BCs in (85) are homogeneous, hence A is self-adjoint. We consider finite element formulations based on Galerkin method with weak form (GM/WF) as well as based on residual functional, least squares finite element formulation. Details are straight forward and can be found in [17]. The approximation u h of u over discretizationΩ T x = eΩ e x must be such that in which k = 3 is minimally conforming space (in which case u h , du h dx and d 2 u h dx 2 are continuous and u h and du h dx are differentiable for ∀ x ∈Ω T x ). This requires that the local approximation u e h overΩ e x must also belong to the same space as u h . Figure 10 shows a schematic of the problem.

Analytical Solution
We choose dimensionless L to be one and b(x) to be constant and the following for the body force f (x) where σ is a positive constant 0 < σ < ∞ and θ is another constant −∞ < θ < ∞. Then (85) can be written as The analytical solution of (88) can be obtained by integrating it twice and using the boundary conditions. We obtain Remark 6.

2.
When σ = 0, du dx has square root singularity at x = 0, hence this case is ruled out.

Numerical Results
Since the operator A is self-adjoint, the quadratic functional π is given by and the error or the residual functional is given by For k ≥ 3 both π and I can be computed for GM/WF and LSP. Using theoretical value of π we can calculate percentage error in π, g ε π and l ε π for GM/WF and for LSP. We consider L = 10 in., area of section a = 1 in. 2 , and Young's modulus E = 30 × 10 6 psi. The rod is fixed at the left end ( x = 0), and free at the right end ( x = L) (see Figure 10). If we choose L = L 0 = 10 and E 0 = E = 30 × 10 6 = σ 0 then F 0 = E 0 L 2 0 = 30 × 10 8 lb, and a = a We consider a fixed eight element graded mesh as shown in Figure 10b. Solutions are calculated using GM/WF as well as LSM. For each order of space, p-level is increased up to 19.
Plots of g ε π and l ε π versus dofs for GM/WF and LSM are shown in Figure 11a,b. Plots of residual functional I versus dof for GM/WF and LSM are shown in Figure 11c,d.

Discussion of Results
Since discretization is fixed, characteristic length h of the mesh is constant. Thus, from Figure 11a-d, we can observe p-convergence, k-convergence and pk-convergence of the solutions and the functionals. p-convergence: For a fixed k, the sequence of computed solutions for progressively increasing p-level represent p-convergence of the solution and the functionals in Figure 11a-d. For each value of k in Figure 11a-d, we observe progressively increasing slope of the functionals versus dof graphs for progressively increasing p-level, indicating progressively increasing convergence rate (as derived in the a priori error estimation). k-convergence: We note from the graphs of functionals versus dof, for increasing p-level, that with progressively increasing k the graphs for all k-values remain almost parallel to each other, but progressively shift downwards with progressively increasing value of k. Indicating that, increasing k does not result in higher convergence rate but results in better accuracy (lower value of the functionals) for a given degrees of freedom. Thus, k affects accuracy but not the convergence rate. Along the lines of constant p, hence constant h, p and only k varies, thus we have k-convergence. We note that increase in k results in increasing functional values indicating increasing error. This is due to the fact that for fixed p, increasing k results in reduced dofs (obvious from the graphs). We observe from the graphs that for a fixed dofs, increase in k unconditionally results in lower functional values, hence better accuracy of the solution. pk-convergence: We can view pk-convergence at least in the following two ways: (1) for a fixed k we can choose p based on p = 2k − 1, minimum p-level for k. If we connect these points on graphs in Figure 11a-d, we obtain much higher convergence rate than pconvergence for any value of k. (2) The most obvious dramatic and the highest convergence rates are achieved for combination of p and k for which dofs are almost constant indicated by almost vertical lines on the graphs in Figure 11a-d. The numerical studies presented here for this model problem confirm h, p, k as three independent parameters in the finite element computational processes for self-adjoint operators.
(a) Error in quadratic functional π using the Galerkin method (b) Error in quadratic functional π using the least-squares method (c) Error functional I using the Galerkin method (d) Error functional I using the leastsquares method Figure 11. Graphs of π and I versus dof for axial deformation.

1D Convection Diffusion Equation
Consider the following BVP [2]: (93) is non-self-adjoint, hence GM/WF will yield a VIC integral form but the integral form based on LSP is VC. Here, we only consider LSP (see [17] for numerical studies using GM/WF). We consider k ≥ 2 and p-levels up to 19. A theoretical solution of (93) and (94) is given by Obviously, φ is of class C ∞ . We consider the following meshes for Pe = 100 and 10 6 . We consider k ≥ 2 with p-levels up to 19 in LSFEP. For each k, p-levels are increased up to 19 uniformly for each element of the discretization.

Discussion of Results
As in previous model problem, here also characteristic discretization length h is constant as the discretization is fixed, hence we can study p-convergence, k-convergence and pk-convergence of the solution and functionals. Figure 12a,b present residual functional I versus dof graphs for various values of k for both Pe. p-convergence: For a fixed k, the sequence of solutions for progressively increasing p-level represent p-convergence of the solution and the residual functional I in Figure 12a,b. For each value of k in Figure 12a,b we observe progressively increasing slope of residual functional I versus dof graphs for progressively increasing p-levels, indicating progressively increasing convergence rate (as shown in a priori error estimation section). k-convergence: From the graphs of residual functional I versus dof for increasing p-level but for a fixed value of k, we note that with progressively increasing k the I verus dofs graphs for all k values remain almost parallel to each other but shift downwards with increasing k. This confirms that increase in k does not result in higher convergence rate but results in better accuracy (due to lower values of functional I) for a given degrees of freedom. Thus, k affects accuracy but not the convergence rate. In Figure 12a,b, from the curves of constant p, hence constant h and p , thus only k is varying along these curves (k-convergence). We note that increase in k results in increase in the value of residual functional I, indicating increasing error. This is due to the fact that for fixed p, increasing k results in reduced dofs (obvious from the graphs). We observe from the graphs that for a fixed dofs, increase in k unconditionally results in lower values of I, hence better accuracy.
pk-convergence: As in case of previous model problem, here also we can view pkconvergence in at least the following two ways: (1) for a fixed k, we can choose p = 2k − 1, minimum p-level of this choice of k. If we construct these parts on the graph in Figure 12a,b, we observe much higher convergence rate than p-convergence for any value of k. (2) The most dramatic and the highest convergence rates are achieved for combinations of p and k for which dofs are almost constant indicated by nearly vertical lines in the graphs in Figure 12a,b.
The numerical studies presented here for the model problem in which differential operator is non-self adjoint also confirm that h, p and k are three independent parameters in finite element computations and the observed convergence behavior is in accordance with a priori estimates.
has continuous derivatives of all orders. The differential operator dx 2 is nonlinear, therefore only LSP yields VC integral form [17,18]. LSFEP is used to present numerical studies using the following discretizations for k ≥ 2 with p-levels up to 17.

Discussion of Results
In the numerical studies, the characteristic discretization length h is fixed (due to fixed mesh), hence we can study p-convergence, k-convergence and pk-convergence. Figure 13a,b show plots of residual functional I versus dof for Re = 100 and 10 6 for different values of k for progressively increasing p-level up to 17. p-convergence: For a fixed value of k, the sequence of computed finite element solutions for progressively increasing p-level represent p-convergence of the solution and the residual functional I in Figure 13a,b. For each value of k in Figure 13a,b we observe progressively increasing slope of I versus dof graphs for progressively increasing p-level, indicating progressively increasing convergence rate (shown in a priori error estimation). k-convergence: From the I versus dof graphs for increasing p-level but for a fixed value of k, we note that with progressively increasing k, the I versus dof graphs for all values of k remain almost parallel to each other but shift downward with increasing k. This confirms that increase in k does not result in higher convergence rate but results in better accuracy (lower values of I) for a given degrees of freedom. Thus, k affects accuracy but not the convergence rate. From the curves of constant p, hence constant h and p in Figure 13a,b, thus only k varying along these curves (k-convergence), we note that increase in k results in increase of the value of functional I, indicating increasing error. This is due to the fact for fixed p, increasing k results in reduced degrees of freedom (obvious from the graphs). We note from the graphs that for a fixed dof, increase in k unconditionally results in lower values of residual functional I, hence improved accuracy.
pk-convergence: We can observe pk-convergence in at least two ways: (1) for a fixed k we can choose p = 2k − 1, minimum p-level for this choice of k. If we connect these points in the graphs in Figure 13a,b, we observe much higher convergence rate than p-convergence for any value of k. (2) The most dramatic and the highest convergence rates are achieved for combination of p and k for which the dofs are almost constant indicated by nearly vertical lines in the graphs in Figure 13a,b. The numerical studies for the BVP in which the differential operator is nonlinear also confirm that h, p, k are three independent parameters in the finite element computations and the convergence behavior is in accordance with a priori estimates.

Poisson's Equation (BVP)
Consider the two-dimensional steady state Poisson's equation (see [19][20][21] also): u(x, −b) = u(x, b) = u(−a, y) = u(a, y) = 0 (100) f (x, y) is chosen that the theoretical solution of (99) and (100) is given by u(x, y) in (101) satisfies boundary conditions (100). We choose a = b = 1, m = n = 8. The differential operator in (99) is self-adjoint, hence GM/WF and LSP both yield variationally consistent integral forms. If u h is an approximation of u overΩ T xy = eΩ e xy , discretization ofΩ xy , then in which k = 3 is minimally conforming space of approximation φ h and hence φ e h for which integrals are Riemann over (Ω T xy ) . We consider two types of discretization. Mesh A The domainΩ xy is divided into 2 × 2, 4 × 4, . . . uniform discretizations in which elements are square and ξη axes for each element parallel to xy and pointing in the same directions as the xy axes. (Figure 14). For these discretizations we can use both HGDA/DG and HGDA/TP element formulations. he = 1/2 he = 1 (2 × 2) mesh (4 × 4) mesh

Mesh B
Initial mesh for distorted elements is shown in Figure 15. Each element of this mesh is uniformly divided into 2 × 2, 4 × 4, . . . to generate progressively refined meshes. Computations are performed for progressively refined Mesh A and Mesh B using HGDA/DG of class C 11 , C 22 and C 33 as well as using HGDA/TP elements of class C 11 , C 22 , C 33 for p-levels of 3, 5, 7 and 9, keeping in mind that for HGDA/TP elements only Mesh A can be used whereas for HGDA/DG both Mesh A and Mesh B can be used. The p-levels are increased starting with 3 in ξ and η directions. Figures 16-21 show graphs of residual functional I versus dof for solutions for classes C 11 , C 22 and C 33 for Mesh A and Mesh B using HGDA/DG and HGDA/TP formulations. We discuss details in what follows.

1.
In Figures 16-18 It is significant to note that even though HGDA/TP formulations are perfectly valid for Mesh A and are believed to have optimal convergence rates, even then HGDA/DG solutions yield higher convergence rates and yield better accuracy. This needs further investigation and discussion (given later). 4. Figures 19-21 show plots of I versus dof for solutions for class C 11 , C 22 and C 33 at p-levels of 3, 5, 7 and 9 using HGDA/DG formulation with Mesh B and HGDA/TP formulation with Mesh A. We remark that HGDA/TP element formulation is believed to be meritorious formulation for Mesh A. Obviously HGDA/TP formulation cannot be used for Mesh B as the elements of the discretizations are distorted. 5.
HGDA/DG formulation always has higher convergence rate and lower values of I for a given dof for solutions for classes C 11 , C 22 and C 33 for each p-level considered. 6.
Thus, regardless of whether the discretization contains square (or rectangular) elements or distorted elements in xy space, HGDA/DG formulation has higher convergence rate and better accuracy (lower values of I) compared to HGDA/TP elements for Mesh A.

2D Convection Diffusion Equation (BVP)
We consider 2D steady state convection diffusion equation with boundary conditions and u(x, 1) = u(1, y) = 0 where Pe is the Peclet number. A theoretical solution is given by From the theoretical solution, we observe that u(x, y) is analytic for all value of Pe and is of class C jj , where j → ∞ is admissible.
We consider the same discretizations that are described for Mesh A and Mesh B for the Poisson's equation and the same p-levels for solutions for classes C 11 , C 22 , and C 33 . Figures 22-27 show graphs of residual functional I versus dof obtained using Mesh A and Mesh B for solutions for class C 11 , C 22 , C 33 ; p-levels of 3, 5, 7, and 9 for HGDA/DG and HGDA/TP formulations. We discuss results in the following. 1.

2D Burgers Equation (BVP)
Consider the 2D steady state scalar Burgers equation (see [19][20][21] also): with the BCs u(−a, y) = u(a, y) = u(x, −b) = u(x, b) = 0 (108) We consider Re = 10 and choose f (x, y) such that the theoretical solution of (107) and (108) is given by In the numerical studies we consider a = b = 1 and m = n = 8. In this case the differential operator ∂y 2 is nonlinear, hence the GM/WF will yield a VIC integral form. Modified LSP [24] yields VC integral form, hence we consider finite element formulation based on this integral form in the numerical studies. We consider the same discretizations that are described for Mesh A and Mesh B for Poisson's equation and we consider the same p-levels for solutions of classes C 11 , C 22 , C 33 and the formulations based on HGDA/DG and HGDA/TP. Since, in this case the differential operator is nonlinear, the system of algebraic equations resulting from LSP are nonlinear. Their solution is obtained using Newton's linear method with line search [17,18]. A tolerance of 10 −6 or lower is used for assuming computed quantities to be zero. We discuss results in what follows.

Model Problem: IVPs
In the case of IVPs, space-time coupled finite element processes with a space-time variationally consistent integral form with local approximation in H (k) (Ω xt ) = H (k 1 ,k 2 ) (Ω xt ) scalar product spaces containing space-time functions of global differentiability k 1 − 1 in space and k 2 − 1 in time are most meritorious over all others [18]. Since the space-time differential operators are either non-self adjoint or non-linear, only space-time finite element processes based on space-time residual functional (space-time least squares processes) are STVC [18]. In case of initial value problems, rather than discretizing the entire space-time domain, we could consider an increment of time and discretize the space-time strip or slab for the increment of time, with only one element in time, but adequate discretization in space. Need for more elements in time can be addressed by reducing the size of the time increment. We compute converged solution (residual functional I → 0) for the first increment of time. The solution for the second increment of time is calculated using second space-time strip for which initial conditions are determined using the converged solution from the first increment of time, i.e., first space-time strip.
On obtaining converged solution for second space-time strip we move on to third space-time strip and so on until entire evolution is calculated. This process is referred to as space-time strip or slab with time marching. This is the most efficient and meritorious approach of computing evolution described by the IVP. We make some remarks in the following. Remark 7.

1.
If 2m 1 and 2m 2 are the highest orders of the derivatives of the dependent variables in space and time in the initial value problem, then k 1 = 2m 1 + 1 and k 2 = 2m 2 + 1 correspond to the minimally conforming scalar product space H (k 1 ,k2) (·) containing functions of space and time for which all space-time integrals will remain Riemann type over the space-time discretization.

2.
In the space-time strip with time marching method, we must ensure that the computed solution for the first space-time strip is converged before we move on to the second space-time strip as the ICs for the second space-time strip are obtained using the solution for the first space-time strip.

3.
Based on Remark 2, the convergence studies in IVPs can be done only for the first space-time strip. In reference [18], it has been shown using many model problems that the convergence characteristics and various aspects of space-time HGDA/DG and HGDA/TP element formulations as well as their performance in terms of accuracy remain the same as shown for 2D BVPs discussed in this paper. For this reason and also for the sake of brevity, numerical studies are not presented for IVPs.

4.
We emphasize that the space-time finite elements of classes C qr , C qrs . have precisely same formulations as the C qr , C qrs formulations for BVPs. Just like BVPs, here also HGDA/DG elements only have the required dof at the nonhierarchical nodes for desired orders of continuity and the HGDA/TP space-time elements are inferior to HGDA/DG in all aspects for the same reasons that have been discussed earlier in the context with BVPs.

Summary and Conclusions
In the k-version of the finite element method initiated by Surana et al. [1][2][3], the authors showed that k, the order of the approximation space defining global differentiability of approximation over a discretization is an independent parameter in all finite element processes in addition to h and p, hence the terminology k-version of the finite element method in addition to hand p-versions is quite natural. In this paper, we considered the works of Surana et al. [1][2][3][11][12][13][14][15][16][17][18][19][20][21] on k-version finite element method and have presented a unified computational methodology incorporating the k-version that addresses all higher order global differentiability issues in BVPs and IVPs. The k-version of the finite element method presented in this paper is compared with other published works on higher order global differentiability approaches including isogeometric analysis. We present a short summary and draw some conclusions.

1.
The global differentiability of an approximation is due to union of local approximations, i.e., in the k-version presented in this paper, the element local approximation over finite elements are designed such that their union automatically gives the desired globally differentiability everywhere in the discretized domain. This must be intrinsic and fundamental aspect of all finite element processes considering k-version.

2.
The HGDA/DG are p-version and are hierarchical, i.e., in C q , C qr and C qrs approximations, p-levels can be increased beyond those needed for q, qr, qrs orders of continuity. This increase in p-level does not increase dofs at the nonhierarchical nodes that are responsible for the desired degree of global smoothness.

3.
It is shown and established that HGDA/DG are truly higher order global differentiability local approximations. HGDA/TP elements contain additional dofs at the nonhierarchical nodes compared to HGDA/DG elements. The consequence of this is: (a) the HGDA/TP elements have dofs at the hierarchical nodes that cannot be transformed between ξη and xy (ξηζ and xyz) spaces (b) poor convergence rate compared to HGDA/DG (c) poor accuracy compared to HGDA/DG (d) HGDA/TP elements do not have true C q , C qr , C qrs global differentiability. 4.
HGDA/DG elements work perfectly well in discretizations containing distorted element geometries while HGDA/TP elements are restricted to rectangular shapes with additional restrictions on ξη for the elements to be parallel to xy and pointing in the same directions.

5.
Graphs of residual functional and the error in quadratic functional are reported in the paper with respect to dof instead of using characteristic discretization length h. Since HGDA/TP elements have additional dofs at the nonhierarchical nodes compared to HGDA/DG elements, use of dof is a true measure of their performance in terms of convergence rate and the accuracy as h is not effected by additional dof in HGDA/TP elements. 6.
We find some published works on k-version that has been cited in the literature review section of this paper. Unfortunately, the approaches used in those publications are specific to some order of differentiability and are not p-version hierarchical and cannot be extended to orders higher than those presented in the paper. The work presented here allows any desired order of global differentiability local approximations in R 1 , R 2 and R 3 with p-version hierarchical structure. To the best of our knowledge, there is no other published work on the k-version that has these features. 7.
We have considered and presented considerable details of isogeometric analysis. (1) We have clearly shown that isogeometric analysis is nothing more than C 0 finite element discretization with higher p-level, each element being a patch in isogeometric analysis. We have clearly demonstrated that in the isogeometric approach, the global solution for the entire domain is always of class C 0 when the domain contains more than one patch regardless of the solution class within the patch. This is due to the fact that in isogeometric analysis, inter-patch continuity is C 0 . (2) Isogeometric analysis is not a finite element method. A patch is like a complex domain, for which displacement approximation is constructed (no finite elements yet). The patch is subdivided into smaller patches that are mapped in two unit squares or cubes for the sake of integration over the patch. Smaller patches or subdomains of the patch have no local approximation. Each smaller patch or subdomain has to inherit the displacement approximation constructed for the whole patch (no finite element yet either). 8.
In isogeometric analysis, benefits of higher order smoothness of geometry have never been shown. Our work on the k-version shows that the benefit of accuracy and convergence rates are due to local approximation and not due to higher order geometry. 9.
A comparison of isogeometric method with k-version [1][2][3][19][20][21] is meaningless for two reasons (i) isogeometric analysis is not a finite element method (ii) the global approximation in isogeometric is only C 0 , so where is the k-version?
In conclusion this paper presents a true k-version finite element method (HGDA/DG) in which local approximation can be of class C q , C qr , C qrs p-version hierarchical that always yield global differentiability of orders q, qr and qrs in R 1 , R 2 , R 3 . There is no other published works on the k-version of finite element method or higher order global differentiability approximation that has these features. Discretization of (finite element mesh) ofΩ orΩ xy