Next Article in Journal
Conditional Kaplan–Meier Estimator with Functional Covariates for Time-to-Event Data
Previous Article in Journal
Bias-Corrected Maximum Likelihood Estimation and Bayesian Inference for the Process Performance Index Using Inverse Gaussian Distribution
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Review

Selected Payback Statistical Contributions to Matrix/Linear Algebra: Some Counterflowing Conceptualizations

by
Daniel A. Griffith
School of Economic, Political, and Policy Sciences, University of Texas at Dallas, Richardson, TX 75080, USA
Stats 2022, 5(4), 1097-1112; https://doi.org/10.3390/stats5040065
Submission received: 5 October 2022 / Revised: 7 November 2022 / Accepted: 7 November 2022 / Published: 9 November 2022
(This article belongs to the Section Data Science)

Abstract

:
Matrix/linear algebra continues bestowing benefits on theoretical and applied statistics, a practice it began decades ago (re Fisher used the word matrix in a 1941 publication), through a myriad of contributions, from recognition of a suite of matrix properties relevant to statistical concepts, to matrix specifications of linear and nonlinear techniques. Consequently, focused parts of matrix algebra are topics of several statistics books and journal articles. Contributions mostly have been unidirectional, from matrix/linear algebra to statistics. Nevertheless, statistics offers great potential for making this interface a bidirectional exchange point, the theme of this review paper. Not surprisingly, regression, the workhorse of statistics, provides one tool for such historically based recompence. Another prominent one is the mathematical matrix theory eigenfunction abstraction. A third is special matrix operations, such as Kronecker sums and products. A fourth is multivariable calculus linkages, especially arcane matrix/vector operators as well as the Jacobian term associated with variable transformations. A fifth, and the final idea this paper treats, is random matrices/vectors within the context of simulation, particularly for correlated data. These are the five prospectively reviewed discipline of statistics subjects capable of informing, inspiring, or otherwise furnishing insight to the far more general world of linear algebra.

1. Introduction

The unfolding of the statistics field over recent centuries discloses a transformation of the role of matrix algebra from a mostly notational tool for conveniently expressing statistical problems, to an essential component in the conceptualization, derivation, comprehension, and utilization of more mature, complex, and complicated analytical devices characterizing the modern-day statistical sciences. Matrix/linear algebra benefits began accruing in the theoretical and applied statistics literature in the early 1940 s, when Fisher, and then Wilks, used the word matrix for the first time in their publications [1]. Today, it is an indispensable part of the discipline’s subject matter. This one-way flow of contributions has bolstered the mathematization of statistics, with statisticians increasingly recognizing that it has become a necessary prerequisite in many parts of their discipline [2], especially for linear statistical models [3] and multivariate statistics [4] research and coursework. The primary purpose of this paper is to review at least some of the ideas shared between these two cognate subdisciplines, converting them into bilateral interactions by describing their converse issue, namely, the emerging and potential roles of statistics in the conceptualization, derivation, comprehension, and utilization of analytical mechanisms characterizing parts of matrix/linear algebra.
Many academic books and articles address matrix/linear algebra and its sundry relationships with statistics [4,5,6,7,8]. Perhaps foremost among the more targeted of such tomes/pieces is the set treating regression [3], long deemed the workhorse of traditional statistics, in either its linear or its iteratively linear (i.e., nonlinear) forms. Virtually all statisticians, whereas apparently relatively few matrix and linear algebraists and other specialized mathematicians, seem to be very familiar with and well versed in this rather customary data analytic technique, perhaps because it flourished in astronomy and geodesy during the age of discovery rather than in pure mathematics. In its simplest version, it specifies a response variable to be a linear combination of p ≥ 1 covariates, all individually organizing data in vector form. This notion provides a neat solution to the enduring matrix algebra problem of determining which, if any, of its rows and/or columns are collinear. Conceivably, the next ranking mutual opportunity focuses on eigenfunctions―dating back to 18th century mathematics but acquiring their contemporary name from Hilbert in 1940―which occupy a prominent place in matrix algebra theory and applications. Usually, a quantitative scholar first encounters this concept in statistics when studying multivariate principal components and/or factor analysis [4], or, frequently almost in passing, multiple regression multicollinearity complications [3]. It also is an important ingredient in correlated data analyses. Matrix algebra theory accompanying eigenvalues and their paired eigenvectors encompasses the establishment of upper and lower bounds upon the eigenvalues, an effort whose ultimate research goal is to shrink these bounds to exactly match their corresponding eigenvalues, with the well-known Perron-Frobenius theorem [9] historically initializing such an interval definition for the principal eigenvalue of a statistical (e.g., covariance) matrix. Statistics supplies estimation theory extendable to an approximation of eigenvalues, point-estimate quantities whose calculations should be intellectually appealing to matrix algebraists. The method of moments estimator is of particular relevance here, given that a matrix readily supplies the mean and variance of its set of eigenvalues.
A third common interest is special matrix operations, such as Kronecker sums and products, and Hadamard products [9]. Spatial statistics, for example, furnishes some informative insights into these categories of operators. Meanwhile, a fourth salient theme spanning these two disciplines is multivariable calculus, expressly the Jacobian term associated with variable transformations [10]. Again, spatial statistics provides correlated data auto-normal model specifications that highlight not only calculus-based, but also eigenvalue-based, illuminations directed from statistics to matrix/linear algebra. Finally, random matrices/vectors within the context of simulation, chiefly in terms of linkage variation for correlated data, promote accumulating sampling distribution understanding about a wide range of matrices, paralleling the body of knowledge already generated through Erdős-Rényi random matrix synergies; Wishart [11] first introduced such random matrix theory within the environment of multivariate statistics [12]. In summary, these are the five subjects prospectively reviewed in this paper.
Interestingly, matroids―combinatorial structures simultaneously but separately discovered by Whitney [13] and Nakasawa [14] in 1935, with Oxley [15] providing a grand survey of its theory, that have a habit of being ideal for describing a wide range of objects by abstracting and generalizing the notion of linear independence in vector spaces―appearing in algebraic statistics, which come in various forms, span this range of topics, illustrating other gestational possibilities as well as offering an identifiable potential avenue for backflow from statistics to linear algebra.

2. A Linear Regression Contribution to Matrix Algebra

A matrix/linear algebra problem seeking an uncomplicated solution concerns determining all of the distinct linear combination subsets of matrix rows/columns resulting in zero eigenvalues for a given matrix. Overlapping linear combinations constitute a complication whose astute management and treatment remain both elusive and a fruitful subject of future research. This section addresses the determination of the same number of subsets as zero eigenvalues, recognizing this preceding repetition situation possibility.
Multiple linear regression coupled with a sweep―a reader should not confuse this operation, whose name was coined in 1979, with the RProject basic function (see https://www.rdocumentation.org/packages/base/versions/3.6.2/topics/sweep (last accessed on 4 October 2022)), named as such in 1988, which differs from it―algorithm [16] applied to a n-by-m matrix M give an elegant solution to this problem. Designating an arbitrary row/column as a regression response vector Y, and then treating the remaining rows/columns like the regression covariate matrix X—which does not include a vector 1 for the intercept term in this circumstance—enables a statistical package regression routine to invoke its aforementioned sweep operation. This procedure always begins by sweeping the first row/column in matrix X, followed by the next row/column in this matrix if the pivot―a matrix cell element selected to do certain computations that needs to be not only distinct, but also distant, from zero, used to determine row/column permutation swappings―is less than a near-zero value whose default threshold magnitude often is roughly 1.0 × 10−9 (i.e., if that row/column is not a linear function of the preceding row/column), then continuing sequentially to each of the next rows/columns if their respective pivots are not less than this threshold amount, until it passes through all of the rows/columns of matrix X. If rows are evaluated first, then the transpose of matrix M allows an analysis of columns. In other words, postulating this setting, the aforementioned sweep algorithm can uncover linearly dependent subsets of matrix rows/columns by specifying its first row/column, C1, as a dependent variable, and its (n − 1) remaining rows/columns, C2-Cn, as covariates. Standard computer software package output from this specification usually includes an enumeration of the existing linearly dependent row/column subsets. A second no-intercept regression that is stepwise in its nature and execution can check whether or not C1 itself is part of a linear combination subset. Simplicity dissipates here when n becomes too large, with prevailing numerical precision resulting in some linear combinations embracing numerous superfluous columns with computed near- or almost-identically zero regression coefficients (e.g., 1.0 × 10−11). This rounding error corruption frequently emerges only after n or m is in the thousands.
The two matrices presented in Table 1 supply illustrative examples of this approach. One is a square symmetric matrix, meaning that an examination needs to be of only its rows or columns, not both. The second is a non-square asymmetric matrix.
Table 2 enumerates the specimen zero-eigenvalue problems, displaying screenshots of the proprietary output facing users of selected statistical software packages, using matrix column symbolism for convenience. Analyzing columns in the 5-by-5 Table 1 matrix, Minitab 17 reports (Table 2) removal of the third column from a no-intercept multiple linear regression analysis in which the first column is specified as Y (Table 2). Subsequently converting this arrangement to a stepwise linear regression returns no perfect linear combinations matching the first column. Meanwhile, handling the first of the eight columns in the 5-by-8 Table 1 matrix as Y, executing a no-intercept stepwise linear regression procedure identifies the third column as a perfect match to it. For the 8-by-5 transpose of this second matrix, SAS generates Table 2 output, uncovering the latent linear combination. A matrix publicly available on the internet yields Table 2 reports: although neither the first column nor first row produces an exact match, a complicated linear combination of each subset is an exact match to this designated row/column. Because of its ease of construction, accompanied by its n−1 zero eigenvalues, the final example is the adjacency matrix for a complete undirected star graph, which is symmetric, for n = 10. Casting the first column as Y, the SAS regression sweep algorithm uncovers eight linear combination pairs, followed by a stepwise regression revealing the ninth linear combination. Although the possible non-disjoint linear combination pairings numbers 45, the only feature of interest here is finding one combination for each zero eigenvalue.
Consequently, matrix and linear algebra pedagogy, if not research, could benefit from an operational awareness of standard linear regression implementations, one allowing the treatment of a given matrix as a regression problem. Accordingly, the task of identifying sets of linear combinations of a matrix’s rows/columns that produces zero eigenvalues becomes quite easy and straightforward.

3. Eigenfunctions, Statistics, and Matrix/Linear Algebra

The multivariate normal (MVN) probability density function (PDF) can draw attention to the eigenfunctions―Abdi [17] furnishes a reader-friendly exposition describing these mathematical entities, which, in brief, reduce any square n-by-n matrix to its constituent parts, and appear throughout, for example, multivariate statistics―subject area of a concomitant functional interest. Its classical PDF may be written as follows:
f   ( X ,   µ ,   Σ ) = 1 det Σ 1 / 2   2 π p / 2 e X µ T Σ 1 X µ / 2 ,
where bold denotes matrix/vector, det|•| and superscript T, respectively denote the matrix determinant and transpose operators, X is a p-dimensional MVN random variable, µ is its p-by-1 vector of means, and Σ is its p-by-p covariance matrix. Adapting Equation (1) to a specific dependent observations problem that is a cornerstone of spatial statistics, one shared theme by matrix/linear algebra and statistics concerns the computation of det|Σ|. Equation (1) translates into the following univariate correlated data [18] expression:
f ( X ,   µ ,   V σ 2 ) = 1 det V 1 / 2   2 π n / 2   σ n e X μ 1 T V 1 X μ 1 / 2 σ 2 ,
where µ and σ2, respectively are the common mean and variance for n-dimensional univariate random variable X, n is the sample size, 1 is an n-by-1 vector of ones, and V is an n-by-n covariance structure matrix (with positive real diagonal entries that tend to fluctuate around one). The common focal point here is det|V|, which may be rewritten as the product of its matrix V eigenvalues (a well-known eigenfunction property); because it is symmetric, all of them are real numbers. Griffith [18] notes that: (1) conventional matched/paired observations have a very simple block-diagonal eigenvalue structure; (2) time series observations have a simple known eigenvalue structure [19]; and, (3) spatial and network observation dependency structure computational intensities become daunting for massively large n.
Matrix/linear algebra provides the necessary square matrix eigenfunction theory employing graph theoretic articulations of observation dependency structures. Statistics provides accurate matrix determinant approximations through its method of moments estimation technique. By definition, following standard covariance matrix decomposition (e.g., Cholesky, spectral), all diagonal entries of an input n-by-n adjacency-based matrix A (i.e., a function of 0/1 adjacency matrix C, where cij = 1 if row areal unit i and column areal unit j are adjacent, and cij = 0 otherwise), for which ATA = V, are zero, causing the eigenvalues to sum to zero (eigenvalues summing to their matrix trace is a well-known eigenfunction property); one of its most popular spatial statistics specifications (i.e., the simultaneous autoregressive model) is given by, for row-standardized (i.e., stochastic) adjacency matrix W = D−1 C, where D denotes a diagonal matrix whose dii entries are the sum of elements in row i of matrix C, the spatial linear operator matrix A = (I − ρW), where I denotes the identity matrix, and ρ denotes the observation dependence parameter. This specification is reminiscent of that used in time series analyses. Meanwhile, the sum of squared eigenvalues is a quantity that is directly calculable from the entries in matrix V; for its row-standardized version, W, the specimen for this section, this sum of squares is given by
1TD−1CD−11 = (18PQ + 11P + 11Q + 12)/72
for a regular square tessellation overlaying a P-by-Q (i.e., P rows and Q columns of pixels/squares in the given grid) complete rectangular region; this total, whose formula’s proof is by mathematical induction, delivers the second moment of an eigenvalue set. Furthermore, Griffith [19] outlines an algorithm for quickly and precisely calculating the extreme eigenvalues of matrix V.
Griffith [20] exploits two additional properties: (1) the rank ordering of an eigenvalue set, which is applicable to any matrix; and, (2) a line graph analytical eigenvalue solution—namely 2 COS   i π / n 1 , i = 1, 2, …, n—foundation for calculating rook adjacency regular square tessellation case approximations. His approximation results match benchmark eigenvalue sets almost exactly, rendering extremely accurate known third and fourth moments for unknown eigenvalue sets as a portion of their quality assessment. For a more general eigenvalue situation, with ascending ranked values denoted by λr (r = 1, 2, …, n), the positive values organize into a near linear trend through eigenvalue zero with a power version of their relative rankings—i.e., λr = [1 − (r − 1)/np]δ, where np denotes the number of positive eigenvalues, and δ > 0 denotes an inflating positive exponent. Their matrix inertia count of zero eigenvalues is n0. Finally, a veritable description of their nn values less than zero is λminLN [1 + αr/nn]/LN [2 + α − (r/nn)β], where λmin denotes the extreme smallest eigenvalue, and bestows the sign on these negative values. These formulae signify that determining the inertia of a matrix is necessary; Griffith [21] reports that most large planar graph eigenvalue sets have np/n ≈ 0.4, n0 ≈ 0, and hence nn ≈ 0.6. Ensuing Table 4 raises the question asking whether or not this percentage is a small sample property of planar matrices. The Syracuse (n = 7249) empirical example having 44.7% positive eigenvalues―this specimen is part of composite datasets for a number of papers, with more details about it, itself, appearing in Griffith and Luhanga [22]―is somewhat consistent with this conundrum.
Figure 1 portrays a number of specimen scatterplots for a suite of larger planar graphs spanning a wide range of n and relating to convenient publicly available spatial statistics geographic weights adjacency matrices, corroborating the contention that these preceding formulae furnish excellent approximations. Envisaging that each matrix’s set of eigenvalues constitutes a population, the method of moments can calculate the three coefficient (i.e., α, β, and δ) calibrations for these formulae. Given that an adjacency matrix divulges its first and second moment quantities, a remaining task is to estimate the third and/or fourth moments from the characteristics (e.g., row sums, extreme eigenvalues) of that matrix. Here, the third moment was calibrated as a function of: n, maximum row sum, second moment, and minimum eigenvalue (i.e., λmin). Most of the polygons were demarcated by the United States (US) Census Bureau census tracts or blocks; Canada’s polygons are Statistics Canada enumeration areas, and England’s polygons are NUTS-3 areal units.
Figure 1 overlays attest to a very good description by the simple moments-matched calibration, but with several conspicuous discrepancies (e.g., Figure 1a,e) indicating a need for further refinement research. Regardless, Table 3 authenticates the closeness of these simple approximations, particularly regarding their matrix determinants. Some of these outcomes suggest an asymptotic mechanism may be at work here, yet another topic warranting subsequent research.
In conclusion, dating back to Perron [23] and Frobenius [24], if not before, matrix/linear algebra has been, and seemingly continues to be, preoccupied with establishing progressively tighter upper and lower bounds on individual, and particularly extreme, matrix eigenvalues [25,26]. Statistics supports realistic approximations/calibrations/estimations that may well be superior to a buffering of a rank-ordered eigenvalue set with close upper and lower bounds. The matrix determinant pairs listed in Table 3 are almost identical element couplings for the most common case of moderate positive spatial autocorrelation encountered in real-world geospatial socio-economic/demographic data. Extending these results to massively large georeferenced datasets (e.g., [20]), for example, enables the implementation of spatial statistics for any size GIScience problem. In addition, this numerical solution initiates a protocol capable of going beyond non-adjacency matrices, presumably to any real matrix. This setting portends a potential meaningful contribution to matrix/linear algebra by statistics.

4. Kronecker, Hadamard, and Other Nonstandard Operators: From the Spatial Statistical to the Matrix/Linear Algebra Domain

Kronecker matrix algebra―a generalization of outer product linear algebra operations from vectors to matrices that creates a block tensor product linear map matrix form with respect to a standard basis choice, differing from their usual direct product/sum linear algebra counterparts, which are entirely different operations―first appeared in the early to-mid 1800 s [27], followed by a rather controversial discovery credit history. The Kronecker product (denoted by symbol ) promotes a narrow range of applications in matrix calculus and theory of equations, system theory and identification, signal processing, and other special engineering and computer science fields [28], such as the computation of search directions for semi-definite programming primal–dual interior–point algorithms. In contrast, Graham [29] advocates a more diverse set of applications by presenting this operation within the statistical problems of general least squares (§7.4), multivariate maximum likelihood estimation (§7.5), and Jacobians of transformations (§7.6), a somewhat scant assemblage, nonetheless. Neudecker [30] discusses its introduction into econometrics, where it has proved extremely useful in time series analyses, yet another cornucopia of consequential application examples. It also is becoming a popular addition to spatial statistics toolboxes [31], likewise enhancing its pool of prospective relatable illustrations. The Kronecker sum (denoted by symbol ; e.g., Pease [32]) is a more obscure operator that is a version of the Kronecker product. Together, these two operations facilitated the work of spatial statisticians in solving the spatial autocorrelation problem plaguing spatial interaction flows data [33], embedding an n-by-n origin and destination surface partitioning adjacency matrix, C, into existing model specifications with the terms
C I ,   I C ,   C C ,   and   C C   =   C I   +   I C ,
which also can join the arsenal of application contributions from statistics to matrix/linear algebra. The dormant contribution here, which represents a host of empirical applications involving geographic flows spanning journey-to-work, -to-shop, and to-recreate, as well as migration/mobility and international trade, is bolstered by the Blieberger and Schöbel [34] railroad study employing Kronecker algebra. This collection constitutes a wealth of problem exemplars for next-generation matrix/linear algebra textbooks.
Graham [29] begins showing matrix/linear algebra application contributions from statistics that can enrich the breadth of Kronecker product textbook examples, although, as already noted, his repertoire is rather sparse, similar to the expansion of subject matter oriented statistics and calculus classroom texts during the last half of the twentieth century, following the debate in mathematics about whether or not mathematics-ladened topics of that time could be taught without being framed in a rigorous systematic theorem-proof context. Unfortunately, the incorporation of such feedback from statistics is atypical, as Bernstein [35] demonstrates; his book includes examples about signal processing, scientific algorithmic computing, matrix calculus, and tensor (i.e., the higher dimensional analogue of matrices) analysis, but not statistics. This paper certainly argues that his book would benefit greatly by adapting additional applications from statistics.
This lack of two-directional cross-fertilization transcends Kronecker algebra, also pertaining to other esoteric matrix/linear algebra themes, such as Hadamard algebra [36]―synonymously known as the element-wise, entry-wise, and Schur matrix product, it is a binary operation on two matrices of identical dimensions that produces a third matrix in which each cell (i, j) is the product of the corresponding (i, j) elements of the original two matrices. This matrix operation differs from its usual direct product linear algebra counterpart, which is an entirely different operation. During the last half of the last century, Styan [37] recognized this specific idea as both a neglected matrix theory concept, and an approach already finding scarce and scattered application in statistics. Meanwhile, Neudecker et al. [38] helped import it into econometrics practice. Later, Griffith [39] helped usher it into quantitative geography and spatial statistics usage. Again, statistics offers a fertile opportunity for reciprocal contributions, opening another reservoir of application possibilities to what could become a more inclusive matrix/linear algebra.
In conclusion, subsections of matrix/linear algebra presentations, such as those for Kronecker and Hadamard algebra, could gain immensely by more comprehensively diversifying their content through recognizing and formally staging and explicating their more recent novel statistical applications. Not only could their textbooks and research reference volumes be more appealing to a wider audience, but such publications also could spawn synergies fostering new research. This essentially is the story told by applied calculus and applied statistics endeavors. For example, although such topics as trigonometry, an integral ingredient in theoretical calculus courses, may not be treated in an applied calculus course, avoiding all of the material devoted to trigonometric substitutions or integrations/differentiation of trigonometric functions, frequently extra material from differential equations (e.g., growth through time), linear algebra (e.g., Markov chains), operations research (e.g., optimizing industry/manufacturing production functions), and/or statistics (e.g., parameter estimation) may be studied that students of mathematics, engineering, physics, and a few other disciplines, would not be exposed to until after, if at all, they completed their regular sequence of theoretical calculus courses. Similarly, applied statistics students rarely study proving theorems about estimators, hypothesis tests, and other inferential methods, often focusing on implementation and interpretation of data analysis outcomes, whereas mathematical statistics students tend to engage real world data less, and even more seldom large amounts of it, including its collection, authentication, and cleaning and quality control. In both cases, the two opposites perpetually advocate for discovery, with efforts in new undertakings like baseball analytics via applied calculus, and new data analytic tool formulations in statistics.

5. The Auto-Normal Model Jacobian Term and Spatial Autocorrelation: More Spatial Statistical Musings for Matrix/Linear Algebra

Auto-models (i.e., the response variable Y is on both sides of an equation equal sign), particularly the widely adopted auto-normal specification [e.g., Equation (2)], team multivariable calculus, especially the Jacobian term associated with variable transformations, with spatial statistics and other correlated data in a manner allowing payback through not only calculus-based, but also eigenfunction-based, illuminations directed from statistics to matrix/linear algebra. Graph theory is a backbone to contributions here, being glamorized by what at the beginning of this millennium became known as the new science of networks [40], in part sparked by a social networks revolution. The informative instrument at this juncture is scientific visualization, which serves, for example, many demystifying purposes. The formal teaching of auto-normal model parameter estimation together with fantasizing about paintings like those displayed in Figure 2 helped inspired this pursuit [41,42,43,44].

5.1. Jacobian Term Plots: Smiley Face Emojis in Science

The bivariate version of Equation (1) furnishes the simplest Jacobian term [i.e., det|Σ| in Equation (1), and det|V| in Equation (2)] plot demonstration (Figure 3c) for statistics. Figure 3a,b portray those drawings for usual spatial statistical auto-normal estimation problems. The surprise here is a visualization that resembles a smiley face emoji (e.g., https://depositphotos.com/308817798/stock-illustration-7cute-yellow-round-emoticon-smiling.html and https://www.pinterest.com/pin/300826450087576163/ (accessed on 4 October 2022)), an unexpected image that students of matrix/linear algebra most likely would find memorable. The ability to graph matrix determinants in such a parsimonious and unforgettable way seems remarkable, completely unexpected, and eye-opening to numerous quantitative scholars.

5.2. Eigenfunction Visualizations

Although graphics portraying matrix determinants may be extraordinary, a truly astonishing visualization embraces eigenfunctions themselves; until recently, the only public example seems to be that appearing in Figure 2a (https://sites.math.washington.edu/~burdzy/hot_spots.php (accessed on 4 October 2022)). This outcome is beyond the long-standing pairwise eigenvector-based abstract plots well-known to data analytic researchers for multivariate statistical techniques such as principal components and factor analysis, ones that eventually evolved into critical search engine input. Rosmarin’s paintings (e.g., Figure 2b,c) prompted this more current awareness [36,37]. Griffith [19] derives the analytical eigenfunctions for regular square tessellations forming complete rectangular regions; these are the planar graphs whose eigenvectors convert to somewhat impressionistic choropleth maps (e.g., Figure 4a) in which distinct two-dimensional visual patterns materialize.
The stunning similarities between selected theoretical eigenvector portrayals and Rosmarin’s paintings was a serendipitous first glimpse into simple eigenvector visualization [36,37] augmenting Kelly’s lone art specimen (Figure 2a) to that date. Quantitative scholars and students alike express amazement when confronted with such juxtapositions, mostly unaware that such extremely abstract entities as n-tuple matrix components can manifest themselves as optical paintings. This truly is an exploitable chance for matrix/linear algebra to better appeal to the masses, following in the footsteps of, for example, fractal digital art effectively invented by Benoit Mandelbrot [45].
Even more thought-provoking is that the eigenvalues accompanying these eigenvectors index the nature and degree of the spatial correlation—the correlation of nearby red-green-blue (RGB) spectral values on a canvas—latent in their respective eigenvector portrayals. The two eigenvectors depicted in Figure 4c, respectively account for 33% of the red and 19% of the green canvas-wide geographic variation in Figure 4b red-green painting [36]; their corresponding correlation measures are 0.998 and −0.906, revealing a modestly strong similarity, with a trace of contrasting, neighboring painting pixel RGB color values.

5.3. Network Dependencies in Pictures

Eigenfunction visualization does not culminate with planar graphs typifying spatial statistics situations. It also relates to the denser graph theoretic adjacency matrices coincident with contemporary social networks (e.g., Figure 5). Once more, the ability to scientifically visualize intangible n-tuples here is astounding, although not unprecedented in general [46], and certainly is reminiscent of how Rubik’s Cube aids in making mathematical group theory palpable [47]. Although the preceding maps may afford a more concrete exemplification of matrix/linear algebra ideas, social network realizations like those portrayed in Figure 5 also should join the repayment assets that should flow from correlated data statistics [18] to matrix/linear algebra.

6. Sampling Distributions for Customary Planar Graph Indices

An Atlas of Graphs [48] inventories and catalogues most, if not all, possible graphs involving relatively small numbers of nodes (e.g., ≤7). This is an invaluable resource for graph theorists and researchers in related fields. The internet avails other such sources (e.g., https://zenodo.org/record/4010122#.Yy0rLUzMKUk, https://houseofgraphs.org/ (accessed on 7 November 2022)). The undirected planar graph algorithm outlined in Griffith [21] created the database for this section, a tool capable of generating much larger graphs, ones with hundreds or thousands of nodes, although not exhaustively enumerating them because of their substantial numbers. These are distinct from the fashionable Erdős-Rényi random matrices. The goal of this section is to simulate selected sampling distributions for categories of adjacency matrices that can enlighten matrix/linear algebra, extending what already is known about them theoretically, such as, for example, the statistical distribution of the principal component (i.e., largest) eigenvalue of a correlation matrix [49], a rather limited knowledge base. One previously established outcome is that planar graphs tend to have a 40–60% split of positive-negative eigenvalues [21]. Other useful properties for non-negative matrices include: the number of complete bipartite K4 subgraphs (i.e., the graph structure responsible for the maximum number of negative eigenvalues reaching 75%; [50]); the adjacency matrix row sums (i.e., vertex degrees) variance; and, the statistical distribution of the minimum eigenvalue. No doubt other matrix traits [e.g., spectral and eigenvalue gaps, supplementing K4 subgraph counting with K3 subgraph (K5 and beyond are non-planar; Kuratowski’s Theorem) counting] are worthy of monitoring, as the literature echoes [51]. These are contributions from statistics to matrix/linear algebra that can be transformative in a descriptive attribute tabulating way, one that inputs accumulated facts into innovative classification scheme refinements.
Salient noteworthy adjacency matrix facets include its principal eigenvalue, primarily because this quantity in its general role receives so much attention in the literature. Unlike conventional pairwise variate correlation matrices, Table 4 and Figure 6a imply that this matrix quantity fails to converge upon a bell-shaped normal distribution for complete undirected random planar graph adjacency matrices (i.e., Cs). One conspicuous tendency disclosed by Table 4 is that dispersion for various quantitative matrix indices tends to decrease with increasing n.
Two interesting aspects about the smallest row-standardized adjacency matrix (i.e., W) eigenvalue is that it displays a propensity to be nearer to −1 than −0.5 with increasing n, perhaps beckoning a trade-off tension between square and hexagonal tessellation proclivities in irregular surface partitionings. Accompanying this possible trend is a matrix inertia (see §3) inclination toward an equal number of positive and negative eigenvalues, rather than its expected 40–60% division, which merely may resonate nothing more than the aforementioned hexagonal partitioning predilection of administrative polygons. In addition, the standard deviation of individual complete undirected planar graph adjacency matrix vertex degrees (i.e., elements of vector C1) appears to converge upon a bell-shaped curve. This is counter to the expectation promoted by most administrative polygon surface partitions that some mixture of four and of six degrees often dominate an empirical adjacency matrix.
The K4,4 subgraph statistics are captivating because of their already noted relationship to the adjacency matrix number of negative eigenvalues upper limit (i.e., 75%; also see [52]). Roughly half of the simulated matrices have no K4,4 subgraphs, with this set tending to traverse the entire range of negative eigenvalue percentages, while rendering little evidence of a meaningful trendline describing the remaining percentages, regardless of n. Regardless, a graph embracing only a single K4 subgraph requires the maximum number of colors required by the four-color theorem for graph/edge/vertex/face coloring.
These and other descriptive statistics computed for simulated complete undirected planar graph adjacency matrices underscore a host of informative matrix characteristics the discipline of statistics can tabulate for and transmit to the subdiscipline of matrix/linear algebra.

7. Conclusions and Implications

Statistics has a tradition of borrowing and importing conceptualizations from kindred disciplines and subdisciplines, particularly ones in mathematics. Matrix/linear algebra is a case in point. However, this flow of intellectual property does not have to be one-way. The literature review encapsulated in this paper argues and epitomizes a few ways in which statistics can compensate matrix/linear algebra for its numerous contributions that helped statistics to flourish as a more rigorous mathematical sciences discipline. In doing so, proper adaptions could aid parts of matrix/linear algebra to become more understandable to a wider audience of scholars. This paper reviews the following five potential candidate topics, all of which build upon key strengths of statistics that clearly inhabit its interface with matrix/linear algebra: uncovering all sources of a matrix’s zero eigenvalues; approximating eigenvalues for massively large matrices; special matrix operations, such as Kronecker and Hadamard products; the calculus-based Jacobian term associated with variable transformations, a topic covered in virtually all introductory mathematical statistics courses; and, properties of planar graph grounded random matrices/vectors detectable with simulation experiments.
Of course, these are not the only interchange junctures; others, some to which the previous narrative alludes, remain implicit or unidentified here. Among others, the subfield of multilinear (vector) algebra, with many adoptions of it in multivariate statistics, constitutes an emerging potential common ground rife with payback possibilities. For example, matrix decomposition is both a fundamental and widely studied topic, although its extension to the tensor case still encounters some difficulties [53,54]. Furthermore, reminiscent of the preceding eigenvector discussion, Loperfido [55,56] highlights connections between tensor eigenvectors and skewness maximization, within the framework of projection pursuits. Expanding upon this eigenfunction theme, the fully real eigenpairs (i.e., values and their corresponding vectors) problem is well-known for matrices, while still remaining open for tensor eigenvectors [57]. All in all, future contributions from especially algebraic statistics to multilinear algebra should come to fruition.
The discipline of statistics should, and most likely will be compelled to, produce more reviews of this type as it begins engaging the emerging subject that is being labeled data science [58].

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The simulated, spatial weights matrix, and Rosmarin painting RGB spectral data are available from the author, by request. The social network data are available via http://konect.cc (accessed on 4 October 2022).

Acknowledgments

The author is an Ashbel Smith Professor of Geospatial Information Sciences and Geography. The author thanks an anonymous reviewer for bringing multilinear (vector) algebra potential payback possibility to his attention.

Conflicts of Interest

The author declares no conflict of interest.

References

  1. David, H. First (?) occurrence of common terms in mathematical statistics. Am. Stat. 1995, 49, 121–133. [Google Scholar]
  2. Harville, D. Matrix Algebra from a Statistician’s Perspective; Springer: New York, NY, USA, 1997. [Google Scholar]
  3. Gruber, M. Matrix Algebra for Linear Models; Wiley: New York, NY, USA, 2014. [Google Scholar]
  4. Adachi, K. Matrix-Based Introduction to Multivariate Data Analysis, 2nd ed.; Springer Nature: Singapore, 2020. [Google Scholar]
  5. Healy, M. Matrices for Statistics; Oxford University Press: Oxford, UK, 2000. [Google Scholar]
  6. Gentle, J. Matrix Algebra: Theory, Computations, and Applications in Statistics; Springer: New York, NY, USA, 2007. [Google Scholar]
  7. Banerjee, S.; Roy, A. Linear Algebra and Matrix Analysis for Statistics; CRC Press: Boca Raton, FL, USA, 2014; Volume 181. [Google Scholar]
  8. Searle, S.; Khuri, A. Matrix Algebra Useful for Statistics, 2nd ed.; Wiley: New York, NY, USA, 2017. [Google Scholar]
  9. Seber, G. A Matrix Handbook for Statisticians; Wiley: New York, NY, USA, 2008. [Google Scholar]
  10. Schott, J. Matrix Analysis for Statistics, 3rd ed.; Wiley: New York, NY, USA, 2017. [Google Scholar]
  11. Wishart, J. The generalised product moment distribution in samples from a normal multivariate population. Biometrika 1928, 20A, 32–52. [Google Scholar] [CrossRef] [Green Version]
  12. Biroli, G.; Burda, Z.; Vivo, P. Random matrices: The first 90 years. J. Phys. A Math. Theor. (Spec. Issue) 2019, 51–52. Available online: https://iopscience.iop.org/journal/1751-8121/page/Random-Matrices (accessed on 4 October 2022).
  13. Whitney, H. On the abstract properties of linear dependence. Am. J. Math. 1935, 57, 509–533. [Google Scholar] [CrossRef]
  14. Nishimura, H.; Kuroda, S. (Eds.) A Lost Mathematician, Takeo Nakasawa: The Forgotten Father of Matroid Theory; Birkhäuser Verlag: Basel, Switzerland, 2009. [Google Scholar]
  15. Oxley, J. Matroid Theory, 2nd ed.; Oxford University Press: Oxford, UK, 2011. [Google Scholar]
  16. Goodnight, J. A tutorial on the sweep operator. Am. Stat. 1979, 33, 149–158. [Google Scholar]
  17. Abdi, A. The eigen-decomposition: Eigenvalues and eigenvectors. In Encyclopedia of Measurement and Statistics; Salkind, N., Ed.; Sage: Thousand Oaks, CA, USA, 2007; pp. 305–309. [Google Scholar]
  18. Griffith, D. A family of correlated observations: From independent to strongly interrelated ones. Stats 2020, 3, 166–184. [Google Scholar] [CrossRef]
  19. Griffith, D. Eigenfunction properties and approximations of selected incidence matrices employed in spatial analyses. Linear Algebra Its Appl. 2000, 321, 95–112. [Google Scholar] [CrossRef] [Green Version]
  20. Griffith, D. Approximation of Gaussian spatial autoregressive models for massive regular square tessellation data. Int. J. Geogr. Inf. Sci. 2015, 29, 2143–2173. [Google Scholar] [CrossRef]
  21. Griffith, D. Generating random connected planar graphs. GeoInformatica 2018, 22, 767–782. [Google Scholar] [CrossRef]
  22. Griffith, D.; Luhanga, U. Approximating the inertia of the adjacency matrix of a connected planar graph that is the dual of a geographic surface partitioning. Geogr. Anal. 2011, 43, 383–402. [Google Scholar] [CrossRef]
  23. Perron, O. Zur theorie der matrices [Translation: Theory Matrices]. Math. Ann. 1907, 64, 248–263. [Google Scholar] [CrossRef]
  24. Frobenius, G. Ueber matrizen aus nicht negativen elementen [translation: On matrices of non-negative elements]. Sitz. Der Königlich Preuss. Akad. Der Wiss. 1912, 23, 456–477. Available online: https://archive.org/details/mobot31753002089602/page/6/mode/2up (accessed on 4 October 2022).
  25. Diaconis, P.; Stroock, D. Geometric bounds for eigenvalues of Markov chains. Ann. Appl. Probab. 1991, 1, 36–61. [Google Scholar] [CrossRef]
  26. Schulze Darup, M.; Mönnigmann, M. Improved automatic computation of Hessian matrix spectral bounds. SIAM J. Sci. Comput. 2016, 38, A2068–A2090. [Google Scholar] [CrossRef]
  27. Henderson, H.; Pukelsheim, F.; Searle, S. On the history of the Kronecker product. Linear Multilinear Algebra 1983, 14, 113–120. [Google Scholar] [CrossRef] [Green Version]
  28. Zhang, H.; Ding, F. On the Kronecker products and their applications. J. Appl. Math. 2013, 2013, 296185. [Google Scholar] [CrossRef] [Green Version]
  29. Graham, A. Kronecker Products and Matrix Calculus with Applications; Courier Dover Publications: Mineola, NY, USA, 2018. [Google Scholar]
  30. Neudecker, H. The Kronecker matrix product and some of its applications in econometrics. Stat. Neerl. 1968, 22, 69–82. [Google Scholar] [CrossRef]
  31. Cao, J.; Genton, M.; Keyes, D.; Turkiyyah, G. Sum of Kronecker products representation and its Cholesky factorization for spatial covariance matrices from large grids. Comput. Stat. Data Anal. 2021, 157, 107165. [Google Scholar] [CrossRef]
  32. Pease, M. (Ed.) The direct product and Kronecker Sum. In Methods of Matrix Algebra; Mathematics in Science and Engineering, Volume 16; Academic Press: New York, NY, USA, 1965; Chapter XIV; pp. 335–347. [Google Scholar]
  33. Chun, Y.; Griffith, D. Modeling network autocorrelation in space–time migration flow data: An eigenvector spatial filtering approach. Ann. Assoc. Am. Geogr. 2011, 101, 523–536. [Google Scholar] [CrossRef]
  34. Blieberger, J.; Schöbel, A. Application of Kronecker algebra in railway operation. Teh. Vjesn. 2017, 24, 21–30. [Google Scholar]
  35. Bernstein, D. Kronecker and Schur Algebra. In Scalar, Vector, and Matrix Mathematics: Theory, Facts, and Formulas; Revised and Expanded Edition; Princeton University Press: Princeton, NJ, USA, 2018; Chapter 9; pp. 681–702. [Google Scholar] [CrossRef]
  36. Liu, S.; Trenkler, G. Hadamard, Khatri-Rao, Kronecker and other matrix products. Int. J. Inf. Syst. Sci. 2008, 4, 160–177. [Google Scholar]
  37. Styan, G. Hadamard products and multivariate statistical analysis. Linear Algebra Its Appl. 1973, 6, 217–240. [Google Scholar] [CrossRef] [Green Version]
  38. Neudecker, H.; Polasek, W.; Liu, S. The heteroskedastic linear regression model and the Hadamard product a note. J. Econom. 1995, 68, 361–366. [Google Scholar] [CrossRef]
  39. Griffith, D. Spatial-filtering-based contributions to a critique of geographically weighted regression (GWR). Environ. Plan. A 2008, 40, 2751–2769. [Google Scholar] [CrossRef]
  40. Barabási, A.-L. Linked: The New Science of Networks; Perseus: Cambridge, MA, USA, 2002. [Google Scholar]
  41. Griffith, D. Spatial autocorrelation and art. Cybergeo Eur. J. Geogr. 2016. Available online: http://cybergeo.revues.org/27429 (accessed on 4 October 2022).
  42. Griffith, D. A spatial analysis of selected art: A GIScience-humanities interface. Int. J. Humanit. Arts Comput. 2020, 14, 154–175. [Google Scholar] [CrossRef]
  43. Griffith, D. Eigenvector visualization and art. J. Math. Arts 2021, 15, 170–187. [Google Scholar] [CrossRef]
  44. Griffith, D. Art, geography/GIScience, and mathematics: A surprising interface. Ann. Am. Assoc. Geogr. 2022, 12. [Google Scholar] [CrossRef]
  45. Mandelbrot, B. The Fractal Geometry of Nature; WH Freeman: New York, NY, USA, 1982; Volume 1. [Google Scholar]
  46. Albert, R.; Jeong, H.; Barabási, A.-L. Attack and error tolerance of complex networks. Nature 2000, 406, 378. [Google Scholar] [CrossRef] [Green Version]
  47. Joyner, D. Adventures in Group Theory: Rubik’s Cube, Merlin’s Machine, and Other Mathematical Toys; Johns Hopkins University Press: Baltimore, MD, USA, 2002. [Google Scholar]
  48. Read, R.; Wilson, R. An Atlas of Graphs; Oxford University Press: Oxford, UK, 1998. [Google Scholar]
  49. Johnson, R.; Wichern, D. Applied Multivariate Statistical Analysis, 6th ed.; Prentice Hall: Upper Saddle River, NJ, USA, 2015. [Google Scholar]
  50. Elphick, C.; Wocjan, P. An inertial lower bound for the chromatic number of a graph. Electron. J. Comb. 2016, 24, P1.58. Available online: https://www.combinatorics.org/ojs/index.php/eljc/article/view/v24i1p58 (accessed on 4 October 2022). [CrossRef]
  51. Hawkins, T. Nonnegative matrices. In The Mathematics of Frobenius in Context (Sources and Studies in the History of Mathematics and Physical Sciences); Springer: New York, NY, USA, 2013; Chapter 17; pp. 607–649. [Google Scholar] [CrossRef]
  52. Taliceo, N.; Griffith, D. The K4 graph and the inertia of the adjacency matrix for a connected planar graph. Studia Kpzk Pan Publ. Pol. Acad. Sci. 2018, 183, 185–209. [Google Scholar]
  53. Comon, P.; Golub, G.; Lim, L.-H.; Mourrain, B. Symmetric tensors and symmetric tensor rank. SIAM J. Matrix Anal. Appl. 2008, 30, 1254–1279. [Google Scholar] [CrossRef] [Green Version]
  54. Comon, P. Tensors: A brief introduction. IEEE Signal Process Mag. 2014, 31, 44–53. [Google Scholar] [CrossRef] [Green Version]
  55. Loperfido, N. Finite mixtures, projection pursuit and tensor rank: A triangulation. Adv. Data Anal. Classif. 2018, 31, 145–173. [Google Scholar] [CrossRef]
  56. Loperfido, N. Skewness-based projection pursuit: A computational approach. Comput. Stat. Data Anal. 2018, 120, 42–57. [Google Scholar] [CrossRef]
  57. Sturmfels, B. Tensors and their eigenvectors. Not. Am. Math. Soc. 2016, 63, 604–606. [Google Scholar] [CrossRef]
  58. Hassani, H.; Beneki, C.; Silva, E.; Vandeput, N.; Madsen, D. The science of statistics versus data science: What is the future? Technol. Forecast. Soc. Change 2021, 173, 121111. [Google Scholar] [CrossRef]
Figure 1. Selected administrative polygon geographic surface partitioning planar graph adjacency matrix eigenvalue ranking scatterplots (in black) with superimposed approximations (in red). Top left (a): England (NUTS-3). Top middle (b): Chicago, IL (2000 census tracts). Top right (c): Edmonton, AB (2011 enumeration areas). Bottom left (d): North Carolina (2010 census tracts). Bottom middle (e): US hospital service areas (2017). Bottom right (f): Syracuse, NY (1990 blocks).
Figure 1. Selected administrative polygon geographic surface partitioning planar graph adjacency matrix eigenvalue ranking scatterplots (in black) with superimposed approximations (in red). Top left (a): England (NUTS-3). Top middle (b): Chicago, IL (2000 census tracts). Top right (c): Edmonton, AB (2011 enumeration areas). Bottom left (d): North Carolina (2010 census tracts). Bottom middle (e): US hospital service areas (2017). Bottom right (f): Syracuse, NY (1990 blocks).
Stats 05 00065 g001
Figure 2. Scientific visualization of matrix/linear algebra notions. Left (a): second Neumann eigenvector; Ellsworth Kelly Two Panels: Green Orange (1970), oil on canvas, Carnegie Museum of Art, Pittsburgh, PA. Middle (b): acrylic painting on canvas by Susie Rosmarin. Right (c): red painting (2010) by Susie Rosmarin.
Figure 2. Scientific visualization of matrix/linear algebra notions. Left (a): second Neumann eigenvector; Ellsworth Kelly Two Panels: Green Orange (1970), oil on canvas, Carnegie Museum of Art, Pittsburgh, PA. Middle (b): acrylic painting on canvas by Susie Rosmarin. Right (c): red painting (2010) by Susie Rosmarin.
Stats 05 00065 g002
Figure 3. Jacobian term plots across their feasible correlation parameter spaces. Left (a): a regular square tessellation underlying a remotely sensed satellite image. Middle (b): a census tract irregular surface partitioning underlying socio-economic/demographic household data. Right (c): the traditional bivariate regression case [i.e., Equation (1) for p = 2].
Figure 3. Jacobian term plots across their feasible correlation parameter spaces. Left (a): a regular square tessellation underlying a remotely sensed satellite image. Middle (b): a census tract irregular surface partitioning underlying socio-economic/demographic household data. Right (c): the traditional bivariate regression case [i.e., Equation (1) for p = 2].
Stats 05 00065 g003
Figure 4. Top (a): regular square tessellation (e.g., for remotely sensed images) eigenvectors E2,2, E3,3, and E4,4. Middle (b): a Susie Rosmarin untitled acrylic on canvas painting and its inherent red and green spectral bands (the blue band is vacant). Bottom (c): the predominant E5,5 eigenvector, the ESRI©’s geographic information systems (GIS) mapping software ArcMap reconstructed image, and the recessive E207,208 eigenvector.
Figure 4. Top (a): regular square tessellation (e.g., for remotely sensed images) eigenvectors E2,2, E3,3, and E4,4. Middle (b): a Susie Rosmarin untitled acrylic on canvas painting and its inherent red and green spectral bands (the blue band is vacant). Bottom (c): the predominant E5,5 eigenvector, the ESRI©’s geographic information systems (GIS) mapping software ArcMap reconstructed image, and the recessive E207,208 eigenvector.
Stats 05 00065 g004
Figure 5. Mathematica 13.0 renditions of selected publicly available social network principal eigenvector elements classified as being relatively high (red), medium (yellow), and low (green) values (from [18]). Left (a): Chicago jazz musicians. Right (b): an e-mail social network.
Figure 5. Mathematica 13.0 renditions of selected publicly available social network principal eigenvector elements classified as being relatively high (red), medium (yellow), and low (green) values (from [18]). Left (a): Chicago jazz musicians. Right (b): an e-mail social network.
Stats 05 00065 g005
Figure 6. Simulated sampling distributions for n = 100 (top), n = 500 (middle), and n = 2000 (bottom). Left (a): raw adjacency matrix λmax. Middle (b): row-standardized adjacency matrix λmin. Right (c): individual row sum ni dispersion (standard deviation).
Figure 6. Simulated sampling distributions for n = 100 (top), n = 500 (middle), and n = 2000 (bottom). Left (a): raw adjacency matrix λmax. Middle (b): row-standardized adjacency matrix λmin. Right (c): individual row sum ni dispersion (standard deviation).
Stats 05 00065 g006
Table 1. Specimen illustrative example matrices.
Table 1. Specimen illustrative example matrices.
Simple 5-by-5 MatrixSimple 5-by-8 Matrix
0 1 1 0 1 1 0 0 1 0 1 0 0 1 0 0 1 1 0 0 1 0 0 0 0 3 0 3 9 1 2 7 6 1 1 1 2 2 3 4 4 2 2 2 4 4 6 8 8 4 7 4 8 1 9 3 6 5 9 5 7 8 2 3 5
eigenvalues: 2.14, 0.66, 0, −0.66, −2.14linear combinations: 1 row pair & 1 column pair
NOTE: bold font denotes linearly dependent rows/columns; matrices retrieved from the web site https://www.mathworks.com/matlabcentral/answers/574543-algorithm-to-extract-linearly-dependent-columns-in-a-matrix (accessed on 8 November 2022).
Table 2. Regression sweep algorithm results for specimen matrices.
Table 2. Regression sweep algorithm results for specimen matrices.
Software PackageScreen Printed MessageMatrix Column(s)/Row(s)
Minitab 17 & Table 1; 5-by-5 matrixthe following terms cannot be estimated and were removedC3
SAS 9.4 & Table 1; 5-by-8 matrixthe following parameters have been set to 0, since the variables are a linear combination of other variables as shownR3 = 2 * R2
C6 = 222 * intercept − 3 * c3 − 4 * C4 − 4 * C5
R5 = 111 * intercept − R2 − 2 * R3 − 2 * R4
SAS 9.4 & n = 10 complete undirected star graph adjacency matrixC3 = C2
C4 = C2
C5 = C2
C6 = C2
C7 = C2
C8 = C2
C9 = C2
C10 = C2
NOTE: prefix C denotes matrix column, prefix R denotes matrix row, and suffix number denotes the horizontal/vertical rank order location of the column/row.
Table 3. Summary statistics for a specimen set of empirical geographic adjacency matrix eigenvalue calibrations for administrate polygon surface partitionings.
Table 3. Summary statistics for a specimen set of empirical geographic adjacency matrix eigenvalue calibrations for administrate polygon surface partitionings.
Planar GraphEigenvalue (λ)n λ ¯ σλSkewnessExcess Kurtosis–LN [det|V|1/2]
ρ = 0.5
Englandactual89300.459770.32159−0.749530.02969
calibrated00.459650.41221−0.797580.03007
Chicagoactual206700.442760.52311−0.703610.02824
calibrated00.442760.55394−0.689620.02837
Edmontonactual209800.435320.58034−0.705310.02743
calibrated00.435320.61421−0.687080.02758
North Carolinaactual219500.423080.71073−0.598240.02630
calibrated00.423080.68182−0.544990.02622
US hospital service areasactual340800.426590.65378−0.622250.02657
calibrated00.426590.52563−0.422710.02620
Texasactual526500.445780.43170−0.755370.02825
calibrated00.445780.50926−0.739310.02859
Table 4. Attribute arithmetic means for 1000 random planar graph adjacency matrices.
Table 4. Attribute arithmetic means for 1000 random planar graph adjacency matrices.
Number of VerticesRaw MatrixStochastic Matrix λminni Standard Deviation% 1s| Planar%
λ > 0
K4,4 Sub-Graphs
λmaxλmin
1005.587
(0.625)
−3.533
(0.152)
−0.752
(0.109)
1.929
(0.560)
77.670
(6.807)
44.742
(2.045)
2.925
(3.299; 33.4 )
5005.979 (0.476)−4.095
(0.251)
−0.855
(0.086)
1.640
(0.498)
76.190
(6.122)
46.087
(1.846)
4.996
(6.124; 53.9)
10006.122
(0.432)
−4.342
(0.322)
−0.912
(0.059)
1.290
(0.348)
72.896
(4.562)
47.134
(1.510)
4.335
(6.247; 52.9)
15006.120
(0.463)
−4.375
(0.341)
−0.948
(0.038)
1.031
(0.250)
70.422
(3.141)
47.956
(1.168)
3.679
(4.785; 52.0)
20006.075
(0.470)
−4.328
(0.301)
−0.976
(0.019)
0.792
(0.169)
68.086
(1.694)
48.845
(0.729)
5.225
(6.983; 51.1)
† percentage of simulated planar graphs with at least one K4,4 subgraph. NOTE: standard deviations in parentheses.
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Griffith, D.A. Selected Payback Statistical Contributions to Matrix/Linear Algebra: Some Counterflowing Conceptualizations. Stats 2022, 5, 1097-1112. https://doi.org/10.3390/stats5040065

AMA Style

Griffith DA. Selected Payback Statistical Contributions to Matrix/Linear Algebra: Some Counterflowing Conceptualizations. Stats. 2022; 5(4):1097-1112. https://doi.org/10.3390/stats5040065

Chicago/Turabian Style

Griffith, Daniel A. 2022. "Selected Payback Statistical Contributions to Matrix/Linear Algebra: Some Counterflowing Conceptualizations" Stats 5, no. 4: 1097-1112. https://doi.org/10.3390/stats5040065

Article Metrics

Back to TopTop