# Statistical Modeling of Financial Data with Skew-Symmetric Error Distributions

## Abstract

## 1. Introduction

**sn**} package to handle families of skew-symmetric distributions in R.

## 2. Data Set

**Name**:- Firm name + Nikkei Firm Code (1137 firms)
**YMD**:- Closing date
**Sector1**:- Nikkei Industry Sector Code (Major) (1: manufacture, 2: non-manufacture)
**Sector2**:- Nikkei Industry Sector Code (Middle)
**Sector3**:- Nikkei Industry Sector Code (Minor)
**AC**:- Accounting criterion (
`1`: Japanese standard accounting,`2`: United States standard accounting,`3`: International Financial Reporting Standards (IFRS)) **Sales**:- Amount of sales (Unit: Million Yen)
**Employee**:- Number of employees (Unit: People)
**Assets**:- Total assets (Unit: Million Yen)

## 3. Data Visualization and Its Implications

#### 3.1. Data Visualization

#### 3.2. Implications of Visualization

`log.assets`) and logarithmic sales (

`log.sales`) can be seen to be ’slant’ from the lower right to the upper left, rather than being elliptical. In order to model distributions with such a structure, we can use distributions belonging to the family of skew-symmetric distributions proposed by [20,21].

## 4. Regression Modeling of Cross-Sectional Data

#### 4.1. Fitting Log-Log Model with Normal Error

`JAPANEXCHANGEGROUP0075107-3`(Japan Exchange Group, Inc., Credit & Leasing, Tokyo, Japan) is the most influential data set. The second most influential data set is

`JAPANSECURITIESFINANCE0070514-1`(Japan Securities Finance Co., Ltd. Credit & Leasing, Tokyo, Japan), followed by

`JAPANPOSTHOLDINGS0038793-1`(Japan Post Co., Ltd. Services, Tokyo, Japan) and

`TOMENDEVICES00306071-1`(Tomen Devices Co., Wholesale Trade, Tokyo, Japan), which can be confirmed to be highly influential. Note that these firms have very high or low sales relative to the number of employees and assets of the other firms.

#### 4.2. Fitting Log-Log Model with Skew-Normal Error

#### 4.3. Fitting Log-Log Model with Skew-t Error

#### 4.4. Model Selection for Log-Log Models

## 5. Fitting Log-Log Model with Dummy Variables

#### 5.1. Fitting Log-Log Model with Skew-t Error and Dummy Variables

#### 5.2. Economic Implications

#### 5.3. Grouping ‘Insignificant’ Sectors and Final Model Comparison

## 6. Conclusions and Discussion

## Supplementary Materials

## Author Contributions

## Funding

## Data Availability Statement

## Acknowledgments

## Conflicts of Interest

## Abbreviations

AIC | Akaike Information Criterion |

BIC | Bayesian Information Criterion |

CP | Centered Parameter |

DP | Direct Parameter |

FY | Fiscal Year |

EDA | Exploratory Data Analysis |

MLE | Maximum Likelihood Estimate |

TFP | Total Factor Productivity |

TSE | Tokyo Stock Exchange |

SN | Skew-Normal |

ST | Skew-t |

## References

**Figure 1.**Concept diagram of EDA [6].

**Figure 3.**Pairwise scatter plot (or scatter plot matrix) of sales, number of employees, and total assets of firms in TSE Prime for the fiscal year ending March 2022.

**Figure 4.**Three-dimensional scatter plot of sales, number of employees, and total assets of firms in TSE Prime for the fiscal year ending March 2022.

**Figure 7.**Plots of residuals based on the results of fitting log-log model with normal error to financial data for TSE Prime listed firms for the fiscal year ending March 2022. (

**upper-left**) index plots of the residuals, (

**lower-left**) plot of residuals against fitted values, (

**upper-right**) normal Q-Q plot of residuals, and (

**lower-right**) smoothed density function plot of the residuals.

**Figure 8.**Plots for regression diagnostics (sensitivity analysis) when fitting the log-log model with normal error.

**Figure 9.**Plots of residuals based on the results of fitting a log-log model with normal errors after removing influential data.

**Figure 10.**Histogram of CP residuals and statistical model (

**left panel**); P-P plot of squared scaled DP residuals (

**right panel**).

**Figure 11.**Logarithmic scale three-dimensional scatter plot and sample regression plane: log-log model with skew-t errors after removing the influential data.

**Figure 13.**Bubble chart of financial data for firms closing in March 2022: color-coded according to Nikkei middle classification codes.

**Figure 14.**Logarithmic scale three-dimensional scatter plot and sample regression planes: log-log model with skew-t errors and dummy variables.

**Table 1.**Data set of TSE Prime listed firms extracted from Nikkei NEEDS financial database (the first ten data are extracted from all 1137 data).

Name | YMD | Sector1 | Sector2 | Sector3 | AC | Sales | Employees | Assets | |
---|---|---|---|---|---|---|---|---|---|

1 | KYOKUYO0000001 | 31 March 2022 | 2 | 35 | 341 | 1 | 253575 | 2208 | 130460 |

2 | NIPPONSUISAN0000003 | 31 March 2022 | 2 | 35 | 341 | 1 | 693682 | 9662 | 505731 |

3 | MARUHANICHIRO0000004 | 31 March 2022 | 2 | 35 | 341 | 1 | 866702 | 12352 | 548603 |

4 | NITTETSUMINING0000022 | 31 March 2022 | 2 | 37 | 362 | 1 | 149082 | 2019 | 197732 |

5 | MITSUIMATSUSHIMAHOLDINGS0000023 | 31 March 2022 | 2 | 37 | 361 | 1 | 46592 | 1305 | 67837 |

6 | FURUKAWA0000043 | 31 March 2022 | 1 | 19 | 181 | 1 | 199097 | 2804 | 229727 |

7 | MITSUIMINING&SMELTING0000045 | 31 March 2022 | 1 | 19 | 181 | 1 | 633346 | 11881 | 637878 |

8 | TOHOZINC0000046 | 31 March 2022 | 1 | 19 | 181 | 1 | 124279 | 1051 | 145796 |

9 | MITSUBISHIMATERIALS0000047 | 31 March 2022 | 1 | 19 | 181 | 1 | 1811759 | 23711 | 2125032 |

10 | SUMITOMOMETALMINING0000049 | 31 March 2022 | 1 | 19 | 181 | 3 | 1259091 | 7202 | 2268756 |

StudRes | Hat | CookD | |
---|---|---|---|

TOMENDEVICES0030607-1 | 5.03 | 0.01 | 0.07 |

JAPANPOSTHOLDINGS0038793-1 | −3.47 | 0.02 | 0.08 |

JAPANSECURITIESFINANCE0070514-1 | −7.01 | 0.05 | 0.76 |

JAPANEXCHANGEGROUP0075107-3 | −7.07 | 0.05 | 0.80 |

Estimate | Std.Err | z-Ratio | Pr{>|z|} | |
---|---|---|---|---|

(Intercept.DP) | 1.0344 | 0.0980 | 10.56 | 0.0000 |

log(employees) | 0.2985 | 0.0165 | 18.11 | 0.0000 |

log(assets) | 0.6817 | 0.0150 | 45.40 | 0.0000 |

$\omega $ | 0.3623 | 0.0213 | 17.03 | 0.0000 |

$\alpha $ | 0.5435 | 0.1717 | 3.17 | 0.0016 |

$\nu $ | 3.7783 | 0.4997 | 7.56 | 0.0000 |

Dim | AIC | BIC | |
---|---|---|---|

Normal | 4 | 1496.43 | 1516.56 |

Skew-Normal | 5 | 1494.69 | 1519.85 |

Skew-t | 6 | 1397.36 | 1427.56 |

Estimate | Std.Err | z-Ratio | Pr{>|z|} | TFP | |
---|---|---|---|---|---|

log(employees) | 0.2938 | 0.0157 | 18.73 | 0.0000 | — |

log(assets) | 0.7024 | 0.0150 | 46.88 | 0.0000 | — |

$\omega $ | 0.2564 | 0.0158 | 16.18 | 0.0000 | — |

$\alpha $ | −0.5675 | 0.1853 | −3.06 | 0.0022 | — |

$\nu $ | 3.5863 | 0.4394 | 8.16 | 0.0000 | — |

Petroleum | 0.6154 | 0.1416 | 4.35 | 0.0000 | 1.8594 |

Wholesale Trade | 0.4497 | 0.0604 | 7.44 | 0.0000 | 1.6937 |

Retail Trade | 0.3023 | 0.0718 | 4.21 | 0.0000 | 1.5463 |

Fish and Marine Products | 0.2622 | 0.1422 | 1.84 | 0.0652 | 1.5062 |

Shipbuilding and Repairing | 0.1085 | 0.2643 | 0.41 | 0.6813 | 1.3525 |

Foods (Intercept.DP) | 1.3660 | 0.1057 | 12.92 | 0.0000 | 1.2440 |

Construction | −0.0029 | 0.0588 | −0.05 | 0.9605 | 1.2411 |

Iron and Steel | −0.1624 | 0.0730 | −2.22 | 0.0261 | 1.0816 |

Warehousing and Harbor Transportation | −0.2067 | 0.1117 | −1.85 | 0.0644 | 1.0373 |

Sea Transportation | −0.2110 | 0.1273 | −1.66 | 0.0973 | 1.0330 |

Non-Ferrous Metal and Metal Products | −0.2252 | 0.0674 | −3.34 | 0.0008 | 1.0188 |

Utilities—Gas | −0.2267 | 0.1132 | −2.00 | 0.0452 | 1.0173 |

Real Estate | −0.2507 | 0.0924 | −2.71 | 0.0067 | 0.9933 |

Mining | −0.2623 | 0.1424 | −1.84 | 0.0655 | 0.9817 |

Pulp and Paper | −0.2710 | 0.0912 | −2.97 | 0.0030 | 0.9730 |

Trucking | −0.2751 | 0.0840 | −3.28 | 0.0010 | 0.9689 |

Motor Vehicles and Auto Parts | −0.2935 | 0.0640 | −4.59 | 0.0000 | 0.9505 |

Other Manufacturing | −0.3027 | 0.0770 | −3.93 | 0.0001 | 0.9413 |

Services | −0.3048 | 0.0555 | −5.49 | 0.0000 | 0.9392 |

Transportation Equipment | −0.3107 | 0.1125 | −2.76 | 0.0058 | 0.9333 |

Chemicals | −0.3284 | 0.0562 | −5.84 | 0.0000 | 0.9156 |

Communication Services | −0.4065 | 0.0943 | −4.31 | 0.0000 | 0.8375 |

Stone, Clay, and Glass Products | −0.4198 | 0.0756 | −5.55 | 0.0000 | 0.8242 |

Electric and Electronic Equipment | −0.4450 | 0.0561 | −7.94 | 0.0000 | 0.7990 |

Machinery | −0.4734 | 0.0567 | −8.36 | 0.0000 | 0.7706 |

Rubber Products | −0.4754 | 0.1016 | −4.68 | 0.0000 | 0.7686 |

Drugs | −0.4848 | 0.0706 | −6.86 | 0.0000 | 0.7592 |

Precision Equipment | −0.5204 | 0.0716 | −7.27 | 0.0000 | 0.7236 |

Textile Products | −0.5943 | 0.0868 | −6.85 | 0.0000 | 0.6497 |

Utilities—Electric | −0.6013 | 0.0900 | −6.68 | 0.0000 | 0.6427 |

Credit and Leasing | −0.8536 | 0.1031 | −8.28 | 0.0000 | 0.3904 |

Railroad Transportation | −1.0878 | 0.0749 | −14.52 | 0.0000 | 0.1562 |

Air Transportation | −1.1681 | 0.1623 | −7.20 | 0.0000 | 0.0759 |

Dim | AIC | BIC | |
---|---|---|---|

Distinct Sector Dummies Only | 34 | 4013.01 | 4184.12 |

Log-Log Normal/w DSDs | 36 | 840.75 | 1021.93 |

Log-Log Skew-Normal/w DSDs | 37 | 820.13 | 1006.33 |

Log-Log Skew-t/w DSDs | 38 | 701.23 | 892.47 |

Log-Log Skew-t/w PGSDs | 31 | 704.14 | 860.15 |

