## 1. Introduction

The need for knowledge of the amount of sediment reaching specific points of streams and river segments became evident from the early 20th century [

1,

2,

3]. As a consequence of that, the investigation of the sediment transport processes and mechanisms emerged as a high significance research topic for hydrologists, physicists and engineers in the years that followed. Sediments constitute an integral part of river flows, relentlessly forming the shape of fluvial systems and variously affecting everything in their path [

4,

5]. Water-quality issues, changes in the wet cross-section, increased flooding risk and obstruction of navigation, as a result of excessive depositions, effects on the aquatic ecosystems, decline of macrophyte growth, clogging of spawning gravel, pressures inflicted on coastal zones, effective diminution of dams’ storage volume, due to excessive sedimentation, and extreme erosion rates in the case of sediment-starved water (usually below storage dams—theory of hungry water) [

6,

7,

8,

9,

10,

11], are some of the effects of sediments, which constitute the driving force behind the investigation of sediment transport processes, as well as modeling and quantification efforts. Moreover, knowledge about the interrelated interactions among water-biota-sediment in natural rivers is one of the central issues in today’s sustainable river management [

12].

The total sediment load results as the sum of the suspended load and the bed load, with the suspended load being the largest part of it. According to the literature, bed and bank erosion, in rivers, can be considered as a percentage of 10–20% of the total load [

13,

14,

15], although this largely depends on whether they are sandy-bed or gravel-bed rivers [

16]. Naturally, the finer the bed material is, the more easily it is entrained and transported downstream. Hence, the bed load ratio—as a fraction of the total load—increases, as the bed material becomes finer.

The result of decades of intensive research on river sedimentology and sediment transport is an amplitude of formulas, models, and theoretical concepts, aiming at the estimation of sediment load in natural streams. Depending on their target, these models can be divided into three principal classes: (a) bed-load models [

17,

18], (b) suspended-load models [

19,

20], (c) total-load models [

21,

22]. Despite most of the above-cited models were developed half a century—or more—ago, their theoretical basis and fundamental equations are so powerful, that even today they dominate the stream sediment transport research. The models for total sediment load can be further categorized as follows [

23]: (a) stochastic models and regression models [

24,

25,

26], (b) energy models [

22,

27,

28], (c) shear stress models [

20,

29,

30].

Yang first introduced his unit stream power theory for the determination of total sediment concentration, in open channels, in 1972 [

27]. This new theory questioned the assumption, made by conventional sediment transport equations, that sediment transport rate could be determined on the basis of water discharge, average flow velocity, energy slope, or shear stress [

31]. Yang [

22], primarily, implemented his unit stream power theory for sandy-bed open channels, and thus developed a formula applicable for bed material with particle size less than 2 mm. In 1984, Yang [

32] extended his unit stream power equation from sand transport to gravel transport, for gravel beds with particle sizes between 2 mm and 10 mm. Yang’s unit stream power theory has been extensively applied in the literature, and with more than 2000 citations, it constitutes one of the most esteemed formulas for the determination of total sediment yield.

Fuzzy logic has proved a particularly useful tool in the hands of engineers, and its use in recent decades has been widespread in hydrology, hydraulics and sediment transport [

33,

34,

35]. Fuzzy linear regression provides a functional fuzzy relationship between dependent and independent variables [

36], where uncertainty manifests itself in the coefficients of the independent variables.

Fuzzy logic has been utilized in a variety of cases to study the sediment transport processes, as well as to estimate the total sediment concentration. As a recent paradigm, Chachi et al. [

36] introduced a fuzzy regression method based on the Multivariate Adaptive Regression Splines (MARS) technique, to estimate suspended load, based on discharge and bed-load transport data, using fuzzy triangular numbers. The comparison of the model’s results with real data and two other fuzzy regression models (fuzzy least-absolutes and fuzzy least-squares regressions) showed that the fuzzy regression model performs well for predicting the fuzzy suspended load, by discharge, as well as the fuzzy bed load transport data. In 2018, Spiliotis et al. [

37] transformed the threshold—expressed by a dimensionless critical shear stress—for incipient sediment motion into a fuzzy set, by means of Zanke’s formula [

38], for the computation of the dimensionless critical shear stress, by using fuzzy triangular numbers instead of crisp values. The fuzzy band produced included almost all the used experimental data with a functional spread. The same group of researchers carried out similar studies, with an adaptive fuzzy-based regression and data from several gravel-bed rivers from mountain basins of Idaho, USA [

39], and with conventional fuzzy regression analysis and a goal programming-based fuzzy regression using experimental data [

40]; the results were satisfactory in both cases. In 2015, Özger and Kabataş [

41] successfully applied fuzzy logic and combined wavelet and fuzzy logic techniques (WFL) to predict suspended sediment load data which, then, were compared with monthly measured suspended sediment data from Corukhi River and miscellaneous East Black Sea basins. Kişi, in 2009 [

42], and Kişi et al. [

43] efficiently elaborated evolutionary fuzzy models (EFMs) and triangular fuzzy membership functions for suspended sediment concentration estimation using data from the US Geological Survey (USGS). Lohani et al. [

44] applied Zadeh’s [

45] fuzzy rule-based approach to derive stage-discharge-sediment concentration relationships. Firat (2010) [

46] used an Adaptive Neuro-Fuzzy Inference System (ANFIS) approach as a monthly total sediment forecasting system.

The present study aims to redefine the coefficients of the stream sediment transport formula of Yang [

22] with a fuzzy regression, using the very same experimental data that Yang used for the original equation. Basically, it is intended to build a functional “fuzzy twin” of the original equation, which will provide a fuzzy band for the total sediment concentration for natural sandy-bed rivers. The study initiated with the collection, analysis and processing of the primary experimental data, which, by itself, was a painstaking process. Finally, the 93.3% of the original experimental data was possible to be collected. Based on this data, a fuzzy “duplicate” of Yang’s equation was built, by means of the fuzzy regression model of Tanaka [

47]. In addition, the original sediment transport equation was reconstructed, by means of classic multiple linear regression, in order to validate the quality of data by comparing the calculated coefficients with the original ones. Apart from the coefficients, an efficiency assessment was carried out on the basis of comparison between the measured crisp total sediment concentrations and the calculated concentrations with a fuzzy band. It was shown that all the elaborated methods produced successful results for both the classic and the fuzzy multiple regressions.

It is the authors’ belief that fuzzy logic efficiently deals with the uncertainties that naturally envelop the complex sediment transport processes, by providing a fuzzy band for the final result—whichever this might be.

## 2. Unit Stream Power Theory of Yang for Sediment Transport in Natural Rivers

In 1972, Yang [

27], with the introduction of the unit stream power theory, fundamentally questioned the applicability of most sediment transport models which until then argued that sediment transport rate could be determined on the basis of physical magnitudes, such as discharge, flow velocity, energy slope or shear stress.

Yang defines the unit stream power as the velocity-slope product. The rate of energy per unit weight of water available for transporting water and sediment in an open channel of reach length

x and total drop

Y is [

31]:

where

Y is the elevation above a datum which also equals the potential energy per unit weight of water above a datum;

x is the longitudinal distance;

V is the mean flow velocity;

S is the energy slope; and

VS is the unit stream power.

To determine total sediment concentration, Yang regarded a relation between several physical quantities of the following form:

where

C_{t} is the total sediment concentration (ppm), with wash load excluded;

${V}_{*}$ is the shear velocity (m/s);

ν is the water kinematic viscosity (m

^{2}/s);

ω is the fall velocity (m/s); and

d_{50} is the median particle diameter (m).

By means of Buckingham’s

π theorem, the total sediment concentration can be expressed as a function of dimensionless parameters, as follows:

Yang added a critical unit stream power in the formula, to account for incipient motion of sediment, and after dimensional analysis, he derived the following equation for the total sediment concentration:

where

C_{F} is the calculated total sediment concentration (ppm); and

V_{cr}S is the critical unit stream power, derived as the product of mean critical flow velocity and energy slope.

Equation (4) is the dimensionless unit stream power equation that can be used to calculate the total sediment concentration, in ppm by weight, in both laboratory flumes and natural sandy-bed rivers, with median particle size less than 2 mm. Knowing the discharge and the geometric characteristics of the channel, and with simple calculations, the aforementioned sediment concentration can easily be transformed into any form of sediment load, sediment yield, or sediment discharge.

Yang’s unit stream power theory has been applied in a plethora of cases in literature, both continuously [

48,

49] and event-based [

50,

51]. Quaintly, nonetheless successfully, it has also been applied for estimating overland flow erosion capacity [

52,

53]. Because of the fact that Yang’s equations for total load [

22,

54] were built with data in the sand-size range, their application should be limited only in sandy rivers. However, Moore and Burch [

52] proved that Equation (4) can be applied equally well to predict the sediment transport rate in sheet and rill flows, when soil particles are in ballistic dispersion. It should be mentioned, however, that Moore and Burch used a constant value, of 0.002 m/s, for the critical unit stream power [

31].

## 3. “Fuzzy Twin”—The Physical Meaning

As mentioned above, the ultimate goal of this research is to build a functional “fuzzy twin” of the unit stream power formula of Yang. In an effort to explain the physical meaning of the term “fuzzy twin”, it is considered meaningful to separately analyze “fuzzy” and “twin”.

While a portion of the engaging parameters, such as the flow velocity, the flow depth, the bed slope and the water temperature, can be determined with a fairly high precision in natural streams, still the overall uncertainty that blankets the stream sediment transport processes, let alone the determination of the in-stream sediment concentration, is appreciably high. This is not only associated with the bed morphology and the grain size distribution, but also with the constantly altering flow conditions that prevail in natural rivers. Yet, any uncertainty due to measurement errors seems to be small compared to uncertainties in the computational part. This is due to simplifications and assumptions made by sediment transport formulas, in which reality is usually poorly reflected. Hence, apart from the measurement errors, the fuzzy band is even more meant to deal with uncertainties in the computational part, namely uncertainties that have to do with the representation of all the involved physical processes in a formula. Just to give an example, fall velocity, for instance, can be measured with much greater accuracy than it can be computed by any existing formula. Obvious reasons for this are that in all fall velocity formulas the particle is considered a sphere, and is usually represented by the median particle diameter, d

_{50}, and not by its actual diameter, as well as the disregard of turbulence. Going further, the uncertainty raises by the subjectivity in the estimation of the incipient motion criterion [

54] and the turbulence impact on sediment transport. To better identify the source of uncertainty in Yang’s formula, the assumption of one-dimensional, uniform and steady flow (especially, in the case of natural rivers), as well as the regression analysis between sediment transport rate and stream discharge, which partly neglects the physical mechanisms of the sediment transport phenomenon, must be considered, as well. The uncertainties would significantly be reduced in the case of an analytical physically based model. The complex nature of sediment transport and the associated uncertainties have been very well documented in literature [

54,

55,

56,

57,

58,

59]. In terms of uncertainty, Kleinhans (2005) [

57] compares the notoriety of the sediment transport problem with that of the roughness problem and he stresses the necessity of calibration. In such cases, fuzzy regression, contrarily to conventional solutions such as classic regression, offers an efficient and applicable solution, by producing a fuzzy band within which the measured values are most likely included. Indeed, Azamathulla et al. [

60] state that classic regression does not efficiently cope with the uncertainties that dominate both input and output data and instead they use a Fuzzy Inference System (FIS) as a prediction model. Hence, “fuzzy” is justified by the fact that the computed sediment concentration is not a crisp value, as it would be if the classic formula of Yang Equation (4) had been used, but a range of values which is expected to contain the observed data.

As already mentioned, and as it is thoroughly presented in

Section 4, the construction of the fuzzy total sediment concentration formula is based on the exact same datasets that Yang used 47 years ago to derive his unit stream power formula. Hence, both formulas were built upon the same foundation and this makes them “twins”.

## 4. Materials and Methods

#### 4.1. Experimental Data for the Derivation of Yang’s Formula

Yang determined the coefficients of Equation (4) by considering the logarithmic total sediment concentration, logC_{F}, as the dependent variable and the log(ω·d_{50}/ν), log(${V}_{*}$/ω), log(VS/ω−V_{cr}S/ω), log(ω·d_{50}/ν)·log(VS/ω−V_{cr}S/ω), log(${V}_{*}$/ω)·log(VS/ω−V_{cr}S/ω), as the independent variables, and applying a multiple regression analysis for 463 sets of data in laboratory flumes.

These data were obtained by the following hydraulic and sediment transport related surveys, in laboratory flumes:

Vanoni and Brooks (1957) [

62]

It should be mentioned that this research initiated with collecting and organizing the experimental data of the above surveys which, by itself, was a very laborious task. Moreover, an appreciable effort was put in dealing with inaccuracies and incorrect values found in bibliography, in order to record the exact and correct data, as they result from the initial surveys.

#### 4.1.1. Nomicos’ Data (1956)

Nomicos [

61] investigated the friction characteristics of streams with sediment load. Velocity and sediment profiles were measured, and friction factor and von Karman’s constant were calculated in a 40-foot long and 0.875-foot wide flume. Nomicos conducted seven sets of experiments with 43 runs under uniform flow conditions and with various bed configurations, using sands ranging from 0.1 mm to 0.16 mm. Yang [

22] utilized a portion of 12 runs, with the same particle size (0.152 mm) and flow depth (0.24 foot), for his regression analysis.

#### 4.1.2. Vanoni and Brooks’ Data (1957)

Vanoni and Brooks [

62] carried out a total of 94 experimental runs, in the context of four different experiments, in two laboratory flumes with fine sand of several size distributions under uniform flow conditions. Fine sand with median particle diameter of 0.137 mm was used for the channel bed. During these experiments, the relationship between sediment transport rates and the hydraulic variables was investigated. A number of 14 runs, from the experiments conducted in a 60-foot long and 2.79-foot wide flume, was used by Yang.

#### 4.1.3. Kennedy’s Data (1961)

In 1961, Kennedy [

63] carried out a series of experiments with the objective of investigating the factors involved in the formation of antidunes, the characteristics of stationary waves, as well as the effect of these on the friction factor and sediment transport. Three experiments, in three flumes with different geometric characteristics, were executed for the needs of Kennedy’s survey. More specifically, fine sand of 0.233 mm and 0.549 mm was used in a 40-foot long and 0.875-foot wide flume, and 0.233 mm sand was used in a 60-foot long and 2.79-foot wide flume. These are the very same flumes utilized by Nomicos [

61], and Vanoni and Brooks [

62], in their laboratory experiments. A corresponding number of 14, 13 and 14 sets of data were considered by Yang from each of Kennedy’s experiments.

#### 4.1.4. Stein’s Data (1965)

Stein [

64] executed experiments for the determination of total load and total bed materials by fractional sampling, static and dynamic dune properties, and head losses encountered by flow over an alluvial bed. Stein’s results showed that in the presence of moving dunes, mean flow velocity appeared to be the decisive parameter for the determination of total load and total bed load. The experiments were conducted in a 100-foot long and 4-foot wide flume with a bed material of 0.4 mm. From the 73 runs of Stein, Yang [

22] selected 42 sets of experimental data.

#### 4.1.5. Guy, Simons and Richardson’s Data (1966)

The primary purpose of Guy et al. [

65] was to summarize and make available to the public, the results of the hydraulic and sediment data that were collected by Simons et al. [

68] in a unique series of experiments at Colorado State University, between 1956 and 1961. During these experiments, 339 equilibrium runs were executed in order to determine the effects of bed material size, water temperature, and fine sediment in the flow on the hydraulic and transport variables.

More than half (286 sets of data) of the 463 sets of data Yang used for his unit stream power sediment transport formula, were derived from the Guy et al. [

65] survey. A number of 10 sets of experiments with different conditions were conducted in two 150-foot long and 8-foot wide, and 60-foot long and 2-foot wide flumes for fine sand beds with a variety of median particle diameters for the bed material, ranging from 0.19 mm to 0.93 mm.

#### 4.1.6. Williams’ Data (1967)

Williams (1967) [

66] used the coarser sand (1.35 mm), compare to the other studies, in a 52-foot long and 1-foot wide laboratory flume in order to study sediment transport in a series of 37 runs with bed forms ranging from an initial plane bed to antidunes. For the range of conditions examined, unique relationships were found between any two variables as long as depth was constant [

66]. All the 37 sets of data from William’s survey were imported to Yang’s multiple regression.

#### 4.1.7. Schneider’s Data (1971)

Schneider’s data [

67] constitutes an exception, compare to the rest of the surveys, as it is the only data not coming from a published research. Indeed, it was observed, by the authors, that in several of Yang’s relative publications from 1973 onwards, Schneider’s data is cited as “personal communication”. Yang [

22] has published the values and ranges of the physical quantities of this data (i.e., 1.67–6.45 m/s, for mean flow velocity, 18–17,152 ppm, for total sediment concentration, etc.), yet the 31 sets of data of Schneider remain unknown. Despite the appreciable efforts put by the authors to obtain this data, this was not possible.

The data of the aforementioned surveys, which Yang used, in 1973, to construct his well-known and widely used formula, for the determination of total sediment concentration, are summarized in

Table 1. Though all calculations in the mathematical part of this study were executed in System International (SI) units, the values in

Table 1 are given in the original US customary units, so that they are more easily recognizable and correlated to previous literature. It must be highlighted that the values displayed in this table, whether they were obtained directly in the correct units or they were first converted (i.e., lbs/ft/s into ppm, for measured total sediment concentration, or °F into °C, for temperature), are the exact values with no rounding, whatsoever. This is said because there are some slight or, in some cases, major differences with earlier publishing.

Yang [

22] used, in his analysis, only data in the sand size range 0.0625 mm < d < 2 mm. It is important to note that the particle size,

d_{50}, is the median sieve diameter of the sediment, while Guy et al. [

65] published their data in terms of fall diameter. According to Yang [

22], the difference between these two measurements of particle size is insignificant when either one is smaller than 0.4 mm. The fall diameter was converted, by Yang, into sieve diameter by means of Figure 7 of Report 12 of the Inter-Agency Committee on Water Resources (1957) [

69]. The numbers shown in parentheses, in

Table 1, refer to the fall diameters for the coarse sand.

Missing the 31 sets of data of Schneider, out of a total of 463 sets of data, resulted in obtaining 432 of them, which correspond to the 93.3% of the total amount of data. As far as the authors are concerned, this is the closest one can get in collecting the dataset upon which the unit stream power sediment transport equation of Yang was based. All the work presented in this study, is based on this, nearly complete, dataset.

#### 4.2. Fuzzy Regression

Α fuzzy set can be seen as a mapping from a general set

X to the closed interval [0, 1]. A fuzzy set can be expressed by a membership function, which shows to what degree an element lies in the examined fuzzy set. A membership function is confined in the interval [0, 1], with a membership degree of 0 indicating that the element does not belong to the set and a membership degree of 1 indicating that the element fully belongs to the set. Subsequently, an object with a membership degree between 0 and 1 will belong to the set to some degree [

37].

A fuzzy number is a fuzzy set which, furthermore, satisfies the properties of convexity and normality. It is defined in the axis of real numbers and its membership function is a piecewise continuous function [

70].

The (soft) α-cut set of the fuzzy number

A, with 0 < α ≤ 1, is defined as follows [

71]:

where

μ_{A}(

x) the membership function of the fuzzy number

A; and

R is the set of real numbers.

An interesting point is that the crisp set including all the elements with non-zero membership function is the 0-strongcut which can be defined as follows [

72]:

More analytically, according to Equation (8), above the 0-cut is an open interval that does not contain the boundaries. For this reason, and in order to have a closed interval containing the boundaries, Hanss [

73] suggested the phrase worst-case interval

W, which is the union of the 0-strongcut and the boundaries [

74].

Linear regression analysis is used to model the linear relationship between the independent variables and the dependent variable. Most collected data in the present study constitute independent variables and the derivative regression model should approximate the results of the dependent variable measurements according to the criteria specified by the analyst. In the fuzzy linear regression model, the difference between the computational data and the actual values (measurements) is assumed to be due to the structure of the system. The proposed model carries this uncertainty back to its coefficients or, in other words, our inability to construct a precise relationship, is directly introduced into the model, on the fuzzy parameters [

75,

76]. Based on the above reasoning, the coefficients for the independent variables are chosen to be fuzzy numbers. This study also deals with cases where both the input data (independent variables) and the derived output (dependent variable) are classic numbers. The problem of fuzzy linear regression is reduced to a linear programming problem according to the following steps [

77]:

#### 4.3. Implementation

For the modulation of the auxiliary variables

X_{1},

X_{2},

X_{3},

X_{4} and

X_{5}, several parameters had to be calculated. The sedimentation rate in the unit stream power equation of Yang Equation (4) was determined by means of Zanke’s [

78] formula:

where

ν is the water kinematic viscosity (m

^{2}/s);

D* is the Bonnefille number; and

d_{ch} the characteristic grain diameter (m). The Bonnefille number,

D*, is given by:

In the above relation, ${\rho}_{F}$ is the density of sediment (kg/m^{3}) and ${\rho}_{W}$ is the density of water (kg/m^{3}).

The kinematic viscosity,

ν, of water is given by the equation:

where

T (°C) is the temperature of the water.

The shear velocity,

${V}_{*}$, was determined by means of the following formula:

where

g is the gravitational acceleration (m/s

^{2});

h (m) is the flow depth; and

S is the energy slope (m/m). In Equation (20), the hydraulic radius is replaced approximately by the flow depth. In the case of uniform flow, the energy slope equals the bed slope.

When the auxiliary variables

X_{1},

X_{2},

X_{3},

X_{4} and

X_{5} Equation (21) are introduced into the fuzzified version of the Yang’s equation Equation (4), then Equation (22) results in:

The dependent variable is the logarithm of the concentration of the total load, C_{F}, which is produced as fuzzy symmetric triangular number, as well. By introducing the above auxiliary variables X_{1} to X_{5} for numeric data, the problem of non-linear fuzzy regression is reduced to a linear fuzzy regression problem. In the fuzzy linear regression model, the coefficients of the independent variables are fuzzy numbers that were determined using the Matlab program.

Furthermore, a simplified version of the Yang’s Equation (4), which contains only the exerted unit stream power minus the critical unit stream power, is investigated:

A criterion for the successfulness of this simplification will be the produced uncertainty. More analytically, if the uncertainty is increased significantly, this will indicate an irrational simplification (undertraining behavior).

## 6. Conclusions

The objective of this research is to transform the arithmetic coefficients of the total sediment transport rate formula of Yang, into fuzzy numbers, and thus create a fuzzy relationship that will provide a fuzzy band of in-stream sediment concentration. A very large set of experimental data, in flumes, was used for the fuzzy regression analysis. The reason for selecting the fuzzy regression is that it provides a fuzzy band not only for the coefficients of the independent variables, but for the final result, as well, which is the total sediment concentration. This means that the resulting sediment concentration is not a crisp value, but a range of values, which stretch to a value equal to the semi-width on both sides of the central value. It is proved well, by the results, that this range of values deals efficiently with the uncertainties and the ambiguous nature of sediment transport processes. Apart from the measurement errors, the computational part, and specifically the physical simplifications, i.e., one-dimensional, uniform and steady flow, grain size distribution, etc., increase the uncertainty. An interesting perspective is that even if the validation data are observations from natural rivers, where significant uncertainty and simplifications take place, these are successfully captured by the proposed fuzzy band. The minimum advantage of the fuzzy band produced is that all the data must be included. However, a main criterion is the produced width of the fuzzy band. Based on this criterion, the authors concluded that the simplification of using only one variable should be avoided, and furthermore that a determinant variable is the subtraction of the critical unit stream power from the exerted unit stream power (X_{3}). Nevertheless, a simplification based on the X_{3} (i.e., only the variable X_{3} is taken into account) leads to a fuzzy linear curve that can be used to interpret the phenomenon. The produced fuzzy band compared with the central values indicates the good performance of the proposed fuzzy curve. In terms of elaboration of the original data utilized by Yang for the establishment of the unit stream power theory, this research goes the closest possible to what could be called “fuzzy twin” of Yang’s stream sediment transport formula.