Hotel Comment Emotion Classification Based on the MF-DFA and Partial Differential Equation Classifier

Duanzhu, Sangjie; Wang, Jian; Jia, Cairang

doi:10.3390/fractalfract7100744

Open AccessArticle

Hotel Comment Emotion Classification Based on the MF-DFA and Partial Differential Equation Classifier

by

Sangjie Duanzhu

^1,2,3,4,

Jian Wang

^5,*

and

Cairang Jia

^1,2,3,4,*

¹

School of Computer, Qinghai Normal University, Xining 810016, China

²

Qinghai Province Tibetan Information Processing Engineering Technology Research Center, Xining 810008, China

³

Key Laboratory of Tibetan Information Processing and Machine Translation of Qinghai Province, Xining 810008, China

⁴

State Key Laboratory of Tibetan Intelligent Information Processing and Application, Xining 810008, China

⁵

School of Mathematics and Statistics, Nanjing University of Information Science and Technology, Nanjing 210044, China

^*

Authors to whom correspondence should be addressed.

Fractal Fract. 2023, 7(10), 744; https://doi.org/10.3390/fractalfract7100744

Submission received: 14 July 2023 / Revised: 7 October 2023 / Accepted: 8 October 2023 / Published: 9 October 2023

Download

Browse Figures

Versions Notes

Abstract

:

Due to the significant value that hotel reviews hold for both consumers and businesses, the development of an accurate sentiment classification method is crucial. By effectively distinguishing the authenticity of reviews, consumers can make informed decisions, and businesses can gain insights into customer feedback to improve their services and enhance overall competitiveness. In this paper, we propose a partial differential equation model based on phase-field for sentiment analysis in the field of hotel comment texts. The comment texts are converted into word vectors using the Word2Vec tool, and then we utilize the multifractal detrended fluctuation analysis (MF-DFA) model to extract the generalized Hurst exponent of the word vector time series to achieve dimensionality reduction of the word vector data. The dimensionality reduced data are represented in a two-dimensional computational domain, and the modified Allen–Cahn (AC) function is used to evolve the phase values of the data to obtain a stable nonlinear boundary, thereby achieving automatic classification of hotel comment texts. The experimental results show that the proposed method can effectively classify positive and negative samples and achieve excellent results in classification indicators. We compared our proposed classifier with traditional machine learning models and the results indicate that our method possesses a better performance.

Keywords:

emotion classification; MF-DFA; Hurst exponent; Allen–Cahn

1. Introduction

Natural Language Processing (NLP) is an important direction in computer science and artificial intelligence, aiming to explore various theories and methods that can realize effective communication between humans and computers in natural language [1,2]. NLP has been used by many scholars to excavate emotional bias in texts, especially company press releases and consumer online comments [3,4,5,6]. For the detailed development of NLP, please refer to the review [7,8]. Online reviews provide the customers’ feelings to the companies and other customers. In order to obtain valuable knowledge about customers, it is important to collect and analyze the online reviews. Sann and Lai [9] adopted NLP to synthesize hotel complaints about specific service failure items, which can help hotels improve their service. Jiao and Qu [10] proposed a computerized method which can extract Kansei knowledge from online reviews. Sharma et al. [11] exhibited a hybrid three-layer NLP framework based on a genetic algorithm and ontology, which had an excellent performance for an opinion mining patent. Guerrero-Rodriguez et al. [12] proved that online travel reviews can help tourists determine which places are positive and negative. Additionally, the double language representation model based on NLP has developed rapidly. However, the model cannot handle the absence of information. Therefore, the concept of an interval-probabilistic two-hierarchy language set combining EDAS was proposed by Wang et al. [13]. Using NLP to uncover emotional biases in text can effectively provide management with authentic evaluations of hotel services, thereby guiding optimization efforts. Wang et al. [14] proposed a new mathematical model as a hybrid robust stochastic approach to achieve the maximization of expected profit in compressed air energy systems. Saeedi et al. [15] proposed a robust optimization method for modeling cooling demand uncertainty, aiming to obtain robust chiller loadings in the uncertain environment of providing cooling demand in a multi-chiller system. Liu et al. [16] adopted a novel approach to achieve optimal management of smart parking lots in uncertain environments and provide optimal bidding curves for participating in the electricity market. Mehrpooya et al. [17] provided the performance of a new combined energy system composed of different devices at various parameters by providing multi-objective optimization to obtain the best performance of the developed hybrid system. Jiang et al. [18] proposed a stochastic optimization strategy that considers demand response for participation in the energy market operations. Han and Ghadimi [19] employed a hybrid approach based on convolutional neural networks and extreme learning machine networks to optimize the model using a novel meta-heuristic algorithm.

Multifractal theories have strong nonlinear processing ability, which has attracted much attention. Therefore, multifractal theories have been widely adopted in the analysis of markets and prices in economics and finance [20,21]. The complexity and uncertainty of the market can be obtained by analyzing the multifractal characteristics of financial time series. Meanwhile, multifractal theories can be used to investigate the volatility of stock prices and risk management, which can help develop effective risk management strategies [22,23]. In addition, mutifractal theory has excellent dimensionality reduction capability, which has been widely applied to feature extraction [24,25]. In general, the Hurst exponent is used to characterize the multifractal features. Additionally, the multifractal method is used in conjunction with other methods to improve classification accuracy. The Hurst exponent, Lempel–Ziv information and Shannon entropy were combined as feature vectors into the support vector machine to determine normal and noisy heart sounds [26]. Abbasi et al. [27] extracted Hurst exponent and autoregressive moving average features from EEG signals and put the feature vector into a long short-term memory classifier, the accuracy of which was high. Using the MF-DFA model aims to effectively handle time series information within textual data. This model finds extensive applications in time series analysis and is capable of extracting long-range correlations and nonlinear features within sequences. By transforming textual data into word vector time series and applying MF-DFA, we can better capture crucial patterns and features within sentiment text, leading to more accurate classification.

The phase field model is a model in materials science and physics that describes the relationship between the microstructure and macroscopic behavior of a material, and is a mathematical model that can be used to solve the two-phase interface. In 1979, Allen and Cahn first applied the Allen–Cahn (AC) equation in [28] to describe the process of phase separation in a binary alloy, and it has been one of the most widely studied phase field models and finds extensive applications in image segmentation [29], shape transformation [30], denoising [31], etc. In the field of volume reconstruction, Li et al. [32] developed a computational method for weighted 3D volume reconstruction from a series of slice data based on the modified AC equation with a fidelity term. Liu et al. [33] proposed a novel variational model by integrating the AC term with a local binary fitting energy term, aiming to segment images with intensity, inhomogeneity, and noise. In terms of theoretical research, to ensure unconditional stability of the proposed AC equation with logarithmic free energy, Park et al. [34] employed interpolation and finite difference methods to solve the splitting terms. Choi and Kim [35] introduced a novel conservative AC equation with new Lagrange multipliers, and provided corresponding numerical methods that maintain the maximum principle and ensure unconditional stability. Zhang and Yang [36] presented a numerical method for solving the spatial fractional Allen–Cahn equation. Their approach involved utilizing the Crank–Nicolson method for time discretization and employing the second-order weighted and shifted Grünwald difference formula for spatial discretization. Based on the characteristics of phase field phase separation, the phase field model can form an interface-like hyperplane and the exciting thing is that the interface is nonlinear. Therefore, based on the phase field model, novel classifiers were constructed by Wang et al. [37,38], which had an excellent classification performance. Due to the ability of the phase field model to generate stable nonlinear boundaries during model evolution, it is particularly well-suited for handling complex emotional expressions and contexts that are often found in textual data in sentiment classification tasks. Textual data typically contains intricate emotional nuances that are challenging to capture with simple linear boundaries. The nonlinear characteristics of the phase field model make it more suitable for handling these intricate emotional boundaries, resulting in a superior classification performance compared to other models.

In this paper, we propose a classifier based on MF-DFA and phase field modeling for effective positive and negative sentiment classification of hotel review texts. By utilizing NLP to mine emotional biases in the text, we transform hotel reviews into word vectors. We then use the MF-DFA model to extract the generalized Hurst exponent of the word vector time series, reducing the data representation to a two-dimensional space. The phase field model, due to its characteristics of two-phase flow, can achieve stable nonlinear boundaries during model evolution. Therefore, compared to other classifiers that can only form linear boundaries, it exhibits a better classification performance, effectively classifying positive and negative samples to a certain extent. This makes it more suitable for a sentiment analysis of hotel review texts. In the body of the paper, we also compare our proposed model with other methods.

The remainder of the paper is structured as follows. We introduce the methodology in Section 2. Section 3 presents the data information. Section 4 contains the empirical results. Section 5 reports the conclusion.

2. Methodology

This study proposes a hotel comment sentiment analysis model in Figure 1 based on the characteristics of hotel comment text. The proposed algorithm mainly includes modules such as preprocessing, word vector acquisition, word vector feature extraction, and phase-field classification. Subsequently, we describe the two main methods used in this article.

2.1. MF-DFA

MF-DFA is adopted to explore the fractal characteristic of the time series, which has been widely applied in finance [39,40], medicine [41,42], etc. The detailed calculation about MF-DFA is presented as follows.

Firstly, construct a mean-removed sum series

T (i)

for the time series

\{t_{k}\}

, the length of which is N:

\begin{matrix} T (i) = \sum_{k = 1}^{i} (t_{k} - \bar{t}), i = 1, 2, \dots, N, \end{matrix}

(1)

where

\bar{t}

is the mean of the series

\{t_{k}\}

.

Next, divide the new series

T (i)

into N disjoint intervals of length s; this aims to change the time scale. Here, N is rounded to

N / s

. In order to reduce the loss of the information in the process of dividing the sum series

T (i)

,

T (i)

needs to be divided once according to i from small to large and from large to small, so that a total of

2 N_{s}

intervals can be obtained.

The following step is performing a polynomial fitting of the series between each interval and the polynomial fitting adopted is the least squares method, designed to remove the trend of the series segment:

\begin{matrix} y_{v} (i) = α_{1} i^{m} + α_{2} i^{m - 1} + \dots + α_{m} i + α_{m + 1}, i = 1, 2, \dots, s; m = i, 2, \dots, \end{matrix}

(2)

where v is the number of the points in each interval.

Then, calculate the mean square error

f^{2} (s, v)

of the each interval. When

v = 1, 2, \dots, N_{s}

,

\begin{matrix} f^{2} (s, v) = \frac{1}{s} \sum_{i = 1}^{s} {\{T [(v - 1) s + i] - y_{v} (i)\}}^{2} . \end{matrix}

(3)

When

v = N_{s + 1}, N_{s + 2}, \dots, 2 N_{s}

,

\begin{matrix} f^{2} (s, v) = \frac{1}{s} \sum_{i = 1}^{s} {\{T [N - (v - N_{s}) s + i] - y_{v} (i)\}}^{2} . \end{matrix}

(4)

Average the

f^{2} (s, v)

obtained from all the segments and the

q

-th wave function F can be obtained as follows:

\begin{matrix} F (s) = \{\begin{matrix} {\{\frac{1}{2 N_{s}} \sum_{v = 1}^{2 N_{s}} {[f^{2} (s, v)]}^{q / 2}\}}^{1 / q}, q \neq 0, \\ e x p \frac{1}{2 N_{s}} \sum_{v = 1}^{2 N_{s}} l n [f^{2} (s, v)], q = 0 . \end{matrix} \end{matrix}

(5)

F (s)

is the function of the dividing length s and the fractal order q. Here,

F (s) \propto H (q)

, where

H (q)

is called the generalized Hurst exponents. When q equals 2,

H (q)

, that is,

H (2)

is the standard Hurst exponent and

F (s)

is the standard DFA.

2.2. Phase-Field Model

Based on the energy minimization theory, to handle binary classification problems using the modified Allen–Cahn equation, we minimize the following energy equation:

\begin{matrix} E (s (x, t)) = & \int_{Ω} \sum_{k = 1}^{2} [\frac{H (s_{k} (x, t))}{ϵ^{2}} + \frac{1}{2} {| \nabla s_{k} (x, t) |}^{2} + \frac{λ}{2} {(s_{k} (x, t) - h_{k} (x))}^{2}] d x, \end{matrix}

(6)

where

x = (x, y)

represents the spatial coordinates,

H (s_{k}) = 0.25 s_{k}^{2} {(s_{k} - 1)}^{2}

denotes the double-well potential energy function,

ϵ

represents the interface energy parameter, and

λ

is a fidelity parameter. In this paper,

h_{k} (x)

represents a fitting term that conforms to the given data while satisfying the constraint

\sum_{k = 1}^{2} h_{k} (x) = 1

. The vector

s = (s_{1}, s_{2})

abides by the constraint

\sum_{k = 1}^{2} s_{k} (x, t) = 1

, where

H^{'} (s) = (H^{'} (s_{1}), H^{'} (s_{2}))

and

h = (h_{1}, h_{2})

. After minimizing the free energy functional Equation (6), we can obtain the following modified Allen–Cahn equation in the sense of

L^{2}

, which can be used for binary classification, by employing the gradient flow for

k = 1, 2

:

\begin{matrix} \frac{\partial s_{k} (x, t)}{\partial t} & = & - \frac{H^{'} (s_{k} (x, t))}{ϵ^{2}} + Δ s_{k} (x, t) - λ (s_{k} (x, t) - h_{k} (x)) - ζ (s (x, t)), \end{matrix}

(7)

where

ζ (s (x, t)) = - \sum_{k = 1}^{2} H^{'} (s_{k} (x, t)) / (2 ϵ^{2})

. Due to

s_{1} (x, t) + s_{2} (x, t) = 1

, we utilize zero Neumann boundary conditions to obtain

n \cdot \nabla s_{k} (x, t) = 0

, for

x \in \partial Ω, t > 0,

where

n

is the unit outward normal vector to

\partial Ω

. In the two-dimensional space

Ω = (L_{x}, R_{x}) \times (L_{y}, R_{y})

, we discretize it to obtain the following discrete computational domain

Ω_{h} = {(x_{i}, y_{j}) | x_{i} = L_{x} + (i - 0.5) h, y_{j} = L_{y} + (j - 0.5) h, 1 \leq i \leq N_{x}, 1 \leq j \leq N_{y}}

, where

N_{x}

and

N_{y}

are positive integers and h is the uniform mesh size. Let

s_{k (i j)}^{n}

denote the numerical approximations of

s_{k} (x_{i}, y_{j}, n Δ t)

with a time step of

Δ t

. We divide the modified Allen–Cahn equation into the following two equations for

k = 1, 2

:

\begin{matrix} \frac{\partial s_{k} (x, t)}{\partial t} & = & - \frac{H^{'} (s_{k} (x, t))}{ϵ^{2}} + Δ s_{k} (x, t) - ζ (s (x, t)), \end{matrix}

(8)

\begin{matrix} \frac{\partial s_{k} (x, t)}{\partial t} & = & λ [h_{k} (x) - s_{k} (x, t)] . \end{matrix}

(9)

To solve Equation (8) using numerical methods, we employ the explicit Euler method:

\begin{matrix} \frac{s_{k (i j)}^{n + \frac{1}{2}} - s_{k (i j)}^{n}}{Δ t} = - \frac{H^{'} (s_{k (i j)}^{n})}{ϵ^{2}} + Δ_{h} s_{k (i j)}^{n} + \sum_{k = 1}^{2} \frac{H^{'} (s_{k (i j)}^{n})}{2 ϵ^{2}}, \end{matrix}

(10)

where

Δ_{h} s_{k (i j)}^{n}

represents the Laplace operator. Next, we solve the fidelity equation Equation (9) for

s_{k (i j)}^{n + \frac{1}{2}}

using the implicit Euler method and we can rewrite it as follows:

\begin{matrix} s_{k (i j)}^{n + 1} = \frac{s_{k (i j)}^{n + \frac{1}{2}} + λ Δ t h_{k (i j)}}{1 + λ Δ t} . \end{matrix}

(11)

This scheme is explicit, which means we do not need to solve a system of discrete equations implicitly, resulting in faster computations. However, to ensure the stability of the scheme, we have the constraint

Δ t < 0.25 h^{2}

[43].

3. Data Collection

The public sentiment dictionary of the hotel comment emotion analysis corpus in this article comes from Tan et al. [44]. Among them, 728 are positive ratings and 772 are negative comments. We take

80 %

of these samples as the training set and the rest as the test set.

Firstly, we preprocess the hotel text and use the precise word segmentation function “lcut()” from the jieba library in Python to segment the text. In the jieba word segmentation mode, the precise word segmentation mode can accurately separate the text without redundancy. Then, the preprocessed hotel comments are used as a training set, and a given length word vector time series can be trained through the Word2Vec tool. Next, we use the MF-DFA method to extract the generalized Hurst exponents of the word vector time series, and finally utilize the generalized Hurst exponents in the phase-field classifier model for sentiment classification of hotel texts.

4. Experimental Results

The experimental environment is based on Python 3.8 and MATLAB R2020a with a computer with an Intel(R) Core(TM) i5-4430 CPU 3.00 GHz processor, using the Windows 10 operating system.

We first convert 1500 hotel comment texts into 1500 word vectors using the Word2Vec model. Figure 2 represents a positive hotel comment word vector time series and a negative word vector sequence, respectively. Both sequences are unstable fluctuation sequences, with a large amount of noise in the data. From the sequence perspective, there is no significant difference between the two types of texts. Therefore, we use the MF-DFA method to extract features from the time series to reduce data dimensions and improve classification accuracy.

We reduce the dimensionality of the data to three dimensions and extracted generalized Hurst exponents

H (- 6)

,

H (0)

, and

H (6)

. For ease of representation, we characterize these three feature values in pairs on a two-dimensional plane and regularize the data within the computational domain of

Ω = {(0, 1)}^{2}

. Next, we first use the phase-field model classifier to train the training set, and get a nonlinear dividing line suitable for the data distribution characteristics. The parameter settings here are:

N_{x} = N_{y} = 100

,

h = 1 / N_{x}

,

ϵ = h / (2 \sqrt{2} a t a n h (0.9))

,

λ = 4500

,

α = 0.12

,

Δ t = α h^{2}

.

We first investigate the formation of the classification interface under the phase-field model when the Hurst exponent combination is

H (- 6)

and

H (0)

. As shown in Figure 3, the classifier model has strong adaptability to the distribution of initial data. Figure 3a shows the data distribution of positive and negative samples of hotel comments, and Figure 3b–d show the evolution process of the phase of the data points in Figure 3a over time, ultimately forming a stable state in Figure 3d. Figure 3d is a two-dimensional contour plot filled with phase values based on the phase-field evolution in the final stage, generated using the contourf function in the MATLAB R2020a. Figure 3e is a contour plot of the matrix based on phase values, also created using the contour function in Matlab, and its boundaries between two different phase values are consistent with those in Figure 3d. By using the blue classification boundary trained on the training set in Figure 3e, we can evaluate the classification performance of the classifier we proposed on the test set. Figure 3f represents the classification results of the test set. It can be seen from Figure 3e that the dividing line under the phase-field model is nonlinear and can effectively classify texts belonging to different types. However, traditional machine learning models such as DNN and SVM have a linear classification interface, which makes it easy for text data with unclear boundaries to be misclassified.

Next, we change the Hurst exponent combination to

H (0)

and

H (6)

, and use the same parameters as before. As shown in Figure 4, the classifier model can still effectively classify the hotel comment data. Due to the difference in the initial data distribution from that in Figure 3, the different blue classification boundary formed in Figure 4e, the corresponding classification interface has changed. However, the underlying principle of image formation remains the same as in Figure 3. In Figure 4f, we used the generated classification boundary to classify the test set.

Finally, we also observed the phase-field evolution process under the combination of

H (- 6)

and

H (6)

, and the classification process is shown in Figure 5. After 350 iterations, the phase-field model used the contourf function in Matlab to create a filled two-dimensional contour plot in Figure 5d. A stable classification boundary formed between different phase values. We plotted the blue classification boundary in Figure 5e and classified the test set based on this boundary in Figure 5f.

Next, we adopt the calculation of classification metrics to evaluate the classification effectiveness in the above three scenarios. Accuracy, Precision, Recall, and

F_{1}

Score are used as the indicators to evaluate the classification results in this study, and the calculation formulas are as follows:

\begin{matrix} A c c u r a c y & = & \frac{T P + T N}{T P + F N + F P + T N}, \end{matrix}

(12)

\begin{matrix} P r e c i s i o n & = & \frac{T P}{T P + F P}, \end{matrix}

(13)

\begin{matrix} R e c a l l & = & \frac{T P}{T P + F N}, \end{matrix}

(14)

\begin{matrix} F_{1} & = & 2 \times \frac{P r e c i s i o n \times R e c a l l}{(P r e c i s i o n + R e c a l l)} . \end{matrix}

(15)

For a binary classification problem, where instances are divided into positive or negative classes, the following four situations may occur in actual classification:

I.: If an instance is positive and predicted to be positive, it is the true class TP (True Positive);
II.: If an instance is positive but predicted to be negative, it is called a false negative class FN (False Negative);
III.: If an instance is negative but predicted to be positive, it is a false positive FP (False Positive);
IV.: If an instance is negative and predicted to be negative, it is true negative class TN (True Negative).

The classification metrics in Table 1 show that there is no significant difference in the classification results of our model under the combination of three different Hurst exponents. This further indicates that the three Hurst indices extracted by MF-DFA retain the main features of the original word vector, and the feature values are all efficient. Excellent results have been achieved on all the indicators of Accuracy, Precision, Recall, and

F_{1}

Score.

In order to demonstrate the nonlinear interface formed by our proposed model in hotel comment text classification and achieve the advantage of high accuracy, we compared the model with DNN and SVM classifiers that have linear classification interfaces. As shown in Table 2, the classification indicators of hotel comment texts under the phase-field classifier are superior to the traditional two classifier models. However, our model needs a large number of grids because of the discretization in the required space. Therefore, it is suitable for data classification in low dimensions for high-dimensional data and is prone to the curse of dimensionality for high-dimensional data.

5. Conclusions

This paper mainly studied the emotional classification problem of hotel comment texts. Firstly, the original text data were preprocessed, and the texts were segmented using the jieba tool. Then, the text was processed into a time series type word vector using Word2Vec. For these unstable sequences with weak features, we adopted the MF-DFA method for feature extraction, reducing the dimensionality of the word vector sequence with a length of 150 to a length of 3. Then, we combined the three reduced feature values in pair in the two-dimensional computing domain. Then we used the phase-field model to evolve the data based phase values, and finally obtained a nonlinear classification interface that can adapt to the distribution characteristics of data in the computing domain. Through experimental verification on the test set, the results showed that, in all three classification scenarios, the classifier can effectively form and classify the positive and negative emotions of hotel comment texts in the test set. In order to demonstrate the advantages of the nonlinear classification interface, we calculated the classification indicators for the same data using DNN and SVM, and the calculation results exhibited that our model was more efficient. Since the phase-field model needs to assign phase values on the discrete grid in space, the data dimensions used for calculation cannot be too large, and it is difficult for the equation discretization in the phase-field model to occur in a space larger than three dimensions. In our future research, our main focus will be on improving the robustness of the classification algorithm in high-dimensional data, improving the accuracy of classifier classification and enhancing the applicability of the classification algorithm in other fields of network commentary.

Author Contributions

Conceptualization, S.D. and J.W.; methodology, J.W.; software, S.D.; validation, S.D., J.W. and C.J.; formal analysis, S.D.; investigation, S.D.; resources, C.J.; data curation, J.W.; writing—original draft preparation, S.D.; writing—review and editing, J.W.; visualization, J.W.; supervision, C.J.; project administration, J.W.; funding acquisition, J.W. All authors have read and agreed to the published version of the manuscript.

Funding

The author Jian Wang expresses thanks for the Natural Science Foundation of the Jiangsu Higher Education Institutions of China (Grant No. 22KJB110020).

Data Availability Statement

Data will be made available on request.

Acknowledgments

We would like to extend our deepest gratitude to the reviewers for their invaluable comments and suggestions that greatly improved the quality of this manuscript.

Conflicts of Interest

The authors declare no conflict of interest.

References

Hirschberg, J.; Manning, C.D. Advances in natural language processing. Science 2015, 349, 261–266. [Google Scholar] [CrossRef] [PubMed]
Lu, Y. Artificial intelligence: A survey on evolution, models, applications and future trends. J. Manag. Anal. 2019, 6, 1–29. [Google Scholar] [CrossRef]
Hu, M.; Dang, C.; Chintagunta, P.K. Search and learning at a daily deals website. Mark. Sci. 2019, 38, 609–642. [Google Scholar] [CrossRef]
Bansal, N.; Sharma, A.; Singh, R.K. Fuzzy AHP approach for legal judgement summarization. J. Manag. Anal. 2019, 6, 323–340. [Google Scholar] [CrossRef]
He, W.; Tian, X.; Tao, R.; Zhang, W.; Yan, G.; Akula, V. Application of social media analytics: A case of analyzing online hotel reviews. Online Inf. Rev. 2017, 41, 921–935. [Google Scholar] [CrossRef]
Gaur, L.; Afaq, A.; Solanki, A.; Singh, G.; Sharma, S.; Jhanjhi, N.Z.; My, H.T.; Le, D.N. Capitalizing on big data and revolutionary 5G technology: Extracting and visualizing ratings and reviews of global chain hotels. Comput. Electr. Eng. 2021, 95, 107374. [Google Scholar] [CrossRef]
Kang, Y.; Cai, Z.; Tan, C.W.; Huang, Q.; Liu, H. Natural language processing (NLP) in management research: A literature review. J. Manag. Anal. 2020, 7, 139–172. [Google Scholar] [CrossRef]
Alshemali, B.; Kalita, J. Improving the reliability of deep neural networks in NLP: A review. Knowl.-Based Syst. 2020, 191, 105210. [Google Scholar] [CrossRef]
Sann, R.; Lai, P.C. Understanding homophily of service failure within the hotel guest cycle: Applying NLP-aspect-based sentiment analysis to the hospitality industry. Int. J. Hosp. Manag. 2020, 91, 102678. [Google Scholar] [CrossRef]
Jiao, Y.; Qu, Q.X. A proposal for Kansei knowledge extraction method based on natural language processing technology and online product reviews. Comput. Ind. 2019, 108, 1–11. [Google Scholar] [CrossRef]
Sharma, M.; Singh, G.; Singh, R. Design of GA and ontology based NLP frameworks for online opinion mining. Recent Patents Eng. 2019, 13, 159–165. [Google Scholar] [CrossRef]
Guerrero-Rodriguez, R.; Álvarez-Carmona, M.Á.; Aranda, R.; López-Monroy, A.P. Studying online travel reviews related to tourist attractions using nlp methods: The case of guanajuato, mexico. Curr. Issues Tour. 2023, 26, 289–304. [Google Scholar] [CrossRef]
Wang, X.; Xu, Z.; Gou, X. The Interval probabilistic double hierarchy linguistic EDAS method based on natural language processing basic techniques and its application to hotel online reviews. Int. J. Mach. Learn. Cybern. 2022, 13, 1517–1534. [Google Scholar] [CrossRef]
Cai, W.; Mohammaditab, R.; Fathi, G.; Wakil, K.; Ebadi, A.G.; Ghadimi, N. Optimal bidding and offering strategies of compressed air energy storage: A hybrid robust-stochastic approach. Renew. Energy 2019, 143, 1–8. [Google Scholar] [CrossRef]
Saeedi, M.; Moradi, M.; Hosseini, M.; Emamifar, A.; Ghadimi, N. Robust optimization based optimal chiller loading under cooling demand uncertainty. Appl. Therm. Eng. 2019, 148, 1081–1091. [Google Scholar] [CrossRef]
Liu, J.; Chen, C.; Liu, Z.; Jermsittiparsert, K.; Ghadimi, N. An IGDT-based risk-involved optimal bidding strategy for hydrogen storage-based intelligent parking lot of electric vehicles. J. Energy Storage 2020, 27, 101057. [Google Scholar] [CrossRef]
Mehrpooya, M.; Ghadimi, N.; Marefati, M.; Ghorbanian, S.A. Numerical investigation of a new combined energy system includes parabolic dish solar collector, Stirling engine and thermoelectric device. Int. J. Energy Res. 2021, 45, 16436–16455. [Google Scholar] [CrossRef]
Jiang, W.; Wang, X.; Huang, H.; Zhang, D.; Ghadimi, N. Optimal economic scheduling of microgrids considering renewable energy sources based on energy hub model using demand response and improved water wave optimization algorithm. J. Energy Storage 2022, 55, 105311. [Google Scholar] [CrossRef]
Han, E.; Ghadimi, N. Model identification of proton-exchange membrane fuel cells based on a hybrid convolutional neural network and extreme learning machine optimized by improved honey badger algorithm. Sustain. Energy Technol. Assess. 2022, 52, 102005. [Google Scholar] [CrossRef]
Jiang, Z.Q.; Xie, W.J.; Zhou, W.X.; Sornette, D. Multifractal analysis of financial markets: A review. Rep. Prog. Phys. 2019, 82, 125901. [Google Scholar] [CrossRef]
Fernandes, L.H.; Silva, J.W.; de Araujo, F.H.; Tabak, B.M. Multifractal cross-correlations between green bonds and financial assets. Financ. Res. Lett. 2023, 53, 103603. [Google Scholar] [CrossRef]
Yao, C.Z.; Liu, C.; Ju, W.J. Multifractal analysis of the WTI crude oil market, US stock market and EPU. Phys. A Stat. Mech. Its Appl. 2020, 550, 124096. [Google Scholar] [CrossRef]
Mensi, W.; Vo, X.V.; Kang, S.H. Upward/downward multifractality and efficiency in metals futures markets: The impacts of financial and oil crises. Resour. Policy 2022, 76, 102645. [Google Scholar] [CrossRef]
Li, N.; Wu, S.B.; Yu, Z.H.; Gong, X.Y. Feature Extraction with Multi-fractal Spectrum for Coal and Gangue Recognition Based on Texture Energy Field. Nat. Resour. Res. 2023, 32, 2179–2195. [Google Scholar] [CrossRef]
Joseph, A.J.; Pournami, P.N. Multifractal theory based breast tissue characterization for early detection of breast cancer. Chaos Solitons Fractals 2021, 152, 111301. [Google Scholar] [CrossRef]
Lahmiri, S.; Bekiros, S. Complexity measures of high oscillations in phonocardiogram as biomarkers to distinguish between normal heart sound and pathological murmur. Chaos Solitons Fractals 2022, 154, 111610. [Google Scholar] [CrossRef]
Abbasi, M.U.; Rashad, A.; Basalamah, A.; Tariq, M. Detection of epilepsy seizures in neo-natal EEG using LSTM architecture. IEEE Access 2019, 7, 179074–179085. [Google Scholar] [CrossRef]
Allen, S.M.; Cahn, J.W. A microscopic theory for antiphase boundary motion and its application to antiphase domain coarsening. Acta Metall. 1979, 27, 1085–1095. [Google Scholar] [CrossRef]
Liu, C.; Qiao, Z.; Zhang, Q. Multi-phase image segmentation by the Allen–Cahn Chan–Vese model. Comput. Math. Appl. 2023, 141, 207–220. [Google Scholar] [CrossRef]
Han, Z.; Xu, H.; Wang, J. A simple shape transformation method based on phase-field model. Comput. Math. Appl. 2023, 147, 121–129. [Google Scholar] [CrossRef]
Wang, J.; Han, Z.; Jiang, W.; Kim, J. A fast, efficient, and explicit phase-field model for 3D mesh denoising. Appl. Math. Comput. 2023, 458, 128239. [Google Scholar] [CrossRef]
Li, Y.; Song, X.; Kwak, S.; Kim, J. Weighted 3D volume reconstruction from series of slice data using a modified Allen–Cahn equation. Pattern Recognit. 2022, 132, 108914. [Google Scholar] [CrossRef]
Liu, C.; Qiao, Z.; Zhang, Q. Two-Phase Segmentation for Intensity Inhomogeneous Images by the Allen–Cahn Local Binary Fitting Model. SIAM J. Sci. Comput. 2022, 44, B177–B196. [Google Scholar] [CrossRef]
Park, J.; Lee, C.; Choi, Y.; Lee, H.G.; Kwak, S.; Hwang, Y.; Kim, J. An unconditionally stable splitting method for the Allen–Cahn equation with logarithmic free energy. J. Eng. Math. 2022, 132, 18. [Google Scholar] [CrossRef]
Choi, Y.; Kim, J. Maximum principle preserving and unconditionally stable scheme for a conservative Allen–Cahn equation. Eng. Anal. Bound. Elem. 2023, 150, 111–119. [Google Scholar] [CrossRef]
Zhang, B.; Yang, Y. A new linearized maximum principle preserving and energy stability scheme for the space fractional Allen–Cahn equation. Numer. Algorithms 2023, 93, 179–202. [Google Scholar] [CrossRef]
Wang, J.; Han, Z.; Jiang, W.; Kim, J. A novel classification method combining Phase-Field and DNN. Pattern Recognit. 2023, 142, 109723. [Google Scholar] [CrossRef]
Wang, J.; Xu, H.; Jiang, W.; Han, Z.; Kim, J. A novel MF-DFA-Phase-Field hybrid MRIs classification system. Expert Syst. Appl. 2023, 225, 120071. [Google Scholar] [CrossRef]
Miloş, L.R.; Haţiegan, C.; Miloş, M.C.; Barna, F.M.; Boțoc, C. Multifractal detrended fluctuation analysis (MF-DFA) of stock market indexes. Empirical evidence from seven central and eastern European markets. Sustainability 2020, 12, 535. [Google Scholar] [CrossRef]
Aslam, F.; Aziz, S.; Nguyen, D.K.; Mughal, K.S.; Khan, M. On the efficiency of foreign exchange markets in times of the COVID-19 pandemic. Technol. Forecast. Soc. Chang. 2020, 161, 120261. [Google Scholar] [CrossRef]
Wang, J.; Shao, W.; Kim, J. Ecg classification comparison between mf-dfa and mf-dxa. Fractals 2021, 29, 2150029. [Google Scholar] [CrossRef]
Wang, F.; Wang, H.; Zhou, X.; Fu, R. Study on the effect of judgment excitation mode to relieve driving fatigue based on MF-DFA. Brain Sci. 2022, 12, 1199. [Google Scholar] [CrossRef] [PubMed]
Thomas, J.W. Numerical Partial Differential Equations: Finite Difference Methods; Springer Science & Business Media: New York, NY, USA, 2013; Volume 22. [Google Scholar]
Tan, S.; Zhang, J. An empirical study of sentiment analysis for Chinese documents. Expert Syst. Appl. 2008, 34, 2622–2629. [Google Scholar] [CrossRef]

$Fractalfract 07 00744 g001$

Figure 1. Flowchart of hotel comment emotional analysis model.

$Fractalfract 07 00744 g001$

$Fractalfract 07 00744 g002$

Figure 2. Word vector sequence of (a) positive hotel comment, and (b) negative hotel comment.

$Fractalfract 07 00744 g002$

$Fractalfract 07 00744 g003$

Figure 3. Classification process of Hurst exponent group

H (- 6)

and H(0) of (a) initial training data, (b) phase evolution at

t = 0

, (c) phase evolution at

t = 50 Δ t

, (d) phase evolution at

t = 350 Δ t

, (e) classification interface under training set, and (f) classification results under the test set.

Figure 3. Classification process of Hurst exponent group

H (- 6)

and H(0) of (a) initial training data, (b) phase evolution at

t = 0

, (c) phase evolution at

t = 50 Δ t

, (d) phase evolution at

t = 350 Δ t

, (e) classification interface under training set, and (f) classification results under the test set.

$Fractalfract 07 00744 g003$

$Fractalfract 07 00744 g004$

Figure 4. Classification process of Hurst exponent groups

H (0)

and H(6) of (a) initial training data, (b) phase evolution at

t = 0

, (c) phase evolution at

t = 50 Δ t

, (d) phase evolution at

t = 350 Δ t

, (e) classification interface under training set, and (f) classification results under the test set.

Figure 4. Classification process of Hurst exponent groups

H (0)

and H(6) of (a) initial training data, (b) phase evolution at

t = 0

, (c) phase evolution at

t = 50 Δ t

, (d) phase evolution at

t = 350 Δ t

, (e) classification interface under training set, and (f) classification results under the test set.

$Fractalfract 07 00744 g004$

$Fractalfract 07 00744 g005$

Figure 5. Classification process of Hurst exponent groups

H (- 6)

and H(6) of (a) initial training data, (b) phase evolution at

t = 0

, (c) phase evolution at

t = 50 Δ t

, (d) phase evolution at

t = 350 Δ t

, (e) classification interface under training set, and (f) classification results under the test set.

Figure 5. Classification process of Hurst exponent groups

H (- 6)

and H(6) of (a) initial training data, (b) phase evolution at

t = 0

, (c) phase evolution at

t = 50 Δ t

, (d) phase evolution at

t = 350 Δ t

, (e) classification interface under training set, and (f) classification results under the test set.

$Fractalfract 07 00744 g005$

Table 1. Classification metrics according to different Hurst exponents group.

Group	TP	TN	FP	FN	Accuracy	Precision	Recall	$F_{1}$
$H (- 6)$ & $H (0)$	115	149	10	26	0.88	0.92	0.82	0.87
$H (0)$ & $H (6)$	118	145	14	23	0.88	0.89	0.84	0.86
$H (- 6)$ & $H (6)$	120	146	13	21	0.89	0.90	0.85	0.87

Table 2. Comparisons between other classifiers and ours.

Method	Accuracy	Precision	Recall	$F_{1}$
DNN	0.80	0.79	0.77	0.78
SVM	0.82	0.83	0.79	0.81
Ours	0.89	0.90	0.85	0.87

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Duanzhu, S.; Wang, J.; Jia, C. Hotel Comment Emotion Classification Based on the MF-DFA and Partial Differential Equation Classifier. Fractal Fract. 2023, 7, 744. https://doi.org/10.3390/fractalfract7100744

AMA Style

Duanzhu S, Wang J, Jia C. Hotel Comment Emotion Classification Based on the MF-DFA and Partial Differential Equation Classifier. Fractal and Fractional. 2023; 7(10):744. https://doi.org/10.3390/fractalfract7100744

Chicago/Turabian Style

Duanzhu, Sangjie, Jian Wang, and Cairang Jia. 2023. "Hotel Comment Emotion Classification Based on the MF-DFA and Partial Differential Equation Classifier" Fractal and Fractional 7, no. 10: 744. https://doi.org/10.3390/fractalfract7100744

APA Style

Duanzhu, S., Wang, J., & Jia, C. (2023). Hotel Comment Emotion Classification Based on the MF-DFA and Partial Differential Equation Classifier. Fractal and Fractional, 7(10), 744. https://doi.org/10.3390/fractalfract7100744

Article Menu

Hotel Comment Emotion Classification Based on the MF-DFA and Partial Differential Equation Classifier

Abstract

1. Introduction

2. Methodology

2.1. MF-DFA

2.2. Phase-Field Model

3. Data Collection

4. Experimental Results

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI