A Short-Patterning of the Texts Attributed to Al Ghazali: A “Twitter Look” at the Problem

: This article presents an novel approach inspired by the modern exploration of short texts’ patterning to creations prescribed to the outstanding Islamic jurist, theologian, and mystical thinker Abu Hamid Al Ghazali. We treat the task with the general authorship attribution problematics and employ a Convolutional Neural Network (CNN), intended in combination with a balancing procedure to recognize short, concise templates in manuscripts. The proposed system suggests new attitudes make it possible to investigate medieval Arabic documents from a novel computational perspective. An evaluation of the results on a previously tagged collection of books ascribed to Al Ghazali demonstrates the method’s high reliability in recognizing the source authorship. Evaluations of two famous manuscripts, Mishakat al-Anwa and Tahafut al-Falasifa , questioningly attributed to Al Ghazali or co-authored by him, exhibit a signiﬁcant di ﬀ erence in their overall stylistic style with one inherently assigned to Al Ghazali. This fact can serve as a substantial formal argument in the long-standing dispute about these manuscripts’ authorship. The proposed methodology suggests a new look on the perusal of medieval documents’ inner structures and possible authorship from the short-patterning and signal processing perspectives.


Introduction and Problem Formulation
This article presents an innovative approach inspired by short-patterning methodologies, aiming to analyze the literary compositions of the outstanding Islamic jurist, theologian, and mystical thinker Abu Hamid Al Ghazali (1058-1111).

Abu Hamid Al Ghazali
Al Ghazali is one of the most significant Muslim Sufis, whose ideas are prominent and persuasive not only in the Muslim world. As is well acknowledged (see, e.g., [1][2][3][4]), he was born at Tus in Persia, where he learned different fields of traditional Islamic religious disciplines. At the age of thirty-three, Al Ghazali was appointed by Nizâm al-Mulk, the Seljuq Empire' powerful vizier, to the Nizâmiyya Madrasa in Baghdad. Afterward, while going through a deep spiritual crisis, Al Ghazali abandoned his excellent career, and in November 1095, he left Baghdad with the excuse of going on a pilgrimage to Mecca. After some time spent in Damascus and Jerusalem, with a visit to Mecca in 1096, Al Ghazali settled in Tus. He spent the rest of his life writing, practicing Sufi, and teaching. In 1106 he went back to the Nizâmiyya Madrasa in Nishapur, where he had been a student, and continued teaching at least till 1110. Afterward, he returned to 2020, 8 : This article presents an novel approach inspired by the modern exploration of short texts' g to creations prescribed to the outstanding Islamic jurist, theologian, and mystical thinker id Al Ghazali. We treat the task with the general authorship attribution problematics and a Convolutional Neural Network (CNN), intended in combination with a balancing re to recognize short, concise templates in manuscripts. The proposed system suggests new make it possible to investigate medieval Arabic documents from a novel computational ive. An evaluation of the results on a previously tagged collection of books ascribed to Al demonstrates the method's high reliability in recognizing the source authorship. ons of two famous manuscripts, Mishakat al-Anwa and Tahafut al-Falasifa, questioningly d to Al Ghazali or co-authored by him, exhibit a significant difference in their overall style with one inherently assigned to Al Ghazali. This fact can serve as a substantial formal t in the long-standing dispute about these manuscripts' authorship. The proposed logy suggests a new look on the perusal of medieval documents' inner structures and authorship from the short-patterning and signal processing perspectives.
ds: short-patterning; Al Ghazali authorship; signal processing model; word embedding ction and Problem Formulation article presents an innovative approach inspired by short-patterning methodologies, analyze the literary compositions of the outstanding Islamic jurist, theologian, and mystical bu Hamid Al Ghazali (1058-1111).
amid Al Ghazali hazali is one of the most significant Muslim Sufis, whose ideas are prominent and e not only in the Muslim world. As is well acknowledged (see, e.g., [1][2][3][4]), he was born at sia, where he learned different fields of traditional Islamic religious disciplines. At the age hree, Al Ghazali was appointed by Nizâm al-Mulk, the Seljuq Empire' powerful vizier, to iyya Madrasa in Baghdad. Afterward, while going through a deep spiritual crisis, Al bandoned his excellent career, and in November 1095, he left Baghdad with the excuse of a pilgrimage to Mecca. After some time spent in Damascus and Jerusalem, with a visit to 1096, Al Ghazali settled in Tus. He spent the rest of his life writing, practicing Sufi, and In 1106 he went back to the Nizâmiyya Madrasa in Nishapur, where he had been a student, nued teaching at least till 1110. Afterward, he returned to Ṭus and died the following year. dern researchers recognize the significant contributions of Al Ghazali to world theological sophical thought. hazali had a substantial influence on the development of the Arab-Muslim culture.
to the Hadith predicting the arrival of Islam's renewer once every century, the Arab us and died the following year. Many modern researchers recognize the significant contributions of Al Ghazali to world theological and philosophical thought.
Al Ghazali had a substantial influence on the development of the Arab-Muslim culture. According to the Hadith predicting the arrival of Islam's renewer once every century, the Arab community community perceived Al Ghazali as the renewer of Islam's fifth century. As an example, the Shafi'i jurist al-Subki claimed, "If there had been a prophet after Muhammad, Al-Ghazali would have been the man." The Al Ghazali's most meaningful work is Iḥyāʾ ʿulūm al-dīn (The Revival of the Religious Sciences), primarily considered an outstanding work of Muslim spirituality. The Ihya turned out to be the most common Islamic text after the Holy Quran and the Hadith. This book, written in Arabic, is unquestionably essential to individual religious practice and comprehended as one of the greatest works and a timeless outline of the pious Muslim's way to God. Moreover, this extraordinary treatise's outstanding achievement is to unite orthodox Sunni theology and Sufi mysticism together in a valuable, understandable fashion to guide every aspect of Muslim life and death (see, e.g., [1,5,6]).
Al Ghazali's creativity has been the subject of numerous studies and reviews in various Islamic practical aspects and for humanity at large. Many works are attributed to Al Ghazali, occasionally appearing with different titles in different manuscripts (see, e.g., [1,5,6]). This topic is still being explored. The methods applied in these exciting research studies are mainly based on the stylistic and thematical analysis. They involve an in-depth evaluation of the religious and theological views expressed in the works and cross-citation breakdowns. In this regard, it is essential to mention a prominent Scottish Orientalist, historian, academic, and Anglican priest, William Montgomery Watt (1909Watt ( -2006. His assessments of the authenticity of works attributed to Al Ghazali are considered the most important in this field.
We do not review the explanations devoted to the issue because our purpose is to approach this problem from the formal, mathematical standpoint of modern deep learning methods applied to individual writing style modeling. Hopefully, such a practice can be further merged with the traditional methodologies to combine both attitudes' advantages.

Authorship Attribution
An individual writing style outwardly expresses an author's perception of reality. It is a personification of the general writing process, composed of many inexact and connecting phases commonly identified as pre-writing, drafting and writing, sharing and responding, revising and editing, and publishing [7]. Therefore, recognizing an individual writing style can be considered to uncover the style's templates, expressed through authorship attributes.
Following this general perception, we consider the assignment of texts prescribed to Al Ghazali from the overall authorship attribution problematics perspective. This field aims to recognize the author of a particular document in question from an analysis of materials with known authorship. A survey of methods applied in this area is given, for instance, in [8]. Such approaches are mainly used in the literature to identify the authorship of novels, plays, or poems with controversial origins.
There are two main kinds of methods in the author's verification problem: intrinsic and extrinsic. The intrinsic methods work merely with the provided texts (one with acknowledged authorship and one undergoing inspection) and form a one-class classification problem. Conversely, extrinsic verification techniques draw the non-target set and create a group of external documents. Extrinsic methods adapt the verification task to a binary classification problem. The most recognized and feasible extrinsic verification approach is the Impostors' method [9].
Supposing we deal with medieval literature, we have to consider the peculiar properties inherent in this type of literary creativity. One of these features is built-in text inhomogeneity caused by multiple frequently unspecified citations of other authors and sources. The manner of expression depends on the target audience and the topic of the text. It may contain many quotations and borrowings; thus, such texts' writing patterns are unstable and vary within the document. Simultaneously, the original style is kept in short patterns inherent to the author (i.e., "The devil is in the details."). The classical procedures for the authorship determination are less accurate when analyzing such short text prototypes.
А similar situation appears in the modern age, where people interconnect through relatively short messages such as tweets. Twitter is a social networking site launched in 2006 to distribute short posts of a maximum of 140 characters, named tweets. The requirement to briefly convey messages as (The Revival of the Religious Sciences), primarily considered an outstanding work of Muslim spirituality. The Ihya turned out to be the most common Islamic text after the Holy Quran and the Hadith. This book, written in Arabic, is unquestionably essential to individual religious practice and comprehended as one of the greatest works and a timeless outline of the pious Muslim's way to God. Moreover, this extraordinary treatise's outstanding achievement is to unite orthodox Sunni theology and Sufi mysticism together in a valuable, understandable fashion to guide every aspect of Muslim life and death (see, e.g., [1,5,6]).
Al Ghazali's creativity has been the subject of numerous studies and reviews in various Islamic practical aspects and for humanity at large. Many works are attributed to Al Ghazali, occasionally appearing with different titles in different manuscripts (see, e.g., [1,5,6]). This topic is still being explored. The methods applied in these exciting research studies are mainly based on the stylistic and thematical analysis. They involve an in-depth evaluation of the religious and theological views expressed in the works and cross-citation breakdowns. In this regard, it is essential to mention a prominent Scottish Orientalist, historian, academic, and Anglican priest, William Montgomery Watt (1909Watt ( -2006. His assessments of the authenticity of works attributed to Al Ghazali are considered the most important in this field.
We do not review the explanations devoted to the issue because our purpose is to approach this problem from the formal, mathematical standpoint of modern deep learning methods applied to individual writing style modeling. Hopefully, such a practice can be further merged with the traditional methodologies to combine both attitudes' advantages.

Authorship Attribution
An individual writing style outwardly expresses an author's perception of reality. It is a personification of the general writing process, composed of many inexact and connecting phases commonly identified as pre-writing, drafting and writing, sharing and responding, revising and editing, and publishing [7]. Therefore, recognizing an individual writing style can be considered to uncover the style's templates, expressed through authorship attributes.
Following this general perception, we consider the assignment of texts prescribed to Al Ghazali from the overall authorship attribution problematics perspective. This field aims to recognize the author of a particular document in question from an analysis of materials with known authorship. A survey of methods applied in this area is given, for instance, in [8]. Such approaches are mainly used in the literature to identify the authorship of novels, plays, or poems with controversial origins.
There are two main kinds of methods in the author's verification problem: intrinsic and extrinsic. The intrinsic methods work merely with the provided texts (one with acknowledged authorship and one undergoing inspection) and form a one-class classification problem. Conversely, extrinsic verification techniques draw the non-target set and create a group of external documents. Extrinsic methods adapt the verification task to a binary classification problem. The most recognized and feasible extrinsic verification approach is the Impostors' method [9].
Supposing we deal with medieval literature, we have to consider the peculiar properties inherent in this type of literary creativity. One of these features is built-in text inhomogeneity caused by multiple frequently unspecified citations of other authors and sources. The manner of expression depends on the target audience and the topic of the text. It may contain many quotations and borrowings; thus, such texts' writing patterns are unstable and vary within the document. Simultaneously, the original style is kept in short patterns inherent to the author (i.e., "The devil is in the details."). The classical procedures for the authorship determination are less accurate when analyzing such short text prototypes.
A similar situation appears in the modern age, where people interconnect through relatively short messages such as tweets. Twitter is a social networking site launched in 2006 to distribute short posts of a maximum of 140 characters, named tweets. The requirement to briefly convey messages as short tweets has spawned a new literary genre, attracting keen attention from various standpoints. Different applications for the analysis of short texts have been recently proposed to authorize and recognize malicious bots, chat conversations, short message service (SMS) messages, Twitter posts, Facebook status updates. Such studies commonly consist of analyzing short word patterns and should give rise to new methodologies for authorization of a very stylistically heterogeneous textual material. It seems to be very natural to adopt the analytical techniques applied in this area to investigate medieval Arabic texts.

Paper Contribution
Approaches designed to reveal short patterns in texts using deep learning techniques are known in the literature [10][11][12], primarily dealing with English language content while paying less attention to others, including ancient languages. As an initial step, these methods involve modifications of word embeddings (see, e.g., [12][13][14][15][16][17]). Chinese, Persian, Arabic, and Hebrew are significantly distinguishable in their own linguistic and semantic structures from English. In this paper, a preprocessing technique is combined with a word embedding technique constructed for the Arabic language.
The proposed learning procedure follows the general idea of the Impostors' method [14]. This methodology operates with external documents (impostors) and constructs a set of resemblances. In our case, we apply a modified version of the approach. The first basic impostor is composed by the earlier mentioned manuscript, The Al Ghazali's most meaningful work is Iḥyāʾ ʿulūm al-dīn (The Revival of the Religious Sciences), primarily considered an outstanding work of Muslim spirituality. The Ihya turned out to be the most common Islamic text after the Holy Quran and the Hadith. This book, written in Arabic, is unquestionably essential to individual religious practice and comprehended as one of the greatest works and a timeless outline of the pious Muslim's way to God. Moreover, this extraordinary treatise's outstanding achievement is to unite orthodox Sunni theology and Sufi mysticism together in a valuable, understandable fashion to guide every aspect of Muslim life and death (see, e.g., [1,5,6]).
Al Ghazali's creativity has been the subject of numerous studies and reviews in various Islamic practical aspects and for humanity at large. Many works are attributed to Al Ghazali, occasionally appearing with different titles in different manuscripts (see, e.g., [1,5,6]). This topic is still being explored. The methods applied in these exciting research studies are mainly based on the stylistic and thematical analysis. They involve an in-depth evaluation of the religious and theological views expressed in the works and cross-citation breakdowns. In this regard, it is essential to mention a prominent Scottish Orientalist, historian, academic, and Anglican priest, William Montgomery Watt (1909Watt ( -2006. His assessments of the authenticity of works attributed to Al Ghazali are considered the most important in this field.
We do not review the explanations devoted to the issue because our purpose is to approach this problem from the formal, mathematical standpoint of modern deep learning methods applied to individual writing style modeling. Hopefully, such a practice can be further merged with the traditional methodologies to combine both attitudes' advantages.

Authorship Attribution
An individual writing style outwardly expresses an author's perception of reality. It is a personification of the general writing process, composed of many inexact and connecting phases commonly identified as pre-writing, drafting and writing, sharing and responding, revising and editing, and publishing [7]. Therefore, recognizing an individual writing style can be considered to uncover the style's templates, expressed through authorship attributes.
Following this general perception, we consider the assignment of texts prescribed to Al Ghazali from the overall authorship attribution problematics perspective. This field aims to recognize the author of a particular document in question from an analysis of materials with known authorship. A survey of methods applied in this area is given, for instance, in [8]. Such approaches are mainly used in the literature to identify the authorship of novels, plays, or poems with controversial origins.
There are two main kinds of methods in the author's verification problem: intrinsic and extrinsic. The intrinsic methods work merely with the provided texts (one with acknowledged authorship and one undergoing inspection) and form a one-class classification problem. Conversely, extrinsic verification techniques draw the non-target set and create a group of external documents. Extrinsic methods adapt the verification task to a binary classification problem. The most recognized and . The alternative set (the second impostor) includes several books recognized as the Pseudo-Ghazali's ones (composed in imitation of his writing style). Note that these reproductions could be stylistically nonhomogeneous but provided in different ways. Thus, a one-class classification problem is transformed into a binary classification task. Modifying the mentioned deep learning techniques serves as a classifier, trained on the impostors' collection to recognize questionable authorship.
Another difficulty is imbalances in the training material. The size of the fundamental short tweets has spawned a new literary genre, attracting keen attention from various standpoints. Different applications for the analysis of short texts have been recently proposed to authorize and recognize malicious bots, chat conversations, short message service (SMS) messages, Twitter posts, Facebook status updates. Such studies commonly consist of analyzing short word patterns and should give rise to new methodologies for authorization of a very stylistically heterogeneous textual material. It seems to be very natural to adopt the analytical techniques applied in this area to investigate medieval Arabic texts.

Paper Contribution
Approaches designed to reveal short patterns in texts using deep learning techniques are known in the literature [10][11][12], primarily dealing with English language content while paying less attention to others, including ancient languages. As an initial step, these methods involve modifications of word embeddings (see, e.g., [12][13][14][15][16][17]). Chinese, Persian, Arabic, and Hebrew are significantly distinguishable in their own linguistic and semantic structures from English. In this paper, a preprocessing technique is combined with a word embedding technique constructed for the Arabic language.
The proposed learning procedure follows the general idea of the Impostors' method [14]. This methodology operates with external documents (impostors) and constructs a set of resemblances. In our case, we apply a modified version of the approach. The first basic impostor is composed by the earlier mentioned manuscript, Iḥyāʾ ʿulūm al-dīn. The alternative set (the second impostor) includes several books recognized as the Pseudo-Ghazali's ones (composed in imitation of his writing style). Note that these reproductions could be stylistically nonhomogeneous but provided in different ways. Thus, a one-class classification problem is transformed into a binary classification task. Modifying the mentioned deep learning techniques serves as a classifier, trained on the impostors' collection to recognize questionable authorship.
Another difficulty is imbalances in the training material. The size of the fundamental Iḥyāʾ ʿulūm al-dīn is expected to be much larger (by a factor of nine) than the total length of the texts from the alternative class. An enriching methodology for alternative balanced groups is constructed and applied to overcome this obstacle in the proposed approach.
The numerical experiments faithfully classify the tested material, previously tagged as written and not written by Al Ghazali, almost entirely corresponding to the accepted perspective. From our standpoint, these results look impressive since they are obtained using a technique utterly different from the accepted ones in this area and entirely based on a formal justification. This study leads to an innovative (signal processing) standpoint on the perusal of ancient documents' inner structures and possible authorship.
At the same time, two known literary compositions, Mishakat al-Anwa and Tahafut al-Falasifa, become so close in the sense of their short text features to the so-called Pseudo-Ghazali texts that they are recognized not as written by but only attributed to Al Ghazali. The manuscript Tahafut al-Falasifa is commonly acknowledged as Al Ghazali wrote and a student of the Asharite school of Islamic theology. The proposed method recognizes that a prominent part of the text (more than 80%) is written in style substantially differing from that recognized as Al Ghazali's own. Regarding the second manuscript, Mishakat al-Anwa, the same conclusion is reached. Note that such an inference is previously made only regarding the final part of the text by Watt [18]. In some ways, these outcomes both confirm and contradict commonly accepted judgments and, of course, have to be compared with future outcomes.
The main contributions of this paper are as follows: • short tweets has spawned a new literary genre, attracting keen attention from various standpoints. Different applications for the analysis of short texts have been recently proposed to authorize and recognize malicious bots, chat conversations, short message service (SMS) messages, Twitter posts, Facebook status updates. Such studies commonly consist of analyzing short word patterns and should give rise to new methodologies for authorization of a very stylistically heterogeneous textual material. It seems to be very natural to adopt the analytical techniques applied in this area to investigate medieval Arabic texts.

Paper Contribution
Approaches designed to reveal short patterns in texts using deep learning techniques are known in the literature [10][11][12], primarily dealing with English language content while paying less attention to others, including ancient languages. As an initial step, these methods involve modifications of word embeddings (see, e.g., [12][13][14][15][16][17]). Chinese, Persian, Arabic, and Hebrew are significantly distinguishable in their own linguistic and semantic structures from English. In this paper, a preprocessing technique is combined with a word embedding technique constructed for the Arabic language.
The proposed learning procedure follows the general idea of the Impostors' method [14]. This methodology operates with external documents (impostors) and constructs a set of resemblances. In our case, we apply a modified version of the approach. The first basic impostor is composed by the earlier mentioned manuscript, Iḥyāʾ ʿulūm al-dīn. The alternative set (the second impostor) includes several books recognized as the Pseudo-Ghazali's ones (composed in imitation of his writing style). Note that these reproductions could be stylistically nonhomogeneous but provided in different ways. Thus, a one-class classification problem is transformed into a binary classification task. Modifying the mentioned deep learning techniques serves as a classifier, trained on the impostors' collection to recognize questionable authorship.
Another difficulty is imbalances in the training material. The size of the fundamental Iḥyāʾ ʿulūm al-dīn is expected to be much larger (by a factor of nine) than the total length of the texts from the alternative class. An enriching methodology for alternative balanced groups is constructed and applied to overcome this obstacle in the proposed approach.
The numerical experiments faithfully classify the tested material, previously tagged as written and not written by Al Ghazali, almost entirely corresponding to the accepted perspective. From our standpoint, these results look impressive since they are obtained using a technique utterly different from the accepted ones in this area and entirely based on a formal justification. This study leads to an innovative (signal processing) standpoint on the perusal of ancient documents' inner structures and possible authorship.
At the same time, two known literary compositions, Mishakat al-Anwa and Tahafut al-Falasifa, become so close in the sense of their short text features to the so-called Pseudo-Ghazali texts that they are recognized not as written by but only attributed to Al Ghazali. The manuscript Tahafut al-Falasifa is commonly acknowledged as Al Ghazali wrote and a student of the Asharite school of Islamic theology. The proposed method recognizes that a prominent part of the text (more than 80%) is written in style substantially differing from that recognized as Al Ghazali's own. Regarding the second manuscript, Mishakat al-Anwa, the same conclusion is reached. Note that such an inference is previously made only regarding the final part of the text by Watt [18]. In some ways, these outcomes both confirm and contradict commonly accepted judgments and, of course, have to be compared with future outcomes.
The main contributions of this paper are as follows: • Adapting a Convolutional Neural Network (CNN) model intended to recognize anonymous authorship using short text patterns; • Performing detailed analysis of the authorship of creations attributed to Al Ghazali, confirming the reliability of the suggested model; • Discovering a short pattern structure in two famous works, Mishakat al-Anwa and Tahafut al-Falasifa, indicating that they very likely to belong to the Pseudo-Ghazali category (merely attributed to Al is expected to be much larger (by a factor of nine) than the total length of the texts from the alternative class. An enriching methodology for alternative balanced groups is constructed and applied to overcome this obstacle in the proposed approach.
The numerical experiments faithfully classify the tested material, previously tagged as written and not written by Al Ghazali, almost entirely corresponding to the accepted perspective. From our standpoint, these results look impressive since they are obtained using a technique utterly different from the accepted ones in this area and entirely based on a formal justification. This study leads to an innovative (signal processing) standpoint on the perusal of ancient documents' inner structures and possible authorship.
At the same time, two known literary compositions, Mishakat al-Anwa and Tahafut al-Falasifa, become so close in the sense of their short text features to the so-called Pseudo-Ghazali texts that they are recognized not as written by but only attributed to Al Ghazali. The manuscript Tahafut al-Falasifa is commonly acknowledged as Al Ghazali wrote and a student of the Asharite school of Islamic theology. The proposed method recognizes that a prominent part of the text (more than 80%) is written in style substantially differing from that recognized as Al Ghazali's own. Regarding the second manuscript, Mishakat al-Anwa, the same conclusion is reached. Note that such an inference is previously made only regarding the final part of the text by Watt [18]. In some ways, these outcomes both confirm and contradict commonly accepted judgments and, of course, have to be compared with future outcomes.
The main contributions of this paper are as follows: • Adapting a Convolutional Neural Network (CNN) model intended to recognize anonymous authorship using short text patterns; • Performing detailed analysis of the authorship of creations attributed to Al Ghazali, confirming the reliability of the suggested model; • Discovering a short pattern structure in two famous works, Mishakat al-Anwa and Tahafut al-Falasifa, indicating that they very likely to belong to the Pseudo-Ghazali category (merely attributed to Al Ghazali); • Suggesting a new signal-like text representation to study stylistic text characteristics from the signal processing standpoint.
The rest of the paper is organized as follows. Section 2 states the formal model. In Section 3, the provided numerical experiments are described. Section 4 is devoted to the conclusion.

Arab Words Embedding
One prevalent text mining method is the bag-of-words technique, presenting the text as vectors of terms' occurrences. This methodology does not preserve semantic information because it ignores the words' orderliness and joint appearances. Moreover, constructed in this way, representations are usually very sparse and suggest additional smoothing techniques. Deep learning embedding systems arrange more exact procedures, providing words' real-value vector representations such that adjacent patterns correspond to words with comparable sense.
Embedding words into a linear space is a trendy modern approach that exhibits semantic and syntactic text properties, implementing the general Distributional Hypothesis (see, e.g., [19,20]), asserting similar meanings of terms appearing in comparative contexts. The work in [21] suggests, based on this principle, the famous Word2vec model of word embedding in real Euclidian space in two fashions: the Continuous Bag-of-Words (CBOW) model and the Skip-gram model. The key idea is to attach a term's feature not to a sole coordinate but an entire compressed vector. Thus, a particular text is translated into a prototype with semantically accomplished columns. Compared with the earlier mentioned bag-of-words procedure, this natural language processing method preserves semantic and syntactic information. These are the most popular embedding methods:  [24].
CBOW strives to estimate the chance of a word occurrence, using a context such as a solitary word or a words' sequence. Conversely, Skip-gram aims to evaluate the context of a word. Both methods follow the same network topology, yet from opposing directions. The desired representation minimizes, roughly speaking, the distortion between the actual and the predicted matter on a large text corpus. GloVe uses the total word-word co-occurrences estimated on a corpus to reveal a meaningful representation of the word vector space. In comparison with Word2vec, this representation is optimized to approximate the neighboring likelihood logarithm by the words co-occurrences' inner products.
FastText is a modification of the Word2vec model where each word is embodied using characters' n-grams. It is more beneficial, to sum up, the sense of short terms and, likewise, suffixes and prefixes. ELMo is a deep word representation that exhibits upper word features such as syntax and semantics and their evolution across linguistic contexts. A deep bidirectional network qualified on a considerable corpus provides the embedding vectors.
At least 400 million people in about 60 countries consider Arabic as their native language, and about 250 million consider it their second. Arabic is the fifth most spoken language in the world. There are 28 letters in the Arabic alphabet, using only lowercase written letters. Letters sometimes join adjacent ones on both sides or only on the right, thereby creating the Arabic script form named a ligature. A character may appear in up to four different forms, contingent on its location in a word. Arabic is a more complicated morphology than many other languages such as English, French, German, or Russian, but is, to some extent, similar to Hebrew.
One of the most functional Arabic word embedding models is AraVec [25], which offers a pre-trained, word embedding, open-source platform with efficient word embedding models, developed in the general framework of the Word2vec model.
In this way, each term in a pre-trained Arabic word portrayal is substituted with its non-sparse d-dimensional vector representation in the Euclidian space and is trained on modern resources such as Wikipedia or Twitter collections. However, a term from a medieval document may not occur in such texts and may not be captured by the embedding source, thereby excluding it. For instance, these could be words not used in modern language, proper names, or words borrowed from other languages such as Urdu and Persian. These kinds of terms are omitted from our considerations. Considering that the analyzed texts are closer to Modern Literary Arabic than the non-formal Twitter language, we employ in our experiments a 300-dimensional representation trained on the Wikipedia corpus.

Convolutional Neural Networks
As the tool is intended to discover short patterns in a text, the following neural network architecture is applied. The suggested structure is a CNN, created in the spirit of [10][11][12], and having as an input, a sequence of matrixes resulted from a word embedding. Let us consider a document D = {w 1 , w 2 , . . . , w n } composed from the words w i , i = 1,..,n and attained from a vocabulary of terms V.
The next ingredient is a convolutional component with the following parameters: Then, we construct m matrices having the order d × l: Here, as before, the matrix columns correspond to individual words. In the next step, a convolution filter (convolution with a kernel belonging to h ∈ H) is applied to the pieces, obtained via shifting the initial h words in a chunk with an increment equal to the stride size s: The next step is a max-pooling: The outcome of this operation is to emphasize the most relevant data across a window. The obtained results are concatenated from the beginning over j and secondly over k, aiming to pass via a fully connected level with N 0 components controlling a softmax output layer. As mentioned earlier, the model is designed in the spirit of [10][11][12]. The main differences are that the network works with words instead of characters' n-grams and uses the relu activation function.

Handling of Imbalanced Training Data
The amounts of the data located in the two groups are expected to be significantly different due to the immense volume of the basic Al Ghazali's The Al Ghazali's most meaningful work is Iḥyāʾ ʿulūm al-dīn (The Revival of the Religious Sciences), primarily considered an outstanding work of Muslim spirituality. The Ihya turned out to be the most common Islamic text after the Holy Quran and the Hadith. This book, written in Arabic, is unquestionably essential to individual religious practice and comprehended as one of the greatest works and a timeless outline of the pious Muslim's way to God. Moreover, this extraordinary treatise's outstanding achievement is to unite orthodox Sunni theology and Sufi mysticism together in a valuable, understandable fashion to guide every aspect of Muslim life and death (see, e.g., [1,5,6]).
Al Ghazali's creativity has been the subject of numerous studies and reviews in various Islamic practical aspects and for humanity at large. Many works are attributed to Al Ghazali, occasionally appearing with different titles in different manuscripts (see, e.g., [1,5,6]). This topic is still being explored. The methods applied in these exciting research studies are mainly based on the stylistic and thematical analysis. They involve an in-depth evaluation of the religious and theological views expressed in the works and cross-citation breakdowns. In this regard, it is essential to mention a prominent Scottish Orientalist, historian, academic, and Anglican priest, William Montgomery Watt (1909Watt ( -2006. His assessments of the authenticity of works attributed to Al Ghazali are considered the most important in this field.
We do not review the explanations devoted to the issue because our purpose is to approach this problem from the formal, mathematical standpoint of modern deep learning methods applied to individual writing style modeling. Hopefully, such a practice can be further merged with the traditional methodologies to combine both attitudes' advantages.

Authorship Attribution
An individual writing style outwardly expresses an author's perception of reality. It is a personification of the general writing process, composed of many inexact and connecting phases commonly identified as pre-writing, drafting and writing, sharing and responding, revising and editing, and publishing [7]. Therefore, recognizing an individual writing style can be considered to uncover the style's templates, expressed through authorship attributes.
Following this general perception, we consider the assignment of texts prescribed to Al Ghazali from the overall authorship attribution problematics perspective. This field aims to recognize the author of a particular document in question from an analysis of materials with known authorship. A survey of methods applied in this area is given, for instance, in [8]. Such approaches are mainly used in the literature to identify the authorship of novels, plays, or poems with controversial origins.
There are two main kinds of methods in the author's verification problem: intrinsic and extrinsic. The intrinsic methods work merely with the provided texts (one with acknowledged authorship and one undergoing inspection) and form a one-class classification problem. Conversely, extrinsic verification techniques draw the non-target set and create a group of external documents. Extrinsic manuscript forming the authentic main class. Thus, we meet here a typical instance of the imbalanced classification. Such a situation stands, as is well known, as a bias to the majority group, possibly ignoring the minority class overall. This paper builds the following simple procedure, intended to balance the data together with appropriate augmentation. Undersampling of the majority class and oversampling of the minority class are combined before the training. The aim is to balance the training classes involved in the learning procedure.
More precisely, let us suppose that we have two datasets, D 1 and D 2 , with |D 1 |>|D 2 |.

Procedure:
• Undersample a sample S 1 from D 1 with the undersampling rate F 1 * |D 1 |; • F* F 1 times replicate D 2 and get S 2 ; • Return S 1 and S 2 .
We suggest that F and F 1 are such that |D 1 |>| F* F 1 * |D 1 |. This procedure is anticipated to resolve two problems simultaneously: firstly, to equalize the original set sizes, and then expand them, attempting to stabilize the training process.

Preprocessing
Preprocessing is a crucial procedure of each NLP (Natural Language Processing) task, especially for Arabic text handling. Such a phase significantly influences the results and has to be matched to the employed embedding method. We operate with the earlier-mentioned AraVec methodology of text preprocessing, which acts in the preprocessing step as follows: • Remove all punctuation marks, English letters, special characters, and digits; Remove diacritics.

Procedure
The method is applied to three collections (Cl i , i = 0,1,2), where Cl 0 includes texts that are unquestionably recognized as written by Al Ghazali. Cl 1 is a collection of books attributed to Al Ghazali but not written by him, and Cl 2 is a tested collection, including at least two anchor books: Al Ghazali definitely wrote the first, and the second is attributed to him. The presence of these anchor items allows interpreting of the obtained clusters of the tested documents. Thus, a cluster containing the first anchor is understood as a collection of the authentic Al Ghazali texts, and the second cluster as-fabricated ones. Algorithm 1 describes the training's overall procedure with the subsequent recognizing the authorship of the tested texts.  Considering the attained matrix M as a matrix of multivariate data: Each row matches an observation, and each column matches a variable. 7.
Perform a partition of the variables into 2 clusters using the K-Means algorithm: [labels, centers] = K-Means(M,2), where: • labels are assignments of the variables to the clusters • centers are the centroids of the clusters 8.
Partition is evaluated using the silhouette method: S= silhouette(labels, M) 9.
Decision step • If S < Silhouette threshold or the anchors belong to the same cluster then: Documents D∈ Cl 2 are not classified stop • otherwise: Documents D∈ Cl 2 are classified according to the anchors' assignment A few remarks should be made about the algorithm. The process starting at 5. is performed Niter times. Item 5a corresponds to a balancing procedure, providing well-adjusted datasets S0 and S1 using the underlying material D 1 , D 2, to train the proposed neural network in 5b. The next step, 5c, checks if the learned result achieves the desired accuracy. If the randomly chosen subset S1 does not provide the necessary separation, the procedure is repeated with another pair of S0, S1. The loop located in 5.e classifies all documents from Cl 2 via the current iteration network Net.
The authentic Al Ghazali class is labeled as 0, and the alternative is 1. Thus, M(Iter, D) represents, for each document from D, the fraction of its parts attained in 5a and recognized as Pseudo-Ghazali. The columns of the matrix M are separated in 7. into 2 clusters using the K-means method. Suppose the obtained partition is significant (silhouette is above of Silhouette threshold). In that case, if the anchors' elements are positioned in different groups, then the documents D ∈ Cl 2 are classified according to the anchors' location.
The balancing routine's outcomes strongly depended on the underlying features of the data drawn from the majority class. However, as mentioned earlier, the source class is sufficiently heterogeneous from the stylistic standpoint. Thus, the results are inherently biased toward the primary class due to reproducing its part in the learning process. The scheme utilized here embodies just one possible way to neutralize the bias, using the anchors to recognize appropriate groups.

Material
In a deep learning model framework, the training material and tested materials consist of texts attributed to Al Ghazali. The balancing routine's outcomes strongly depended on the underlying features of the data drawn from the majority class. However, as mentioned earlier, the source class is sufficiently heterogeneous from the stylistic standpoint. Thus, the results are inherently biased toward the primary class due to reproducing its part in the learning process. The scheme utilized here embodies just one possible way to neutralize the bias, using the anchors to recognize appropriate groups.

Material
In a deep learning model framework, the training material and tested materials consist of texts attributed to Al Ghazali.

Experiment Setup
The procedure is implemented in Python 3.7.6, using the 2.

Results
Several numerical experiments are performed, in which the values of the critical parameters l and l 0 vary as 64, 120, 128, and 30, 40, and 50. In general, these trials provide similar results. However, the best outcomes are achieved for [l/l 0 ] = 2. This article's limited scope reports only the most representative results, for l = 128 and l 0 = 50. In this matter, Cl 0 (the source collection) includes 7619 chunks, with an approximate size of 8.5 MB. Correspondingly, the alternative class contains 819 pieces using about 1.0 MB. As such, the imbalance ratio (IR) is about 8.5.
The training data consists of about 17,700 units, with about 10,000 training, 3300 validation, and 4400 testing samples in each iteration. Following the procedure described earlier, we get training sets that are almost identical in terms of their sizes at each step. I.e., the minor class is multiplied six times. Experiments have shown that this is probably the minimum suitable value. The learning process repeatedly does not converge or exceed the 0.75 validation accuracy for smaller values. Moreover, this characteristic significantly increases in the final stages of training. Table 1 exhibits the characteristics of the alternative collection. Documents 6 and 7 dominate in this class. Based on the Parula color map in the "scaled rows" fashion, a heat map demonstrates the experiments in Figure 1. The document numbers are mapped on the horizontal axis, while the vertical axis represents the experiment number. As can be seen, two text groups are divided by brightness and color into two parts: 1-6 and 7-10. The corresponding cluster procedure separates these sets with the silhouette value of 0.8836.  Based on the Parula color map in the "scaled rows" fashion, a heat map demonstrates the experiments in Figure 1. The document numbers are mapped on the horizontal axis, while the vertical axis represents the experiment number. As can be seen, two text groups are divided by brightness and color into two parts: 1-6 and 7-10. The corresponding cluster procedure separates these sets with the silhouette value of 0.8836. The same situation appears in the error bar charts given in Figure 2. Recall that this graphical representation displays the mean with the variability, specifying by the bars the uncertainty in a measurement. In our case, it embodies one standard deviation. The dotted lines represent the average cluster centroids (the clusters' averages) y = 0.3259 and y = 0.6090, respectively. The central line corresponds to the line separating the clusters y0 = 0.4674. The same situation appears in the error bar charts given in Figure 2. Recall that this graphical representation displays the mean with the variability, specifying by the bars the uncertainty in a measurement. In our case, it embodies one standard deviation. The dotted lines represent the average cluster centroids (the clusters' averages) y = 0.3259 and y = 0.6090, respectively. The central line corresponds to the line separating the clusters y 0 = 0.4674. The partition mentioned above is likewise clearly comprehended here. Let us ascertain this result from the standpoints of the structure of the considered texts. First of all, note that the procedure unquestionably identifies the first six books, known as authorized by Al Ghazali. The procedure tags two books (numbers eight and nine) as Pseudo-Ghazali, perfectly matching the inherent labeling. The two remaining books' classification is of the most significant interest and novelty: Tahafut al-Falasifa (The Incoherence of the Philosophers, seventh in the list of tested manuscripts). According to the common standpoint, this milestone opus was created by Al Ghazali, together with a student of the Asharite school of Islamic theology. The book criticizes some positions of Greek and other earlier Muslim theorists, mostly those of Ibn Sina (Avicenna) and Al-Farabi (Alpharabius). The manuscript is reputedly an exceptionally successful creation and a landmark in Islamic philosophy.
We explore this topic using additional text representations highlighted by our model. As mentioned before, the procedure divides texts into successive equal-length pieces with the size l = 128. According to the predicted classification, each of them is split into batches with the length l0 = 50, tagged as 0 or 1. In this way, each document D is embodied as a signal, having the length 1 l l + possible values, signifying the pieces tags' mean values. An example of such a signal representation is given in Figure 3. The X-axis represents a piece's sequential number, and the Y-axis shows the average scores of the pieces, which could be 0, 0.5, or 1 in the considered case. Here, m = 397; that is, the tested document is divided into 397 pieces, having approximately the same size of 1.2 K. Thus, the numbering on the X-axis is from 1 to 397.  The partition mentioned above is likewise clearly comprehended here. Let us ascertain this result from the standpoints of the structure of the considered texts. First of all, note that the procedure unquestionably identifies the first six books, known as authorized by Al Ghazali. The procedure tags two books (numbers eight and nine) as Pseudo-Ghazali, perfectly matching the inherent labeling. The two remaining books' classification is of the most significant interest and novelty: Tahafut al-Falasifa (The Incoherence of the Philosophers, seventh in the list of tested manuscripts). According to the common standpoint, this milestone opus was created by Al Ghazali, together with a student of the Asharite school of Islamic theology. The book criticizes some positions of Greek and other earlier Muslim theorists, mostly those of Ibn Sina (Avicenna) and Al-Farabi (Alpharabius). The manuscript is reputedly an exceptionally successful creation and a landmark in Islamic philosophy.
We explore this topic using additional text representations highlighted by our model. As mentioned before, the procedure divides texts into successive equal-length pieces with the size l = 128. According to the predicted classification, each of them is split into batches with the length l 0 = 50, tagged as 0 or 1. In this way, each document D is embodied as a signal, having the length m = [|D|/l], taking [l/l 0 ] + 1 possible values, signifying the pieces tags' mean values. An example of such a signal representation is given in Figure 3. The X-axis represents a piece's sequential number, and the Y-axis shows the average scores of the pieces, which could be 0, 0.5, or 1 in the considered case. Here, m = 397; that is, the tested document is divided into 397 pieces, having approximately the same size of 1.2 K. Thus, the numbering on the X-axis is from 1 to 397.
Even here, it can be seen that the Pseudo-Ghazali (the score is above the cluster separation line y 0 = 0.4674) covers a more meaningfully significant part of the manuscript.
The overall conclusion has to be based not just on a simple random sample but on the whole assembly of all 20 simulated samples. To do it, we average such curves (signals) obtained in these 20 iterations and consider the resulting sequence, as seen in Figure 4.
The values derived from the averaged series are marked in blue. The red line is the result of the moving average smoothing with lag, equaling 7. This outline characterizes the style's overall behavior, demonstrating that most segments strive to fit the "1" Pseudo-Ghazali style. The observation is also confirmed by histograms generated for the original signal and its smoothed version (see Figure 5).
128. According to the predicted classification, each of them is split into batches with the length l0 = 50, tagged as 0 or 1. In this way, each document D is embodied as a signal, having the length / m D l =     , taking [ ] 0 / 1 l l + possible values, signifying the pieces tags' mean values. An example of such a signal representation is given in Figure 3. The X-axis represents a piece's sequential number, and the Y-axis shows the average scores of the pieces, which could be 0, 0.5, or 1 in the considered case. Here, m = 397; that is, the tested document is divided into 397 pieces, having approximately the same size of 1.2 K. Thus, the numbering on the X-axis is from 1 to 397.  Even here, it can be seen that the Pseudo-Ghazali (the score is above the cluster separation line y0 = 0.4674) covers a more meaningfully significant part of the manuscript.
The overall conclusion has to be based not just on a simple random sample but on the whole assembly of all 20 simulated samples. To do it, we average such curves (signals) obtained in these 20 iterations and consider the resulting sequence, as seen in Figure 4. The values derived from the averaged series are marked in blue. The red line is the result of the moving average smoothing with lag, equaling 7. This outline characterizes the style's overall behavior, demonstrating that most segments strive to fit the "1" Pseudo-Ghazali style. The observation is also confirmed by histograms generated for the original signal and its smoothed version (see Figure 5).  Even here, it can be seen that the Pseudo-Ghazali (the score is above the cluster separation line y0 = 0.4674) covers a more meaningfully significant part of the manuscript.
The overall conclusion has to be based not just on a simple random sample but on the whole assembly of all 20 simulated samples. To do it, we average such curves (signals) obtained in these 20 iterations and consider the resulting sequence, as seen in Figure 4. The values derived from the averaged series are marked in blue. The red line is the result of the moving average smoothing with lag, equaling 7. This outline characterizes the style's overall behavior, demonstrating that most segments strive to fit the "1" Pseudo-Ghazali style. The observation is also confirmed by histograms generated for the original signal and its smoothed version (see Figure 5). Both distributions have a negative skew, specifying a long left tail, the left asymmetry of a distribution around its mean. About 20% of the data are smaller than 0.5 in the left panel and about 8% in the red one. Thus, it is possible to conclude that the dominant part of the considered manuscript Tahafut al-Falasifa, is not written in the inherent Al Ghazali style.
Mishakat al-Anwar (The Niche of Lights, number 10 in the tested manuscripts' list) The prominent official Al Ghazali internet resource (https://www.ghazali.org) dedicates a subsite (https://www.ghazali.org/site/on-mishkat.htm) to the authorship problem of Mishakat al- Both distributions have a negative skew, specifying a long left tail, the left asymmetry of a distribution around its mean. About 20% of the data are smaller than 0.5 in the left panel and about 8% in the red one. Thus, it is possible to conclude that the dominant part of the considered manuscript Tahafut al-Falasifa, is not written in the inherent Al Ghazali style.
Mishakat al-Anwar (The Niche of Lights, number 10 in the tested manuscripts' list) The prominent official Al Ghazali internet resource (https://www.ghazali.org) dedicates a subsite (https://www.ghazali.org/site/on-mishkat.htm) to the authorship problem of Mishakat al-Anwar. Additionally, for several manuscript versions, the site presents the background information and the six crucial papers [5,18,[26][27][28]. These articles apparently can be treated, with some limitations, as the core discussion material in the problem.
The ongoing debate surrounding Al Ghazali's authorship of this manuscript in numerous scientific forums is much more wide-ranging than this website. It refers to documents not mentioned in the current article. In this long-time dispute, the participants present compelling arguments for and against the alleged authorship, based mainly on linguistic, religious, and philosophical outlooks. An analysis and review of these essential issues are not the present paper's subjects because we focus on formal algorithmic methods designed to evaluate the manuscript's authorship.
As in the previous case in Figure 3, we start from an example of a digital signal representation of pieces, given in Figure 6. Anwar. Additionally, for several manuscript versions, the site presents the background information and the six crucial papers [5,18,[26][27][28]. These articles apparently can be treated, with some limitations, as the core discussion material in the problem. The ongoing debate surrounding Al Ghazali's authorship of this manuscript in numerous scientific forums is much more wide-ranging than this website. It refers to documents not mentioned in the current article. In this long-time dispute, the participants present compelling arguments for and against the alleged authorship, based mainly on linguistic, religious, and philosophical outlooks. An analysis and review of these essential issues are not the present paper's subjects because we focus on formal algorithmic methods designed to evaluate the manuscript's authorship.
As in the previous case in Figure 3, we start from an example of a digital signal representation of pieces, given in Figure 6. This document is significantly shorter (just 78 pieces) than the one mentioned above; in conclusion, the graph appears to be more sparse. However, the dominance of the scores larger than 0.5 is undoubtedly visible. A chart of the average mean score (blue line) in the trials demonstrates the same tendency in Figure 7. The red line, as previously, corresponds to the moving average smoothing line with lag equaling 7. The resultant histograms also exhibit a left side tail, the left asymmetry of a distribution around its mean (Figure 8). This document is significantly shorter (just 78 pieces) than the one mentioned above; in conclusion, the graph appears to be more sparse. However, the dominance of the scores larger than 0.5 is undoubtedly visible. A chart of the average mean score (blue line) in the trials demonstrates the same tendency in Figure 7. The red line, as previously, corresponds to the moving average smoothing line with lag equaling 7. Anwar. Additionally, for several manuscript versions, the site presents the background information and the six crucial papers [5,18,[26][27][28]. These articles apparently can be treated, with some limitations, as the core discussion material in the problem. The ongoing debate surrounding Al Ghazali's authorship of this manuscript in numerous scientific forums is much more wide-ranging than this website. It refers to documents not mentioned in the current article. In this long-time dispute, the participants present compelling arguments for and against the alleged authorship, based mainly on linguistic, religious, and philosophical outlooks. An analysis and review of these essential issues are not the present paper's subjects because we focus on formal algorithmic methods designed to evaluate the manuscript's authorship.
As in the previous case in Figure 3, we start from an example of a digital signal representation of pieces, given in Figure 6. This document is significantly shorter (just 78 pieces) than the one mentioned above; in conclusion, the graph appears to be more sparse. However, the dominance of the scores larger than 0.5 is undoubtedly visible. A chart of the average mean score (blue line) in the trials demonstrates the same tendency in Figure 7. The red line, as previously, corresponds to the moving average smoothing line with lag equaling 7. The resultant histograms also exhibit a left side tail, the left asymmetry of a distribution around its mean (Figure 8). The resultant histograms also exhibit a left side tail, the left asymmetry of a distribution around its mean (Figure 8). The quantities of the scores lying below 0.5 are 36% and 20%. The general conclusion is that most of the text of Mishakat al-Anwar is not composed of the inherent Al Ghazali writing style.
As remarked earlier, we strive to propose a new perspective on the discussed problem. The suggested approach is fundamentally different from those commonly accepted. On the other hand, one study case, in our opinion, is to be debated.
As stated in a paper by Watt [18], "Most of the problems formulated by Gairdner are connected with the last section of the Mishkiit, the detailed interpretation of the Tradition about the Seventy (or Seventy Thousand) Veils (which for convenience I shall call the "Veils-section")." The article [26] of Gairdner is mentioned here. Watt continues, "If the above investigations have not overlooked some crucial point, there is no avoiding the conclusion that the Veils-section of Mishkat al-Anwar is a forgery." This statement agrees with the results obtained here, where the book's smoothed profile (marked in red in Figure 7) is mostly located above the line y = 0.5 in the last part of the chart. As for most of the manuscript, we conclude that this part is not written in the inherent Al Ghazali style. On the one hand, it shows that the obtained results do not contradict the widely accepted opinions. On the other hand, our results generalize them, indicating that the considered book's overall style differs from the inherent one ascribed to Al Ghazali.

Conclusions and Discussion
This paper suggests a new approach to the problem of the authenticity of the manuscripts attributed to Al Ghazali. Consideration of the short patterns appearing in the text body makes it possible to unfold a new perspective on the faithfulness and forgery of the studied documents. Combining with a deep learning technique used to analyze tweets, the proposed methodology examines medieval Arabic texts posing high inner inhomogeneity. When the method is applied to the previously tagged text collection, it exhibits reliable results that make it possible to offer a novel text representation in the signal fashion.
The absence of ground truth is an essential but inherent research limitation. Although one generally accepted attitude is confirmed by this study and presented in the last part of the experimental section. Merging the newly proposed method with traditional means can lead to novel inferences for the discussed problem. The quantities of the scores lying below 0.5 are 36% and 20%. The general conclusion is that most of the text of Mishakat al-Anwar is not composed of the inherent Al Ghazali writing style.
As remarked earlier, we strive to propose a new perspective on the discussed problem. The suggested approach is fundamentally different from those commonly accepted. On the other hand, one study case, in our opinion, is to be debated.
As stated in a paper by Watt [18], "Most of the problems formulated by Gairdner are connected with the last section of the Mishkiit, the detailed interpretation of the Tradition about the Seventy (or Seventy Thousand) Veils (which for convenience I shall call the "Veils-section")." The article [26] of Gairdner is mentioned here. Watt continues, "If the above investigations have not overlooked some crucial point, there is no avoiding the conclusion that the Veils-section of Mishkat al-Anwar is a forgery".
This statement agrees with the results obtained here, where the book's smoothed profile (marked in red in Figure 7) is mostly located above the line y = 0.5 in the last part of the chart. As for most of the manuscript, we conclude that this part is not written in the inherent Al Ghazali style. On the one hand, it shows that the obtained results do not contradict the widely accepted opinions. On the other hand, our results generalize them, indicating that the considered book's overall style differs from the inherent one ascribed to Al Ghazali.

Conclusions and Discussion
This paper suggests a new approach to the problem of the authenticity of the manuscripts attributed to Al Ghazali. Consideration of the short patterns appearing in the text body makes it possible to unfold a new perspective on the faithfulness and forgery of the studied documents. Combining with a deep learning technique used to analyze tweets, the proposed methodology examines medieval Arabic texts posing high inner inhomogeneity. When the method is applied to the previously tagged text collection, it exhibits reliable results that make it possible to offer a novel text representation in the signal fashion.
The absence of ground truth is an essential but inherent research limitation. Although one generally accepted attitude is confirmed by this study and presented in the last part of the experimental section. Merging the newly proposed method with traditional means can lead to novel inferences for the discussed problem.
Funding: This research received no external funding.