Next Article in Journal
LLM-ROM: A Novel Framework for Efficient Spatiotemporal Prediction of Urban Pollutant Dispersion
Previous Article in Journal
LLM-Augmented Algorithmic Management: A Governance-Oriented Architecture for Explainable Organizational Decision Systems
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

An Interpretable Fuzzy Framework for Data-to-Text Generation Using Linguistic Contexts and Computational Perceptions: A Case Study on Photovoltaic Stations

by
Roberto G. Aragón
1,
Fernando Chacón-Gómez
1,
Jesús Medina
1 and
Clemente Rubio-Manzano
2,*
1
Department of Mathematics, University of Cádiz, CP 11510 Puerto Real, Spain
2
Department of Information Systems, University of the Bío-Bío, Concepción CP 3800708, Chile
*
Author to whom correspondence should be addressed.
AI 2026, 7(3), 103; https://doi.org/10.3390/ai7030103
Submission received: 1 January 2026 / Revised: 30 January 2026 / Accepted: 2 March 2026 / Published: 10 March 2026

Abstract

Textual and visual representations of data play a key role in data science and artificial intelligence by supporting effective and user-friendly communication. Among existing approaches, automatic data-to-text generation aims to produce natural language descriptions from structured data sources. This paper presents an interpretable fuzzy framework for generating data to text based on linguistic contexts and computational perception networks evaluated through formal concept analysis. The proposed framework is organized into four main stages: (i) transforming numerical data sets into linguistic contexts, (ii) generating computational perceptions from linguistic contexts, (iii) building computational perceptions networks to automatically generate natural language summaries, and (iv) validating the generated texts through comparison with summaries obtained using formal concept analysis–based baselines. To the best of our knowledge, this is the first work to address the generation of linguistic summaries through an interpretable process that transforms data into linguistic contexts and subsequently into computational perceptions. Another key difference from previous work lies in the verification of the linguistic summaries generated through these computational perceptions by using a formal method. A software prototype was implemented and evaluated using real photovoltaic station data provided by a local energy operator in Puerto Real (Cádiz, Spain). Experimental results show that the proposed fuzzy framework improves the interpretability and consistency of the generated summaries when compared with others approaches, demonstrating its potential for explainable and user-centered data-to-text generation.

1. Introduction

Artificial Intelligence (AI) and the latest developments in Natural Language Processing (NLP) are enabling the creation of data analysis tools that can extract valuable and easily comprehensible information. This presents numerous challenges that cannot be overcome using traditional statistical methods and visual representations alone. A key aspect of NLP is Natural Language Generation (NLG), which enables the creation of text from a variety of sources, including numerical and textual data.
In NLG, data-to-text (D2T) systems [1,2] focuses on the automatic transformation of non-linguistic structured data into coherent, understandable, and contextually appropriate text generation. These systems complement traditional data visualization by producing summaries, reports, and textual descriptions that facilitate the interpretation of large volumes of information in fields as diverse as weather forecasting, finance, healthcare, sports analysis, and e-commerce. Their operation is typically structured in two main stages: (i) the extraction and selection of relevant information during the data analysis phase, and (ii) the formulation of this information in natural language during the generation phase.
The literature establishes two main lines of research [2,3]. On one hand, there are rule-based methods, which primarily rely on expert knowledge and domain-specific rules to select relevant data. These methods are notable for their use of clustering and categorization techniques, as well as their robustness and transparency. However, their main limitation is scalability to new domains. On the other hand, there are trainable methods (statistical/machine learning) that do not require domain experts and are delivered as ready-made systems. These methods are characterized by a high dependence on data, requiring large datasets to achieve good results, but they offer greater flexibility in expanding to new domains.
Within the first line of work, several fuzzy logic-based approaches have been proposed to generate linguistic data descriptions (LDDs) and summaries using these terms. Zadeh introduced the paradigm known as Computing with Words and Perceptions [4,5], which involves performing calculations using linguistic terms represented by fuzzy sets. Building on this concept, numerous studies have successfully summarized numerical variables and their values using linguistic terms. For instance, the concept of computational perception (both first and second order) has been utilized to create networks for linguistically describing complex phenomena in diverse fields, known as computational perception networks (CPNs) [6]. Some works also utilize the general concept of protoform and the more specific notion of fuzzy quantified sentences with various structures (e.g., “At some points, the temperature is high”) [7].
While LDDs may not have the same depth of expression as real texts, they can still be valuable information components for NLG systems, especially for D2T systems [8,9]. There have been numerous proposals in this area, including an NLG system proposed in [10] that uses LDD to create energy-saving reports for home use. Another example is the D2T system described in [11] for meteorological applications, which is built upon traditional NLG approaches [12] and uses LDD to extract linguistic data for verbalization in two languages.
On the other hand, Formal Concept Analysis (FCA) is a mathematical approach used to represent knowledge and extract valuable information from data [13]. FCA has been extensively researched from both theoretical and practical perspectives. A significant advancement in FCA is the incorporation of fuzzy logic, which enables the handling of uncertainty, imprecise data, and incomplete information, making it a crucial area of study [14,15,16].
This paper introduces a novel fuzzy formal approach for data to text generation that utilizes linguistic contexts to integrate CPNs and FCA techniques. By combining these approaches, D2T systems can be designed, implemented, and evaluated in four stages: (i) converting data sets into linguistic contexts, (ii) generating computational perceptions from linguistic contexts, (iii) creating networks of computational perceptions to generate natural language summaries of the data and (iv) comparing the summaries obtained in the previous phase with those generated using FCA to evaluate the proper functioning of the generation process carried out by the computational perceptions network. The key contributions of this work include:
  • We introduce the concept of a computational perceptions network based on formal linguistic contexts, providing a structured and interpretable representation for data-to-text generation.
  • We establish a formal correspondence between formal concept analysis and automatic linguistic descriptions of complex phenomena.
  • We propose a method to transform numerical datasets into linguistic contexts, enabling the modeling of first-order perceptions using linguistic variables.
  • We define novel mechanisms to derive second-order computational perceptions by aggregating first-order perceptions, and third-order computational perceptions by aggregating second-order ones, supporting hierarchical perception modeling.
  • We integrate the proposed model into a software prototype and demonstrate its applicability through a real-world case study for photovoltaic generation facilities, formally analyzing data and generating linguistic descriptions.
Although, as will be shown later, the developed proposal is applicable to any time-series-based problem, in this work, it is validated using data from photovoltaic power plants. This choice is due to both the availability of information derived from collaboration with a local energy provider and the global relevance of the sector. According to the latest report from the International Renewable Energy Agency (IRENA) [17], solar photovoltaic technology has established itself as one of the fastest-growing segments within the global energy system. In 2024, installed capacity exceeded 1.8 TW, enough to power hundreds of millions of homes, positioning it as a key pillar in the transition to low-carbon energy systems. Solar photovoltaics account for more than three-quarters of the new renewable capacity added globally and represent the majority of recent clean energy growth, underscoring their dominant role in the decarbonization of electricity generation. Furthermore, photovoltaic systems were responsible for a substantial increase in electricity production during 2024, reaching an estimated contribution of nearly 10% of global electricity generation, a significant milestone for any energy technology.
The rest of the paper is organized as follows: Section 2 presents the preliminary concepts necessary for understanding the progress of this work. In particular, it reviews the foundations of linguistic data description, the formal analysis of fuzzy concepts, and a method for transforming data sets into linguistic concepts. Section 3 introduces a formal method that allows the transformation of precise data sets into linguistic fuzzy contexts using linguistic variables dynamically created from those data. Section 4 develops the computational perception network from the obtained fuzzy contexts. Then, Section 5 describes a granular model based on the linguistic description of complex phenomena, which utilizes linguistic contexts to represent first-, second-, and third-order computational insights. This model enables the construction of computational insight networks aimed at generating data summaries. The previous developments are complemented with the use of FCA in Section 6. This formal theory is used to develop more granular linguistic descriptions of the data sets. Finally, some conclusions and prospects for future work are given.

2. Preliminary Concepts

2.1. Automatic Linguistic Description of Complex Phenomena

The Computational Theory of Perceptions (CTP) [9] is founded on the observation that human reasoning relies heavily on perceptions and on the ability to organize information into meaningful granules. This cognitive mechanism allows people to carry out both physical and intellectual activities without resorting to precise numerical measurements or conventional computation. Building upon this theoretical framework, Automatic Linguistic Description of Complex Phenomena (LDCP) seeks to capture and convey knowledge through natural language. Its objective is to generate textual reports that emulate those produced by human specialists, emphasizing the most significant features of a phenomenon according to the needs of particular users and the context in which the description is required.

2.1.1. Computational Perception

A computational perception (CP) is a pair of elements ( A , W ) described as follows:
  • A = { a 1 , a 2 , , a n } denotes the collection of linguistic statements that define the linguistic space of the complex phenomenon (CP). Each element a i corresponds to a potential linguistic characterization of the phenomenon. These statements may range from concise expressions, such as a 1 = “the consumption is low” or a 2 = “the consumption is medium”, to more elaborate formulations, for instance a 1 = “I am concerned that your energy consumption efficiency has declined over the past semester” and a 2 = “Well done, your energy consumption efficiency has improved over the past semester”.
  • W = { w 1 , w 2 , , w n } represents the set of confidence values associated with the linguistic statements in A, where each w i lies in the interval [ 0 , 1 ] . These values quantify the degree to which a given statement is considered appropriate, combining aspects of its contextual relevance and its degree of truth.
As an illustrative case, consider a cooking scenario in which the system designer specifies a collection of admissible linguistic statements A. Examples of such statements include a 1 = “Be careful, the oven temperature is too high” and a 2 = “The oven temperature remains low”. The interpretation of these expressions is inherently context-dependent, as their meaning may vary according to the particular recipe being followed. Suppose that the oven thermometer indicates a temperature of 100 degrees Celsius; in this situation, and assuming a bread-baking context, suitable validity degrees might be assigned as w 1 = 0 and w 2 = 0.8 . In general, a complex phenomenon is represented through a set of linguistic labels that constitute a strong fuzzy partition, which implies that the associated validity degrees satisfy the normalization condition i w i = 1 .

2.1.2. Perception Mapping (PM)

We use PMs to create new CPs by aggregating CPs. A PM is a tuple ( U , y , g , T ) where:
  • U = ( u 1 , u 2 , , u n ) is a set of input CP’s, where u i = ( A i , W i ) . In the special case of first order Perception Mapping (1PM), U is a variable defined in the input data domain, e.g., the value provided by a thermometer.
  • y is the output CP, y = ( A y , W y ) = { ( a 1 , w 1 ) , ( a 2 , w 2 ) , , ( a n y , w n y ) } .
  • g is the aggregation function W y = g ( W u 1 , W u 2 , , W u n ) where W y is a vector ( w 1 , w 2 , , w n y ) of degrees of validity assigned to each element in y and W u i are the degrees of validity of the input perceptions. In fuzzy logic, many different types of aggregation functions have been developed. In case of 1PM, g consist in applying a set of membership functions { μ a 1 , μ a 2 , , μ a n y } to an input data z, obtaining the vector W y = ( μ a 1 ( z ) , μ a 2 ( z ) , , μ a n y ( z ) ) = ( w 1 , w 2 , , w n y ) . Hence, W y is the vector of degrees of validity assigned to each a y and z is the input data.
  • T is a text generation algorithm that allows generating the linguistic expressions in A y . T has associated a figure and uses the input data to choose the most suitable clauses to describe the current state of the monitored phenomenon. In simple cases, T can be implemented using a linguistic template, e.g., “The temperature in the room is [high|medium|low]”.

2.1.3. Computational Perception Network

A Computational Perception Network can be understood as an interconnected structure composed of perception mappings (PMs). Each PM processes a collection of input computational perceptions (CPs) and produces a new CP that is propagated to higher levels of the network. In this sense, every output CP is interpreted as being generated—or explained—by its corresponding PM on the basis of its input CPs. Across the network, different CPs capture complementary facets of the underlying phenomenon, each at a particular level of abstraction or granularity.
Perception mappings that directly receive information from the environment are referred to as first-order perception mappings (1PMs), and their resulting outputs are termed first-order computational perceptions (1CPs). When a PM takes 1CPs as inputs, it is classified as a second-order perception mapping (2PM), and its outputs are correspondingly denoted as second-order computational perceptions (2CPs). This hierarchical organization is conceptually motivated by Popper’s three-world framework: world 1, comprising physical phenomena; world 2, consisting of perceived entities represented here by 1CPs; and world 3, which encompasses abstract mental constructs formed from world 2 objects and represented by 2CPs [18].
Figure 1 shows an example of a computational perception network that generates several 2CPs using data obtained from sensors. Also, we can see several examples of clauses that describe linguistically the current state of the phenomenon at different degrees of granularity. Using different aggregation functions and different linguistic expressions, the GLMP paradigm allows the designer to model computationally his/her perceptions of complex phenomena.

2.2. Formal Concept Analysis

In this article, we adopt a fuzzy framework for Formal Concept Analysis (FCA) grounded in a residual network, known in the literature as a one-sided conceptual framework  [19,20]. This approach provides a well-defined algebraic structure for modeling fuzzy relationships between objects and attributes, allowing for the representation of membership degrees through residual logical operators that guarantee semantic coherence and desirable formal properties, such as the existence of implication and conjunction operators related by adjunction.
Although this framework was originally introduced as an independent alternative to the classical fuzzy FCA approach based on residual implications  [14], it has subsequently been shown to be a particular case of the multi-adjoint framework, which generalizes different residual logical systems within a single unified formulation. This connection places our approach within a broader algebraic context and strengthens its theoretical rigor by allowing us to interpret fuzzy relationships in terms of well-established logical structures.
From a practical standpoint, this framework enables us to model formal fuzzy contexts in which the relationship between objects and attributes is not binary but rather graded. This is particularly suitable for domains with uncertainty, variability, or noise in the data (as occurs in visual representations or computational perceptions). Furthermore, the use of residual operators facilitates the definition of formal fuzzy concepts and ensures that the set of all such concepts forms a complete lattice, thereby preserving the fundamental properties of classical FCA in a graded environment.
To simplify the notation and focus on the structural aspects of the proposed method, this work assumes that all attributes share the same residual lattice. This assumption does not restrict the generality of the approach; rather, it allows us to present the definitions and results more clearly, avoiding the notational overload associated with the use of multiple adjunctions.
Definition 1
(Residuated Lattice). A residuated lattice is an algebraic structure ( L , , , , 0 , 1 ) such that:
  • ( L , ) is a complete lattice, where 0 and 1 denote the bottom and top elements, respectively.
  • ( L , , 1 ) is a commutative monoid.
  • The pair ( , ) satisfies the adjointness (residuation) property; that is, for all x , y , z L ,
    x y z if and only if x y z .
On this structure the following fuzzy extension of FCA arises.
Definition 2
(Formal Context). A formal context is a tuple ( A , B , R ) such that A and B are non-empty sets, and R is a L-fuzzy relation R : A × B L . Let 2 B be the powerset of B and L A the set of fuzzy subsets of A. The concept-forming operators :   2 B L A and :   L A 2 B , are defined for each X 2 B , f L A and a L as follows:
X ( a ) = x X R ( x , a )
f = { x B for all a A , f ( a ) R ( x , a ) }
which form a Galois connection [20].
Definition 3.
A one-side formal concept is a pair X , f satisfying that X B , f L A and that X = f and f = X . A one-side concept lattice is the set
C ( A , B , R ) , { X , f X B , f L A and X = f , f = X }
in which the ordering is defined by X 1 , f 1 X 2 , f 2 if and only if X 1 2 X 2 (equivalently f 2 1 f 1 ), where 2 and 1 are the natural inclusion orders defined in 2 B and L A , respectively.

2.3. Transforming Datasets into Residuated Linguistic Contexts

Let ( L , , , , 0 , 1 ) be a residuated lattice. Throughout this paper, we fix a set of objects (B) and a set of attributes (A). The information contained in the dataset is interpreted as a relation between objects and attributes, given by R : A × B [ 0 , 1 ] . Furthermore, each attribute in (A) is associated with a linguistic variable [21].
Definition 4
(Linguistic Variable). A linguistic variable is characterized by the tuple ( X , T ( X ) , U , G , M ) , which formalizes both its syntactic and semantic components. The symbol X designates the variable itself, while T ( X ) represents the set of linguistic terms used to express its possible qualitative values. These terms are defined over a universe of discourse U, which specifies the domain in which the variable takes meaning. The formation of the linguistic terms in T ( X ) is governed by a generative rule G, whereas their interpretation is provided by a semantic mapping M. This mapping assigns to each term a fuzzy subset of U, thereby capturing its meaning through an associated membership function.
For each attribute ( a A ), we consider an associated linguistic variable represented by the tuple ( a , T ( a ) , U , G , M ) . Instead of working directly with the raw attribute set A, we construct an expanded attribute space that captures these linguistic interpretations. Specifically, we define a transformed set of attributes A * consisting of all linguistic terms associated with every attribute in A. Formally, we have: A * = { t a T ( a ) a A } ; where each element t a represents the linguistic term t applied to the original attribute a. In this way, the attribute space is lifted from a purely descriptive level to a semantically structured one, enabling the formal context to represent graded, human-interpretable properties rather than only raw measurements. This transformation is essential for bridging numerical data and conceptual structures, as it provides the semantic layer upon which the subsequent formal concept analysis is built.
Next, a new relation is defined between the set of objects B and the set of linguistic attributes A * , given by R * : A * × B [ 0 , 1 ] which is established as R * ( t a , b ) = μ t a ( R ( a , b ) ) for all t a A * (where a A and t a T ( a ) ) and b B . Here, μ t a denotes the membership function associated with the linguistic term t a corresponding to the attribute a A . In this way, the fuzzy formal context ( A * , B , R * ) is obtained, which we will refer to as the linguistic context. The following example illustrates the procedure described for transforming a (fragmented) dataset into a linguistic context.
Example 1.
Consider a portion of a database containing records about individuals, such as their names, age measured in years, height expressed in meters, and body weight in kilograms. These data are organized in the relational table displayed on the left in Figure 2. The table shown on the right in Figure 2 focuses exclusively on the values associated with the first attribute of the original table, isolating that information from the remaining attributes.
Once the sets A and B, together with the relation R, have been specified, we restrict the analysis (without loss of generality) to a single attribute, namely Age, which is treated as a linguistic variable. Its associated set of linguistic terms is assumed to be T ( Age ) = Young , Middle Aged , Old . Under this restriction, since only one attribute is considered, the derived set of terms satisfies:
A * = { t t T ( a ) , a A } = { Young , Middle aged , Old } = T ( Age )
Now, the relation R * : A * × B [ 0 , 1 ] is defined as R * ( t a , b ) = μ t a ( R ( a , b ) ) , for all t a A * and b B , which is displayed in Figure 3.
From the membership functions μ young , μ middle , and μ old associated with the trapezoidal membership mappings ( 20 , 20 , 30 , 45 ) , ( 30 , 45 , 50 , 70 ) , ( 50 , 70 , 80 , 80 ) , respectively (depicted on the left side of Figure 4), we obtain the numerical values of the relation R * , which are shown on the right side of Figure 4.
An analogous procedure can be conducted with the attributes Height and Weight.

3. Transforming Data into Linguistic Contexts Automatically

This section presents a novel method for generating linguistic formal contexts from a dataset provided by solar energy production in order to represent the correct behavior of a solar station. For illustrative purposes, consider a photovoltaic facility comprising one station with a set of inverters. To develop an algorithm for automatically generating linguistic variables, it is necessary to consider the following preconditions:
  • The station is constituted by a set of inverters { I 1 , I 2 , , I N } . In this work, we will manage a station with N = 9 .
  • Each month is formed by a set of days { D 1 , , D s } , where s { 1 , , 31 } depending on the month and year under consideration.
  • For each day D j , with j { 1 , , s } , we have a list of hours H k , with k { 0 , , 23 } , for which we obtain the record of the energy production of the inverter I i , with i { 1 , , N } , denoted by P ( I i , D j , H k ) .
The aforementioned data regarding energy production is presented in tabular form in Table 1. This table can be transformed into a linguistic formal context, as previously described in Section 2.3. However, the energy production in a photovoltaic facility is dependent on both the hour of the day and the month. For example, there are significant differences in the energy generated in July and January, as well as at 9 a.m. and at 12 a.m. In general, ultraviolet radiation is at its strongest in the hours around noon and is significantly reduced in the early morning and late afternoon.
It is thus necessary to define a linguistic variable for each hour H k . This variable will be determined by the production in the remaining days of the considered month for that precise hour. In order to define a set of linguistic variables associated with each inverter I i , it is necessary to collate the energy production data for different days in the form of a production table (see Table 1). Then, from this table obtained for an inverter I i , we calculate the average, maximum and minimum production for every hour of every day over a month, i.e., from a fixed month we compute for the inverter I i the variables Average ( I i , H k ) , Max ( I i , H k ) , Min ( I i , H k ) , for all k { 0 , , 23 } , as follows:
  • Average ( I i , H k ) = j = 1 s P ( I i , D j , H k ) s
  • Max ( I i , H k ) = max { P ( I i , D j , H k ) j { 1 , , s } }
  • Min ( I i , H k ) = min { P ( I i , D j , H k ) j { 1 , , s } }
Then, we extend the information in Table 1 as shown in Table 2.
Therefore, for each inverter we have a list of the energy generation for each hour of those days together with the variables previously computed. These variables are employed in the calculation of the linguistic variables associated with each hour in each inverter, which will represent the behavior of the inverter in that hour and will be denoted as L V ( I i , H k ) . Furthermore, each linguistic variable is constituted by a set of linguistic terms, represented by { L T 1 , L T 2 , , L T l } ( I i , H k ) . The same set of three different linguistic terms ( l = 3 ) will be considered for all linguistic variables, thus yielding the following correspondences: 1 = low , 2 = medium and 3 = high .
The determination of the linguistic variable for an inverter I i operating at the hour H k is described in Algorithm 1. This is expressed in terms of the variables Average ( I i , H k ) , Max ( I i , H k ) and Min ( I i , H k ) . The range of the interval has been divided into 8 parts, but the user can select another division. We have selected 8 parts in order to represent that the usual state is medium (like a normal distribution in statistics). Hence, the full low state is the first eighth part, the full high state is the last eighth part, the full medium states are the two center eighth parts and the transitions between states take into consideration the other two eighth parts. This is represented by the membership functions given in Step 5 of the Algorithm.
Hence, Algorithm 1 builds a linguistic table for an inverter I i in a single day D j , denoted by T I i , D j , as it is represented in Table 3. In fact, the linguistic table will comprise the degrees of membership of each energy value produced for each fuzzy set associated with the linguistic terms.
A linguistic context can be created for each inverter by combining the linguistic table (Table 3) obtained for an inverter I i with the original table (Table 1). This results in a linguistic context defined by ( A I i * , B , R I i * ), where each hour H k is associated with a linguistic variable L V ( I i , H k ) . In this linguistic formal context, the set of attributes A I i * will be constituted by the values low, normal, and high. The set of objects B will consist of each day and the respective hours within that day. Ultimately, the relation R I i * will be calculated using the linguistic variable associated with each inverter and the hour. The following example illustrates this procedure.
Algorithm 1: Automatic computation of the linguistic variable L V ( I i , H k ) from Average ( I i , H k ) , Max ( I i , H k ) and Min ( I i , H k ) . Algorithm 1 calculates the linguistic variables for each hour using a statistics table that includes the minimum, maximum, and average values for each hour. This statistics table (see Table 2) is derived from the hours-per-day table (see Table 1), wherein the hours are grouped by day. The table is computed using an input dataset that contains the hours and energy production for each day.
Ai 07 00103 i001
Example 2.
Let us consider that we are working with the inverter I 1 on 1 August 2022 at 15 p.m. whose energy production for that day and hour was 5.2 , that is, fixed the month of August we have that P ( I 1 , D 1 , H 15 ) = 5.2 (see the Appendix A Table A1 with the remaining values of the energy generated for each inverter). Now, we can collate the information about energy production in the same month but the previous year (This dataset can be extended with the information of more previous years, depending on the availability and confidence in the data.) and compute the variables Min ( I 1 , H 15 ) , Max ( I 1 , H 15 ) and Average ( I 1 , H 15 ) in order to apply Algorithm 1 and obtain the linguistic variable L V ( I 1 , H 15 ) . From the original data, we obtain the following values:
  • Min ( I 1 , H 15 ) = 1.2
  • Max ( I 1 , H 15 ) = 5.5
  • Average ( I 1 , H 15 ) = 4.848
Then, within the procedure of Algorithm 1, the membership functions are calculated from these values, that is, for the set of linguistic terms that is considered, { L T low , L T medium , L T high } ( I 1 , H 15 ) , we define the following trapezoidal functions:
  • μ L T ( I 1 , H 15 ) low = ( 1.2 , 1.2 , 1.737 , 2.812 )
  • μ L T ( I 1 , H 15 ) medium = ( 1.737 , 2.812 , 3.887 , 4.962 )
  • μ L T ( I 1 , H 15 ) high = ( 3.887 , 4.962 , 5.5 , 5.5 )
Thus, we obtain the linguistic variable as follows:
L V ( I 1 , H 15 ) = { μ L T ( I 1 , H 15 ) low , μ L T ( I 1 , H 15 ) medium , μ L T ( I 1 , H 15 ) high }
The linguistic variable L V ( I 1 , H 15 ) is represented in Figure 5 where it is possible to ascertain the degrees of membership of the value 5.2 , which is considered to be high with value 1.0 and 0.0 for the remaining linguistic terms.
Subsequently, the aforementioned process is repeated for each value of production obtained in each hour on the aforementioned date. However, in this instance, the linguistic variables associated with each hour are employed. This process yields the linguistic table depicted in Table 4.

4. Computational Perception Network Based on Linguistic Contexts

This section continues the previous developments taking advantage of the notion of computational perception. First of all, given an inverter of a photovoltaic facility and a day, we will define an 1CP for each hour from a linguistic context as shown in Table 3. Then, for each day, we will aggregate the 1CPs of each inverter of the same hour in order to obtain two 2CPs for each hour. These 2CPs will provide a description of the global performance of the facility under consideration in terms of the performance of each inverter from different perspectives. At the end of this section, we will present a 3CP to summarize the performance of the facility over a period of time.
We begin by introducing the 1CPs. Given an inverter I i of a facility F, a day D j and an hour H k , and denoting the production of this inverter at this time as P ( I i , D j , H k ) , we define a first order computational perception 1 C P ( I i , D j , H k ) = ( A ( I i , D j , H k ) , W ( I i , D j , H k ) ) , where:
A ( I i , D j , H k ) = { low , medium , high } ( I i , D j , H k ) W ( I i , D j , H k ) = { w low , w medium , w high } ( I i , D j , H k ) = { μ L T ( I i , H k ) low ( P ( I i , D j , H k ) ) , μ L T ( I i , H k ) medium ( P ( I i , D j , H k ) ) , μ L T ( I i , H k ) high ( P ( I i , D j , H k ) ) }
These 1CPs are computed for each inverter of the facility F under study and taken into account to determine the global performance of F at that time. With this purpose, we define two 2CPs, 2 C P ( F , D j , H k ) 1 = ( A ( F , D j , H k ) 1 , W ( F , D j , H k ) 1 ) , 2 C P ( F , D j , H k ) 2 = ( A ( F , D j , H k ) 2 , W ( F , D j , H k ) 2 ) by means of an aggregation operator @ as
( ( A ( F , D j , H k ) 1 , W ( F , D j , H k ) 1 ) , ( A ( F , D j , H k ) 2 , W ( F , D j , H k ) 2 ) ) = @ ( 1 C P ( I 1 , D j , H k ) , , 1 C P ( I N , D j , H k ) )
with:
A ( F , D j , H k ) 1 = { irregular } ( F , D j , H k ) W ( F , D j , H k ) 1 = { w irregular } ( F , D j , H k ) A ( F , D j , H k ) 2 = { bad , normal , excellent } ( F , D j , H k ) W ( F , D j , H k ) 2 = { w bad , w normal , w excellent } ( F , D j , H k )
The previous linguistic expressions are divided into two different groups according to their meaning. On the one hand, the linguistic expression “irregular” analyzes the difference of performance among the inverters of the facility under study to show its stability. On the other hand, “bad”, “normal” and “excellent” are related to the amount of energy produced by the corresponding facility. The membership function μ H k a to each linguistic expression a { A 1 , A 2 } determines each validity degree w a { W 1 , W 2 } , and are represented in Figure 6. Notice that, we have considered the same mappings for each hour H k in order to simplify the notation and examples, although the system allows us to use different mappings per hour depending on the behavior of the inverters and on the necessity of the study.
These membership functions have been defined according to the usual performance of facilities is “normal”. As a result, it is necessary to have a high enough production to consider the performance as “excellent”. Moreover, “irregular” performances are obtained when there is a considerable disparity between the performance of the inverters. The validity degree of “irregular” will be computed from the standard deviation of the performances of each inverter in the facility. Taking into account that these performances are expressed by a value in the unit interval, and that the standard deviation of values in this interval is at most 0.5 , the domain of the membership function of “irregular” is [ 0 , 0.5 ] .
The process followed to obtain the 2CPs is detailed in Algorithms 2 and 3.
Now, we explain Algorithms 2 and 3 in detail. First of all, step 1 of Algorithm 2 focuses on the definition of the two linguistic variables corresponding to the two 2CPs to be returned. The first linguistic variable has only the term “irregular”, and it analyzes the difference of performance among the inverters of the studied facility. On the other hand, the second linguistic variable is composed by the terms “bad”, “normal” and “excellent”, and it is related to the amount of energy production of the facility. Next, step 2 computes the performance of each inverter by means of a weighted average of the kind of performances computed from the membership functions defined in Algorithm 1. Notice that w ( I i , D j , H k ) low , w ( I i , D j , H k ) medium , w ( I i , D j , H k ) high [ 0 , 1 ] are the degrees to which the performance of I i aligns with the corresponding linguistic expressions “low”, “medium” and “high”, respectively. Since WAVERAGE(Ii,Dj,Hk) represents the global performance of I i , high values correspond to a good performance. Therefore, “low” performances are not taken into account in the computation of WAVERAGE(Ii,Dj,Hk), while “high” performances have twice the impact of “medium” performances. In step 3 the mean and standard deviation of the previous averages are computed in order to aggregate them and summarize the performance of the facility under consideration. These measures are considered to study the global and uneven performance of the inverters in the facility under analysis, since both together can provide a detailed description. Step 4 translates the previous summary into linguistic terms by computing the validity degree of each defined linguistic variable. The standard deviation is used to describe “irregular” performance, while the mean is employed to delineate “bad”, “normal” and “excellent” behaviors by applying Algorithm 3. It is convenient to remark that if the degree of membership of the standard deviation to the linguistic variable “irregular” is lower than 0.3 , then its validity degree is defined as 0. The reason is that the performance is not considered unstable enough to warn the user. Consequently, the membership function of the linguistic expression irregular given in Figure 6, is transformed into the one given in Figure 7.
Algorithm 2: Aggregation of first order computational perceptions. It algorithm (which contains Algorithm 3) aggregates the first-order computational perceptions for each hour and for each inverter of the facility. Note that, two types of 2CP are used, 2CP irregular on one side and 2CP (bad, normal, excellent) on the other.
Ai 07 00103 i002
Algorithm 3: Compute list of validity degrees from standard deviation. This algorithm is part of Algorithm 2 and is used to compute second-order computational perceptions based on the standard deviation and the mean.
Ai 07 00103 i003
As a result, it is necessary to have a standard deviation greater than or equal to 0.26 to report a significant level of irregularity in the performance of the facility. Finally, step 5 provides the two 2CPs whose validity degrees have been computed in Algorithm 3.
The overall performance of the facility F throughout a day D j is determined by a 3CP. With this purpose, we take into account the greatest range of hours { k 1 , , k K } in which F has had a significant energy production. This fact allows us to discard hours with low production, since they may have a negative impact on the results of the 3CP. For instance, during the summer months, we will consider the time interval from 09:00 h to 21:00 h. We define the 3CP as
3 C P ( F , D j ) = ( A ( F , D j ) , W ( F , D j ) ) = @ ( 2 C P ( F , D j , H k 1 ) 2 , , 2 C P ( F , D j , H k K ) 2 )
with:
A ( F , D j ) = { not well , well , very well } ( F , D j ) W ( F , D j ) = { w not well , w well , w very well } ( F , D j )
The aggregation operator is defined in terms of the “excellent” and “bad” performances given by each 2 C P ( F , D j , H k m ) 2 , with m { 1 , , K } . Specifically, we consider the average of the validity degrees of “excellent” and “bad” linguistic expressions in the corresponding day, that is:
p ( F , D j ) T = m = 1 K w ( F , D j , H k m ) T K
with T { bad , excellent } . The global performance of the facility is based on the difference between these rates:
Q = p ( F , D j ) excellent p ( F , D j ) bad
We consider that “normal” represent the standard performance of the facility. Therefore, the “excellent” and “bad” behaviors of the facility are compared in order to determine if its performance is good.
Subsequently, Q is evaluated in the membership functions μ not well , μ well and μ very well in order to obtain each validity degree of the 3CP. Notice that, Q [ 1 , 1 ] , since p ( F , D j ) excellent , p ( F , D j ) bad [ 0 , 1 ] . Hence, the domain of the membership functions μ not well , μ well , μ very well is [ 1 , 1 ] . Figure 8 shows the trapezoidal representation of these membership functions.
As a consequence, the performance is considered “ very well ” when excellent results are more frequent than bad ones. Similar conclusions can be extracted for the linguistic expressions “well” and “ not well ”. The implementation of the 3CP is depicted in Algorithm 3.
The following example illustrates Algorithms 2 and 3 to determine the performance of a photovoltaic facility at a specific hour according to the performance of each inverter. Then, the global performance of the facility in the same day is obtained from Algorithm 4.
Example 3.
The studied photovoltaic facility F has nine inverters I i with i { 1 , , 9 } . We will analyze the day 05-August-2022 and hour H 21 = 21:00 h. As it is exposed above, we define a 1CP for each of the previous inverters. Focusing on the inverter I 1 , its production at that time was 0.7 kWh. Taking into account the linguistic variable L V ( I 1 , H 21 ) , which was obtained from Algorithm 1, the first order computational perception 1 C P ( I 1 , D 5 , H 21 ) = ( A ( I 1 , D 5 , H 21 ) , W ( I 1 , D 5 , H 21 ) ) is given as follows:
A ( I 1 , D 5 , H 21 ) = { low , medium , high } ( I 1 , D 5 , H 21 ) W ( I 1 , D 5 , H 21 ) = { w low , w medium , w high } ( I 1 , D 5 , H 21 ) = { μ L T ( I 1 , H 5 ) low ( P ( I 1 , D 5 , H 21 ) ) , μ L T ( I 1 , H 21 ) medium ( P ( I 1 , D 5 , H 21 ) ) , μ L T ( I 1 , H 21 ) high ( P ( I 1 , D 5 , H 21 ) ) } = { μ L T ( I 1 , H 21 ) low ( 0.7 ) , μ L T ( I 1 , H 21 ) medium ( 0.7 ) , μ L T ( I 1 , H 21 ) high ( 0.7 ) } = { 0 , 0 , 1 }
Algorithm 4: Aggregation of second order computational perceptions. This algorithm is responsible for adding the second-order perceptions into a final computational perception that allows determining the facility’s performance for each day. All the excellent hours and all the bad hours are grouped together for each day, and the difference between them is calculated. This difference (Q) allows me to determine the station’s performance.
Ai 07 00103 i004
Hence, the energy production of I 1 is high with respect to the production of this inverter in the given hour H 21 during the month of August with total certainty. The rest of 1CPs are computed analogously, and all the results are shown next:
1 C P ( I i , D 5 , H 21 ) = ( { low , medium , high } ( I i , D 5 , H 21 ) , { 0 , 0 , 1 } ) for all i { 1 , 4 , 5 , 6 , 7 , 8 , 9 } 1 C P ( I 2 , D 5 , H 21 ) = ( { low , medium , high } ( I 2 , D 5 , H 21 ) , { 0 , 0.6 , 0.4 } ) 1 C P ( I 3 , D 5 , H 21 ) = ( { low , medium , high } ( I 3 , D 5 , H 21 ) , { 1 , 0 , 0 } )
Then, most of the inverters work fine, although I 3 has very low performance. As a result, we can anticipate that the facility does not have considerable problems of production at that time, but it needs a revision due to the performance of I 3 . Next, the previous 1CPs are aggregated to obtain the two 2CPs which describe the performance of the facility F.
2 C P ( F , D 5 , H 21 ) 1 = @ ( 1 C P ( I 1 , D 5 , H 21 ) , , 1 C P ( I 9 , D 5 , H 21 ) ) = ( { irregular } ( F , D 5 , H 21 ) , { w irregular } ( F , D 5 , H 21 ) ) 2 C P ( F , D 5 , H 21 ) 2 = @ ( 1 C P ( I 1 , D 5 , H 21 ) , , 1 C P ( I 9 , D 5 , H 21 ) ) = ( { bad , normal , excellent } ( F , D 5 , H 21 ) , { w bad , w normal , w excellent } ( F , D 5 , H 21 ) )
Now, we compute { w irregular } ( F , D 5 , H 21 ) and { w bad , w normal , w excellent } ( F , D 5 , H 21 ) by applying Algorithms 2 and 3. First of all, we compute WAVERAGE ( I i , D 5 , H 21 ) for all i { 1 , , 9 } according to step 2 of Algorithm 2. For instance:
WAVERAGE ( I 1 , D 5 , H 21 ) = w ( I 1 , D 5 , H 21 ) medium + 2 · w ( I 1 , D 5 , H 21 ) high 2 = 0 + 1 · 2 2 = 1
All the obtained results are shown next:
WAVERAGE ( I i , D 5 , H 21 ) = 1 for all i { 1 , 4 , 5 , 6 , 7 , 8 , 9 } WAVERAGE ( I 2 , D 5 , H 21 ) = 0.7 WAVERAGE ( I 3 , D 5 , H 21 ) = 0
Next, we compute MEAN and DEVIATION of WAVERAGE ( F , D 5 , H 21 ) = { 1 , 0.7 , 0 , 1 , 1 , 1 , 1 , 1 , 1 } , obtaining that:
MEAN ( WAVERAGE ( F , D 5 , H 21 ) ) = 0.855 DEVIATION ( WAVERAGE ( F , D 5 , H 21 ) ) = 0.316
Therefore, we have obtained a high mean and a relatively high standard deviation, so we can expect a great but unstable performance of F. Next, we compute the validity degree of each linguistic expression { irregular } ( F , D 5 , H 21 ) and { bad , normal , excellent } ( F , D 5 , H 21 ) by step 4 of Algorithm 2 in terms of the linguistic variables defined in step 1. We show the obtained validity degrees in Figure 9.
By evaluating these membership functions in the corresponding mean and deviation, we obtain the following results:
w ( F , D 5 , H 21 ) irregular = μ H 21 irregular ( 0.316 ) = 0.58 w ( F , D 5 , H 21 ) bad = μ H 21 bad ( 0.855 ) = 0 w ( F , D 5 , H 21 ) normal = μ H 21 normal ( 0.855 ) = 0.148 w ( F , D 5 , H 21 ) excellent = μ H 21 excellent ( 0.855 ) = 0.851
On the one hand, we deduce that the facility F has had an irregular performance at 21:00 of 05-August-2022 with a medium confidence, since w ( F , D 5 , H 21 ) irregular = 0.58 . On the other hand, it has had a suitable performance in terms of energy production. This is because, with MEAN( WAVERAGE ( F , D 5 , H 21 ) ) = 0.8555 and according to the linguistic variables defined in step 1 of Algorithm 2, the performance is classified as between “normal” and “excellent”. Figure 10 shows the results obtained for other hours of the same day where columns 2–10 would be the first-order perceptions and columns 11 and 12 would be the aggregation of the same, CP irregular and 2CP (bad, normal, excellent), respectively.
Finally, we want to extract a general conclusion of the performance of the facility F at day D 5 by means of the 3CP. For that, we will consider the hours from 9:00 h to 21:00 h, so that K = 13 and { k 1 , , k 13 } = { 9 , , 21 } (see Table 5). As a result of applying Algorithms 2 and 3, we have that:
By adding the previous quantities, we obtain that:
p ( F , D 5 ) excellent = m = 1 13 w ( F , D 5 , H k m ) excellent 13 = 0.688 p ( F , D 5 ) bad = m = 1 13 w ( F , D 5 , H k m ) bad 13 = 0.051 p ( F , D 5 ) excellent p ( F , D 5 ) bad = 0.688 0.051 = 0.637
Consequently, the excellent performances clearly overcomes the bad performances, so we can anticipate that the facility has had a great behavior during D 5 . This fact is confirmed by the validity degrees of linguistic expressions “not well”, “well” and “very well”, which are:
w ( F , D 5 ) not well = μ not well ( 0.637 ) = 0 w ( F , D 5 ) well = μ well ( 0.637 ) = 0 w ( F , D 5 ) very well = μ very well ( 0.637 ) = 1
Figure 11 shows the corresponding validity degrees.
Notice that, although an irregular performance at different hours exist, the production of energy has mainly been excellent. Thus, the information given by the 3CP is associated with this feature. However, it is also fundamental to remark this irregular performance when the linguistic descriptions will be prepared. This part will be analyzed in the next section.

5. Linguistic Descriptions

In order to provide a comprehensive overview of the entire system, we calculate the summaries of each instant CP by inverter and facility. The process entails the addition of the values associated with each specific instant CP during the data capture phase, employing the concept of fuzzy cardinality. These are designated as Summary CPs (denoted by Σ C P I i ). The term Σ C P I i is employed to generate a set of linguistic descriptions for a specific inverter I i . It should be noted that the quality of this sentence could be enhanced by the use of linguistic summaries ([7]) to summarize the information obtained during the Summary CPs.
In this case, each sentence in the linguistic summary would have the following form: “qs were v”, where q is a linguistic label (none, few, several, about half, many, most of, all) of a quantifier Q on [ 0 , 1 ] ; s represents the timespan (hours or days), and v is a linguistic expression for a given C P . As an illustration, the following summaries can be derived from the aforementioned CP: “Many hours were low; few hours were high.
Therefore, we can generate linguistic descriptions by employing similar rates to those given in Equation (3) for each linguistic term defined in 1CP, 2CPs and 3CP, respectively. We must recall that 1CP and 2CPs are obtained for each hour. In order to allow flexible studies, we will consider a total number of hours R during the period of time under analysis P. Specifically, we have J different days D j n , with n { 1 , , J } , and K n hours H k m , with m { 1 , , K n } , which depend on the day D j n under consideration. As a consequence, the rates for 1CP and 2CPs are defined in the following way:
p ( I i , P ) T 1 = n = 1 J m = 1 K n w ( I i , D j n , H k m ) T 1 R with T 1 { low , medium , high } p ( F , P ) T 2 = n = 1 J m = 1 K n w ( F , D j n , H k m ) T 2 R with T 2 { irregular , bad , normal , excellent }
With respect to 3CP, since it is considered for each day, the period P consist on a set of J days { D j 1 , , D j J } . Then, the rate is defined as:
p ( F , P ) T 3 = n = 1 J w ( F , D j n ) T 3 J with T 3 { not well , well , very well }
We propose mapping these rates onto linguistic labels representing quantity. In this paper, the following criteria are adopted for all previously defined rates p:
  • NONE when p is 0.
  • FEW when p ( 0 , 0.2 ] .
  • SEVERAL when p ( 0.2 , 0.4 ] .
  • ABOUT HALF when p ( 0.4 , 0.6 ] .
  • MANY when p ( 0.6 , 0.8 ] .
  • THE MOST OF when p ( 0.8 , 1 ) .
  • ALL when p = 1 .
Next example analyzes the station under consideration in the period P = August 2022 by applying the proposed criteria.
Example 4.
The total number of hours in this period is R = 462 , and the number of days is J = 31 . The following linguistic descriptions are generated for the 1CP and I 1 :
  • FEW hours were low,
  • SEVERAL hours were medium,
  • ABOUT HALF hours were high
The linguistic descriptions of the rest of inverters are obtained analogously (see Table 6).
Concerning 2CPs, the global description of the facility is provided next:
  • p ( F , P ) irregular = 57.359 462 = 0.124   FEW hours were irregular
  • p ( F , P ) bad = 87.090 462 = 0.188    FEW hours were bad
  • p ( F , P ) normal = 166.044 462 = 0.359   SEVERAL hours were normal
  • p ( F , P ) excellent = 209.154 462 = 0.452    ABOUT HALF hours were excellent
Finally, for the 3CP we have the following values and the associated descriptions:
  • p ( F , P ) not well = 4.663 31 = 0.150   FEW days were not well
  • p ( F , P ) well = 3.083 31 = 0.099    FEW days were well
  • p ( F , P ) very well = 23.25 31 = 0.750    MANY days were very well
As a result, we can ensure that this photovoltaic facility, although it has an irregular performance, it has a great behavior in this period of time concerning the energy production. The Figure 12 shows a complete computational perception network, along with 1CP, 2CP, and 3CP, and their corresponding summaries.
We have created two report templates depending on the value of p ( F , P ) irregular in order to provide users with the most relevant details about the generation of the different solar inverters. Clearly, more reports can be designed for giving more particularities that users can require. Here, we have considered these two kinds of reports in order to simplify the text. In the proposed reports, each v S i is an element of an array of strings S i listed below.
S 1 = { irregular } S 2 = { bad , normal , excellent } S 3 = { not well , well , very well }
Figure 13 shows a possible report template for the Analysis of Solar Photovoltaic Facility when an irregular behavior is detected. Note that, v S 2 1 is the linguistic term with highest weight obtained from 2CP2, and v S 2 2 is the second linguistic term with the highest weight obtained from 2CP2.
Figure 14 presents a template and an example when the facility is regular. In this form, v S 3 is the linguistic term with highest weight obtained from 3CP.

6. Linguistic Descriptions of Data via Fuzzy Formal Concepts Generated by Computational Perceptions

In this section, we apply F C A to extract information from the dataset and generate the corresponding computational perceptions. When the concept lattice is finite, it can be guaranteed that each concept is the infimum of the meet-irreducible concepts that are larger than it [22]. Consequently, these particular concepts constitute a generating system for the entire concept lattice, and from them, it is possible to obtain a base. Therefore, the information provided by a context can be synthesized using its meet-irreducible concepts. In this work, we will focus on this type of concept, which will be analyzed and used to construct linguistic descriptions. The study and calculation of irreducible elements within the framework of FCA have been addressed in several previous works [23]. Readers interested in exploring this concept further in the context of fuzzy concept networks may consult those references.
In this paper, we will consider the one-side approach [20,24] which assume the subset of objects as crisp sets. This is natural in our example because the objects are hours of one day, and so they should be present (with truth value 1) or absent (with truth value 0) in the extension of a concept, and it does not have sense to consider a degree between 0 and 1. Notice also that the context ( A , B , R ) is finite, specifically, A = { a 1 , , a l } and B = { b 1 , , b m } , where we consider a granularity of 10 for both the objects and the relation, that is, the truth values belong to the unit interval divided into 10 pieces. Moreover, we will use Gödel t-norm for all the underlying operations to compute concepts.
From this context, a new method for the automatic generation of linguistic descriptions from the set of irreducible concepts will be presented. Specifically, for each meet-irreducible concept [22], I C = { b 1 / β 1 , , b m / β m } , { a 1 / α 1 , , a l / α l } with α i [ 0 , 1 ] and β j { 0 , 1 } , for all i { 1 , , l } , j { 1 , , m } , two distinct linguistic descriptions will be generated: one for the extent and one for the intent. The union of both descriptions will allow us to obtain a complete set of linguistic descriptions.
The meet-irreducible concept is generated from an attribute a i with a truth value y i [ 0 , 1 ] , with i { 1 , , l } , that is, we start from the fuzzy subset { a 1 / 0 , , a i / y i , , a l / 0 } . Hence, the extent represents the objects satisfying attribute a i with the truth value y i , and the intent satisfies that y i α i and also other truth values could be different from zero. We have that the inclusion of these new attributes are caused by attribute a i with a truth value between y i and α i . Thus, every meet-irreducible concept provides significant information about each attribute.
We will use the support and the rate to generate linguistic descriptions that reflect the extension of an irreducible concept. Formally, the rate is calculated as the average of the truth values of each object in the extension, that is, p = ( β 1 + + β m ) / m . We propose transforming the rate p into a linguistic label of quantity, using the same criteria defined in Section 5.
Given that, in the intent { a 1 / α 1 , , a l / α l } , each α i represents the degree of membership with which the attribute a i is satisfied, where i { 1 , , l } , it is necessary to design a set of linguistic labels for its interpretation. In this context, and considering both the granularity of the context attributes and the semantics associated with each one, we propose interpreting the values α i using the following linguistic labels:
  • …were NOT  a i , when α i is 0;
  • …were ALMOST NOT  a i , when α i ( 0 , 0.2 ] ;
  • …were MORE OR LESS  a i , when α i ( 0.2 , 0.6 ] ;
  • …were RATHER a i , when α i ( 0.6 , 1.0 ) ;
  • …were a i , when α i is 1.0 .
It is worth noting that a greater number of linguistic labels can be defined, along with different intervals for each. An automated procedure for the optimal selection of labels and intervals will be presented later. The proposed method for the automatic generation of linguistic descriptions using FCA will then be illustrated, applied to each level of the computational perceptions calculated in the previous sections. From the information provided by each computational perception, a set of irreducible concepts will be obtained.

6.1. FCA from 1CP: Performance of the Inverter by Hours

We will focus on a single inverter, as each inverter within the facility is assigned a unique 1CP. Applying FCA to the 1CP will provide information about the behavior of the inverter during the period considered. In this case, the inverter I 5 is considered as well as the context ( A , B , R ) where A = { low , medium , high } , B is the set of hours each day during the month of August in the year 2022, and R is the relation given by the linguistic table generated using the linguistic variables for the inverter I 5 (for example, see Table 4).
The concept lattice associated with this context can be computed, as well as the meet-irreducible concepts depicted in Figure 15, which has been colored. Now, linguistic descriptions can be generated from these meet-irreducible concepts to determine the behavior of the inverter during the period considered in the set of objects. The following format will be employed: number of the concept along with its intent, support, rate, and the linguistic description below. Thus, we obtain some of the following descriptions (the complete list is given in Appendix B):
  • C 15 , { high / 0.2 } , s = 230.0 , p = 0.5
    ABOUT HALF hours were ALMOST NOT high
  • C 14 , { high / 0.5 } , s = 127.0 , p = 0.27608695652173915
    SEVERAL hours were MORE OR LESS high
  • C 6 , { low / 1.0 } , s = 88.0 , p = 0.19130434782608696
    FEW hours were low
  • C 7 , { medium / 1.0 } , s = 128.0 , p = 0.2782608695652174
    SEVERAL hours were medium
  • C 3 , { high / 1.0 } , s = 111.0 , p = 0.24130434782608695
    SEVERAL hours were high
It is evident that, from this set of linguistic descriptions, it is possible to extract more granulated information than the linguistic information we provided before as output of 1CP (see Table 6) in order to show how the inverter performs over the considered period of time. The previous process in Section 5 is based on a stricter defuzzification method. In fact, we obtained similar linguistic descriptions, such as the ones generated by C 6 , C 7 and C 3 . On the other hand, we can analyze specific behaviors that the inverter has had over the month; for instance, the number of hours with high production with a truth value of 1 (depicted in C 3 ) are not very different from the hours with high production with a truth value of 0.5 (shown in C 14 ). However, when 0.2 is considered in the concept C 15 , the difference is significant. Therefore, we find that, usually, if the production is high (with a representative truth value—e.g., at least 0.5 ), it is normally high with a truth value of 1.0. However, the differences with medium are greater.
Note that only one attribute appears in all intents of meet-irreducible concepts. This is due to the independence of the attributes that arises from the semantics and definition of the membership functions of linguistic variables. Therefore, the attributes low and high will never appear simultaneously in an intent with non-null truth values.

6.2. FCA from the 2CP2: Performance of the Facility by Hour

In this case, we generate a formal context ( A , B , R ) whose relation is obtained from the output of the 2CP2 and where the set of linguistic expressions { bad , normal , excellent } plays the role of the set of attributes. The set of objects is composed of the day and hour couple, as in the previous example. We compute the concept lattice associated with this context (see Figure 16) and its meet-irreducible concepts to provide a range of linguistic descriptions. In this occasion, we are analyzing the hours of generated energy of the facility instead of one inverter. Appendix C contains the linguistic descriptions generated by each of the meet-irreducible concepts. For example, we obtain that:
  • C 56 , { bad / 0.3 } , s = 130.0 , p = 0.2813852813852814
    SEVERAL hours were MORE OR LESS bad
  • C 33 , { normal / 0.3 } , s = 229.0 , p = 0.49567099567099565
    ABOUT HALF hours were MORE OR LESS normal
  • C 77 , { excellent / 0.3 } , s = 253.0 , p = 0.5476190476190477
    ABOUT HALF hours were MORE OR LESS excellent
  • C 64 , { bad / 0.5 } , s = 87.0 , p = 0.18831168831168832
    FEW hours were MORE OR LESS bad
  • C 30 , { normal / 0.5 } , s = 160.0 , p = 0.3463203463203463
    SEVERAL hours were MORE OR LESS normal
  • C 114 , { excellent / 0.5 } , s = 234.0 , p = 0.5064935064935064
    ABOUT HALF hours were MORE OR LESS excellent
  • C 20 , { bad / 1.0 } , s = 40.0 , p = 0.08658008658008658
    FEW hours were bad
  • C 19 , { normal / 1.0 } , s = 49.0 , p = 0.10606060606060606
    FEW hours were normal
  • C 13 , { excellent / 1.0 } , s = 100.0 , p = 0.21645021645021645
    SEVERAL hours were excellent
In this case, we also have a proper granularity to provide more detailed information on the energy production behavior of the facility in each hour during the selected period. It is important to note that, as was the case when analyzing the performance of an inverter, the attributes used here give rise to meet-irreducible concepts whose intent is composed of a single attribute. However, if the output of both 2CPs is used to generate the context, a more complex lattice will be obtained, as the attributes are more frequently interrelated through the attribute irregular. We will show this fact in the following section.

6.3. FCA from 2CPs: Combining the Information in Both 2CPs

Now, we can add the output of the 2CP1 computed in Section 4, that is, considering the set of attributes { bad , normal , excellent , irregular } . In this case, we find that the attributes bad , normal and excellent can be related through the attribute irregular, since the truth value for this attribute is obtained independently of the others. This fact is evidenced by the concept lattice obtained in this context depicted in Appendix E Figure A1. The linguistic descriptions obtained from each meet-irreducible concept are listed in Appendix D. The following are some of them:
  • C 183 , { normal / 0.1 , irregular / 0.7 } , s = 47.0 , p = 0.10173160173160173
    FEW hours were ALMOST NOT normal AND RATHER irregular
  • C 180 , { normal / 0.4 , irregular / 0.8 } , s = 29.0 , p = 0.06277056277056277
    FEW hours were MORE OR LESS normal AND RATHER irregular
  • C 179 , { normal / 0.4 , irregular / 0.9 } , s = 24.0 , p = 0.05194805194805195
    FEW hours were MORE OR LESS normal AND RATHER irregular
  • C 367 , { bad / 1.0 } , s = 40.0 , p = 0.08658008658008658
    FEW hours were bad
  • C 405 , { normal / 1.0 } , s = 49.0 , p = 0.10606060606060606
    FEW hours were normal
  • C 128 , { excellent / 1.0 } , s = 100.0 , p = 0.21645021645021645
    SEVERAL hours were excellent
  • C 178 , { normal / 0.4 , irregular / 1.0 } , s = 17.0 , p = 0.0367965367965368
    FEW hours were MORE OR LESS normal AND irregular
We can see interesting information. For example, from the concepts C 178 , C 179 , C 180 and C 183 , we have that a high irregularity provides a normal behavior (with a low truth value). This fact is not a contradiction, because we have many behavior changes among bad, normal and excellent, from which a value associated with the linguistic term normal is obtained.

6.4. FCA from 3CP: Performance of the Facility per Day

Finally, we want to consider the performance of the facility for everyday use. For this goal, we construct the formal context ( A , B , R ) whose fuzzy relation is obtained from the 3CP output, the set of attributes is the set of linguistic expressions { not well , well , very well } and the set of objects we consider the day of the selected period. In this paper, we have taken into account each whole day of the month of August of 2022. The concept lattice obtained is depicted in Figure 17 and the linguistic description obtained by the meet-irreducible concepts are listed in Appendix E, some of them are given below.
  • C 8 , { very well / 0.3 } , s = 24.0 , p = 0.7741935483870968
    MANY days were MORE OR LESS very well
  • C 6 , { very well / 1.0 } , s = 23.0 , p = 0.7419354838709677
    MANY days were very well
  • C 5 , { not well / 1.0 } , s = 4.0 , p = 0.12903225806451613
    FEW days were not well
  • C 3 , { well / 1.0 } , s = 1.0 , p = 0.03225806451612903
    FEW days were well
We can see from the previous descriptions that the facility is working very well during August 2022. Few representativeness exists for ‘not well’, with only support considering up 0.4 as truth degree, and ‘well’, with very few support. Attribute ‘very well’ is the most representative, even with truth degree 1.0. Remark on the small difference between the support of the concept with truth value 0.3 and with 1.0. This indicates that all objects satisfying very well / 0.3 also satisfy very well / 1.0 , with the exception of a single object.
Thus, the descriptions given by the FCA application provide extra information to the user, which is valuable to decision support, and so to the design of automatic intelligent systems.

7. Conclusions and Future Work

We present a new methodology for developing and evaluating data-to-text systems, which converts data into linguistic contexts and then into a three-level network of computational perceptions. This methodology enables the addition of information until it can be summarized into a set of linguistic descriptions. The use of three levels of perceptions helps the automatic decision support system to analyze the behavior of the facility in three levels: inverter per hour, facility per hour, and facility per day.
The presented approach shares several advantages with rule-based approaches, as it allows for explicit knowledge representation, conferring transparency and interpretability. However, it also inherits some limitations traditionally associated with this type of system, such as lower scalability and dependence on expert intervention during the design process. Nevertheless, the contributions introduced in this work represent progress aimed at mitigating these disadvantages. In particular, the proposal incorporates a mechanism for the automatic categorization of data in linguistic terms, which reduces the burden of manual definition and facilitates adaptation to new domains, especially in the initial design phases. Furthermore, the system integrates a verification module based on Formal Concept Analysis, providing a structured framework for validating the generated results.
The proposed design has been applied to a real facility allocated in Puerto Real, Cádiz, Spain, but it can be clearly translated to any other facility in the world or to other different problems, such as in wind turbine facilities, hospitals, factories, so on. This is possible thanks to the nature of the proposal, which allows for the automatic categorization of datasets defined by their temporal evolution, with time being one of the key variables in the grouping process. This feature facilitates its application to different domains where time series are utilized. For example, the method can generate linguistic summaries across a wide range of domains and is already being used for the automatic production of summaries from meteorological data. The computational complexity of the current algorithms is linear. Although computational performance is not the primary focus of this work, we aim to open a new line of research in this direction by exploring parallel implementations and execution in GPU-based environments. In line with this research perspective, we also propose the development of a Python library designed to load time-series datasets and automatically generate linguistic summaries. This tool will allow users to configure different parameters to obtain results tailored to their specific requirements, thereby extending support for data-driven decision-making processes in organizations across diverse sectors.
As a future line of work, the integration of the proposal within GEN’s overall system is planned, aiming to support its operational and strategic decision-making processes. Furthermore, the scope of the study is expected to expand to include additional specific photovoltaic installations in Spain and various European countries, which will allow for the evaluation of the generalizability and robustness of the approach in diverse geographical and operational contexts. It should be noted that climate change is altering solar radiation patterns and, in certain regions, is contributing to an increase in photovoltaic generation potential. In this context, advanced analysis and decision support tools, such as the one proposed in this work, will be essential for efficiently managing the growing penetration of solar energy in electrical systems.

Author Contributions

Conceptualization, R.G.A., F.C.-G., J.M. and C.R.-M.; methodology, R.G.A., F.C.-G., J.M. and C.R.-M.; software, C.R.-M.; validation, R.G.A., F.C.-G., J.M. and C.R.-M.; formal analysis, R.G.A., F.C.-G., J.M. and C.R.-M.; investigation, R.G.A., F.C.-G., J.M. and C.R.-M.; resources, R.G.A., F.C.-G., J.M. and C.R.-M.; data curation, R.G.A., F.C.-G., J.M. and C.R.-M.; writing—original draft preparation, R.G.A., F.C.-G., J.M. and C.R.-M.; writing—review and editing, R.G.A., F.C.-G., J.M. and C.R.-M.; visualization, R.G.A., F.C.-G., J.M. and C.R.-M.; supervision, J.M.; project administration, R.G.A., F.C.-G., J.M. and C.R.-M. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The data presented in this study are available on request from the corresponding author.

Acknowledgments

The authors would like to thank the PhD Program of Economics and Information Management (Faculty of Business Science—University of Bío-Bío).

Conflicts of Interest

The authors declare no conflicts of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript; or in the decision to publish the results.

Appendix A. Dataset

The original dataset for a specific situation has recorded the solar energy per inverter per day for its 24 h.
Table A1. The table presents an extract of the original data, illustrating solar energy generation for a single day. For each hour and inverter, the solar energy produced is recorded. Please note that there are hours with no production ( N A ). Production hours are highly dependent on the seasons.
Table A1. The table presents an extract of the original data, illustrating solar energy generation for a single day. For each hour and inverter, the solar energy produced is recorded. Please note that there are hours with no production ( N A ). Production hours are highly dependent on the seasons.
Period I 1 I 2 I 3 I 4 I 5 I 6 I 7 I 8 I 9
01-August-2022, H 0 NANANANANANANANANA
01-August-2022, H 1 NANANANANANANANANA
01-August-2022, H 2 NANANANANANANANANA
01-August-2022, H 3 NANANANANANANANANA
01-August-2022, H 4 NANANANANANANANANA
01-August-2022, H 5 NANANANANANANANANA
01-August-2022, H 6 NANANANANANANANANA
01-August-2022, H 7 NANANANANANANANANA
01-August-2022, H 8 NANANANANANANANANA
01-August-2022, H 9 0.100.000.000.000.000.000.000.100.10
01-August-2022, H 10 0.400.300.000.000.000.001.000.400.40
01-August-2022, H 11 1.401.101.002.001.002.001.001.501.40
01-August-2022, H 12 2.902.302.003.002.002.003.002.802.60
01-August-2022, H 13 3.903.003.003.003.004.003.003.903.00
01-August-2022, H 14 4.703.503.005.003.005.005.004.604.20
01-August-2022, H 15 5.203.703.005.003.005.005.005.004.70
01-August-2022, H 16 5.203.704.005.003.005.004.005.104.30
01-August-2022, H 17 5.003.703.005.003.004.005.005.204.60
01-August-2022, H 18 4.703.403.005.004.005.004.004.404.20
01-August-2022, H 19 3.502.803.003.002.003.004.003.503.30
01-August-2022, H 20 2.201.702.002.002.002.002.002.202.20
01-August-2022, H 21 0.800.600.001.000.001.000.000.800.70
01-August-2022, H 22 0.100.100.000.000.000.000.000.100.00
01-August-2022, H 23 NANANANANANANANANA

Appendix B. Meet-Irreducible Concepts Given from 1CP (Section 6.1)

The following list contains the linguistic descriptions generated by each of the meet-irreducible concepts given from 1CP (Section 6.1).
  • C 10 , { low / 0.2 } , s = 102.0 , p = 0.2217391304347826
    SEVERAL hours were ALMOST NOT low
  • C 11 , { medium / 0.5 } , s = 261.0 , p = 0.5673913043478261
    ABOUT HALF hours were MORE OR LESS medium
  • C 15 , { high / 0.2 } , s = 230.0 , p = 0.5
    ABOUT HALF hours were ALMOST NOT high
  • C 9 , { low / 0.5 } , s = 89.0 , p = 0.1934782608695652
    FEW hours were MORE OR LESS low
  • C 14 , { high / 0.5 } , s = 127.0 , p = 0.27608695652173915
    SEVERAL hours were MORE OR LESS high
  • C 6 , { low / 1.0 } , s = 88.0 , p = 0.19130434782608696
    FEW hours were low
  • C 13 , { medium / 0.8 } , s = 244.0 , p = 0.5304347826086957
    ABOUT HALF hours were RATHER medium
  • C 3 , { high / 1.0 } , s = 111.0 , p = 0.24130434782608695
    SEVERAL hours were high
  • C 7 , { medium / 1.0 } , s = 128.0 , p = 0.2782608695652174
    SEVERAL hours were medium

Appendix C. Meet-Irreducible Concepts Given from 2CP (Section 6.2)

The following list contains the linguistic descriptions generated by each of the meet-irreducible concepts given from 2CP (Section 6.2).
  • C 40 , { bad / 0.1 } , s = 140.0 , p = 0.30303030303030304
    SEVERAL hours were ALMOST NOT bad
  • C 35 , { normal / 0.1 } , s = 322.0 , p = 0.696969696969697
    MANY hours were ALMOST NOT normal
  • C 80 , { excellent / 0.1 } , s = 273.0 , p = 0.5909090909090909
    ABOUT HALF hours were ALMOST NOT excellent
  • C 47 , { bad / 0.2 } , s = 131.0 , p = 0.28354978354978355
    SEVERAL hours were ALMOST NOT bad
  • C 34 , { normal / 0.2 } , s = 270.0 , p = 0.5844155844155844
    ABOUT HALF hours were ALMOST NOT normal
  • C 79 , { excellent / 0.2 } , s = 261.0 , p = 0.564935064935065
    ABOUT HALF hours were ALMOST NOT excellent
  • C 56 , { bad / 0.3 } , s = 130.0 , p = 0.2813852813852814
    SEVERAL hours were MORE OR LESS bad
  • C 33 , { normal / 0.3 } , s = 229.0 , p = 0.49567099567099565
    ABOUT HALF hours were MORE OR LESS normal
  • C 77 , { excellent / 0.3 } , s = 253.0 , p = 0.5476190476190477
    ABOUT HALF hours were MORE OR LESS excellent
  • C 60 , { bad / 0.4 } , s = 89.0 , p = 0.19264069264069264
    FEW hours were MORE OR LESS bad
  • C 32 , { normal / 0.4 } , s = 211.0 , p = 0.45670995670995673
    ABOUT HALF hours were MORE OR LESS normal
  • C 91 , { excellent / 0.4 } , s = 245.0 , p = 0.5303030303030303
    ABOUT HALF hours were MORE OR LESS excellent
  • C 64 , { bad / 0.5 } , s = 87.0 , p = 0.18831168831168832
    FEW hours were MORE OR LESS bad
  • C 30 , { normal / 0.5 } , s = 160.0 , p = 0.3463203463203463
    SEVERAL hours were MORE OR LESS normal
  • C 114 , { excellent / 0.5 } , s = 234.0 , p = 0.5064935064935064
    ABOUT HALF hours were MORE OR LESS excellent
  • C 98 , { bad / 0.6 } , s = 84.0 , p = 0.18181818181818182
    FEW hours were MORE OR LESS bad
  • C 29 , { normal / 0.6 } , s = 141.0 , p = 0.3051948051948052
    SEVERAL hours were MORE OR LESS normal
  • C 113 , { excellent / 0.6 } , s = 218.0 , p = 0.47186147186147187
    ABOUT HALF hours were MORE OR LESS excellent
  • C 96 , { bad / 0.7 } , s = 55.0 , p = 0.11904761904761904
    FEW hours were RATHER bad
  • C 27 , { normal / 0.7 } , s = 128.0 , p = 0.27705627705627706
    SEVERAL hours were RATHER normal
  • C 111 , { excellent / 0.7 } , s = 196.0 , p = 0.42424242424242425
    ABOUT HALF hours were RATHER excellent
  • C 106 , { bad / 0.8 } , s = 53.0 , p = 0.11471861471861472
    FEW hours were RATHER bad
  • C 25 , { normal / 0.8 } , s = 79.0 , p = 0.170995670995671
    FEW hours were RATHER normal
  • C 119 , { excellent / 0.8 } , s = 180.0 , p = 0.38961038961038963
    SEVERAL hours were RATHER excellent
  • C 105 , { bad / 0.9 } , s = 48.0 , p = 0.1038961038961039
    FEW hours were RATHER bad
  • C 100 , { normal / 0.9 } , s = 70.0 , p = 0.15151515151515152
    FEW hours were RATHER normal
  • C 118 , { excellent / 0.9 } , s = 144.0 , p = 0.3116883116883117
    SEVERAL hours were RATHER excellent
  • C 20 , { bad / 1.0 } , s = 40.0 , p = 0.08658008658008658
    FEW hours were bad
  • C 19 , { normal / 1.0 } , s = 49.0 , p = 0.10606060606060606
    FEW hours were normal
  • C 13 , { excellent / 1.0 } , s = 100.0 , p = 0.21645021645021645
    SEVERAL hours were excellent

Appendix D. Meet-Irreducible Concepts Given from 2CP (Section 6.3)

The following list contains the linguistic descriptions generated by each of the meet-irreducible concepts given from 2CP (Section 6.3).
  • C 244 , { bad / 0.1 } , s = 140.0 , p = 0.30303030303030304
    SEVERAL hours were ALMOST NOT bad
  • C 74 , { normal / 0.1 } , s = 322.0 , p = 0.696969696969697
    MANY hours were ALMOST NOT normal
  • C 46 , { excellent / 0.1 } , s = 273.0 , p = 0.5909090909090909
    ABOUT HALF hours were ALMOST NOT excellent
  • C 45 , { irregular / 0.3 } , s = 81.0 , p = 0.17532467532467533
    FEW hours were MORE OR LESS irregular
  • C 253 , { bad / 0.2 } , s = 131.0 , p = 0.28354978354978355
    SEVERAL hours were ALMOST NOT bad
  • C 101 , { normal / 0.2 } , s = 270.0 , p = 0.5844155844155844
    ABOUT HALF hours were ALMOST NOT normal
  • C 44 , { excellent / 0.2 } , s = 261.0 , p = 0.564935064935065
    ABOUT HALF hours were ALMOST NOT excellent
  • C 261 , { bad / 0.3 } , s = 130.0 , p = 0.2813852813852814
    SEVERAL hours were MORE OR LESS bad
  • C 120 , { normal / 0.3 } , s = 229.0 , p = 0.49567099567099565
    ABOUT HALF hours were MORE OR LESS normal
  • C 42 , { excellent / 0.3 } , s = 253.0 , p = 0.5476190476190477
    ABOUT HALF hours were MORE OR LESS excellent
  • C 269 , { bad / 0.4 } , s = 89.0 , p = 0.19264069264069264
    FEW hours were MORE OR LESS bad
  • C 160 , { normal / 0.4 } , s = 211.0 , p = 0.45670995670995673
    ABOUT HALF hours were MORE OR LESS normal
  • C 40 , { excellent / 0.4 } , s = 245.0 , p = 0.5303030303030303
    ABOUT HALF hours were MORE OR LESS excellent
  • C 51 , { irregular / 0.4 } , s = 79.0 , p = 0.170995670995671
    FEW hours were MORE OR LESS irregular
  • C 277 , { bad / 0.5 } , s = 87.0 , p = 0.18831168831168832
    FEW hours were MORE OR LESS bad
  • C 172 , { normal / 0.5 } , s = 160.0 , p = 0.3463203463203463
    SEVERAL hours were MORE OR LESS normal
  • C 38 , { excellent / 0.5 } , s = 234.0 , p = 0.5064935064935064
    ABOUT HALF hours were MORE OR LESS excellent
  • C 57 , { irregular / 0.5 } , s = 72.0 , p = 0.15584415584415584
    FEW hours were MORE OR LESS irregular
  • C 294 , { bad / 0.6 } , s = 84.0 , p = 0.18181818181818182
    FEW hours were MORE OR LESS bad
  • C 190 , { normal / 0.6 } , s = 141.0 , p = 0.3051948051948052
    SEVERAL hours were MORE OR LESS normal
  • C 34 , { excellent / 0.6 } , s = 218.0 , p = 0.47186147186147187
    ABOUT HALF hours were MORE OR LESS excellent
  • C 137 , { irregular / 0.6 } , s = 68.0 , p = 0.1471861471861472
    FEW hours were MORE OR LESS irregular
  • C 301 , { bad / 0.7 } , s = 55.0 , p = 0.11904761904761904
    FEW hours were RATHER bad
  • C 204 , { normal / 0.7 } , s = 128.0 , p = 0.27705627705627706
    SEVERAL hours were RATHER normal
  • C 32 , { excellent / 0.7 } , s = 196.0 , p = 0.42424242424242425
    ABOUT HALF hours were RATHER excellent
  • C 183 , { normal / 0.1 , irregular / 0.7 } , s = 47.0 , p = 0.10173160173160173
    FEW hours were ALMOST NOT normal AND RATHER irregular
  • C 371 , { bad / 0.8 } , s = 53.0 , p = 0.11471861471861472
    FEW hours were RATHER bad
  • C 213 , { normal / 0.8 } , s = 79.0 , p = 0.170995670995671
    FEW hours were RATHER normal
  • C 132 , { excellent / 0.8 } , s = 180.0 , p = 0.38961038961038963
    SEVERAL hours were RATHER excellent
  • C 180 , { normal / 0.4 , irregular / 0.8 } , s = 29.0 , p = 0.06277056277056277
    FEW hours were MORE OR LESS normal AND RATHER irregular
  • C 369 , { bad / 0.9 } , s = 48.0 , p = 0.1038961038961039
    FEW hours were RATHER bad
  • C 349 , { normal / 0.9 } , s = 70.0 , p = 0.15151515151515152
    FEW hours were RATHER normal
  • C 131 , { excellent / 0.9 } , s = 144.0 , p = 0.3116883116883117
    SEVERAL hours were RATHER excellent
  • C 179 , { normal / 0.4 , irregular / 0.9 } , s = 24.0 , p = 0.05194805194805195
    FEW hours were MORE OR LESS normal AND RATHER irregular
  • C 367 , { bad / 1.0 } , s = 40.0 , p = 0.08658008658008658
    FEW hours were bad
  • C 405 , { normal / 1.0 } , s = 49.0 , p = 0.10606060606060606
    FEW hours were normal
  • C 128 , { excellent / 1.0 } , s = 100.0 , p = 0.21645021645021645
    SEVERAL hours were excellent
  • C 178 , { normal / 0.4 , irregular / 1.0 } , s = 17.0 , p = 0.0367965367965368
    FEW hours were MORE OR LESS normal AND irregular

Appendix E. Meet-Irreducible Concepts Given from 3CP (Section 6.4)

Figure A1. Concept lattice for the facility with the 2CP (bad, normal, excellent, irregular).
Figure A1. Concept lattice for the facility with the 2CP (bad, normal, excellent, irregular).
Ai 07 00103 g0a1
The following list contains the linguistic descriptions generated by each of the meet-irreducible concepts given from 3CP (Section 6.4).
  • C 12 , { not well / 0.3 } , s = 6.0 , p = 0.1935483870967742
    FEW days were MORE OR LESS not well
  • C 9 , { well / 0.6 } , s = 4.0 , p = 0.12903225806451613
    FEW days were MORE OR LESS well
  • C 8 , { very well / 0.3 } , s = 24.0 , p = 0.7741935483870968
    MANY days were MORE OR LESS very well
  • C 13 , { not well / 0.4 } , s = 5.0 , p = 0.16129032258064516
    FEW days were MORE OR LESS not well
  • C 6 , { very well / 1.0 } , s = 23.0 , p = 0.7419354838709677
    MANY days were very well
  • C 5 , { not well / 1.0 } , s = 4.0 , p = 0.12903225806451613
    FEW days were not well
  • C 7 , { well / 0.7 } , s = 3.0 , p = 0.0967741935483871
    FEW days were RATHER well
  • C 3 , { well / 1.0 } , s = 1.0 , p = 0.03225806451612903
    FEW days were well

References

  1. Reiter, E. An architecture for data-to-text systems. In Proceedings of the Eleventh European Workshop on Natural Language Generation, Schloss Dagstuhl, Germany, 17–20 June 2007; ENLG ’07. pp. 97–104. [Google Scholar]
  2. Reiter, E. Natural Language Generation; Springer: Cham, Switzerland, 2025. [Google Scholar] [CrossRef]
  3. Gkatzia, D. Content selection in data-to-text systems: A survey. arXiv 2016, arXiv:1610.08375. [Google Scholar]
  4. Zadeh, L. Fuzzy logic = computing with words. IEEE Trans. Fuzzy Syst. 1996, 4, 103–111. [Google Scholar] [CrossRef]
  5. Zadeh, L. A New Direction in AI: Toward a Computational Theory of Perceptions. AI Mag. 2001, 22, 73. [Google Scholar] [CrossRef]
  6. Trivino, G.; Sugeno, M. Towards linguistic descriptions of phenomena. Int. J. Approx. Reason. 2013, 54, 22–34. [Google Scholar] [CrossRef]
  7. Kacprzyk, J.; Yager, R.R. Linguistic summaries of data using fuzzy logic. Int. J. Gen. Syst. 2001, 30, 133–154. [Google Scholar] [CrossRef]
  8. Yager, R.R. A new approach to the summarization of data. Inf. Sci. 1982, 28, 69–86. [Google Scholar] [CrossRef]
  9. Zadeh, L. From computing with numbers to computing with words. From manipulation of measurements to manipulation of perceptions. IEEE Trans. Circuits Syst. I Fundam. Theory Appl. 1999, 46, 105–119. [Google Scholar] [CrossRef]
  10. Conde-Clemente, P.; Alonso, J.; Trivino, G. Toward automatic generation of linguistic advice for saving energy at home. Soft Comput. 2018, 22, 345–359. [Google Scholar] [CrossRef]
  11. Ramos-Soto, A.; Bugarin, A.J.; Barro, S.; Taboada, J. Linguistic Descriptions for Automatic Generation of Textual Short-Term Weather Forecasts on Real Prediction Data. IEEE Trans. Fuzzy Syst. 2015, 23, 44–57. [Google Scholar] [CrossRef]
  12. Reiter, E.; Dale, R. Building Applied Natural Language Generation Systems. Nat. Lang. Eng. 2002, 3, 57–87. [Google Scholar] [CrossRef]
  13. Ganter, B.; Wille, R. Formal Concept Analysis: Mathematical Foundation; Springer: Berlin/Heidelberg, Germany, 1999. [Google Scholar]
  14. Bělohlávek, R. Fuzzy Galois Connections. Math. Log. Q. 1999, 45, 497–504. [Google Scholar] [CrossRef]
  15. Burusco Juandeaburre, A.; Fuentes-González, R. The study of the L-fuzzy concept lattice. Mathw. Soft Comput. 1994, 1, 209–218. [Google Scholar]
  16. Krajči, S. A generalized concept lattice. Log. J. IGPL 2005, 13, 543–550. [Google Scholar] [CrossRef]
  17. International Renewable Energy Agency (IRENA). Renewable Capacity Statistics 2025. 2025. Available online: https://www.irena.org/Publications/2025/Mar/Renewable-capacity-statistics-2025 (accessed on 10 January 2026).
  18. Popper, K.R.; Eccles, J.C. The Self and Its Brain; Springer Science & Business Media: Berlin/Heidelberg, Germany, 1977. [Google Scholar]
  19. Butka, P.; Pócs, J. Generalization of One-Sided Concept Lattices. Comput. Inform. 2013, 32, 355–370. [Google Scholar]
  20. Butka, P.; Pócs, J.; Pósová, J. On equivalence of conceptual scaling and generalized one-sided concept lattices. Inf. Sci. 2014, 259, 57–70. [Google Scholar] [CrossRef]
  21. Zadeh, L. The concept of a linguistic variable and its application to approximate reasoning—I. Inf. Sci. 1975, 8, 199–249. [Google Scholar] [CrossRef]
  22. Davey, B.; Priestley, H. Introduction to Lattices and Order, 2nd ed.; Cambridge University Press: Cambridge, UK, 2002. [Google Scholar]
  23. Cornejo, M.E.; Medina, J.; Ramírez-Poussa, E. Characterizing reducts in multi-adjoint concept lattices. Inf. Sci. 2018, 422, 364–376. [Google Scholar] [CrossRef]
  24. Antoni, L.; Krajči, S.; Krídlo, O. On stability of fuzzy formal concepts over randomized one-sided formal context. Fuzzy Sets Syst. 2018, 333, 36–53. [Google Scholar] [CrossRef]
Figure 1. Flow through the computational perceptions and perception mappings.
Figure 1. Flow through the computational perceptions and perception mappings.
Ai 07 00103 g001
Figure 2. Fragment of a database (left side) and relation R (right side) of Example 1.
Figure 2. Fragment of a database (left side) and relation R (right side) of Example 1.
Ai 07 00103 g002
Figure 3. Relation R * of Example 1.
Figure 3. Relation R * of Example 1.
Ai 07 00103 g003
Figure 4. Membership functions and the relation R * of Example 1.
Figure 4. Membership functions and the relation R * of Example 1.
Ai 07 00103 g004
Figure 5. Linguistic variable L V ( I 1 , H 15 ) for the day August 1st 2022 and the hour 15 p.m.
Figure 5. Linguistic variable L V ( I 1 , H 15 ) for the day August 1st 2022 and the hour 15 p.m.
Ai 07 00103 g005
Figure 6. Membership functions of linguistic expressions “irregular”, and “bad”, “normal” and “excellent”, respectively.
Figure 6. Membership functions of linguistic expressions “irregular”, and “bad”, “normal” and “excellent”, respectively.
Ai 07 00103 g006
Figure 7. Membership function of linguistic expression “irregular” after applying Algorithm 3.
Figure 7. Membership function of linguistic expression “irregular” after applying Algorithm 3.
Ai 07 00103 g007
Figure 8. Membership functions of linguistic expressions “not well”, “well” and “very well”, respectively.
Figure 8. Membership functions of linguistic expressions “not well”, “well” and “very well”, respectively.
Ai 07 00103 g008
Figure 9. Obtained values of deviation and mean in the membership functions of linguistic expressions “irregular”, “bad”, “normal” and “excellent”, respectively.
Figure 9. Obtained values of deviation and mean in the membership functions of linguistic expressions “irregular”, “bad”, “normal” and “excellent”, respectively.
Ai 07 00103 g009
Figure 10. Performance of F during 05-August-2022: L is Low, M is Medium, H is High; I is Irregular; B is bad (red), N is Normal (orange) and E is Excellent (green).
Figure 10. Performance of F during 05-August-2022: L is Low, M is Medium, H is High; I is Irregular; B is bad (red), N is Normal (orange) and E is Excellent (green).
Ai 07 00103 g010
Figure 11. Obtained value of performance in the membership functions of linguistic expressions “not well”, “well” and “very well”, respectively. It shows the station performance values for the latest computational performance perception (3CP). The value 0.637 is the difference between all excellent hours and all bad hours for 5 August 2022. The value 0.637 corresponds to a “very well” -with 1.0- performance based on the linguistic variable defined as a function of the standard deviation with a limit of 0.3.
Figure 11. Obtained value of performance in the membership functions of linguistic expressions “not well”, “well” and “very well”, respectively. It shows the station performance values for the latest computational performance perception (3CP). The value 0.637 is the difference between all excellent hours and all bad hours for 5 August 2022. The value 0.637 corresponds to a “very well” -with 1.0- performance based on the linguistic variable defined as a function of the standard deviation with a limit of 0.3.
Ai 07 00103 g011
Figure 12. A computational perception network that models both data processing and the generation of computational perceptions, along with their corresponding linguistic summaries.
Figure 12. A computational perception network that models both data processing and the generation of computational perceptions, along with their corresponding linguistic summaries.
Ai 07 00103 g012
Figure 13. When p ( F , P ) irregular 0 , The Report Template (left) for the Analysis of Solar Inverters and a linguistic report (right) automatically generated.
Figure 13. When p ( F , P ) irregular 0 , The Report Template (left) for the Analysis of Solar Inverters and a linguistic report (right) automatically generated.
Ai 07 00103 g013
Figure 14. When p ( F , P ) irregular = 0 , The Report Template (left) for the Analysis of Solar Inverters and a linguistic report (right) automatically generated.
Figure 14. When p ( F , P ) irregular = 0 , The Report Template (left) for the Analysis of Solar Inverters and a linguistic report (right) automatically generated.
Ai 07 00103 g014
Figure 15. Concept lattice associated with the context of the 1CP obtained from I 5 .
Figure 15. Concept lattice associated with the context of the 1CP obtained from I 5 .
Ai 07 00103 g015
Figure 16. Concept lattice for the facility with the 2CP (bad, normal, excellent).
Figure 16. Concept lattice for the facility with the 2CP (bad, normal, excellent).
Ai 07 00103 g016
Figure 17. Concept lattice for the facility considering the 3CP.
Figure 17. Concept lattice for the facility considering the 3CP.
Ai 07 00103 g017
Table 1. Records of energy production for an inverter I i during the hours { H 0 , , H 23 } of a day D j .
Table 1. Records of energy production for an inverter I i during the hours { H 0 , , H 23 } of a day D j .
Hour/Day  D 1 D 2  …  D s
H 0 P ( I i , D 1 , H 0 ) P ( I i , D 2 , H 0 ) P ( I i , D s , H 0 )
H 1 P ( I i , D 1 , H 1 ) P ( I i , D 2 , H 1 ) P ( I i , D s , H 1 )
H 23 P ( I i , D 1 , H 23 ) P ( I i , D 2 , H 23 ) P ( I i , D s , H 23 )
Table 2. Energy production for each hour H k of each day D j along with the average, maximum and minimum over a month.
Table 2. Energy production for each hour H k of each day D j along with the average, maximum and minimum over a month.
Hour D 1 D 2  …  D s AverageMaxMin
H 0 P ( I i , D 1 , H 0 ) P ( I i , D 2 , H 0 )  …  P ( I i , D s , H 0 ) Average ( I i , H 0 ) Max ( I i , H 0 ) Min ( I i , H 0 )
H 1 P ( I i , D 1 , H 1 ) P ( I i , D 2 , H 0 )  …  P ( I i , D s , H 0 ) Average ( I i , H 1 ) Max ( I i , H 1 ) Min ( I i , H 1 )
H 23 P ( I i , D 1 , H 23 ) P ( I i , D 2 , H 23 )  …  P ( I i , D s , H 23 ) Average ( I i , H 23 ) Max ( I i , H 23 ) Min ( I i , H 23 )
Table 3. A linguistic table for the inverter I i in the day D j built from the linguistic variables associated with the hour H k .
Table 3. A linguistic table for the inverter I i in the day D j built from the linguistic variables associated with the hour H k .
R I i * LT ( I i , H k ) 1 LT ( I i , H k ) 2 LT ( I i , H k ) l
D j , H 0 μ L T ( I i , H 0 ) 1 ( P ( I i , D j , H 0 ) ) μ L T ( I i , H 0 ) 2 ( P ( I i , D j , H 0 ) ) μ L T ( I i , H 0 ) l ( P ( I i , D j , H 0 ) )
D j , H 1 μ L T ( I i , H 1 ) 1 ( P ( I i , D j , H 1 ) ) μ L T ( I i , H 1 ) 2 ( P ( I i , D j , H 1 ) ) μ L T ( I i , H 1 ) l ( P ( I i , D j , H 1 ) )
D j , H 23 μ L T ( I i , H 23 ) 1 ( P ( I i , D j , H 23 ) ) μ L T ( I i , H 23 ) 2 ( P ( I i , D j , H 23 ) ) μ L T ( I i , H 23 ) l ( P ( I i , D j , H 23 ) )
Table 4. Linguistic table generated using the linguistic variables for the inverter I 1 over a period of time, during which energy production was generated (see Table A1).
Table 4. Linguistic table generated using the linguistic variables for the inverter I 1 over a period of time, during which energy production was generated (see Table A1).
R I 1 * μ LT ( I 1 , H k ) low μ LT ( I 1 , H k ) medium μ LT ( I 1 , H k ) high
01-August-2022, H 9 0.2 0.8 0.0
01-August-2022, H 10 0.0 1.0 0.0
01-August-2022, H 11 0.0 0.0 1.0
01-August-2022, H 12 0.0 0.0 1.0
01-August-2022, H 13 0.0 0.0 1.0
01-August-2022, H 14 0.0 0.0 1.0
01-August-2022, H 15 0.0 0.0 1.0
01-August-2022, H 16 0.0 0.0 1.0
01-August-2022, H 17 0.0 0.0 1.0
01-August-2022, H 18 0.0 0.0 1.0
01-August-2022, H 19 0.0 0.0 1.0
01-August-2022, H 20 0.0 0.0 1.0
01-August-2022, H 21 0.0 0.0 1.0
01-August-2022, H 22 0.0 0.0 1.0
Table 5. Excellent and bad performances of the facility F at day D 5 from 9:00 h to 21:00 h.
Table 5. Excellent and bad performances of the facility F at day D 5 from 9:00 h to 21:00 h.
k m w ( F , D 5 , H k m ) excellent w ( F , D 5 , H k m ) bad
90 0.667
1000
11 0.501 0
12 0.963 0
1310
14 0.778 0
15 0.982 0
16 0.482 0
1710
18 0.76 0
19 0.963 0
20 0.667 0
21 0.851 0
Table 6. Linguistic Descriptions generated for each inverter in August 2022.
Table 6. Linguistic Descriptions generated for each inverter in August 2022.
I i RatesLinguistic Descriptions
p ( I i , P ) low p ( I i , P ) medium p ( I i , P ) high
I 1 88.900 462 = 0.192 109.899 462 = 0.237 262.799 462 = 0.568 FEW hours were low, SEVERAL hours were medium, ABOUT HALF hours were high
I 2 72.500 462 = 0.156 136.799 462 = 0.296 252.699 462 = 0.546 FEW hours were low, SEVERAL hours were medium, ABOUT HALF hours were high
I 3 92.800 462 = 0.201 195.100 462 = 0.422 74.099 462 = 0.376 SEVERAL hours were low, ABOUT HALF hours were medium, SEVERAL hours were high
I 4 85.800 462 = 0.185 195.900 462 = 0.424 180.29 462 = 0.390 FEW hours were low, ABOUT HALF hours were medium, SEVERAL hours were high
I 5 91.100 462 = 0.197 231.300 462 = 0.500 139.599 462 = 0.302 FEW hours were low, ABOUT HALF hours were medium, SEVERAL hours were high
I 6 87.500 462 = 0.189 204.500 462 = 0.442 170.0 462 = 0.367 FEW hours were low, ABOUT HALF hours were medium, SEVERAL hours were high
I 7 88.300 462 = 0.191 187.100 462 = 0.404 186.599 462 = 0.403 FEW hours were low, ABOUT HALF hours were medium, ABOUT HALF hours were high
I 8 97.000 462 = 0.209 109.199 462 = 0.236 255.900 462 = 0.553 SEVERAL hours were low, SEVERAL hours were medium,ABOUT HALF hours were high
I 9 99.100 462 = 0.214 149.499 462 = 0.323 213.500 462 = 0.462 SEVERAL hours were low, SEVERAL hours were medium, ABOUT HALF hours were high
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Aragón, R.G.; Chacón-Gómez, F.; Medina, J.; Rubio-Manzano, C. An Interpretable Fuzzy Framework for Data-to-Text Generation Using Linguistic Contexts and Computational Perceptions: A Case Study on Photovoltaic Stations. AI 2026, 7, 103. https://doi.org/10.3390/ai7030103

AMA Style

Aragón RG, Chacón-Gómez F, Medina J, Rubio-Manzano C. An Interpretable Fuzzy Framework for Data-to-Text Generation Using Linguistic Contexts and Computational Perceptions: A Case Study on Photovoltaic Stations. AI. 2026; 7(3):103. https://doi.org/10.3390/ai7030103

Chicago/Turabian Style

Aragón, Roberto G., Fernando Chacón-Gómez, Jesús Medina, and Clemente Rubio-Manzano. 2026. "An Interpretable Fuzzy Framework for Data-to-Text Generation Using Linguistic Contexts and Computational Perceptions: A Case Study on Photovoltaic Stations" AI 7, no. 3: 103. https://doi.org/10.3390/ai7030103

APA Style

Aragón, R. G., Chacón-Gómez, F., Medina, J., & Rubio-Manzano, C. (2026). An Interpretable Fuzzy Framework for Data-to-Text Generation Using Linguistic Contexts and Computational Perceptions: A Case Study on Photovoltaic Stations. AI, 7(3), 103. https://doi.org/10.3390/ai7030103

Article Metrics

Back to TopTop