Phraseology and Culture in Terminological Knowledge Bases: The Case of Pollution and Environmental Law

: Despite its importance, environmental law has largely been ignored in environmental knowledge bases. This may be due to the fact that legal issues may not, strictly speaking, be considered scientific knowledge in environmental knowledge resources, which may in turn relate to the complexity of reflecting the cultural component (which includes different legal systems) in the description of terms and concepts. The terminological knowledge base EcoLexicon has recently begun to include information on environmental law. This paper takes the methodological perspective of frame-based terminology to analyze typical verb collocations in environmental law that will be added to the phraseology module of EcoLexicon. Corpus analysis was used to compare the behavior of verbs collocating with pollution in environmental science and environmental law. Verbs were classified based on lexical domains and semantic classes through definition factorization, as described in the Lexical Grammar Model. The differences were mostly based on the specificity of the arguments and the emphasis on the polluter in environmental law. This resulted in a proposal for the inclusion and configuration of environmental law phraseology in EcoLexicon, showing sociocultural differences across environmental subdomains.


Introduction
Culture is generally regarded as the characteristics and knowledge of a particular group of people, encompassing religion, food, traditions, music, arts, and general language.As such, it permeates all aspects of life and even influences the way that we perceive the world (Unsworth et al. 2005).Not surprisingly, culture is also reflected in specialized language and terminology.Recently, the cultural facet of terminology or culture-bound terminology (Diki-Kidiri 2008) has been highlighted by Temmerman and van Campenhoudt (2014), Faber and Medina-Rull (2017), Diki-Kidiri (2022), Reimerink et al. (2023) and León- Araúz and Faber (Forthcoming).In fact, today, terms are acknowledged to possess an expressive power of their own insofar as they are often steeped in the culture and ideology of the text sender and even encode metaphors that have an impact on the understanding of a specialized domain (Faber 2022, p. 1).Since terms and their meanings are culturally motivated, the issue is how to represent this cultural dimension in terminological knowledge bases.
Recently, the process of converting EcoLexicon (ecolexicon.ugr.es)into an inclusive resource sensitive to cultural variation has driven the inclusion of new content and data categories.EcoLexicon is a multilingual and multimodal terminological knowledge base (TKB) (Faber et al. 2016) that represents the conceptual structure of the specialized domain of the environment in the form of a dynamic visual resource.It combines conceptual, linguistic, and graphical information to help translators, technical writers, and environmental experts acquire an in-depth understanding of specialized environmental concepts and help them write or translate specialized or semi-specialized texts.It is the practical application of concepts and help them write or translate specialized or semi-specialized texts.It is the practical application of frame-based terminology (FBT) (Faber 2012(Faber , 2015(Faber , 2022)), a cognitive approach to domain-specific language, which directly links specialized knowledge representation to cognitive linguistics and cognitive semantics.In FBT, knowledge acquisition begins at the term level, progresses to the phrase level, and finally results in the codification of an entire knowledge frame.The data are collected by means of corpus analysis.
To adapt EcoLexicon to cultural variation, a set of cultural profiles or frames must be specified that are linked to culture-dependent semantic categories, such as geographic landforms (e.g., creek), flora and fauna (e.g., cookie-cutter shark), meteorological phenomena (local wind), and even named entities (e.g., Mesoamerican Reef System).It also signifies adding a cultural component to all modules (definitions, conceptual networks, terms, phraseology, and multimodal resources).Culture in EcoLexicon is a broad notion that encompasses not only the inclusion of culture-specific concepts but also the different phraseological structures that arise from subtle changes in perspective (i.e., environmental subdomains) at the linguistic level.
Cultural variation is usually reflected in multidimensional concepts, whose relational behavior changes based on contextual parameters.Accordingly, cultural recontextualization depends on a set of cultural parameters, based on geographic location, historical time period, sociocultural usage, etc., which restrict the conceptual behavior to a certain cultural context.To reflect the sociocultural representation of environmental concepts, the information in EcoLexicon can be recontextualized according to environmental subdomain (e.g., geology, coastal engineering, hydrology, etc.).For example, the concept WATER has an active role in geology (it causes erosion, reshapes the terrestrial landscape, etc.), while in the water treatment domain, it is a patient that receives actions (purification, filtering, etc.) (León- Araúz et al. 2013).An example of restrictions in conceptual networks for a concept that behaves differently according to its geographical location is WETLAND.In Figure 1, the network to the left shows the general network for WETLAND, whereas the network to the right is restricted to the Caribbean, with MARSH and SWAMP as prototypical wetlands for the area, and SEAGRASS BED, which is only there considered a wetland.Some subdomains, such as biodiversity, are more prone to cultural variation than others because flora and fauna are directly related to the geographical location they inhabit.However, there is one domain with a very special relationship to culture: environmental law.Environmental law is an important transversal domain that combines law with environmental science.It is impossible to understand the environment without an Some subdomains, such as biodiversity, are more prone to cultural variation than others because flora and fauna are directly related to the geographical location they inhabit.However, there is one domain with a very special relationship to culture: environmental law.Environmental law is an important transversal domain that combines law with environmental science.It is impossible to understand the environment without an in-depth knowledge of how international, national, and regional governments and administrative bodies regulate it.The law is a profoundly human construct that is directly related to culture and, therefore, different in every culture.Studying the behavior of environmental concepts within this subdomain as compared to the environment as a whole promises to provide insight into the impact of culture on scientific knowledge.For this reason, EcoLexicon has begun to include concepts and terms in different languages that pertain to environmental law (Faber and Reimerink 2019;Reimerink 2021).
In a previous study (Reimerink 2021), to expand and improve the information related to environmental law in EcoLexicon, comparative corpus analysis was used to identify missing concepts and explore how the multidimensional nature (León Araúz 2009) of environmental science might affect the behavior of other concepts in the subdomain of environmental law.The study focused on the POLLUTION frame, and the results showed that a new participant (i.e., the POLLUTER) had to be added when contextualized for the subdomain of environmental law.Whereas, in environmental science, the main focus is generally on the polluting substance, in environmental law, it is on the person/institution/industry responsible (see examples 1 and 2, emphasis by the authors).We also discovered that some facets of the concept POLLUTION (i.e., time and origin) are more prominent in this subdomain compared to the environmental domain as a whole (see examples 3 and 4).

1.
The pollutants disperse in a downward direction, causing substantial air pollution at ground level but cannot escape upwards because of the inversion.2.
. ..the polluter-pays principle, the person responsible for the pollution cannot be identified or cannot be held liable under Community or national legislation. . .

3.
Indeed, the phenomenon of historical pollution represents the result of the convergence and interaction of a number of different factors. . . 4.
Historically the regulation of vessel-source pollution has engendered conflict between coastal States. . .These results entailed changes in the conceptual networks and the definitions of EcoLexicon. Figure 2 shows the non-restricted conceptual network for POLLUTION without the generic-specific relations for more clarity.
in-depth knowledge of how international, national, and regional governments and administrative bodies regulate it.The law is a profoundly human construct that is directly related to culture and, therefore, different in every culture.Studying the behavior of environmental concepts within this subdomain as compared to the environment as a whole promises to provide insight into the impact of culture on scientific knowledge.For this reason, EcoLexicon has begun to include concepts and terms in different languages that pertain to environmental law (Faber and Reimerink 2019;Reimerink 2021).
In a previous study (Reimerink 2021), to expand and improve the information related to environmental law in EcoLexicon, comparative corpus analysis was used to identify missing concepts and explore how the multidimensional nature (León-Araúz 2009) of environmental science might affect the behavior of other concepts in the subdomain of environmental law.The study focused on the POLLUTION frame, and the results showed that a new participant (i.e., the POLLUTER) had to be added when contextualized for the subdomain of environmental law.Whereas, in environmental science, the main focus is generally on the polluting substance, in environmental law, it is on the person/institution/industry responsible (see examples 1 and 2, emphasis by the authors).We also discovered that some facets of the concept POLLUTION (i.e., time and origin) are more prominent in this subdomain compared to the environmental domain as a whole (see examples 3 and 4).
1.The pollutants disperse in a downward direction, causing substantial air pollution at ground level but cannot escape upwards because of the inversion.2. …the polluter-pays principle, the person responsible for the pollution cannot be identified or cannot be held liable under Community or national legislation… 3. Indeed, the phenomenon of historical pollution represents the result of the convergence and interaction of a number of different factors… 4. Historically the regulation of vessel-source pollution has engendered conflict between coastal States… These results entailed changes in the conceptual networks and the definitions of EcoLexicon. Figure 2 shows the non-restricted conceptual network for POLLUTION without the generic-specific relations for more clarity.Although the final result in the conceptual network does not convey all the conceptual nuances, the relationship between POLLUTER, POLLUTANT, and POLLUTION is made explicit.The present case is a very good example of the need for multimodality in terminological knowledge bases.They must be enhanced with multimodal representations, namely visual and linguistic representations that converge to facilitate knowledge acquisition.
The results in Reimerink (2021) led to the revision of the definition of POLLUTION in EcoLexicon.A flexible definition was created to recontextualize it for environmental law.New facets included the facts that the polluter causes damage to the environment and that a polluter can be held responsible and sanctioned.The definitional template for POLLUTION (Table 1) now shows two agents.Agent1 is the polluter, who is ultimately responsible for the pollution.Agent2 is the pollutant, which is the direct cause of the pollution.The primary result (result1) is the direct consequence of pollution on the environment, whereas the secondary result (result2) is the fact that the polluter can be held responsible and sanctioned.Although the final result in the conceptual network does not convey all the conceptual nuances, the relationship between POLLUTER, POLLUTANT, and POLLUTION is made explicit.The present case is a very good example of the need for multimodality in terminological knowledge bases.They must be enhanced with multimodal representations, namely visual and linguistic representations that converge to facilitate knowledge acquisition.
The results in Reimerink (2021) led to the revision of the definition of POLLUTION in EcoLexicon.A flexible definition was created to recontextualize it for environmental law.New facets included the facts that the polluter causes damage to the environment and that a polluter can be held responsible and sanctioned.The definitional template for POLLUTION (Table 1) now shows two agents.Agent1 is the polluter, who is ultimately responsible for the pollution.Agent2 is the pollutant, which is the direct cause of the pollution.The primary result (result1) is the direct consequence of pollution on the environment, whereas the secondary result (result2) is the fact that the polluter can be held responsible and sanctioned.Whereas the conceptual network provides graphical access to the POLLUTION frame and all the related concepts, including POLLUTER and POLLUTANT, the linguistic expression of the definition provides the means to convey the nuances of the relationship between the participants of the frame.
In the present study, we analyzed how the differences between environmental science and its subdomain environmental law, at the conceptual level, are conveyed at the linguistic level.End users of EcoLexicon, such as translators and technical writers, need to know how to express the differences at the conceptual level in their texts.This is usually reflected in phraseological combinations.However, even though the phraseology of specialized discourse is attracting increasing interest (Aguado de Cea 2007; Buendía-Castro 2013; Cabezas-García and Faber 2018), studies focusing on specialized phraseology are much less numerous than those addressing general language phraseology.
Our hypothesis is that the subdomain of environmental law uses different linguistic expressions to describe the POLLUTION frame than the global environmental science domain.The research questions we tried to answer are as follows: (a) how are the linguistic expressions related to the POLLUTION frame different when comparing environmental law and environmental science, and (b) how can we represent this knowledge in a TKB on the environment?The present study analyzed verb collocations in environmental law to add to the phraseology module of EcoLexicon, which is currently under construction.In this pilot study, we focus on phraseology in English.Future research will also address the topic in Spanish, one of the other major languages of EcoLexicon.
The rest of this paper is organized as follows: Section 2 explains the phraseology extraction method; Section 3 presents the results; Section 4 discusses the results and provides a proposal for their representation in the phraseology module; and Section 5 summarizes the conclusions that can be derived from this research.

Materials and Methods
In all cultures, legal language is a sublanguage with very specific syntactic, semantic, and pragmatic features (Tiersma 1999, pp. 15-133).The documents in the field often use grammatical structures that are rarely found elsewhere, such as redundancy, formulaic expressions, foreign words and Latinisms, syntactic discontinuity, impersonal and passive constructions, nominalization, and complex sentences (Hiltunen 2012;Williams 2004, pp. 112-15;Buendía-Castro and Faber 2015).Although, to a certain extent, the relation between content and form is present in other specialized texts as well, it is even more prevalent for texts in the legal domain since legal language is the result of a social contract and can be regarded as system-bound (Mattila 2006, p. 9).
Accordingly, an entry in a legal TKB can only be regarded as adequate if there is as complete a description as possible of the macro-and micro-context in which the term appears.If the resource is aimed at translators, for example, this description must provide information on how the term is used and the degree to which it can be regarded as equivalent to a given term within another legal system.Possible equivalent terms in other languages should also appear with as much contextual information as possible, which will facilitate mapping relations between the source and target language systems and cultures (Buendía-Castro and Faber 2015, p. 164).However, few specialized resources actually contain word combinations (L'Homme and Leroyer 2009, p. 260), and those that do include them are often not consistent in their treatment of phraseological units (Montero-Martínez and Buendía-Castro 2012).
Legal phraseology has attracted an increased interest in linguistics and translation studies.However, the same degree of interest has not been devoted to the issue of how phraseology can be managed and displayed in legal lexicographic and/or terminological resources (Peruzzo 2019, p. 149).In a questionnaire passed to final-year law students (Peruzzo 2019, p. 152), the students indicated that the enumeration of phraseological units in bi-or multilingual TKBs did not meet their needs because, firstly, these units were not accompanied by a definition and, secondly, in a bi-or multilingual terminological entry containing a separate phraseology field for each term, establishing equivalence relations between phraseological units is not always a straightforward task.
The phraseology module of EcoLexicon is based on a wide interpretation of the concept of collocation, and at its core are verb collocations.An analysis of verb collocations in specialized discourse is especially relevant because they convey specialized knowledge and are essential to communicating fluently (Kübler and Pecman 2012;Orenha-Ottaiano et al. 2021;Buendía Castro 2021).In FBT, verb collocations are frequent combinations of two or more lexical units composed of a noun + verb, verb + noun, or noun + verb + noun, where the meaning of the verb is limited by the meaning of the noun.However, at the same time, the verb restricts the type of noun with which it can combine (Buendía-Castro 2013, p. 115).For example, in the collocation "the fire burns", the verb only allows for arguments that can be on fire, whereas the argument "fire" needs a verb that refers to the process of combustion (Montero Martínez and Buendía-Castro 2017).
In the phraseology module, verbs will be classified based on their meaning in combination with the terms with which they collocate.This is in line with previous work (Rosario et al. 2002;Maguire et al. 2010;Gagné and Spalding 2013;Cabezas García 2020), which analyzes the relevance of semantics in the recurrent patterns of combination that occur in phraseological units and the usefulness of these patterns in meaning access.
Therefore, verbs will not have their own entries in EcoLexicon but will be included as additional information in the term entries.The inclusion of a phraseme in EcoLexicon is essentially based on frequency of occurrence in the corpus.However, as will be shown, frequency changes when comparing different subdomains.Therefore, different phrasemes and examples will be shown, depending on the context the end user is focusing on in EcoLexicon.
To compare the collocational behavior of POLLUTION in environmental science and the subdomain of environmental law, Sketch Engine (https://www.sketchengine.eu/,Kilgarriff et al. 2014) was used.As a reference corpus, we used the EcoLexicon Environmental Corpus (EEC, 23 million words; León- Araúz et al. 2018) available in the Open Corpora section of Sketch Engine, and we compared it to a corpus specifically created for this purpose: the Environmental Law corpus (enLaw, 9.7 million words), composed of EEC texts, tagged with the domain of environmental law, as well as additional texts from the same domain harvested from the Internet.Some texts of the enLaw corpus are also included in the complete corpus on environmental science.Environmental law is part of the overall domain of environmental science; therefore, environmental law texts should also be included in the overall corpus.However, the differences between the overall domain as compared to the subdomain come to light when we compare the overall corpus with a corpus of texts that are specifically about environmental law.The EEC and enLaw corpora were both compiled in Sketch Engine with the Penn Treebank tagset and the EcoLexicon Semantic Sketch Grammar (ESSG; León- Araúz et al. 2016).
The ESSG is a Corpus Query Language (CQL)-based grammar (Jakubíček et al. 2013) as is the default grammar used for word sketches in Sketch Engine.Whereas Sketch Engine's default grammar provides grammatical relations, such as verb-object, modifiers, and prepositional phrases, the ESSG was developed for the extraction of semantic word sketches based on some of the most common semantic relations in terminology: genericspecific, part-whole, location, cause, and function.This was especially useful for the previous study (Reimerink 2021), where we focused on the conceptual differences between the global domain and the subdomain.However, to select representative examples for the phraseology module, the semantic word sketches provide easy access to sentences that convey conceptual knowledge (see Section 3, Figure 9).The Sketch Engine functions used to extract and compare the noun + verb collocations of pollution, as well as the related terms pollute/polluter, in both corpora were Word Sketch and Concordance.
After extraction, verbs were categorized according to the lexical domains in Faber and Mairal Usón (1999).The authors analyzed and categorized the semantic and syntactic structure of 12,000 general language English verbs through definition factorization, as described in the Lexical Grammar Model, and validated them via corpus analysis.This resulted in the following general lexical domains that can also be applied to verbs in specialized discourse: EXISTENCE (be, happen), CHANGE (become, change), POSSESSION (have), SPEECH (say, talk), EMOTION (feel), ACTION (do, make), MENTAL PERCEPTION (know, think), MOVEMENT (move, go, come), PHYSICAL PERCEPTION (see, hear, taste, smell, touch), MANIPULATION (use), CONTACT/IMPACT (hit, break), and POSITION (put, be).Other smaller classes included LIGHT, SOUND, BODY FUNCTIONS, WEATHER, etc.

Results
The results are presented according to the two functions of Sketch Engine used for corpus analysis: Word Sketch and Concordance.

Word Sketch
The information provided in Tables 2-6 is provided as Sketch Engine shows the data.The first column shows the collocate, the second column the absolute frequency, and the third the logDice score.The logDice score is used for determining how typical the collocation is.A high score means that the collocate is often found together with the node, and at the same time, there are not very many other nodes that the collocate combines with. 1 Table 2 shows that the verbs that collocate with pollution as an object in both corpora mostly belong to the domain of CAUSATIVE EXISTENCE, more specifically to cause something to exist (cause), to cause something to cease to exist (eliminate), and to cause something to not happen (prevent, avoid).Other important lexical domains are CHANGE, more specifically, to cause something to change by decreasing it (abate, reduce, minimize, mitigate, decrease, limit) and MANIPULATION (control, monitor).Finally, the lexical domains of VISUAL PERCEPTION, COGNITION, and SPEECH are present with verbs such as consider, define, and regard.
In the word sketch of verbs with pollution as the subject (see Table 3), there are fewer results for the EEC because the numbers of collocations with pollution did not exceed the "auto" threshold, a default parameter in Sketch Engine based on corpus size. 2 This makes sense because the EEC is a corpus on the overall domain of environmental science; pollution is, thus, only one of the aspects to be considered.In contrast, in the enLaw corpus, POLLUTION is a central concept, and that is why collocations with pollution are statistically more relevant.The lexical domain of the verbs that predominate in both corpora is EXISTENCE: originate, occur, arise, be, emanate, become, and include.Another lexical domain present in both corpora is CHANGE (reduce, increase), to cause something to change by making it worse (destroy, damage, harm, threaten), and more general causative verbs such as cause, affect, derive, and result.
The verb flush in the EEC word sketch of pollution is the result of the term pollution flushing, which is a process through which pollution is removed from a water body through natural or artificial currents or tides.It can be classified as causing something to cease to exist (EXISTENCE) or as MOVEMENT (Faber and Reimerink 2019).
After analyzing pollution, we also analyzed the verb pollute and the noun polluter in Word Sketch.When we were looking at the results for the word sketch object_of, there were no obvious differences between the verb's behavior in enLaw and EEC, apart from the difference in the number of results (see Table 4).However, quite a few tagging mistakes were found, as some of the results are clearly objects (air, environment, groundwater, river, beach, soil, surface, stream, etc.), whereas others seemed to be clearly subjects of the verb (industry, activity, discharge, emission, facility, behavior, etc.).An example of the tagging mistakes is shown in the concordances for polluting industries in Figure 4, where polluting is obviously in an adjectival position.This shows that, although Word Sketch provides very valuable information in an easily accessible format, the processing of the corpora is not infallible, and therefore, manual analysis of concordances is necessary (see Section 3.2).Table 5 shows which verbs collocate with polluter in the object slot.Once again, the enLaw corpus provides more results, some of which are directly related to the legal domain: prosecute and sue.This is why the concept of POLLUTER is only shown in relation to POLLUTION in the conceptual network restricted for environmental law.Another important lexical domain is MANIPULATION: implement, regulate, oblige, force, compel, deter, require, etc.  Table 5 shows which verbs collocate with polluter in the object slot.Once again, the enLaw corpus provides more results, some of which are directly related to the legal domain: prosecute and sue.This is why the concept of POLLUTER is only shown in relation to POLLUTION in the conceptual network restricted for environmental law.Another important lexical domain is MANIPULATION: implement, regulate, oblige, force, compel, deter, require, etc.
Finally, the word sketch polluter subject_of showed the verb pay as the very first result for both corpora.This is, of course, because one of the most important principles of environmental law is the polluter-pays principle (see Table 6).

Concordance
Apart from the fact that there were more results for pollution in enLaw, the lexical domains of the verbs collocating with pollution were very similar in both corpora.The differences pertained to some of the arguments of the verbs, which can be deduced from the results of the Concordance function of Sketch Engine.To illustrate this, we analyzed the verbs (i) abate and (ii) minimize, both from the lexical domain CHANGE (to cause something to change by decreasing it), and (iii) control from the lexical domain MANIPULATION.
Figure 5 shows an extract of the concordances of the CQL abate + pollution in enLaw.The second argument that collocates with this combination is an institutional body (state, UK), a company (industries, firms), a measure (measures), or a cost (expenditures, costs).
Languages 2024, 9, x FOR PEER REVIEW 12 of 18 Figure 5 shows an extract of the concordances of the CQL abate + pollution in enLaw.The second argument that collocates with this combination is an institutional body (state, UK), a company (industries, firms), a measure (measures), or a cost (expenditures, costs).Collocations of abate + pollution in the EEC corpus showed the same second arguments, which is not surprising, as all the occurrences were in texts tagged as pertaining to the environmental law domain or the water treatment domain.
The second argument for the CQL minimize + pollution (Figure 6) is mostly a measure (requirements, directive, measures) in enLaw.However, the concordances for minimize + pollution in the EEC showed different second arguments (Figure 7).Infrastructural elements, such as water supply systems and wastewater treatment systems (concordance 2), locating wells in areas of deep groundwater and impermeable soils (4), bioethanol blending to petrol (6), the best available techniques not entailing excessive costs (9), and recycling techniques allied with good design practices (10) all refer to specific technical procedures developed by experts that have shown to be the best options for minimizing pollution.Natural gas (7) and mangrove soils and plants ( 12), on the other hand, are natural entities that help minimize pollution.Collocations of abate + pollution in the EEC corpus showed the same second arguments, which is not surprising, as all the occurrences were in texts tagged as pertaining to the environmental law domain or the water treatment domain.
The second argument for the CQL minimize + pollution (Figure 6) is mostly a measure (requirements, directive, measures) in enLaw.
Languages 2024, 9, x FOR PEER REVIEW 12 of 18 Figure 5 shows an extract of the concordances of the CQL abate + pollution in enLaw.The second argument that collocates with this combination is an institutional body (state, UK), a company (industries, firms), a measure (measures), or a cost (expenditures, costs).Collocations of abate + pollution in the EEC corpus showed the same second arguments, which is not surprising, as all the occurrences were in texts tagged as pertaining to the environmental law domain or the water treatment domain.
The second argument for the CQL minimize + pollution (Figure 6) is mostly a measure (requirements, directive, measures) in enLaw.However, the concordances for minimize + pollution in the EEC showed different second arguments (Figure 7).Infrastructural elements, such as water supply systems and wastewater treatment systems (concordance 2), locating wells in areas of deep groundwater and impermeable soils (4), bioethanol blending to petrol (6), the best available techniques not entailing excessive costs (9), and recycling techniques allied with good design practices (10) all refer to specific technical procedures developed by experts that have shown to be the best options for minimizing pollution.Natural gas (7) and mangrove soils and plants ( 12), on the other hand, are natural entities that help minimize pollution.However, the concordances for minimize + pollution in the EEC showed different second arguments (Figure 7).Infrastructural elements, such as water supply systems and wastewater treatment systems (concordance 2), locating wells in areas of deep groundwater and impermeable soils (4), bioethanol blending to petrol (6), the best available techniques not entailing excessive costs (9), and recycling techniques allied with good design practices (10) all refer to specific technical procedures developed by experts that have shown to be the best options for minimizing pollution.Natural gas (7) and mangrove soils and plants (12), on the other hand, are natural entities that help minimize pollution.The second argument for the CQL control + pollution (Figure 8) includes an institutional body (state, administration, agencies) and a measure (strategies, measures, regulations, laws) in enLaw.In the EEC, the second arguments fall in the same categories, again because the texts pertained to the environmental law, water treatment, and air quality management domains.The second argument for the CQL control + pollution (Figure 8) includes an institutional body (state, administration, agencies) and a measure (strategies, measures, regulations, laws) in enLaw.In the EEC, the second arguments fall in the same categories, again because the texts pertained to the environmental law, water treatment, and air quality management domains.
Languages 2024, 9, x FOR PEER REVIEW 13 of 18 The second argument for the CQL control + pollution (Figure 8) includes an institutional body (state, administration, agencies) and a measure (strategies, measures, regulations, laws) in enLaw.In the EEC, the second arguments fall in the same categories, again because the texts pertained to the environmental law, water treatment, and air quality management domains.One of the participants that is specific to the pollution frame in environmental law is, evidently, the polluter.Figure 9 shows an extract of the concordances of the CQL pollution caused_by in enLaw.The cause is evidently the polluting industry (ship, operational discharges, activities) or the person or entity responsible (polluters, manufacturers, persons, parties, corporation).When choosing the examples for the phraseology module under the term pollution, the prominence of the polluter must be made explicit.
Languages 2024, 9, x FOR PEER REVIEW 14 of 18 One of the participants that is specific to the POLLUTION frame in environmental law is, evidently, the POLLUTER.Figure 9 shows an extract of the concordances of the CQL pollution caused_by in enLaw.The cause is evidently the polluting industry (ship, operational discharges, activities) or the person or entity responsible (polluters, manufacturers, persons, parties, corporation).When choosing the examples for the phraseology module under the term pollution, the prominence of the POLLUTER must be made explicit.

Discussion
From the results shown in Section 3, certain conclusions can be drawn.First of all, POLLUTION is a much more central concept in the environmental law subdomain than in the general domain of environmental science.This can be deduced from the fact that, often, there were fewer results for the EEC than for enLaw, as the numbers of collocations with pollution in the EEC did not exceed the threshold, whereas in the enLaw corpus, the collocations with pollution were statistically more relevant.
Secondly, apart from the fact that there were more results for pollution in enLaw, the lexical domains of the verbs collocating with pollution were very similar in both corpora.The verbs that collocate with pollution as an object in both corpora mostly belonged to the domain of CAUSATIVE EXISTENCE, more specifically to cause something to exist, to cause something to cease to exist, and to cause something to not happen.Other important lexical domains were CHANGE, more specifically to cause something to change by decreasing it, and MANIPULATION.The word sketch of verbs with pollution as the subject showed that the lexical domain that predominates in both corpora is EXISTENCE.Another lexical domain present in both corpora is CHANGE.
When we were analyzing the verb pollute with the word sketch object_of, there were no obvious differences between the verb's behavior in enLaw and EEC.When we were studying the noun polluter as the object of verbs, verbs directly related to the legal domain such as prosecute and sue came up, and the most important lexical domain was MANIPU- LATION.
A few different second arguments arose when we were analyzing the concordances of the verbs abate, minimize, and control.Especially the categories for the second argument of minimize were very different in enLaw (measure) as compared to the EEC (technical procedures and natural entities).
Regarding the phraseology module in EcoLexicon, the verbs abate, minimize, and control will be included under the term pollution in the following phrasemes for the environmental law subdomain:

Discussion
From the results shown in Section 3, certain conclusions can be drawn.First of all, POLLUTION is a much more central concept in the environmental law subdomain than in the general domain of environmental science.This can be deduced from the fact that, often, there were fewer results for the EEC than for enLaw, as the numbers of collocations with pollution in the EEC did not exceed the threshold, whereas in the enLaw corpus, the collocations with pollution were statistically more relevant.
Secondly, apart from the fact that there were more results for pollution in enLaw, the lexical domains of the verbs collocating with pollution were very similar in both corpora.The verbs that collocate with pollution as an object in both corpora mostly belonged to the domain of CAUSATIVE EXISTENCE, more specifically to cause something to exist, to cause something to cease to exist, and to cause something to not happen.Other important lexical domains were CHANGE, more specifically to cause something to change by decreasing it, and MANIPULATION.The word sketch of verbs with pollution as the subject showed that the lexical domain that predominates in both corpora is EXISTENCE.Another lexical domain present in both corpora is CHANGE.
When we were analyzing the verb pollute with the word sketch object_of, there were no obvious differences between the verb's behavior in enLaw and EEC.When we were studying the noun polluter as the object of verbs, verbs directly related to the legal domain such as prosecute and sue came up, and the most important lexical domain was MANIPULATION.
A few different second arguments arose when we were analyzing the concordances of the verbs abate, minimize, and control.Especially the categories for the second argument of minimize were very different in enLaw (measure) as compared to the EEC (technical procedures and natural entities).
Regarding the phraseology module in EcoLexicon, the verbs abate, minimize, and control will be included under the term pollution in the following phrasemes for the environmental law subdomain: The examples of collocations for the phraseology module will be chosen so as to highlight the differences between the arguments in environmental law and environmental science, showing different examples, depending on the contextualization of the POLLUTION frame.
Table 7 shows the information that will be included in EcoLexicon's phraseology module.Under the term pollution within the subdomain of environmental law, the different lexical domains will be presented with the verbs identified by corpus analysis.When clicking on each verb, the second argument categories will be shown (e.g., INDUSTRY, INSTITUTIONAL BODY, and PERSON/COMPANY in the first row [EXISTENCE, cause to exist]).When clicked on, example sentences that illustrate these verbs and arguments will also appear.In the  As the examples show, emphasis in the environmental law domain is on the polluter (e.g., "state B causes pollution", "The cost is borne by the company who causes pollution"), the liability of the polluter before the courts ("Courts have allowed a common law suit. .."), and the facets of the POLLUTION frame that stand out in environmental law: time and origin (". ..dumping from ships and aircraft").However, if the phraseology for POLLUTION is contextualized for environmental science, the phraseme for CHANGE [decrease] will change: the categories TECHNICAL PROCEDURE and NATURAL ENTITY will be shown, and the example sentences will change their focus to the polluting substance.

Conclusions
The results described in this paper show that frame-based terminology provides the methodological underpinnings to extract the subtle differences between environmental science and its subdomains at the linguistic level.Specifically, verbal collocations in the environmental law domain differ from those in the environmental science domain in regard to the specificity of the arguments or even the activation of certain verbs.These differences must be included in terminological knowledge bases in order to provide an accurate representation of environmental knowledge, as they reveal the nuanced ways in which language is used across different contexts to discuss similar issues.For example, in the broader environmental domain, verbs associated with POLLUTION might include general actions like reduce, prevent, and control, reflecting a wide range of activities impacting the environment.Conversely, within the subdomain of environmental law, the phraseology becomes more precise, incorporating legal-specific verbs such as regulate and sue.This shift in terminology not only underscores the importance of context-specific language for clarity and precision in discourse but also highlights a conceptual change of perspective.Differences at the conceptual level pervade the linguistic level because of the choice of verbs and their arguments.In the same way, the differences observed at the linguistic level can contextualize the conceptual representation of specialized concepts in the conceptual networks.
The present study adds to the still scarce research in specialized phraseology, as well as studies in legal phraseology, which, to our knowledge, have not touched upon legal phraseology in scientific domains.Furthermore, it provides a proposal as to how to represent this phraseology in a terminological resource.The representation proposal, where the verbs of the phraseme are classified according to the lexical domain and the arguments are classified in broader semantic categories, provides a direct link between the phraseme and its underlying semantics.It, therefore, provides the necessary knowledge for end users when they need to choose between different phraseological options.
Representing this phraseological knowledge for all the terms in EcoLexicon in English and in Spanish will be one of the challenges for the future development of EcoLexicon.

Figure 1 .
Figure 1.Non-restricted conceptual network (left) and network restricted for the Caribbean (right) for WETLAND.

Figure 1 .
Figure 1.Non-restricted conceptual network (left) and network restricted for the Caribbean (right) for WETLAND.

Figure 2 .
Figure 2. Conceptual network of POLLUTION without generic-specific relations and with a definition (Reimerink 2021).

Figure 3
Figure3shows the conceptual network for POLLUTION when applying contextual restrictions for the domain of environmental law.It includes the concept HISTORICAL POLLUTION, the additional participant POLLUTER, and the conceptual relations between the POLLUTER, the POLLUTANT, and POLLUTION.

Figure 2 .
Figure 2. Conceptual network of POLLUTION without generic-specific relations and with a definition (Reimerink 2021).

Figure 3
Figure3shows the conceptual network for POLLUTION when applying contextual restrictions for the domain of environmental law.It includes the concept HISTORICAL POLLUTION, the additional participant POLLUTER, and the conceptual relations between the POLLUTER, the POLLUTANT, and POLLUTION.

Figure 3 .
Figure 3. Conceptual network of POLLUTION recontextualized for the domain of environmental law (Reimerink 2021).

Figure 3 .
Figure 3. Conceptual network of POLLUTION recontextualized for the domain of environmental law (Reimerink 2021).

Figure 4 .
Figure 4. Concordance extract of pollute + industry available from word sketch object_of pollute in enLaw.

Figure 4 .
Figure 4. Concordance extract of pollute + industry available from word sketch object_of pollute in enLaw.

Figure 5 .
Figure 5. Extract concordances abate + pollution in enLaw.Second arguments are highlighted with a rectangle.

Figure 6 .
Figure 6.Extract concordances minimise + pollution in enLaw.Second arguments are highlighted with a rectangle.

Figure 5 .
Figure 5. Extract concordances abate + pollution in enLaw.Second arguments are highlighted with a rectangle.

Figure 5 .
Figure 5. Extract concordances abate + pollution in enLaw.Second arguments are highlighted with a rectangle.

Figure 6 .
Figure 6.Extract concordances minimise + pollution in enLaw.Second arguments are highlighted with a rectangle.

Figure 6 .
Figure 6.Extract concordances minimise + pollution in enLaw.Second arguments are highlighted with a rectangle.

Figure 7 .
Figure 7. Extract concordances minimize + pollution in EEC.Second arguments are highlighted with a rectangle.

Figure 8 .
Figure 8. Extract concordances control + pollution in enLaw.Second arguments are highlighted with a rectangle.

Figure 7 .
Figure 7. Extract concordances minimize + pollution in EEC.Second arguments are highlighted with a rectangle.

Figure 7 .
Figure 7. Extract concordances minimize + pollution in EEC.Second arguments are highlighted with a rectangle.

Figure 8 .
Figure 8. Extract concordances control + pollution in enLaw.Second arguments are highlighted with a rectangle.

Figure 8 .
Figure 8. Extract concordances control + pollution in enLaw.Second arguments are highlighted with a rectangle.

Figure 9 .
Figure 9. Extract concordances pollution caused_by in enLaw.Second arguments are highlighted with a rectangle.

Figure 9 .
Figure 9. Extract concordances pollution caused_by in enLaw.Second arguments are highlighted with a rectangle.

Table 1 .
Recontextualization of the definition of POLLUTION in environmental law.

Table 1 .
Recontextualization of the definition of POLLUTION in environmental law.Physical, chemical, or biological alteration of the air, water, or soil by means of microorganisms, chemicals, toxic substances, waste, or wastewater in a concentration that makes the medium unfit for its next intended use and that is caused by a person or company who can be held responsible under civil and/or criminal law POLLUTION(Environmental Law)

Table 2 .
Word Sketch: first 25 verbs with pollution as object in enLaw and EEC.

Table 3 .
Word Sketch: first 25 verbs with pollution as subject in enLaw and EEC.

Table 4 .
Word Sketch: first 25 objects of pollute in enLaw and EEC.

Table 5 .
Word Sketch: first 25 results for polluter object_of in enLaw and EEC.

Table 5 .
Word Sketch: first 25 results for polluter object_of in enLaw and EEC.
INSTITUTIONAL BODY/MEASURE + MANIPULATION [control] + POLLUTIONThe first phraseme for the environmental science domain as a whole will be different: • INSTITUTIONAL BODY/COMPANY/MEASURE/COST + CHANGE [decrease: abate, minimize] + POLLUTION • table, the example sentences are shown for INSTITUTIONAL BODY causes POL-LUTION, PERSON/COMPANY causes POLLUTION, INSTITUTIONAL BODY abates POLLUTION, COMPANY abates POLLUTION, and MEASURE abates POLLUTION.

Table 7 .
Proposal for phraseology module related to the term pollution in EcoLexicon.