The Phraseology of Legal French and Legal Popularisation in France and Canada: A Corpus-Assisted Analysis

: The popularisation of legal knowledge is a critical issue for equal access to law and justice. Legal discourse has been justly criticised for its obscure terminology and convoluted phrasing, which notably led to the Plain Language Movement in English-speaking countries. In Canada, the concept of Plain Language has been applied to French since the 1980s due to the official policy of bilingualism, while the concept has only been recently discussed in France. In this paper, we examine the impact of Plain Language rewriting on legal phraseology in French popularisation contexts. The first aim of our study is to see if plain texts published in France contain more traces of legal phraseology than French Canadian texts. Our second objective is to determine if a ‘phraseology of plain language’ can be identified across genres and languages. To do this, we compare two corpora of expert-to-expert legal texts written in French—made up, respectively, of legislative texts published in France and judicial texts published by the Supreme Court of Canada—with two corpora of texts that are claimed to have been written in Plain French Language for a non-expert readership—texts that guide laypersons through legal and administrative processes in France and summaries of decisions by the Supreme Court of Canada. Using n-grams, we extract and discuss the patterns that emerge from the corpora. In particular, our analyses rely on the concept of ‘lexico–grammatical patterns’, defined as the minimal unit of meaningful text made up of recurrent sequences of lexical and grammatical items. We then identify a sample of recurring lexico–grammatical patterns and their discursive functions.


Introduction
Plain Language (PL) is an attempt to encourage official institutions and other organisations to communicate with laypeople using accessible, clear, user-friendly language.The PL movement first gained momentum in the 1970s and 1980s in English-speaking countries: notably, in the United Kingdom, Australia, and New Zealand and then the United States and Canada (Asprey 2004).The initial domain of application for PL was legal and judicial contexts, such as the drafting of contracts and statutes, but the concept has since spread to other areas, such as public administration and medicine.PL differs from more formal language schemes (such as Basic English, Controlled Language, and so on) in that it corresponds to a nebulous series of stylistic preferences rather than an explicitly defined set of re-writing rules or vocabulary.The guidelines for PL include negative advice (avoid the passive, do not use rare or specialised terms, avoid complex verbs and complex prepositions, etc.) as well as more positive recommendations (use shorter sentences, prefer direct expressions, address the reader as 'you', etc.) (Cutts 2008;Williams 2004).Notwithstanding a lack of formal definition, PL has attained a high degree of official recognition in various Englishspeaking countries and has been implemented in both expert-to-expert communication (as in statutes) (Williams 2004(Williams , 2015) ) and expert-to-non-expert communication.
Many studies have been devoted to discussing the implementation of PL (Williams 2015(Williams , 2022) ) or its reception in various legal settings in English-speaking countries, including the United Kingdom and New Zealand (Masson and Waldron 1994;Rossetti et al. 2020).There has also been research from the point of view of discourse analysis that looks at popularisation and the dissemination of legal knowledge.In this paper, we focus on the popularisation of legal knowledge, defined by Engberg et al. (2018) as the recontextualisation of legal knowledge from contexts that exhibit a form of power asymmetry to a new non-expert context, with the intent of adapting the presentation of knowledge to the audience (Engberg et al. (2018) [our paraphrase]).As can be seen in this volume, numerous authors have investigated popularisation of the law carried out by legal institutions in various languages such as English (Cacchiani et al. 2018;Turnbull 2018), German (Luttermann, and Engberg 2018) and French (Preite 2016(Preite , 2018)), while others have examined popularisation in non-institutional contexts through various media, including studies of YouTube videos used by expert lawyers (Cavalieri et al. 2018), children's books (Diani 2018) or teaching applications of TV shows in legal English classes (D ąbrowski 2017).Our paper follows on from research on popularisation produced by legal institutions, as it focuses on legal information texts published by the French government and Plain Language Summaries of judgements published by the Supreme Court of Canada.Both genres are intended for a non-specialised audience, and to the best of our knowledge, these genres have seldom been compared.
Generally speaking, the concept of PL is less well established outside the Englishspeaking world, and there have consequently been far fewer studies on PL or on the principles of clear language, especially in languages such as French.However, it is notable that the principles of PL have been widely adopted and implemented in Canada, and therefore also in the French-speaking parts of Canada, as part of bilingual language policy under the name Langue Claire et Simple (Simple and Clear Language).Various legal professions (barristers) and institutions (the Supreme Court of Canada) now claim to use PL in Canada (Asprey 2004).In contrast, the concept of PL is not as well-developed in France, and it has not achieved the same level of institutional recognition.One reason for this may be that relations between French citizens and the administration (Service Public) are notoriously difficult.This has been argued by both linguists (Collette et al. 2002) and independent observers, who have pointed out the complexity of legal and administrative procedures for French users of the law (des Droits 2019).Thus, although there have been attempts to implement the principles of PL in some contexts, it strikes us that there is generally still a considerable gap between the user-friendly discourse adopted by 'public-facing' organisations in many English-speaking contexts and the highly elaborate 'techno-heavy' style of French official discourse.
These general observations lead us to test two assumptions in this paper.In the first instance, we set out to test the hypothesis that French texts from France (FR FRA) are 'less simplified' than comparable texts from English-speaking countries (EN UK, EN NZ, etc.).This hypothesis was tested and partly confirmed in Bouyé (2022).In this paper, we test the related hypothesis that Canadian-French texts (FR CAN) are 'more simplified' than their European French counterparts (FR FRA).By 'more or less simplified', we are not talking about a single quantifiable characteristic, but rather, we are talking about two different configurations that can be identified systematically in two types of discourse (expert legal texts vs. plain legal texts).More specifically, abundant research has established that legal language is characterised by a 'highly nominal' style (Crystal and Davy 1969)-one of the characteristics that legal French shares with English, amongst other languagesalong with Latin and Latinate terms, set formulae, a formal register, complex syntax due to a high degree of subordination, and the use of complex prepositional phrases (Galonnier 1997).One of our objectives is to examine the impact of the reconfiguration of legal knowledge on certain syntactic and stylistic features, including nominal and prepositional forms, as PL guidelines often encourage the use of verbal rather than nominal forms (Plain English Campaign 2022).
In the remaining sections of this paper, we explore these questions quantitatively and through the prism of phraseology.Several analysts have examined the phraseology of legal language, with a number of studies looking at regular expressions and lexical bundles in English (Biel 2017;Breeze 2017;Goźdź Roszkowski and Pontrandolfo 2013;Goźdź-Roszkowski and Pontrandolfo 2017).However, fewer studies have been conducted on the phraseology of PL in French.What we mean by 'the phraseology of plain language' is simply the typical wordings (routine formulae, extended collocations, lexico-grammatical patterns, etc.) that can be seen as statistically significant in one type of text (popularized texts, mediated knowledge, etc.) when compared to other types of text.More specifically, our concept of 'phraseology' corresponds to recurring sequences of lexico-grammatical sequences that operate as whole semantic units and serve a regular discourse function within a specific type of discourse or genre, a notion we have explored elsewhere (Bouyé and Gledhill 2019).Thus, in popularised legal texts, it is possible to identify recurrent sequences of this type (This is called . . .Dans un délai de) and to associate these sequences with specific discourse functions.These discourse functions include referential functions, i.e., sequences that refer to participants or elements of the legal process, as well as metatextual functions, such as sequences that define a term or direct the reader towards another part of the text or towards another text.In one of the first studies of this type (Bouyé and Gledhill 2019), we attempted to set out some characteristics of PL phraseology in English and French using n-grams.In this paper, we return to the concept of the 'lexico-grammatical pattern' (LxGr) in order to examine whether there is such a phenomenon as the 'phraseology of simplification'.In order to grasp the implications of this, it is important to provide a more formal account of LxGr patterns.We define LxGr patterns (Gledhill et al. 2017) as recurrent sequences of lexical items ('collocations') that correspond to regular grammatical structures 1 and that have a recognisable frame of reference or discourse function.Thus, each LxGr pattern corresponds to a 'minimal meaningful unit of text'.Unlike n-grams and other fixed sequences, LxGr patterns are productive and potentially discontinuous.The simplest forms of LxGr patterns are routine formulae or 'speech acts' (such as greetings, warnings, official pronouncements, etc.) (Gledhill et al. 2017).
In addition to our hypotheses regarding Plain Language in legal discourse, in this paper we also examine a number of more general research questions regarding LxGr patterns.In particular: what is the smallest possible sequence of items (n-gram) that corresponds to an LxGr pattern (i.e., a meaningful unit of text)?Furthermore, given a random selection of n-grams, is it possible to predict the discourse function for that LxGr (e.g., definition, procedure, explanation, evaluation, etc.)?
Returning more specifically to the topic of Plain Language in French, this paper considers the following research questions:

•
Are there characteristic LxGr patterns in legal texts (i.e., in non-plain legal texts)?• Are there traces of such patterns in PL texts?• Similarly, are there characteristic LxGr patterns of PL in administrative discourse?• More specifically, is it possible to establish a difference between generic phraseology (belonging to several 'genres') and specific phraseology (patterns that are 'unique' or at least more salient in one genre as opposed to all the others)?
We propose to answer these questions in the following sections.Section 2 introduces our data and corpus tools.Sections 3 and 4 present, analyse and discuss the results obtained from our data.

Data
The textual data used in this study are based on two French-language corpora: one consisting of popularisation texts destined for non-expert law users, entitled PLAIN, and the second, entitled LEX, made up of expert-to-expert written legal genres.Each of these corpora is subdivided into two subcorpora.
Concerning the PLAIN corpus, as mentioned above, this paper focuses on the popularisation of legal knowledge in French published by two government institutions.Turnbull (2018) distinguishes between 'popularisation', defined as the recontextualisation of information with the aim of broadening the reader's general knowledge, and 'knowledge mediation', in which information is transferred with the aim of allowing readers to take action performatively and thus to 'empower' themselves.The two popularisation genres represented here can be said to be instances of both mediation and popularisation as they relate directly to citizens' access to justice (accès au droit et à la justice) and public understanding of rights (connaître et faire valoir ses droits) and aim to help their readers make sense of the decisions taken by major judicial institutions or guide them through various legal and administrative processes.The first PLAIN subcorpus is composed of texts published by one of the most popular public service dissemination websites in France: Service Public.The texts are drafted and published by the governmental agency Direction de l'information légale et administrative (DILA), which is a department of the French Prime Minister's Office.It is one of the main public service legal mediators in the country.This subcorpus, entitled FR-Admin-DILA, is made up of 337 texts published between 2017 and 2019 and 466,472 word tokens.The second PLAIN subcorpus was collected from the Canadian Supreme Court website.It is made up of 66 summaries of decisions taken by the Canadian Supreme Court between 2017 and 2019.The small size of this corpus, which comprises 68,025 word tokens, can be explained by the specific type of text it is made of.The summaries, or 'Cases in Brief', are short summaries of the Court's judgements that recall the facts, explain the final decision reached by the Court and explain the positions of both the majority and dissenting opinions.This corpus is called FR-CA-Résumés.
As for the LEX corpus, it comprises FR-LAW, which contains articles and excerpts from statutes drafted between 1967 and 2018 and which were still in force at the time the corpus was compiled.It is representative of the legislative register and contains 5,083,750 word tokens.The second specialised corpus, entitled CA-Judgements, is made up of judicial texts: namely, decisions written and delivered by the Justices from the Supreme Court of Canada published between 2017 and 2018.It contains a total of 682,294 word tokens.

Methodology
Although in this paper we are focusing on phraseology, the first step in our analysis is to establish candidate sequences in the form of n-grams.In this case, we are interested in n-grams that identify parts of speech (POS) rather than just lexical forms.To identify salient POS-grams, we use the n-gram function of the concordance software Sketch Engine.Part-of-speech tags (not words) are used as attributes in order to extract not only major lexical bundles but also possibly salient syntactic regularities that could characterise our corpora.POS-grams allow us to capture the most salient grammatical constructions that may be candidate forms for more 'recognisable' LxGr patterns (as mentioned below, not all n-grams are potential LxGr patterns).To extract the most salient POS-grams in these corpora, a general reference corpus was used.This was the French Web Corpus (Sketch Engine) 2017, which is part of the TenTen family, a set of corpora obtained through webcrawling (Jakubíček et al. 2013) that contains a variety of text types, including news texts.We consider the French TenTen to be a general reference corpus.It must be noted that although this reference corpus contains more than four billion word tokens, only the first million word tokens in the corpus are used when computing the frequency on Sketch Engine.Key POS-gram candidates were identified based on Sketch Engine's keyness score feature, which uses the 'simple math method' (Kilgarriff 2009) to identify key words or key n-grams based on the normalised frequency of the word or n-gram in the focus corpus in relation to its normalised frequency in the reference corpus and includes a smoothing parameter.For each subcorpus, POS-grams with a keyness score above 100 were marked as LxGr-candidates.This means they were found to be at least a hundred times more frequent in the subcorpora than in the French TenTen reference corpus.To ensure that the selected patterns were over-represented in the legal or plain legal corpora as compared to the reference corpus, chi-square tests were performed for each dataset using R.This point also applies to the other results we mention in Section 4 of this paper.
The n-gram function with POS tags as attributes returned a list of POS tag sequences that had to be converted back into readable POS patterns or lexical bundles.The analysis of concordances for each candidate POS-gram thus had to be carried out to identify LxGr patterns and their functions.Many POS-gram sequences with very high keyness scores corresponded to noise, i.e., they referred to numbers, prices or abbreviations in the various corpora.Others corresponded to parts of larger LxGr patterns.It was therefore necessary to consider POS-grams using contextual analysis and concordances.In Section 3, we present some POS-grams and patterns that were both highly recurring and interesting in terms of their rhetorical functions.This explains why some of our corpora have more distinctive patterns than others.As mentioned above, we do not attempt to make a simple distinction between 'specialised' and 'popularised' phraseology, but we are attempting to identify the typical patterns that emerge in comparable corpora, which can be characterised as expert-oriented ('specialised') and non-expert oriented ('popularised').In the following discussion, we see examples of salient phraseology that can be identified as typical (in a statistical sense) in one corpus and atypical (or absent) in the other or can be observed as occurring in both corpora.This does not mean that we are in a position to fully characterise the phraseology of simplified language; rather, we claim here simply that it is possible to identify a representative sample of the most salient (outstanding, archetypical) phraseological units in our corpora, thus paving the way for a more complete analysis using other methods (e.g., textometrics).
We then performed a qualitative analysis of salient sequences based on domainspecific discourse functions put forward by Goźdź- Roszkowski et al. (2012).These include the category 'legal reference bundles': in particular, Institutional bundles, which refer to institutions; or Terminological bundles, which refer to specialised terms.The second category that was used to classify the POS-grams is text-oriented bundles: in particular, Structuring bundles.The final category used by Goźdź-Roszkowski et al. ( 2012) is Stanceoriented bundles, which contains Attitudinal bundles and Epistemic Stance bundles.Not all of these categories are represented in the results below.
In the rest of this paper, we use the following typographic conventions to refer to LxGr patterns.Lexical items are shown in italics, and each LxGr pattern is presented between angular brackets < >.All frequencies given in Tables are relative frequencies per million words (pmw).

Analysis of Lexicogrammatical Patterns in Legal Texts
Table 1 shows some of the key POS-grams that were extracted from the FR-Law corpus from France.Some patterns found in the results are specific (unique) to the type of discourse, or register, as can be seen when comparing the relative frequencies of these POS-grams in the two focus corpora, where they occur several hundred or even thousand times per million words, whereas they only appear a few times or a few dozen times in the general reference corpus.For example, Table 2 shows two specific 3-grams from the CA-Judgements corpus: la Cour d'Appel, which means 'the Court of Appeal' and les juges majoritaires, 'the majority judges'.Both are fragments of phrases that belong to longer Institutional bundles.These phrases refer to central figures and institutions of the Common law system: namely, the judges from the lower court along with the Court of Appeal, whose decision is the basis for a case being brought before the Supreme Court.In the case of the FR-Law corpus, one recurring pattern is le cas échéant, which means 'if the [aforementioned] conditions are fulfilled'.As we will see, these are key terms and phrases in legal discourse that can also be found in PLAIN texts but with only marginal recurrence (20 pmw).
What can be noted from Tables 1 and 2, however, is that both genres share common LxGr patterns, although with some variation.Most of these patterns correspond to what Goźdź- Roszkowski et al. (2012) calls Textual bundles and refer to other statutes or decisions or to other articles or sections in the same statute or judgement.These patterns are linked to cross-referencing: a broader characteristic of legal discourse (Tiersma 2000) that can be found in both genres (statute or Supreme Court opinion).In statutes, Bhatia (1994) calls these Referential provisions and explains that they point intertextually to other legislative texts or passages within the same text.
(CA-Judgements) Section 662 of the Criminal Code provides that where a person is charged with one offence, but only a part of that offence is proved, he or she may be convicted of a lesser, included offence.(Official Supreme Court of Canada translation) (2) La durée de la période prévue <à l'article L. 434-9 > est fixée à trois ans.(FR-Law) The duration of the period is set to three years under art.L. 434-9.(Our translation) Such patterns (using a prepositional phrase in French and a thematised phrase in English) are pervasive in this type of expert discourse (statutes, decisions), which requires constant references to previous decisions, statutes and legal instruments.
Another particularly interesting candidate is represented by the sequence <Prep + N + prep + N + prep>.This is a good LxGr candidate since it consists of a highly regular recurring grammatical structure but also allows for a large amount of lexical variation.A quick examination of some concordance results reveals that this pattern corresponds to a complex prepositional sequence, i.e., a prepositional phrase that introduces another noun phrase or prepositional phrase(s).
(3) Les présents pourvois portent <sur l'étendue de l'obligation de communication du ministère public en ce qui a trait aux> registres d'entretien des alcootests.(CA-Judgements) These appeals deal with the scope of the Crown's disclosure obligations with respect to maintenance records of breathalyzer instruments.(Official translation by the Supreme Court of Canada) The fact that this pattern is highly salient in our French legal corpora is significant.The use of complex noun chains and prepositional cascades is considered a typical feature of legal language, especially when it is in the form of complex prepositions in expressions such as for the purpose of 2 (Bhatia 1983;Biel 2017;Coulthard et al. 2016).Even more significantly, according to PL drafting guides in English and French, complex nominals and prepositional chains are among those features that should be avoided by legal writers because they purportedly contribute to the heavy nominal style (Crystal and Davy 1969) of legal language.In addition, the packing of information in nominal or prepositional cascades increases lexical density in texts (Halliday 1994).We discuss this pattern further in Section 3.3 since it is also found to some extent in PLAIN texts from our corpora.
It is also interesting to note at this point that many of the patterns we have identified above can be found within other patterns (either embedded or adjacent), as is evident in the example below, an excerpt from the French-Law corpus, in which two patterns from Table 1 can be identified: The fact that patterns occur alongside or within other patterns (thus, we use the term 'chains of pattern' or cascades) corroborates the idea that phraseological patterns constitute the building block of specialised discourse and here, in particular, of legislative or judicial discourse.We now turn to the main LxGr patterns in the PLAIN corpus.

In the European French Admin Corpus
The key POS-grams in the French administrative corpus addressing law users correspond to very specific n-grams (or lexical bundles).Table 3 shows the most frequent patterns.Although many of these items are very short, we suggest that several of these sequences are fragments of longer LxGr patterns, which themselves constitute recognisably meaningful units of text (according to our definition above).Many of these sequences correspond to text-oriented patterns, framed as direct questions, which have a cohesive, organisational function.These LxGr patterns either introduce the definition of a justmentioned term, as in Examples 5 and 6 below, or explain who or what institution the law users can contact next in the legal process they are involved in.In Example 7, the question 'Who should I contact?' is immediately followed by a sentence that gives the phone number of the Police.
(6) <De quoi s'agit-il ?>Le contrôle judiciaire est une mesure qui soumet la personne mise en cause dans une affaire pénale à une ou plusieurs obligations, dans l'attente de son procès.(FR-Admin-PL) <What does this mean?>Bail is a measure that can be given, with one of more conditions, to a person accused of a criminal offence while they wait for trial.2018) have called 'conversational turns'.As such, they represent significant structuring devices in the dissemination of legal knowledge.Drafters of legal knowledge mediation texts imagine and anticipate the questions law users might have and use interrogative sentences to structure their texts when explaining the steps of the legal procedure or situations related to a legal right.In the case of De quoi s'agit-il ?(Example 5), we have an explicit signal that the text is going to provide the definition of a term.As we see later on, this structure is comparable to <il y a + N>, in which the authors interrupt their exposition to provide an explicit metadiscoursal definition.The difference here is that <De quoi s'agit-il?>and <Où s'adresser> both belong to oral discourse and are explicit markers of turn-taking; whereas <Il y a X quand / lorsque> belongs to elaborate, expository discourse.
Other patterns correspond to lexical phrases that are are linked to the steps of the legal or administrative process itself, such as the complex prepositional phrase dans un délai de + N, which defines the time limit for legal action for law users.This highly frequent bundle, which appears only 0.9 pmw in the general French corpus, appears to be quite specific to the FR-Admin corpus.It is usually part of an extended LxGr pattern that contains an Actor (usually second-person You), a verb phrase referring to some type of legal action (such as reporting a crime or referring to an institution) and the complex prepositional phrase, which functions as a time adjunct.
(8) À savoir : en raison des règles de prescription, vous devez déposer votre plainte pour viol <dans un délai de 20 ans> à compter de la date des faits.(FR-PL-Admin) Please be aware that because of the statute of limitation, you need to file a rape complaint within twenty years.(Our translation) (9) Si votre demande est acceptée, vous en êtes informé par courrier <dans un délai de 4 mois>.
(FR-PL-Admin) If your request is approved, you will be notified within four months.
In Example 8, the time limit that is defined is twenty years, and the legal action is reporting a specific felony (rape).Interestingly, this type of pattern can also be found in the FR-Law corpus, as in the example below.Although the pattern <dans un délai de> appears in the legislative corpus, it is about half as frequent in the specialised corpus (229.5 pmw vs. 568.1 pmw in the PLAIN FR-Admin corpus).It thus seems to be relatively more specific to the discourse of administrative French.
(10) L'autorité administrative statue sur la demande <dans un délai de six mois> à compter du dépôt par l'étranger du dossier complet de cette demande.(FR-Law) The authority reviews the application within six months after the application has been made.(Our translation) Whereas the pattern is found in sequences wherein the reader is directly addressed in the PL administrative texts, the legislative excerpt (Example 10) is much more impersonal, as the subject of the verb is an abstract entity (the administrative authority).
In the specialised texts (FR-LEX, CA-Judgements), we find a number of sequences in which the subject of the verb corresponds very often to an abstract legal or administrative concept.In the administrative corpus, the typical subject of certain patterns corresponds more often to the law user (thus representing a significant re-orientation of the discourse).This difference suggests what has been called a process of 'personalisation' (Turnbull 2018), by which expert legal knowledge is reformulated for a non-expert readership.
For example, one of the most common POS-grams we find in the corpus is <Vous devez/pouvez + V>, involving the second person vous and a modal verb expressing either obligation (devez = must) or possibility (pouvez = can/may).The lexical verb that is introduced in these contexts often expresses an administrative procedure (expressed as a Material or Behavioural process 3 ).( 11) <Vous devez écrire> directement au procureur de la République.
You can collect evidence of your harassment yourself (Our translation).(FR-Admin-PL) Similarly to dans un délai de, this pattern is usually associated with various types of legal action, albeit not as fixed in terms of the syntactic frame.The modals of both the obligation and possibility vary according to the way in which legal dissemination texts define the user's legal rights and obligations.As with previous examples of reorientation, the use of the second-person pronoun is part of the communication strategy called 'conversationalization of public discourse' (Turnbull 2018), which is performed through the use of direct questions (as seen above) as well as first-or second-person pronouns to reformulate highly abstract legal knowledge and to create a form of dialogue between the institution and the non-expert readership.

Summary of the Canadian Plain Language Corpus
The most salient POS-grams and lexical bundles in the summaries of judgements are presented in Table 4.The LxGr patterns in the CA-PL-Summaries are mostly Institutional bundles, referring to either institutions (the Court of Appeal), legal actors (judges) or to foundational legal texts: in particular, the Canadian Charter of Rights and Freedoms, which sets out and protects a number of rights and freedoms, as can be seen below.It can be noted that several of the shorter LxGr patterns presented in Table 4 and found in the CA-PL-Summaries are the same as those found in the CA-Judgements corpus: namely, la Cour d'appel ('the Court of Appeal') or an extended LxGr pattern of the type les juges majoritaires ont + V + que ('the majority judges + V + that').These mostly correspond to key terms of judicial discourse in the Common Law system, which explains why they are used abundantly by the Supreme Court Justices in their judgements.As for the summaries, they have both an encapsulating and explanatory function.As such, they need to explain what the various courts involved in a case have decided and, in particular, what the majority opinion of the judges was; hence, there are frequent references to Courts of Appeal and to the majority judges.
What is interesting is that many of the sequences found through the analysis of POSgrams reveal extended LxGr patterns, often corresponding to projection (expression of engagement through indirect speech).( 15 These examples appear to belong to an extended LxGr pattern for which we suggest the following formula: <Institutional subject + Communicative process (state, confirm) + que (that) + Reporting clause>.This pattern is linked to a key inherent function of the summary genre, which is to report the Supreme Court's decisions by explaining the opinions expressed not only by the Justices but also by the inferior courts.Some authors discuss this in terms of 'discursive heterogeneity' (Preite 2016), referring to the fact that legal popularisation discourse (like all popularisation discourse) is based on a 'primary' specialised discourse that is explicitly mentioned as a legitimising source.This is especially true in the summaries of decisions, as the figure of the judge is at the core of the explanation and elaboration strategies.

Focus on Phraseology: Patterns from Legal Texts Also Found in the PLAIN Corpora
We have seen a certain number of patterns that appear to be specific to certain genres or are found in a certain type of register (legal or popularised text).We now turn to some specific patterns that seem to be of particular interest because they are particularly representative of legalese, and they can also be found in PLAIN legal texts from our corpus.

Il y a (There Is/There Are)
The first pattern we want to focus on is the existential or presentational structure Il y a, which is usually translated as There is/are.Although its keyness score is not extremely high, the pattern caught the authors' attention when analysing the data, as it is used with a specific syntax and is associated with an extended LxGr pattern that has a specific discourse function in the concordances in both legal and plain texts.It furthermore exhibits a higher relative frequency across our corpora as compared to the reference corpus (respectively: FR-Law 125.9 pmw; FR-Judgements 109.3 pmw; FR-Admin-PL: 132.9 pmw vs. FrenchTenTen 34.11 pmw).Some examples of this pattern and its use are set out in Figures 1 and 2.  In the excerpts from FR-Law and CA-Judgements shown in Figure 1, this pattern is used to define a legal term or the conditions for a legal term to be established, especially a crime or felony.Similarly, in the excerpts from FR-Law and CA-Judgements shown in Figure 2, a similar pattern is used to specify and define a legal term.In the figures, the existential structure "il y a" is shown in black, while the term is presented red and the conjunction introducing a conditional clause or definition is in purple font.Significantly, this is always a term that has just been mentioned in the immediate co-text.Grammatically speaking, the indefinite article is always elided in this construction (before the term), which is unusual in French: hence, our choice to our focus on this structure.This particular feature allows us to draw up a formula for the LxGr pattern as a whole: <[term in preceding discourse] ... il y a + [Ø] + [name of the term] // conditional clause + definition>.Finally, a notable difference in the corpora is that in the PL texts, the clause that introduces the definition or conditions for the crime or felony is quand (when) or si (if), while the legal texts contain its more formal alternative lorsque.
The structure Il y a + [Ø article] + definition happens to be used in other popularisation/dissemination discourses: for example, medical popularisation.The example below is an excerpt from the French National Health Service website explaining the signs of cardiac arrest, and it introduces the term using the same <il y a + [Ø]> construction.
(17) <Il y a arrêt cardiaque si> : la victime perd connaissance, tombe et ne réagit pas quand on lui parle ou qu'on la stimule It is a case of cardiac arrest if the victim is unconscious, falls or shows no reaction when talked to or stimulated (our translation).
Other types of dissemination discourse that use this pattern include promotional discourse or game rules.It appears to be specific to elaboration strategies and definitions of terms in French.

Prepositional Cascades
The second pattern we now turn to is often found in specialised legal texts and is especially highly frequent in the FR-Law corpus, although the CA-Judgements corpus also contains some examples as well as variations on the same type of structure.The pattern in question can be seen as an expanded version of the complex prepositional chains mentioned in Section 3.1, as it is composed of a chain of three prepositional phrases.What is surprising is that this highly complex structure can also be found routinely in the texts from the PLAIN corpus.Table 5 displays the frequency (pmw) of this POS-gram in the specialised corpora and their simplified versions as well as an example from each corpus.What stands out in this table is that, first, the prepositional cascades <Prep + N + prep + N + prep> are particularly prominent not only in the FR-LAW corpus, with a frequency of 7150 pmw, but also in the FR-Admin-PL, with 5702 pmw.Although this is half as frequent in the CA-PL-summaries than in the CA-Judgements corpus, it is also statistically salient in the judicial dissemination corpus.This result suggests that the syntactic complexity that characterises legal language (Bhatia 1983), even after simplification to address a lay audience, still percolates into the plain texts under study.We discuss our results in the following section.

Discussion
The results of our phraseological exploration of legal French are consistent with other phraseological studies of legal language in other languages and contexts, as they underline the highly intertextual and syntactically complex nature of legal French in both subcorpora.Our results also suggest differences in the two genres represented here, as the use of complex prepositions and prepositional cascades appears to be more frequent in the legislative FR-Law than in judicial discourse, represented by CA-Judgements, a result which is also consistent with previous research on legal phraseology in other languages, such as Spanish (Pontrandolfo 2021).Of course, the fact that the two legal subcorpora come from different countries, France and Canada, might also account for this difference in the frequency of these prepositional structures, as legal phrasemes are 'bound to a particular legal system' and the results should therefore be interpreted with caution (Pontrandolfo 2023).Future research on comparable judicial and legislative corpora in both European-French and French-Canadian would be necessary to examine whether the difference is genre-based or culture-based.
Concerning the PLAIN corpus, our preliminary results show that there are traces of legal phraseology in both dissemination corpora (that is to say, in the Canadian French corpus and the European French corpus).This becomes particularly clear in the case of projecting/reporting structures and complex prepositional chains.This leads us to suggest that the simplification of legal phraseology in French, though evident in both PLAIN corpora, is only achieved to a limited extent: an observation that is consistent with previous findings (Rossetti et al. 2020).There are, however, some LxGr patterns that appear to be characteristic of plain legal French, especially for certain genres.We note the very high frequency of lexical bundles like De quoi s'agit-il?('What is it?')or Où s'adresser in the FR-Admin-PL corpus, as illustrated again by the example below in which both patterns create a dialogical structure in a text that is explaining what to do in case of harassment.
(FR-Admin-PL) What is it?Harassment is defined as repeated words of behaviours that aim at or result in the deterioration of the victim's living conditions.(. . . ) Where to get help The victim can complain about the person(s) that are harassing them.
In our preceding analysis, we have seen lexical patterns that either serve to 'structure' the text (Halliday's textual metafunction), to introduce specialised terms (the 'Il y a' construction) or draw the law users' attention to specific points in the text.Thus, we can characterise the overall strategy in the FR-DILA corpus as an attempt to provide readers with heuristic guidance-helping them to navigate around administrative procedures rather than setting out a 'boiled down' or simplified version of these procedures.
In the PL summaries from the Supreme Court of Canada, the communicative strategy is quite different: these texts are structured by references to the central figure-the judgeand the judicial institutions, which include the Supreme Court as well as the lower courts of appeal.Structurally, the core phraseology of this genre is oriented towards reporting clauses or to expansions (explanations), both of which involve extensive use of subordinate and embedded clauses.This can be seen in the example below.
The majority said this also burdened the justice system,which spent more trying to get poor people to pay the surcharge than it would ever get back.The majority noted that a sentence works bestif it is made for the individual.(Official English version) One of the notable differences between the two PLAIN corpora is their types of syntactic complexity.Our preliminary analysis suggests that the Fr-PL-Admin corpus is characterised by more nominal complexity, i.e., by the use of prepositional cascades and complex noun groups, while the CA-summary texts appear to contain more clausal complexity, i.e., more subordinate clauses, especially reported speech.In this sense, the French administrative texts are closer to the legislative texts they are based on and are more complex in terms of information packaging in noun and prepositional phrases.The French Canadian corpus also exhibits more clausal complexity and is thus more 'elaborate' in this particular sense of clausal complexity, as can be seen in Example 19, in which we have emphasised relative pronouns as well as binding and linking conjunctions.Generally speaking, it might be expected that 'simplified discourse' will, in fact, turn out to be more elaborate (i.e., involve more structural expansions) than the highly codified but also much denser and compact 'expert discourse'.Indeed, greater clausal and verbal complexity linked to elaboration strategies and unpacking of information has also been shown in other PL discourse: in particular, medical discourse (Gledhill et al. 2019).The results we have set out above may also be due to the influence of the Plain Language Movement in French-speaking Canada, whereas the concept has not been institutionalised as such in France.If the FR-Admin corpus is closer to the FR-Law corpus, it may also be due to the fact that some of the texts published are actually direct quotations, without any simplification, of the original legal text, as suggested by results put forward by Bouyé (2022).Despite these differences in communicative strategies and complexity, our findings regarding 'plain French' appear to be consistent with recent research on legal popularisation; as is the case in other online lay-oriented discourse, drafters 'balance impersonal explanatory strategies with more interpersonal and communicative strategies' (Diani et al. 2023).
In the first section of this paper, we raised several questions about the nature and distribution of lexico-grammatical (LxGr) patterns.Our first question related to the relative 'size' of n-grams (What is the smallest sequence possible or useful to identify meaningful LxGr patterns?).In the data analysis above, we have demonstrated that it is often possible to analyse short n-grams as longer stretches of meaningful text that often correspond to specific discourse functions.The primary example here is the very short sequence Il y a, which on its own can only be seen as a short verbal group in French.Out of context, this sequence is so ubiquitous in French that it tells us very little, but once we begin to look at the corpus data, Il y a turns out to be a very distinctive definitional routine in both of the main FR corpora we analysed here.
Regarding our more general questions, it is useful to deal with each of them in turn: (1) Is it possible to assign a discourse function to random n-grams, assuming that these sequences have been found to be salient in one of the subcorpora?Here, we have demonstrated this as a principle, although we have clearly not shown this positively across a wide range of data.(2) Is it possible to identify characteristic LxGr patterns in legal texts (i.e., in non-plain legal texts)?This has been demonstrated using various examples, such as complex nominal groups/prepositonal phrases (Section 3.1).(3) Is it possible to identify characteristic LxGr patterns of PL in administrative discourse?
This has also been demonstrated in relation to turn-taking sequences and procedural constructions, among other examples (Section 3.2).( 4) Is it possible to establish a difference between generic phraseology (belonging to several 'genres') and specific phraseology (patterns that are 'unique' or at least more salient in one genre as opposed to all the others)?We have shown that certain constructions (such as the projection structures of reported speech) are 'generic' and occur as significant LxGr patterns in all of the corpora we have analysed here.This is especially evident in the two specialised corpora, in which LxGr patterns related to cross-referencing pervade both judicial and legislative corpora.Despite obvious macro-textual differences, the FR-Law and CA-Judgements corpora can be identified as belonging to the legal register as a whole; this is, notably, based on the 'generic' patterns related to cross-referencing, one of the features that gives legal language its characteristic 'legal flavour' (Maley 1994).Regarding specific LxGr patterns, we have identified a significant sample of these, among the most recognisable ones being the multiple prepositional phrase pattern (associated with FR Law and CA-Judgements) as well as the Il y a definitional pattern (specific to the LEX corpus by use of the conjunction lorsque or to the PLAIN corpus by use of the subordinators quand/si).

Conclusions
In this paper, we have tried to characterise the phraseology of two legal genres and of two types of legal popularisation texts in French from France and from Canada.Our results suggest that salient n-grams and LxGr patterns can be identified in both legal French and PL discourse.Some patterns appear to be generic patterns, i.e., they are linked to the legal register as a whole, while others are specific to a genre or text type.In particular, our findings contribute to the characterisation of administrative and legal dissemination in French as phraseological patterns linked to knowledge recontextualisation and elaboration.Traces of highly complex phraseological patterns from legal language can be found in both PLAIN corpora, although they are more salient in European French texts, suggesting that these are more complex (in the sense of 'lexico-grammatically elaborate') than Canadian French texts.There are probably deep-rooted cultural reasons for this (higher expectations placed on French-speaking users of the law/public services, the relatively recent emergence of the plain language movement in France, differences in legal culture as well as practice . . .).
Further research is needed to confirm the preliminary observations set out in this article.Possible research perspectives include designing a survey to measure the comprehensibility of plain legal French by obtaining readability measures (self-paced reading or eye-tracking) from ad hoc tasks targetting specific features in legal texts: for example, complex prepositional phrases or passives.

( 4 )
Tout travailleur de nuit bénéficie d'un suivi individuel régulier <de son état de santé> <dans les conditions fixées à l'article L. 4624-1>.(French-Law corpus) Any night worker has the right to regular health check-ups pursuant to the conditions set out (Our translation) <Où s' adresser ?>Police secours -17 Par téléphone Composez le 17 en cas d'urgence concernant un accident de la route, un trouble à l'ordre public ou une infraction pénale.Where to get help Police services -By phone -Call 17 in case of an emergency, traffic accident, disturbing the peace or criminal offence.(Our translation) (FR-Admin-PL) Taken as a whole, these LxGr patterns are part of what discourse analysts such as Turnbull (

( 13 )
<La Cour d'appel> a dit partager l'opinion du premier juge.(CA-PL-Summaries) The Court of Appeal agreed with the trial judge.(Official English version) (14) M. Chhina a fait valoir que son traitement était illégal au regard de <la Charte canadienne des droits et libertés>, qui fait partie de la Constitution du Canada.Mr. Chhina said that his treatment was illegal under the Canadian Charter of Rights and Freedoms, part of Canada's constitution.(Official English version of the summary by the Supreme Court of Canada)

Figure 1 .
Figure 1.Concordances of presentational structure <Il y a + NG (term)> in French specialized legal texts.

Figure 2 .
Figure 2. Concordances of presentational structure <Il y a + NG (term)> in French PLAIN texts.

Table 2
presents the key POS-grams in the corpus of Supreme Court judgements written in French.

Table 2 .
Key POS-grams in the CA-Judgements Corpus (Supreme Court judgements).

Table 4 .
Main LxGr patterns in French Canadian summaries.
1 Translation: <The Court of Appeal>. 2 Translation: <The Supreme Court + V + that>. 3anslation: <The majority + V + that>.4Translation:<The Canadian Charter of Rights and Freedoms> ) <Les juges majoritaires ont affirmé que> la police a porté atteinte aux droits que la Charte garantit à M. Reeves en prenant l'ordinateur sans son consentement et sans mandat.(CA-PL-Summaries) The majority said that the police breached Mr. Reeves' Charter rights by taking the computer without his consent and without a warrant.(English version of the summary by the Supreme Court of Canada) (16) <La Cour suprême a confirmé que>, dans une cause criminelle, le doute «raisonnable» doit être fondé sur la preuve et non sur des conjectures.In a criminal case, 'reasonable' doubt should be based on evidence, not speculation, the Supreme Court has confirmed.(English version of the summary by the Supreme Court of Canada)