Improving Estuarine Flood Risk Knowledge through Documentary Data Using Multiple Correspondence Analysis

: Estuarine margins are usually heavily occupied areas that are commonly affected by compound ﬂooding triggers originating from different sources (e.g., coastal, ﬂuvial, and pluvial). Therefore, estuarine ﬂood management remains a challenge due to the need to combine the distinct dimensions of ﬂood triggers and damages. Past ﬂood data are critical for improve our understanding of ﬂood risks in these areas, while providing the basis for a preliminary ﬂood risk assessment, as required by European Floods Directive. This paper presents a spin-off database of estuarine ﬂood events built upon previously existing databases and a framework for working with qualitative past ﬂood information using multiple correspondence analysis. The methodology is presented, with steps ranging from a spin-off database building process to information extraction techniques, and the statistical method used was further explored through the study of information acquired from the categories and their relation to the dimensions. This work enabled the extraction of the most relevant estuarine ﬂood risk indicators and demonstrates the transversal importance of triggers, since they are of utmost importance for the characterization of estuarine ﬂood risks. The results showed a relation between sets of triggers and damages that are related to estuarine margin land use, demonstrating their ability to inform ﬂood risk management options. This work provides a consistent and coherent approach to use qualitative information on past ﬂoods, as a useful contribution in the context of scarce data, where measured and documentary data are not simultaneously available.


Introduction
Estuaries are complex areas located at the confluence of rivers and seas, which are often low-lying and densely populated, and where multiple flood triggers can occur simultaneously, increasing the flood hazard risk [1]. Estuarine flood risk assessment and management are challenging tasks, since they depend on the simultaneous knowledge of both triggers and their related compounding effects [2][3][4], along with diverse impacts that are dependent on the increasingly inhabited and urbanized nature of estuarine margins [5][6][7].
Documentary sources play a significant role in the obtaining of information on past flood triggers and damages [8]. Information on triggers is usually used to inform flood frequency analyses and flood model validations, particularly to assess events that occurred before the instrumental period [9,10]. However, forms of documentary information, ranging from newspapers to reports, are particularly useful in efforts to collect information on damages, since these types of sources commonly provide information on a local scale, with detailed descriptions and details of damage accountability, improving natural hazard inventories [11,12] and flood hazard model validations [13].
Information extracted from documentary sources is frequently organized into databases to assure consistency and support queries and statistical analysis. Regarding natural hazards, a large number of databases exist, with different inclusion criteria, ranging from the multinational scale, such as EM-DAT (https://www.emdat.be/, accessed on 18 June 2022), to the national scale [14,15]. Flood damage databases are useful, since they provide postevent information, enabling analysis to support the flood response and management, spatial planning options and risk management strategies [16].
Flood risk management is mostly conducted by public authorities and involves various steps. In the European context, the European Floods Directive (2007/60/EC) established a framework for the assessment and management of flood risks, including a preliminary flood risk assessment, which is the first screening exercise used to identify flood risk areas based on historic records. Therefore, the existence of past flood records compiled and organized in databases is of the utmost importance. This importance is recognized by the European Environmental Agency, which launched, in collaboration with other European authorities, the project of creating a potential flood impact database [17].
The assessment of past impacts is challenging, since management authorities usually do not collect this information for long-term monitoring purposes in a systematic and coherent manner. Furthermore, past flood event information, especially that related to damages, is scattered across different sources, namely newspapers, media and diverse public authorities, ranging from environmental to civil protection agencies and from the national to local scales [18]. Other constrains are related to the effort required to collect and gather documentary sources, along with the lack of a consistent methodology that can be used to deal with qualitative information and different data sources [18]. However, some efforts were made to introduce quality and uncertainty assessments into the analysis of disaster loss data [19,20]. Despite the abovementioned constraints on our ability to deal with qualitative and diverse documentary information, case studies with scarce past flood data can benefit from the use of documentary sources and improved methodologies.
Qualitative data analysis can be performed using several methods, ranging from cluster analysis to principal component analysis, depending on the study goals and type of data. Multiple correspondence analysis (MCA) is a multivariate statistical analysis method appropriate for dealing with multiple variables of a nominal categorical type [21,22]. The method allows investigators to interpret underlying relations between multiple variables and has been applied in different scientific fields, ranging from environmental studies [23,24] to safety research [25]. The procedure involves transforming qualitative data through a process of quantification, generating optimal quantifications (also called optimal scaling) of each category and object [22] This paper is based on a previously published database of estuarine flood occurrences [20,26] and a database of flood events [27]. The paper describes a spin-off database of estuarine flood events built on the abovementioned three databases and presents an innovative methodology that can be used to deal with qualitative information on flood event damages. This information was mainly retrieved from documentary sources and enables the collection of estuarine flood damages and triggers on a local scale. The objectives of the paper are to: (1) present a methodology that applies to qualitative estuarine flood information, taking advantage of multiple data source types; (2) obtain general indicators of estuarine flood damages and triggers based on local contexts; and (3) discuss the benefits and challenges related to the use of qualitive information. The structure of the paper is as follows: Section 1 presents the case study contexts, Section 2 describes the methods used, Section 3 presents the results, Section 4 offers an overall discussion of the results and Section 5 presents the main conclusions.

Case Studies: Geographic and Territorial Contexts
There is no single, unanimously adopted definition of an estuary [28][29][30]. The definition adopted in this study follows the one proposed by Davidson et al., (1991) [31] that states: "An estuary is a partially enclosed area of water and soft tidal shore and its surroundings, open to saline water from the sea and receiving fresh water from rivers, land run-off or seepage". Here, we chose three different estuaries located along the western European coast as case studies due to their contrasting characteristics and flood record and data availability. The three estuaries are the Tagus Estuary (Portugal), the Shannon Estuary (Ireland) and the Solent Estuary (United Kingdom), and these are described in more detail below.
The Tagus Estuary is located on the Portuguese western coast, shaped by an energetic wave regime and affected by storm surges that increase in the direction from south to north [32], with an extensive flood record [5]. The estuary has a narrow and deep inlet channel and an extensive shallow inner domain, which promotes tidal range amplification due to resonance [33] and simultaneously constrains the propagation of oceanic waves entering into the estuary. Nevertheless, the unique inner domain geometry favors local wind wave generation [34]. Tides range between 0.55 and 3.86 m in the open coast (Cascais tide gauge), being amplified in the estuary interior due to resonance [33].
The main fluvial discharge into the estuary comes from the Tagus River, and fluvial discharge may influence the water levels, particularly in the upstream area [35]. The extreme water levels that promote estuarine margin inundation have two main origins: (1) storm surge conditions (low atmospheric pressure and strong winds) combined with high spring tides; and (2) extreme Tagus and Sorraia River discharges. Nowadays, the estuarine margins are densely occupied, with an extensive urban fabric in both margins, along with industrial, commercial and transport units framing the major metropolitan area of Portugal, with a resident population of more than 3 million people [36]. Upstream, the south margin is formed of heterogenous agriculture areas and pastures ( Figure 1). The margins are connected by two bridges, and the daily commuting by road, train and boat is intense. Public and private businesses and critical infrastructures are located near the margins, which are served by a dense road network. There is a set of historical records of flood events with multiple types of impacts. Significant floods include those of 15 February 1941, 2 November 1997 and, more recently, February 2010 [5,37,38].

The Shannon Estuary (Ireland)
The Shannon Estuary ( Figure 2) is located on the west coast of Ireland, shaped by an extremely energetic wave regime. Since Ireland's west coast is particularly exposed to atmospheric depression tracks, storm surge events are frequent [39], causing floods [40]. The estuarine area encompasses the lower sections of Shannon River between the city of Limerick and the ocean, as well as the River Fergus small estuary, located south of Clarecastle. The estuarine inner section is located east and north of Foynes Island, while the outer section includes the area between Foynes Island and Loop Head [41]. The main fluvial discharge into the estuary comes from River Shannon [42], followed by the Rivers Fergus, Maigue and Deel. The estuary is macrotidal and has the largest tidal range along the Ireland coast [43]. Water depths vary between 37 m (estuary mouth) and 19 m (Fergus confluence), being less than 5m in Limerick, located around the upstream part of the estuary [42]. The mean high water spring tide is ca. 5.44 m (OD, relative to ordnance datum) and the mean high water neap tide is ca. 4.04 m (OD) at the Limerick Docks [41].
Nowadays, the occupation of the estuarine margins consists mostly of agricultural areas (pastures), with ports (Foynes) and industrial facilities, along with dispersed small villages ( Figure 2). The city of Limerick is the major urban center, with a population in 2016 of ca. 94 thousand people, followed by Ennis, with ca. 25 thousand [44]. The only bridges connecting the two margins are in the city of Limerick, and the small villages along the estuary margins are linked by a network of national and municipal roads. The region is served by the Shannon international airport, located in the estuarine margin.
Previous studies demonstrated that there is a set of historical records of flooding with multiple types of impacts. Significant floods include those of 24 and 25 December 1999 and, more recently, the 1 February 2014 [38,40].

The Solent Estuary (United Kingdom)
The Solent on the south coast of the UK is a complex estuarine system composed of an estuarine strait (Southampton Water) comprising 12 separate small estuaries [45,46], located between the south coast of England and the Isle of Wight in the English Channel. Although sheltered by the Isle of Wight from south-westerly waves, the Solent is affected by storm surges of heights of up to 1 m, caused by atmospheric depression tracks moving from the Atlantic eastward, along with small surges coming from the North Sea region, which promote flood events [8,47].
Although the twelve small estuaries contribute to fluvial discharge, the Rivers Test and Itchen at the head of Southampton Water account for about 45% of the total inflow into the Solent [48]. The Solent is a mesotidal system, recognized as having complex tides, with a young flood stand and double high waters, which are especially noticeable during spring tides [47]. The mean spring tide ranges between 2 m in the west to ca. 4 m in the east [46], and the water depths favor navigation and vary between 20 m in the more open water and 60 m near Hurst Castle (Hurst Spit) [48].
The Solent margins are densely populated, with up to 1.4 million people living on its margins [49], creating a dense urban fabric along with industrial, commercial and transport units ( Figure 3). Southampton and Portsmouth are the major cities, both of which have relevant industrial and port facilities linked with commercial, cruise and shipbuilding activities, along with passenger operations and recreational sailing [45,49]. Major cities and villages are connected by an extensive network of primary and secondary roads.

Methods
Documentary sources are analyzed through a systematic procedure of reviewing and the evaluation of the sources of information aimed at gaining an understanding and developing knowledge [50]. The procedure involves examination, reading and interpretation, being an interactive process of content analysis [50]. Due to their structured nature and requirements, databases provide the ideal instrument with which to store and organize information according to a previously defined structure, being widely used on the local and national scales.
For clarity, the concepts used throughout this study are presented here. The spinoff database refers to the common database built within the scope of this study for the three estuaries based on previously compiled sources. Unstructured sources refer to previously compiled stand-alone documents consulted in order to extract information about the estuarine flood events included in the spin-off database. Structured sources refer to previously built databases that were used to extract information for the spin-off database's creation.

From Documentary Sources to an Estuarine Flood Events Database
To achieve a comprehensive collection of information on the flood events involving the three estuarine systems, different sources were used for each estuary, and these are listed in Table 1. The construction process of one common flood event database (spin-off database) started with the identification and selection of sources ( Figure 4). The identified sources are divided into unstructured (previously compiled stand-alone documents) and structured (other databases), following the approach of [51]. For the Tagus and Shannon Estuaries, a database of flood occurrences was previously built [20,26]. For clarity, the reader should note that an occurrence is defined as a geographically defined place described in the consulted sources as being affected by estuarine flooding, regardless of its severity [20]. Based on those occurrences, a merging process was carried out in order to obtain a set of events ( Figure 4). Events form the basis of the analysis presented herein and are defined as a set of occurrences sharing the same date or defined in the sources as belonging to the same flood episode. Unstructured data sources were available for the Tagus and Shannon Estuaries. In the case of the Tagus Estuary, most data sources were provided by the DISASTER project (http://riskam.ul.pt/disaster, accessed on 18 June 2022), which systematically gathered daily newspapers reporting on hydro-geomorphologic disasters in Portugal between 1865 and 2010. This vast collection of newspapers and, additionally, a small number of historic photographs provided by the Lisbon Port Authority (APL) were filtered according to the Tagus Estuary's location. The ANEPC database was filtered by the type of flood and location, and two records were retrieved, whose information was integrated into a common framework structure ( Figure 5).
The Shannon Estuary documentary sources were transferred from the former Irish Flood Hazard Mapping Website, which migrated to the recent national flood information portal FloodInfo [52]. Herein, only past flood records extending up to the autumn of 2014 were used. The information available on the website was collected by the OPW (Office of Public Works), with the collectors ranging from local authorities and national organizations to members of the public. Users are advised that it does not represent a complete catalogue of all the events. The available sources only describe flood events of fluvial and tidal origins, and the newspaper sources only consider the most severe events of the past 120 years. According to the abovementioned terms, a total of 106 documents were gathered, which include reports, newspapers and photographs.
Structured data sources were available for the Solent estuarine system from the Surge-Watch database (http://www.surgewatch.org/, accessed on 18 June 2022), which constitutes the most comprehensive source of historical records of UK coastal flood events. The database spans from 1915 to 2016 and includes a total of 329 costal flood events, covering the entire UK coastline [27]. We performed a filtration by location to retrieve the events that took place within the Solent estuarine system. Geographically, we considered the coastline between Hurst Spit and Selsey Bill, including the Southampton Water and the Isle of Wight north coast (Figure 3). A total of 79 flood events were retrieved, covering the period between 1916 and 2016, and the information was adapted to fit a previously designed common structured framework ( Figure 5).
Overall, the final spin-off event database comprises information of the three estuaries and have a total of 149 events between 1865 and 2016. For each event, the date, location and source, along with trigger and damage data, were registered ( Figure 5). The triggers group includes six variables: low pressure, wind/waves, rainfall, fluvial discharge and high tide. The damages group includes seven variables: physical damages, economic losses, human damages, circulation interruption, functions disruption, environmental degradation and institutional involvement. For each variable, at least the information regarding the presence or absence of a certain trigger was retrieved, and the appropriate field was filled with YES if present and NULL if absent. When available, a detailed description was registered in the text field ( Figure 5). A set of criteria (Table 2) were used to identify estuarine flood events reported in the sources (structured and unstructured) based on geographic constraints, along with other proxies.   Table 2. Criteria used to identify estuarine flood events in the documentary sources.

Tagus Estuary
Geographic constraint: area between Oeiras and Vila Franca de Xira (upstream limit of the salt intrusion) and between the highest astronomical tide line [53] the upper limit of the intertidal domain and 20 m above mean sea level [20].

Shannon Estuary
First step: we removed documents that were not related to estuarine floods. This extraction was performed using additional documentary proxies [54][55][56] and the OPW website (http://www.floodinfo.ie/map/floodmaps/) accessed on 18 June 2022. Geographic constraint: area between Loop Head and the city of Limerick (tidal limit), whose description is clearly connected with estuarine flooding.

Solent estuarine system
Geographic constraint: area between Hurst Spit and Selsey Bill, including Southampton Water and the north coast of the Isle of Wight.

Information Extraction
Unstructured sources were collected and individually treated using content analysis techniques [62,63]. Content analysis comprises a set of techniques that use systematic and objective descriptive procedures applied to message content in order to obtain indicators (quantitative or not), enabling knowledge inference [62]. Among the various techniques, Bardin (2020) [62] explained that categorical analysis is the most generalized and simple, consisting in classifying and counting, following a set of criteria and noting the presence or absence of a certain item/element that constitutes the message of the data source. Following the above-mentioned definition, structured and unstructured sources were extensively analyzed using a common and predefined set of criteria (Table 3), enabling the extraction of relevant information into a common structure.

Statistical Analysis
The final spin-off estuarine event database, with a total 149 events (corresponding to the number of database entries), was analyzed using a non-linear multivariate analysis method, specifically multiple correspondence analysis (MCA), applying IBM ® SPSS ® Statistics software. MCA is particularly appropriate for analyzing nominal categorical information [21,22], enabling the detection of underlying data relations (described as dimensions in MCA) and possible associations between data entities and providing a method to support variable exclusion. For the sake of clarity, Table 4 presents the definitions and nomenclature used in the MCA. The first step in multiple correspondence analysis is the calculation of the maximum number of dimensions that should be retained for the analysis (Equation (1) In this case, the number of objects (149) is larger than the number of categories (2); hence, (p − max(m 1 ; 1)). The database has 13 variables, without non-answers (m 1 = 13), with 2 categories each (p = 26). The maximum number of dimensions is (p − m 1 ); hence, 13 dimensions should be used as a first estimate.
The second step included the analysis of the 13 dimensions and reveled that the first 2 dimensions were the most representative and should be retained to the third step of the analysis. This evaluation is performed using the inertia values (ratio between the eigenvalue and total number of active variables) that vary between 0 and 1. Therefore, higher values explain more of the variance by dimension.
The third step involves running the analysis again with only the two previously retained dimensions and assessing the results. The assessment is performed using the discrimination measure of each variable by dimension. Hence, the discrimination measures quantify the variance in each variable, accomplishing an optimal quantification [22] ranging between 0 and 1. Therefore, higher values that are close to one allow for the identification of the most discriminant variables in a certain dimension, enabling the exclusion of the ones that do not have the capacity to discriminate. After this step, a set of selected variables is obtained, and the analysis is run again to confirm whether there is another variable that can be excluded using the criteria mentioned above. Finally, a stable group of discriminant variables are attained, which enables the thematic interpretation of the dimensions.
A refined interpretation analysis can be performed using categories in order to better discriminate between the objects of analysis, retaining the ones with higher contributions (contribution of point to inertia of dimension) relative to a reference value given by 1/p, where p is the total number of categories.
In summary, according to [22] an interpretative process using multiple correspondence analysis comprises three main steps, namely: (1) preserving the dimensions with higher inertia values; (2) retaining the variables with higher discrimination values in the previously selected dimensions; and (3) retaining the categories with higher contribution values in regard to the previously retained variables.
For the sake of clarity, we note that the nomenclature of "variables" adopted within the scope of multiple correspondence analysis is considered, in this study, as a synonym for indicators, in the sense that indicators are "a set of variables that characterise natural and anthropogenic systems after the hazardous process take place" [26]. Therefore, in the Section 4, "indicators" are used as a synonym for "variables".

Spin-Off Flood Events Database
A detailed analysis of the spin-off flood event database including a total of 149 events encompassing the period 1865-2016 (151 years) was carried out. In the case of the Tagus Estuary, most of the consulted sources are newspapers, which account for 38 events (Figure 6a), while the other sources account for 6 out of 44 events. The Shannon Estuary sources are far more diverse, including photographs and newspapers, both accounting for 15 events out of 28, and maps, which account only for 1 event. In addition, a combination of sources described three events (Figure 6b). The case of the Solent is less diverse than that of the Shannon in terms of its sources. Scientific articles account for most of the sources, detailing 57 events out of 77, whereas 17 events are based on a combination of sources (Figure 6c). The temporal coverage of events according to the estuarine system is shown in Figure 7a and reveals an almost continuous record over time (from 1865 to 2013) in the case of the Tagus Estuary. In contrast, the Shannon records are the sparsest, with more coverage from recent decades, from 1980 onwards. The Solent records are relatively continuous over time from 1916 to 2016. The monthly distribution showed that most events occurred in the autumn and winter months, with the highest number of events in January in the three case studies (Figure 7b). Figure 7. (a) Spin-off database temporal coverage by estuarine system (some years contain more than one event, which is not show in figure). For the temporal coverage and number of events by estuary, please refer to Table 1 in Section 3. (b) Characterization of the spin-off database distribution by month and by estuarine system. In most cases, several triggers occurred simultaneously. Wind/waves occurred in most cases (42%), followed by rainfall (40%) and low pressure (36%) (Figure 8a). Concerning registered damages, circulation interruption occurred in 67% of events, followed by physical damages and function disruption, which were registered in 63% and 62% of events, respectively (Figure 8b).

Multiple Correspondence Analysis
The multiple correspondence analysis, applied to the spin-off database, obtained a Cronbach alpha value of 0.75 (Table 5). This value is a widely used measure to assess the internal consistency of a dataset, ranging between 0 and 1. Although the value thresholds necessary to consider alpha as acceptable are debated, some authors (e.g., [64,65]) showed that an acceptable value should be at least 0.70. Therefore, the statistical model was considered appropriate. The two final selected dimensions account for 61% of the model variance. Considering the 13 initial variables (see Tables 3 and 4), 3 variables were excluded (high tides, circulation interruption and environmental degradation) due to their low discrimination values, producing a final set of 10 significant variables (Figure 9). The retained dimensions were thematically interpreted using the two groups of variables that better explain each dimension (Figure 9). Therefore, dimension 1 is associated, in terms of triggers, with rainfall, fluvial discharge and deficient urban drainage conditions, and in terms of damages is related to human damages, function disruption and institutional involvement. This combination of variables are identified as a dimension associated with the influence of the hydrographic basin. In contrast, dimension 2 is related with wind/waves, low pressure, economic losses and physical damages, which are recognized as a combination of variables related to oceanographic influences.
The category analysis (Figure 10a) revealed different combinations of categories representing distinct event profiles. The relative distances between categories were used to identify the event profiles, defined as a set of events sharing similar characteristics. The graphical representation of the events by estuarine system and their relation to both dimensions is shown in Figure 10b, enabling the discovery of associations between event profiles and the identification of the estuarine system.
Hence, events involving the Tagus Estuary are associated with the presence of rainfall, fluvial discharge and deficient urban drainage, as triggers, and with human damages and institutional involvement presence as damages (3rd quadrant). In contrast, most events involving the Solent are associated with the lack of rainfall (1st quadrant), whereas most events involving the Shannon Estuary are related with the presence of low pressure, wind/waves and economic losses (2nd quadrant). The fourth quadrant features denote a set of events, mainly related to the Solent and Shannon, that are characterized by the absence of wind/waves, physical damages, economic losses and function disruption.

Discussion
In this paper, a large and diverse set of documentary sources were used to extract flood triggers and damage information in three distinct estuarine systems located in the western European coast with past records of estuarine flooding. As indicated by other studies (e.g., [12,16,66]), documentary sources, also called "soft data" or "historical sources", proved their usefulness for extracting valuable information regarding flood triggers and damages.
Events involving the Tagus Estuary are largely based on newspaper sources, covering a wide temporal period, with almost all the sources being daily newspapers [20]. The Shannon Estuary is a contrasting case due to the low number of events, which are concentrated in recent years due to missing data. This fact is acknowledged by the source provider (OPW portal), which states that the available list of sources does not represent a complete catalogue of events, and that newspaper sources only reflect the most severe events occurring in the last 120 years. Despite this pitfall, it was possible to register 28 events. Events involving the Solent estuarine system were obtained from a structured source (SurgeWatch; [27]). This fact explains its consistency over time, since SurgeWatch originates from the merging of sea level observations from tide gauge records along the UK coastline with "soft data" sources, particularly a set of selected scientific articles [27]. The obtained spin-off database demonstrates the capacity of researchers to merge and adapt distinct source typologies (structured and unstructured) so as to obtain a regional damage assessment model record that is able to inform public policy options and directives from the local to transnational levels.
The seasonal distribution of events is in accordance with storm and flood occurrences along the western coast of Europe, with a higher frequency of events in the autumn and winter months [67], when atmospheric conditions are prone to storm surge occurrence, wind-generated waves and more abundant precipitation periods, generating a high fluvial discharge. As already observed in other studies (e.g., [18,66]), documentary data, in general, can be a valuable source of information on damages. In contrast, information on flood triggers is more scarce and often limited and unreliable, particularly when newspapers are used as data sources, since this specific type of source usually tends to focus on the description of the damages and accountability rather than on the description of triggers [20]. Although the spin-off database was based on a multiplicity of data sources, the overall results are in line with previous studies [20,38], showing the importance of physical damages, circulation interruption and function disruption as major estuarine flood damage typologies and the presence of rainfall, wind/waves and low pressure as potential indicators of estuarine flood triggers.
The multiple correspondence analysis enabled the exclusion of less significant variables that do not contribute to explaining the final model. Although reported in the sources, especially in the case of Shannon (please refer to Figure S1), high tides were excluded. This is interpreted as a "background" presence in the case of estuaries and therefore is not distinctive as an indicator of flooding.
Regarding damages, the model excluded environmental degradation and circulation interruption. The first influenced a relatively low number of events involving the Solent (please refer to Figure S1), where this type of damage is residual. Circulation interruption, on the other hand, was excluded due to its close relation to function disruption, since they often overlap. Thus, the model exclusion is explained by variable redundancy.
The results revealed two distinct dimensions, enabling a thematic interpretation of the model. Dimension 1 is characterized by a set of indicators closely related to the watershed draining into the estuarine systems, whereas dimension 2 is characterized by indicators related to oceanographic conditions. However, it is noteworthy that trigger indicators are the cornerstone of the interpretation of the thematic meaning of the overall dimensions. These results are in accordance with previous work on the Tagus and Shannon Estuaries [26].
The events involving the Tagus and Solent Estuaries exhibit a distinct pattern, forming different profiles. Events involving the Tagus Estuary are associated with the presence of rainfall, fluvial discharge and deficient urban drainage, as triggers, which have been observed previously by other authors [20,35] as determinants to fully understand the flooding of this estuarine system. The Tagus Estuary is a widely urbanized system, where rainfall patterns in the hydrographic basin, along with the coincidence of short-term rainfall episodes, high tide and a limited capacity for urban drainage, can induce estuarine flooding [68][69][70]. Moreover, fluvial discharge is a relevant flood trigger in the upstream area of the estuary [35]. The presence of human damages and function disruption are in line with the observations of [20].
Events involving the Solent Estuary are characterized mostly by the absence of rainfall, which is in accordance with previous work reporting on the region (e.g., [8,47]), and flooding is mostly related to extreme sea level events caused by low pressure conditions that trigger storm surges. Nevertheless, a recent analysis by [71] of the entire UK coast, assessing the characteristics and drivers of compound flooding events, confirmed the joint occurrence of a high storm surge and high river discharge on the southwestern coast of the UK. Not surprisingly, the spin-off database does not reflect this fact, as the Solent Estuary data are derived from the SurgeWatch database, which primarily seeks to describe coastal flood events linked with high sea levels [27,72]. The limited characterization of the Solent's profile (lack of rainfall), associated with the absence of any damage as a flood indicator, is explained by the source provider's intrinsic characteristics (SurgeWatch database) and the fact that it contains limited descriptions of damages in the case of the Solent.
Even though it is affected by a sparsity of data, events involving the Shannon Estuary are mostly characterized by the presence of low pressure and wind waves, along with economic losses. The Irish coast is commonly affected by storms and strong winds [73]. Recently, [40] presented a revised and updated catalogue of extreme wave events along the entire Irish coast, where storm surges also occur, highlighting coastal flooding events due to storm surges in the Shannon Estuary.
Although past flood information is of limited use for the forecasting of future scenarios involving climate change that might affect these areas, the presented profiles are a relevant outcome because they constitute a basis for the diagnosis and management of flood risks. This knowledge can inform tailored measures for application to each estuarine system so as to increase awareness of the potential risks; support resource allocation through early warning systems, emergency preparedness and response situations; and assist with spatial planning activities to reduce exposure to risks. Furthermore, information on past flood events has demonstrated its usefulness for the validation of flood forecasting tools [74].
A critical comment should be made regarding the capacity of the sources to fully explain the abovementioned results. In the case of the Solent, using structured sources as a previously built database to assess the presence of triggers or damages in regard to an estuarine flood event might be reductive and give origin to incomplete outcomes. As already pointed out, the absence of information in the sources prevents us from drawing more extensive and detailed conclusions. Additionally, the previously built database's content (database fields), along with the primary objective of its creation, influence the type and amount of information that can be extracted.
As other authors have suggested (e.g., [8,75,76]), the most suitable approach to the identification of both triggers and damages, reducing bias and increasing reliability, should combine long-term measured records (e.g., tide gauge data, hydrologic datasets) with documentary sources (e.g., newspapers, reports). Nevertheless, the benefit of this combination is lacking in most cases due to the difficulty involved in simultaneously obtaining measured data for all types of estuarine flood triggers and documentary evidence that is available and consistent over time.
Despite the abovementioned limitations, we can argue that: (i) past flood information is of limited use for the forecasting of future scenarios involving climate change; (ii) the intrinsic characteristics of data sources affect the results; and (iii) the most suitable approach should combine measured records with documentary sources. The presented methodology is proven to be effective for extracting reliable flood event information regarding triggers and damages from a multiplicity of documentary data sources, combining them into a common database and extracting the best explanatory indicators of estuarine flooding. Moreover, the methodology was also able to provide estuarine profiles, offering a comprehensive synthesis of the most important indicators of each system. This outcome is of major importance to flood risk managers, enabling a knowledge basis for the creation of flood risk management options, since it offers information on the types of damages and triggers that are critical for tailored and sustainable estuarine flood management.
It is also relevant that the trigger typology is related to a set of impact typologies that are closely associated with estuarine physical features and the land use of margins. Furthermore, the use of qualitative documentary sources and the multiple correspondence analysis enabled the comparison of contrasting estuarine systems. The approach brought to light a common set of indicators, regardless of the estuarine system, that are crucial in informing flood risk management and prevention though structural and non-structural actions, along with mitigation measures related to estuarine margin spatial planning.

Conclusions
This study presented a methodology that can be used to obtain qualitative information extracted from multiple documentary structured and unstructured sources. The approach started with a documentary data gathering exercise and spin-off database construction using content analysis techniques. A multiple correspondence analysis technique was applied to the variables stored in the database, namely flood damages and triggers, and enabled us to explore the information acquired from the categories and their relationships with the dimensions, allowing us to obtain distinct event profiles.
The statistical analysis enabled the extraction of the most relevant indicators regarding estuarine flood triggers and damages. The results revealed that the estuarine flood risk is driven by two distinct dimensions, namely the influence of the hydrographic basin and oceanographic influences, demonstrating the transversal effects of triggers in characterizing the estuarine flood risk. Another relevant outcome was the recognition of distinct estuarine profiles, which are valuable as a means to inform more tailored flood risk measures. A critical analysis was performed on the challenges of using documentary sources, showing that, despite the limitations, qualitative information is useful and valuable data, especially in the context of data scarcity. The combined approach of using qualitive documentary sources and the multiple correspondence analysis enabled the comparison of contrasting estuarine systems and identification of relationships between a set of triggers and a set of damages that are related to land use and estuarine margins' physical characteristics.
The methodology proved its effectiveness in extracting reliable flood event information from a diverse set of documentary data sources, combining them into a common database structure (spin-off database). Furthermore, it allowed us to obtain the best explanatory indicators of estuarine flooding, regardless of the estuarine system. Additionally, the estuarine profiles offered a comprehensive synthesis and a knowledge basis for estuarine flood risk characterization, and they are valuable as a means to inform tailored and integrated flood risk management options.
Supplementary Materials: The following supporting information can be downloaded at: https: //www.mdpi.com/article/10.3390/w14193161/s1, Figure S1 Reported presence and absence of all considered variables before MCA application for each estuarine system. Tagus comprises 44 events, Shannon comprises 27 and Solent comprises 77. Overall, the total number of events are 149.  Data Availability Statement: The database described is available upon reasonable request.