Earth Scientists and Sustainable Development: Geocomputing, New Technologies, and the Humanities

This opinion paper discusses some of the challenges and opportunities that earth scientists face today in connection with environmental problems. It focuses on aspects that are related to the role of geocomputational approaches and new technologies for geoenvironmental analysis in the context of sustainable development. The paper also points out a “data imbalance” effect, a key issue in the analysis of environmental evolution and of geosphere-anthroposphere interactions in the long-term. In connection with this, it stresses the importance of geoenvironmental information which can be derived from environmental humanities and related disciplines, such as history and archeology. In this context, the complexities and potentialities of a dialogue between earth sciences and the humanities are outlined.


Introduction
Sustainable human development, from the perspective of the maintenance of the Earth system in a resilient state (e.g., [1,2]), presents relevant challenges, involving science, technology and socio-economical aspects. Global and local sustainability require the analysis and understanding of multiple interacting environmental processes, acting across multiple spatiotemporal scales. Moreover, the analysis and management of human perturbations on the earth system require an interdisciplinary approach, capable of analyzing and modeling geosphere-anthroposphere interactions. Earth scientists play a key role in many aspects of the sustainability challenges, including the definition and implementation of human development related policies. From this perspective, the Sustainable Development Goals (SDGs), as defined by the United Nations [3], influencing the policies of many countries, e.g., the European "Green Deal", are emblematic. Several of the SDGs are in fact related to the environment and, as such, more or less directly address the work of earth scientists and acknowledge the societal value of their research.
In this context, geocomputational approaches and new technologies play a pivotal role in earth sciences. On the one hand, the acquisition and the quantitative analysis of geoenvironmental data are fundamental for understanding the complex dynamics of the earth system and its interactions with the anthroposphere. On the other hand, the gathering of geoenvironmental data, their correct comprehension and modeling, as well as their use have taken on an unprecedented political meaning as they have become fundamental for responsible decision making. The validation of data and their interpretation have ostensibly reached out from the ivory towers of science to the public sphere as matters of shared concern in civil society. New technologies and computational methods are available to the earth scientist, but they need to be reassessed in order to be correctly-and critically-employed towards the achievement of global goals. Moreover, the epistemological question of how to integrate information, knowledge and approaches stemming from different disciplines is far from settled in a time in which the urgency to address the connection between earth-system processes and cultural phenomena is evidenced by the deep anthropic transformation of our planet. In the field of the history and philosophy of sciences, there is growing interest in studying the interactions between humankind and the environment (e.g., [4]), as well as between methodologies descending from the natural sciences and the humanities [5].
Our considerations are particularly pertinent to soil science and geoenvironmental research, especially when focused on topics such as land degradation/management, water resources protection/management and multiple geoengineering-related issues. First, these issues directly involve the Critical Zone (e.g., [6]), the surficial earth layer characterized by a high intensity of interactions between the geosphere (intended in a broad sense, including the hydrosphere), the biosphere and the anthroposphere. Second, new technologies (e.g., remote sensing) for the collection, analysis and modeling of geoenvironmental data have a strong impact on this context. Third, valuable information on the complex interactions between the humans and the environment can be extracted from humanities-related informative sources (e.g., [7]).
The themes covered in this essay are related to a wide range of topics, many of which are in continuous evolution owing to the fast developments that characterize technology and geocomputational approaches. Accordingly, the present discussion is inevitably partial and covers only those aspects that we consider worth highlighting in view of a conscious and critical use of geocomputational methodologies and available informative sources.
Section 2, "Geosphere-anthroposphere interlinked dynamics", discusses the difficult conceptualization of the geosphere-anthroposphere dynamics from an interdisciplinary perspective that brings the earth sciences and humanities in dialogue. The role of technology and geocomputation for communicating environmental dynamics and human impacts is then introduced. Section 3, "Technological innovation and geoenvironmental data", focuses on the role of technology for geoenvironmental data retrieval and analysis, including the challenges related to the growing complexity and heterogeneity of informative sources. Section 4, "Geocomputing and the earth scientists", outlines the relevance of expertknowledge in geocomputation for explorative and predictive analysis; then, it discusses the impact of technology on the development and diversification of geocomputational tools, posing opportunities and challenges for the earth scientists. Section 5, "Data imbalance at the crossroads of geocomputing, new technologies and historical information", discusses the "data imbalance" that frequently characterizes the analysis of geoenvironmental dynamics in the long-term. The need to consider humanities-related informative sources, both for compensating the data imbalance as well as for studying geosphere-anthroposphere interactions, is introduced. Finally, the necessity to improve the dialogue between earth sciences and humanities is outlined.

Geosphere-Anthroposphere Interlinked Dynamics
The relevance of local and global sustainability challenges will likely increase in the coming years due to multiple factors, among which is population global dynamics [8]. Not only will the global population likely continue to grow, with an estimated population of more than 9 billion by 2050 [8], but it is also marked by a relevant imbalance, both from the geographical as well as socio-economical viewpoints, among the regions of the globe. The polarization of population in and around big/mega cities (will this trend be changed by pandemics outbreaks?), with more than half of the world population living in cities [9,10], is another relevant factor. These basic demographic considerations suggest a future increase in interactions between the human and the geoenvironment. Such interactions have well-known multiple manifestations, both from the perspective of anthropic impacts and natural impacts: land-use changes, natural hazards, pollution, ecological alteration, climate change, natural resources depletion, geoengineering issues, etc.
The increasing relevance of geosphere-anthroposphere interlinked dynamics for society and science is also marked by various research pathways focused on this issue. It is worth mentioning research areas that have become particularly visible in recent times: the "Anthropocene" related debates and the study of the "Critical Zone", that is, the superficial geological layer of maximum human-geological-biological interactions [6]. The Anthropocene issue [11] has come to cover a wide range of debates, ranging from those in the humanities to the artistic scene and environmental activism (as can be evidenced by approaches as different as [4,[12][13][14]. The fact that the concept of Anthropocene stems from geology [15] and geological observations is representative of the intense anthropic signature on our planet. Then, the proposal to formally define a new geological epoch, involving strict stratigraphic requirements, currently based mainly on geochemical considerations, is still more emblematic of this concept. The formal definition of an Anthropocene epoch, in the stratigraphical sense, depends on the results of the ongoing work of the Anthropocene Working Group, which has been created as part of the Subcommission on Quaternary Stratigraphy of the International Commission of Stratigraphy, in 2009 [16]. However, even independently from the stratigraphical definition of the Antropocene, the concept is valid in its essence. Both research areas (Anthropocene and the critical zone) are explicitly focused on interactions between humans and the geoenvironment. For both, the collection and analysis of geoenvironmental data plays a pivotal role. Moreover, it is important to stress that in these research pathways, historical and archeological information is extremely relevant, especially for the analysis of the evolution of geoenvironmental system on long-time scales.
In this context, policies for sustainable development, as the SGDs and the European "Green Deal", are a first step toward maintaining the "stability" of the Earth system, as proposed for example by [2]. However, the definition and implementation of sustainable development policies imply a detailed and objective knowledge of the geoenvironmental system and a continuous monitoring of its dynamics, including the interactions with the anthroposphere. Unfortunately, the level of geoenvironmental knowledge required for sustainable development is not easily achievable. First, due to the complexity and heterogeneity of geoenvironmental systems, a full parameterization of the system, including knowledge on governing processes/factors and boundary conditions, requires huge quantity of data, with high spatiotemporal sampling densities and coverage. Second, reasoning from the perspective of the implementation of sustainable policies, the potential reflexive dynamics (e.g., [17][18][19]) that characterize the human-geoenvironmental systems should be considered. This reflexive behavior often develops according to circular, selfreinforcing and path-dependent patterns, through perception-related human actions. This implies that both the social sciences and history should contribute to untangling the complex interactions between humans and the geoenvironment. Third, the analysis and understanding of the geosphere-anthroposphere dynamics often require the study of geoenvironmental processes for extended periods of time, often considering centuries or even millennia (e.g., for studying human forcing on the climate system). As discussed further in the paper, this represents a critical point due to the "data imbalance" effect: firstly, this is represented by the progressive deterioration in data, i.e., spatiotemporal density and accuracy, going back in time; secondly, this is also marked by a general inhomogeneity in data properties (e.g., [20]).
The need to consider, in addition to traditional geoenvironmental proxies (e.g., based on dendrochronology, sedimentology, paleontology, geochemistry, etc.), humanitiesrelated informative sources open new challenges for the earth scientists, who need to consider specific characteristics of human sciences and its investigative approaches. With reference to the human sciences, it should be stressed that their theory and knowledge (in the realms of sociology, anthropology, economy and, more generally, cultural studies) transform the subject matter they target due to a reflexive loop effect. According to a semiotic "principle of indeterminacy" (which we could more simply call the "observer effect"), all inquiry into human reality affects and transforms its object of inquiry [21] (pp. [28][29]. This awareness has even led to the identification of social reality with its 'representation' in some of the most influential trends of sociological investigation [22,23]. To neglect the observer's positioning can lead to a sort of "ideological fallacy", that is, to assume the objective neutrality of the humanities and social sciences, as if the agendas and motivations that inform their specific form of knowledge could be separated from their content (this is like positing a form of pure "knowledge for the sake of knowledge" on the basis of which the knower does not want to undertake anything and wishes to leave reality untouched) [21]. As a consequence of these methodological premises, culture has been seen as a structured symbolic system (or "semiosphere") which results from a process of selective abstraction (which, in turn, is dependent on codification-andinterpretation codices) [24]. This semiotic abstraction should by no means be confused with reality itself, as inclusion and exclusion from the system is a processual matter [25,26]. Moreover, all abstraction-no matter how accurate, complex and systematicemerges out of cognitive and historical processes and exerts societal functions towards specific goals ( [27] (pp. ). If cultural studies and, more specifically, the human sciences, will be included in a program of geo-anthropological inquiry, the objectivesubjective tension that characterizes them ought to be taken into account as a constituent of the resulting interdisciplinary paradigm (in Kuhn's structuralist sense of paradigm, [28]) which cannot by any means be transcended or circumvented, not even by automated processes of data elaboration. In extreme synthesis, it is hard for the earth scientists to extract unbiased environmental information from human sciences related sources without the contribution of scientists/scholars in the humanities and social sciences; on the other side, it can be difficult for humanities-based scientists/scholars to analyze humanenvironmental interactions without contributions from earth and natural scientists.
New technologies and geocomputational approaches contribute strongly to the communicative approaches of earth scientists, permitting us to highlight human impacts on the geosphere at multiple scales, including the global one. For example, the remote sensing based "BlackMarble" map of NASA [29], reporting light pollution on the globe (Figure 1), furnishes a sharp and self-evident picture of how humankind is overrunning the planet Earth. Light pollution, apart from being an impact itself, is a proxy for urbanization and land use changes, with all the inherited, direct and indirect, geoenvironmental and ecological implications, including the systemic disruption of multiple ecosystem services. From this viewpoint, the various geographical informative layers and maps reported in the "Atlas of the Human Planet" [9,10] are even more convincing. The atlas has been built by means of advanced geocomputational approaches based on machine learning, permitting the integrated use of different sources of information, including remote sensing technologies. It offers an updated and quantitative analysis of the urbanization in the globe, with interesting outcomes. At the end, the communicative power of images and maps plays a key role from the perspective of social perception and serves as a key informative instrument in the hands of geoscientists. Quantitative maps of environmental variables (for example, reporting the pollution of air, water and soil) unequivocally display the human impacts on the geosphere. The air pollution maps of the globe derived via remote sensing technology, as with the ESA Tropomi instrument mounted on Sentinel 5 (https://www.esa.int/Applications/Observing_the_Earth/Copernicus/Sentinel-5P/Nitrogen_dioxide_pollution_mapped, accessed on 7 February 2021) or the maps of Cesium deposition in Europe after the Chernobyl accident (e.g., [30,31]) are emblematic. As a further instance, one can mention cold-war atomic tests (over 500 detonations in the 1950s and 1960s) which left an even broader and lasting "bomb spike" that is currently under examination as a possible Anthropocene marker with disturbing ethical connotations [32,33]. Finally, recent public health and epidemiological studies are revealing that, almost surreptitiously (e.g., [34]), the pollution of air (e.g., [35]), soil (e.g., [36,37]) and water (e.g., [38]) is significantly affecting the health and well-being of humans; the potential societal and economic impact for the coming years could be worse than what is expected from climate change. The potential interactions between pandemic events (e.g., , environmental pollution and socio-economic processes are another area to be further investigated (e.g., [39,40]). To be sure, many more examples could be presented, in connection with climate change, ecological impact, land-use changes and other geoenvironmental aspects. The point is that technology and geocomputational approaches are fundamental not only for researchers studying and managing the environment, but they are also fundamental to increase awareness among the wide public and policy makers toward environmental issues and controversies over "consensus on consensus" (beginning with the paradigmatic case of human-caused global warming, as discussed by [41]. In these cases, statistical graphs, images and maps (e.g., [42]) capturing the environmental processes are a formidable communication tool. However, statistics, graphs and maps should be always accompanied by information on the informative sources and on their inherent limits (e.g., spatial resolution, uncertainty, underlying assumptions, etc.), to correctly assess their objective content. This allows us to avoid useless controversies that misleadingly embrace an opposition between data and models (which are, in fact, generally interdependent, as discussed by [43]). It also helps us to counter wide-spread forms of anti-science skepticism, a growing problem in the public opinion whose recent manifestations have their roots in the 'constructivist' sociology of science (in particular, the thesis that scientific truths are socially constructed) [44]. Criticism stemming from the sociology of scientific knowledge needs to be counter-balanced by a renewed trust in the validity and objectivity of knowledge content, albeit with an awareness of the function of such content and its methodological and technological limitations. Otherwise, mounting skepticism can become an instrument of irresponsible economic agendas and populist politics, and deeply affect scientific work and discredit expertise, as has been evidenced by recent posttruth debates [45][46][47].

Technological Innovation and Geoenvironmental Data
Geoenvironmental data are fundamental in order to handle objectively the challenges that our planet and humanity are facing. They play a pivotal role in communicating geoenvironmental issues to a non-expert audience too. A fundamental characteristic of geoenvironmental data is that the spatial and temporal dimensions are an inherent property. In fact, the geographical position and temporal reference of environmental data (e.g., the concentration of a pollutant) are an integral part of the available information; this aspect has a decisive impact on all processes related to data collection, data analysis and dissemination of environmental information. Technological, hardware and software and methodological developments, both in regard to field as well as laboratory procedures, have a strong impact in geoenvironmental information retrieval, management and exploitation. The set of methodologies and tools that can be deployed to parameterize the environment is extremely wide and is characterized by continuous advancements. This contributes to the extreme heterogeneity in the characteristics of geoenvironmental data available. In fact, geoenvironmental data can be characterized by relevant differences in many aspects, such as (e.g., [48,49]): typology of information (e.g., continuous, categorical, compositional, hard, soft, etc.), spatiotemporal support of measurement, spatial coverage (fragmentary versus exhaustive information) and uncertainty.
Probably, at least concerning the analysis of the earth-surface geoenvironmental processes from a global perspective, the most evident progress in geoenvironmental data retrieval is related to the remote sensing technologies (e.g., [50][51][52]) mounted on spatial platforms outside the atmosphere. The remote sensing imagery, representing spatial data with an exhaustive coverage of the domain studied, have the capability to catch the dynamics of environmental processes in action, for wide areas and with relatively high spatial and temporal resolution. Series of imagery reporting atmospherics or oceanic circulation are an example of this capability. Another relevant example could be represented by the improvements in satellite gravimetry (e.g., [53,54]). National and international space agencies are making serious efforts aimed to develop new sensors and platforms and for improving easy access to data. From this viewpoint, it is worth mentioning the efforts of European Union with its European Space Agency (ESA) and Copernicus for developing new satellite sensors and making available to the public satellite data via various web portals and software tools (https://earth.esa.int/eogateway/ accessed on 7 February 2021). Remote or, more generally, contactless sensing (e.g., proximal sensing) includes not only sensors mounted on satellite platforms but also on all the other platforms, manned or unmanned, that can be terrestrial, marine and aerial (e.g., [55][56][57]). Moreover, active sensing technologies (e.g., [51]) such as Light Detection and Ranging (LiDAR) and Synthetic Aperture Radar (SAR,) have revolutionized the way in which we can study earth processes. An instance of this is the possibility to derive by means of airborne LiDAR high-resolution digital terrain/surface models that makes the detection of fine-scale morphology and the study of multiple aspects of surface roughness feasible (e.g., [58,59]). Moreover, these high-resolution terrain models, when collected on a multitemporal basis, are fundamental to monitor specific processes such as landslides, glaciers and coastal morphology (e.g., [57,60]). Concerning SAR technology, the possibility to monitor ground deformation for wide areas has a specific value for monitoring geoenvironmental processes such as land subsidence (e.g., [61]) or ground deformations after strong earthquakes (e.g., [62]).
Unfortunately, remote sensing technologies are useful for gathering information about the earth surface but furnish limited information on geoenvironmental processes and factors in the subsoil or below the water surface. In this context, geophysical methodologies, strongly connected with remote-sensing technologies and coupled with geocomputational tools, are fundamental to improve our understanding of earth subsurface processes (e.g., [63][64][65]). Geophysical methodologies have seen relevant developments in recent years, with the main trend toward the development of easy-todeploy and low-cost technologies to be applied to a wide set of issues and in a wide range of settings, including urban contexts. For example, seismic, geoelectrical and georadar technologies are currently intensively applied for multiple geoenvironmental and geoengineering issues. The role of technological developments in this context is emblematically described by the case of passive seismic methodologies with the development of flexible and easy-to-use tromographs that fueled an effective explosion of environmental seismology-related research (e.g., [66]), focused, for example, on seismic microzonation and bedrock-sediment transition mapping.
Technological development improved the collection of geoenvironmental data useful for a wide range of earth science disciplines, both in the context of field and as well laboratory equipment. Focusing on field sensors and the related data loggers, the improvements have been impressive in multiple fields: geochemical sensors for environmental monitoring (soil, water and atmosphere), physical sensors (pressure, temperature, strain, conductivity, etc.) for hydrological and hydrogeological monitoring, proximal sensing, tracers, etc. In general, the trend is toward the development of low-cost sensors, rugged, customizable and requiring minimum maintenance. Then, modern sensors coupled with web technologies become smart and the "geosensor webs" become possible (e.g., [67]); within this framework, each sensor is capable of adapting to the registered signal and also of taking into account the feedback from the other sensors on the web, making the construction of self-adaptive monitoring networks feasible. This framework directly relates to the Internet of Things (IOT), a technology opening up new opportunities [68] but also implying potential cybersecurity threats [69] that can be critical when the environmental monitoring is devoted to important economical and strategical assets represented by natural resources.
Another typology of sensor that has seen strong improvement is the "human sensor" through Citizens Science approaches (e.g., [70,71]). In particular, the development in Information Communication Technologies (ICT), both from the side of software (e.g., web) and hardware (e.g., smart phones, microcontrollers, etc.) facilitates the collection of environmental data by means of participative approaches and directly in the field by means of digital technologies (e.g., digital geological mapping and references). In regard to participative geoenvironmental data collection, many examples can be reported; some of these are related to civil protection activities, ecological monitoring and post-disaster mapping, such as the post-Fukushima radioactivity monitoring network (https://safecast.org/, accessed on 7 February 2021). Citizen Science approaches could play an active role in increasing transparency in environmental monitoring and improving awareness on environmental issues.
ICTs play a fundamental role in all segments related to the productive chain of geoenvironmental information, including information retrieval, management and analysis (e.g., [72]). Cloud storing services are fundamental to manage the bewildering quantity of remote sensing data available from space agencies; for example, the private firm "Amazon", with the "Amazon Sustainability data initiative" (https://registry.opendata.aws/collab/asdi/, accessed on 7 February 2021) manages remote sensing data from multiple sources. Cloud services play a pivotal role even in the context of computing and participative programming; "Google earth engine" for remote sensing (https://earthengine.google.com/, accessed on 7 February 2021) or the "Tensor" platform (https://www.tensorflow.org/, accessed on 7 February 2021) for machine learning are examples of how the computing power and the possibility to develop algorithms in collaboration with multiple researchers are widening the potentialities in environmental data analysis but are also raising concerns regarding the free access to data. Corporate ownership and selling of data, especially those related to human activities, and their embedment in algorithmic systems used for the reorganization and automation of labor and policing raise legal, ethical and political concerns [73][74][75]. Even in the context of geographical information systems (GIS), there is a continuous push, both from proprietary as well as open-source solutions, toward WebGis services and online GIS. Many environmental agencies, research institutions, associations and other entities collect and manage environmental data by means of cloud storage services, generally following various standards on data and metadata (e.g., Inspire). Moreover, bigdata related to human activity and consumption plays a key role in studying the possible geosphereanthroposphere interconnections.
The complexity and quantity of geoenvironmental information available is evergrowing; this is also accompanied by continuous improvements and diversification of data analysis tools and increasing computer power. However, there is the feeling that these developments have grown much faster than our capability to fully, safely and robustly exploit available information. In order to "mine" the core information from multiple sources and huge quantity of data, which is not always qualitatively homogeneous, it is often necessary to adopt a big-data perspective and data-mining approaches. In this context, specific strategies for information validation become a key element. Relevant efforts should then be spent to formalize and explicate expert-based choices in the process of retrieving and analyzing data, given that user-based decisions impact many segments in the productive chain of environmental information. A final and perhaps obvious remark is that the dependency on online services for data storage and analysis could be risky, especially if based on infrastructures owed by private firms for which the profit is the inherent target, since they can change their policies at any time or they can also fail.

Geocomputing and Expert Knowledge
Geocomputational methodologies play a key role in the context of sustainability challenges. These are fundamental for the quantitative analysis of the main processes and their interactions characterizing the earth system (e.g., [1]). The analysis and modeling of geoenvironmental data are crucial for the detection of early-warnings signs of geoenvironmental-system instabilities at the local or global scales. Moreover, the importance of geoenvironmental intelligence tasks for economic investments and policy making has significantly grown, adding a 'prescriptive dimension' to the collection and modeling of geodata. This more than ever emphasizes the need for a conscious and transparent use of geocomputational methodologies.
Geoenvironmental information is generally exploited by means of supervised or unsupervised learning approaches for achieving two main tasks: data exploration and prediction. In this discussion, the term "prediction" is used in a broad sense, i.e., the action of evaluating the value of an environmental property or the state of an environmental system in a specific location of the spatiotemporal domain of interest, where information is lacking or it is incomplete (e.g., [48,49,76,77]). Clearly, prediction and data exploration are two interlinked and complementary tasks, often marked by fuzzy boundaries.
In data exploration, the main aim is to find some "interesting" underlying structure which is potentially capable of shedding light on studied phenomena (e.g., detection of forcing factors) and governing the predictive approaches further adopted. The "interesting" structure can be related to multiple aspects, e.g., spatial and temporal autoand cross-correlation, trends (in space and or time), periodicities and multiscale analysis (Fourier, fractal, wavelets, etc.), in causality relationships, clustering, fractal analysis, tipping points, variable reduction, pattern analysis, etc.
In prediction, the main aim is to estimate the value (continuous) or the state (discrete) of an environmental property (or of an ensemble of environmental properties) in "locations" of the spatiotemporal domain of interest where measurements are missing or incomplete. Ultimately, from a practical perspective, one of the most important tasks is to obtain a spatiotemporally exhaustive "mapping", static or dynamic, of the environmental variables of interest in a given spatiotemporal domain. The reconstructed spatiotemporal mapping should be characterized by low uncertainty and should be "realistic", i.e., compatible with available data and with our expert knowledge.
Following this perspective, under the term "predictive" we can include not only explicitly predictive approaches (e.g., spatial interpolators, regression, etc.) but also numerical modeling approaches (e.g., ground water models). In particular, the adoption of a specific approach is dependent on the quantity of data available, the complexity of studied phenomena and on the level of knowledge of the involved physicochemical processes. Accordingly, we could classify the predictive approaches in terms of the balance between available data and expert knowledge in influencing the analysis. When data are dominant with respect to expert knowledge, statistical predictive approaches such as geostatistics, Bayesian modeling and machine learning approaches can be adopted [76]. In these approaches, the expert knowledge influences the analysis in a semiquantitative way, for example, during the phases of exploratory data analysis and in the selection of critical user-defined settings (e.g., selection of the domain analysis, selecting a specific anisotropy parameter, etc.). Then, in those settings characterized by a general balance between available data and expert knowledge, the last codifiable only semantically, the prediction can be performed via a set of expert-based rules. In this typology, the approaches based on fuzzy logic [78] are emblematic. In other circumstances, the data spatiotemporal density can be too low for describing the true heterogeneity; however, at the same time, the physical-chemical processes governing the studied phenomena are identified and numerically modelable: in this setting, typical for example in groundwater modeling (e.g., [79]), the data are used for calibrating the numerical model, and the spatial fields of the geoenvironmental property of interest can be derived by means of forward modeling (e.g., the dispersion of a pollutant) or by means of inversion (e.g., hydraulic conductivity). Finally, a data assimilation approach can be adopted when a continuous flow of information is available and the physical-chemical processes governing the studied phenomena are identified and numerically modelable. In this setting, typical of meteorology and oceanography, the numerical models are continuously updated as long as new data flow in the model.
The role of expert knowledge becomes particularly intricate in explorative analysis focused on finding interesting structures in data. In some way, this typology of explorative analysis is related to the ancestral-inherent human characteristics of finding an underling meaning in the surrounding environment. For example, the inference of causation from data in Earth system sciences-including by means of machine learninghas received new attention [80] in the wake of novel trends toward the formalization of causal thought. Such trends are backed by claims that automated causal inferences constitute a breakthrough with respect to earlier vetoes against the derivation of causation from correlation-which led to a general ban on causal explanation from statistics [81]. The difficulty of shifting from an 'epistemology of correlation' to one 'of causation' is not unprecedented, as can be evidenced by important historical developments of the natural sciences. In astronomy, for instance, it took very long time before a proper celestial mechanics could emerge [82]. This emergence firstly presupposed the collection of 'big observational data' (from Babylonian times throughout the Middle Ages) and secondly, a shift from predictions based on the recognition of the recurrences of planetary motions to a geometrical 'pattern recognition' (beginning in Greek antiquity, cf. [83,84]). Eventually, causality could enter the arena of mathematical astronomy in modern times, when Kepler and, more maturely, Newton introduced forces as the causes from which the geometrical regularities of celestial physics should be derived. But this causal leap looked like an epistemological break rather than a causal inference. Even today's most keen supporters of causal inference, Pearl and McKenzie, acknowledge that causes depend on belief assumptions 'beyond the data'. Causal surmises can be confirmed and selected, yet they are not extracted from a hypotheses-free tabula rasa [85]. In the human sciences, the problem is tantalizing. The fathers of modern economy, especially after David Ricardo, already recognized that the source of the wealth of nations rests on labor, whose structuring and organization in specific societal formations depends on variable social and political factors [86]. Predictions based on the determination of historical causality are dubious since patterns can always be structurally altered by "black swans" that irreversibly change the paradigms to be modeled [87]. Renewed attention to the impact of ideas in the form of political and juridical theories that justify and produce societies' developments-that is, the importance of political and ideological factors over economic and technical ones-is at the basis of a rebirth of historical approaches in economy, in which the importance of the natural language as a fundamental complement to the mathematical and statistic language has taken center stage [88]. Such epistemological remarks, far from destructive skeptical arguments, call for a sober recognition of the intrinsic limitations in the modeling of societal phenomena, which depend on the historically contingent character of the reality they map.

Geocomputation and Technology
The cited approaches have seen a growing applicability in recent years due to the increase in data availability, continuous software development and augmented computational power. In the context of software, the evolution of programming languages and of programming environments is a key ingredient for the applicability and development of computational approaches. Programming paradigms that go beyond oldfashioned procedural programming, currently based on object oriented, generic and functional programming make it feasible to write geocomputational software more easily and efficiently than in the past. Moreover, modern code could be potentially easier to understand and is better suited for participative collaboration and development. In this regard, the availability of opensource solutions and common standards are fundamental. An example is represented by the Python language (https://www.python.org/, accessed on 7 February 2021), both used as scripting for automation and customization in domain specific software (e.g., GIS packages) as well as a standalone programming language with its own scientific libraries. Other examples are represented by mathematical or statistical programming environments where the development of new algorithms is straightforward. In this context the opensource statistical programming environment R [89] is emblematic of the potentialities of opensource programming in science. Even domain-specific software (free-open or commercial) have seen astonishing developments, which can have an impact in the direction envisioned by the SGDs. The possibility to adopt free and open software solutions is fundamental to promote proper (perhaps more 'democratic') environmental management, as for ground-water resources (e.g., [90]).
In analogy to what has already been reported in regard to data-source availability, the available set of algorithms and related software seems to grow faster than the capability to select and use the right tools for specific tasks. For example, in the context of statistical predictive algorithms, such as geostatistics and machine learning approaches, the quantity of available software packages and algorithms, and the related papers, is bewildering. It is extremely difficult for an earth scientist, especially at the beginning of her/his career, to find a clear pathway among the multiple options available today. Moreover, scientific literature may be of little help if the reader does not adopt a critical view; it is not rare to review or read scientifically unsound or at least inaccurate papers, for example those naively applying interpolation methodologies. Nevertheless, the wide set of available methodologies in data analysis and modeling, including technologies fostering participative approaches, represents an opportunity to develop collective intelligence approaches. These, improving transparency and pluralities of perspectives, are necessary to shed light on the multiple aspects of the earth system and its interactions with the geosphere.
In order to move safely among the many options available today and the new ones of the future there is the need to improve many aspects related to geocomputation. First, an intense and generalized demystification campaign should be conducted to clarify the key concepts, the assumptions (often hidden) and the limits of available methodologies. This is particularly true for approaches that seek to find 'interesting' structures in data such as tipping points, chaos and causality relationships. Moreover, complex formula and formal mathematical expressions should always be accompanied by a clear explanation in plain language; specialistic terminology should be always explained and jargon should be avoided. In the same Enlightenment spirit, there is the need to highlight connections and analogies between different approaches, especially when the differences are mainly related to tradition and specificities of the different disciplines. There are multiple examples of this kind, e.g., the connections of kriging and objective analysis (e.g., [91]); the analogies between orthogonal regression and Principal Components (e.g., [92]); the analogies between autocorrelation analysis in time series and geostatistics. These first two targets are fundamental to obtain the third one: selecting the right tools for a specific task, preferring the simpler ones. Then, it is always worth highlighting the essential role of explorative data analysis and of expert-knowledge in predictive approaches, even when adopting machine learning algorithms and other supposed automatic/black-box methodologies. In this context, the main issue relies upon documenting transparently how user-based decisions influence the results of the analysis.

Data Imbalance
Technological developments and growing awareness of geoenvironmental issues are fueling a continuous growth in geoenvironmental data collection. This reverberates in the astonishing imbalance in data quantity, spatiotemporal density and spatial coverage between datasets related to currently or recently monitored phenomena and datasets related to past dynamics. If we focus on the analysis of the surface geoenvironmental processes, a sharp and exponential increase in data coverage and spatiotemporal density could be seen in the 1970s, corresponding to the beginning of the NASA Landsat program [50]. The more we dig into the past, the less environmental data are available, and the data imbalance becomes more evident. The deterioration in information density, coverage and quality going back in time, which for simplicity is referred to as a "data imbalance", is clearly not new, and it is a recurring curse in many disciplines such as history, archeology and, of course, geology. What is really new is the bewildering explosion of new environmental informative sources and data collection capabilities of the last decades, which are growing day after day. The "bloom" in geoenvironmental data is particularly extreme in the context of earth surface processes; consequently, another extreme data imbalance is also present when dealing with subsurface data, characterized by a sharp deterioration in information moving downward. The data imbalance becomes relevant when studying geoenvironmental dynamics on long time lengths (or in 3D), for example in studying temporal trends of specific environmental variables (e.g., atmospheric temperature, sea level, subsidence, etc.). The imbalance is critical when the study of environmental dynamics is oriented to specific tasks, e.g., the analysis of variability, analysis of extreme events or in the detection of causality relationships (e.g., [80]).
In the perspective of sustainable development, and given the complex and reflexive relationships between humankind and the geosphere, the data imbalance is a critical issue, and is not easily surmountable. The earth scientists need to take into careful consideration the data imbalance in their analysis. A favorable condition is that the current capability to obtain a detailed and exhaustive spatial mapping of environmental variables of interest, even if limited to the earth surface, permits us to reasonably estimate the impact of a severe undersampling and/or of a deterioration in data quality. Moreover, it is possible to derive an almost exhaustive and detailed overview of the complexity and dynamicity of many environmental processes, including the presence of abrupt transitions of system state. However, considering human-environment interactions, we have to reflect on how an oversimplified modeling of societal phenomena hinders their very comprehension, as modeling might lead us to forget the discontinuity that marks human history, which is marked by shifts and breaks that can elude the framework of a given societal-cultural formation and that therefore can escape the possibility of necessary deduction from preexisting conditions.

Earth Science and the Humanities
The data imbalance discussed above and the need to analyze geosphereanthroposphere interlinked dynamics in time amplifies the relevance to consider, in addition to well-known geology-related methodologies, historical sources. The derivation of geoenvironmental information from historical records (e.g., documents, paintings, architectures, etc.) is fundamental to reconstruct past geoenvironmental conditions (e.g., local sea levels), the occurrence of peculiar environmental events (e.g., exceptionally cold winters, earthquakes, eruptions) and the changing relationships between humankind and the environment (e.g., landscape engineering). For example, the use of historical records of earthquakes for seismic hazard evaluation is well-established (e.g., [93,94]) and relies on the possibility of directly appropriating historical records on the perception of seismic effects. Other examples can be found in the context of geomorphological changes (e.g., [95]), landslides (e.g., [96,97]), volcanic activity (e.g., [98]) and many other geoenvironmental phenomena. The derivation of quantitative proxy data from historical records is not straightforward, although some pioneering works already exist. Camuffo et al. [99] worked on the reconstruction of temperatures in the Mediterranean Sea over 500 years through the combination of more and less recent data derived from instrumental observation and historical sources for times that preceded modern scientific measurements. Camuffo et al. [100] could also derive evidence of extremely cold winters in the lagoon of Venice from local documentary sources, including not only archival documents but also the visual arts and early printed books. The most daring proposal has been to derive biological proxy about the past sea levels of the lagoon of Venice, from 1350 to 2014, from early-modern depictions of green algae in Venetian canals, and to integrate them with information about past sea levels inferred from the height of the stairs of historical palaces on the main city canal, the Canal Grande [7]. One of the main difficulties in order to use such sources rests on the correct historical evaluation of the material and cultural contexts of works of art and the uses of architecture. Moreover, not only did scientific concepts change with time but so did the units of measurement that were once used and the meaning of their referents. To take just one example, it is not easy to translate measurements of river flows in the past if, even after the Renaissance mathematization of the principles of water flow thanks to the Galileian school, the quantity of measured water was referred to the volume of a geometrical construction rather than to the modern concept of flow rate [101,102]. Additionally, the study of historical and archeological records also sheds light on how science, politics and socio-economic factors interacted from the perspective of geoenvironmental policies and adaption to ever-changing environmental conditions (e.g., [103]). To remain with the Venetian case, archival administrative, technical and political documents could provide a mine of geological and environmental data, provided the correct interpretative methods are employed. Such documents also offer historical cases that help us reflect on geoenvironmental politics. The proper methods, in this case, include archival competence, philological skills and historical training, which are rarely united with a sufficient preparation in the earth and natural sciences. Moreover, the use of digital tools for the extraction of information in the humanities (e.g., for textual analysis or the comparison of corpora of texts) is less developed than in the earth sciences, although the digital humanities is a rapidly growing field of research (e.g., [104]).

New Professionalism?
The derivation of quantitative geoenvironmental information from historical and archeological records is not an easy task and requires a truly holistic approach, in which earth scientists (e.g., geologist, soil scientist, ecologist, geographer, etc.) work in teams with historians, archeologists, philologists, historians of architecture and philosophers of science. Hence, it is necessary to establish more inter-disciplinary research networks and collaborations. A new hybrid profession is needed. Part of its work ought to be devoted to clarifying the historical meaning of scientific categories and their transformations, as well to clarifying the varying goals that shaped the sciences of the past, which is typically a work for intellectual historians. Most importantly, this hybrid professional figure should bridge different academic cultures and disciplines, thus mediating between different outlooks and different uses of concepts, which sometimes only superficially look identical. For instance, it is not clear yet, from a geo-anthropological perspective, how historical time and geological 'deep' time intersect and co-determine each other from the viewpoint of the disciplines that deal with them.
Problems of geocomputing, linked to data imbalance and the problems of integrating historical data, not to mention the modeling of societal data, are highly relevant in politics and decision making. Quantitative abstractions of cultural practices and the tracing of natural processes and human actions has ostensible advantages in terms of management but runs the risk of preparing new forms of exploitation and authoritarian politics. In fact, the inclusion of geoenvironmental models in economic computations, although pursued in the name of a "green economy", reconceptualizes and operationalizes natural and cultural phenomena in terms of resources, services and capital (e.g., [105][106][107]). Far from being mere metaphors, such linguistic uses connected with quantitative abstractions ensures the possibility of economic valorization in the framework of a sort of "hyperrealism" which, at the level of individual and collective psychology, creates the illusion that the profit economy constitutes the unsurmountable naturalized horizon of history and human relations [108].

Conclusions
The increasing availability of data and geocomputational tools is continuously amplifying our capability to understand and model the environment. However, we are, maybe luckily, far from relying on purely automatic "meat mincer" approaches capable of searching and assimilating all available environmental data for the problem at hand and outputting the core information. Differently, data exploration, possibly according to a plurality of perspectives along with expert knowledge, play a key role in understanding and modeling earth system dynamics. Now more than ever, there is the need for earth scientists characterized by a balanced alchemy of geo-environmental knowledge and geocomputational capabilities, with advanced field-related skills. Field interpretation of the geoenvironment, even with the contribution of digital technologies, will continue to play a pivotal role.
Present-day geological and environmental challenges need us to look closer at human history, in particular the history of economics, technology, and science. These studies can help dig out environmental data and offer cases of the complex and reflexive dynamics between humans and their environments. Moreover, by strengthening the awareness of the historicity of human society, culture and knowledge, science studies (the wide spectrum of disciplines which reflect on science at philosophical, historical and sociological levels) foster critical thought, which is particularly important in order to find a balance between the political and the techno-scientifical components that are coimplicated in sustainable development policies.  Acknowledgments: The design of this paper is related to a keynote presentation (Trevisani, 2019) presented during the international conference "TerraEnvision 2019: toward sustainable development goals" (https://terraenvision.eu/, accessed on 7 February 2021), held in Barcelona on 2-6 September 2019: "Geocomputing, New Technologies and Historical Analysis: Tools for a Changing Planet". Some technology-related considerations have been inspired during the various meetings of the Geosciences & Information Technology group (Section of the Italian Geological Society). The authors are grateful to the Max Planck Society for the funding of the Max Planck Partner Group in Venice, The Water City, in order to further investigate the themes of this opinion paper. The authors would like to acknowledge the blind referees for their comments and suggestions and Jonathan Regier for his valuable support with the final revision.