LeafLive-DB: Classiﬁcation and Data Storage of Botanical Studies

: The development of studies, projects, and technologies that contribute to the understanding and preservation of plant biodiversity is becoming highly necessary, as well as tools and software platforms that enable the storage and classiﬁcation of information resulting from studies on biodiversity. This work presents LeafLive-DB, a software platform that helps map and characterize species from the Brazilian plant biodiversity, offering the possibility of worldwide distribution. Developed by Brazilian and Peruvians researchers, this platform, which is available in its ﬁrst version, features some functions for consulting and registering plant species and their taxonomy, among other information, through intuitive interfaces and an environment that promotes collaboration and data and research sharing. The platform innovates in data processing, functionality, and development architecture. It has ten thousand registers, and it should start to be distributed in partnership with schools and higher education institutions.


Introduction
It is currently possible to highlight the remarkable effort to preserve the Brazilian Atlantic Forest and other forests and plant areas of great bioeconomic importance. This effort requires an appropriate mapping of the existing biodiversity in all regions of Brazil [1][2][3]. This diversity motivates further studies on the organic and chemical structure of plant species that may be of great interest to industry [4], since they are an essential raw material used in the manufacture of products in the sectors of perfumery, cosmetics, pharmaceuticals, hygiene and cleaning, and food and beverages. In this way, it is necessary to create tools, data storage, and classification environments that contribute to mapping species studied, analyzed, and cataloged. Therefore, these software environments could use the most robust technologies to analyze, classify, and store the generated data.
The rapid advancement of technologies for the treatment of large volumes of data has allowed the increasing development of new platforms for the treatment, analysis, and representation of information resulting from research processes in the most diverse areas of knowledge. New paradigms, such as Big Data [5], propose developing new techniques based on data mining, machine learning, neural networks, and more robust databases to structure information in the binary and the real world. Moreover, bioinformatics [6] is a science capable of changing knowledge about human and plant genomes. Consequently, it allows greater efficiency in the classification and inference of the large volume of data generated due to several studies and experiments performed. For instance, in the case of the human genome, the number of researches on genes, functions, and their respective mutations has expanded, opening space for the pharmaceutical industry, laboratories, and similar industries that investigate treatments for the most aggressive diseases. In the case of other species, bioinformatics, associated with molecular biology and new technologies, are efficient and necessary in studying species classification and mapping.
The Integrated Taxonomic Information System (ITIS) [7] is one of the largest taxonomic data repositories. It gathers data on different species, such as plants, animals, fungi, and microbes, in North America. Among its functions, there is the possibility of exporting the database for processing on other platforms. Some repositories also integrate tools for data treatment, assisting in academic research. An example is the GenBank [8] from the National Center for Biotechnology Information (NCBI), developed as a collaborative repository for biological research. The GenBank can be considered the most complete and probably the largest repository in the world on shared data due to several related studies on different species (human and animal).
Although there is a particular interest in species mapping for preservation, few collaborative tools and software environments support the unique storage of plant species, taxonomies, and useful information to the researcher, especially in the Brazilian Atlantic Forest. However, the platform allows the addition of other Brazilian biomes, such as the Amazon rainforest and Cerrado, which are important for studying and preservation.
The present work is based on the premise that new platforms should be designed for exclusive use and registration of plant biodiversity data. It should consider its capability to store and classify large volumes of data on plant species and improve the level of usability of these environments to facilitate access to information and learning. LeafLive-DB might meet this demand, using intuitive interfaces and self-explanatory fields, allowing students and the researchers to know and classify in a user-friendly way and access information resulting from their studies on plant biodiversity.

Literature Survey
To facilitate the study of certain species, animals, or plants, it is necessary to structure and classify information about such species in environments that offer tools and interfaces to intuitively infer such data. Table 1 shows the main tools in operation nowadays. The number of platforms is few but robust, and their dataset is referenced in several articles published worldwide.  The platforms presented in Table 1 introduce functionalities of taxonomy repositories of different organisms [7,9] with a vast number of information and related scientific works [8]. Some offer the function of plant image catalog [10], and others allow the search for information in several databases, presenting them as an online encyclopedia [11]. Others exhibit a higher volume of data, functions, structures, and classification forms by region [7,8,11,12]. Some are exclusive to the North American area [4,7], and the access to some information is restricted to the domain of academic institutions [3]. Some platforms seek group information on global biodiversity [21], gathering information through consultation in several restricted databases. Other platforms offer a more structured search and presentation of species information [20], even though restrictive and complex when considering the technical language used. Others present succinct information about a specific species, even adding statistical information [22].
In summary, it is possible to observe that, although these platforms present solutions concerning the volume and the tools available for treatment, inference, and data classification (taxonomies, classification, images), most of them are not specific to plant species. Additionally, in most cases, they present relatively complex environments. A broad technical-scientific domain is necessary for its manipulation due to the diversity of data types and representation forms. The few platforms that gather information about plant species are either restricted or exclusive to a region. Some initiatives have been developed (see Table 2) to achieve the demand for platforms and tools that meet the forms of data representation resulting from the classification of plant species and their characteristics, taxonomies, and images. However, they are presented only as possible techniques to be used and not as platforms or tools already available and in operation for collaboration with free access.
The survey (Tables 1 and 2) aims to highlight the lack of software platforms that help spread knowledge of Brazilian biomes species, plants, characteristics, and morphology, among other information.
The purpose of the platform developed and described in this work is to offer the possibility of obtaining a software environment for sharing information about plant species from the Brazilian Atlantic Forest and allow the inclusion of data from other Brazilian biomes. However, the platform will not be restricted to this environment only, as it will be able to gather taxonomic details and images of the highest diversity existing inside and outside the Brazilian plant ecosystem.

Overview of Leaflive-DB
LeafLive-DB is presented, in its first version, as a software platform for a web environment that allows registering and sharing information about plant species from the Brazilian Atlantic Forest ( Figure 1). Its search environment, in which all users have full access, has self-explanatory interfaces. The result of the queries provides information that can be complemented after a quick registration of the user (student or researcher). Unlike other platforms, LeafLive-DB is designed as a collaborative environment for the exclusive storage and classification of information about plant species and their respective taxonomies and aims to be a reference platform in all teaching environments. The platform has a search/query interface and allows the inclusion of information, such as: (c) Images; (d) Geolocation information; (e) Scientific articles (scientific productivity and published data, resulting from studies on various species).
A map in the shape of a plant leaf (leaf map) is also included to assist and complement the plant's information to be registered. The platform environment is mostly flexible, scalable, and features collaborative environments. Moreover, it has self-descriptive and easy-to-use interfaces, allowing the user to cancel actions and provide the administrator with a dashboard that facilitates user and content control and validation.
Two crucial aspects characterize LeafLive-DB. The first is that it is an environment for storing and classifying information exclusively about plant species and their taxonomies. The second is that it uses the model view controller (MVC) architecture model, which is customized in its three layers of implementation, focusing not only on the scalability of its functions but also on storage techniques, classification, and inference on large volumes of data.
LeafLive-DB's functions are grouped into three functional modules, as described below: (a) Administrative Module: the administrative panel or control dashboard allows the total management of the system, the database of users and hierarchical access profiles, and the approval of organism registration. This module's template was designed to have an adequate level of abstraction, thus allowing the reuse of some functions common to all users (header, footer, sidebar, and slide bar, among others). In the business layer, which is responsible for the administrative panel, functions were created to be loaded according to each user's access profile. (b) Organism Registration Module: considered the main module of the system, since it provides the interfaces for feeding the system, it allows data insertion related to a particular species and its related information, such as its taxonomy. The module is divided into five levels: (i) Taxonomic Classification Level: allows the insertion of the entire taxonomic hierarchy of the organism. It was built to enable a dynamic search in the database and return the results by similarities in an auto-complete form. Consequently, if the taxonomy entered by the user already exists, it is not necessary to make a new registration and duplicate the data, allowing only a new relationship with the pre-existing taxonomy. When selecting a pre-existing taxonomy, the database query's return was treated in a form that will enable a new taxonomy to be registered in case of non-existence. (ii) Image Level: at this level, the function that allows uploading the registered plant's image is programmed, allowing the user to correlate the information visually. The upload is done asynchronously using the "fineuploader" plug-in. The images are visualized as a slide show, together with a preview of the organism's registered information, when registration is finalized. (iii) General Information Level: at this level, there are text fields for the insertion of information for each plant species, where a "leaf map" is made available to assist the user in the precise identification of the species during the moment of registration. (iv) Indexing Level: this level includes registering academic papers related to the recorded species. After the article's registration, relationships are created between all registered species that have been studied and described in the paper. For this purpose, the user needs to mention the plants that were considered using keywords. All information provided is grouped, and references (link or article) are inserted on the main registration page of each organism. The articles to be uploaded on this platform must be from the authors themselves and with open access. Otherwise, the user can share the official publication webpage.
Geolocation level: this level was built to insert geographic location information of the identified species (Google Maps API). This should eliminate errors at the origin (where the plant was located) and allow easy mapping of the Brazilian biome species.
In the registration module, the data is pre-validated during its insertion. In cases where information is already registered in the system, this information automatically complements the existing registration, avoiding data duplication.
(c) Consultation Module: the consultation module allows, with a combination of specific filters, the location of registered bodies and related works (articles). Currently, this module has a base of 56 thousand records, with a response time per query in the order of milliseconds. Date and time functions on the insertion and manipulation of data registered on the platform are part of this module.
Each of the levels and modules that correspond to the registration of new plant species, characteristics, and taxonomy is displayed independently, thus ensuring information is managed fractionally and allowing for mass updates to the shared registrations.

Description of the Architecture
The architecture of LeafLive-DB ( Figure 2) presents a high level of abstraction, considering the large volume of data expected for the system. The system's division into layers allows the improvement of each component individually, facilitating the data treatment in the lower layers and permitting a straightforward presentation of the plant's information, classification, structure, and taxonomies in a simple way for the user, with a highly intuitive interface layer. The architecture (Figure 2) follows a three-layer (3-Tier) MVC model [27] customized in its three levels of implementation architecture. The result of this customization is the FelideoMVC framework (source code of FelideoMVC: https://github.com/felideo/ FelideoMVC (accessed on 1 January 2021)) [28], which defines a structure for implementing and controlling services (data and access) that is customized for the construction of Web platforms. Therefore, the objective is to maximize access, management, and inference about large volumes of data using hierarchical models (FelideoTrine source code: https: //github.com/felideo/FelideoTrine (accessed on 1 January 2021)) [29], in addition to maintaining the scalable structure in functions and performance. The three layers of this framework are described below.

(a) Interface layer (view) is restructured with a new interface control class, coupling the
Dwoo PHP Template Engine. Incorporating the template allows efficient recovery of structures encoded in HTML and makes dynamic replacements with data received from the control layer. The result is an agile construction of interfaces, keeping the code organized and eliminating the mix of languages. In this same interface layer, the routing class "Bootstrap" is called, which has direct communication with the control layer. In this class, all application execution control is performed. All the framework routing is done based on the URL used, either by the user or the application itself. Using the URL, the Bootstrap class identifies which controllers, models, and views should be instantiated and which function must be executed. (b) Control layer (controllers or Business Layer) was restructured and segmented into three levels. At level 1, specific control is performed by reducing the set of specific business rules for a given controller. At level 2, an abstraction class called Con-trollerCrud is developed, which implements and executes automatically without the developer's need to code them. The basic functionalities of a CRUD (function) are to create, read, edit, and delete entries. At level 3, the most common control methods used by the lowest levels of operation are grouped, such as access to the functions of the interface layer (view) and data (model).
The control layer restructuring allows the developer to create system modules more quickly and effectively, indicating only a few parameters at level 1. It is worth mentioning that all other functions of creating, retrieving, editing, deleting, viewing data, accessing the database, and returning the user interface are already done and executed automatically by the upper levels, which are common to all controllers.
(c) Data layer (model) is restructured and segmented into two levels. At level 1, there is the specific model, and at level 2, the general model. The abstractions performed by both levels follow the same line of operation as the control layer. At level 1, it is possible to encode specific and more complex queries, and level 2 automatically takes care of the most common queries to be used.
In this same layer (model), a library called FelideoTrine is developed to simplify the coding queries to the database, replacing the native and verbose language of MySQL ( Table 3) with methods that reduce the number of code lines to build a query.
As presented in Table 3, when generating the query, the FelideoTrine library treats the result, building a hierarchical tree, removing repeated data, and structuring for a better representation. Therefore, it is useful when inferring about large volumes of data with several related variables.
Following a higher-level description, the LeafLive-DB architecture is built using languages, frameworks, and plug-ins to maximize the control of interfaces and access to platform data by users. It also simplifies the customization and addition of new functions, as described below.
(a) Interface layer: built following a hybrid programming model, using HTML5, PHP, JavaScript, and CSS3 languages. As a language for styling content at the interface level, the latter is part of the platform specification. The JavaScript language allows the performance of the dynamic control of HTML and CSS3, making changes to the layout in real-time according to user interaction, and to change the behavior and information displayed on the page. (b) Business Layer: responsible for validating, processing server-side data, applying business rules, and communicating with the data layer. In its implementation, the framework called FelideoMVC was used, a component of the MVC framework. It was chosen due to its simplicity in the development and study routines, initially relying on the basic implementation of an MVC architecture, communication with the database, URL routing, and user login system. In addition, it is also an intuitive administrative panel. Improvements in the MVC framework allowed for better code organization, greater agility in coding the platform's functionalities, and, finally, greater stability to the system.
The business layer focuses on the implementation and control of all platform functions. In the treatment and access to data, two plug-ins and a library are used to improve some of the data access functions (data layer).
The DataTables plug-in is used to build dynamic tables, uploading data information to the database via Ajax, thus allowing searching, filtering, and ordering the content in the table without the need to reconstruct it.
The SweetAlert plug-in is used to display alerts to the user, allowing the display of confirmation alerts, insertion of forms in the alerts, making call-back calls via Ajax, among others.
The jQuery library is used for the interaction between JavaScript and HTML. It has numerous functions that simplify and accelerate the development of some functions at the interface level (interface layer). Some of these functions are running scripts on the client-side without the need to be sent to the server, data validation and asynchronous communication with the Business layers via Ajax, and making requests for data processing and queries to the database without the need to reload the page.
(c) Data Layer: it follows a relational structure currently composed of 22 tables. Six tables are essential to the framework's functioning, and sixteen tables refer to the structure for storing data and images of organisms and scientific articles.

Accessing Leaflive-DB Services
Currently, LeafLive-DB has more than 10,000 plant records and their respective taxonomies. The search environment (Figure 3) is completely open and has selection fields, where users have easy access. The selection of all fields is not mandatory; however, they improve the search criteria for information in the relational bank structure. When the objective is not the search but the inclusion of plants (registration of living being), characteristics, and taxonomies, the user (student or researcher) needs to go to the "Access" option, which allows the registration, preferably using an institutional email. Access will be given after validation of the email sent in the registered email address's message box. Once registered, the user will have access to a higher number of functions, such as plant search and registration, an administrative panel, and, finally, the option to edit the user's profile ( Figure 4). After their registration, users can access the option "registration of living being," where they can contribute to the LeafLive-DB platform. At the interface level, users find a set of fields that should be familiar to them, in addition to allowing quick completion, since there is an auto-complete function. The taxonomy registration section ( Figure 5) uses a plug-in called "select2", which has the purpose of searching the DataBase (DB), returning the results. If the taxon entered by the user already exists, it can be selected and, consequently, the system will fill in all related upward taxonomies. If the taxon does not exist, the user can manually fill in all the requested and related information. When concluding the insertion of information about the taxonomic classification, users will be able to insert an image (photo) of the plant, therefore adding graphic details on the registered plant and its visible characteristics. When uploading images (photos), users start to share their intellectual property, allowing other users to use them in their academic research work.
Moreover, plants are generally better known by their popular name than by their scientific name. It is important to note that popular names may vary depending on the country's region where they are found. Users who are registering or new users can add features, descriptions, and additional information that allow other people to recognize the cataloged species easily. Therefore, they may do it in a much simpler way using the LeafLive-DB platform. If desired, it is also possible to add a "leaf map" (Figure 6). The species is cataloged following a taxonomic hierarchy structure, and the information presented can be complemented with details and peculiarities of the plant. These details might be the number of petals and stamens, the ovary's position, whether it has fruit or not, whether it is covered or not, and a brief description of the plant. The leaf map contains complementary information and allows the student and the researcher to add information about the leaf shape of the registered plant and upload an image of the plant.
Finally, two options stand out within the "registration of living being" function. The first function allows users to include scientific production related to the species of registered plants, whether they are the authors or not. They can make the file available or include the reference link so that new users can reference the article (Figure 7). The second function corresponds to the location information (georeferencing) of the plant, where it was found and collected, among others. This function was integrated with the Google Maps API, allowing the use of latitude and longitude information to insert a marker on the map. Moreover, multiple locations can be added to the same plant (Figure 7). Upon finalizing the layout, each information section was treated so that there was an initial validation, via client-side, before being saved and waiting for the administrator's validation. All requests for new records are temporarily saved and available on the administrator's dashboard (Figure 8) until approval. Only after approval, the information will be available to users of the platform.

Conclusions and Future Works
LeafLive-DB is an initiative that aims not only to provide platforms and APIs to assist in the mapping and classification of plant data and their respective taxonomies, but also to promote knowledge about the biodiversity of Brazilian flora at various levels of primary and higher education. Currently, the data repository has 10,000 registered data, and soon they can be complemented with more information, such as plant images, geolocation data, and scientific articles.
In addition to the Atlantic Forest data, the Amazon rainforest is considered the greatest biodiversity on the planet and in Brazil, as well. It presents mega diversity of plant species among other species, being little known by the population itself and widely vulnerable with risks of extinction of some species due to constant deforestation and forest burning in addition to the absence of an adequate preservation policy. We believe that the development of platforms that encourage the sharing and representation of information about biodiversity, in a structured and self-explanatory way, also exploring the use of popular species names in their classification, could help in the learning and encourage the preservation of the local biome, as well as its dissemination worldwide. Furthermore, as the Amazon rainforest borders other countries in South America, this platform could also be used in neighboring countries, contributing even more with the data used and deposited on the platform, allowing LeafLive-DB to become a more robust platform.
The platform, in its first version, presents an environment with friendly and very intuitive interfaces. It provides options for searching, species registration (plants), taxonomies, images, leaf map, an option that allows linking scientific articles related to one or more species, and finally, a geolocation function. For this last function, an API for mobile devices is expected to be developed shortly, allowing the student and researcher to use their mobile device as a collection tool. The idea is that they could operate offline and use the cellphone GPS to collect their position when mapping the species. Hence, when users are in a zone with an internet connection, they could synchronize their API with the online platform.
A second point to be developed on the LeafLive-DB platform is gamification. The idea is to create a game that simulates the Atlantic Forest scenario, recreating situations and challenges encountered by students and researchers, and can be solved using the platform. Moreover, to reach a younger target audience, as in primary schools, the game's conditions related to preservation must be addressed.
A third and final point is to develop the option of grouping and viewing species information in the form of catalogs ( Figure 9). These catalogs would allow greater diffusion of the plant morphology and taxonomy tool in classrooms. Finally, LeafLive-DB is still under development, and, as needed and as users share suggestions, we will update it. Therefore, the platform will be continuously improved to meet the increasing demand for plant data and its taxonomies, thus making knowledge available to everyone.