The Spider Anatomy Ontology (SPD)—A Versatile Tool to Link Anatomy with Cross-Disciplinary Data

: Spiders are a diverse group with a high eco-morphological diversity, which complicates anatomical descriptions especially with regard to its terminology. New terms are constantly proposed, and deﬁnitions and limits of anatomical concepts are regularly updated. Therefore, it is often challenging to ﬁnd the correct terms, even for trained scientists, especially when the terminology has obstacles such as synonyms, disputed deﬁnitions, ambiguities, or homonyms. Here, we present the Spider Anatomy Ontology (SPD), which we developed combining the functionality of a glossary (a controlled deﬁned vocabulary) with a network of formalized relations between terms that can be used to compute inferences. The SPD follows the guidelines of the Open Biomedical Ontologies and is available through the NCBO BioPortal (ver. 1.1). It constitutes of 757 valid terms and deﬁnitions, is rooted with the Common Anatomy Reference Ontology (CARO), and has cross references to other ontologies, especially of arthropods. The SPD o ﬀ ers a wealth of anatomical knowledge that can be used as a resource for any scientiﬁc study as, for example, to link images to phylogenetic datasets, compute structural complexity over phylogenies, and produce ancestral ontologies. By using a common reference in a standardized way, the SPD will help bridge diverse disciplines, such as genomics, taxonomy, systematics, evolution, ecology, and behavior.


Introduction
Spiders are an impressively diverse group that have specialized to very diverse habitats, life styles, and preys [1]. Such diversity is reflected in a myriad of morphological structures that enable different ways of walking, spinning, attacking, and sensing the world. The names of these structures may be widely used in animals (e.g., eye, retina), in other arthropods (e.g., leg, sclerite), or only arachnids (e.g., sucking stomach, chelicera), while some are specific to spiders (e.g., spinneret, male copulatory bulb) or even to definite spider groups (e.g., cribellum, lyra, paracymbium, tibial crack). The definitions and limits of these concepts are constantly updated; as functional studies discover constituent parts of organs and how they interact and work together, taxonomic and phylogenetic studies often reveal new structures in different taxa, and how they vary in shape and composition across species. As a result, morphological terms have accumulated over time and it is challenging to choose the correct name for a structure of interest, even for trained scientists. Moreover, similar to many other taxa [2], morphological terminology is plagued with complications such as synonyms (e.g., carapace vs dorsal shield of prosoma), subtle differences (e.g., direct vs indirect eye), disputed definitions (e.g., claw tuft vs scopula; spermatheca vs reservoir), ambiguities (e.g., onychium, bursa; spermathecae vs vulva), same name used for different structures (e.g., ejaculatory duct for spermophor), functional names for structurally different structures (e.g., conductor, receptacle), and misleading names (e.g., dictynoid pore). These terminological ambiguities lead to misunderstandings, imprecise morphological descriptions and are an obstacle for cross-disciplinary and integrative studies.
A simple approach to overcome this dilemma are glossaries, alphabetically ordered lists with terms and their definitions. For example, one of the first spider glossaries contained 28 terms [3], Simon's Histoire naturelle des araignées [4] listed 152, and the glossary of an identification guide for North American spiders has 225 entries for anatomical parts [5].
Such glossaries are valuable for clarity and stability of anatomical terminology in a given work. Generally, names of morphological parts should be stable and intuitive; however, both purposes are sometimes in conflict, as for example the name "carapace", which is used for crustaceans and spiders with completely different meanings. Therefore, "dorsal shield of prosoma" has been recommended for spiders instead. However, this term is not widely used as stability seems to be favored over a clear term. Another example is the "dictynoid pore", a delimited patch of glandular pores on the spermathecae of some entelegyne spiders described by Bennet [6]. At that time, Bennett thought that the structure was characteristic for the superfamily Dictynoidea as delimited by Forster [7] resulting in such taxon-specific term. Later, it was shown that the dictynoids are not monophyletic (see [8]), and since the gland was found in many entelegynes, but never in dictynids, a new name ("Bennet's gland") was proposed by Ramírez [9] arguing for clarity rather than stability.
Beyond clarity and stability, anatomical names reflect hypotheses of sameness or distinction that are still open to investigation. For example, the muscles M29 and M30 in the male palps insert in the copulatory bulb of many spiders (see [10]). Their name comes from similar muscles in the palp of females and immatures, as well as in the legs, which operate the pretarsus and claws (depressor and levator muscles, respectively; see [11]). Thus, a common name for these muscles conveys the identification of the male copulatory organ as a kind of modified palpal pretarsus, which is plausible but hypothetical (see [12,13]). Another terminology conundrum is the terminology for silk glands and spigots. The spigots are named after the gland type they serve, but in many cases they are documented first from external anatomy, while the connection with the corresponding gland remains to be proven (e.g., the modified spigots and the flanking triad spigots in the posterior lateral spinnerets of araneomorphs, the paracribellar gland spigots of filistatids; see [14]).
It can be concluded that naming of body parts and structures can be difficult and highly volatile leading often to obstacles in morphological descriptions, a drawback which was circumscribed as the "linguistic problem of morphology" by Vogt et al. [15]. Many users of morphological terms are inexpert in spider anatomy, yet they need accurate names to annotate observations and, more importantly, a common standard for morphological descriptions. Therefore, it is important to create a reference and a community space to discuss and maintain that knowledge (see also [16]). Here, we propose an anatomical ontology for spiders, which has not only a glossary functionality, but also allows powerful inferences using community standards for many fields in biology. The Spider Anatomy Ontology (SPD) provides an open repository for anatomical terms and a forum where researchers and users with different background can discuss and refine definitions, argue about preferred names, correct mistakes, or contribute new terms.

Anatomical Ontologies and What They are Used for
Anatomical ontologies are controlled defined vocabularies, which are formalized by rules and assertions [16]. The purpose of an anatomical ontology is two-fold. Firstly, it serves as a glossary, where each term or "entity" (eye, retina, carapace, sustentaculum) has a textual definition and its preferred and alternative names are listed. For example, identification guides (e.g., [5,17]) have a glossary for anatomical terms that helps the understanding of terms that may otherwise be confusing or academic.
Secondly, and no less important, it is the grouping of terms in a reasoned way using RDF triplets (e.g., the retina is part of the eye, which is part of the prosoma), which results in a hierarchical structure as for example in the neuroanatomical glossary of Richter et al. [18]. There are many ways of grouping terms to express current knowledge (serial homology, structural parthood, functional groups), and all co-exist in the same ontology (e.g., adding that the eye is a sensory organ, as well as part of the prosoma). For example, a researcher needs to annotate an image showing the innervation of a sensory organ related to the control of the dragline, thus using the term "sensillum of major ampullate field" (SPD:0000256) (see [19]). The ontology represents this organ as a type of strain sensillum of cuticle (SPD:0000511), part of the peripheral nervous system (SPD:0000622), and part of the anterior lateral spinneret (SPD:0000036). Another study reports that the proneural gene CsASH2 is expressed in the anterior spinneret [20], which according to the ontology is a synonym of the anterior lateral spinneret. This structure of relationships represents much of the definition of each anatomical entity, helps bridging cross-disciplinary studies, and allows for elaborate questions about groups of entities (e.g., are strain detectors differently distributed in web vs cursorial hunters?).
Anatomical ontologies were originally designed for cross-disciplinary work over large communities, especially to link phenotypic and genetic data (where the genes are expressed, what organ affects a gene inactivation or mutant), with the objective to accumulate knowledge and mining the data for potential genotype-phenotype associations [21]. For this reason, only some parts of the ontology are friendly to the human eye (i.e., the definitions, many of the relations between entities), while other components are more technical or abstract, intended for automated computation and compatibility (e.g., the alphanumeric identifiers, some higher-level abstract entities). The formal relations of the ontology are used by algorithms called reasoners to derive inferences, such as candidate genes for a given phenotype [22]. Beyond the original impulse, anatomy ontologies are used today in multiple tasks that require manipulation and integration of large quantities of phenotypic data [23,24]. A recent application in systematics is the merging of phylogenetic datasets and the automatic inference of phenotypic traits [25]. An example for spiders would be the inference that the anterior lateral spinnerets (ALS) are present in a species that was scored as having a very short base of piriform spigots, because the piriform spigots are in a chain of part of relations with the ALS. These kinds of inferences will minimize the missing entries in the automatic merging of datasets of araneomorphs and mygalomorph spiders.

The Spider Anatomy Ontology (SPD)
We developed the Spider Anatomy Ontology (SPD) following the guidelines of the Open Biomedical Ontologies [26], a community that harbors most of the ontologies developed in biology. Every term in the SPD has some mandatory elements (see also Figure 1): Identifier (ID): For example, "SPD:0000024". The identifier is unique (meaning that it is not occupied by any other term in this or any other ontology) and persistent (meaning that once created it cannot be recycled for a different term). Terms no longer in use are marked as "obsolete" but keep their identifier. Name: For example, "leg 4". The name is unique within a given ontology. Note that the name of an entity can change (e.g., correcting misspellings, switching with a synonym), as well as its definition/concept (e.g., refining concepts or grammar), or its relationships to other terms. This means that databases or other software can be automatically updated due to the unique IDs.
Definition/Concept: For example, "most posterior appendage of prosoma," followed by a bibliographic reference; sometimes the name of a curator replaces the reference.
One or more relationships: For example, "leg 4 is a leg", "leg part of prosoma". There may be some optional elements such as synonyms (hind leg, leg IV), comments, whether it is an obsolete term, and in this last case, what other term ought to be used instead.
The SPD was built and edited with OBO-Edit [27], and is available online in BioPortal (http: //bioportal.bioontology.org/ontologies/SPD), where the successive versions are tracked. A prototype version was first developed for the agglomeration of phylogenetic characters of spiders and managing of images in phylogenetic datasets, as published in Ramírez et al. [28]. these terms may find a better place in the future in a specific ontology. We have undertaken a major effort to build the SPD as a multi-species ontology, trying to accommodate the anatomy of many spider species in the same scheme, as already successfully implemented for teleost fishes (TAO; [30]), hymenopterans [HAO; 2], or circulatory system of arthropods (OArCs; [31]). In contrast, many ontologies are single-species, such as the Drosophila Anatomy Ontology (FB-BT, for Drosophila melanogaster) [32] or the Foundational Model of Anatomy for humans (FMA; [33]). Coverage: This is an anatomical ontology, thus many phenotypic characteristics such as behavioral entities (e.g., describing courtship display, mating posture), web or burrow elements, or other spider constructions are not considered. However, we have included a group of entities to describe the main silk threads and fibers (e.g., cribellate silk band, dragline, attachment disk) under "portion of organism substance" from the Common Anatomy Reference Ontology (CARO; [29]); these terms may find a better place in the future in a specific ontology. We have undertaken a major effort to build the SPD as a multi-species ontology, trying to accommodate the anatomy of many spider species in the same scheme, as already successfully implemented for teleost fishes (TAO; [30]), hymenopterans [HAO; 2], or circulatory system of arthropods (OArCs; [31]). In contrast, many ontologies are single-species, such as the Drosophila Anatomy Ontology (FB-BT, for Drosophila melanogaster) [32] or the Foundational Model of Anatomy for humans (FMA; [33]).
Relationships: We used the relationships 'is a' and 'part of', which are the most common in anatomical ontologies [34]. The architecture of the ontology is rather simple compared to the more mature ontologies mentioned above; some of these ontologies use more relations (e.g., 'attaches to', 'develops from', 'connected to', 'contained in', 'synapsed by'), reflecting their longer development time. The root terms of SPD were taken from CARO, mainly for compatibility with other ontologies. We inserted some cross-references to equivalent terms in other ontologies (e.g., cuticle, acrosomal vacuole, reproductive system).
Synonyms: We used exact synonyms for interchangeable names with the same definition (e.g., abdomen and opisthosoma), and related synonyms for terms that are often used with similar meaning but do not correspond exactly (maxillary gland and sieve plate).
References: We included some references as citations, especially when the definition can be traced to a particular source (e.g., SPDrf:Huber_1994). In most cases the terms are referred to the curator that added the definition (Ramirez or Michalik in this initial version). The main references for terms and definitions are as follows: External morphology follows mostly Ramírez [9]. Male reproductive system and spermatozoa follows mainly Michalik & Ramírez [35]. Genital organs follow mainly Comstock [36], Sierwald [37] and Coddington [12]. Silk structures follow Eberhard and Pereira [38]. Main sources for general anatomy were Foelix [39], Barth [40], Griswold et al. [14] and Millot [41].
The ontology currently has 757 valid terms, of which 17 are from CARO higher classes. There are 89 cross references to other ontologies (FBbt, GO, UBERON, ZFA, HAO, TrOn, and CL; see NCBO BioPortal, https://bioportal.bioontology.org/). Additionally, there are 75 obsolete terms that are kept for reference to older ontology versions.
The coverage of terms with regard to external and internal structures reflects the common knowledge on spider anatomy and morphology. Therefore, the external morphology is substantially better covered compared to internal organs. However, the main anatomical systems are represented (e.g., nervous system, circulatory system, digestive system), but only at a gross level. Exceptions are the reproductive system and silk glands, which are more densely covered.

Application of the Spider Anatomy Ontology (SPD)-A Few Use Cases
In the following we provide an overview of selected use cases for the spider ontology demonstrating the value of the SPD as a tool in morphological, taxonomic and systematic research.
Linking images to phylogenetic datasets. In this application the ontology was used to retrieve relevant images for phylogenetic datasets, and to document the characters and states with descriptions and exemplar images [28]. This is accomplished by the Silk package [42] of the software Mesquite [43]. Each image is annotated with one or more ontology entity, according to the structures displayed in the image (e.g., anterior lateral spinneret), and with the species name. Then each character (e.g., number of piriform gland spigots) is annotated with the relevant entity in the ontology. As one points to a Diversity 2019, 11, 202 6 of 9 cell (character by taxon), the program queries the database for relevant images (entity by species) and, if found, displays one or more images, which can be used as information to score the cell or review previous scorings. Together with the character and state documentation, the program shows the definition of the associated entity in the ontology.
Evolution of complexity and ancestral ontologies. This use case aimed to calculate and trace the evolution of anatomical complexity over a phylogenetic tree [44]. For this, each character of a dataset of dionychan spiders [9] was annotated with anatomical entities from the SPD. One single-species ontology was derived for every species and hypothetical ancestor, as a subset of the multi-species ontology. This was accomplished by using the characters expressing presence or absence of structures (neomorphic characters, [45]) to remove entire sections of the ontology (e.g., the cribellum and all its constituent parts were removed for an ecribellate spider). Then a complexity value was derived for every species and hypothetical ancestor, thus tracing the evolution of complexity (e.g., transitions to overall simpler morphologies). The ontology was used also to filter relevant character systems, such as a transition to more complex setae system when spinning organs and silk was simplified.
Text mining for quantitative traits. In this application the purpose was to retrieve quantitative data from taxonomic descriptions using the Explorer of Taxon Concepts application [46]. The application processed the species descriptions from PDF files of taxonomic publications of amaurobioidine spiders (Anyphaenidae) and composed a data matrix of measurements (e.g., carapace length, tibia 1 length), organized as species by variable. The Spider Anatomy Ontology was used by the application to guide the natural language processing and identify the terms in the text.
Alignment of terms between arthropod ontologies. In this study Bertone et al. [47] aligned comparable terms between the Hymenoptera Anatomy Ontology (HAO) and other arthropod anatomy ontologies (spiders, ticks, mosquitoes and Drosophila melanogaster). They found 43 terms in common between HAO and SPD, especially from leg segments, some features of the reproductive system and higher-level CARO classes. Besides the structural differences between hymenopterans and spiders, the ontologies differ in granularity for organ systems; for example, HAO covers much more muscles, while SPD have more classes for setae/sensory structures. Several terms are used with different meaning in SPD and HAO (homonyms; e.g., radix, serrula, pedicel, alveolus). As expected, HAO had more terms in common with the other insect ontologies (152 with mosquitoes, 132 with D. melanogaster), although the architecture of their design varies substantially.

How to Contribute
The ontology presented herein has to be perceived as a starting point, and we hope that the community will use it as a resource and contribute with better definitions, corrections, new terms and relations. Persons interested to contribute should email the contacts listed in BioPortal (http://bioportal.bioontology.org/ontologies/SPD). Initially, we have set up a Google forum for discussion and contributions (https://groups.google.com/forum/#!forum/spider-anatomy-ontology), although that may change in the future.

Limitations and Future Developments
The anatomical ontology presented here offers a dense coverage of anatomical terms in use for all spider taxa, and can be used as a reference and as a repository that can be refined and expanded. Users that are inexpert in anatomy can benefit from the browsing and searching tools already implemented in the BioPortal. Since the terms have definitions, and are grouped in several hierarchies (anatomical clusters, organ systems), it is intuitive to locate a correct term or find that a new term is needed.
It is important to emphasize some of the limitations of this enterprise. Anatomical ontologies, as SPD, are succinct references, imperfect and never exhaustive. SPD will be expanded and refined with community input, but in order to discuss and reach consensus on complex cases, it is expected to be a step behind the knowledge of the time. A more fundamental limitation is that any ontology uses simple and schematic logical expressions to construct a rough representation of our understanding of very complex biological phenomena.
One major challenge comes from the aim of fitting all spider species in a single multi-species ontology. Some anatomical groups are so rich in structures and variable that it is difficult to represent all variants in fine detail (e.g., genital and spinning organs in araneomorphs), and further to integrate these together with structures from much simpler morphologies (e.g., in mygalomorphs). A similar case, from fishes, is the representation of the Weberian apparatus of Otophysi together with the undifferentiated vertebrae of other fishes (see [30]).
A second challenge is the discrimination of structures by sex or stage. We have duplicated by sex certain structures that are of special interest; for example, male palp and female palp (with the corresponding male palpal tibia and female palpal tibia, etc.). Other structures, such as the leg 1, are represented only once. However, since many spider males have clasping structures on tibia 1, should we discriminate the leg 1 by sex as well? Should we do this for any body part affected by sex dimorphism in some species? This is a common complication of ontologies (e.g., of holometabolous insects [47]).
In summary, the Spider Anatomy Ontology offers a wealth of anatomical knowledge that can be used as a resource for any scientific study. By using a common reference in a standardized way, the SPD will help bridging diverse disciplines, such as genomics, taxonomy, systematics, evolution, ecology, and behavior. Moreover, the semantic tools that are constantly developed in many fields of biology can be used with SPD to obtain inferences that go beyond the original purpose of the individual research.