<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD Journal Publishing DTD v2.3 20070202//EN" "journalpublishing.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink" xml:lang="en" article-type="research-article">
<front>
<journal-meta>
<journal-id journal-id-type="nlm-ta">Sensors</journal-id>
<journal-title>Sensors</journal-title>
<issn pub-type="epub">1424-8220</issn>
<publisher>
<publisher-name>Molecular Diversity Preservation International (MDPI)</publisher-name></publisher></journal-meta>
<article-meta>
<article-id pub-id-type="doi">10.3390/s120912126</article-id>
<article-id pub-id-type="publisher-id">sensors-12-12126</article-id>
<article-categories>
<subj-group>
<subject>Article</subject></subj-group></article-categories>
<title-group>
<article-title>Ontological Representation of Light Wave Camera Data to Support Vision-Based AmI</article-title></title-group>
<contrib-group>
<contrib contrib-type="author">
<name><surname>Serrano</surname><given-names>Miguel Ángel</given-names></name><xref ref-type="corresp" rid="c1-sensors-12-12126"><sup>*</sup></xref></contrib>
<contrib contrib-type="author">
<name><surname>Gómez-Romero</surname><given-names>Juan</given-names></name></contrib>
<contrib contrib-type="author">
<name><surname>Patricio</surname><given-names>Miguel Ángel</given-names></name></contrib>
<contrib contrib-type="author">
<name><surname>García</surname><given-names>Jesús</given-names></name></contrib>
<contrib contrib-type="author">
<name><surname>Molina</surname><given-names>José Manuel</given-names></name></contrib>
<aff id="af1-sensors-12-12126">Applied Artificial Intelligence Group, Universidad Carlos III de Madrid, Avd. de la Universidad Carlos III, 22, Colmenarejo, Spain; E-Mails: <email>jgromero@inf.uc3m.es</email> (J.G.-R.); <email>miguelangel.patricio@uc3m.es</email> (M.A.P.); <email>jesus.garcia@uc3m.es</email> (J.G.); <email>molina@ia.uc3m.es</email> (J.M.M.)</aff></contrib-group>
<author-notes>
<corresp id="c1-sensors-12-12126">
<label>*</label>Author to whom correspondence should be addressed; E-Mail: <email>miguel.serrano@uc3m.es</email>; Tel.: +34-918-561-338.</corresp></author-notes>
<pub-date pub-type="collection">
<year>2012</year></pub-date>
<pub-date pub-type="epub">
<day>05</day>
<month>09</month>
<year>2012</year></pub-date>
<volume>12</volume>
<issue>9</issue>
<fpage>12126</fpage>
<lpage>12152</lpage>
<history>
<date date-type="received">
<day>02</day>
<month>05</month>
<year>2012</year></date>
<date date-type="rev-recd">
<day>31</day>
<month>07</month>
<year>2012</year></date>
<date date-type="accepted">
<day>21</day>
<month>08</month>
<year>2012</year></date></history>
<permissions>
<copyright-statement>© 2012 by the authors; licensee MDPI, Basel, Switzerland.</copyright-statement>
<copyright-year>2012</copyright-year>
<license>
<p>This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution license (http://creativecommons.org/licenses/by/3.0/).</p></license></permissions>
<abstract>
<p>Recent advances in technologies for capturing video data have opened a vast amount of new application areas in visual sensor networks. Among them, the incorporation of light wave cameras on Ambient Intelligence (AmI) environments provides more accurate tracking capabilities for activity recognition. Although the performance of tracking algorithms has quickly improved, symbolic models used to represent the resulting knowledge have not yet been adapted to smart environments. This lack of representation does not allow to take advantage of the semantic quality of the information provided by new sensors. This paper advocates for the introduction of a part-based representational level in cognitive-based systems in order to accurately represent the novel sensors' knowledge. The paper also reviews the theoretical and practical issues in part-whole relationships proposing a specific taxonomy for computer vision approaches. General part-based patterns for human body and transitive part-based representation and inference are incorporated to an ontology-based previous framework to enhance scene interpretation in the area of video-based AmI. The advantages and new features of the model are demonstrated in a Social Signal Processing (SSP) application for the elaboration of live market researches.</p></abstract>
<kwd-group>
<kwd>visual sensor networks</kwd>
<kwd>light wave</kwd>
<kwd>structured light</kwd>
<kwd>time-of-flight</kwd>
<kwd>cognitive vision</kwd>
<kwd>ontology-based</kwd>
<kwd>ambient intelligence</kwd>
<kwd>social signal processing</kwd></kwd-group></article-meta></front>
<body>
<sec sec-type="intro">
<label>1.</label>
<title>Introduction</title>
<p>AmI develops computational systems that apply Artificial Intelligence techniques to process information acquired from sensors embedded in the ambience in order to provide helpful services to users in daily activities. AmI objectives are: (i) to <italic>recognize</italic> the presence of individuals in the sensed scene; (ii) to <italic>understand</italic> their actions and estimate their intentions; (iii) to <italic>act</italic> in consequence.</p>
<p>The use of visual sensors in AmI applications has received little attention [<xref ref-type="bibr" rid="b1-sensors-12-12126">1</xref>], even though they can obtain a large amount of interesting data. Some reasons are: the economic cost of visual sensor networks, the computational requirements of visual data processing, the difficulties to adapt to changing scenarios and the disadvantages with respect to other sensor technologies, such as legal and ethical issues.</p>
<p>In the last decade, new visual sensor technologies have updated the established concepts of the computer vision approaches. Time-of-Flight (ToF) technology provides both intensity and distance information for each pixel of the image, thus offering 3-dimensional imaging [<xref ref-type="bibr" rid="b2-sensors-12-12126">2</xref>,<xref ref-type="bibr" rid="b3-sensors-12-12126">3</xref>]. Structured light imaging allows to obtain an accurate depth surface for objects with an unprecedented resolution. Recently, the cost of these sensors has been dramatically reduced, which has lead to a widespread adoption of these technologies, now even present in consumer electronics like the Kinect™ peripheral for Microsoft XBox™ system.</p>
<p>New computer vision algorithms have been proposed to detect and track human movements from structured light and ToF sensors [<xref ref-type="bibr" rid="b4-sensors-12-12126">4</xref>]. These works are mostly based on the definition of a model and motion of the human body. To name some application areas, ToF-based systems have been used in tracking algorithms for the detection of moving people [<xref ref-type="bibr" rid="b5-sensors-12-12126">5</xref>], nose detection algorithms [<xref ref-type="bibr" rid="b6-sensors-12-12126">6</xref>], body gesture recognition [<xref ref-type="bibr" rid="b7-sensors-12-12126">7</xref>], hand tracking proposals [<xref ref-type="bibr" rid="b8-sensors-12-12126">8</xref>,<xref ref-type="bibr" rid="b9-sensors-12-12126">9</xref>], SSP to classify human postures [<xref ref-type="bibr" rid="b10-sensors-12-12126">10</xref>] and Ambient Assisted Living to detect people falls [<xref ref-type="bibr" rid="b11-sensors-12-12126">11</xref>].</p>
<p>Unfortunately, current approaches do not provide a well-defined model to represent the semantic details of the data, such as relationships or constraints, coming from new algorithms. The use of a conceptual model offers several advantages at a low cost. Formal models establish a common symbolic vocabulary to describe and communicate scene data while providing support for logic-based reasoning. Symbolic language is closer to human language, and therefore it is easy to interact and interpret system inputs and outputs. Reasoning, in turn, can be applied to check the consistency of the models and to infer additional knowledge from explicit information.</p>
<p>The formulation of models based on abstraction levels has led to the implementation of non-cohesive systems which are not able to fluently communicate among themselves. For this reason, it is necessary to provide new common and transverse knowledge layers among these levels including new semantic relationships. The goal of this strategy is the close interaction among semantically similar layers to the automatic generation of new knowledge. With the advent of new sensors, we advocate for the addition of a representation layer based on mereology and meronymy. Meronymy studies part-whole relations from a linguistics and cognitive science perspective. Mereology is a close concept, which concerns the formal ontological investigation of the part-whole relation and it is formally expressed in terms of first-order logic. The idea of employing a part-based layer to support the statements of the scene object abstraction level in a cognitive architecture has been previously suggested by Pinz <italic>et al.</italic> [<xref ref-type="bibr" rid="b12-sensors-12-12126">12</xref>]. Our proposal goes further and seeks to provide a symbolic layer based on the formal definition, development patterns and implementation of part-whole relationships.</p>
<p>Symbolic data representations allow to develop cognitive models able to represent more accurately the complexity of the scene. These models can analyze systematically the knowledge of the scene to discover and describe data related with activities developed by a subject fusing its representation with high-level context knowledge—the set of circumstances surrounding a situation of interest that are potentially of relevance to its completion [<xref ref-type="bibr" rid="b13-sensors-12-12126">13</xref>]. A key part of such analysis is currently supported by the approaches emerged from a cognitive view of the traditional computer vision techniques. The ties between meronymy and the current qualitative approaches [<xref ref-type="bibr" rid="b14-sensors-12-12126">14</xref>,<xref ref-type="bibr" rid="b15-sensors-12-12126">15</xref>] in cognitive vision—mainly focused on a qualitative description of spatio-temporal aspects [<xref ref-type="bibr" rid="b16-sensors-12-12126">16</xref>]—must be regarded as crucial to narrow the gap of knowledge in activity recognition approaches.</p>
<p>This paper describes an ontology-based model for data acquired from recognition algorithms through light wave technology. This model is incorporated into a cognitivist [<xref ref-type="bibr" rid="b17-sensors-12-12126">17</xref>] (According to Vernon's definition “Cognitivism asserts that cognition involves computations defined over symbolic representations, in a process whereby information about the world is abstracted by perception, represented using some appropriate symbol set, reasoned about, and then used to plan and act in the world.”) framework for contextual fusion of 2D visual information previously proposed by our research group [<xref ref-type="bibr" rid="b18-sensors-12-12126">18</xref>–<xref ref-type="bibr" rid="b20-sensors-12-12126">20</xref>]. The cornerstone of the framework is an ontological model designed according to the Joint Directors of Laboratories (JDL) fusion model [<xref ref-type="bibr" rid="b21-sensors-12-12126">21</xref>] that represents sensor and context information stepped in several levels from low-level tracking data to high-level situation knowledge. The ontological model has been designed to promote extensibility and modularity. Each ontology level provides a skeleton that includes general concepts and relations to describe very general computer vision entities and relations. A general taxonomy of part-whole relationships for computer vision is proposed. The relationships are distributed along the levels of the model according to their abstraction. Several general pattern based on transitive part-whole relationships are proposed to cover the representation of the data to the level of accuracy currently achieved and to improve the quality of the inference process.</p>
<p>To illustrate the functioning of the extended framework a case study based on a SSP environment is presented. SSP aims at providing computers with the ability to sense and understand human social signals [<xref ref-type="bibr" rid="b22-sensors-12-12126">22</xref>]. The example depicts a novel application of structured light cameras for live market researches. The goal is the formal representation of complex activity recognition and the automatic reasoning through ontologies. The example incrementally describes the activities representation through the presented model and the automatic structuring of event knowledge along the part-based level. Straightforward rules corresponding to a logic inference engine are attached to the example sections to demonstrate that the application is feasible.</p>
<p>The reminder of this article is organized as follows. Section 2 discusses theoretical issues in part-based representations. Section 3 includes an overall description of the new features added to our framework due to the use of novel sensors. Section 4 describes a symbolic layer which includes the proposal of a part-based taxonomy of properties for cognitive vision environments and a pattern which formalize the representation of those which are transitive. The pattern is depicted using the human body structure extracted from novel sensors. Section 5 details the configuration of event of interest for data extraction and propagation. The implementation issues are revisited in Section 6. Section 7 depicts a live market research scenario to detect interesting situations in the SSP area. Section 8 summarizes the conclusions obtained and proposes some directions for future work.</p></sec>
<sec>
<label>2.</label>
<title>Theoretical Issues in Part-Based Representations</title>
<p>Meronymy has been subject of researches in linguistics, philosophy, and psychology.</p>
<p>From a philosophical point of view parts have been characterized as single, universal and transitive relations used to model, among others, the spatio-temporal domain [<xref ref-type="bibr" rid="b23-sensors-12-12126">23</xref>]. This definition stay open since it was criticized by using an axiomatic representation which considers part-of a partial ordering relation [<xref ref-type="bibr" rid="b24-sensors-12-12126">24</xref>]. Afterwards the representation was completed with the addition of new axioms [<xref ref-type="bibr" rid="b25-sensors-12-12126">25</xref>].</p>
<p>Representations of part-based relations are founded on the Ground Mereology theory. The Ground Mereology establishes three principles [<xref ref-type="bibr" rid="b26-sensors-12-12126">26</xref>]:
<list list-type="bullet">
<list-item>
<p>Reflexive: Everything is part of itself.</p>
<p>∀<italic>x</italic>(<italic>part_of</italic>(<italic>x, x</italic>))</p></list-item>
<list-item>
<p>Antisymmetric: Two distinct things cannot be part of each other.</p>
<p>∀<italic>x, y</italic>((<italic>part_of</italic>(<italic>x, y</italic>) ∧ <italic>part_of</italic>(<italic>y, x</italic>)) → <italic>x</italic> = <italic>y</italic>)</p></list-item>
<list-item>
<p>Transitive: Any part of any part of a thing is itself part of that thing.</p>
<p>∀<italic>x, y, z</italic>((<italic>part_of</italic>(<italic>x, y</italic>) ∧ <italic>part_of</italic>(<italic>y, z</italic>)) → <italic>part_of</italic>(<italic>x, z</italic>))</p></list-item></list></p>
<p>These principles have been a source of discussions in meronymy due to the need to consider different kinds of part-whole relations and because some of them must be intransitive. Some examples can be found in [<xref ref-type="bibr" rid="b27-sensors-12-12126">27</xref>].</p>
<p>The variety of semantic senses in part-whole relations drove researchers to look for a collection of part-whole relations. Winston <italic>et al.</italic> [<xref ref-type="bibr" rid="b28-sensors-12-12126">28</xref>] developed a taxonomy founded on three linguistic and logical characteristics: functional, homeomerous and separable. These characteristics define a set of six meronymic relations: component-integral object, member-collection, portion-mass, stuff-object, feature-activity and place-area.</p>
<p>Keet <italic>et al.</italic> [<xref ref-type="bibr" rid="b29-sensors-12-12126">29</xref>] proposed a formal taxonomy of part-whole relations (see <xref ref-type="fig" rid="f1-sensors-12-12126">Figure 1</xref>) which implements a compromise solution for the “ontologically-motivated relations useful for conceptual modeling up to the minimum level of distinctions”. This taxonomy is particularly relevant since the properties are defined using categories of the DOLCE [<xref ref-type="bibr" rid="b30-sensors-12-12126">30</xref>] upper ontology. The taxonomy by Keet <italic>et al.</italic> is extended in Section 4.1 to be applied in cognitive vision environments.</p>
<p>Interestingly enough, connectedness is a fundamental concept shared between the foundations of mereological and topological theories. As it is shown in mereotopological approaches [<xref ref-type="bibr" rid="b31-sensors-12-12126">31</xref>], topology can be defined as a domain specific subtheory of mereology and mereology can be defined as a subtheory being topology primal. An example of the latter is the theory developed by Randell <italic>et al.</italic> [<xref ref-type="bibr" rid="b14-sensors-12-12126">14</xref>], who propose the Region Connection Calculus (RCC). RCC defines the part-of relation in terms of the connection relation. RCC is an axiomatization of certain spatial concepts and relations in first order logic. The basic theory assumes just one primitive dyadic relation: C(x, y) read as x connects with y. Individuals (x, y) can be interpreted as denoting spatial regions. The relation C(x, y) is reflexive and symmetric. The subsets including Disconnected (DC), Externally Connected (EC), Partially Overlaps (PO), Equal (EQ), Tangential Proper Part (TPP), Non-Tangential Proper Part (NTPP), Tangential Proper Part Inverse (TPPi) and Non-Tangential Proper Part Inverse (NTPPi) (see <xref ref-type="fig" rid="f2-sensors-12-12126">Figure 2</xref>) have been proven to form a jointly exhaustive and pairwise disjoint set, which is known as RCC-8. Similar sets of one, two, three and five relations are known as RCC-1, RCC-2, RCC-3 and RCC-5.</p>
<p>Current capabilities in computer vision systems do not allow an easy recognition of mereological relationships from spatial inclusion assertions. Topological relationships between two entities, for example, TPP, NTPP, EQ or PO relations, are essential cues to detect part-whole patters; however, it is also necessary to detect a connection relation among the content and the container. On the other hand, we advocate for the combined use of spatial and mereological knowledge at different levels. A separate definition of theories can be used to classify and assert new knowledge. A clear example is the classification of subactivities. The spatial context of a subactivity can determine the relationship with the overall activity. Comparing products in the supermarket is part of shopping; however, comparing products can be part of cooking if the subject is in a kitchen. Sections 5.1 and 6.1 present a practical approach on the combination of topological and mereological relations and their implementation in our system.</p></sec>
<sec>
<label>3.</label>
<title>Ontology-Based Computer Vision Model and Light Wave Technology Integration</title>
<p>The representation for new sensors data has been used in the framework for computer vision representation presented in [<xref ref-type="bibr" rid="b20-sensors-12-12126">20</xref>]. This framework is based on an ontological model for the representation of context and scene entities. The ontological model is organized into several levels compliant with the Joint Directors of Laboratories (JDL) model for Information Fusion [<xref ref-type="bibr" rid="b21-sensors-12-12126">21</xref>]. Each layer includes general concepts and properties to describe computer vision entities and relations at different abstraction level. Concepts that belong to a less abstract ontology are the building blocks of concepts corresponding to a more abstract ontology. Current implemented levels are:
<list list-type="bullet">
<list-item>
<p>Tracking Entities level, to model input data coming from the tracking algorithms: track information (color, position, speed) and frames (to support the temporal consistency).</p></list-item>
<list-item>
<p>Scene Objects level, to model real-world entities, properties, and relations: moving and static objects, topological relations, <italic>etc.</italic></p></list-item>
<list-item>
<p>Activities level, to model behavior descriptions: grouping, approaching, picking an object, and so forth.</p></list-item></list></p>
<p>The model has been designed to promote extensibility and modularity. This means that the general structure can be refined to apply this model to a specific domain. Local adaptations should not cause cascade changes in the rest of the structure.</p>
<p>Ontologies may contain both perceptual and context data. Perceptual data is automatically extracted by tracking algorithms, while the context data is external knowledge used to complete the comprehension of the scene. For example, the description of a sensorised static object—size, position, type of object, and so on—is regarded as context data.</p>
<p>Some changes are needed to model tracking data coming from novel devices. The priority to adapt these changes is to maintain the compatibility with the previous approach. The ontologies of the initial framework have been extended to include support for light wave data:
<list list-type="bullet">
<list-item>
<p>An additional Euclidean dimension for the depth position of recognized objects. This is easily achieved by relying on the qualia approach [<xref ref-type="bibr" rid="b30-sensors-12-12126">30</xref>] used in the original ontology model to represent properties and property values.</p></list-item>
<list-item>
<p>A new definition of the concepts that represent human entities in the scene. Essentially, the current description of a subject in the scene, represented by the 
<monospace>Person</monospace> concept, is now associated with a description of anatomical joints and limbs. This description has been formalized according to existing patterns to represent part-whole relations with ontologies and current ToF-based computer vision models for articulated bodies.</p></list-item></list></p>
<p>The introduction of new devices requires upgrading the capacity of spatial representation in the model from two to three dimensions. These changes concern both perceptual data captured by light wave cameras and context data representing physical objects. The previous model followed the qualia approach used in the upper ontology DOLCE [<xref ref-type="bibr" rid="b30-sensors-12-12126">30</xref>]. This modeling pattern distinguishes between properties themselves and the space in which they take values. The values of a quality—e.g., 
<monospace>Position</monospace>—are defined within a certain conceptual space—e.g., 
<monospace>2DPoint</monospace>. To adapt the ontology-based model to this new quality space, the 
<monospace>3DPoint</monospace> concept, which represents a position using three coordinates, is included as a subclass of 
<monospace>PositionValueSpace</monospace>, which represents the space of values of the physical positions.</p>
<p>Current Kinect™ algorithms are able to detect real-world entities; e.g., a person including data related to the human limbs and joints. Our ontology-based model represents these kinds of real-world data at the scene object level. However, these data also include low level information that should be represented as tracking entities to support the scene object assertions. Tracking entities level has been adapted to represent low level data of human members and joints—position, size, kinematic state, and so on—and this information is associated to the 
<monospace>Person</monospace> concept declared in the scene object level. The inclusion of limbs and joints is compliant to the previous version of the tracking entities ontology. The applied part-whole pattern (see Section 4.2) allows keeping backward compatibility. In fact, this model can combine 2D monocular cameras and light wave devices using the same set of ontologies.</p></sec>
<sec>
<label>4.</label>
<title>Part-Based Symbolic Layer for Cognitive Vision Approaches</title>
<p>This section presents a part-based taxonomy of properties for cognitive vision environments based on some approaches discussed in Section 2. Afterwards a general ontology-based pattern to represent the transitive properties of the taxonomy is explained. To illustrate this pattern we have chosen the semantics of the human body and its parts. Thereby we fulfill the dual purpose of explaining the general pattern and its application to exploit the detection of human body structures using novel devices.</p>
<sec>
<label>4.1.</label>
<title>Part-Based Taxonomy of Properties for Cognitive Vision Environments</title>
<p>The identification of the underlying characteristics presented in Section 2 allows to discriminate between several kinds of part properties. The characteristics by Winston <italic>et al.</italic> are appropriate for cognitive vision representation because they are mainly supported by spatio-temporal foundations. However this set of characteristics is too small and do not allow a wide specialization of properties. Thus we have also taken into account the classification by Opdahl <italic>et al.</italic> [<xref ref-type="bibr" rid="b32-sensors-12-12126">32</xref>] (see <xref ref-type="table" rid="t1-sensors-12-12126">Table 1</xref>).</p>
<p>The resulting classification is focused on properties which can be projected as spatial and temporal concepts captured by visual devices. <xref ref-type="fig" rid="f3-sensors-12-12126">Figure 3</xref> shows the proposed taxonomy taking into account the spatio-temporal aspects in vision-based systems. We carry out an analysis based on characteristics of part properties. This analysis only considers the general characteristics of each property. We do not offer an exhaustive list of characteristics for each property because some of them do not characterize the property. Current classification can be reconsidered for a specific specialization according to a particular domain. It is considered that all the properties meet the Ground Mereology principles except transitivity.</p>
<p>Component/Integral object (
<monospace>componentOf</monospace>): This is a functional, separable, resultant and transitive property. The property is relevant for unidentified entities and scene objects. Thus it is mandatory to define a set of subactivities where the part can intervene. There are two subtypes: (i) Essential/Integral object (
<monospace>essentialComponentOf</monospace>) are those critical parts to identify a whole, for example, the chest of a body. Their characteristics, in addition to the inherited, are: mandatory, existential dependency and immutable; (ii) Dispensable/Integral object (
<monospace>dispensableComponentOf</monospace>) are those parts that are not crucial for recognition. Following the previous example, a hand can be regarded as a dispensable component for body recognition. Their corresponding characteristics are: optional and mutable.</p>
<p>Member/Collection (
<monospace>memberOf</monospace>): This property aims to redefine the identity of an entity through its assimilation to a group. The necessary characteristics of this property are separable, optional, mutability, shareability. Generally this property is intransitive when it is used for abstract sets of membership, for example, when a person is part of an organization. The subproperties are specialized in the spatio-temporal level where they can be detected according to proximity measures or similar kinematic features: (i) Physical member/Subgroup (
<monospace>physicalMemberOf</monospace>) which meets the mandatory characteristic because the parts only can be scene objects corresponding to context data or detected entities with physical features; (ii) Physical Subgroup/Group (
<monospace>physicalSubGroupOf</monospace>) which meets transitivity, homeomerousity and mandatory characteristics because parts only can be clusters of physical members.</p>
<p>Thing/Surroundings (
<monospace>settledIn</monospace>): This property defines a content relationship and an invariant connection between the part and the whole. It is only applicable between objects and entities with spatial or temporal representation. The general characteristics of this property are: homeomerousity, invariance, optional, immutability, shareability and intransitivity. The transitive, mandatory and existential dependency subproperties are: (i) Content/Volume (
<monospace>containedIn</monospace>) is exclusively used by spatial representations based on 3D points; (ii) Place/Area (
<monospace>locatedIn</monospace>) is exclusively used by spatial representations based on 2D points; (iii) Subinterval/Interval (
<monospace>intervalOf</monospace>) is used by temporal representations based on time intervals.</p>
<p>Object (Subject)/Subactivity (
<monospace>involvedIn</monospace>): This intransitive property defines the subjects that are involved in an activity. Its characteristics are functional, non-homeomerous, separable, optional and sharable. Objects and subjects with functional part properties in their definition are the main candidates to instantiate this property. The identified subproperties are not based on any characteristic but in our knowledge about the activity recognition: (i) Active Object/Subactivity (
<monospace>activelyInvolvedIn</monospace>) is instantiated when the object performs the activity; (ii) Passive Object/Subactivity (
<monospace>passivelyInvolvedIn</monospace>) is instantiated when the object is passively involved in the activity.</p>
<p>Subactivity/Activity (
<monospace>participatesIn</monospace>): Represents the relation among straightforward activities which participates in more complex activities. The main characteristic of this property are: functional, separable, homeomerous, transitive and sharable. The property can be divided in: (i) Essential Subactivity/Activity (
<monospace>essentialSubActivityOf</monospace>) if the subactivity is mandatory for the recognition of a more complex activity. Its specific characteristics are: mandatory, existential dependency and immutability; (ii) Dispensable Subactivity/Activity (
<monospace>dispensableSubActivityOf</monospace>) if the subactivity is not crucial to recognize a more complex activity. Its specific characteristics are: optional and mutability.</p>
<p>Portion/Mass (
<monospace>portionOf</monospace>): Necessary characteristics of this property are: homeomerousity, separability and intransitivity. Two transitive subproperties have been identified: (i) Proportion/Measure (
<monospace>proportionOf</monospace>) if the property is countable with a spatio-temporal measure. For example, a second is the sixtieth part of a minute. The corresponding characteristics are: functional, mandatory and existential dependency; (ii) Subquantity/Quantity (
<monospace>quantityOf</monospace>) if there does not exist a visual proportion between the part and the whole. For instance, the part of the water spilled from a cup. The inherent characteristic of this subtype is mandatory.</p>
<p>Stuff/Object (
<monospace>madeOf</monospace>): The constituent material can help to identify an object avoiding false positives in the entity detection process. This property is typically used in part-based taxonomies; however it can not be detected in the scope of vision systems.</p>
<p>Some other characteristics from Opdahl <italic>et al.</italic> classification have not been mentioned because they are already defined in the Winston <italic>et al.</italic> set of properties (e.g., abstraction and homeomerousity), have the same name but a different meaning (e.g., separability) or are not general (e.g., shareability). It is interesting to note that shareability can be seen as a cardinality restriction for specific cases of some relationships. For example, a chest only can be part of one body. These kind of situations become a problem if the relationship is transitive. In Section 4.2 we present a pattern to manage the semantic of these situations.</p>
<p>Some of the properties shown in the previous taxonomy are intransitive, for example, 
<monospace>involvedIn</monospace> and 
<monospace>physicalMemberOf</monospace>. Sometimes there are complementary transitive relations that can be used to propagate a property along another property. The corresponding properties of the previous examples would be 
<monospace>participatesIn</monospace> and 
<monospace>physicalSubGroupOf</monospace>. To illustrate this, let us suppose a person who is a physical member of a group and the same group is part of a bigger group. This procedure only requires to declare the 
<monospace>physicalMemberOf</monospace> property along the 
<monospace>physicalSubGroupOf</monospace> property to automatically assert that a person is a physical member of the bigger group. A wider and strongly related vision of this issue is the table developed in [<xref ref-type="bibr" rid="b33-sensors-12-12126">33</xref>] which defines the conditions for the overall set of transitive interactions between different types of properties.</p></sec>
<sec>
<label>4.2.</label>
<title>General Model for Ontology-Based Human Skeleton Representation</title>
<p>There are several existing ontologies designed to share and reason with structured data representing human anatomy [<xref ref-type="bibr" rid="b34-sensors-12-12126">34</xref>]. Unfortunately, these ontologies have been developed in biomedical environments and define a complex conceptualization which is not useful to our needs. There are also other ontologies that represent the human body in a more simplified way [<xref ref-type="bibr" rid="b35-sensors-12-12126">35</xref>]; however these ontologies are not designed to deal with sensor data in a cognitive environment. A general pattern based on part-whole relationships is proposed to cover the semantic representation of data captured using light wave sensors. The designed ontology adapts the patterns presented in [<xref ref-type="bibr" rid="b36-sensors-12-12126">36</xref>] and follows the conceptualization of articulated bodies shown in [<xref ref-type="bibr" rid="b37-sensors-12-12126">37</xref>] while keeping compatibility with DOLCE. Our proposal can be broadly adapted to other fields.</p>
<p>Real-world knowledge achieves a more comprehensive representation organized through mereological relationships. A clear example is how the human mind divides the structure of a body in subjective parts. The current capabilities of Kinect™ skeletal view (see <xref ref-type="fig" rid="f4-sensors-12-12126">Figure 4</xref> (reproduced from <ext-link xlink:href="http://embodied.waag.org" ext-link-type="uri">http://embodied.waag.org</ext-link>) allow the description of a detected person in terms of two kinds of attributes: (i) body members—hands, feet, thigh, and so forth; (ii) joints—shoulders, elbows, wrists, knees, and so forth. A conceptualization of the attributes detected and the limbs composed by these attributes is represented in the tracking entities level. Resulting concepts represent the parts of the human body which are embodied in the 
<monospace>Person</monospace> concept.</p>
<p>The properties named below (
<monospace>partOf</monospace> and 
<monospace>partOf</monospace>_
<monospace>directly</monospace>) correspond to the 
<monospace>componentOf</monospace> subtype of properties. The names have been modified to present the pattern in a general way since it can be applied to the rest of properties defined in Section 4.1.</p>
<p>Two properties are used to represent the part-whole relationships: (i) 
<monospace>partOf</monospace>; (ii) 
<monospace>partOf</monospace>_
<monospace>directly</monospace>—a 
<monospace>partOf</monospace> subproperty. 
<monospace>partOf</monospace> is a transitive property whose goal is establishing the correspondences between the parts and all the entities containing them. 
<monospace>partOf</monospace>_
<monospace>directly</monospace> defines the subjective relation among a part and the next direct level of composed entities. These properties are necessary since cardinality restrictions over transitive properties, such as 
<monospace>partOf</monospace>, are not allowed by OWL-DL. Therefore, 
<monospace>partOf</monospace>_
<monospace>directly</monospace> is used to define restrictions to maintain cardinality consistency, and 
<monospace>partOf</monospace> is used to infer both direct and indirect parts by means of transitivity and 
<monospace>partOf</monospace>_
<monospace>directly</monospace> property instances.</p>
<p>The previous ontology is extended with classes to represent direct parts—e.g., 
<monospace>PersonPartDirectly</monospace>—and the overall set of part-whole relationships—e.g., 
<monospace>PersonPart</monospace>. 
<monospace>PersonPartDirectly</monospace> subsumes direct parts of a 
<monospace>Person</monospace> such as 
<monospace>Head, UpperLimb</monospace> and 
<monospace>LowerLimb</monospace>. The classes hosting direct parts state existential range restrictions over 
<monospace>partOf</monospace>_
<monospace>directly</monospace> properties—e.g., 
<monospace>partOf</monospace>_
<monospace>directly some Person</monospace>. On the other hand 
<monospace>PersonPart</monospace> subsumes the set of parts of the 
<monospace>Person</monospace> concept. In this case, the direct parts of an 
<monospace>UpperLimb</monospace> concept, namely 
<monospace>Arm, Forearm, Hand, Shoulder, Elbow</monospace> and 
<monospace>Wrist</monospace>, are classified as subclasses of 
<monospace>PersonPart</monospace>; however they are not considered subclasses of 
<monospace>PersonPartDirectly</monospace>. The classes hosting direct and non-direct parts state existential range restrictions over 
<monospace>partOf</monospace> properties—e.g., 
<monospace>partOf some Person</monospace>.</p>
<p>To improve the consistency, cardinality restrictions—exactly 1—are stated over 
<monospace>partOf</monospace>_
<monospace>directly</monospace> as necessary conditions into the concepts corresponding to body members and joints. This means “a part only belongs directly to the next level entity and just to that entity”.</p>
<p>The combined use of the part properties and the restricted classes leads reasoners to automatically infer new taxonomies derived based on part-whole relationships. <xref ref-type="fig" rid="f5-sensors-12-12126">Figure 5</xref> illustrate an example of a taxonomy inferred from an explicitly stated taxonomy. Unfortunately, adding cardinality restrictions on each concept could significantly affect the performance of the reasoner. Some other configurations for this pattern are possible and also valid. This implementation tries to reduce the classification time while complying to the semantics of the human body domain.</p>
<p>Considering the combination of the taxonomy presented in Section 4.1 and the pattern above, we obtain a taxonomy to tackle with the spatio-temporal issues of a cognitive vision system. <xref ref-type="fig" rid="f6-sensors-12-12126">Figure 6</xref> shows the implemented taxonomy, notice that some of the transitive properties do not include a direct property because it is implicit when the superproperty is transitive, for example, 
<monospace>dispensableComponentOf</monospace> and 
<monospace>essentialComponentOf</monospace> are regarded as direct properties because 
<monospace>componentOf</monospace> is transitive. Each subtaxonomy of properties is assigned to one or several levels forming a transverse layer through the model shown at the beginning of Section 3.</p>
<p>The classification of joints is inspired by the virtual model shown in [<xref ref-type="bibr" rid="b37-sensors-12-12126">37</xref>]. There are three types of joints (see <xref ref-type="fig" rid="f7-sensors-12-12126">Figure 7</xref>) depending on the degrees of freedom (DoF): (i) 
<monospace>UniversalJoint</monospace>, three DoF; (ii) 
<monospace>HingeJoint</monospace>, one DoF and two restricted DoF; (iii) 
<monospace>EllipticJoint</monospace>, three restricted DoF. Joint concepts store important data such as the articulated body members and the angle between them. These data is basic to maintain the consistency and to improve the semantic capacity of the model.</p>
<p>The model is designed by taking into account future changes in the granularity of the obtained data. New devices able to offer an accurate definition of the body members—e.g., the fingers of a hand—are easily adaptable. The larger the number of levels in the model, the greater amount of data is inferred. More details and additional information about data described in this section can be found in the authors' web page [<xref ref-type="bibr" rid="b42-sensors-12-12126">42</xref>].</p></sec></sec>
<sec sec-type="methods">
<label>5.</label>
<title>Part-Based Data Extraction and Propagation</title>
<p>There is an important amount of implicit knowledge surrounding the part-based approaches which should be extracted and used as a basis of the cognitivist models to improve the semantic richness and robustly justify the knowledge base reasoning.</p>
<sec>
<label>5.1.</label>
<title>Explicating Hidden Relationships Between Subclasses, Parts and Locations</title>
<p>The research by Winston <italic>et al.</italic> [<xref ref-type="bibr" rid="b28-sensors-12-12126">28</xref>] shows the power to find implicit relationships using deductive reasoning based on syllogisms. The conclusion of this study indicates that there is a hierarchical ordering respectively between class inclusion, mereological inclusion and spatial inclusion which implies that “syllogisms are valid if and only if the conclusion expresses the lowest relation appearing in the premises”. Syllogism are a kind logical argument in which one proposition is inferred from two or more premises. A huge quantity of implicit relations can emerge from these inferences. The following example illustrates these assertions:
<list list-type="simple">
<list-item>
<p>
<monospace>(1a)</monospace> Peter is a physical member of a tourist group. (Mereological inclusion)</p></list-item>
<list-item>
<p>
<monospace>(1b)</monospace> The tourist group is in the shop. (Spatial inclusion)</p></list-item>
<list-item>
<p>
<monospace>(1c)</monospace> Peter is in the shop. (Spatial inclusion)</p></list-item></list></p>
<p>Ontologies have several advantages to carry out this kind of deductive reasoning because: (i) the hierarchical structure of ontologies is strongly related to the idea of class inclusion since terminological boxes represent concepts as general classes which host more specific or specialized classes; (ii) the mereological patterns to represent and reason with parts and the current reasoner's support for qualitative spatial approaches [<xref ref-type="bibr" rid="b38-sensors-12-12126">38</xref>] provide the semantic support to apply this kind of arguments; (iii) the OWL 2 construct 
<monospace>ObjectPropertyChain</monospace> allows a property to be defined as the composition of several properties. Compositions enable to propagate a property (e.g., 
<monospace>placedIn</monospace>) along another property (e.g., 
<monospace>partOf</monospace>). The previously described syllogism is automatically handled by the following statement:
<list list-type="simple">
<list-item>
<p>
<monospace>SubPropertyOf( ObjectPropertyChain(:partOf :placedIn) :placedIn)</monospace>
<list list-type="simple">
<list-item>
<p>(Composition feature in OWL 2. <ext-link xlink:href="http://www.w3.org/2007/OWL/wiki/New_Features_and_Rationale#F8:_Property_Chain_Inclusion" ext-link-type="uri">http://www.w3.org/2007/OWL/wiki/New_Features_and_Rationale#F8:_Property_Chain_Inclusion</ext-link> Last accessed 12 April 2012)</p></list-item></list></p></list-item></list></p>
<p><xref ref-type="table" rid="t2-sensors-12-12126">Table 2</xref> [<xref ref-type="bibr" rid="b39-sensors-12-12126">39</xref>] shows the syllogisms' hierarchical ordering described through properties composition. Notice that the table's main diagonal compositions do not need to be declared since the properties are transitive.</p></sec>
<sec>
<label>5.2.</label>
<title>Automatic Data Propagation of Events of Interest</title>
<p>Sometimes the knowledge originated in an entity component should be represented as knowledge directly attributable to the overall entity. A pattern for data propagation along the parts and to the whole can be deployed based on the pattern explained in Section 5. Another pattern from [<xref ref-type="bibr" rid="b36-sensors-12-12126">36</xref>] is adapted to distribute the data concerning the events developed in the human body members. This pattern requires: (i) the creation of the 
<monospace>hasEvent</monospace> property, which indicates that a subject is the source of an event—these property can be also specialized to address more specific events; (ii) new classes—e.g. 
<monospace>EventInBody</monospace> or 
<monospace>EventInUpperLimb</monospace>—to classify events, which comprises all the events carried out by the body and their parts; (iii) the characterization of the 
<monospace>partOf</monospace> property as reflexive. As it is shown in Section 2, reflexivity is one of the principles of Ground Merology theory and dictates that “everything is part of itself”. These principles allows to include the whole entities in the taxonomy of parts. This causes the subsumption of the 
<monospace>Person</monospace> concept by the 
<monospace>PersonPart</monospace> class.</p>
<p>Classes which host instances of events state existential range restrictions over 
<monospace>hasEvent</monospace> properties, for example, 
<monospace>EventInBody</monospace> declares the restriction 
<monospace>hasEvent someValuesFrom</monospace> (someValuesFrom restriction. <ext-link xlink:href="http://www.w3.org/TR/2004/REC-owl-features-20040210/#someValuesFrom" ext-link-type="uri">http://www.w3.org/TR/2004/REC-owl-features-20040210/#someValuesFrom</ext-link> Last accessed 08 May 2012) PersonPart and 
<monospace>EventInUpperLimb</monospace> states 
<monospace>hasEvent <italic>someValuesFrom</italic> UpperLimbPart</monospace>. To illustrate this, let us suppose the detection of an event in a hand. After the instantiation of the event and the corresponding property 
<monospace>hasEvent</monospace>, the reasoner propagates the event to the 
<monospace>EventInBody</monospace> and 
<monospace>EventInUpperLimb</monospace> classes. Thereby, events are classified by following an organization refined by anatomical levels. In addition, this pattern represents the affirmation “an event carried out by a person is an event executed by the person or any of its parts”.</p>
<p>This approach can be extended using a composition between the properties 
<monospace>componentOf</monospace> and 
<monospace>participatesIn</monospace>. Based on the relationship between an event and a body part, the relationships between parts of higher order that contains them and the event are automatically inferred. The following example syllogism and the <xref ref-type="fig" rid="f8-sensors-12-12126">Figure 8</xref> depicts this extension:
<list list-type="simple">
<list-item>
<p>
<monospace>(2a)</monospace> Upper limb is component of Robert. (Explicit)</p></list-item>
<list-item>
<p>
<monospace>(2b)</monospace> Robert's upper limb participates in embraces a lamp. (Explicit)</p></list-item>
<list-item>
<p>
<monospace>(2c)</monospace> Robert participates in embraces a lamp. (Conclusion)</p></list-item></list></p></sec></sec>
<sec>
<label>6.</label>
<title>Implementation</title>
<p>The architecture presented in Section 3 has been implemented as a system prototype. The system has three basic inputs: a variable amount of a priori knowledge, sensor data coming from different information sources and data formalisms represented with ontologies. The ontologies include a set of terminological boxes (TBoxes), each of which containing sentences describing concept hierarchies. In turn, an assertional box (ABox) contains facts about individuals of the domain of discourse. These TBoxes make up the structure of the vision-based AmI symbolic representation. The ABoxes of these levels are filled with assertions from predefined context knowledge, previous inferences and sensor data.</p>
<p>The overall system is based on the RACER (Racer Systems GmbH &amp; Co. KG. <ext-link xlink:href="http://www.racer-systems.com/" ext-link-type="uri">http://www.racer-systems.com/</ext-link> Last accessed 05 April 2012) reasoner. The reasoner hosts the levels of the ontology-based computer vision model explained in Section 3; namely, tracking entities, scene object and activities [<xref ref-type="bibr" rid="b20-sensors-12-12126">20</xref>]. RACER has been chosen because it includes support for different kinds of inference rules through the new Racer Query Language (nRQL), such as deductive, abductive, spatial and temporal [<xref ref-type="bibr" rid="b19-sensors-12-12126">19</xref>].</p>
<p>Beyond the standard ontology reasoning mechanism based on subsumption, RACER also supports abductive and deductive rule-based inference. During the execution, abductive nRQL rules defined in a subontology create new instances that are asserted into the same level or into an upper level. Eventually, the creation of new instances as defined in the consequents of the rules draws instances corresponding to an interpretation of the scene in terms of the activity ontology. Deductive rules, in turn, are used to maintain the logical consistency of the scene. The consistency verifies whether all concepts in the TBox admit at least one individual in the corresponding ABox.</p>
<p>The output of the system is a coherent and readable interpretation of the scene logically justified from the low-level data to the high-level interpretation.</p>
<sec>
<label>6.1.</label>
<title>Spatio-Temporal Support</title>
<p>RACER is the first inference engine able to manage the spatial knowledge through an implementation of the RCC [<xref ref-type="bibr" rid="b14-sensors-12-12126">14</xref>] (see Section 2 for definition) as an additional substrate layer. A substrate is a complementary representation layer associated to an ABox. The RCC substrate offers querying facilities, such as spatial queries and combined spatial and non-spatial queries. Although spatial instances from the ABox are not automatically connected with the RCC substrate, there is an identifying correspondence between them and the objects stored in the substrate.</p>
<p>A significant amount of knowledge of scene objects and activity levels is obtained by abductive rules that include spatial properties in their antecedent. <xref ref-type="fig" rid="f9-sensors-12-12126">Figure 9</xref> shows the integration of a geometric model in the system to dynamically calculate qualitative spatial relationships between scene objects. The geometric model receives spatial data from the scene object level. These data is instantiated into the Java Topology Suite (JTS) [<xref ref-type="bibr" rid="b43-sensors-12-12126">43</xref>]. The JTS is an open source Java software library of two-dimensional spatial predicates and functions compliant to the Simple Features Specification SQL published by the Open GIS Consortium. JTS represents spatial objects in a Euclidean plane and obtains spatial relationships between two-dimensional objects quickly. Although OpenGIS spatial predicates and RCC-8 are not directly compatible, the output from the geometric model can be easily mapped from the OpenGIS format; in some cases, it only involves translating the name of the relationships. A correlation table between OpenGIS spatial predicates and RCC-8 can be found in [<xref ref-type="bibr" rid="b40-sensors-12-12126">40</xref>].</p>
<p>Additional improvements could be implemented to increase the computation speed. It is interesting to highlight that checking object spatial relations, and particularly RCC relations, has a complexity O(n<sup>2</sup>) -the test must be performed between each pair of elements. Thus, it would be convenient to build a data structure able to maintain a hierarchical spatial partition on the Euclidean space. Currently, our framework does not support these improvements, which remains as a promising line for future work [<xref ref-type="bibr" rid="b41-sensors-12-12126">41</xref>].</p>
<p>The temporal dimension can be represented as timestamps or time intervals. Timestamps are represented using snapshots of capturing data. Time intervals representation is directly supported by the RCC substrate thanks to their proper relationships [<xref ref-type="bibr" rid="b18-sensors-12-12126">18</xref>]. The temporal dimension can be applied in both ways into the antecedent of rules.</p></sec></sec>
<sec sec-type="methods">
<label>7.</label>
<title>Case Study: Live Market Research</title>
<p>Learning about relationships between the customer and the product at the point of sale is a very interesting knowledge in many economic fields, such as sales or marketing. Body gestures and spatial relationships contain useful knowledge about the sensations and intentions of shopping experiences. The model hereby presented can be used to automatically build live market researches based on the reactions and interactions of customers with the products.</p>
<p>Next subsections describe our system instantiation procedure and the expressiveness of the ontology model by presenting an activity recognition representation and a data propagation example. These subsections are depicted with rules to show its applicability in real environments.</p>
<sec>
<label>7.1.</label>
<title>Gesture Instantiation Procedure</title>
<p>A data set containing the skeleton representation of 11 people was designed to test the new representation. These body structures were captured by using a Kinect™ sensor. For each person five types of upper limbs gestures were stored: down, open, up, diagonal and akimbo. A control system based on the OWL API [<xref ref-type="bibr" rid="b44-sensors-12-12126">44</xref>] functionalities automates the assertion of data in the form of axioms from the capture device to the ontology formalism. The control system manages the classification of the individuals received from the Kinect™ sensor, the explicit property instantiations such as 
<monospace>partOf</monospace>_
<monospace>directly</monospace> and the instantiation of properties that represent the articulation of body member through a joint. The control system also manages the automatic calculation of data values from the received data, such as the size of the body members and angles formed between them.</p>
<p>An data instantiation example to describe a left upper limb with down gesture for the person in <xref ref-type="fig" rid="f10-sensors-12-12126">Figure 10</xref> would include: (i) classification of joint instances (see <xref ref-type="fig" rid="f7-sensors-12-12126">Figure 7</xref>); (ii) 
<monospace>partOf</monospace>_
<monospace>directly</monospace> property instantiations (see <xref ref-type="fig" rid="f5-sensors-12-12126">Figure 5</xref>); (iii) joint positioning data.</p></sec>
<sec>
<label>7.2.</label>
<title>Activity Recognition Example: Touching a Product</title>
<p>Activity recognition usually requires composition of simple activities along the time. Therefore temporal analysis is required in order to recognize complex activities [<xref ref-type="bibr" rid="b7-sensors-12-12126">7</xref>]. Our ontology model is expressive enough to represent the temporal dimension of the activities. The representation capabilities resulting from the combined use of Kinect™ and the ontology-based model offer simple but very expressive tools to detect interesting activities for a market research confection.</p>
<p>Relevant activities for current market researches may be: stand in front of, look at, point at and touch a product. Recognition of simple interactions between different body members and objects regarded as context data can be detected finding the spatial relationship between these elements. The process becomes more robust if the object includes sensors (e.g., RFID and accelerometer) able to provide different kinds of features—id, location and kinematic state.</p>
<p>In order to demonstrate the expressiveness of our representation, a syntactically relaxed nRQL—the query language of the RACER reasoner—rule is presented in <xref ref-type="fig" rid="f11-sensors-12-12126">Figure 11</xref>. The variables of the rule are denoted with a question mark at the beginning of their names (?), variables belonging to the RCC substrate are labeled adding a star (?*), concept types start with a hash (#) and RCC-8 relationships are labeled with a colon (:). To the existing namespaces, tracking entities (#!tren:), scene objects (#!scob:) and activities (#!actv:), a new one is added to group all the specific information related to market researches (#!mkrs:). The syntax of nRQL has been slightly simplified to make them more readable. The following rule detects touching activities between people and sensorized objects.</p>
<p>First, different variables that act along the rule are declared (3–7). The rule checks if the object involved in the situation is currently moving (8). This statement can also be used as a trigger of the rule. Afterwards, the rule checks if there is a spatial relationships between the moving 
<monospace>Product</monospace> and a 
<monospace>Hand</monospace> (9). The place of the person is assessed in (10–11). Finally, to discriminate between clients and employees, the rule considers if the person involved in the action is member of the staff (12). Identifying capability is referred in future work. If the antecedent conditions are satisfied, the consequent is applied. The consequent creates a 
<monospace>Touching</monospace> activity (14) with a known beginning (15) and an unknown ending (16). The spatial location of the activity is bounded by the location of the person who performs the activity (17). 
<monospace>passivelyInvolvedIn</monospace> and 
<monospace>activelyInvolvedIn</monospace> relationships among the new activity with the passive object (18) and the active subject (19) are also stated in the consequent. The resulting activity has been defined according to spatio-temporal criteria and part-based relationships.</p>
<p>The 
<monospace>Touching</monospace> activity is candidate to be classified as a subactivity of 
<monospace>Shopping</monospace>. To recognize the 
<monospace>Shopping</monospace> activity it is required to recognize a sequence of subactivities (e.g., touching the product, trying the product, interacting with the staff, paying for the product) where the same active subjects and passive objects are involved in the same place and time. For the sake of simplicity a rule which only recognizes the spatial dimension of a 
<monospace>Touching</monospace> and a 
<monospace>Paying</monospace> activity is showed in <xref ref-type="fig" rid="f12-sensors-12-12126">Figure 12</xref>.</p>
<p>At the beginning of the antecedent a set of variables are declared (3–6). Then, the same objects, subjects and places are identified in the subactivities (7–12). Finally, the starting and ending timestamps of the activities sequence are retrieved (13–14). The consequent creates a 
<monospace>Shopping</monospace> activity whose validity time interval is bounded by the starting point of the former activity and the ending point of the latter activity (16–18). The coincident place of the subactivities and the mereological properties between the subactivities and the overall activity are eventually asserted (19–21).</p>
<p>Crucial data is inferred from the former to the latter rule. Thanks to the interaction between the mereological and the geolocalized layers, the rules acquire more flexibility and the amount of relationships between concepts grows, which improves the completeness of the model. Imagine that the subactivities are detected in different places.</p>
<list list-type="bullet">
<list-item>
<p>
<monospace>touchingAct placedIn GroundFloor</monospace></p></list-item>
<list-item>
<p>
<monospace>payingAct placedIn FirstFloor</monospace></p></list-item></list>
<p>The system can store mereological data stated to describe invariant context relationships such as:
<list list-type="bullet">
<list-item>
<p>
<monospace>GroundFloor containedIn Shop</monospace></p></list-item>
<list-item>
<p>
<monospace>FirstFloor containedIn Shop</monospace></p></list-item></list></p>
<p>In both cases, using the compositions described in <xref ref-type="table" rid="t2-sensors-12-12126">Table 2</xref>, new relationships are inferred.</p>
<list list-type="bullet">
<list-item>
<p>
<monospace>touchingAct placedIn Shop</monospace></p></list-item>
<list-item>
<p>
<monospace>payingAct placedIn Shop</monospace></p></list-item></list>
<p>Even though the activities have been detected in different places, the latter rule is fired because there is a common location for both activities (see <xref ref-type="fig" rid="f13-sensors-12-12126">Figure 13</xref>). Following the reasoning, an appropriate spatial environment (
<monospace>Shop</monospace>) is allocated to the overall activity (19).</p></sec>
<sec>
<label>7.3.</label>
<title>Data Propagation Example: Touching a Product</title>
<p>Many data relationships are automatically propagated from the consequent's assertions of the previous section. In the first rule (19) of the previous section, a 
<monospace>Hand</monospace> is declared as active subject of the 
<monospace>Touching</monospace> subactivity. However, in the latter rule (9–10) a previously unstated assertion includes a 
<monospace>Person</monospace> as active subject of this subactivity. The pattern explained in 6.2 justifies the propagation of activity relationships for all the parts which contains the part performing the activity. When the 
<monospace>Hand</monospace> was declared as an active subject, the objects containing it were also inferred as active subjects.</p>
<list list-type="bullet">
<list-item>
<p>
<monospace>upperlimb activelyInvolvedIn touchingAct</monospace></p></list-item>
<list-item>
<p>
<monospace>person activelyInvolvedIn touchingAct</monospace></p></list-item></list>
<p>Data propagation enable to choose the level of granularity of the information retrieval tasks and to assess data from multiple perspectives. The following query would retrieve the interactions among the people and the upper limbs, and the products during a campaign (it is assumed that, during a campaign, the products are located in the same place).</p>
<p>The query in <xref ref-type="fig" rid="f14-sensors-12-12126">Figure 14</xref> retrieves different levels of active subjects (
<monospace>Person</monospace> and 
<monospace>UpperLimb</monospace>) of 
<monospace>Touching</monospace> activities for all the products on sale (1). Then query variables are declared (3–6). The 
<monospace>Product, Person</monospace> and 
<monospace>UpperLimb</monospace> of the same 
<monospace>Touching</monospace> activities are retrieved (7–9). From these set of activities, only those whose validity time interval is within the validity time interval of the campaign (10–12) are chosen.</p>
<p>The extracted information is helpful for answering abstract questions such as: “What is the visibility of this product?” A very rough answer would be the number of people who have interact with it. The level of doubts involved in the purchase decision can be also measured if we count the number of interactions of each user with the product. An extended model able to distinguish between right and left limbs could be used to assess the quality of the product accessibility.</p>
<p>Another example of propagation is the automatic assignment of subjects and objects in composed activities. The first rule of the previous section states a 
<monospace>Person</monospace> and a 
<monospace>Product</monospace> as the active subject and passive object of a 
<monospace>Touching</monospace> subactivity. The system automatically connects these individuals as active subject and passive object of the 
<monospace>shoppingAct</monospace> individual when the 
<monospace>touchingAct</monospace> subactivity is detected participating in a 
<monospace>Shopping</monospace> activity individual (see <xref ref-type="fig" rid="f15-sensors-12-12126">Figure 15</xref>). This process is repeated, thanks to the composition explained in Section 4.1, each time a 
<monospace>participatesIn</monospace> property is instantiated.</p></sec></sec>
<sec sec-type="conclusions">
<label>8.</label>
<title>Conclusions and Future Work</title>
<p>This paper proposes an update of the cognitivist models towards part-based representations. To do so, the work presents a theoretical taxonomy of mereological relations from a computer vision perspective. Using the Component/Integral object relationship of the taxonomy, we developed a general ontology-based model for formal representation of the human body semantics using part-whole patterns and data propagation patterns. The model has been embedded into a previous computer vision framework by relying on part-whole patterns and DOLCE recommendations. The proposal includes Kinect™ skeletal view data representation with backward compatibility. To illustrate the functioning of the extended framework, a case study for live market research has been described by presenting a data instantiation procedure and some examples of activity recognition representation and data propagation. These examples are able to represent semantically complex relationships through the interpretation of the user interactions with the context. The main advantages of this model are the general representation for further domain extensions and the logical capabilities for automatic inference of high-level relationships. Both advantages provide support for more sophisticated activity analysis.</p>
<p>Future research will be based on specific knowledge about the features of the users of a service. An important feature is the identity of a subject, which allows the differentiation among individuals. Kinect Skeletal View™ provides very significant data to recognize individuals, such as the shoulder width, the head width, the body height, the length of the limbs, and so forth. Market research data will be organized through automatic recognition of the gender and the age of the study subjects. We sense that Kinect Skeletal View™ can provide the ability to distinguish at least age ranges, such as child, adult or elder. Knowing the nature of the data, the research may be probably addressed towards fuzzy sets.</p>
<p>In addition, future works will address the completion of a full market research and the application of the entire model to a real life scenario combining monocular and light wave sensors. This application should include a probabilistic mechanism to reason with real world data asserted in the model, which may be imprecise or uncertain.</p></sec></body>
<back>
<ack>
<p>This work was supported in part by Projects CICYT TIN2011-28620-C02-01, CICYT TEC2011-28626-C02-02, CAM CONTEXTS (S2009/TIC-1485) and DPS2008-07029-C02-02.</p></ack>
<ref-list>
<title>References</title>
<ref id="b1-sensors-12-12126"><label>1.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Remagnino</surname><given-names>P.</given-names></name><name><surname>Foresti</surname><given-names>G.L.</given-names></name></person-group><article-title>Ambient intelligence: A new multidisciplinary paradigm</article-title><source>IEEE Trans. Syst. Man Cybern. A: Syst. Hum.</source><year>2005</year><volume>35</volume><fpage>1</fpage><lpage>6</lpage></citation></ref>
<ref id="b2-sensors-12-12126"><label>2.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Foix</surname><given-names>S.</given-names></name><name><surname>Alenya</surname><given-names>G.</given-names></name><name><surname>Torras</surname><given-names>C.</given-names></name></person-group><article-title>Lock-in Time-of-Flight (ToF) cameras: A survey</article-title><source>IEEE Sens. J.</source><year>2011</year><volume>11</volume><fpage>1917</fpage><lpage>1926</lpage><pub-id pub-id-type="doi">10.1109/JSEN.2010.2101060</pub-id></citation></ref>
<ref id="b3-sensors-12-12126"><label>3.</label><citation citation-type="confproc"><person-group person-group-type="author"><name><surname>Kolb</surname><given-names>A.</given-names></name><name><surname>Barth</surname><given-names>E.</given-names></name><name><surname>Koch</surname><given-names>R.</given-names></name><name><surname>Larsen</surname><given-names>R.</given-names></name></person-group><article-title>Time-of-Flight Sensors in Computer Graphics</article-title><conf-name>Proceedings of the Eurographics 2009</conf-name><conf-loc>Munich, Germany</conf-loc><conf-date>30 March–3 April 2009</conf-date><fpage>119</fpage><lpage>134</lpage></citation></ref>
<ref id="b4-sensors-12-12126"><label>4.</label><citation citation-type="confproc"><person-group person-group-type="author"><name><surname>Ganapathi</surname><given-names>V.</given-names></name><name><surname>Plagemann</surname><given-names>C.</given-names></name><name><surname>Koller</surname><given-names>D.</given-names></name><name><surname>Thrun</surname><given-names>S.</given-names></name></person-group><article-title>Real Time Motion Capture Using a Single Time-of-Flight Camera</article-title><conf-name>Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR)</conf-name><conf-loc>San Francisco, CA, USA</conf-loc><conf-date>13–18 June 2010</conf-date><fpage>755</fpage><lpage>762</lpage></citation></ref>
<ref id="b5-sensors-12-12126"><label>5.</label><citation citation-type="confproc"><person-group person-group-type="author"><name><surname>Kahlmann</surname><given-names>T.</given-names></name><name><surname>Remondino</surname><given-names>F.</given-names></name><name><surname>Guillaume</surname><given-names>S.</given-names></name></person-group><article-title>Range Imaging Technology: New Developments and Applications for People Identification and Tracking</article-title><conf-name>Proceedings of Videometrics IX–SPIE–IS&amp;T Electronic Imaging</conf-name><conf-loc>San Jose, CA, USA</conf-loc><year>2007</year><comment>Volume 6491</comment></citation></ref>
<ref id="b6-sensors-12-12126"><label>6.</label><citation citation-type="confproc"><person-group person-group-type="author"><name><surname>Haker</surname><given-names>M.</given-names></name><name><surname>Bohme</surname><given-names>M.</given-names></name><name><surname>Martinetz</surname><given-names>T.</given-names></name><name><surname>Barth</surname><given-names>E.</given-names></name></person-group><article-title>Geometric Invariants for Facial Feature Tracking with 3D ToF Cameras</article-title><conf-name>Proceedings of the International Symposium on Signals, Circuits and Systems (ISSCS'07)</conf-name><conf-loc>Iasi, Romania</conf-loc><conf-date>13–14 July 2007</conf-date><comment>Volume 1</comment><fpage>1</fpage><lpage>4</lpage></citation></ref>
<ref id="b7-sensors-12-12126"><label>7.</label><citation citation-type="confproc"><person-group person-group-type="author"><name><surname>Holte</surname><given-names>M.B.</given-names></name><name><surname>Moeslund</surname><given-names>T.B.</given-names></name><name><surname>Fihl</surname><given-names>P.</given-names></name></person-group><article-title>Fusion of Range and Intensity Information for View Invariant Gesture Recognition</article-title><conf-name>Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops (CVPRW'08)</conf-name><conf-loc>Anchorage, AK, USA</conf-loc><conf-date>24–26 June 2008</conf-date><fpage>1</fpage><lpage>7</lpage></citation></ref>
<ref id="b8-sensors-12-12126"><label>8.</label><citation citation-type="confproc"><person-group person-group-type="author"><name><surname>Breuer</surname><given-names>P.</given-names></name><name><surname>Eckes</surname><given-names>C.</given-names></name><name><surname>Mller</surname><given-names>S.</given-names></name></person-group><article-title>Hand Gesture Recognition with a Novel IR Time-of-Flight Range Camera—A Pilot Study</article-title><conf-name>Proceedings of the 3rd International Conference on Computer Vision/Computer Graphics Collaboration Techniques (Mirage'07)</conf-name><conf-loc>Rocquencourt, France</conf-loc><conf-date>28–30 March 2007</conf-date><comment>Volume 4418</comment><fpage>247</fpage><lpage>260</lpage></citation></ref>
<ref id="b9-sensors-12-12126"><label>9.</label><citation citation-type="confproc"><person-group person-group-type="author"><name><surname>Soutschek</surname><given-names>S.</given-names></name><name><surname>Penne</surname><given-names>J.</given-names></name><name><surname>Hornegger</surname><given-names>J.</given-names></name><name><surname>Kornhuber</surname><given-names>J.</given-names></name></person-group><article-title>3-D Gesture-based Scene Navigation in Medical Imaging Applications Using Time-of-Flight Cameras</article-title><conf-name>Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops (CVPRW'08)</conf-name><conf-loc>Anchorage, AK, USA</conf-loc><conf-date>24–26 June 2008</conf-date><comment>Volume 1</comment><fpage>1</fpage><lpage>6</lpage></citation></ref>
<ref id="b10-sensors-12-12126"><label>10.</label><citation citation-type="confproc"><person-group person-group-type="author"><name><surname>Wientapper</surname><given-names>F.</given-names></name><name><surname>Ahrens</surname><given-names>K.</given-names></name><name><surname>Wuest</surname><given-names>H.</given-names></name><name><surname>Bockholt</surname><given-names>U.</given-names></name></person-group><article-title>Linear-Projection-Based Classification of Human Postures in Time-of-Flight Data</article-title><conf-name>Proceedings of the IEEE International Conference on Systems, Man and Cybernetics (SMC'09)</conf-name><conf-loc>San Antonio, TX, USA</conf-loc><conf-date>11-14 October 2009</conf-date><fpage>559</fpage><lpage>564</lpage></citation></ref>
<ref id="b11-sensors-12-12126"><label>11.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Leone</surname><given-names>A.</given-names></name><name><surname>Diraco</surname><given-names>G.</given-names></name><name><surname>Siciliano</surname><given-names>P.</given-names></name></person-group><article-title>Detecting falls with 3D range camera in ambient assisted living applications: A preliminary study</article-title><source>Med. Eng. Phys.</source><year>2011</year><volume>33</volume><fpage>770</fpage><lpage>781</lpage><pub-id pub-id-type="doi">10.1016/j.medengphy.2011.02.001</pub-id><pub-id pub-id-type="pmid">21382737</pub-id></citation></ref>
<ref id="b12-sensors-12-12126"><label>12.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Pinz</surname><given-names>A.</given-names></name><name><surname>Bischof</surname><given-names>H.</given-names></name><name><surname>Kropatsch</surname><given-names>W.</given-names></name><name><surname>Schweighofer</surname><given-names>G.</given-names></name><name><surname>Haxhimusa</surname><given-names>Y.</given-names></name><name><surname>Opelt</surname><given-names>A.</given-names></name><name><surname>Ion</surname><given-names>A.</given-names></name></person-group><article-title>Representations for cognitive vision: A review of appearance-based, spatio-temporal, and graph-based approaches</article-title><source>ELCVIA</source><year>2008</year><volume>7</volume><fpage>35</fpage><lpage>61</lpage></citation></ref>
<ref id="b13-sensors-12-12126"><label>13.</label><citation citation-type="confproc"><person-group person-group-type="author"><name><surname>Henricksen</surname><given-names>K.</given-names></name><name><surname>Indulska</surname><given-names>J.</given-names></name></person-group><article-title>A Software Engineering Framework for Context-Aware Pervasive Computing</article-title><conf-name>Proceedings of the Second IEEE Annual Conference on Pervasive Computing and Communications (PerCom'04)</conf-name><conf-loc>Orlando, FL, USA</conf-loc><conf-date>14–17 March 2004</conf-date><fpage>77</fpage><lpage>86</lpage></citation></ref>
<ref id="b14-sensors-12-12126"><label>14.</label><citation citation-type="confproc"><person-group person-group-type="author"><name><surname>Randell</surname><given-names>D.A.</given-names></name><name><surname>Cui</surname><given-names>Z.</given-names></name><name><surname>Cohn</surname><given-names>A.G.</given-names></name></person-group><article-title>A Spatial Logic Based on Regions and Connection</article-title><conf-name>Proceedings of the 3rd International Conference on Principles of Knowledge Representation and Reasoning (KR'92)</conf-name><conf-loc>Cambridge, MA, USA</conf-loc><conf-date>25–29 October 1992</conf-date><fpage>165</fpage><lpage>176</lpage></citation></ref>
<ref id="b15-sensors-12-12126"><label>15.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Allen</surname><given-names>J.F.</given-names></name></person-group><article-title>Maintaining knowledge about temporal intervals</article-title><source>Commun. ACM</source><year>1983</year><volume>26</volume><fpage>832</fpage><lpage>843</lpage><pub-id pub-id-type="doi">10.1145/182.358434</pub-id></citation></ref>
<ref id="b16-sensors-12-12126"><label>16.</label><citation citation-type="book"><person-group person-group-type="author"><name><surname>Renz</surname><given-names>J.</given-names></name></person-group><source>Qualitative Spatial Reasoning with Topological Information</source><publisher-name>Springer-Verlag</publisher-name><publisher-loc>Heidelberg, Germany</publisher-loc><year>2002</year></citation></ref>
<ref id="b17-sensors-12-12126"><label>17.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Vernon</surname><given-names>D.</given-names></name></person-group><article-title>Cognitive vision: The case for embodied perception</article-title><source>Image Vision Comput.</source><year>2008</year><volume>26</volume><fpage>127</fpage><lpage>140</lpage><pub-id pub-id-type="doi">10.1016/j.imavis.2005.08.009</pub-id></citation></ref>
<ref id="b18-sensors-12-12126"><label>18.</label><citation citation-type="confproc"><person-group person-group-type="author"><name><surname>Gómez-Romero</surname><given-names>J.</given-names></name><name><surname>García</surname><given-names>J.</given-names></name><name><surname>Patricio</surname><given-names>M.A.</given-names></name><name><surname>Molina</surname><given-names>J.M.</given-names></name></person-group><article-title>Towards the implementation of an ontology-based reasoning system for visual information fusion</article-title><conf-name>3rd Annual Skvde Workshop on Information Fusion Topics (SWIFT 2009)</conf-name><conf-loc>Skvde, Sweden</conf-loc><conf-date>12–13 October 2009</conf-date><fpage>5</fpage><lpage>10</lpage></citation></ref>
<ref id="b19-sensors-12-12126"><label>19.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Gómez-Romero</surname><given-names>J.</given-names></name><name><surname>Patricio</surname><given-names>M.</given-names></name><name><surname>García</surname><given-names>J.</given-names></name><name><surname>Molina</surname><given-names>J.M.</given-names></name></person-group><article-title>Ontology-based context representation and reasoning for object tracking and scene interpretation in video</article-title><source>Expert Syst. Appl.</source><year>2011</year><volume>38</volume><fpage>7494</fpage><lpage>7510</lpage><pub-id pub-id-type="doi">10.1016/j.eswa.2010.12.118</pub-id></citation></ref>
<ref id="b20-sensors-12-12126"><label>20.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Gómez-Romero</surname><given-names>J.</given-names></name><name><surname>Serrano</surname><given-names>M.A.</given-names></name><name><surname>Patricio</surname><given-names>M.A.</given-names></name><name><surname>García</surname><given-names>J.</given-names></name><name><surname>Molina</surname><given-names>J.M.</given-names></name></person-group><article-title>Context-based scene recognition from visual data in smart homes: An Information Fusion approach</article-title><source>Pers. Ubiquitous Comput.</source><year>2011</year><pub-id pub-id-type="doi">10.1007/s00779-011-0450-9</pub-id></citation></ref>
<ref id="b21-sensors-12-12126"><label>21.</label><citation citation-type="confproc"><person-group person-group-type="author"><name><surname>Llinas</surname><given-names>J.</given-names></name><name><surname>Bowman</surname><given-names>C.</given-names></name><name><surname>Rogova</surname><given-names>G.</given-names></name><name><surname>Steinberg</surname><given-names>A.</given-names></name><name><surname>Waltz</surname><given-names>E.</given-names></name><name><surname>White</surname><given-names>F.</given-names></name></person-group><article-title>Revisiting the JDL Data Fusion Model II</article-title><conf-name>Proceedings of the Seventh International Conference on Information Fusion</conf-name><conf-loc>Stockholm, Sweden</conf-loc><conf-date>28 June-1 July 2004</conf-date><fpage>1218</fpage><lpage>1230</lpage></citation></ref>
<ref id="b22-sensors-12-12126"><label>22.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Vinciarelli</surname><given-names>A.</given-names></name><name><surname>Pantic</surname><given-names>M.</given-names></name><name><surname>Bourlard</surname><given-names>H.</given-names></name></person-group><article-title>Social signal processing: Survey of an emerging domain</article-title><source>Image Vision Comput.</source><year>2009</year><volume>27</volume><fpage>1743</fpage><lpage>1759</lpage><pub-id pub-id-type="doi">10.1016/j.imavis.2008.11.007</pub-id></citation></ref>
<ref id="b23-sensors-12-12126"><label>23.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Girju</surname><given-names>R.</given-names></name><name><surname>Badulescu</surname><given-names>A.</given-names></name><name><surname>Moldovan</surname><given-names>D.</given-names></name></person-group><article-title>Automatic discovery of part-whole relations</article-title><source>Comput. Linguist.</source><year>2006</year><volume>32</volume><fpage>83</fpage><lpage>135</lpage></citation></ref>
<ref id="b24-sensors-12-12126"><label>24.</label><citation citation-type="book"><person-group person-group-type="author"><name><surname>Simons</surname><given-names>P.</given-names></name></person-group><source>Parts: A Study in Ontology</source><publisher-name>Clarendon Press</publisher-name><publisher-loc>Oxford, UK</publisher-loc><year>1987</year></citation></ref>
<ref id="b25-sensors-12-12126"><label>25.</label><citation citation-type="book"><person-group person-group-type="author"><name><surname>Simons</surname><given-names>P.</given-names></name></person-group><article-title>Part/whole II: Mereology since 1900</article-title><source>Handbook of Metaphysics and Ontology</source><publisher-name>Philosophia Verlag</publisher-name><publisher-loc>Munich, Germany</publisher-loc><year>1991</year><fpage>672</fpage><lpage>675</lpage></citation></ref>
<ref id="b26-sensors-12-12126"><label>26.</label><citation citation-type="book"><person-group person-group-type="author"><name><surname>Varzi</surname><given-names>A.</given-names></name></person-group><article-title>Mereology</article-title><source>The Stanford Encyclopedia of Philosophy</source><person-group person-group-type="editor"><name><surname>Zalta</surname><given-names>E.N.</given-names></name></person-group><publisher-loc>Spring</publisher-loc><publisher-name>Singapore</publisher-name><year>2011</year></citation></ref>
<ref id="b27-sensors-12-12126"><label>27.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Odell</surname><given-names>J.J.</given-names></name></person-group><article-title>Six different kinds of composition</article-title><source>J. Object-Oriented Progr.</source><year>1994</year><volume>5</volume><fpage>10</fpage><lpage>15</lpage></citation></ref>
<ref id="b28-sensors-12-12126"><label>28.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Winston</surname><given-names>M.E.</given-names></name><name><surname>Chaffin</surname><given-names>R.</given-names></name><name><surname>Herrmann</surname><given-names>D.</given-names></name></person-group><article-title>A taxonomy of part-whole relations</article-title><source>Cogn. Sci.</source><year>1987</year><volume>11</volume><fpage>417</fpage><lpage>444</lpage><pub-id pub-id-type="doi">10.1207/s15516709cog1104_2</pub-id></citation></ref>
<ref id="b29-sensors-12-12126"><label>29.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Keet</surname><given-names>C.M.</given-names></name><name><surname>Artale</surname><given-names>A.</given-names></name></person-group><article-title>Representing and reasoning over a taxonomy of part-whole relations</article-title><source>Appl. Ontol.</source><year>2008</year><volume>3</volume><fpage>91</fpage><lpage>110</lpage></citation></ref>
<ref id="b30-sensors-12-12126"><label>30.</label><citation citation-type="confproc"><person-group person-group-type="author"><name><surname>Gangemi</surname><given-names>A.</given-names></name><name><surname>Guarino</surname><given-names>N.</given-names></name><name><surname>Masolo</surname><given-names>C.</given-names></name><name><surname>Oltramari</surname><given-names>A.</given-names></name><name><surname>Schneider</surname><given-names>L.</given-names></name></person-group><article-title>Sweetening Ontologies with DOLCE</article-title><conf-name>Proceedings of the Knowledge Engineering and Knowledge Management. 13th International Conference on Ontologies and the Semantic Web (EKAW'02)</conf-name><conf-loc>Sigenza, Spain</conf-loc><conf-date>1–4 October 2002</conf-date><fpage>223</fpage><lpage>233</lpage></citation></ref>
<ref id="b31-sensors-12-12126"><label>31.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Varzi</surname><given-names>A.C.</given-names></name></person-group><article-title>Parts, wholes, and part-whole relations: The prospects of mereotopology</article-title><source>Data Knowl. Eng.</source><year>1996</year><volume>20</volume><fpage>259</fpage><lpage>286</lpage><pub-id pub-id-type="doi">10.1016/S0169-023X(96)00017-1</pub-id></citation></ref>
<ref id="b32-sensors-12-12126"><label>32.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Opdahl</surname><given-names>A.L.</given-names></name><name><surname>Henderson-Sellers</surname><given-names>B.</given-names></name><name><surname>Barbier</surname><given-names>F.</given-names></name></person-group><article-title>Ontological analysis of whole-part relationships in OO-models</article-title><source>Inf. Softw. Technol.</source><year>2001</year><volume>43</volume><fpage>387</fpage><lpage>399</lpage><pub-id pub-id-type="doi">10.1016/S0950-5849(00)00175-0</pub-id></citation></ref>
<ref id="b33-sensors-12-12126"><label>33.</label><citation citation-type="confproc"><person-group person-group-type="author"><name><surname>Sattler</surname><given-names>U.</given-names></name></person-group><article-title>A Concept Language for an Engineering Application with Part-Whole Relations</article-title><conf-name>Proceedings of the International Workshop on Description Logics</conf-name><conf-loc>Rome, Italy</conf-loc><conf-date>2–3 June 1995</conf-date><fpage>119</fpage><lpage>123</lpage></citation></ref>
<ref id="b34-sensors-12-12126"><label>34.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Rosse</surname><given-names>C.</given-names><suffix>Jr.</suffix></name><name><surname>Mejino</surname><given-names>J.L.</given-names><suffix>Jr.</suffix></name></person-group><article-title>A reference ontology for biomedical informatics: The foundational model of anatomy</article-title><source>J. Biomed. Inf.</source><year>2003</year><volume>36</volume><fpage>478</fpage><lpage>500</lpage><pub-id pub-id-type="doi">10.1016/j.jbi.2003.11.007</pub-id></citation></ref>
<ref id="b35-sensors-12-12126"><label>35.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Gutiérrez</surname><given-names>M.</given-names></name><name><surname>Garcia-Rojas</surname><given-names>A.</given-names></name><name><surname>Thalmann</surname><given-names>D.</given-names></name><name><surname>Vexo</surname><given-names>F.</given-names></name><name><surname>Moccozet</surname><given-names>L.</given-names></name><name><surname>Magnenat-Thalmann</surname><given-names>N.</given-names></name><name><surname>Mortara</surname><given-names>M.</given-names></name><name><surname>Spagnuolo</surname><given-names>M.</given-names></name></person-group><article-title>An ontology of virtual humans</article-title><source>Visual Comput.</source><year>2007</year><volume>23</volume><fpage>207</fpage><lpage>218</lpage><pub-id pub-id-type="doi">10.1007/s00371-006-0093-4</pub-id></citation></ref>
<ref id="b36-sensors-12-12126"><label>36.</label><citation citation-type="book"><person-group person-group-type="author"><name><surname>Rector</surname><given-names>A.</given-names></name><name><surname>Welty</surname><given-names>C.</given-names></name><name><surname>Noy</surname><given-names>N.</given-names></name><name><surname>Wallace</surname><given-names>E.</given-names></name></person-group><source>Simple Part-Whole Relations in OWL Ontologies. W3C Editor's Draft</source><publisher-name>W3C</publisher-name><year>2005</year></citation></ref>
<ref id="b37-sensors-12-12126"><label>37.</label><citation citation-type="confproc"><person-group person-group-type="author"><name><surname>Knoop</surname><given-names>S.</given-names></name><name><surname>Vacek</surname><given-names>S.</given-names></name><name><surname>Dillmann</surname><given-names>R.</given-names></name></person-group><article-title>Modeling Joint Constraints for an Articulated 3D human Body Model with Artificial Correspondences in ICP</article-title><conf-name>Proceedings of the 5th IEEE-RAS International Conference on Humanoid Robots</conf-name><conf-loc>Tsukuba, Japan</conf-loc><conf-date>5–7 December 2005</conf-date><fpage>74</fpage><lpage>79</lpage></citation></ref>
<ref id="b38-sensors-12-12126"><label>38.</label><citation citation-type="confproc"><person-group person-group-type="author"><name><surname>Stocker</surname><given-names>M.</given-names></name><name><surname>Sirin</surname><given-names>E.</given-names></name></person-group><article-title>Pelletspatial: A hybrid RCC-8 and RDF/OWL Reasoning and Query Engine</article-title><conf-name>Proceedings of the CEUR Workshop</conf-name><conf-loc>Heraklion, Greece</conf-loc><conf-date>27–28 May 2009</conf-date><comment>Volume 529</comment><fpage>2</fpage><lpage>31</lpage></citation></ref>
<ref id="b39-sensors-12-12126"><label>39.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Hornsby</surname><given-names>K.S.</given-names></name><name><surname>Joshi</surname><given-names>K.</given-names></name></person-group><article-title>Combining ontologies to automatically generate temporal perspectives of geospatial domains</article-title><source>Geoinformatica</source><year>2010</year><volume>14</volume><fpage>481</fpage><lpage>505</lpage><pub-id pub-id-type="doi">10.1007/s10707-009-0088-1</pub-id></citation></ref>
<ref id="b40-sensors-12-12126"><label>40.</label><citation citation-type="confproc"><person-group person-group-type="author"><name><surname>Schuele</surname><given-names>M.</given-names></name><name><surname>Karaenke</surname><given-names>P.</given-names></name></person-group><article-title>Qualitative Spatial Reasoning with Topological Information in BDI Agents</article-title><conf-name>Proceedings of the 2nd Workshop on Artificial Intelligence and Logistics (AILog)</conf-name><conf-loc>Lisbon, Portugal</conf-loc><conf-date>17 August 2010</conf-date><fpage>7</fpage><lpage>12</lpage></citation></ref>
<ref id="b41-sensors-12-12126"><label>41.</label><citation citation-type="confproc"><person-group person-group-type="author"><name><surname>Serrano</surname><given-names>M.A.</given-names></name><name><surname>Gómez-Romero</surname><given-names>J.</given-names></name><name><surname>Patricio</surname><given-names>M.A.</given-names></name><name><surname>Garcia</surname><given-names>J.</given-names></name><name><surname>Molina</surname><given-names>J.M.</given-names></name></person-group><article-title>Topological Properties in Ontology-Based Applications</article-title><conf-name>Proceedings of the 11th International Conference on Intelligent Systems Design and Applications (ISDA)</conf-name><conf-loc>Córdoba, Spain</conf-loc><conf-date>22–24 November 2011</conf-date><fpage>1329</fpage><lpage>1334</lpage></citation></ref>
<ref id="b42-sensors-12-12126"><label>42.</label><citation citation-type="web"><article-title>Additional Resources Web Page</article-title><comment>Available online: <ext-link xlink:href="http://www.giaa.inf.uc3m.es/miembros/jgomez/et/" ext-link-type="uri">http://www.giaa.inf.uc3m.es/miembros/jgomez/et/</ext-link> (accessed on 11 April 2012)</comment></citation></ref>
<ref id="b43-sensors-12-12126"><label>43.</label><citation citation-type="web"><article-title>Java Topology Suite Web Page</article-title><comment>Available online: <ext-link xlink:href="http://www.vividsolutions.com/jts/" ext-link-type="uri">http://www.vividsolutions.com/jts/</ext-link> (accessed on 11 April 2012)</comment></citation></ref>
<ref id="b44-sensors-12-12126"><label>44.</label><citation citation-type="web"><article-title>OWL API Web Page</article-title><comment>Available online: <ext-link xlink:href="http://owlapi.sourceforge.net/" ext-link-type="uri">http://owlapi.sourceforge.net/</ext-link> (accessed on 11 April 2012)</comment></citation></ref></ref-list>
<sec sec-type="display-objects">
<title>Figures and Tables</title>
<fig id="f1-sensors-12-12126" position="float">
<label>Figure 1.</label>
<caption>
<p>Keet <italic>et al.</italic>'s taxonomy of basic mereological and meronymic part-of relations.</p></caption>
<graphic xlink:href="sensors-12-12126f1.gif"/></fig>
<fig id="f2-sensors-12-12126" position="float">
<label>Figure 2.</label>
<caption>
<p>RCC-8 relations.</p></caption>
<graphic xlink:href="sensors-12-12126f2.gif"/></fig>
<fig id="f3-sensors-12-12126" position="float">
<label>Figure 3.</label>
<caption>
<p>Proposed taxonomy of part properties for spatio-temporal aspects in vision-based systems.</p></caption>
<graphic xlink:href="sensors-12-12126f3.gif"/></fig>
<fig id="f4-sensors-12-12126" position="float">
<label>Figure 4.</label>
<caption>
<p>Joints captured by Kinect™ skeletal view.</p></caption>
<graphic xlink:href="sensors-12-12126f4.gif"/></fig>
<fig id="f5-sensors-12-12126" position="float">
<label>Figure 5.</label>
<caption>
<p>An example of explicit and inferred taxonomies.</p></caption>
<graphic xlink:href="sensors-12-12126f5.gif"/></fig>
<fig id="f6-sensors-12-12126" position="float">
<label>Figure 6.</label>
<caption>
<p>Spatio-temporal taxonomy with pattern representation.</p></caption>
<graphic xlink:href="sensors-12-12126f6.gif"/></fig>
<fig id="f7-sensors-12-12126" position="float">
<label>Figure 7.</label>
<caption>
<p>Explicit taxonomies for joints and body members.</p></caption>
<graphic xlink:href="sensors-12-12126f7.gif"/></fig>
<fig id="f8-sensors-12-12126" position="float">
<label>Figure 8.</label>
<caption>
<p>Inferred properties using composition between 
<monospace>hasEvent</monospace> and 
<monospace>partOf</monospace>.</p></caption>
<graphic xlink:href="sensors-12-12126f8.gif"/></fig>
<fig id="f9-sensors-12-12126" position="float">
<label>Figure 9.</label>
<caption>
<p>System implementation.</p></caption>
<graphic xlink:href="sensors-12-12126f9.gif"/></fig>
<fig id="f10-sensors-12-12126" position="float">
<label>Figure 10.</label>
<caption>
<p>Gesture instantiation and action example.</p></caption>
<graphic xlink:href="sensors-12-12126f10.gif"/></fig>
<fig id="f11-sensors-12-12126" position="float">
<label>Figure 11.</label>
<caption>
<p>Rule to exemplify expressiveness.</p></caption>
<graphic xlink:href="sensors-12-12126f11.gif"/></fig>
<fig id="f12-sensors-12-12126" position="float">
<label>Figure 12.</label>
<caption>
<p>Simplified rule to recognize shopping.</p></caption>
<graphic xlink:href="sensors-12-12126f12.gif"/></fig>
<fig id="f13-sensors-12-12126" position="float">
<label>Figure 13.</label>
<caption>
<p>Representation of the inferred 
<monospace>placedIn</monospace> relationships.</p></caption>
<graphic xlink:href="sensors-12-12126f13.gif"/></fig>
<fig id="f14-sensors-12-12126" position="float">
<label>Figure 14.</label>
<caption>
<p>Query for different interactions during a campaign.</p></caption>
<graphic xlink:href="sensors-12-12126f14.gif"/></fig>
<fig id="f15-sensors-12-12126" position="float">
<label>Figure 15.</label>
<caption>
<p>Representation of the inferred 
<monospace>involvedIn</monospace> relationships.</p></caption>
<graphic xlink:href="sensors-12-12126f15.gif"/></fig>
<table-wrap id="t1-sensors-12-12126" position="float">
<label>Table 1.</label>
<caption>
<p>Set of characteristics to classify part-whole relations.</p></caption>
<table frame="hsides" rules="groups">
<thead>
<tr>
<th align="left" valign="top"><bold>Characteristic</bold></th>
<th align="left" valign="top"><bold>Definition</bold></th></tr></thead>
<tbody>
<tr>
<td align="left" valign="top"><bold>Functional</bold></td>
<td align="left" valign="top">Parts are in a specific spatial/temporal position with respect to each other supporting their functional role with respect to the whole.</td></tr>
<tr>
<td align="left" valign="top"><bold>Homeomerous</bold></td>
<td align="left" valign="top">Parts are visually similar to each other and to the whole to which they belong. Parts and aggregates belong to the same class.</td></tr>
<tr>
<td align="left" valign="top"><bold>Separable</bold></td>
<td align="left" valign="top">Parts can be physically disconnected from the whole to which they are connected and can be detected without being part of a particular aggregate object. The opposite characteristic is <bold>Invariance</bold>.</td></tr>
<tr>
<td align="left" valign="top"><bold>Resultant</bold></td>
<td align="left" valign="top">A part provides at least one property that extends to the whole.</td></tr>
<tr>
<td align="left" valign="top"><bold>Mandatory</bold></td>
<td align="left" valign="top">An object of a particular class must be detected to declare the existence of an aggregate object. The opposite characteristic is <bold>Optional</bold>.</td></tr>
<tr>
<td align="left" valign="top"><bold>Existential dependency</bold></td>
<td align="left" valign="top">A single and always the same occurrence of an object is critical for the life of the aggregate.</td></tr>
<tr>
<td align="left" valign="top"><bold>Mutability</bold></td>
<td align="left" valign="top">A particular part object can be replaced in the aggregate object by another equivalent part without losing its identity. The opposite characteristic is <bold>Immutability</bold>.</td></tr>
<tr>
<td align="left" valign="top"><bold>Shareability</bold></td>
<td align="left" valign="top">An object can be part of more than one aggregate object at the same time.</td></tr>
<tr>
<td align="left" valign="top"><bold>Transitivity</bold></td>
<td align="left" valign="top">An object A is part of an aggregate B, the aggregate B is in turn part of another aggregate C, then A is also part of C. The opposite characteristic is <bold>Intransitivity</bold>.</td></tr></tbody></table></table-wrap>
<table-wrap id="t2-sensors-12-12126" position="float">
<label>Table 2.</label>
<caption>
<p>Composition of properties.</p></caption>
<table frame="hsides" rules="groups">
<thead>
<tr>
<th align="center" valign="top"><bold>⊗</bold></th>
<th align="center" valign="top"><bold>hasClass</bold></th>
<th align="center" valign="top"><bold>partOf</bold></th>
<th align="center" valign="top"><bold>placedIn</bold></th></tr></thead>
<tbody>
<tr>
<td align="center" valign="top"><bold>hasClass</bold></td>
<td align="center" valign="top">hasClass</td>
<td align="center" valign="top">partOf</td>
<td align="center" valign="top">placedIn</td></tr>
<tr>
<td align="center" valign="top"><bold>partOf</bold></td>
<td align="center" valign="top">partOf</td>
<td align="center" valign="top">partOf</td>
<td align="center" valign="top">placedIn</td></tr>
<tr>
<td align="center" valign="top"><bold>placedIn</bold></td>
<td align="center" valign="top">placedIn</td>
<td align="center" valign="top">placedIn</td>
<td align="center" valign="top">placedIn</td></tr></tbody></table></table-wrap></sec></back></article>
