Citizen Science and Topology of Mind : Complexity , Computation and Criticality in Data-Driven Exploration of Open Complex Systems

Recently emerging data-driven citizen sciences need to harness an increasing amount of massive data with varying quality. This paper develops essential theoretical frameworks, example models and a general definition of complexity measure, and examines its computational complexity for an interactive data-driven citizen science within the context of guided self-organization. We first define a conceptual model that incorporates the quality of observation in terms of accuracy and reproducibility, ranging between subjectivity, inter-subjectivity, and objectivity. Next, we examine the database’s algebraic and topological structure in relation to informational complexity measures, and evaluate its computational complexities with respect to an exhaustive optimization. Conjectures of criticality are obtained on the self-organizing processes of observation and dynamical model development. An example analysis is demonstrated with the use of a biodiversity assessment database, the process that inevitably involves human subjectivity for management within open complex systems.


Introduction
Recent innovation of information and communication technologies (ICT) embedded in real environment is drastically changing the way society interacts with computation.This has been described as the fourth industrial revolution [1].In particular, ubiquitous sensors and mobile communication tools have led to an increasing capacity of distributed and interactive environmental sensing.These technological supports bring in new effective methodologies to tackle complex self-organising behaviours in social-ecological systems that are difficult to understand with conventional modelling and simulation approaches (e.g., [2] [3]).Massive amounts of sparse and heterogenous data that are based on the internal observation from within various collective phenomena call for an extended analytical framework, ranging from objective measurements such as with sensors, and subjective data such as human evaluations and feedbacks.
Redefining a standard formalization of computation and its complexity that are associated with self-organised citizen science can raise multiple criteria for the evaluation of critical phenomena, spread over the dynamical process of observation, management, and knowledge formation in open complex systems [4] [5].Self-organised criticality appears in various natural and social phenomena, often with scale-free statistical properties [6] [7].They manifest in the power law, which can be reduced to a simple combination of inherent stochastic processes [8], and whose realizations provide proxies of emergent functionality (e.g., [9] [10] [11]).The large fluctuation of the power law distributes the statistical complexity in multiple scales that cannot be represented by a simple mean value for predictive purposes.The sampling time series from a power-law distribution encounters intermittent shifts of the sample average due to the infinite variance of distribution, even with the upper-bounded power law in the real world, e.g., in the magnitude distribution of earthquakes.This situation addresses a statistical limit of prediction solely by the modelling and simulation of the phenomena, but also presents a positive reason to engage human elements as a practical solution in actual management, especially those involving semantic and cognitive judgements [12] [13].On the technology side, machine learning models have long been attempting to optimize the prediction of unknown stochastic sources, implementing interactive estimation processes to exploit the hidden causal structure from temporal observation sequences (e.g., [14]).Modelling studies of guided self-organization have been recently explored with the implementation to robotics, simulated neural networks and networks of agents, etc. [15].Although most of the achievement is discussed within the predictability of a confined experimental setting, a hybrid system with the synergy of human and computation elements always lies as a premise of real-world situation, which has been little exploited, except for some prototypical interfaces for the internet of things (e.g., [16]).For a cost-effective monitoring and control within restricted resources, guided criticality should be introduced to the user side of technology, in order to migrate and abstract decision making process from computation to human ability [3] [4] [17].
In particular, in solving global agenda such as sustainability goals, a comprehensive approach is required that should make use of the full potential of self-organisation in coupled social-ecological systems [5] [18] [19].These efforts practically take on the engagement of citizens and multi-disciplinary stakeholders as important actors in the data acquisition and the implementation of an interactive management through guided self-organization, as a novel type of collective intelligence in the era of the fourth industrial revolution [3] [20] [21].
In facing the transition of data-driven citizen science towards the achievement of dynamical control in managing real-world open complex systems, this article raises fundamental theories and example models to support the discussion of complexity, computation, and criticality in its most possible general form.We formalize the basic objectives as follows, which are exploited in the subsequent sections with the corresponding numbers: • Section 1: How can we formalize and treat the databases of varying quality from both machine and human observations, which range from subjective bias to objective fact?How can we set up scientific measures that should assure the compatibility with the principles of accuracy and reproducibility ?• Section 2: How can we generalize the concept of complexity measures in application to the human-computer hybrid systems in citizen science?• Section 3: What is the nature of computational complexities in actual data processing ?
• Section 4: What is the general condition to yield guided self-organization for cost-effective citizen science ?
Although these questions are universal in multiple industries, a common basis of understanding the problems and mutual development of ICT infrastructure are still isolated and developed independently in each sector.Throughout the exploration of these topics, this paper attempts to provide a common terminology and establish a theoretical basis for the realisation of a cost-effective citizen science in open complex systems situations.This is becoming increasingly important for solving transdisciplinary problems through the participation of multiple stakeholders in real world [5].

Inter-Subjective Objectivity Model
We first consider the expression of the quality of data ranging between human subjectivity and machine objectivity in the general form of database X.As a premise, any information that can be represented in digital computing is compatible with the natural number theory.At the infinite limit of computational memory, the representation of the database extends to general sets on a real data type with countably infinite precision, which accepts the definition of σ-finite measure in a measure-theoretical formulation.We define the general form of arbitrary database X as follows: Where R is a real data type, S m is the m sets {S i } i=1,2,...,m of arbitrary symbolic set S i = {s 1 , s 2 , . . ., s l i }, with the dimensions n, m and l i as natural numbers N including 0. Any variable in this article takes the assumption that it can be stored in X.For mathematical simplicity, we hereafter consider the real data type R as a real number.In practise, R n describes the values of n real variables (such as time, spatial coordinates, probabilities, etc), and S m represents m discrete sets of symbols (such as the name of variables, occurrence of discrete variables, text data, etc).Obviously, S m ⊆ R m holds in mathematical simplification, but we separate the notations to distinguish between the quantitative and qualitative variable types.
1.1.Formalization of Subjectivity, Inter-Subjectivity, Subjective-Objective Unity and Objectivity Digital data X from citizen science vary from subjective human perception to objective sensor measurement with a different degree of human-induced bias.Here, the subjectivity and objectivity matter because it influences the accuracy and reproducibility of data that is fundamental to establish scientific analysis.We formalize the nature of observation variables between the subjectivity, objectivity and these interactions as follows: • Subjectivity is the quality of observation that is based on human perception without the substantial support of a machine.• Inter-Subjectivity is the degree of commonality between the subjectivities of multiple subjects.
• Objectivity is the quality of observation that is based on a machine measurement whose consequence does not depend on the operator's will.• Subjective-Objective Unity is the degree of commonality between the subjectivity and objectivity.
• Inter-Subjective Objectivity is the quality of observation that satisfies the coincidence of both inter-subjectivity and subjective-objective unity.
These follow basic concepts in philosophy and social science and are adapted to the situation of data analysis.The concept of subjectivity is commonly used in philosophy as the collection of the perceptions, experiences, expectations, personal or cultural understanding, and beliefs specific to a person, which influences, informs, and is biased towards people's judgments and evaluations.In contrast, the objectivity refers to a view of truth or reality which is free from any individual's influence [22].The most simplistical form of inter-subjectivity in social science employed the term in the sense of having a shared definition of an object, or shared subjectivity [23].
The relations between these classifications are shown in Fig. 1(a).For example, text data written by humans are subjective data whether the fact described is based on an objective phenomenon or not.Sensor logs are objective data, even measured on a human body such as heart rate that could be influenced by subjective thought.When multiple subjects give the same subjective evaluation, such as rating of web contents, the commonality augments the degree of inter-subjectivity, which is often adapted to the cloud-sourced data validation (e.g., [24] [25]).When a subjective evaluation coincided with an objective measurement, the commonality represents the degree of subjective-objective unity.A highly reproducible subjective-objective unity can provide on-site practical measurement in field science, typical in the biodiversity assessment and soil texture analysis (e.g., [25] [26]).This is because these plausible subjective-objective unity measures also coincide with high inter-subjectivity after sufficient training, which guarantees the accuracy of on-site application without each time confirming the accordance with objective measurement.When the methodology is highly established with respect to the accuracy and reproducibility, it belongs to the inter-subjective objectivity, where each subjective and objective measurement converges to the same result.The developmental process of reproducible subjective evaluations that converge with objective measurements is depicted in Fig. 1(b).By training the subjective-objective unity of each human observer, their inter-subjectivity increases, and the commonality of measurement augments to become a self-organizing loop between the subjective-objective unity and inter-subjectivity by a mutual feedback to attain a higher degree of inter-subjective objectivity.
Note that in a philosophical generalization, such as phenomenology, all data are the derivatives of subjectivity, because a machine observation is also constructed on human perception in the establishment of measurement principle, construction of sensing devices and data processing workflows, and final interpretation.To avoid trivial argument that does not affect the reproducibility of the results, we adopt the standpoint that separates the subjectivity and objectivity with the degree of intervention to observation outcome between human and machine.We call this conceptual model the inter-subjective objective model.

Representative Model: Buoy-Anchor-Raft Model
In order to apply the inter-subjective objective model into quantitative framework of actual data processing, we develop a general example model with a more familiar and analogical terminology that are intuitively easier to understand: Buoy-anchor-raft model, as schematically expressed in Fig. 2. The definition and correspondence to the inter-subjective objectivity model are given as follows: • Buoy refers to subjective data that fluctuates on the sea surface representing subjectivity.Buoy can provide subjective estimates of an observation object lying on the objective sea floor, but the observation is biased by subjective fluctuations.• Anchor refers to objective data that is fixed on the sea floor representing objectivity, without the influence from the subjective sea surface.Anchors can be connected to buoys, which provide the evaluation of subjective fluctuation with respect to objective machine measurements.• Raft represents the relationship between buoys, and refers to inter-subjectivity of data without reference to anchors.A buoy can evaluate another buoy using relative difference of fluctuation on a subjective sea surface, and the overall commonality between buoys is represented as the raft.Nevertheless, it is based on an internal observation between buoys without an objective system of units, and is therefore susceptible to a global drift of collective standard.• Buoy-Anchor connection rope defines the degree of subjective-objective unity.As a buoy's movement is more controlled by its anchor, higher subjective-objective unity is assured.• Raft-Anchor connection ropes define the degree of inter-subjective objectivity.In addition to the commonality between buoys represented as a raft, the effects of the global drift from subjective sea surface could be controlled with anchors within a plausible range of error with respect to the objective sea floor.1.
Concrete examples of the buoy, anchor, raft in various social systems and scientific domains are given in Tab.1.While inter-subjective objectivity is a conceptual framework that classifies the quality of observation, the buoy, anchor, and raft refer to actual constructs of databases implemented with ICT.The terms arose from the developmental process of management systems in open systems science [5], sharing the perspective with the transversal question of the grand challenge of AI research regarding the effective extraction of scientific knowledge out of heterogenous data of varying quality [27].Without properly positioning subjective background of the study, it is often the case that established knowledge with large-scale experiments and statistical analyses reveals to be false in high-throughput, discovery-oriented researches, resulting in a null-field with statistically prevailing bias [28].As shown in Table 1, conceptual problematics for the implementation of ICT in various fields can be mutually characterized with the use of the buoy-anchor-raft model.This means the ICT infrastructure can be applied and shared in a synergistic way across domains, which is beneficial, especially for open-source development advocated in complex systems science [21].Recent development in the Application Programming Interface for big data integration has increased the support for this challenge, which calls for a general theoretical framework of information processing that the buoy-anchor-raft model can provide (e.g., [29]).We then consider a mathematical expression of the buoy-anchor-raft model in view of providing simplified idea of computation with respect to the evaluation of inter-subjective objectivity.Recently emerging contexts of citizen science make use of buoys as important information sources, in contrast to objective science such as traditional physics, which is usually self-contained with anchors.Buoys fluctuate with human subjectivity that is scientifically called bias.Suppose we cannot directly measure observation objects as anchors.This constraint does not necessarily arise from the observation principle but rather from the resource limitation: For example, a field evaluation of biodiversity mostly depends on human observation because massive DNA barcoding is too costly or even ineffective.So, the accuracy of buoy data should be evaluated with other buoy-anchor connections compatible with observation objects.By defining a buoy data B ⊂ X and corresponding measurable anchor data A ⊂ X, a buoy-anchor connection C can be defined as an error function erf(•) between A and B: In case of n observation objects for one observer, a typical example of buoy-anchor connection c ∈ R is given with the regularized mean squared error: The regularization makes c accessible to the canonical evaluation of confidence interval such as t-test.
As a generalization to m observers, let us describe where given that Next, we consider the raft model.In most social systems, the case-wise precise measurement of anchors is impossible and we call for the raft of common sense and other social feedbacks as a premise of plausible judgement.Consider m observers with somehow quantifiable opinions (buoy) on n observation objects.We define the raft matrix R as follows, as a generalization of buoy data to m observers and n observation objects: where the raft by definition refers to the commonality contained between these buoys.In a completely equal society where every observer's opinion is equally respected, we obtain the mean inter-subjective evaluation E = (e 1 , • • • , e n ) on n objects as follows: Decision making based on the evaluation of raft can represent the community's mean quantifiable opinions, although it is not free from collective bias.It remains only within the framework of where This means that the error function of the buoy-anchor connection is reflected as an entropy that represents subjective-objective unity of each observer.The opinion of the observer with higher subjective-objective unity is weighted according to the informational scarcity of subjective errors.Such integrated evaluation incorporating the scoring system on observers' quality are one of the general solutions in web-based citizen science (e.g., [25]).Note that the n objects of observation can also coincide with m observers themselves.As C can be independently obtained from R, it can also accept subjective objects of observation where direct anchors do not exist, such as psychological state or the quantification of qualia such as QFD [30] and pain scale [31].In such cases, traditional methods only employ simple raft evaluation E without anchors, as formalized in Eq. (8).In contrast, with the buoy-anchor-raft model, it is possible to relate indirect anchors to other related objectively quantifiable variables, by expanding the database into a more comprehensive system.In either case, this model provides accessibility to the inter-subjective objective evaluation by properly defining the buoy, anchor, raft and its connections.
The correspondence between the buoy-anchor-raft model and computational variables developed in the following sections are listed in Tab. 2.
Com. order I and II b/w N observers, TDC, I-I and I-N res.dim.

Complexity Measures
We consider the generalization of complexity measures with respect to essential information processing in citizen science, based on the inter-subjective objectivity model with buoy-anchor-raft constructs.The concept and definition of complexity vary according to the fields, such as algorithmic complexity, statistical complexity, biological complexity, etc.In this paper, we take a generalized definition of complexity measure as the projection from a system's variables to one-dimensional quantity, which is composed to express a distinctive characteristic of the system [32].This includes classical indices mentioned with the context of complexity, as well as various forms of information expressed as numbers in ICT, such as feature dimensions of machine learning.We consider general forms of complexity defined on database X in relation to the search function.Complexity measures are widely studied in information theory, with the underlying principle to abstract a low-dimensional representative index of useful features for functional characterization of complex systems [32].Usually, complexity measures defined on n real variables are the epimorphism to the one-dimensional real number line, R n → R. The general complexity measure for citizen science is therefore the projection of the database to real value index, X → R, with the condition that this transformation will provide some utility for the management.
The importance of utility depends on the need for information retrieval in citizen science process, or the conditions that are practically used in a database search.Indeed, the search function is actually the retrieval of corresponding data set with respect to a given condition, such that where S R stands for the search result on database X with search query Q(•).For example, Q(•) is an if-then construct that can specify the value range of real variables, or the matching with specific symbolic sequence, which returns the corresponding data sets into S R .
In order to perform computation such as the calculation of the buoy-anchor-raft model evaluation, the integral I of σ-finite measure µ on X with respect to the condition Q(•) can be defined as follows, with indicator function 1(•|Q(•)): where In one-dimensional case, µ can represent either of buoy or anchor.If we define µ : X → R as the function of occurrence probability p(•) of x ⊂ X, such as then I coincides with entropy, one of the typical information theoretical complexity measures.µ can also include joint distribution, such that with µ : in which case the mutual information I 2 , can incorporate raft, buoy-anchor and raft-anchor connections.As a search query, Q(x) provides a value of complexity measure I, we can also inversely use } that generates all possible queries {Q(x)} which return the set of x associated with the given value of complexity measure I.For example, we can search the dataset with its entropy higher than a threshold I c , by setting

{Q(x)}
Nevertheless, complexity measures that specifically define an arbitrary Q(x) are generally not given explicitly.In practice, we usually compare the performance of known complexity measures with respect to the ability to characterize the features we focus our analysis on.The general task is to invent a novel complexity measure that can exclusively separate patterns in X, given implicitly as Q(x).For that purpose, the following theorem holds: Theorem 1.For any search condition Q(x), we can construct an exclusively selective complexity measure I which can sort out effects from other variables, with the function G(•) : R → {Q(x)}, such that The definition of invertibility of G follows that of S R .
Proofs of the theorems are given in Appendix.
The intuitive geometric meaning of the inverse function relationship between complexity measures and search function is shown in Fig. 3.

Observation Commonality as Complexity
Inter-subjective objectivity is based on the commonality among subjectivity, inter-subjectivity and objectivity.Essential computation is therefore the search for commonality between different observation datasets whether it be from humans or machines.We consider the observation commonality to be a complexity measure that conforms to inter-subjective objectivity, and analyze its general mathematical structure.
We consider σ-finite probabilistic measures µ 1 , µ 2 on measurable database space (X, B), where B stands for Borel σ-algebra of X.Then the convolution * of µ 1 and µ 2 is defined as follows: where B(S) and B(R) represent σ-algebra of S ⊂ X and R ⊂ X, respectively.Through appropriate variable transformation, the convolution of probability measures with real type variables ( 21) can be expressed as follows, as the probability of the sum of the variables [33]: By choosing finite sets of x such as time period, geographic range, and other real type variable range, as well as symbols for {s i } such as name of observation object, one can define the commonality of observations as a part of the convolution of the probabilities from different observers.The observation µ 1 and µ 2 can be of any nature between subjectivity, inter-subjectivity, and objectivity.
We now consider the condition of valid observation with respect to the regularization of probability measure as follows, for a general number of observers i ∈ {1, • • • , N}: This means, by expanding the scale of the real type variable to infinity, one can observe its occurrence with probability 1.The same formalization also applies to σ-finite measure on (S, B(S)), which is integrated in the formalization with (R, B(R)).
Next, consider a confined variable range r ⊂ R with positive probability measure µ i (r) > 0. This range can be of any complex form as long as it supports positive measure.In a real situation, this can correspond to intermittent observation time interval, scattered geographical range, and other discrete range of the real type variable.We define the rate of observation q i by observer i within variable range r as which converges to (23) with r → R.
The commonality of observation between 2 observers i, j based on r is expressed as the following convolution confined to r: which also means taking the sum of joint distributions µ i • µ j between all smallest measurable events in r.The additional condition x, y ∈ r in 1(•) limits the integral of each variable within r, which includes formal condition x + y ∈ r 2 .The following generalization holds: Theorem 2. For N independent and valid observation µ i (r) > 0 (i = where the coefficient Λ is a free parameter that remains invariant under the convolution.Then This means that the N −1 -th power of multiple convolution λ N (r N ) represents the geometric mean of N independent valid observation rates.By choosing regularization factor Λ, r N corresponds to the x i as a variable.
With the use of the logarithmic scale, the information of λ N (r N ) is the sum of those with individual observation: As a similar property related to geometric mean, note that the following Young's inequality also holds: where || • || denotes total variation.This assures us that the variation of the commonality remains within the order of the product of each observation's variation.However, it is important to note that as a general property of convolution, The equality only holds in case r → R or µ i (r though it requires direct calculation without relevance to q i (r).In order to obtain fast computable form, the following asymptotical generalization holds: converges almost everywhere to the following: where m(•) is the Lebesgue measure on R, and N (ν N , σ 2 N ) represents the normal probability density distribution with mean value ν N and variance σ 2 N as follows: A numerical example of the convolution λ N (r N ) is presented in Fig. 4. Theorems 2 and 3 can be directly generalized to R n (n ∈ N), with r ⊂ R d .We consider the topological structure of inter-subjective objectivity based on the complexity defined as the convolution between different observations.As the commonality within inter-subjective objectivity is defined with multiple different observations, the topological ordering based on these complexity measures is possible with N > 2 observations of any nature.
We consider the commonality space with respect to each observation dataset as a point, and commonality between them as the distance between each pair of points.This can be considered as the undirected complete graph with N vertices, and its pair-wise complexity measure as N C 2 edges length.The general property of Euclidean space allows a complete graph of size N to be embedded in N − 1 dimensions (e.g., any line between 2 points is 1-dimensional space, and any triangle with 3 points is 2-dimensional surface, etc.), although an additional quantitative restriction such as triangle inequality on each triplet of edges is required.In order to treat an arbitrary set of the complexity measures and yield general characteristics of commonality space, we need to focus not on the actual values of complexity but on the topological order between them.
Let us first consider the total order between complexity values with N > 2 observation data contained in N vertices V := {v i } i=1,••• ,N .One can determine the total order between N C 2 edges by taking a mean order relationship between each pair of edges by the following algorithm (namely the pair-wise order algorithm): 1.For each pair of edges {e i , e j =i ∈ E}, calculate the order relation e i ≤ e j or e i ≥ e j with respect to the given complexity measure as an edge attribute such as length.2. Score each edge e i by mapping to integer z : e i → Z, by adding +1 if e i ≥ e j =i and by adding −1 if e i ≤ e j =i , with respect to all other edges e j =i .3. The sorting with the score {z(e i )} provides the total order of E.
Note that the quantitative difference is completely lost in the case of antisymmetry, (e i = e j ) ≡ (e i ≤ e j ) ∧ (e i ≥ e j ).We will consider the meaning of this information loss with respect to other compatible sets of observation in the section 2.4.
Next, we consider the topological order of complexity for N > 2 observations according to the total order of these commonalities.We need here to translate the total order between edges E to that of observations V.This can be obtained by calculating the N C 3 triplet of N > 2 vertices and associated total order of edges, with the following algorithm (namely triplet order algorithm schematicly represented in Fig. 5): 1.For each triplet of observation V i,j,k := {v i , v j =i , v k =i,j ∈ V} and associated edges {e i := {v i , v j }, e j := {v j , v k }, e k := {v k , v i }}, update score of each observation by mapping to integer z : V i,j,k → Z with the following 6 rules: The commonality order of V represents the topological structure of collective intelligence in citizen science with respect to inter-subjective objectivity, which corresponds to the topological inclusion relation of the Venn diagram in Fig. 1.

Topological Structure of Complexity 2: Permutation between Total Orders of Observations
We expand the situation to 2 sets of N > 2 observations, namely observation I and II.For example, observer I and II observing N objects, or N observers observing 2 different objects I and II.It can also represent the application of two different complexity measures I and II to N observations.For simplicity, we limit the formalization to two sets of N > 2 observations, but generalization to a greater number of sets is possible.
In the general case, total orders I and II do not necessarily coincide.The relationship between 2 total orders with N observations can be described with the permutation of N elements (Fig. 6(a)).In order to analyze the permutation between total orders, let G N be a symmetric group with degrees of N. For g ∈ G N , we define a linear transformation L g : S N → S N by which describes the permutation between commonality orders I and II.
We define a subspace S (g) of S N by which represents the subspace with compromise of total order.While by defining its complementary subspace we obtain the subspace in which there is no compromise, or the complete matching of two commonality orders.The whole commonality space can be divided into S (g) and S (g): As depicted in Fig. 6(a) and (b), the compromise between 2 commonality orders is expressed as a non-linear folding relationship between them.Taking an assumption that the complexity measure is a continuous function, the integrated complexity measure that supports both commonality orders can be expressed as a folded structure, topologically speaking, such as the shape of the letter "N" (also the capital letter of Non-identical), taking the commonality measure of I and II as an Affine coordinate: An example with a red dotted line in Fig. 6(b) shows that we can compose an integrated commonality measure by bending the commonality measure II in an "N" shape with respect to that of I kept straight (in "I" shape, for Identical), which resolves the compromise.The "N" shape transformation of commonality measure means to change the topology of commonality order with respect to a permutation g ∈ G N (g(i) > g(j), 1 ≤ i < j ≤ N), while that of "I" shape represents the identical order (g(i) < g(j), 1 ≤ i < j ≤ N).The non-compromising part of the two commonality orders conserves its order to the projection onto any linear combination of the two commonality measures, which topologically do not require "N" shape folding but maintain "I" shape matching.
For simplicity, We call the topological compromise between commonality orders the I-N compromise, and we call topologically identical matching I-I matching.Then I-I matching subspace S (g) can be obtained as the linear combination of commonality measures I and II, and the subspace required for the resolution of I-N compromise corresponds to the complementary space S (g) (Fig.

6(b) and (c) ).
We call S (g) an I-I space that consists of I-I dimensions, and S (g) an I-N resolution space that consists of I-N resolution dimensions.The mean commonality order of two commonality orders projected onto I-I space (red solid arrows in Fig. 6(b) and (c) ) can be obtained with the use of the pair-wise order algorithm in the section 2.3, applied not to commonality itself but to commonality orders.We call this the I-N mean commonality order, since it adopts the mean total order of commonality orders of I and II resolving the I-N compromise.Note that the information lost by antisymmetry of the pair-wise order algorithm does not affect the division of I-I and I-N resolution subspaces.Geometrical representation of the I-N compromise, I-I matching, and these corresponding dimensions, spaces and the I-N mean commonality order are given in Figs. 6.
We finally consider a statistical test on the degree of coincidence (TDC) between 2 commonality orders.

Theorem 4. Statistical test on the degree of coincidence (TDC) between 2 commonality orders:
Given that commonality orders I and II with N observations follow a uniformly random permutation with G N as null hypothesis, the degree of coincidence d c between the 2 commonality orders follows a binomial distribution: where B(M, p) signifies binomial distribution of parameters M = N C 2 and p = 0.5, k I-I represents the degree of coincidence as the number of I-I matching, #(•) returns the size of the set, and P[•] the probability of the degree of coincidence d c .
With respect to the buoy-anchor-raft model in section 1.2, the following correspondence is possible:  The implicit structure of the integrated commonality order with continuity assumption takes a complex form reflecting I-N compromises (red dotted arrow as an example), which corresponds to complex utility configuration in Fig. 3

Computational Complexity
The computation of complexity measures and commonality orders depends on exhaustive calculation of combinatorics between observations.The computational complexity of such calculation should also be investigated in terms of topological complexity, in order to yield a general theoretical platform that does not depend on the particularity of the database.

Topological Complexity of Commonality
First, we investigate topological order of commonality among N observations.Using the convolution as commonality (27), we define the maximum commonality order O : X → N as follows: The general topological structure of O(X) is depicted in Fig. 7.
On the cardinality of O(X), the following holds: This means that for any elaborated inter-subjective objectivity, there is always the possibility to develop another different set of observations that attains higher inter-subjective objectivity by increasing the dataset.This structure assures the representation of a paradigm shift in science when sufficient contradicting evidence gained a majority compared to an old model.For example, minority reports in biology that may lead to novel discoveries in the future can be properly stored and distinguished from erroneous reports as more evidence accumulates [27].

Algorithmic Complexity
Secondly, we evaluate the computational complexity with respect to the computing time scale.Since data-driven citizen science requires real-time computation in a highly interactive manner with observation process, the algorithmic complexity of the calculation of complexity measures is an essential limiting factor of performance.As commonality is based on the intersection of multiple observations, its exhaustive computing confronts combinatorial explosion as datasets increase.
Although computation of complexity itself, or resolution of search query as mathematical theorem is provable and an algorithmic solution can be found, the computation resource is another practical issue for real world implementation, especially in distributed observation.The computational time scale required for the sorting of a database according to a given utility such as commonality is listed in Table 3.Under a general condition with the observation probability database X N of size N, X N := {µ i (x)|x ∈ X, i = 1, • • • , N}, maximum complexity lies in the calculation of commonality order based on the intersection of N 2 or N 2 elements, whose sorting time belongs to factorial order of N .The case with N = 5 is depicted in Fig. 7.This means that an algorithmic burden exists towards the calculation of middle-scale commonality with respect to the data size.As an inter-subjective objectivity successfully increases in citizen science, this peaking of algorithmic complexity in intermediate scale may hinder the effective feedback necessary for guided self-organization.However, in a practical situation, the actual computation time may remain in polynomial order if effective data size shrinks with respect to the increase of maximum commonality order: Theorem 6.By defining the diminution rate of data combination ∆ : N → R with respect to maximum commonality order the order of its product is upper bounded by the d-th root of maximum computational complexity at N = N 2 where dim(•) returns the size of the database, and d > 0 represents the polynomial order of the algorithm O(N d ) with respect to the data size N.
From this result, we can conjecture that for N ≤ N , will assure exhaustive feedback with polynomial response time of degree c.Usually, the left side is based on the past calculation of lower maximum commonality order, we can annotate interactively whether interactive information processing can assure comprehensive feedback.This will add a criterion on the criticality of guided self-organization mediated by computation, which will be explored in the section 4.
Another methodology other than exhaustive computing is to implement local gradient algorithm as local interaction that leads to global heuristic solution without top-down control.This can also be achieved with the use of limited maximum commonality order, for example, O(X) = k < N , which will keep its computational time within polynomial order O(N dk ).(39), an exhaustive number of combinations with the use of observation probability database X N of size N and the time scale required for the sorting of the commonality measure is shown.Sorting time is based on the worst case performance of canonical algorithms such as bubble sort and quick sort (polynomial degree d = 2).O(•) denotes asymptotic notation of Landau.O(X) = N 2 and N 2 require the maximum calculation and sorting time.Note that the total computation time is upper bounded by the sorting process (d = 2) than the combinatorics of commonality (d = 1), though calculation time of each commonality such as convolution should be further considered in actual implementation.

Maximum Commonality Order O(X)
Number of Combination Sorting Time (d = 2)

Big Data Integration
Thirdly, we consider the computational complexity required for big data integration.As open data is increasingly gaining its availability in citizen science, integration of massive databases from different resources has become one of the most important data processing methods.The conversion of different databases through application programming interface is a basic protocol when the database is distributed over multiple servers.
The computation required in big data integration is the extensive calculation of commonality in the direct product of multiple databases.For simplicity, we consider the integration of 2 databases X N and X M , with size N and M ∈ N, X N : A joint distribution between subsets of X N and X M needs to be determined with respect to common parameters, in order to obtain integrated database including the calculation of up to (N + M)-th order of commonality, such as order-wise correlations [32].Exhaustive computing follows the argument in section 3.2, giving the extension of the theorem 6: Theorem 7. Given the diminution rate of data combination ∆ : N 2 → R, with respect to maximum commonality order 1 ≤ i ≤ N ∈ N and 1 ≤ j ≤ M ∈ N, during integration of 2 databases X N and X M , respectively, as the order of its product is upper bounded by the d-th root of maximum computational complexity at N = N where d > 0 represents the polynomial order of the algorithm O([N N M M ] d ) with respect to the data size N and M. In this formalization, computational complexity of database integration also confronts combinatorial explosion with respect to data size.Similarly to (42), we then explore a practical condition that effective maximum commonality order can be treated with polynomial time of degree c > 0, such that For that purpose, we set the uniform sparseness u (0 < u < 1) of random databases representing the density of combination that supports the existence of commonality at each order, which maintains the diminution rate of data combination ∆ (40) and ∆ (43) invariant under the definition.With respect to the total size of the database after integration L = N + M, the following holds: Theorem 8.As L → ∞ in random data (46), the mean condition of (45) for all {N, M|N + M = L} converges to the following inequality, which represents polynomial time constraints on computational complexity for exhaustive calculation of newly emerging commonality order within data size L: where and * signifies the discrete convolution (20): Numerical observation of the proof is given in Fig. 8.This signifies that the convolution of power function of each database's size serves as the complexity measure of big data integration with respect to computational complexity.This provides the condition of data sparseness u such that exhaustive calculation of all newly generating commonality orders within size L can be treated with polynomial time order c under algorithmic constraint d.As the inequality indicates, the more data is sparse, the easier we can calculate joint commonality.

Conjectures on Guided Self-Organization
With effective feedbacks by computation, citizen science dynamics is expected to converge to a critical state where objective is collectively optimized through the mutual increase of inter-subjective objectivity.However, several aspects may intervene in the resulting self-organized state, on which we need theoretical interpretation.In this section, general important aspects are exemplified in relation to self-organized criticality.

Criticality by Limitation
The accuracy and reproducibility of observation is a primary factor that defines the consequent resolution of information represented in a database.Computational complexity also gives constraint on the speed of information processing for prediction.Several limiting factors may generically arise, such as: 1. Limitation by principle: Deterministic chaos inherent in a natural system does not allow for long-term prediction, because the tiniest observation error of present state will develop in exponential order [35].Short-term validity of meteorologcal prediction is a typical example.2. Limitation by reproducibility: In a real world situation, we mostly encounter one-time-only events, which do not allow reproduction under the same condition [5].Available data is sparse with respect to latent variables, which causes quantitative limitation of prediction [4].3. Limitation by computational complexity: As explored in section 3.2, extensive feedback based on exhaustive computing is often impossible with respect to available computing resources.The resolution of feedback may include time delay or incomplete optimization.Spatial-temporal scale of the forecast also sets the constraint as a general trade-off between prediction accuracy and computational resources.The coarser the forecast granularity is, the more costly the calculation becomes but the more likely it is to realize an accurate long term prediction.
These limitations fundamentally regulate the order of significant digits in the prediction process, at the edge of resulting precision where the accuracy reaches criticality.The whole dynamics is also confined by the criticality of the observing phenomena itself, by which observers' behaviour is influenced.

Criticality by Successful Learning
The motivation of citizen science is not necessarily the construction of versatile artificial intelligence, but the integration and augmentation of human capacity as well [4] [13] [12].Successfulness of citizen science can also be defined in terms of information transition from machine to human, on which criticality is assumed to appear.
Let us consider the case when successful learning mediated by computation transferred effective prediction model into human cognitive capacity.We take an example with Bayesian estimation, which is also a general model of our brain function [36].General formulation of Bayesian estimation updates the parameter of hypothesized prior probability P(A) with respect to the observed data P(B), and provide estimation of posterior probability P(A |B) given by Bayes' theorem: where P(B|A) is considered as likelihood function, which updates P(A) to P(A |B).
We now consider that the prior probability P(A), or the model of prediction, depends on the process of computation C and human decision D. As human decision is supported by computation, When human successfully acquired the model represented in computational model, as independent identical distribution, and as independent and informationally homologous distribution.This criticality qualitatively corresponds to the saturation stage of Markov chain Monte Carlo method (MCMC) in the optimization of hierarchical model (52), where hyperparameter and parameter converge to independent stable distributions.Therefore, by monitoring the dependency of machine-human interaction with respect to the actual predictability, one can suggest whether the computation model or human observation should change, or if the actual phenomenon is in transition: Peer-reviewed version available at Entropy 2017, 19, , 181; doi:10.3390/e19040181 • When the actual prediction accuracy is high and human-machine interaction is high, this indicates the successful modelling of observing phenomenon with the use of computation.• When actual prediction accuracy is high and human-machine interaction is low, it means the human has achieved a successful understanding of the phenomenon with less dependency on a machine.• When actual prediction accuracy is low and human-machine interaction is high, it indicates the possibility that computational capacity is not sufficient to effectively treat the phenomenon.Otherwise, the observing phenomenon might be in dynamical transition that effective computational model needs to be changed.• When actual prediction accuracy is low and human-machine interaction is low, more human effort needs to be engaged both on actual observation and the utilization of machine interface.

Criticality by Guided Optimization
Actual management task of citizen science is often firmly related to the sustainability of social-ecological system, where achievement of robustness and resilience is an important criterion of criticality [3] [5].Universally robust model with respect to arbitrary variable cost function is canonically given by uniform distribution, which is commonly adopted as a prior of Bayesian estimation and random search algorithm [14].It is also widely prevalent in biological phenomena as the survival rate depends on the geometric mean of evolutionary fitness, which is maximized with uniformity in space, time and statistical configuration [32] [37].
On the other hand, a short-term management goal is usually biased by a given objective.How to reconcile short-term local efficiency and long-term global sustainability is a crucial issue for guided self-organization of management in citizen science.
In order to optimize the balance between different spatio-temporal scales, information geometry can provide theoretical compromise in terms of informational complexity.Suppose actual distribution of variable X ⊂ X is given by P a (X), a short-term management goal as P s (X) and idealized long-term robust distribution as P l (X).In many natural systems, the uniformity of P l (X) supporting robustness as the result of self-organization is expressed with entropy maximization principle under parameter constraints such as resource availability and energy flux level [38].
For simplicity, take an example with Shannon's diversity index H defined on discrete distribution P(X) on symbols X = {s 0 , s 1 , • • • , s n }, such as frequency of n species in biodiversity observation.
where s 0 represents the non-occurrence of any species.P(X) and H could be either buoy or anchor.
Note that H can be generalized to mutual information H 2 to express raft, buoy-anchor and raft-anchor connections, where P 2 (•, •) denotes joint distribution on X × X.
By maximizing H , we can determine the most diverse distribution P l as which represents the most robust ecosystem taking on the assumption that every species including the gap is equally invaluable in terms of ecosystem function in randomly changing environment.With respect to the subset of X we focus for short-term management goal, both H (P a ) < H (P s ) and H (P a ) > H (P s ) could occur.However, a general relationship between biodiversity and ecosystem functions imposes H (P a ) < H (P s ), meaning a net positive impact on biodiversity and good management in terms of sustainability.H can be generalized to complexity measure G −1 in section 2.1 with respect to the commonality λ in section 2.2, which will be detailed in section 6.
Expressed as an exponential family, P(X) can be parameterized as a statistical manifold based on the canonical setting of information geometry, with the dual-flat coordinates Θ = {θ i |i = 1, • • • , n} and H = {η i |i = 1, • • • , n}, with potential functions φ and ψ, respectively, based on the Fisher information metric g and connection coefficients Γ (α) [39][40]: under the correspondence of the following transformation for discrete distribution, The elements of Fisher information metric g = g ij are given with respect to the dual coordinates, where g inv ij is the inverse matrix of (g ij ).This relation defines Θ and H as the dual coordinate systems orthogonal to each other with respect to g.The α-connection coefficients with respect to a real number α is given by the Fisher information metric as where E[•] is the mean value function.The values α = 1 and −1 are essential in information geometry, which define the eand m-flat connections respectively, in terms of the invariance of tangent space under the covariant differential ∇ (α) on arbitrary coordinates {ξ i }(i = where Γ ij;k = 0 for ξ i = θ i , and Γ (−1) ij;k = 0 for ξ i = η i .For example, the model P(X; Θ) is e-flat with respect to the coordinates Θ, and m-flat with respect to the coordinates H. ∇ (±1) is called the dual-flat connection of the statistical manifold.The concept of flatness defined by these connections further extends to the concept of geometric parallel and geodesic.As an autoparallel submanifold with respect to the connection, eand m-flat geodesic Θ(w) and H(w) between 2 distributions P 1 (X) and P 2 (X) are defined as follows with one-dimensional parameter w: The unique ∇ (α) -divergence D (α) (P 1 (X) : P 2 (X)) that satisfies D(P 1 (X) : P 2 (X)) ≥ 0 and D(P 1 (X) : P 2 (X)) = 0 ⇔ P 1 (X) = P 2 (X), and that remains invariant under possible transformations of the dual-flat coordinates with the connections ∇ (±α) is given by whose dual divergence coincides with Kullbuck-Leibler divergence in case of α = 1, From the Pythagorean relation and the projection theorem of Kullbuck-Leibler divergence on the dual-flat statistical manifold [39](p.63), the following holds: Theorem 9. Let (Θ a , H a ), (Θ s , H s ) and (Θ l , H l ) be the dual-flat coordinates of P a (X), P s (X) and P l (X), respectively, with canonical definition of e-and m-flat dual connections.We define the optimal distribution P o (X) with coordinates (Θ o , H o ) on m-flat geodesic between P a (X) and P l (X) with parameter w ∈ R as Fig. 9 shows the geometrical structure of this theorem.In this case, supposing H (P a ) < H (P s ) < H (P l ) as effectiveness of complexity measure H for management, we want to find optimal distribution of biodiversity P o balancing between P s and P l with respect to actual distribution P a , such that based on statistical dependencies between variables that can be orthogonally separated with Pythagorean relation.As a result, P o provides the optimized distribution with respect to minimum informational discrepancy from the short-term objective to the ideal transition towards the long-term most diverse state.The meaning of major components of Kullbuck-Leibler divergence to be used as a guide of self-organization is listed as follows: • D m (P a : P o ): Discrepancy between actual distribution and optimum portfolio strategy that orthogonally decomposes and attempts to achieve a balance between short-term management objective and long-term sustainability.

Results from Biodiversity Management
We demonstrate the application of the model developed in this article to actual citizen science observation data, taking a biodiversity observation activity supported by interactive database as a typical example [17].Sample data contain the observation by 7 citizen participants on 48 subjective binary indices on species occurrence as buoy data on biological diversity, resulting in 336 samples.On the other hand, a buoy-anchor connection was established separately by objective evaluation of each participant's ability to detect these species.
Commonality orders among seven observers were obtained for both inter-subjectivity based on the mutual information of buoy data, and subjective-objective unity by simply ranking with buoy-anchor connection data.These orders are shown in Fig. 10.A binomial test defined in (38) was performed on the comparison between inter-subjective and subjective-objective commonality orders.The random order distribution hypothesis was rejected with respect to 4% significance threshold.The matching was more consistent in a higher order of commonality, which implies the intervention of subjective bias in a lower order.With respect to the conjectures on criticality in section 4, the results can be interpreted as a significant self-organization process towards criticality with the increase of inter-subjective objectivity.

Discussion
We have tackled the general situation in data-driven citizen science where scientific accuracy and reproducibility can only be discussed at the intersection of subjectivity, inter-subjectivity, and objectivity.Based on the conceptual definition of inter-subjective objectivity, a general topological structure was characterized with respect to complexity measure, search function, computational complexity and criticality conditions.The results provide theoretical criteria for the development of information and communication technology in view of effective assistance and guidance of citizen science from a complex systems perspective.
The universality of the developed theory and models lies in the generality of the commonality concept formalized as convolution.In reality, a joint distribution of N variables can be represented as the function of convolution with degree N, which allows for extensive expression of informational complexities [32].For example, by choosing the time range T ⊂ R with positive Lebesgue measure m(T) > 0, marginal distribution P(x|T) can be expressed as the time integral of probability measure µ, such as according to (24).
On the other hand, joint distribution P(x 1 , x 2 |T) is also the time integral of the products between each variable's probability measure µ 1 and µ 2 , within simultaneous time range {dT i } i=1,••• ,n : where m(•) is Lebesgue measure on R. As defined in (25), which derives the practical form for actual data processing as Taking n → ∞, we obtain the canonical definition of joint distribution with real value resolution of time.This follows the generalization to N variables with (A5) as (78) Therefore, based on the commonality as convolution, we derive whole orders of the joint distribution necessary for the calculation of known complexity measures.In a general form, any complexity measure incorporating the information of a joint distribution can be described as the function of convolution G −1 (Q(λ)), following the formalization of section 2.1.
Commonality order is also accessible to existing algorithms that extract the total order of system elements, such as Dulmage-Mendelsohn decomposition [41] and phylogenetic tree analyses [42].Although the calculation of joint distributions of all orders out of matrix data generally confronts exponential computational time, total order based on partial combinatorics and statistical testing with known distribution of p-value can provide a quick evaluation of matching on the results from different algorithms.The pair-wise and triplet order algorithms of N observations can be processed with O(N 2 ) and O(N 3 ), respectively, similar to the range of most other ranking algorithms that is based on low-order statistics.The comparison between N total orders of commonality requires only second degree polynomial time O(N 2 ) (38).Taking such partial optimization and algorithm-wise comparison of performance into account, as an extensive Bayesian estimator including human of section 4.2, deep learning model with the use of massive parallel machine learning can be structurally effective for an interactive recombination of an estimation model based on human feedback In order to effectively attain criticality in citizen science where knowledge acquisition, transfer, and control are optimized through self-organization, we need to reach a collective intelligence that is distributed in a parallel way both in our subjective mind and objective reality.The cost of data-driven science sometimes depends on the overly weighted objective measurement for complete modelling, which can also hinder the agility of taking actions, and opportunity of effective interaction through internal observation [3].As explored in this article, if there exist natural laws extended in our collective intelligence-much like the physical law in objective nature-we may count on such topological structure, and it may be possible to take effective guidance through partial and distributed observation.Such a way to organize collective intelligence among independent and parallel activity producers could be considered as a social-environmental expansion of the "intelligence without representation", which is based on the direct interface to the world through perception and action, rather than comprehensive representation of knowledge isolated from the environment [43].Data acquisition needs to generate potentially effective action strategies, or the affordance under global management principles, instead of modelling the phenomena without essential intervention of actors [44].This can be described as the data-affordance science in contrast to exhaustive data-driven science, in which we substantially depend on the emergent topological structure of inter-subjective objectivity to take decisions in real time, represented at the intersection of the human mind, computation and natural phenomena.The buoy-anchor-raft model developed as a mutual framework can provide a theoretical basis that expands external observation of conventional science to internal observation necessary for the management and knowledge extraction as a data-affordance science [5] [27].As a cumulative effect of synergistic efficiency, observation and data processing could diminish within a computable time scale by implicitly augmenting the knowledge representation incorporated into actual action principles.With measurement-action unity as a process of affordance in both data and reality, cost-effective interface and human-dependable system could be realized within the framework of internal observation, as a crucial premise for sustainable solution.The edge of criticality for a successful citizen science, in terms of its nature and resource restriction, could find its limits neither in our internal mind nor external world, but on the topology of these interactions.
The author declares no conflict of interest.

Figure 1 .
Figure 1.Schematic representation of the inter-subjective objectivity model.(a) Relations between two subjectivities namely A and B, objectivity, inter-subjectivity between A and B, subjective-objective unity for A and B, and inter-subjective objectivity are depicted as inclusion relations between each other set.(b) Development of inter-subjective objectivity as effective measurements of citizen science.As the inter-subjectivity increases along with the training of subjective-objective unity and inter-subjective feedbacks, the accuracy and reproducibility of measurement based on subjectivity can be assured by the convergence to inter-subjective objectivity.

Figure 3 .
Figure 3. Schematic representation of complexity measures as non-linear feature space and search function as its inverse functions.(a) Utility characteristics of a complex system, or complexity measure in general terms, is expressed with a complex configuration in parameter space.Parameters can also represent other complexity measures.(b) Complexity measures transform parameter space into non-linear feature space, which provides easier interpretation by sorting the order of a given utility.The inverse functions of complexity measures therefore correspond to search functions with respect to the search condition on utility.
ensemble of possible mean values Λ = 1 N , integrated values (Λ = 1) and other weighted sum of N random samplings from r.The regularization parameter Λ can further be generalized to an arbitrary measurable function Λ(•) representing commonality characteristics, taking N ∑ i=1

Figure 5 .
Figure 5. Schematic representation of the triplet order algorithm that calculates the total order of three observations with respect to the complexity defined on the pair-wise commonality between them.Three observations A, B and C are expressed as vertices of triangle in a 2-dimensional surface, whose edge lengths A-B, B-C, and A-C represent the commonality of each vertex pair.For simplicity, the triangles are projected as regular triangle, but the actual edge lengths generally differ, which provides the total order of edges.The six case statements of the algorithm are shown separately.Given the total order between the edges in blue magnitude relation, the corresponding total order of observations are depicted with orange axes at the side of each triangle.Orange axes superimposed with triangles signify that, by orthogonally projecting the vertices onto them, the total order of vertices are obtained, whose generalization is developed in the section 2.4.This holds for arbitrary three positive values of edge length without the constraint of triangular inequality, by considering appropriate projection of the triangles to a non-Euclidian surface.
www.preprints.org)| NOT PEER-REVIEWED | Posted: 14 April 2017 doi:10.20944/preprints201704.0086.v1Peer-reviewed version available at Entropy 2017, 19, , 181; doi:10.3390/e19040181• 2 observers observing N objects: Commonality orders I and II can correspond to either of subjective (buoy) or objective (anchor) observation.The I-N resolution provides integrated commonality measure such as buoy-anchor connection and raft evaluation according to the nature of the observation.TDC provides connections between buoy and/or anchors.• N observers observing two different objects: The commonality of N observers, whether it be subjective (buoy) or objective (anchor), are ranked with respect to two different objects I and II.The I-N resolution provides a mean ranking of N observers' commonality upon these observations.TDC provides the reproducibility of commonality among N observers.• Application of two different complexity measures to N observations: For example, the case of raft-anchor connection where N subjective observers (buoys) are ranked with inter-subjective commonality (raft evaluation) and weighted with two different anchors.The I-N resolution provides mean ranking of N observers' inter-subjective objectivity integrating a multiple criteria of inter-subjective and objective evaluation.TDC represents statistical dependencies between 2 complexity measures in response to a given inter-subjective objective measurement.While significant matching between two commonality orders assures the reproducibility based on the coincidence of observation with these measures, non-significance can also be used to quantify complementarity of different evaluations [32].

Figure 6 .
Figure 6.Integration of two commonality orders.(a): the correspondence between commonality orders I and II (orange arrows) can be described as the permutation between N observations (black circles), providing the topology of I-I matching (green dotted line) and I-N compromise (blue dotted line).(b):Affine space with respect to the commonality orders I and II as coordinate system (orange arrows) for the resolution of I-N compromise.The I-N mean commonality order (red solid arrow) can be calculated from the pair-wise order algorithm (section 2.3) applied on the commonality orders I and II, which makes the I-I matching identical to the I-I dimension (green arrow) and sets the mean order to I-N compromise.One I-N resolution dimension is required to resolve one I-N compromise (blue arrow).The implicit structure of the integrated commonality order with continuity assumption takes a complex form reflecting I-N compromises (red dotted arrow as an example), which corresponds to complex utility configuration in Fig.3(a).(c) The general case with an arbitrary number of I-N compromises.Total commonality space of N − 1 dimensions is divided between I-N resolution dimensions (blue arrows) and I-I dimensions (green arrow), between which I-N mean commonality order can be defined (red arrow).k < N axes of I-N resolution dimensions are required to resolve k I-N compromises (blue arrows).Taking the I-I dimensions and I-N resolution dimensions as Affine coordinates, the integrated commonality order is projected onto the I-N mean commonality order as a simplest sorted order of utility, which corresponds to Fig. 3(b).
Figure 6.Integration of two commonality orders.(a): the correspondence between commonality orders I and II (orange arrows) can be described as the permutation between N observations (black circles), providing the topology of I-I matching (green dotted line) and I-N compromise (blue dotted line).(b):Affine space with respect to the commonality orders I and II as coordinate system (orange arrows) for the resolution of I-N compromise.The I-N mean commonality order (red solid arrow) can be calculated from the pair-wise order algorithm (section 2.3) applied on the commonality orders I and II, which makes the I-I matching identical to the I-I dimension (green arrow) and sets the mean order to I-N compromise.One I-N resolution dimension is required to resolve one I-N compromise (blue arrow).The implicit structure of the integrated commonality order with continuity assumption takes a complex form reflecting I-N compromises (red dotted arrow as an example), which corresponds to complex utility configuration in Fig.3(a).(c) The general case with an arbitrary number of I-N compromises.Total commonality space of N − 1 dimensions is divided between I-N resolution dimensions (blue arrows) and I-I dimensions (green arrow), between which I-N mean commonality order can be defined (red arrow).k < N axes of I-N resolution dimensions are required to resolve k I-N compromises (blue arrows).Taking the I-I dimensions and I-N resolution dimensions as Affine coordinates, the integrated commonality order is projected onto the I-N mean commonality order as a simplest sorted order of utility, which corresponds to Fig. 3(b).

Figure 7 .
Figure 7. Topological hierarchy of commonality between observations.For example, 5 observations A, B, C, D, E are depicted with correspondence to commonality order of each topological subset.The Venn diagram on the left represents the commonality structure within observation probability database X 5 on variable X (N = 5 in section 3.2), where coincident observation is superimposed.The maximum commonality order is the projection between these topological subsets to the natural number N in right axis, describing the number of matching observations.Venn diagram cited from [34].

PreprintsFigure 8 .
Figure 8. Numerical observation of the proof of theorem 8. (a) Chebyshev's inequality (A41) and asymptotic convergence to O(L log L) (A44) with respect to N, M ≥ 1 (N + M = L), L = 10, 10 2 , 10 3 .Y-axis is plotted with log scale.The equality in (A41) is given at N = M = L 2 .(b) Behaviour of f (N) √ L, f (M) √ L and f (N) f (M)L with respect to L = 10, 10 2 , 10 3 .For visibility, Y-axis scale is given as log 2 (Y −1 ) that represents smaller Y value to the bottom, and Y-axis label shows the value of − log Y.The surface below the solid line f (N) f (M)L represents the convolution multiplied by L, f * f (L)L.The mean value of solid line f (N) f (M)L therefore corresponds to the upper limit of u that satisfies the polynomial constraint (45) with respect to given L. c = d = 2 were used for the simulation.
This formalization corresponds to Bayesian hierarchical modelling, where computation C provides hyperparameter of human decision D as prior distribution: P(D, C|B) := P(B|D)P(D, C) P(B) , := P(B|D)P(D|C)P(C) P(B) .

Preprints
(www.preprints.org)| NOT PEER-REVIEWED | Posted: 14 April 2017 doi:10.20944/preprints201704.0086.v1 • D m (P a : P s ): Target risk of short-term management objective.• D m (P o : P s ) = D e (P s : P o ): Buffering element of robustness trade-off between short-term management objective and long-term sustainability.• D m (P l : P o ): Potential risk of optimum portfolio w.r.t.long-term sustainability.• D m (P l : P s ): Potential risk of short-term management objective w.r.t.long-term sustainability.• D m (P l : P a ), D m (P a : P l ): Potential risk of actual distribution w.r.t.long-term sustainability.

Figure 9 .
Figure 9. Information geometrical optimization of diversity strategy portfolio with respect to actual distribution P a , short-term management objective P s and long-term sustainability P l .On a dual-flat statistical manifold based on Fisher information metric, each distribution is represented as a point (black circles).The m-geodesic is depicted with blue line, while e-geodesic is shown with a red line, which orthogonally cross at the optimized strategy P o .Topological correspondence between complexity measure H (aligned on left orange arrow) and diversity strategy portfolio (P a , P s , P l and P o ) is shown with dotted lines with respect to the magnitude relation.

Figure 10 .
Figure 10.Results of inter-subjective and subjective-objective commonality orders in citizen observation of biodiversity.Seven people represented with numerical ID are aligned with commonality orders, (a): based on inter-subjectivity, and (b): based on subjective-objective unity, which showed a 3.92% residual error probability regarding the rejection of the random order distribution hypothesis with respect to the binomial test (38).

Table 1 . Examples of buoy, raft and anchor in
various social systems and scientific domains.Examples are not comprehensive but a partial list of typical data from the recently increasing public availability.

Table 2 . Correspondence between buoy-anchor-raft model
and computational variables in this article.
i (•), Data contained in vertices V Com.order I and II between N objects Observations P(•), P a (•), P s

Preprints (www.preprints.org) | NOT PEER-REVIEWED | Posted: 14 April 2017 doi:10.20944/preprints201704.0086.v1 Peer
without implication for the independence of observations.For the convolution on general subset r s ⊆ r N , the exact definition is given by -reviewed version available at Entropy 2017, 19, , 181; doi:10.3390/e19040181λ N (r s

Table 3 .
Algorithmic complexity for the calculation of commonality orders.With respect to the maximum commonality order in

)
By optimizing H o with orthogonal projection of e-flat geodesic from Θ s to Θ o as w = arg min w (D m (P o : P s )) = arg min w (D e (P s : P o )), m (P a : P s ) = D m (P a : P o ) + D m (P o : P s ), D m (P l : P s ) = D m (P l : P o ) + D m (P o : P s ).(69) Where D m (• : •) and D e (• : •) are Kullback-Leibler divergence and its dual divergence, respectively, Preprints (www.