<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD Journal Publishing DTD v2.3 20070202//EN" "journalpublishing.dtd">
<article xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" xml:lang="en" article-type="research-article">
  <front>
    <journal-meta>
      <journal-id journal-id-type="publisher-id">algorithms</journal-id>
      <journal-title>Algorithms</journal-title>
      <abbrev-journal-title abbrev-type="publisher">Algorithms</abbrev-journal-title>
      <abbrev-journal-title abbrev-type="pubmed">algorithms</abbrev-journal-title>
      <issn pub-type="epub">1999-4893</issn>
      <publisher>
        <publisher-name>MDPI</publisher-name>
      </publisher>
    </journal-meta>
    <article-meta>
      <article-id pub-id-type="doi">10.3390/a5030379</article-id>
      <article-id pub-id-type="publisher-id">algorithms-05-00379</article-id>
      <article-categories>
        <subj-group>
          <subject>Article</subject>
        </subj-group>
      </article-categories>
      <title-group>
        <article-title>Monitoring Threshold Functions over Distributed Data Streams with Node Dependent Constraints</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <name>
            <surname>Malinovsky</surname>
            <given-names>Yaakov</given-names>
          </name>
          <xref rid="c1-algorithms-05-00379" ref-type="corresp">*</xref>
        </contrib>
        <contrib contrib-type="author">
          <name>
            <surname>Kogan</surname>
            <given-names>Jacob</given-names>
          </name>
        </contrib>
      </contrib-group>
      <aff id="af1-algorithms-05-00379">Department of Mathematics and Statistics, University of Maryland, Baltimore County, Baltimore, MD 21250, USA; Email: <email>kogan@umbc.edu</email></aff>
      <author-notes>
        <corresp id="c1-algorithms-05-00379"><label>*</label> Author to whom correspondence should be addressed; Email: <email>yaakovm@umbc.edu</email>; Tel.: +1-410-455-2968; Fax: +1-410-455-1066.</corresp>
      </author-notes>
      <pub-date pub-type="epub">
        <day>18</day>
        <month>09</month>
        <year>2012</year>
      </pub-date>
      <pub-date pub-type="collection">
        <month>09</month>
        <year>2012</year>
      </pub-date>
      <volume>5</volume>
      <issue>3</issue>
      <fpage>379</fpage>
      <lpage>397</lpage>
      <history>
        <date date-type="received">
          <day>19</day>
          <month>06</month>
          <year>2012</year>
        </date>
        <date date-type="rev-recd">
          <day>08</day>
          <month>09</month>
          <year>2012</year>
        </date>
        <date date-type="accepted">
          <day>11</day>
          <month>09</month>
          <year>2012</year>
        </date>
      </history>
      <permissions>
        <copyright-statement>© 2012 by the authors; licensee MDPI, Basel, Switzerland.</copyright-statement>
        <copyright-year>2012</copyright-year>
        <license xmlns:xlink="http://www.w3.org/1999/xlink" license-type="open-access" xlink:href="http://creativecommons.org/licenses/by/3.0/">
          <p>This article is an open-access article distributed under the terms and conditions of the Creative Commons Attribution license (http://creativecommons.org/licenses/by/3.0/).</p>
        </license>
      </permissions>
      <abstract>
        <p>Monitoring data streams in a distributed system has attracted considerable interest in recent years. The task of feature selection (e.g., by monitoring the information gain of various features) requires a very high communication overhead when addressed using straightforward centralized algorithms. While most of the existing algorithms deal with monitoring simple aggregated values such as frequency of occurrence of stream items, motivated by recent contributions based on geometric ideas we present an alternative approach. The proposed approach enables monitoring values of an arbitrary threshold function over distributed data streams through stream dependent constraints applied separately on each stream. We report numerical experiments on a real-world data that detect instances where communication between nodes is required, and compare the approach and the results to those recently reported in the literature.</p>
      </abstract>
      <kwd-group>
        <kwd>data streams</kwd>
        <kwd>distributed system</kwd>
        <kwd>convex optimization</kwd>
        <kwd>feedback</kwd>
        <kwd>feature selection</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec sec-type="intro">
      <title>1. Introduction</title>
      <p>In many emerging applications one needs to process a continuous stream of data in real time. Sensor networks [<xref ref-type="bibr" rid="B1-algorithms-05-00379">1</xref>], network monitoring [<xref ref-type="bibr" rid="B2-algorithms-05-00379">2</xref>], and real-time analysis of financial data [<xref ref-type="bibr" rid="B3-algorithms-05-00379">3</xref>,<xref ref-type="bibr" rid="B4-algorithms-05-00379">4</xref>] are examples of such applications. Monitoring queries is a particular class of queries in the context of data streams. Previous work in this area deals with monitoring simple aggregates [<xref ref-type="bibr" rid="B2-algorithms-05-00379">2</xref>], or term frequency occurrence in a set of distributed streams [<xref ref-type="bibr" rid="B5-algorithms-05-00379">5</xref>].</p>
      <p>A general framework for efficient local algorithms monitoring <italic>l</italic><sub>2</sub> norm of the data average of large networks of computers, wireless sensors, or mobile devices was introduced in [<xref ref-type="bibr" rid="B6-algorithms-05-00379">6</xref>], and further developed in [<xref ref-type="bibr" rid="B7-algorithms-05-00379">7</xref>]. The current contribution is motivated by results recently reported in [<xref ref-type="bibr" rid="B8-algorithms-05-00379">8</xref>,<xref ref-type="bibr" rid="B9-algorithms-05-00379">9</xref>] with focus on a special case of the general model considered in [<xref ref-type="bibr" rid="B7-algorithms-05-00379">7</xref>]. This special case can be briefly described as follows:</p>
      <p>Let <inline-graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="algorithms-05-00379-i002.tif"/> be a set of data streams collected at <italic>n</italic> nodes. Let v<sub>1</sub>(<italic>t</italic>),...,v<sub>n</sub>(<italic>t</italic>) be <italic>d</italic> dimensional real time varying vectors derived from the streams. For a function <inline-graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="algorithms-05-00379-i004.tif"/> we would like to confirm the inequality </p>
      <p><disp-formula id="algorithms-05-00379-i005">
          <inline-graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="algorithms-05-00379-i005.tif"/>
          <label>(1)</label>
          </disp-formula></p>
      <p>while minimizing communication between the nodes. Monitoring inequality (1), or monitoring geometric location of the mean is a problem that can be addressed using a variety of different mathematical tools. A specific choice of a monitoring tool is up to the user. We note that the problem as stated above does not specify any particular tool, <italic>l</italic><sub>2</sub>, or any other norm that is required to address it.</p>
      <p>The problem was recently addressed in [<xref ref-type="bibr" rid="B10-algorithms-05-00379">10</xref>], where the approach proposed imposes equal constraints on each node. In addition to previously used <italic>l</italic><sub>2</sub> norm (see, e.g., [<xref ref-type="bibr" rid="B6-algorithms-05-00379">6</xref>,<xref ref-type="bibr" rid="B7-algorithms-05-00379">7</xref>,<xref ref-type="bibr" rid="B8-algorithms-05-00379">8</xref>,<xref ref-type="bibr" rid="B9-algorithms-05-00379">9</xref>,<xref ref-type="bibr" rid="B11-algorithms-05-00379">11</xref>]) the paper provides theoretical framework for using a wide variety of convex functions, and, as an illustration, runs numerical experiments using <italic>l</italic><sub>2</sub>, <italic>l</italic><sub>1</sub> and <italic>l</italic><sub>∞</sub> norms. In all numerical experiments reported in [<xref ref-type="bibr" rid="B10-algorithms-05-00379">10</xref>] an application of the same algorithm with <italic>l</italic><sub>1</sub> norm generates superior results. This paper extends results in [<xref ref-type="bibr" rid="B10-algorithms-05-00379">10</xref>] in a machine learning direction—a constraint imposed on each node depends on the stream history at the node.</p>
      <p>As a simple illustration of the problem considered in the paper we focus on two scalar functions <italic>v</italic><sub>1</sub>(<italic>t</italic>) and <italic>v</italic><sub>2</sub>(<italic>t</italic>), and the identity function <italic>f</italic> (<italic>i.e</italic>., <italic>f</italic>(<italic>x</italic>) = <italic>x</italic>).We would like to guarantee the inequality </p>
      <p><disp-formula id="algorithms-05-00379-i011">
          <inline-graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="algorithms-05-00379-i011.tif"/>
          </disp-formula></p>
      <p>while keeping the nodes silent as much as possible. A possible strategy is to verify the initial inequality <inline-graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="algorithms-05-00379-i012.tif"/> and to keep both nodes silent while</p>
      <p><disp-formula id="algorithms-05-00379-i013">
          <inline-graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="algorithms-05-00379-i013.tif"/>
          </disp-formula></p>
      <p>The first time <italic>t</italic><sub>1</sub> when one of the functions, say <italic>v</italic><sub>1</sub>(<italic>t</italic>), crosses the boundary of the local constraint, <italic>i.e</italic>., <inline-graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="algorithms-05-00379-i015.tif"/> the nodes communicate, the mean <italic>v</italic>(<italic>t</italic><sub>1</sub>) is computed, the local constraint <italic>δ</italic> is updated and made available to the nodes, and nodes are kept silent as long as the inequalities hold.</p>
      <p><disp-formula id="algorithms-05-00379-i018">
          <inline-graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="algorithms-05-00379-i018.tif"/>
          </disp-formula></p>
      <p>The main contributions of this paper are listed next. We demonstrate that:</p>
      <list>
        <list-item>
          <p>1. This approach works for a non-linear monitoring function <italic>f</italic>.</p>
        </list-item>
        <list-item>
          <p>2. The results depend on the choice of a norm, and the numerical results reported show that <italic>l</italic><sub>2</sub> is probably not the best norm when one aims to minimize communication between nodes. In addition to the numerical results presented we also provide a simple illustrative example that highlights this point (see Remark 4.2).</p>
        </list-item>
        <list-item>
          <p>3. Selection of node dependent local constraints may decrease communication between the nodes.</p>
        </list-item>
        <list-item>
          <p>4. The approach suggested in [<xref ref-type="bibr" rid="B10-algorithms-05-00379">10</xref>] and adopted in this paper paves the way to achieve further communication savings by clustering nodes, and monitoring cluster coordinators. Although this research direction is beyond the scope of this paper we address it briefly in <xref ref-type="sec" rid="sec6-algorithms-05-00379">Section 6</xref>.</p>
        </list-item>
      </list>
      <p>In the next section we provide a text mining related example that leads to a non-linear threshold function <italic>f</italic>. </p>
    </sec>
    <sec>
      <title>2. Text Mining Application</title>
      <p>Let <bold>T</bold> be a finite text collection (for example a collection of mail or news items). We denote the size of the set <bold>T</bold> by |<bold>T</bold>|. We will be concerned with two subsets of <bold>T</bold>:</p>
      <list>
        <list-item>
          <p>1. <bold>R</bold>–the set of “relevant" texts (text not labeled as spam),</p>
        </list-item>
        <list-item>
          <p>2. <bold>F</bold>–the set of texts that contain a “feature" (word or term for example).</p>
        </list-item>
      </list>
      <p>We denote complements of the sets by <inline-graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="algorithms-05-00379-i020.tif"/> respectably (<italic>i.e</italic>., <inline-graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="algorithms-05-00379-i021.tif"/>), and consider the relative size of the four sets <inline-graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="algorithms-05-00379-i022.tif"/> as follows:</p>
      <p><disp-formula id="algorithms-05-00379-i023">
          <inline-graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="algorithms-05-00379-i023.tif"/>
          <label>(2)</label>
          </disp-formula></p>
      <p>Note that </p>
      <p><disp-formula id="algorithms-05-00379-i024">
          <inline-graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="algorithms-05-00379-i024.tif"/>
          </disp-formula></p>
      <p>The function <italic>f</italic> is defined on the simplex (<italic>i.e</italic>., <inline-graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="algorithms-05-00379-i025.tif"/>, <inline-graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="algorithms-05-00379-i026.tif"/>), and given by </p>
      <p><disp-formula id="algorithms-05-00379-i027">
          <inline-graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="algorithms-05-00379-i027.tif"/>
          <label>(3)</label>
          </disp-formula></p>
      <p>where <inline-graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="algorithms-05-00379-i028.tif"/> throughout the paper. We next relate empirical version of information gain Equation (3) and the information gain (see e.g., [<xref ref-type="bibr" rid="B12-algorithms-05-00379">12</xref>]).</p>
      <p>Let <italic>Y</italic> and <italic>X</italic> be random variable with know distributions</p>
      <p><disp-formula id="algorithms-05-00379-i032">
          <inline-graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="algorithms-05-00379-i032.tif"/>
          </disp-formula></p>
      <p>Entropy of <italic>Y</italic> is defined by </p>
      <p><disp-formula id="algorithms-05-00379-i033">
          <inline-graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="algorithms-05-00379-i033.tif"/>
          <label>(4)</label>
          </disp-formula></p>
      <p>Entropy of <italic>Y</italic> conditional on <italic>X</italic> = <italic>x</italic> denoted by <inline-graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="algorithms-05-00379-i035.tif"/> is defined by </p>
      <p><disp-formula id="algorithms-05-00379-i036">
          <inline-graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="algorithms-05-00379-i036.tif"/>
          <label>(5)</label>
          </disp-formula></p>
      <p> Conditional entropy <inline-graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="algorithms-05-00379-i037.tif"/> and information gain <inline-graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="algorithms-05-00379-i038.tif"/> are given by </p>
      <p><disp-formula id="algorithms-05-00379-i039">
          <inline-graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="algorithms-05-00379-i039.tif"/>
          <label>(6)</label>
          </disp-formula></p>
      <p>Information gain is symmetric, indeed </p>
      <p><disp-formula id="algorithms-05-00379-i040">
          <inline-graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="algorithms-05-00379-i040.tif"/>
          </disp-formula></p>
      <p>Due to convexity of <inline-graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="algorithms-05-00379-i041.tif"/>, information gain is non-negative </p>
      <p><disp-formula id="algorithms-05-00379-i042">
          <inline-graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="algorithms-05-00379-i042.tif"/>
          </disp-formula></p>
      <p>It is easy to see that Equation (3) provides information gain for the “feature".</p>
      <p>As an example, we consider <italic>n</italic> agents installed on <italic>n</italic> different servers and a stream of texts arriving at the servers. Let <inline-graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="algorithms-05-00379-i044.tif"/> be the last <italic>w</italic> texts received at the <inline-graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="algorithms-05-00379-i046.tif"/> server, with <inline-graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="algorithms-05-00379-i047.tif"/>. Note that </p>
      <p><disp-formula id="algorithms-05-00379-i048">
          <inline-graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="algorithms-05-00379-i048.tif"/>
          </disp-formula></p>
      <p><italic>i.e</italic>., entries of the global contingency table <inline-graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="algorithms-05-00379-i049.tif"/> are the average of the local contingency tables <inline-graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="algorithms-05-00379-i050.tif"/>.</p>
      <p>For the given “feature" and a predefined positive threshold <italic>r</italic> we would like to verify the inequality </p>
      <p><disp-formula id="algorithms-05-00379-i052">
          <inline-graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="algorithms-05-00379-i052.tif"/>
          </disp-formula></p>
      <p>while minimizing communication between the servers. Note that Equation (3) is a nonlinear function. The case of a nonlinear monitoring function is different from that of linear one (in fact [<xref ref-type="bibr" rid="B8-algorithms-05-00379">8</xref>] calls the nonlinear monitoring function case “fundamentally different"). In the next section we demonstrate the difference, and describe an efficient way to handle the nonlinear case. </p>
    </sec>
    <sec>
      <title>3. Non-Linear Threshold Function: An Example</title>
      <p>We start with a slight modification of a simple one dimensional example presented in [<xref ref-type="bibr" rid="B8-algorithms-05-00379">8</xref>].</p>
      <p>
        <bold>Example 3.1</bold>
        <italic>Let <inline-graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="algorithms-05-00379-i053.tif"/>, and <inline-graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="algorithms-05-00379-i054.tif"/>, <inline-graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="algorithms-05-00379-i055.tif"/> are scalar values stored at two distinct nodes. Note that if <inline-graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="algorithms-05-00379-i056.tif"/>, and <inline-graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="algorithms-05-00379-i057.tif"/>, then</italic>
      </p>
      <p><disp-formula id="algorithms-05-00379-i058">
          <inline-graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="algorithms-05-00379-i058.tif"/>
          </disp-formula></p>
      <p><disp-formula id="algorithms-05-00379-i059">
          <inline-graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="algorithms-05-00379-i059.tif"/>
          </disp-formula></p>
      <p>
        <italic>If <inline-graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="algorithms-05-00379-i060.tif"/>, and <inline-graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="algorithms-05-00379-i061.tif"/>, then </italic>
      </p>
      <p><disp-formula id="algorithms-05-00379-i062">
          <inline-graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="algorithms-05-00379-i062.tif"/>
          </disp-formula></p>
      <p><disp-formula id="algorithms-05-00379-i063">
          <inline-graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="algorithms-05-00379-i063.tif"/>
          </disp-formula></p>
      <p>
        <italic>Finally, when <inline-graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="algorithms-05-00379-i064.tif"/>, and <inline-graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="algorithms-05-00379-i061.tif"/> one has </italic>
      </p>
      <p><disp-formula id="algorithms-05-00379-i065">
          <inline-graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="algorithms-05-00379-i065.tif"/>
          <label>(7)</label>
          </disp-formula></p>
      <p>The simple illustrative example leads the authors of [<xref ref-type="bibr" rid="B8-algorithms-05-00379">8</xref>] to conclude that it is impossible to determine from the values of <italic>f</italic> at the nodes whether its value at the average is above the threshold or not. The remedy proposed is to consider the vectors <inline-graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="algorithms-05-00379-i066.tif"/> and to monitor the values of <italic>f</italic> on the convex hull conv <inline-graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="algorithms-05-00379-i067.tif"/> instead of the value of <italic>f</italic> at the average Equation (1). This strategy leads to sufficient conditions for Equation (1), and may be conservative.</p>
      <p>The monitoring techniques for values of <italic>f</italic> on conv <inline-graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="algorithms-05-00379-i067.tif"/> without communication between the nodes are based on the following two observations: </p>
      <list>
        <list-item>
          <p>1. <italic>Convexity property</italic>. The mean v(<italic>t</italic>) is given by <inline-graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="algorithms-05-00379-i069.tif"/>, <italic>i.e</italic>., the mean v(<italic>t</italic>) is in the convex hull of <inline-graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="algorithms-05-00379-i070.tif"/>, and <inline-graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="algorithms-05-00379-i071.tif"/> is available to node <italic>j</italic> without much communication with other nodes.</p>
        </list-item>
        <list-item>
          <p>2. If <inline-graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="algorithms-05-00379-i073.tif"/> is an <italic>l</italic><sub>2</sub> ball of radius <inline-graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="algorithms-05-00379-i074.tif"/> centered at <inline-graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="algorithms-05-00379-i075.tif"/>, then</p>
        </list-item>
      </list>
      <p><disp-formula id="algorithms-05-00379-i076">
          <inline-graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="algorithms-05-00379-i076.tif"/>
          <label>(8)</label>
          </disp-formula></p>
      <p>(see <xref ref-type="fig" rid="algorithms-05-00379-f001">Figure 1</xref>). Since each ball</p>
      <p><disp-formula id="algorithms-05-00379-i077">
          <inline-graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="algorithms-05-00379-i077.tif"/>
          <label>(9)</label>
          </disp-formula></p>
      <p>can be monitored by node <italic>j</italic> with no communication with other nodes, Equation (8) allows to split monitoring of conv <inline-graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="algorithms-05-00379-i078.tif"/> into <italic>n</italic> independent tasks executed by the <italic>n</italic> nodes separately and without communication.</p>
      <fig id="algorithms-05-00379-f001" position="anchor">
        <label>Figure 1</label>
        <caption>
          <p>ball cover.</p>
        </caption>
        <graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="algorithms-05-00379-g001.tif"/>
      </fig>
      <p>While the inclusion Equation (8) holds when <inline-graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="algorithms-05-00379-i081.tif"/> is substituted by <inline-graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="algorithms-05-00379-i082.tif"/> with <inline-graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="algorithms-05-00379-i083.tif"/> as we show later (see Remark 4.3) the inclusion fails when, for example, <inline-graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="algorithms-05-00379-i084.tif"/> (for experimental results obtained with different norms see <xref ref-type="sec" rid="sec5-algorithms-05-00379">Section 5</xref>).</p>
      <p>In this paper we propose an alternative strategy that will be briefly explained next using Example 3.1, <inline-graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="algorithms-05-00379-i053.tif"/>, and assignment provided by Equation (7). Let <italic>δ</italic> be a positive number. Consider two intervals of radius <italic>δ</italic> centered at <inline-graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="algorithms-05-00379-i064.tif"/> and <inline-graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="algorithms-05-00379-i061.tif"/>, <italic>i.e</italic>., we are interested in the intervals </p>
      <p><disp-formula id="algorithms-05-00379-i085">
          <inline-graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="algorithms-05-00379-i085.tif"/>
          </disp-formula></p>
      <p>If <inline-graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="algorithms-05-00379-i086.tif"/>, <inline-graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="algorithms-05-00379-i087.tif"/>, and <italic>δ</italic> is small, then the average <inline-graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="algorithms-05-00379-i088.tif"/> is not far from <inline-graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="algorithms-05-00379-i089.tif"/>, and <inline-graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="algorithms-05-00379-i090.tif"/> is not far from 7 (hence positive). In fact the sum of the intervals is the interval <inline-graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="algorithms-05-00379-i092.tif"/>, and </p>
      <p><disp-formula id="algorithms-05-00379-i093">
          <inline-graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="algorithms-05-00379-i093.tif"/>
          </disp-formula></p>
      <p>The “zero" points <inline-graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="algorithms-05-00379-i094.tif"/> of <italic>f</italic> are -3 and 3, and as soon as <italic>δ</italic> is large enough so that the interval <inline-graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="algorithms-05-00379-i097.tif"/> “hits" a point where <italic>f</italic> vanishes, communication between the nodes is required in order to verify Equation (1). In this particular example as long as <inline-graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="algorithms-05-00379-i098.tif"/>, and, therefore, </p>
      <p><disp-formula id="algorithms-05-00379-i099">
          <inline-graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="algorithms-05-00379-i099.tif"/>
          <label>(10)</label>
          </disp-formula></p>
      <p> no communication is required between the nodes.</p>
      <p>The condition presented above is a sufficient condition that guarantees Equation (1). As any sufficient condition is, this condition can be conservative. In fact when the distance is provided by the <italic>l</italic><sub>2</sub> norm, this sufficient condition is more conservative than the one provided by “ball monitoring" Equation (9) suggested in [<xref ref-type="bibr" rid="B8-algorithms-05-00379">8</xref>]. On the other hand, since only a scalar <italic>δ</italic> should be communicated to each node, the value of the updated mean <inline-graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="algorithms-05-00379-i102.tif"/> should not be transmitted (hence communication savings are possible), and there is no need to compute the distance from the center of each ball <inline-graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="algorithms-05-00379-i103.tif"/>, <inline-graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="algorithms-05-00379-i104.tif"/>, <inline-graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="algorithms-05-00379-i105.tif"/> to the zero set <inline-graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="algorithms-05-00379-i094.tif"/>. For detailed comparison of results we refer the reader to [<xref ref-type="bibr" rid="B10-algorithms-05-00379">10</xref>].</p>
      <p>We conclude the section by remarking that when inequality Equation (1) is reversed the same technique can be used to monitor the reversed inequality while minimizing communication between the nodes. We provide additional details in <xref ref-type="sec" rid="sec5-algorithms-05-00379">Section 5</xref>. In the next section we extend the above “monitoring with no communication" argument to the general vector setting. The approach suggested in the next section is motivated by an earlier research on robust stability of control systems (see e.g., [<xref ref-type="bibr" rid="B13-algorithms-05-00379">13</xref>]). </p>
    </sec>
    <sec>
      <title>4. Convex Minimization Problem</title>
      <p>In this section we state the monitoring problem as a convex minimization problem. For an appropriate analysis background we refer the interested reader to the classical monograph [<xref ref-type="bibr" rid="B14-algorithms-05-00379">14</xref>]. For the relevant convex analysis material see [<xref ref-type="bibr" rid="B15-algorithms-05-00379">15</xref>].</p>
      <p>Consider the following optimization problem:</p>
      <p>
        <bold>Problem 4.1</bold>
        <italic>For a function<inline-graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="algorithms-05-00379-i106.tif"/> concave with respect to the first <italic>d</italic> variables <inline-graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="algorithms-05-00379-i108.tif"/> and convex with respect to the last <italic>nd</italic> variables <inline-graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="algorithms-05-00379-i110.tif"/>, solve </italic>
      </p>
      <p><disp-formula id="algorithms-05-00379-i111">
          <inline-graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="algorithms-05-00379-i111.tif"/>
          <label>(11)</label>
          </disp-formula></p>
      <p>A solution for Problem 4.1 with appropriately selected <inline-graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="algorithms-05-00379-i112.tif"/> concludes the section.</p>
      <p>The connection between Problem 4.1, and the monitoring problem is explained next. Let <italic>B</italic> be a <inline-graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="algorithms-05-00379-i114.tif"/> matrix made of <italic>n</italic> blocks, where each block is the <inline-graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="algorithms-05-00379-i115.tif"/> identity matrix multiplied by <inline-graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="algorithms-05-00379-i116.tif"/>, so that for a set of <italic>n</italic> vectors <inline-graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="algorithms-05-00379-i117.tif"/> in <inline-graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="algorithms-05-00379-i118.tif"/> one has </p>
      <p><disp-formula id="algorithms-05-00379-i119">
          <inline-graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="algorithms-05-00379-i119.tif"/>
          <label>(12)</label>
          </disp-formula></p>
      <p>Assume that inequality Equation (1) holds for the vector w, <italic>i.e</italic>., <inline-graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="algorithms-05-00379-i120.tif"/>. We are looking for a vector x “nearest" to w so that <inline-graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="algorithms-05-00379-i121.tif"/>, <italic>i.e</italic>., <inline-graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="algorithms-05-00379-i122.tif"/> for some <inline-graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="algorithms-05-00379-i123.tif"/> (where <inline-graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="algorithms-05-00379-i094.tif"/> is the zero set of <italic>f</italic>, <italic>i.e</italic>., <inline-graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="algorithms-05-00379-i124.tif"/>). We now fix z <inline-graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="algorithms-05-00379-i125.tif"/> and denote the distance from w to the set <inline-graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="algorithms-05-00379-i126.tif"/>. Note that for each y inside the ball of radius <inline-graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="algorithms-05-00379-i127.tif"/> centered at w, one has <inline-graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="algorithms-05-00379-i128.tif"/>. If y belongs to a ball of radius <inline-graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="algorithms-05-00379-i129.tif"/> centered at w, then the inequality <inline-graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="algorithms-05-00379-i130.tif"/> holds true.</p>
      <p>Let <inline-graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="algorithms-05-00379-i001.tif"/> be a “norm" on <inline-graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="algorithms-05-00379-i131.tif"/> (specific functions <italic>F</italic> we run the numerical experiments with will be described later). The nearest “bad" vector problem described above is the following.</p>
      <p>
        <bold>Problem 4.2</bold>
        <italic>For <inline-graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="algorithms-05-00379-i125.tif"/> identify</italic>
      </p>
      <p><disp-formula id="algorithms-05-00379-i133">
          <inline-graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="algorithms-05-00379-i133.tif"/>
          <label>(13)</label>
          </disp-formula></p>
      <p>We note that Equation (13) is equivalent to <inline-graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="algorithms-05-00379-i135.tif"/> The function </p>
      <p><disp-formula id="algorithms-05-00379-i136">
          <inline-graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="algorithms-05-00379-i136.tif"/>
          </disp-formula></p>
      <p>is concave (actually linear) in <italic>λ</italic>, and convex in x. Hence (see e.g., [<xref ref-type="bibr" rid="B15-algorithms-05-00379">15</xref>]) </p>
      <p><disp-formula id="algorithms-05-00379-i138">
          <inline-graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="algorithms-05-00379-i138.tif"/>
          </disp-formula></p>
      <p>The right hand side of the above equality can be conveniently written as follows </p>
      <p><disp-formula id="algorithms-05-00379-i139">
          <inline-graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="algorithms-05-00379-i139.tif"/>
          </disp-formula></p>
      <p>The conjugate <inline-graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="algorithms-05-00379-i140.tif"/> of a function <inline-graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="algorithms-05-00379-i141.tif"/> is defined by <inline-graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="algorithms-05-00379-i142.tif"/> (see e.g., [<xref ref-type="bibr" rid="B15-algorithms-05-00379">15</xref>]). We note that </p>
      <p><disp-formula id="algorithms-05-00379-i143">
          <inline-graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="algorithms-05-00379-i143.tif"/>
          </disp-formula></p>
      <p>hence to compute </p>
      <p><disp-formula id="algorithms-05-00379-i144">
          <inline-graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="algorithms-05-00379-i144.tif"/>
          </disp-formula></p>
      <p>one has to deal with </p>
      <p><disp-formula id="algorithms-05-00379-i145">
          <inline-graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="algorithms-05-00379-i145.tif"/>
          </disp-formula></p>
      <p>For many functions <italic>g</italic> the conjugate <inline-graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="algorithms-05-00379-i147.tif"/> can be easily computed. Next we list conjugate functions for the most popular norms</p>
      <p>1. <inline-graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="algorithms-05-00379-i148.tif"/></p>
      <p>2. <inline-graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="algorithms-05-00379-i149.tif"/></p>
      <p>3. <inline-graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="algorithms-05-00379-i150.tif"/></p>
      <p><disp-formula id="algorithms-05-00379-i151">
          <inline-graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="algorithms-05-00379-i151.tif"/>
          </disp-formula></p>
      <p>We note that some of the functions <italic>F</italic> we consider in this paper are different from <italic>l</italic><sub>P</sub> norms (see <xref ref-type="table" rid="algorithms-05-00379-t001">Table 1</xref> for the list of the functions). We first select <inline-graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="algorithms-05-00379-i153.tif"/>, and show below that in this case </p>
      <p><disp-formula id="algorithms-05-00379-i154">
          <inline-graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="algorithms-05-00379-i154.tif"/>
          </disp-formula></p>
      <p>Note that with the choice <inline-graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="algorithms-05-00379-i155.tif"/> the problem <inline-graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="algorithms-05-00379-i156.tif"/> becomes </p>
      <p><disp-formula id="algorithms-05-00379-i157">
          <inline-graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="algorithms-05-00379-i157.tif"/>
          </disp-formula></p>
      <p>Since <inline-graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="algorithms-05-00379-i158.tif"/> the problem reduces to </p>
      <p><disp-formula id="algorithms-05-00379-i159">
          <inline-graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="algorithms-05-00379-i159.tif"/>
          </disp-formula></p>
      <p>The solution to this maximization problem is <inline-graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="algorithms-05-00379-i160.tif"/>. Analogously, when </p>
      <p><disp-formula id="algorithms-05-00379-i161">
          <inline-graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="algorithms-05-00379-i161.tif"/>
          </disp-formula></p>
      <p>one has <inline-graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="algorithms-05-00379-i162.tif"/> Assuming <inline-graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="algorithms-05-00379-i163.tif"/> one has to look at </p>
      <p><disp-formula id="algorithms-05-00379-i164">
          <inline-graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="algorithms-05-00379-i164.tif"/>
          </disp-formula></p>
      <p>Hence </p>
      <p><disp-formula id="algorithms-05-00379-i165">
          <inline-graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="algorithms-05-00379-i165.tif"/>
          </disp-formula></p>
      <p>and <inline-graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="algorithms-05-00379-i166.tif"/>. Finally the value for <inline-graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="algorithms-05-00379-i167.tif"/> is given by <inline-graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="algorithms-05-00379-i168.tif"/>. When <inline-graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="algorithms-05-00379-i169.tif"/> one has <inline-graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="algorithms-05-00379-i170.tif"/>. For clarity sake we collect the above results in <xref ref-type="table" rid="algorithms-05-00379-t001">Table 1</xref>.</p>
      <table-wrap id="algorithms-05-00379-t001" position="anchor">
        <object-id pub-id-type="pii">algorithms-05-00379-t001_Table 1</object-id>
        <label>Table 1</label>
        <caption>
          <p>norm–ball radius correspondence for three different norms and fixed <inline-graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="algorithms-05-00379-i171.tif"/>.</p>
        </caption>
        <table>
          <thead>
            <tr>
              <th align="left" valign="middle"><italic>F</italic>(x)</th>
              <th align="left" valign="middle"><italic>r</italic>(z)</th>
            </tr>
          </thead>
          <tbody>
            <tr>
              <td align="left" valign="middle"><inline-graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="algorithms-05-00379-i172.tif"/></td>
              <td align="left" valign="middle">||<bold>z </bold>− <italic>B</italic><bold>w</bold>||<sub>1</sub></td>
            </tr>
            <tr>
              <td align="left" valign="middle"><inline-graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="algorithms-05-00379-i173.tif"/></td>
              <td align="left" valign="middle">||<bold>z </bold>− B<bold>w</bold>||<sub>2</sub></td>
            </tr>
            <tr>
              <td align="left" valign="middle"><inline-graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="algorithms-05-00379-i174.tif"/></td>
              <td align="left" valign="middle">||<bold>z </bold>− B<bold>w</bold>||<sub>∞</sub></td>
            </tr>
          </tbody>
        </table>
      </table-wrap>
      <p>In the algorithm described below the norm is denoted just by <inline-graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="algorithms-05-00379-i175.tif"/> (numerical experiments presented in <xref ref-type="sec" rid="sec5-algorithms-05-00379">Section 5</xref> are conducted with all three norms). The monitoring algorithm we propose is the following.</p>
      <p>
        <bold>Algorithm 4.1</bold>
        <italic>Threshold monitoring algorithm.</italic>
      </p>
	  <list>
	  <list-item>
      <p>
        <italic>1. Set <inline-graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="algorithms-05-00379-i176.tif"/>.</italic>
      </p>
	  </list-item>
	  <list-item>
      <p>
        <italic>2. Until end of stream.</italic>
      </p>
	  </list-item>
	  <list-item>
      <p><italic>3.    Set <inline-graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="algorithms-05-00379-i177.tif"/>, <inline-graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="algorithms-05-00379-i104.tif"/> (</italic>i.e.<italic>, remember “initial" values for the vectors).</italic></p>
	  </list-item>
	  <list-item>
      <p>
        <italic>4.    Set <inline-graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="algorithms-05-00379-i178.tif"/> (for definition of w see Equation (12)).</italic>
      </p>
	  </list-item>
	  <list-item>
      <p>
        <italic>5.    Set <inline-graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="algorithms-05-00379-i180.tif"/>.</italic>
      </p>
	  </list-item>
	  <list-item>
      <p>
        <italic>6.    If <inline-graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="algorithms-05-00379-i181.tif"/> for each <inline-graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="algorithms-05-00379-i104.tif"/></italic>
      </p>
      <p>
              <italic>go to step 5</italic>
      </p>
      <p>
           <italic>else</italic>
      </p>
      <p>
              <italic>go to step 3</italic>
      </p>
	  </list-item>
	  </list>
      <p>In what follows, we assume that transmission of a double precision real number amounts to broadcasting one message. The message computation is based on the assumption that all nodes are updated by a new text simultaneously. When mean update is required, a coordinator (root) requests and receives messages from the nodes.</p>
      <p>We next count a number of messages that should be broadcast per one iteration if the local constraint <italic>δ</italic> is violated at least at one node. We shall denote the set of all nodes by N, the set of nodes complying with the constraint by <inline-graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="algorithms-05-00379-i182.tif"/>, and the set of nodes violating the constraint by <inline-graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="algorithms-05-00379-i183.tif"/> (so that <inline-graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="algorithms-05-00379-i184.tif"/>). The cardinality of the sets is denoted by <inline-graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="algorithms-05-00379-i185.tif"/> respectively, so that <inline-graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="algorithms-05-00379-i186.tif"/>. Assuming <inline-graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="algorithms-05-00379-i187.tif"/> one has the following:</p>
	  <list>
	  <list-item>
      <p>1. <inline-graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="algorithms-05-00379-i188.tif"/> nodes violators transmit their scalar ID and new coordinates to the root (<inline-graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="algorithms-05-00379-i189.tif"/> messages).</p>
	  </list-item>
	  <list-item>
      <p>2. the root sends scalar requests for new coordinates to the complying <inline-graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="algorithms-05-00379-i190.tif"/> nodes (<inline-graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="algorithms-05-00379-i191.tif"/> messages).</p>
	  </list-item>
	  <list-item>
      <p>3. the <inline-graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="algorithms-05-00379-i191.tif"/> complying nodes transmit new coordinates to the root (<inline-graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="algorithms-05-00379-i193.tif"/> messages).</p>
	  </list-item>
	  <list-item>
      <p>4. root updates itself, computes new distance <italic>δ</italic> to the surface, and sends <italic>δ</italic> to each node (<inline-graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="algorithms-05-00379-i194.tif"/> messages).</p>
	  </list-item>
	  </list>
      <p>This leads to total of</p>
      <p><disp-formula id="algorithms-05-00379-i195">
          <inline-graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="algorithms-05-00379-i195.tif"/>
          <label>(14)</label>
          </disp-formula></p>
      <p>We conclude the section with three remarks. The first one compares conservatism of Algorithm 4.1 and the one suggested in [<xref ref-type="bibr" rid="B8-algorithms-05-00379">8</xref>]. The second one again compares the ball cover suggested in [<xref ref-type="bibr" rid="B8-algorithms-05-00379">8</xref>] and application of Algorithm 4.1 with <italic>l</italic><sub>1</sub> norm. The last one shows by an example that Equation (8) fails when <inline-graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="algorithms-05-00379-i081.tif"/> is substituted by <inline-graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="algorithms-05-00379-i196.tif"/>. Significance of this negative result becomes clear in <xref ref-type="sec" rid="sec5-algorithms-05-00379">Section 5</xref>.</p>
      <p><bold>Remark 4.1</bold> <italic>Let <inline-graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="algorithms-05-00379-i197.tif"/>,</italic><italic>and <inline-graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="algorithms-05-00379-i198.tif"/>. If the Step 6 inequality holds for each node, then each point of the ball centered at <inline-graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="algorithms-05-00379-i199.tif"/> with radius <inline-graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="algorithms-05-00379-i200.tif"/> is contained in the <italic>l</italic><sub>2</sub> ball of radius <italic>δ</italic> centered at </italic>v<italic> (see </italic><xref ref-type="fig" rid="algorithms-05-00379-f002">Figure 2</xref><italic>). Hence the sufficient condition offered by Algorithm 4.1 is <bold>more</bold> conservative than the one suggested in </italic>[<xref ref-type="bibr" rid="B8-algorithms-05-00379">8</xref>]<italic>. </italic></p>
      <fig id="algorithms-05-00379-f002" position="anchor">
        <label>Figure 2</label>
        <caption>
          <p>conservative cover by a single <italic>l</italic><sub>2</sub> ball.</p>
        </caption>
        <graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="algorithms-05-00379-g002.tif"/>
      </fig>
      <p>Algorithm 4.1 can be executed with a variety of different norms, and, as we show next, <italic>l</italic><sub>2</sub> might not be the best one when communication between the nodes should be minimized.</p>
      <p><bold>Remark 4.2</bold> <italic>Let </italic><inline-graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="algorithms-05-00379-i202.tif"/>,</p>
      <p><disp-formula id="algorithms-05-00379-i203">
          <inline-graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="algorithms-05-00379-i203.tif"/>
          </disp-formula></p>
      <p>
        <italic>the</italic>
        <italic>distance is given by the <italic>l</italic><sub>1</sub> norm, and the aim is to monitor the inequality <inline-graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="algorithms-05-00379-i204.tif"/>. Let </italic>
      </p>
      <p><disp-formula id="algorithms-05-00379-i205">
          <inline-graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="algorithms-05-00379-i205.tif"/>
          </disp-formula>
      </p>
      <p><italic>We first consider the “ball cover" construction suggested in </italic>[<xref ref-type="bibr" rid="B8-algorithms-05-00379">8</xref>]<italic>. With this data <inline-graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="algorithms-05-00379-i206.tif"/> with <inline-graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="algorithms-05-00379-i207.tif"/>, and <inline-graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="algorithms-05-00379-i208.tif"/> with <inline-graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="algorithms-05-00379-i209.tif"/>. At the same time <inline-graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="algorithms-05-00379-i210.tif"/><inline-graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="algorithms-05-00379-i211.tif"/>. It is easy to see that the <italic>l</italic><sub>2</sub> ball of radius <inline-graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="algorithms-05-00379-i212.tif"/> centered at <inline-graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="algorithms-05-00379-i213.tif"/> intersects the <italic>l</italic><sub>1</sub> ball of radius 1 centered at <inline-graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="algorithms-05-00379-i215.tif"/> (see </italic><xref ref-type="fig" rid="algorithms-05-00379-f003">Figure 3</xref><italic>). Hence the algorithm suggested in </italic>[<xref ref-type="bibr" rid="B8-algorithms-05-00379">8</xref>]<italic> requires nodes to communicate at time <italic>t</italic><sub>1</sub>.</italic></p>
      <p>
        <italic>On the other hand the <italic>l</italic><sub>1</sub> distance from <inline-graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="algorithms-05-00379-i216.tif"/> to the set <inline-graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="algorithms-05-00379-i217.tif"/> is 1, and since </italic>
      </p>
      <p><disp-formula id="algorithms-05-00379-i218">
          <inline-graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="algorithms-05-00379-i218.tif"/>
          </disp-formula>
      </p>
      <p><italic>Algorithm 4.1 requires no communication between nodes at time <italic>t</italic><sub>1</sub>. In this particular case the sufficient condition offered by Algorithm 4.1 is <bold>less</bold> conservative than the one suggested in </italic>[<xref ref-type="bibr" rid="B8-algorithms-05-00379">8</xref>]<italic>. </italic></p>
      <fig id="algorithms-05-00379-f003" position="anchor">
        <label>Figure 3</label>
        <caption>
          <p><italic>l</italic><sub>2</sub> ball cover requires communication.</p>
        </caption>
        <graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="algorithms-05-00379-g003.tif"/>
      </fig>
      <p>
        <bold>Remark 4.3</bold> 
        <italic>It is easy to see that inclusion Equation (8) fails when <inline-graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="algorithms-05-00379-i220.tif"/> is an <italic>l</italic><sub>1</sub> ball of radius <inline-graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="algorithms-05-00379-i221.tif"/> centered at <inline-graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="algorithms-05-00379-i222.tif"/>. Indeed, when, for example, </italic>
      </p>
      <p><disp-formula id="algorithms-05-00379-i223">
          <inline-graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="algorithms-05-00379-i223.tif"/>
          </disp-formula></p>
      <p>
        <italic>(see </italic>
        <xref ref-type="fig" rid="algorithms-05-00379-f004">Figure 4</xref>
        <italic>) one has </italic>
      </p>
      <p><disp-formula id="algorithms-05-00379-i224">
          <inline-graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="algorithms-05-00379-i224.tif"/>
          </disp-formula></p>
      <p>
        <italic>In the next section we apply Algorithm 4.1 to a real life data and report number of required mean computations.</italic>
      </p>
      <fig id="algorithms-05-00379-f004" position="anchor">
        <label>Figure 4</label>
        <caption>
          <p>failed cover by <italic>l</italic><sub>1</sub> balls.</p>
        </caption>
        <graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="algorithms-05-00379-g004.tif"/>
      </fig>
    </sec>
    <sec id="sec5-algorithms-05-00379">
      <title>5. Experimental Results</title>
      <p>We apply Algorithm 4.1 to data streams generated from the Reuters Corpus RCV1–V2. The data is available from [<xref ref-type="bibr" rid="B16-algorithms-05-00379">16</xref>] and consists of 781,265 tokenized documents with DID (document ID) ranging from 2651 to 810596.</p>
      <p>The methodology described below attempts to follow that presented in [<xref ref-type="bibr" rid="B8-algorithms-05-00379">8</xref>]. We simulate <italic>n</italic> streams by arranging the feature vectors in ascending order with respect to DID, and selecting feature vectors for the stream in the round robin fashion.</p>
      <p>In the Reuters Corpus RCV1–V2 each document is labeled as belonging to one or more categories. We label a vector as “relevant" if it belongs to the “CORPORATE/INDUSTRIAL" (“CCAT") category, and “spam" otherwise. Following [<xref ref-type="bibr" rid="B9-algorithms-05-00379">9</xref>] we focus on three features: “bosnia", “ipo", and “febru". Each experiment was performed with 10 nodes, where each node holds a sliding window containing the last 6700 documents it received.</p>
      <p>First we use 67,000 documents to generate initial sliding windows. The remaining 714,265 documents are used to generate data streams, hence the selected feature information gain is computed 714,265 times. Based on all the documents contained in the sliding window at each one of the 714,266 time instances, we compute and graph 714,266 information gain values for the feature “bosnia" (see <xref ref-type="fig" rid="algorithms-05-00379-f005">Figure 5</xref>).</p>
      <p>For the experiments described below the threshold value <italic>r</italic> is predefined, and the goal is to monitor the inequality <inline-graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="algorithms-05-00379-i226.tif"/> while minimizing communication between the nodes. From now on we shall assume simultaneous arrival of a new text at each node.</p>
      <fig id="algorithms-05-00379-f005" position="anchor">
        <label>Figure 5</label>
        <caption>
          <p>information gain values for the feature “bosnia”.</p>
        </caption>
        <graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="algorithms-05-00379-g005.tif"/>
      </fig>
      <p>As new texts arrive, the local constraint (<italic>i.e</italic>., inequalities <inline-graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="algorithms-05-00379-i227.tif"/>) at each node is verified. If at least one node violates the local constraint, the average <inline-graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="algorithms-05-00379-i228.tif"/> is updated. Our numerical experiment with the feature “bosnia", the <italic>l</italic><sub>2</sub> norm, and the threshold <inline-graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="algorithms-05-00379-i229.tif"/> (reported in [<xref ref-type="bibr" rid="B8-algorithms-05-00379">8</xref>] as the threshold for feature “bosnia" incurring the highest communication cost) shows overall 4006 computation of the mean vector. An application of Equation (14) yields 240,360 messages. We repeat this experiment with <italic>l</italic><sub>∞</sub>, and <italic>l</italic><sub>1</sub> norms. The results obtained and collected in <xref ref-type="table" rid="algorithms-05-00379-t002">Table 2</xref> show that the smallest number of the mean updates is required for the <italic>l</italic><sub>1</sub> norm. </p>
      <table-wrap id="algorithms-05-00379-t002" position="anchor">
        <object-id pub-id-type="pii">algorithms-05-00379-t002_Table 2</object-id>
        <label>Table 2</label>
        <caption>
          <p>number of mean computations, messages, and crossings per norm for feature “bosnia" with threshold <inline-graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="algorithms-05-00379-i229.tif"/>.</p>
        </caption>
        <table>
          <thead>
            <tr>
              <th align="center" valign="middle">Distance</th>
              <th align="center" valign="middle">Mean Comps</th>
              <th align="center" valign="middle">Messages</th>
              <th align="center" valign="middle">LL</th>
              <th align="center" valign="middle">LG</th>
              <th align="center" valign="middle">GL</th>
              <th align="center" valign="middle">GG</th>
            </tr>
          </thead>
          <tbody>
            <tr>
              <td align="center" valign="middle">
                <italic>l</italic><sub>2</sub>
              </td>
              <td align="center" valign="middle">4006</td>
              <td align="center" valign="middle">240,360</td>
              <td align="center" valign="middle">959</td>
              <td align="center" valign="middle">2</td>
              <td align="center" valign="middle">2</td>
              <td align="center" valign="middle">3043</td>
            </tr>
            <tr>
              <td align="center" valign="middle">
                <italic>l</italic><sub>∞</sub>
              </td>
              <td align="center" valign="middle">3801</td>
              <td align="center" valign="middle">228,060</td>
              <td align="center" valign="middle">913</td>
              <td align="center" valign="middle">2</td>
              <td align="center" valign="middle">2</td>
              <td align="center" valign="middle">2884</td>
            </tr>
            <tr>
              <td align="center" valign="middle">
                <italic>l</italic><sub>1</sub>
              </td>
              <td align="center" valign="middle">3053</td>
              <td align="center" valign="middle">183,180</td>
              <td align="center" valign="middle">805</td>
              <td align="center" valign="middle">2</td>
              <td align="center" valign="middle">2</td>
              <td align="center" valign="middle">2244</td>
            </tr>
          </tbody>
        </table>
      </table-wrap>
      <p>Throughout the iterations the mean <inline-graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="algorithms-05-00379-i231.tif"/> goes through a sequence of updates, and the values <inline-graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="algorithms-05-00379-i232.tif"/> may be larger than, equal to, or less than the threshold <italic>r</italic>. We monitor the case <inline-graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="algorithms-05-00379-i233.tif"/> the same way as that of <inline-graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="algorithms-05-00379-i234.tif"/>. In addition to the number of mean computations, we collect statistics concerning “crossings" (or lack of thereof), <italic>i.e.</italic>, number of instances when the location of the mean v and its update <inline-graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="algorithms-05-00379-i235.tif"/> relative to the surface <inline-graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="algorithms-05-00379-i236.tif"/> are either identical or different. Specifically over the monitoring period we denote by:</p>
	  <list>
	  <list-item>
      <p>1. “LL" the number of instances when <inline-graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="algorithms-05-00379-i237.tif"/> and <inline-graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="algorithms-05-00379-i238.tif"/>, </p>
	  </list-item>
	  <list-item>
      <p>2. “LG" the number of instances when <inline-graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="algorithms-05-00379-i237.tif"/> and <inline-graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="algorithms-05-00379-i239.tif"/>, </p>
	  </list-item>
	  <list-item>
      <p>3. “GL" the number of instances when <inline-graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="algorithms-05-00379-i240.tif"/> and <inline-graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="algorithms-05-00379-i238.tif"/>, </p>
	  </list-item>
	  <list-item>
      <p>4. “GG" the number of instances when <inline-graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="algorithms-05-00379-i240.tif"/> and <inline-graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="algorithms-05-00379-i239.tif"/>. </p>
	  </list-item>
	  </list>
      <p>The number of “crossings" is reported in the last four columns of <xref ref-type="table" rid="algorithms-05-00379-t002">Table 2</xref>.</p>
      <p>Note that variation of vectors <inline-graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="algorithms-05-00379-i003.tif"/> does not have to be uniform. Taking on account distribution of signals at each node may lead to additional communication savings. We illustrate this statement by a simple example involving just two nodes. If, for example, there is a reason to believe that </p>
      <p><disp-formula id="algorithms-05-00379-i241">
          <inline-graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="algorithms-05-00379-i241.tif"/>
          <label>(15)</label>
          </disp-formula></p>
      <p>then the number of node violations may be reduced by imposing node dependent constraints </p>
      <p><disp-formula id="algorithms-05-00379-i242">
          <inline-graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="algorithms-05-00379-i242.tif"/>
          </disp-formula></p>
      <p>so that the faster varying signal at the second node enjoys larger “freedom" of change, while the inequality </p>
      <p><disp-formula id="algorithms-05-00379-i243">
          <inline-graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="algorithms-05-00379-i243.tif"/>
          </disp-formula></p>
      <p>holds true. Assignments of “weighted" local constraints requires information provided by Equation (15). With no additional assumptions about signal distribution, this information is not available. Unlike [<xref ref-type="bibr" rid="B11-algorithms-05-00379">11</xref>] we refrain from making assumptions regarding possible underlying data distributions, instead we estimate the weights as follows: </p>
	  <list>
	  <list-item>
      <p>1. Start with the initial set of weights </p>
      <p><disp-formula id="algorithms-05-00379-i245">
          <inline-graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="algorithms-05-00379-i245.tif"/>
          <label>(16)</label>
          </disp-formula></p>
	  </list-item>
	  <list-item>
      <p>2. As texts arrive at the next time instance <inline-graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="algorithms-05-00379-i246.tif"/> each node computes </p>
      <p><disp-formula id="algorithms-05-00379-i247">
          <inline-graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="algorithms-05-00379-i247.tif"/>
          </disp-formula></p>
      <p>If at time <inline-graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="algorithms-05-00379-i248.tif"/> a local constraint is violated, then, in addition to <inline-graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="algorithms-05-00379-i249.tif"/> messages (see Equation (14)), each node <italic>j</italic> broadcasts <inline-graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="algorithms-05-00379-i250.tif"/> to the root, the root computes <inline-graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="algorithms-05-00379-i251.tif"/>, and transmits the updated weights </p>
      <p><disp-formula id="algorithms-05-00379-i252">
          <inline-graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="algorithms-05-00379-i252.tif"/>
          </disp-formula></p>
      <p>back to node <italic>j</italic>. </p>
	  </list-item>
	  </list>
      <p>Broadcasts of weights cause increase of total number of messages per iteration to </p>
      <p><disp-formula id="algorithms-05-00379-i253">
          <inline-graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="algorithms-05-00379-i253.tif"/>
          <label>(17)</label>
          </disp-formula></p>
      <p>With inequalities in Step 6 of Algorithm 4.1 substituted by <inline-graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="algorithms-05-00379-i254.tif"/> the number of mean computations is reported in <xref ref-type="table" rid="algorithms-05-00379-t003">Table 3</xref>.</p>
      <p>It is of interest to compare results presented in <xref ref-type="table" rid="algorithms-05-00379-t003">Table 3</xref> with those reported, for example, in [<xref ref-type="bibr" rid="B9-algorithms-05-00379">9</xref>]. The comparison, however, is not an easy task. While [<xref ref-type="bibr" rid="B9-algorithms-05-00379">9</xref>] reports the threshold <inline-graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="algorithms-05-00379-i229.tif"/> as the threshold value that incurred the highest communication cost, the paper leaves the concept of “communication cost" undefined (we define transmission of a double precision real number as a single “message"). In addition [<xref ref-type="bibr" rid="B9-algorithms-05-00379">9</xref>] provides a graph of “Messages <italic>vs</italic>. Threshold" only. It appears that the maximal value of “bosnia Messages <italic>vs.</italic> Threshold" graph is somewhere between 100,000 and 200,000.</p>
      <table-wrap id="algorithms-05-00379-t003" position="anchor">
        <object-id pub-id-type="pii">algorithms-05-00379-t003_Table 3</object-id>
        <label>Table 3</label>
        <caption>
          <p>number of mean computations, messages, and crossings per norm for feature “bosnia" with threshold <inline-graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="algorithms-05-00379-i229.tif"/>, and stream dependent local constraint <inline-graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="algorithms-05-00379-i255.tif"/>.</p>
        </caption>
        <table>
          <thead>
            <tr>
              <th align="center" valign="middle">Distance</th>
              <th align="center" valign="middle">Mean Comps</th>
              <th align="center" valign="middle">Messages</th>
              <th align="center" valign="middle">LL</th>
              <th align="center" valign="middle">LG</th>
              <th align="center" valign="middle">GL</th>
              <th align="center" valign="middle">GG</th>
            </tr>
          </thead>
          <tbody>
            <tr>
              <td align="center" valign="middle">
                <italic>l</italic><sub>2</sub>
              </td>
              <td align="center" valign="middle">2388</td>
              <td align="center" valign="middle">191,040</td>
              <td align="center" valign="middle">726</td>
              <td align="center" valign="middle">2</td>
              <td align="center" valign="middle">2</td>
              <td align="center" valign="middle">1658</td>
            </tr>
            <tr>
              <td align="center" valign="middle">
                <italic>l</italic><sub>∞</sub>
              </td>
              <td align="center" valign="middle">2217</td>
              <td align="center" valign="middle">177,360</td>
              <td align="center" valign="middle">658</td>
              <td align="center" valign="middle">2</td>
              <td align="center" valign="middle">2</td>
              <td align="center" valign="middle">1555</td>
            </tr>
            <tr>
              <td align="center" valign="middle">
                <italic>l</italic><sub>1</sub>
              </td>
              <td align="center" valign="middle">1846</td>
              <td align="center" valign="middle">147,680</td>
              <td align="center" valign="middle">611</td>
              <td align="center" valign="middle">2</td>
              <td align="center" valign="middle">2</td>
              <td align="center" valign="middle">1231</td>
            </tr>
          </tbody>
        </table>
      </table-wrap>
      <p>We repeat the experiments with “ipo" and “febru" and report the results in <xref ref-type="table" rid="algorithms-05-00379-t004">Table 4</xref> and <xref ref-type="table" rid="algorithms-05-00379-t005">Table 5</xref> respectively. The results obtained with stream dependent local constraints is a significant improvement over those presented in [<xref ref-type="bibr" rid="B10-algorithms-05-00379">10</xref>]. Consistent with the results in [<xref ref-type="bibr" rid="B10-algorithms-05-00379">10</xref>] <italic>l</italic><sub>1</sub> norm comes up as the norm that requires smallest number of mean updates in all reported experiments.</p>
      <table-wrap id="algorithms-05-00379-t004" position="anchor">
        <object-id pub-id-type="pii">algorithms-05-00379-t004_Table 4</object-id>
        <label>Table 4</label>
        <caption>
          <p>number of mean computations, messages, and crossings per norm for feature “febru" with threshold <inline-graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="algorithms-05-00379-i229.tif"/>, and stream dependent local constraint <inline-graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="algorithms-05-00379-i255.tif"/>. </p>
        </caption>
        <table>
          <thead>
            <tr>
              <th align="center" valign="middle">Distance</th>
              <th align="center" valign="middle">Mean Comps</th>
              <th align="center" valign="middle">Messages</th>
            </tr>
          </thead>
          <tbody>
            <tr>
              <td align="center" valign="middle">
                <italic>l</italic><sub>2</sub>
              </td>
              <td align="center" valign="middle">1491</td>
              <td align="center" valign="middle">119,280</td>
            </tr>
            <tr>
              <td align="center" valign="middle">
                <italic>l</italic><sub>∞</sub>
              </td>
              <td align="center" valign="middle">1388</td>
              <td align="center" valign="middle">111,040</td>
            </tr>
            <tr>
              <td align="center" valign="middle">
                <italic>l</italic><sub>1</sub>
              </td>
              <td align="center" valign="middle">1304</td>
              <td align="center" valign="middle">104,320</td>
            </tr>
          </tbody>
        </table>
      </table-wrap>
      <table-wrap id="algorithms-05-00379-t005" position="anchor">
        <object-id pub-id-type="pii">algorithms-05-00379-t005_Table 5</object-id>
        <label>Table 5</label>
        <caption>
          <p>number of mean computations, messages, and crossings per norm for feature “ipo" with threshold <inline-graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="algorithms-05-00379-i229.tif"/>, and stream dependent local constraint <inline-graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="algorithms-05-00379-i255.tif"/>.</p>
        </caption>
        <table>
          <thead>
            <tr>
              <th align="center" valign="middle">Distance</th>
              <th align="center" valign="middle">Mean Comps</th>
              <th align="center" valign="middle">Messages</th>
            </tr>
          </thead>
          <tbody>
            <tr>
              <td align="center" valign="middle">
                <italic>l</italic><sub>2</sub>
              </td>
              <td align="center" valign="middle">7656</td>
              <td align="center" valign="middle">612,480</td>
            </tr>
            <tr>
              <td align="center" valign="middle">
                <italic>l</italic><sub>∞</sub>
              </td>
              <td align="center" valign="middle">7377</td>
              <td align="center" valign="middle">590,160</td>
            </tr>
            <tr>
              <td align="center" valign="middle">
                <italic>l</italic><sub>1</sub>
              </td>
              <td align="center" valign="middle">6309</td>
              <td align="center" valign="middle">504,720</td>
            </tr>
          </tbody>
        </table>
      </table-wrap>
    </sec>
    <sec id="sec6-algorithms-05-00379">
      <title>6. Future Research Directions</title>
      <p>In what follows we briefly outline a number of immediate research directions we plan to pursue.</p>
      <p>The local constraints introduced in this paper depend on history of a data stream at each node, and variations <inline-graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="algorithms-05-00379-i256.tif"/> over time contribute uniformly to local constraints. Attaching more weight to recent changes than to older ones may contribute to further improvement of monitoring process.</p>
      <p><xref ref-type="table" rid="algorithms-05-00379-t006">Table 6</xref> (borrowed from [<xref ref-type="bibr" rid="B10-algorithms-05-00379">10</xref>]) shows that in about 75% of instances (3034 out of 4006) the mean <inline-graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="algorithms-05-00379-i257.tif"/> is updated because of a single node violation. This observation naturally leads to the idea of clustering nodes, and independent monitoring of the node clusters equipped with a coordinator. The monitoring will become a two step procedure. At the first step node violations are checked in each node separately. If a node violates its local constraint, the corresponding cluster computes updated cluster coordinator. At the second step, violations of local constraints by coordinators are checked, and if at least one violation is detected the root is updated. <xref ref-type="table" rid="algorithms-05-00379-t006">Table 6</xref> indicates that in most of the instances only one coordinator will be effected, and, since communication within cluster requires less messages, the two step procedure briefly described above has a potential to bring additional savings.</p>
      <table-wrap id="algorithms-05-00379-t006" position="anchor">
        <object-id pub-id-type="pii">algorithms-05-00379-t006_Table 6</object-id>
        <label>Table 6</label>
        <caption>
          <p>number of nodes simultaneously violating local constraints. for feature “bosnia" with threshold <inline-graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="algorithms-05-00379-i229.tif"/>, and <italic>l</italic><sub>2</sub> norm</p>
        </caption>
        <table>
          <thead>
            <tr>
              <th align="center" valign="middle">nodes</th>
              <th align="right" valign="middle">violations</th>
            </tr>
          </thead>
          <tbody>
            <tr>
              <td align="center" valign="middle">1</td>
              <td align="right" valign="middle">3034</td>
            </tr>
            <tr>
              <td align="center" valign="middle">2</td>
              <td align="right" valign="middle">620</td>
            </tr>
            <tr>
              <td align="center" valign="middle">3</td>
              <td align="right" valign="middle">162</td>
            </tr>
            <tr>
              <td align="center" valign="middle">4</td>
              <td align="right" valign="middle">70</td>
            </tr>
            <tr>
              <td align="center" valign="middle">5</td>
              <td align="right" valign="middle">38</td>
            </tr>
            <tr>
              <td align="center" valign="middle">6</td>
              <td align="right" valign="middle">26</td>
            </tr>
            <tr>
              <td align="center" valign="middle">7</td>
              <td align="right" valign="middle">34</td>
            </tr>
            <tr>
              <td align="center" valign="middle">8</td>
              <td align="right" valign="middle">17</td>
            </tr>
            <tr>
              <td align="center" valign="middle">9</td>
              <td align="right" valign="middle">5</td>
            </tr>
            <tr>
              <td align="center" valign="middle">10</td>
              <td align="right" valign="middle">0</td>
            </tr>
          </tbody>
        </table>
      </table-wrap>
      <p>We note that a standard clustering problem is often described as “…finding and describing cohesive or homogeneous chunks in data, the clusters" (see e.g., [<xref ref-type="bibr" rid="B17-algorithms-05-00379">17</xref>]). The monitoring data streams problem requires to assign to the same cluster <italic>i</italic> nodes <inline-graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="algorithms-05-00379-i259.tif"/> so that the total change within cluster <inline-graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="algorithms-05-00379-i260.tif"/> is minimized, <italic>i.e</italic>., nodes with <bold>different</bold> variations <inline-graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="algorithms-05-00379-i261.tif"/> that cancel out each other as much as possible should be assigned to the same cluster. Hence, unlike classical clustering procedures, one needs to combine “dissimilar" nodes together. This is a challenging new type of a difficult clustering problem.</p>
      <p>Realistically, verification of inequality <inline-graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="algorithms-05-00379-i262.tif"/> should be conducted with an error margin (<italic>i.e</italic>., the inequality <inline-graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="algorithms-05-00379-i263.tif"/> should be investigated, see [<xref ref-type="bibr" rid="B9-algorithms-05-00379">9</xref>]). A possible effect of an error margin on the required communication load is another direction of future research. </p>
    </sec>
    <sec sec-type="conclusions">
      <title>7. Conclusions</title>
      <p>Monitoring streams over distributed systems is an important and challenging problem with a wide range of applications. In this paper we build on the approach for monitoring an arbitrary threshold functions suggested in [<xref ref-type="bibr" rid="B10-algorithms-05-00379">10</xref>], and introduce stream dependent local constraints that serve as a feedback monitoring mechanism. The obtained preliminary results indicate substantial improvement over those reported in [<xref ref-type="bibr" rid="B10-algorithms-05-00379">10</xref>], and demonstrate that monitoring with <italic>l</italic><sub>1</sub> norm requires fewer updates than that with <italic>l</italic><sub>∞</sub> or <italic>l</italic><sub>2</sub> norm.</p>
    </sec>
  </body>
  <back>
    <ack>
      <title>Acknowledgments</title>
      <p>The authors thank anonymous reviewers whose valuable comments greatly enhanced exposition of the results. The work of the first author was supported in part by 2012 UMBC Summer Faculty Fellowship grant.</p>
    </ack>
    <ref-list>
      <title>References</title>
      <ref id="B1-algorithms-05-00379">
        <label>1.</label>
        <citation citation-type="confproc">
          <person-group person-group-type="author">
            <name>
              <surname>Madden</surname>
              <given-names>S.</given-names>
            </name>
            <name>
              <surname>Franklin</surname>
              <given-names>M.J.</given-names>
            </name>
          </person-group>
          <article-title>An Architecture for Queries Over Streaming Sensor Data</article-title>
          <source>Proceedings of the ICDE 02</source>
          <conf-loc>San Jose, CA</conf-loc>
          <conf-date>26 February–1 March 2002</conf-date>
          <fpage>555</fpage>
          <lpage>556</lpage>
        </citation>
      </ref>
      <ref id="B2-algorithms-05-00379">
        <label>2.</label>
        <citation citation-type="confproc">
          <person-group person-group-type="author">
            <name>
              <surname>Dilman</surname>
              <given-names>M.</given-names>
            </name>
            <name>
              <surname>Raz</surname>
              <given-names>D.</given-names>
            </name>
          </person-group>
          <article-title>Efficient Reactive Monitoring</article-title>
          <source>Proceedings of the Twentieth Annual Joint Conference of the IEEE Computer and Communication Societies</source>
          <conf-loc>Anchorage, Alaska</conf-loc>
          <conf-date>2001</conf-date>
          <fpage>1012</fpage>
          <lpage>1019</lpage>
        </citation>
      </ref>
      <ref id="B3-algorithms-05-00379">
        <label>3.</label>
        <citation citation-type="confproc">
          <person-group person-group-type="author">
            <name>
              <surname>Zhu</surname>
              <given-names>Y.</given-names>
            </name>
            <name>
              <surname>Shasha</surname>
              <given-names>D.</given-names>
            </name>
          </person-group>
          <article-title>Statestream: Statistical Monitoring of Thousands of Data Streamsin Real Time</article-title>
          <source>Proceeding of the 28th international conference on Very Large Data Bases (VLDB)</source>
          <conf-loc>Hong Kong, China</conf-loc>
          <conf-date>2002</conf-date>
          <fpage>358</fpage>
          <lpage>369</lpage>
        </citation>
      </ref>
      <ref id="B4-algorithms-05-00379">
        <label>4.</label>
        <citation citation-type="confproc">
          <person-group person-group-type="author">
            <name>
              <surname>Yi</surname>
              <given-names>B.-K.</given-names>
            </name>
            <name>
              <surname>Sidiropoulos</surname>
              <given-names>N.</given-names>
            </name>
            <name>
              <surname>Johnson</surname>
              <given-names>T.</given-names>
            </name>
            <name>
              <surname>Jagadish</surname>
              <given-names>H.V.</given-names>
            </name>
            <name>
              <surname>Faloutsos</surname>
              <given-names>C.</given-names>
            </name>
            <name>
              <surname>Biliris</surname>
              <given-names>A.</given-names>
            </name>
          </person-group>
          <article-title>Online Datamining for Co–Evolving Time Sequences</article-title>
          <source>Proceedings of ICDE 00IEEE Computer Society</source>
          <conf-loc>San Diego, CA</conf-loc>
          <conf-date>2000</conf-date>
          <fpage>13</fpage>
          <lpage>22</lpage>
        </citation>
      </ref>
      <ref id="B5-algorithms-05-00379">
        <label>5.</label>
        <citation citation-type="confproc">
          <person-group person-group-type="author">
            <name>
              <surname>Manjhi</surname>
              <given-names>A.</given-names>
            </name>
            <name>
              <surname>Shkapenyuk</surname>
              <given-names>V.</given-names>
            </name>
            <name>
              <surname>Dhamdhere</surname>
              <given-names>K.</given-names>
            </name>
            <name>
              <surname>Olston</surname>
              <given-names>C.</given-names>
            </name>
          </person-group>
          <article-title>Finding (Recently) Frequent Items in Distributed Data Streams</article-title>
          <source>Proceedings of the 21st International Conference on Data Engineering (ICDE 05)</source>
          <conf-loc>Tokyo, Japan</conf-loc>
          <conf-date>2005</conf-date>
          <fpage>767</fpage>
          <lpage>778</lpage>
        </citation>
      </ref>
      <ref id="B6-algorithms-05-00379">
        <label>6.</label>
        <citation citation-type="confproc">
          <person-group person-group-type="author">
            <name>
              <surname>Wolff</surname>
              <given-names>R.</given-names>
            </name>
            <name>
              <surname>Bhaduri</surname>
              <given-names>K.</given-names>
            </name>
            <name>
              <surname>Kargupta</surname>
              <given-names>H.</given-names>
            </name>
          </person-group>
          <article-title>Local L2-Thresholding Based Data Mining in Peer-to-Peer Systems</article-title>
          <source>Proceedings of the SIAM International Conference on Data Mining (SDM 06)</source>
          <conf-loc>Bethesda, MD, USA</conf-loc>
          <conf-date>2006</conf-date>
          <fpage>430</fpage>
          <lpage>441</lpage>
        </citation>
      </ref>
      <ref id="B7-algorithms-05-00379">
        <label>7.</label>
        <citation citation-type="journal">
          <person-group person-group-type="author">
            <name>
              <surname>Wolff</surname>
              <given-names>R.</given-names>
            </name>
            <name>
              <surname>Bhaduri</surname>
              <given-names>K.</given-names>
            </name>
            <name>
              <surname>Kargupta</surname>
              <given-names>H.</given-names>
            </name>
          </person-group>
          <article-title>A generic local algorithm with applications for data mining in large distributed systems</article-title>
          <source>IEEE Trans. Knowl. Data Eng.</source>
          <year>2009</year>
          <volume>21</volume>
          <fpage>465</fpage>
          <lpage>478</lpage>
          <pub-id pub-id-type="doi">10.1109/TKDE.2008.169</pub-id>
        </citation>
      </ref>
      <ref id="B8-algorithms-05-00379">
        <label>8.</label>
        <citation citation-type="journal">
          <person-group person-group-type="author">
            <name>
              <surname>Sharfman</surname>
              <given-names>I.</given-names>
            </name>
            <name>
              <surname>Schuster</surname>
              <given-names>A.</given-names>
            </name>
            <name>
              <surname>Keren</surname>
              <given-names>D.</given-names>
            </name>
          </person-group>
          <article-title>A geometric approach to monitoring threshold functions over distributed data streams</article-title>
          <source>ACM Trans. Database Syst.</source>
          <year>2007</year>
          <volume>23</volume>
          <fpage>23</fpage>
          <lpage>29</lpage>
        </citation>
      </ref>
      <ref id="B9-algorithms-05-00379">
        <label>9.</label>
        <citation citation-type="book">
          <person-group person-group-type="author">
            <name>
              <surname>Sharfman</surname>
              <given-names>I.</given-names>
            </name>
            <name>
              <surname>Schuster</surname>
              <given-names>A.</given-names>
            </name>
            <name>
              <surname>Keren</surname>
              <given-names>D.</given-names>
            </name>
          </person-group>
          <article-title>A Geometric Approach to Monitoring Threshold Functions over Distributed Data Streams</article-title>
          <source>Ubiquitous Knowledge Discovery</source>
          <person-group person-group-type="editor">
            <name>
              <surname>May</surname>
              <given-names>M.</given-names>
            </name>
            <name>
              <surname>Saitta</surname>
              <given-names>L.</given-names>
            </name>
          </person-group>
          <publisher-name>Springer–Verlag</publisher-name>
          <publisher-loc>New York, NY, USA</publisher-loc>
          <year>2010</year>
          <fpage>163</fpage>
          <lpage>186</lpage>
        </citation>
      </ref>
      <ref id="B10-algorithms-05-00379">
        <label>10.</label>
        <citation citation-type="confproc">
          <person-group person-group-type="author">
            <name>
              <surname>Kogan</surname>
              <given-names>J.</given-names>
            </name>
          </person-group>
          <article-title>Feature Selection over Distributed Data Streams through Convex Optimization</article-title>
          <source>Proceedings of the Twelfth SIAM International Conference on Data Mining (SDM 2012)</source>
          <conf-loc>Anaheim, CA, USA</conf-loc>
          <conf-date>2012</conf-date>
          <fpage>475</fpage>
          <lpage>484</lpage>
        </citation>
      </ref>
      <ref id="B11-algorithms-05-00379">
        <label>11.</label>
        <citation citation-type="journal">
          <person-group person-group-type="author">
            <name>
              <surname>Keren</surname>
              <given-names>D.</given-names>
            </name>
            <name>
              <surname>Sharfman</surname>
              <given-names>I.</given-names>
            </name>
            <name>
              <surname>Schuster</surname>
              <given-names>A.</given-names>
            </name>
            <name>
              <surname>Livne</surname>
              <given-names>A.</given-names>
            </name>
          </person-group>
          <article-title>Shape sensitive geometric monitoring</article-title>
          <source>IEEE Trans. Knowl. Data Eng.</source>
          <year>2012</year>
          <volume>24</volume>
          <fpage>1520</fpage>
          <lpage>1535</lpage>
          <pub-id pub-id-type="doi">10.1109/TKDE.2011.102</pub-id>
        </citation>
      </ref>
      <ref id="B12-algorithms-05-00379">
        <label>12.</label>
        <citation citation-type="book">
          <person-group person-group-type="author">
            <name>
              <surname>Gray</surname>
              <given-names>R.M.</given-names>
            </name>
          </person-group>
          <source>Entropy and Information Theory</source>
          <publisher-name>Springer–Verlag</publisher-name>
          <publisher-loc>New York, NY, USA</publisher-loc>
          <year>1990</year>
          <fpage>119</fpage>
          <lpage>162</lpage>
        </citation>
      </ref>
      <ref id="B13-algorithms-05-00379">
        <label>13.</label>
        <citation citation-type="book">
          <person-group person-group-type="author">
            <name>
              <surname>Hinrichsen</surname>
              <given-names>D.</given-names>
            </name>
            <name>
              <surname>Pritchard</surname>
              <given-names>A.J.</given-names>
            </name>
          </person-group>
          <article-title>Real and Complex Stability Radii: A Survey</article-title>
          <source>Controlof Uncertain Systems</source>
          <person-group person-group-type="editor">
            <name>
              <surname>Hinrichsen</surname>
              <given-names>D.</given-names>
            </name>
            <name>
              <surname>Pritchard</surname>
              <given-names>A.J.</given-names>
            </name>
          </person-group>
          <publisher-name>Birkhauser</publisher-name>
          <publisher-loc>Boston, MA, USA</publisher-loc>
          <year>1990</year>
          <fpage>119</fpage>
          <lpage>162</lpage>
        </citation>
      </ref>
      <ref id="B14-algorithms-05-00379">
        <label>14.</label>
        <citation citation-type="book">
          <person-group person-group-type="author">
            <name>
              <surname>Rudin</surname>
              <given-names>W.</given-names>
            </name>
          </person-group>
          <source>Principles of Mathematical Analysis</source>
          <publisher-name>McGraw-Hill</publisher-name>
          <publisher-loc>New York, NY, USA</publisher-loc>
          <year>1976</year>
        </citation>
      </ref>
      <ref id="B15-algorithms-05-00379">
        <label>15.</label>
        <citation citation-type="book">
          <person-group person-group-type="author">
            <name>
              <surname>Rockafellar</surname>
              <given-names>R.T.</given-names>
            </name>
          </person-group>
          <source>Convex Analysis</source>
          <publisher-name>Princeton University Press</publisher-name>
          <publisher-loc>Princeton, NJ, USA</publisher-loc>
          <year>1970</year>
        </citation>
      </ref>
      <ref id="B16-algorithms-05-00379">
        <label>16.</label>
        <citation citation-type="web">
          <person-group person-group-type="author">
            <name>
              <surname>Bottou</surname>
              <given-names>L.</given-names>
            </name>
          </person-group>
          <article-title>Home Page</article-title>
          <access-date>(accessed on 14 September 2012)</access-date>
          <comment>Available online:<ext-link xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="leon.bottou.org/projects/sgd" ext-link-type="uri">leon.bottou.org/projects/sgd</ext-link></comment>
        </citation>
      </ref>
      <ref id="B17-algorithms-05-00379">
        <label>17.</label>
        <citation citation-type="book">
          <person-group person-group-type="author">
            <name>
              <surname>Mirkin</surname>
              <given-names>B.</given-names>
            </name>
          </person-group>
          <source>Clustering for Data Mining: A Data Recovery Approach</source>
          <publisher-name>Chapman &amp; Hall/CRC</publisher-name>
          <publisher-loc>Boca Raton, FL, USA</publisher-loc>
          <year>2005</year>
        </citation>
      </ref>
    </ref-list>
  </back>
</article>
