Using Timeliness in Tracking Infections

We consider real-time timely tracking of infection status (e.g., COVID-19) of individuals in a population. In this work, a health care provider wants to detect both infected people and people who have recovered from the disease as quickly as possible. In order to measure the timeliness of the tracking process, we use the long-term average difference between the actual infection status of the people and their real-time estimate by the health care provider based on the most recent test results. We first find an analytical expression for this average difference for given test rates, infection rates and recovery rates of people. Next, we propose an alternating minimization-based algorithm to find the test rates that minimize the average difference. We observe that if the total test rate is limited, instead of testing all members of the population equally, only a portion of the population may be tested in unequal rates calculated based on their infection and recovery rates. Next, we characterize the average difference when the test measurements are erroneous (i.e., noisy). Further, we consider the case where the infection status of individuals may be dependent, which occurs when an infected person spreads the disease to another person if they are not detected and isolated by the health care provider. In addition, we consider an age of incorrect information-based error metric where the staleness metric increases linearly over time as long as the health care provider does not detect the changes in the infection status of the people. Through extensive numerical results, we observe that increasing the total test rate helps track the infection status better. In addition, an increased population size increases diversity of people with different infection and recovery rates, which may be exploited to spend testing capacity more efficiently, thereby improving the system performance. Depending on the health care provider’s preferences, test rate allocation can be adjusted to detect either the infected people or the recovered people more quickly. In order to combat any errors in the test, it may be more advantageous for the health care provider to not test everyone, and instead, apply additional tests to a selected portion of the population. In the case of people with dependent infection status, as we increase the total test rate, the health care provider detects the infected people more quickly, and thus, the average time that a person stays infected decreases. Finally, the error metric needs to be chosen carefully to meet the priorities of the health care provider, as the error metric used greatly influences who will be tested and at what test rate.


Introduction
We consider the problem of timely tracking of an infectious disease, e.g., COVID-19, in a population of n people. In this problem, a health care provider wants to detect infected people as quickly as possible in order to take precautions such as isolating them from the rest of the population. The health care provider also wants to detect people who have recovered from the disease as soon as possible since these people need to return to work which is especially critical in sectors such as education, food retail, public transportation, etc. Ideally, the health care provider should test all people all the time. However, as the total test rate is limited, the question is how frequently the health care provider should apply tests on these people when their infection and recovery rates are known. In a broader sense, this problem is related to timely tracking of multiple processes in a resource-constrained setting where each process takes binary values of 0 and 1 with different change rates.
Recent studies have shown that people who have recovered from infectious diseases such as COVID-19 can be reinfected. Furthermore, the recovery times of individuals may vary significantly. For these reasons, in this problem, the ith person becomes infected with rate λ i which is independent of the others. Similarly, the ith person recovers from the disease with rate µ i . We note that the index i may represent a specific individual or a group of individuals that share common features such as age, gender, and profession. Depending on the demographics, coefficients λ i and µ i may be statistically known by the health care provider. We denote the infection status of the ith person as x i (t) (shown with the black curves on the left in Figure 1) which takes the value 1 when the person is infected and the value 0 when the person is healthy. The health care provider applies tests to people marked as healthy with rate s i and to people marked as infected with rate c i . Based on the test results, the health care provider forms an estimate for the infection status of the ith person denoted byx i (t) (shown with the blue curves on the right in Figure 1) which takes the value 1 when the most recent test result is positive and the value 0 when it is negative. Figure 1. System model. There are n people whose infection status are given by x i (t). The health care provider applies tests on these people. Based on the test results, estimations for the infection statuŝ x i (t) are generated. Infected people are shown in red and healthy people are shown in green.
We measure the timeliness of the tracking process by the difference between the actual infection status of people and the real-time estimate of the health care provider which is based on the most recent test results. The difference can occur in two different cases: (i) when the person is sick (x i (t) = 1) and the health care provider maps this person as healthy (x i (t) = 0), and (ii) when the person recovers from the disease (x i (t) = 0) but the health care provider still considers this person as infected (x i (t) = 1). The former case represents the error due to late detection of infected people, while the latter case represents the error due to late detection of healed people. Depending on the health care provider's preferences, detecting infected people may be more important than detecting recovered people (controlling infection), or the other way around (returning people to workforce).
In this paper, we consider the real-time timely tracking of infection status of n people. We first find an analytical expression for the long-term average difference between the actual infection status of people and the estimate of the health care provider based on test results. Then, we propose an alternating minimization-based algorithm to identify the test rates s i and c i for all people. We observe that if the total test rate is limited, we may not apply tests on all people equally. Next, we provide an alternative method to characterize the average difference, by finding the steady state of a Markov chain defined by (x i (t),x i (t)). By using this alternative method, we determine the average estimation error when there are errors in the test measurements expressed by a false positive rate p and a false negative rate q. Next, we consider the infection status of two people where an infected person may spread the disease to another person if the infection has not been detected by the health care provider to consequently isolate the infected person. Finally, we consider an age of incorrect information-based error metric where the estimation error increases linearly over time when the health care provider has not detected the changes in the infection status of the people.
Through extensive numerical results, we observe that increasing the total test rate helps track the infection status of people better, and increasing the size of the population increases diversity which may be exploited to improve the performance. Depending on the health care provider's priorities, we can allocate additional tests to people marked as healthy to detect the infections faster or to people marked as infected to detect the recoveries more quickly. In order to combat the test errors, the health care provider may prefer to apply tests to only a selected portion of the population with higher test rates. When the infection status of a person depends on that of another person, the average time that a person remains infected can be reduced by increasing the total test rate as it helps to detect the infected people more quickly. Finally, we observe that depending on the error metric used, the test rate distribution among the population differs greatly, and thus, we should choose an error metric that aligns with the priorities of the health care provider.

System Model
We consider a population of n people. We denote the infection status of the ith person at time t as x i (t) (black curve in Figure 2a) which takes binary values 0 or 1 as follows, In this paper, we consider a model where each person can be infected multiple times after recovering from the disease. We denote the time interval that the ith person stays healthy for the jth time as W i (j) which is exponentially distributed with rate λ i . We denote the recovery time for the ith person after being infected with the virus for the jth time as R i (j) which is exponentially distributed with rate µ i .
A health care provider wants to track the infection status of each person. Based on the test results at times t i, , the health care provider generates an estimate for the status of the ith person denoted asx i (t) (blue curve in Figure 2a) bŷ Whenx i (t) is 1, the health care provider applies the next test to the ith person after an exponentially distributed time with rate c i . Whenx i (t) is 0, the next test is applied to the ith person after an exponentially distributed time with rate s i . (5). Green areas correspond to the error caused by ∆ i1 (t) in (3). Orange areas correspond to the error caused by ∆ i2 (t) in (4).
An estimation error happens when the actual infection status of the ith person, x i (t), is different than the estimate of the health care provider,x i (t), at time t. This could happen in two ways: when x i (t) = 1 andx i (t) = 0, i.e., when the ith person is sick, but remains undetected by the health care provider, and when x i (t) = 0 andx i (t) = 1, i.e., when the ith person has recovered, but the health care provider is unaware that the ith person has recovered.
We denote the error caused by the former case, i.e., when x i (t) = 1 andx i (t) = 0, by ∆ i1 (t) (green areas in Figure 2b), and we denote the error caused by the latter case, i.e., when x i (t) = 0 andx i (t) = 1, by ∆ i2 (t) (orange areas in Figure 2b), Then, the total estimation error for the ith person ∆ i (t) is where θ is the importance factor in [0, 1]. A large θ gives more importance to the detection of infected people, and a small θ gives more importance to the detection of recovered people. We define the long-term weighted average difference between x i (t) andx i (t) as Then, the overall average difference of all people ∆ is Our aim is to track the infection status of all people. Due to limited resources, there is a total test rate constraint ∑ n i=1 s i + ∑ n i=1 c i ≤ C. Thus, our aim is to find the optimal test rates s i and c i to minimize ∆ in (7) while satisfying this total test rate constraint. We formulate the following problem, We provide a summary of the list of the variables used in this work in Table 1. In the next section, we find the total average difference ∆.

Average Difference Analysis
In this section, we provide a probabilistic analysis to characterize the average difference ∆. In Section 5.1, we give an alternative method to find ∆ by analyzing the steady-state distribution of the Markov chain induced by the states (x i (t),x i (t)). Here, we first find analytical expressions for ∆ i1 (t) in (3) and ∆ i2 (t) in (4) when s i > 0 and c i > 0. We note that ∆ i1 (t) can be equal to 1 whenx i (t) = 0 and is always equal to 0 whenx i (t) = 1. Assume that at time 0, both x i (0) andx i (0) are 0. After an exponentially distributed time with rate λ i , which is denoted by W i , the ith person is infected, and thus x i (t) becomes 1. At that time, sincex i (t) = 0, ∆ i1 (t) becomes 1. Further, ∆ i1 (t) will be equal to 0 again either when the ith person recovers from the disease which happens after R i which is exponentially distributed with rate µ i or when the health care provider performs a test on the ith person after D i , which is exponentially distributed with rate s i . We define T m (i) as the earliest time at which one of these two cases happens, i.e., T m (i) = min{R i , D i } (which is shown by the green areas in Figure 3a). We note that T m (i) is also exponentially distributed with rate µ i + s i , and we have P(T m (i) . If the ith person recovers from the disease before testing, we return to the initial case where both x i (t) andx i (t) are equal to 0 again. In this case, the cycle repeats itself, i.e., the ith person becomes sick again after W i and ∆ i1 (t) remains as 1 until either the person recovers or the health care provider performs a test which takes another T m (i) duration. If the health care provider performs a test before the person recovers, thenx i (t) becomes 1. We denote the time interval for whichx i (t) stays at 0 as I i1 which is given by where K 1 is geometric with rate P(T m (i) = D i ) = s i µ i +s i . Due to [103] (Prob. 9.4.1), Whenx i (t) = 1, the health care provider marks the ith person as infected. The ith person recovers from the virus after R i . After the ith person recovers, either the health care provider performs a test after Z i which is exponentially distributed with rate c i or the ith person is reinfected with the virus which takes W i time. We define T u (i) as the earliest time at which one of these two cases happens, i.e., T u (i) = min{W i , Z i } (which is shown by the orange areas in Figure 3b). Similarly, we note that T u (i) is exponentially distributed with rate λ i + c i , and we have P(T u (i) = W i ) = λ i λ i +c i and P(T u (i) = Z i ) = c i λ i +c i . If the person is reinfected with the virus before a test is applied, this cycle repeats itself, i.e., the ith person recovers after another R i , and then either a test is applied to the ith person, or the person is infected again which takes another T u (i). If the health care provider performs a test to the ith person before the person is reinfected, the health care provider marks the ith person as healthy again, i.e.,x i (t) becomes 0. We denote the time interval thatx i (t) is equal to 1 as I i2 which is given by where K 2 is geometric with rate P(T u (i) We denote the time interval between the jth and (j + 1)th times thatx i (t) changes from 1 to 0 as the jth cycle I i (j) where I i (j) = I i1 (j) + I i2 (j). We note that ∆ i1 (t) is always equal to 0 during I i2 (j), i.e.,x i (t) = 1, and ∆ i1 (t) is equal to 1 when x i (t) = 1 in I i1 (j). We denote the total time duration when ∆ i1 (t) is equal to 1 as T e,1 (i, j) during the jth cycle where T e,1 (i, j) = ∑ . Then, using ergodicity, similar to [80], ∆ i1 is equal to Thus, we have Next, we find ∆ i2 . We note that ∆ i2 (t) is equal to 1 when x i (t) = 0 in I i2 (j) and is always equal to 0 during I i1 (j). Similarly, we denote the total time duration where ∆ i2 (t) is equal to 1 in the jth cycle By using (5), (14), and (15), we obtain ∆ i as Then, by inserting (16) in (7), we obtain ∆. In the next section, we solve the optimization problem in (8).

Optimization of Average Difference
In this section, we solve the optimization problem in (8). Using ∆ i in (16) in (7), we rewrite (8) as We define the Lagrangian function [104] for (17) as where β ≥ 0, ν i ≥ 0, and η i ≥ 0. The KKT conditions are for all i. The complementary slackness conditions are First, we find s i . From (19), we have When where we used the fact that we either have s i > 0 and ν i = 0, or s i = 0 and ν i ≥ 0, due to (21). Here, (·) + = max(·, 0). On the other hand, when θ( > 0, and thus it is optimal to choose s i = 0 as our aim is to minimize ∆ in (7). In this case, when which is independent of the value of c i . As we obtain the same ∆ i for all values of c i , and the total update rate is limited, i.e., ∑ n i=1 s i + c i ≤ C, in this case, it is optimal to choose c i = 0 as well (i.e., when s i = 0).
Next, we find c i . From (20), we have where we used the fact that we either have c i > 0 and η i = 0, or c i = 0 and η i ≥ 0, due to (21). Similarly, Thus, for a given c i , the optimal test rate allocation policy for s i is a threshold policy where s i 's with small 1 Thus, for a given s i , the optimal policy to determine c i is a threshold policy where c i 's with small 1 Next, we show that in the optimal policy, if s i > 0 and c i > 0 for some i, then the total test rate constraint must be satisfied with equality, i.e., ∑ n i=1 s i + c i = C.

Lemma 1.
In the optimal policy, if s i > 0 and c i > 0 for some i, then we have ∑ n i=1 s i + c i = C.
Proof of Lemma 1. The derivatives of ∆ i with respect to s i and c i are We note that s i > 0 in (23) Therefore, in the optimal policy, if we have s i > 0 and c i > 0 for some i, then we must have Otherwise, we can further decrease ∆ in (7) by increasing c i or s i .
Next, we propose an alternating minimization-based algorithm for finding s i and c i . For this purpose, for given initial (s i , c i ) pairs, we define φ i as Then, we define u i as From (23) and (25), s i = u i and c i = u n+i , for i = 1, . . . , n.
Next, we find s i and c i by determining β in (29). First, assume that, in the optimal policy, there is an i such that s i > 0 and c i > 0. Thus, by Lemma 1, we must have Then, given the initial (s i , c i ) pairs, we immediately choose u i = 0 for φ i < 0. For the remaining u i with φ i ≥ 0, we apply a solution method similar to that in [80]. By assuming φ i ≥ β, i.e., by disregarding (·) + in (29), we solve ∑ 2n i=1 u i = C for β. Then, we compare the smallest φ i which is larger than zero in (28) with β. If we have φ i ≥ β, then it implies that u i ≥ 0 for all remaining i. Thus, we have obtained u i values for given initial (s i , c i ) pairs. If the smallest φ i which is larger than zero is smaller than β, then the corresponding u i is negative and we should choose u i = 0 for the smallest non-negative φ i . Then, we repeat this procedure until the smallest non-negative φ i is larger than β. After determining all u i , we obtain s i = u i and c i = u n+i for i = 1, . . . , n. Then, with the updated values of (s i , c i ) pairs, we keep finding u i 's until the KKT conditions in (19) and (20) are satisfied.
We note that for indices (persons) i for which (s i , c i ) are zero, the health care provider does not perform any tests, and maps these people as either always infected, i.e.,x i (t) = 1 for all t, or always healthy, i.e., , and should choosex i (t) = 1 for all t, otherwise, without performing any tests.
Finally, we note that the problem in (17) is not a convex optimization problem as the objective function is not jointly convex in s i and c i . Therefore, the solutions obtained via the proposed method may not be globally optimal. For this reason, we select different initial starting points and apply the proposed alternating minimization-based algorithm and choose the solution that achieves the smallest ∆ in (7).
In the next section, we first provide an alternative method to find the average difference ∆ in (6) and then characterize the average difference for the erroneous test measurements.

Average Difference for the Case with Erroneous Test Measurements
We note that the infection status of the ith person and its estimate at the health care provider form a continuous time Markov chain (Section 7.5 of [105]) with the states (x i (t),x i (t)) ∈ {(0, 0), (0, 1), (1, 0), (1, 1)}. In this section, by finding the steady-state distribution for (x i (t),x i (t)), we provide an alternative method to find ∆ in (6). Then, we consider the case with erroneous test measurements. For this case, we characterize the long-term average difference for the ith person denoted by ∆ e i .

An Alternative Method to Characterize Average Difference
When there is no error in the tests, the state transition graph is shown in Figure 4a. Assuming that s i > 0, c i > 0, every state is accessible from any other state, and thus, the Markov chain induced by the system is irreducible. Note that in Section 4, we see that the testing rates for some people can be equal to 0, i.e., s i = 0 and c i = 0. For these people, we choosex i (t) to be either always 0 or 1, i.e., consider them as always healthy or sick all the time. Depending on the choice ofx i (t), when s i = 0 and c i = 0, either the states (0, 0) and (1, 0), or the states (0, 1) and (1, 1) will be transient, and thus, have 0 probability in the steady state. By using small time-step approximation to a discrete time Markov chain, one can show that the self transition probabilities are non-zero, and thus, the Markov chain induced by the system is also aperiodic (Section 7.5 of [105]). Therefore, the Markov chain shown in Figure 4a admits a unique stationary distribution given by π = {π 00 , π 01 , π 10 , π 11 }. We find the stationary distribution by writing the local-balance equations which are given as π 00 λ i =π 10 µ i + π 01 c i , π 11 µ i =π 10 s i + π 01 λ i .
By using (30)-(33) and ∑ 2 k=1 ∑ 2 =1 π k = 1, we find the steady-state distribution π as and π 00 = µ i +s i λ i π 10 , and π 11 = c i +λ i µ i π 01 . We note that ∆ i1 in (14) is also equal to π 10 in (35), i.e., we have ∆ i1 = π 10 . Similarly, ∆ i2 in (15) is equal to π 01 in (34). Thus, by observing that the states (x i (t),x i (t)) form a continuous time Markov chain, we can find the average difference ∆ in (6) by finding the steady-state distribution for π. This method will be particularly useful in the following section where we consider the case with erroneous test measurements.

Average Difference with Erroneous Test Measurements
In this section, we consider the case where the test measurements can be erroneous. When a test in applied to an infected person, i.e., when x i (t) = 1, the test result will be 0 with probability q and 1 with probability 1 − q, where 0 ≤ q < 1 2 . In other words, the false-negative probability is equal to q. Similarly, when a test is applied to a healthy person, i.e., when x i (t) = 0, the test result will be 1 with probability p and 0 with probability 1 − p, where 0 ≤ p < 1 2 . Thus, the false-positive probability is equal to p. The probability distribution for the test measurements is provided in Table 2.
In this section, we consider the case where the health care provider applies only one test rate v i to the ith person, whether the person is currently marked as healthy or infected. That is, we do not consider separate testing rates of s i and c i for healthy and infected people as we did before, instead, here both s i and c i are equal o v i . Since the health care provider applies the same test rate for the ith person, here we do not consider the importance factor θ either. Then, we define the long-term average difference for the ith person with the error on the test measurements as follows, where the superscript e stands for "erroneous".
and the definitions of ∆ e i1 and ∆ e i2 follow similarly from (13). We note that with the test rates v i and errors on the test measurements, the states (x i (t),x i (t)) form a continuous time Markov chain, and the corresponding state transition graph is shown in Figure 4b. Assuming that v i > 0, one can show that there is a unique steady-state distribution π e = {π e 00 , π e 01 , π e 10 , π e 11 } which can be found by solving the local balance equations which are given as follows Then, by using (37)-(40) and ∑ 2 k=1 ∑ 2 =1 π e k = 1, we find the steady-state distribution π e as π e 00 = We note that ∆ e i1 , and ∆ e i2 are equal to π e 10 in (43), and π e 01 in (42), respectively. Thus, if v i > 0, then ∆ e i in (36) becomes We immediately note that if false-positive test probability p and false-negative test probability q are equal to 0, ∆ e i becomes (14) and (15), respectively, when v i = s i = c i . Then, ∂q ≥ 0 is equivalent to v i + λ i − µ i ≥ 0 which means that depending on the values of v i , µ i , and λ i , the long-term average difference ∆ e i can be an increasing function of only p or only q, or both p and q, but ∆ e i cannot be a decreasing function of both p and q. This is expected as false-negative and false-positive tests negatively affect the estimation process.
One can also show that Next, we consider the case when v i = 0. Note that when v i = 0, the health care provider either maps these people as always sick or always healthy depending on their infection and recovery rates. Thus, when v i = 0 and depending on the estimatex i (t), two of the states in Figure 4b will never be visited and thus, these states will have 0 steady-state probabilities. For this case, the steady states are given byπ e , if people are infected more frequently, then the health care provider chooses its estimate asx i (t) = 1 and, , if people stay healthy more often, then we havex i (t) = 0, In order to find the optimal test rates v i in the case of errors on the test measurements, we formulate the following optimization problem where the objective function is given by the summation of ∆ e i in (45) when v i > 0 and ∆ e i in (46) when v i = 0 over all people and 1{.} is the indicator function taking value 1 when {·} is true and 0, otherwise. In (47), we have a constraint on the total test rate, i.e., ∑ n i=1 v i ≤ C. We note that the optimization problem in (47) is in general not convex due to the indicator function in the objective function. However, for a given set of 1{v i = 0}, the optimization problem in (47) is convex and can be solved optimally. Thus, by solving the problem in (47) for all possible set of 1{v i = 0}, we can determine the global optimal solution which requires to solve 2 n different optimization problems which can be impractical for large n. Because of this reason, next, we provide a greedy algorithm to solve the optimization problem in (47).
In the greedy solution, initially, assuming that 1{v i > 0} = 1 for all i, we consider the following the optimization problem where the objective function in (48) is equal to ∆ e i in (45). For this optimization problem, we define the Lagrangian function for (48) as whereβ ≥ 0,ν i ≥ 0. We note that the problem defined in (48) is a convex optimization problem, and thus we can find the optimal test rates v i by analyzing the KKT and the complementary slackness conditions. The KKT conditions are given by for all i. The complementary slackness conditions arē By using (50) and (51), we find the optimal v i values for the problem in (48) as With the test rates v i in (52) we find the average differences ∆ e i in (45) and then compare them with ∆ e i in (46) when v i = 0. Due to the errors in the tests, ∆ e i in (46) with v i = 0 can be smaller than ∆ e i in (45) with the test rates v i found in (52). For these people, we choose index i where the difference between ∆ e i in (45) with the v i in (52) and ∆ e i in (46) is the highest. Then, we take v i = 0 as applying no test to this person can further decrease ∆ e i . For the remaining people, we solve the optimization problem in (48). After obtaining the test rates for the remaining people, we again compare average differences ∆ e i with the test rates in (52) and with no test and we choose v i = 0 for the person where ∆ e i can be further decreased. We repeat these steps until all ∆ e i s with v i > 0 cannot be further decreased by choosing v i = 0.
We note that the solution obtained in (52) has a threshold structure. As false-positive and -negative test rates increase, the term 2(1−p−q) β in (52) becomes smaller. As a result, some people with higher may not be tested by the health care provider. Thus, when p and q are high, a smaller portion of the population is tested with higher test rates in order to combat the test errors.

Average Estimation Error with Dependent Infection Rates
In this section, we consider the case where we have two people whose infection rates depend on each other. When these two people are healthy, they can be individually infected with the virus after an exponential time with rate λ. When one of these two people is infected and this has not been detected by the health care provider, this person can infect the other healthy person after an exponential time with rate λ 12 which has been illustrated in Figure 5. Thus, when both of the people are healthy, their individual infection rate is λ. However, when one of them is sick and this has not been detected by the health care provider, the healthy person's total infection rate is equal to λ + λ 12 . On the other hand, if only one person is infected, i.e., x i (t) = 1, which has also been detected by the health care provider,x i (t) = 1, then we assume that we isolate the infected person from the healthy one, and thus, the healthy person's infection rate remains as λ instead of λ + λ 12 . When the people are infected, they recover from the disease after an exponential time with rate µ. When the health care provider believes that a person is healthy, i.e.,x i (t) = 0, the next test is applied to this person after an exponential time with rate s. When the health care provider believes that a person is sick, i.e.,x i (t) = 1, the next test applied to this person after an exponential time with rate c. Here, we note that since the people are identical in terms of their infection and recovery rates, the health care provider applies the same test rates.

Age of Incorrect Information Based Error Metric
To date, we have considered an estimation error metric that takes the value 1 if the actual infection status of a person is different than the real-time estimation at the health care provider. Thus, the error metric takes values based on the information content. On the other hand, the traditional age metric introduced in [1] considers only the time passed since the most recently received status update packet is generated at the source. As a result, the traditional age metric does not consider the information content and age alone may not be a suitable performance metric for the problem considered in our work.
In the context of infection tracking, it is important to know how long the estimations at the health care provider have been different from the actual infection status of the people. However, the error metric that we have considered thus far does not have the time component, i.e., it only takes value 1 independent of the time duration that it has been off from the actual health status. Motivated by the AoII introduced in [51,102] which accounts for both the time and the information content, in this section, we consider the following error metric, where the superscript s stands for "synchronization" implied in AoII, where V i (t) is the last time instant where the health care provider makes an accurate estimation of the health status for the ith person, i.e., the last time instant when ∆ s i = 0. Similarly, we define where V i1 (t) and V i2 (t) are equal to the last time instants when ∆ s i1 and ∆ s i2 are equal to 0, respectively. A sample evolution of ∆ s i1 and ∆ s i2 is shown in Figure 6 and we note that Figure 6. A sample evolution of (a) ∆ s i1 (t), and (b) ∆ s i2 (t) in a typical update cycle.
Similar to Section 3, the infection and the recovery rates of the ith person are λ i and µ i , respectively. In this section, the health care provider applies only one test rate for each person denoted by w i . That is, we do not consider separate testing rates of s i and c i for healthy and infected people as we did previously, instead, here both s i and c i are equal o w i . We first consider the case where w i > 0. By following the steps in Section 3, one can show that E[ which can be obtained by substituting w i instead of s i and c i in (10) and (12), respectively. Next, we denote the total area when ∆ s i1 (t) > 0 as A e,1 (i, j) during the jth cycle where A e,1 (i, j) = ∑ T m (i, ) 2 2 and K 1 has a geometric distribution with success rate w i µ i +w i . Then, we have E[A e,1 (i)] = 1 w i (w i +µ i ) . Similarly, we denote the total area when ∆ s i2 (t) > 0 as A e,2 (i, j) during the jth cycle where A e,2 (i, j) = ∑ T u (i, ) 2 2 and K 2 has a geometric distribution with success rate w i when w i > 0. One can show that ∆ s i is a decreasing function of w i , i.e., ∂∆ s i ∂w i < 0, and ∆ s i is a convex function of w i , i.e., i.e., E[I i ] is equal to the expected time of a person's healthy and sick states. Since the health care provider applies no tests to test a person, it either estimates this person to be always sick (x i (t) = 1) or always healthy (x i (t) = 0). When w i = 0 andx i (t) = 1, then ∆ s i = 1 . If µ i < λ i , then the health care providerx i (t) = 1, andx i (t) = 0, otherwise. Thus, when w i = 0, we have ∆ s i = min 1 In order to find the optimal test rates, we formulate the following optimization problem where the objective function in (78) is equal to the summation of ∆ s i in (77) when w i > 0 and ∆ s i when w i = 0 over all people. In order to solve the problem in (78), we follow the same greedy solution approach in Section 5. First, by assuming that w i > 0, and thus, the average difference ∆ s i is given in (77), we solve the following optimization problem Since the problem in (79) is a convex optimization problem, by defining Lagrangian function and analyzing the KKT and the complementary slackness conditions, we can find the optimal w i values. In order to avoid being repetitive, we skip these optimization steps. Then, we compare ∆ s i in (77) with w i values found in (79) with min{ 1 If we can reduce ∆ s i further, we choose w i = 0 for the person with the highest improvement. Then, we solve the optimization problem in (79) for the remaining people. We repeat these steps until there is no improvement in ∆ s i by choosing w i = 0. In the next section, we provide extensive numerical results to evaluate optimal test rates in various settings considered in this paper.

Numerical Results
In this section, we provide seven numerical results. For these examples, we take λ i as where r = 0.9 and a is such that ∑ n i=1 λ i = 6. Furthermore, we take µ i as where q = 1.1 and b is such that ∑ n i=1 µ i = 4. Since λ i in (80) decreases with i, people with lower indices become infected more quickly compared to people with higher indices. Since µ i in (81) increases with i, people with higher indices recover more quickly compared to people with lower indices. Thus, a person with a low index becomes infected quickly and recovers slowly.
In the first example, we take the total number of people as n = 10, the total test rate as C = 16, and θ = 0.5. We start with randomly chosen s i and c i such that ∑ n i=1 s i + c i = 16, and apply the alternating minimization-based method proposed in Section 4. We repeat this process for 30 different initial (s i , c i ) pairs and choose the solution that gives the smallest ∆. In Figure 7a, we observe that the first three people are never tested by the health care provider. We note that s i , which is the test rate whenx i (t) = 0, initially increases with i but then decreases with i. This means that people who become infected rarely are tested less frequently when they are marked as healthy. Similarly, we observe in Figure 7a that c i , which is the test rate whenx i (t) = 1, monotonically increases with i. In other words, people who recover from the virus quickly are tested more frequently when they are marked as infected. In Figure 7b, we plot ∆ i resulting from the solution found from the proposed algorithm, ∆ i when the health care provider applies tests to everyone in the population uniformly, i.e., s i = c i = C 2n for all i, and ∆ i when the health care provider applies no tests, i.e., s i = c i = 0 for all i. In the case of no tests, we have Figure 7b that the health care provider applies tests on people whose ∆ i can be reduced the most as opposed to uniform testing where everyone is tested equally. Thus, the first three people who have the smallest ∆ i are not tested by the health care provider. With the proposed solution, by not testing the first three people, ∆ i are further reduced for the remaining people compared to uniform testing. For the people who are not tested, the health care provider choosesx i (t) = 1 all the time, i.e., marks these people always sick . This is expected as these people have high λ i and low µ i , i.e., they are infected easily and they stay sick for a long time.
In the second example, we use the same set of variables except for the total test rate C. We vary the total test rate C in between 5 and 20. We plot ∆ with respect to C in Figure 8. We observe that ∆ decreases with C. Thus, with higher total test rates, the health care provider can track the infection status of the population better as expected.
In the third example, we use the same set of variables except for the total number of people n. In addition, we also use uniform infection and healing rates, i.e., λ i = 6 n and µ i = 4 n for all i, for comparison with λ i in (80) and µ i in (81), while keeping the total infection and healing rates the same, i.e., ∑ n i=1 λ i = 6 and ∑ n i=1 µ i = 4, for both cases. We vary the number of people n from 2 to 30. We observe in Figure 9 that when the infection and healing rates are uniform in the population, the health care provider can track the infection status with the same efficiency, even though the population size increases (while keeping the total infection and healing rates fixed). For the case of λ i in (80) and µ i in (81), when we increase the population size, we increase the number of people who rarely become sick, i.e., people with high i indices, and also people who rarely heal from the disease, i.e., people with small i indices. Thus, it becomes easier for the health care provider to track the infection status of the people. This is why when we use λ i in (80) and µ i in (81), we observe in Figure 9 that the health care provider can track the infection status of the people better, even though the population size increases. The average difference ∆ with respect to number of people n. We use uniform infection and healing rates, i.e., λ i = 6 n and µ i = 4 n for all i, and also λ i in (80) and µ i in (81) with ∑ n i=1 λ i = 6 and ∑ n i=1 µ i = 4.
In the fourth example, we employ the same set of variables as the first example except for the importance factor θ. Here, we vary θ in between 0.2 and 0.7. We plot ∆ in (7),∆ 1 which is∆ 1 Figure 10a. Note that∆ 1 represents the average difference when people are infected, but have not been detected by the health care provider, and∆ 2 represents the average difference when people have recovered, but the health care provider still marks them as infected. Note that when θ is high, we assign importance to minimization of∆ 1 , i.e., the early detection of people with infection, and when θ is low, we give importance to minimization of∆ 2 , i.e., the early detection of people who recovered from the disease. This is why we observe in Figure 10a that∆ 1 decreases with θ while∆ 2 increases with θ.
We plot the total test rates ∑ n i=1 s i and ∑ n i=1 c i in Figure 10b. We observe in Figure 10b that if it is more important to detect the infected people, i.e., if θ is high, then the health care provider should apply higher test rates to people who are marked as healthy. In other words, ∑ n i=1 s i increases with θ. Similarly, if it is more important to detect people who recovered from the disease, then the health care provider should apply high test rates to people who are marked as infected. That is, ∑ n i=1 c i is high when θ is low. Therefore, depending on the priorities of the health care provider, a suitable θ needs to be chosen.
In the fifth numerical result, we consider the case where there are errors in the test measurements, i.e., the model in Section 5. We take the total test rate as C = 20, and vary error rates in the test p = q = {0.1, 0.2, 0.4}. In Figure 11a, we provide the test rates v i that we found as a result of our greedy policy in Section 5. When the error rates p and q are low, i.e., when p = q = 0.1, we see that the health care provider applies tests to everyone in the population and the corresponding ∆ e i is lower than applying no test as shown in Figure 11b. As we increase the error rates, we observe that some people in the population start to be not tested by the health care provider, see Figure 11a when p = q = {0.2, 0.4}. In this case, the health care provider applies more tests to the remaining people to combat the test errors. However, although it applies more tests to the remaining people, we observe in Figure 11b that the achieved average difference ∆ e i becomes higher as error rates increase.  (7),∆ 1 which is 1 In the sixth numerical result, we consider the case where the infection status of the people depend on each other. In other words, when one person is infected, they can infect the other person with rate λ 12 when they are not detected by the health care provider, i.e., the infection model in Section 6. For this example, first, we take µ = 5, λ = 2.5, s = c = C 4 and vary λ = {2, . . . , 200} and C = {20, 40, 60}. If λ 12 = 0, i.e., if the infection status of people are independent from each other, then the average time that person 1 or 2 is sick is equal to λ λ+µ = 1 3 . As we increase infection rate λ 12 among the person 1 and 2, we see in Figure 12a that the average time that person 1 is sick increases. However, we note that as we increase the total test rate, the health care provider can detect a sick person more frequently, and this explains why the average infected time is low in Figure 12a when the test rate is high. Then, we consider λ 12 = {5, 10, 15} and vary the total test rates λ = {2, . . . , 200}. We plot the average time that both person 1 and 2 stay as sick in Figure 12b. As we increase the total test rate, the health care provider detects the infected person more quickly, and thus, prohibits the infection from spreading. As a result, we observe that the average time that both people are infected decreases in C in Figure 12b. Since both people can be infected with the virus independent from each other with rate λ, the plots in Figure 12b do not drop to 0. In the last numerical result, we consider the age of incorrect information-based error metric in Section 7. Here, the estimation error increases with the time that the health care provider does not detect the changes in the infection status of the people. As a result, the average difference expression given by ∆ s i in (77) is different than ∆ e i in (45) when p = q = 0. For this example, we consider the total test rate C = 4 and compare the normalized average differences given by and the corresponding test rates w i and v i . In Figure 13b, depending on the error metric model, people who are tested by the health care provider show considerable variation in their test rates. For example, with the error metric ∆ s i in (77), we apply tests to every third person while the same person is not tested with the error metric ∆ e i in (45). In Figure 13a, we provide the normalized average difference values. Here, the average normalized error for the tested people exhibit similar values whereas the normalized difference may vary for the untested people. Thus, we should choose a suitable error metric that satisfies the priorities of the health care provider as it greatly affects who is tested and with which test rates.

Conclusions and Discussion
We considered the timely tracking of infection status of individuals in a population. For exponential infection and healing processes with given rates, we determined the rates of exponential testing processes. We considered errors on the test measurements and observed that in order to combat the test errors, a limited portion of the population may be tested with higher test rates. Then, we studied a dependent infection spread model for two people, where one infected person can spread the virus to the other if it has not been detected by the health care provider. Finally, we studied an AoII-based error metric where the error function linearly increases over time as the changes in the infection status have not been detected by the health care provider. We observed in numerical results that the test rates depend on the individuals' infection and recovery rates, the individuals' last known state of being healthy or infected, as well as the health care provider's priorities of detecting infected people versus detecting recovered people more quickly.
In the literature, in order to model epidemics, population is partitioned into groups called compartments. One such example is the SIR model used in [106] with the compartments susceptible (S), infected (I), and recovered (R) which has been further developed by adding the states hospitalized (H), and death (D) in [107]. In these epidemic models, the transitions between the compartments are assumed to be Markovian. In [107], with epidemiological data, the delay distributions for the infected (I) to hospitalized (H), and infected (I) to death (D) are well approximated by exponential and gamma distributions, respectively. However, due to the lack of data availability the delay distribution for infected (I) to recovered (R) is modeled with gamma distribution with higher tolerance. In our work, we modeled infection and recovery times, i.e., the delays between recovered (R) to infected (I) and infected (I) to recovered (R) with exponential distributions. Therefore, more realistic infection tracking models can be developed by considering gamma distributions as observed in [107]. This more realistic model corresponds to the problem of real-time timely tracking of a binary Markov source in a serially connected network. The serially connected network model was studied in [8] with the traditional age of information metric. We note that considering the same networking model with the AoII-based error metric to track information dissemination of a binary Markov source represents a promising research direction and has direct applications to the real-time tracking of epidemic spread models. One can also study the extension of dependent infection spread model in Section 6 to n > 2 people as a future research direction.
Another interesting research direction could be to consider different kinds of tests with different false-positive and false-negative test rates. Regarding this problem, instead of having a total test rate capacity C, we may consider a total test budget K. Assuming that each test bears a different cost, the goal might be to identify how many tests the health care provider should obtain from each type. Here, one can study a trade-off between applying fewer tests with a small probability of error versus applying more tests to individuals with a high probability of error. Moreover, one can consider a scenario where the health care provider may prefer to apply different test types to individuals depending on their infection and recovery rates.

Conflicts of Interest:
The authors declare no conflict of interest.