Artiﬁcial Intelligence (AI) Provided Early Detection of the Coronavirus (COVID-19) in China and Will Inﬂuence Future Urban Health Policy Internationally

: Predictive computing tools are increasingly being used and have demonstrated successfulness in providing insights that can lead to better health policy and management. However, as these technologies are still in their infancy stages, slow progress is being made in their adoption for serious consideration at national and international policy levels. However, a recent case evidences that the precision of Artiﬁcial Intelligence (AI) driven algorithms are gaining in accuracy. AI modelling driven by companies such as BlueDot and Metabiota anticipated the Coronavirus (COVID-19) in China before it caught the world by surprise in late 2019 by both scouting its impact and its spread. From a survey of past viral outbreaks over the last 20 years, this paper explores how early viral detection will reduce in time as computing technology is enhanced and as more data communication and libraries are ensured between varying data information systems. For this enhanced data sharing activity to take place, it is noted that e ﬃ cient data protocols have to be enforced to ensure that data is shared across networks and systems while ensuring privacy and preventing oversight, especially in the case of medical data. This will render enhanced AI predictive tools which will inﬂuence future urban health policy internationally.


Introduction
The advent of advanced computer modelling technologies, and their adoption, in many sectors internationally have led to the improvement and quality risk assessments of local and global economies.Among which, the medical field has been particularly transformed, where an increase in health data accentuated by availability of various technologies like Big Data, Machine Learning, Internet of Things (IoT) and Artificial Intelligence (AI) and others has significantly aided predictive modelling capacities.The availability of predictive computing tools and their subsequent use in healthcare worldwide has brought notable transformations in medical operations, personal medicine and epidemiology, and such are expected to continue improving precision in this field, especially in accuracy of diagnosis [1].Watson [2] further notes that these transformations are providing opportunities for predictive analysis through the novel evaluation of historical and current medical data.AI predictive tools are already at use in the recruitment and assessment of the healthcare work force, and are being further reinforced to include concerns of inclusivity and meritocracy [2].
However, most of these computing tools are relatively new and still evolving with respect to the medical field, and there are numerous associated issues that still need to be appreciated and understood.Additionally, health professionals and related stakeholders have not fully embraced these technologies yet.Thus, life threatening decisions are still heavily reliant upon human-based interpretations that are time consuming and lack a comprehensiveness of data evaluations.This is true despite many technologies being much slower when compared to decisions derived from the use of traditional computing technologies.On this, it is noted that legal frameworks can help to guide issues like data collection and sharing leading to more receptivity by the health community.However, the lack of standardisation of protocols also means that the scope of data to be analyzed is limited to specific areas.Therefore, results obtained in particular medical and or geographical regions may not apply in others.
The technological progress in the health sector is well demonstrated in the detection (here referred to as the process of identification of the disease) speed involving the recent case of novel coronavirus (COVID- 19) where its identification was made relatively earlier.Such occurred in just seven days for human identification [3] compared to past outbreaks, like the Severe Acute Respiratory Syndrome (SARS), which took four months to be identified [4].However, interestingly, it is noted that an Artificial Intelligence (AI) driven algorithm provided an early detection and warning on the 31 December 2019; seven days before the World Health Organisation (WHO) released an official notice of the outbreak [5].In a similar scenario, an epidemic monitoring company called Metabiota, through the use of a predictive tool was able to determine and warn that countries like Thailand, South Korea, Taiwan and Japan were immediately susceptible to the coronavirus outbreak a week before it was officially confirmed in these countries [6].
The use of AI-driven algorithms for early detection of pandemics is in its maturation and may be a potent route in the near future to aid better preparedness.It is expected that as the precision of these technologies continue to advance, they will have a more pronounced role in promoting the formulation of novel health policies.This paper surveys how data and AI processes aided in the early stages of the detection of the COVID-19 pandemic and provides preliminary supporting evidence to showcase that enhanced data sharing protocols will contribute to future urban health policy internationally.

The Early Detection of the Coronavirus (COVID-19)
The official warning issued by the WHO on the outbreak of the novel coronavirus (COVID-19) was made on the 9 January of 2020, after having received reports from official sources in its China offices [7].Nevertheless, the virus had been detected earlier (8 December 2019) when the first batch of six patients first reported into a Wuhan hospital where they were treated and discharged.From that treatment date on the 31 December 2019, China health officials struggled to determine whether they were dealing with a new type of virus without success.Thus, the time lag in reporting the matter to the WHO China Country Office [3].Amongst the information given to health authorities was that patients who had the symptoms were all drawn from the Wuhan city catchment.This led to the lockdown of the entire Wuhan city on the 23 January 2020 [8].
Early reports revealed that the first group of patients had either worked at or had visited the Huanan Seafood Wholesale Market in Wuhan, leading to the prompt identification of the virus' spatial origin.After its reporting, on the 7 January 2020, the virus was identified as a new type of coronavirus and was named 'COVID-19' after Chinese officials ruled out its possibility of being a SARS virus on 5 January 2020.Since its detection to the time the virus was identified and confirmed, tests were only performed in Chinese Laboratories.However, after the virus was reported outside of China, it was confirmed that the Virus Identification Laboratory based at the Peter Doherty Institute for Infection and Immunity in Melbourne, Australia, successfully managed to grow the virus in their lab in cell culture on the 25 January 2020 [9].
Besides the laboratory reports on the virus, it is reported that AI-driven algorithms, outside of China, were able to provide an early detection of the virus even before the WHO was informed, and subsequently managed to warn travelers that could be at high risk of being affected.Interestingly, the question arises as to why the trend was not detected by companies in China; leading to the assumption that data sharing restrictions must be a pointed in this direction.Among the successful companies is BlueDot (https://bluedot.global/), that scoured data from news reports, airline ticketing and animal disease outbreaks, to predict areas that are would be prone to the outbreak, expanding from regions in China [10].Another company with such AI capabilities, Metabiota (https://www.metabiota.com/), used the same approach through Big Data analytics to track flight data to accurately anticipate that countries like Japan, Thailand, Taiwan and South Korea would be at risk to a coronavirus outbreak days before any case was reported in any of those countries [6].
The virus sequence from these predicting tools were uploaded online to assist researchers around the world in finding a vaccine, and to improve their diagnosis.Following concerted collaborations between different agencies, especially in information and data sharing, some levels of success in confronting the outbreak have already been made by a team of researchers led by Doherty Institute Deputy Director Dr. Mike Catton [9].This laboratory is also reported to have successfully placed the virus in an open database where laboratories accredited by the WHO can access to advance the search for future vaccinations.In addition, the virus could be used to develop tests to help identify people who might be infected and are not presently showing any symptoms of the virus, thereby aiding in ensuring that the spread of the disease is curtailed [11].

A Brief Survey on Infectious Disease Outbreak in a 20-Year Period
From a historical perspective, the world has experienced a number of devastating outbreaks at different periods, some with unimaginable consequences.One of the worst cases was the Black Death that struck in the 14th century and left an estimated 100-200 million people dead [12].In recent years, there have been major cases of Influenza (Flu) outbreaks.These include: the first one (H1N1 influenza virus, or 'Spanish Flu') that was reported in 1918 (approximately 500 million deaths)that originated in Étaples, France [13]; then followed by influenza A subtype H2N2 ('Asian Flu') that originated in China reported in 1957-1958 (1.1 million deaths); then, the influenza A (H3N2) virus of 1968 (1 million deaths) that originated from Hong Kong, and then the H1N1 influenza virus (or 'Swine Flu') of 2009 (12,469 deaths) that originated in the USA [14].Other outbreaks include SARS (or Severe acute respiratory syndrome) that broke out in 2003 (774 deaths) in China's Guangdong Province [15]; the Ebola (Zaire ebolavirus) virus in 2014 (11,315) that originated from Zaire (now the Democratic Republic of Congo); the Zika Virus in 2015 spread by the Aedes aegypti mosquito that originated in Brazil; and the latest, coronavirus in 2020 that originated in Wuhan in China.
On this, though detection times may vary greatly in respect virus types (influenza, Henipavirus (Nipah virus), Filoviruses like Ebola, and Flavivirus like Zika et cetera) [16,17] and other characteristics, the constant is that technological advancement is clearly aiding in reducing their detection time.This was evidenced in the recent case of COVID-19, taking only seven days for detection.In particular, advancements in computing capacity brought about by the emergence and widespread employment of such technologies like AI [18], machine learning, Big Data [19,20], and Cloud Computing have allowed massive amounts of data from various sources to be captured and analyzed in real-time and from them insightful predictions are being made.Advancements in AI-based infectious disease-surveillance algorithms and their use to aid early infectious disease detection is further noted in Figure 1 below, where a comparison between the detection time of previous infectious disease outbreaks in the past 20 years, as reported by the WHO, reveals a trend where AI-based tools have gained in efficiency.This is of particular importance because previously endemic diseases are now being mapped through human importation to non-endemic countries, thus causing pandemic outbreaks.
Although policies are now being revised to prevent the spread of a disease to non-endemic countries, much work is still needed to identify an outbreak before its actual occurrence so that healthcare strategies can be better managed and instigated.A PubMed query with the keywords "("artificial intelligence") OR ("machine learning") OR ("deep learning") AND ("disease surveillance") AND ("1999/01/01"[Date-Publication]: "2019/12/31"[Date-Publication])", further supports that there has been an increasing trend in research involving the development of AI-based algorithms to better predict the outcomes of current healthcare data, and thus to predict disease outbreaks in advance.Although we do observe that certain years do not evidence any research on AI-technologies in disease surveillance, there has been significant research performed on the theme of AI-based personal healthcare.Figure 2 shows the increasing number of AI-based research on healthcare through a search on PubMed using the keywords "("artificial intelligence") AND ("epidemic") OR ("endemic") AND ("1999/01/01" [Date-Publication]: "2019/12/31" [Date-Publication])", represented in the y-axis, and the x-axis represents years.Although policies are now being revised to prevent the spread of a disease to non-endemic countries, much work is still needed to identify an outbreak before its actual occurrence so that healthcare strategies can be better managed and instigated.A PubMed query with the keywords "("artificial intelligence") OR ("machine learning") OR ("deep learning") AND ("disease surveillance") AND ("1999/01/01"[Date-Publication]: "2019/12/31"[Date-Publication])", further supports that there has been an increasing trend in research involving the development of AI-based algorithms to better predict the outcomes of current healthcare data, and thus to predict disease outbreaks in advance.Although we do observe that certain years do not evidence any research on AI-technologies in disease surveillance, there has been significant research performed on the theme of AI-based personal healthcare.Figure 2 shows the increasing number of AI-based research on healthcare through a search on PubMed using the keywords "("artificial intelligence") AND ("epidemic") OR ("endemic") AND ("1999/01/01"[Date-Publication]: "2019/12/31"[Date-Publication])", represented in the y-axis, and the x-axis represents years.The PubMed search and the results thereof are consistent with similar database searches used in systematic literature reviews, and has been observed to be an adequate tool to identify emerging  Although policies are now being revised to prevent the spread of a disease to non-endemic countries, much work is still needed to identify an outbreak before its actual occurrence so that healthcare strategies can be better managed and instigated.A PubMed query with the keywords "("artificial intelligence") OR ("machine learning") OR ("deep learning") AND ("disease surveillance") AND ("1999/01/01"[Date-Publication]: "2019/12/31"[Date-Publication])", further supports that there has been an increasing trend in research involving the development of AI-based algorithms to better predict the outcomes of current healthcare data, and thus to predict disease outbreaks in advance.Although we do observe that certain years do not evidence any research on AI-technologies in disease surveillance, there has been significant research performed on the theme of AI-based personal healthcare.Figure 2 shows the increasing number of AI-based research on healthcare through a search on PubMed using the keywords "("artificial intelligence") AND ("epidemic") OR ("endemic") AND ("1999/01/01"[Date-Publication]: "2019/12/31"[Date-Publication])", represented in the y-axis, and the x-axis represents years.The PubMed search and the results thereof are consistent with similar database searches used in systematic literature reviews, and has been observed to be an adequate tool to identify emerging trends, especially in the field of medicine and healthcare [21] and also used by the World Health The PubMed search and the results thereof are consistent with similar database searches used in systematic literature reviews, and has been observed to be an adequate tool to identify emerging trends, especially in the field of medicine and healthcare [21] and also used by the World Health Organisation [22].While this paper only aims at surveying the emergence of AI-based literature in healthcare, a further in-depth study is warranted to better understand the specific AI processes that are being favoured and the most popular health dimensions.Nevertheless, the brief survey established the approach taken in this study to query the PubMed database using specific keyword is consistent with conventional practices, and through it, the results in Figures 1 and 2 support that there is an emerging trend towards the use of AI tools in the health industry and particularly in the detection of outbreaks.
Following the increase in frequency of global pandemics over time, and the handling of outbreaks by different countries, and agencies, where information on such is often withheld or delayed, it has become apparent that there is a need for a framework that requires the sharing of data to the public in a timely manner.Such a decision was taken by WHO in 2016 after the outbreak of Zika virus.This decision was informed by the use of advanced technologies that made it possible for countries, agencies and other stakeholders to quickly analyze available data to determine the nature of outbreaks.In the recent years, agencies in the health sector have been able to employ analytical tools, like the use of AI algorithms, to scour data from different sources, and in different languages to render valuable predictions [23].In particular, with machine-learning technologies, these agencies now have the capacity to finesse their investigative codes using specific datasets.Therefore, they are increasingly able to detect in real-time issues the spread pattern probability of a virus.Such was evident in with the novel coronavirus (COVID-19) where Bluedot and Metabiota were able to predict countries that were at high risk of experiencing the outbreak days before any actual case was officially reported in any of those countries.Besides predicting outbreaks, it is believed that using different AI tools, like machine learning in personalised medicine and predictive patient outcome applications, it will become easier to quickly diagnose and find cures for global outbreaks [24], thus lessening their overall impacts and spread through our global population.A description on the two companies is featured in the next section.

On the Underlying AI Capabilities of Bluedot and Metabiota
A brief overview on how Bluedot and Metabiota use technologies and AI tools to accurately predict outbreaks is explored in this section.

Bluedot
For BlueDot to perform predictions on outbreak relating to infectious diseases, it relies heavily on AI and machine learning technologies.With these, and using various natural language processing algorithms, the company is able to gather data from a wide range of sources including news outlets and global airline ticketing data [25].From its website ( [26]), this start-up supports that it processes big data sourced from over 10,000 official and media sources each day, from over 60 languages.Its data set also includes information on population density; queried from sources like national censuses, World Factbook and national statistics reports.Other sources of data that BlueDot relies on include the global Infectious Disease Alert, real-time climate conditions and insect vectors and animals' disease reservoirs.With information from all these datasets, the company then uses filtering tools to narrow down areas of interest [27] and employ powerful clustering tools to allow the quick discovery of areas that would be regarded as hot spots, cold spots and spatial outliers.With the data, the company then applies machine learning and natural language processing technologies to train the system, and it is thus able to send regular alerts to its clients, especially on cases of anomalous disease and the risk involved and the anticipated destinations highly likely to experience outbreaks.The training of the system entails using a risk assessment model that utilises the large datasets sourced from various domains to detect, flag and show frequencies depicting potential dangers of the diseases and also anticipate the spread outbreaks.
The works of this web-based start-up were validated during the 2009 H1N1 Influenza pandemic, where it was able to correctly make predictions using worldwide air travel data to anticipate the global pathway of the outbreak.The company was also instrumental during the 2014 Ebola outbreak where, using risk assessment models, it was near perfect in predicting the spread of Ebola in West Africa [26].In 2016, six months prior, BlueDot was able to correctly predict that the Zika virus outbreak would be experienced in Florida [28] and, as mentioned in this paper, in 2020, nine days before the official announcement of the outbreak of the now COVID-19, BlueDot, relying on air travel data from Wuhan was able to predict the outbreak, and subsequently correctly anticipated the cities that were at high risk of experiencing the virus outbreak [29].

Metabiota
Metabiota relies on technologies like AI, Machine Learning, Big Data and Natural Language Processing (NLP) algorithms to make predictions about infectious disease outbreaks, spreads, interventions and event severity [25].Using Neuro Linguistic Programming (NLP) algorithms, this San Francisco based company gathers large amounts of data from both official and unofficial sources, like those from biological, socioeconomic, political and environmental frontiers, and through advanced analytical and visualisation of frequencies, severity and duration of it is able to render accurate predictions [30].Unlike BlueDot, that rarely depends on social media for data, Metabioita is observed to sometimes, like in the recent case of COVID-19 outbreak, to collect and utilise social media data to make its predictions.By relying on the power of AI and machine learning, Metabiota then uses such data to make predictions on how outbreaks impact on human behavior, and the scare levels it causes.Such predictions are essential to this company noting that its main clients are insurance companies that are particularly interested in such information, especially in respect to mitigating investment risks.Besides insurance companies, Metabiota's work also benefits governmental agencies, non-profit organisations, contractors and foundations among other entities who rely on such information to make better informed decisions in events of infectious diseases outbreaks.
The story of Metabiota dates back to 2009, but it came into light in 2014 during the Ebola outbreak in West Africa.Before the outbreak of this virus, the company was involved in research that were geared towards finding the link between animal and human health and it had set camp in Africa.On the outbreak of Ebola, it was involved, especially through the U.S. government, which was in the frontline in fighting the epidemic, but when virus was contained, the funds advanced to the company were cut short [31].It is for this reason that the company extended its scope to cater for insurance companies, and today it has amassed a comprehensive disease database.It is now utilizing modern, predictive technologies like AI, Big Data and Machine learning to make predictions.Following this, when COVID-19 was reported, through the use of these technology, it was at the fore in predicting the next impacted jurisdictions, besides Wuhan where the virus was initially experienced; this prediction came a week earlier before the first cases were reported in those destinations [32].This was performed through the use of natural language processing; where the company was able to utilise social media data from different sources to track the spread of the virus; hence, render even more accurate predictions.

On AI-Driven Algorithms and Bioinformatics
Providing early detection in cases of pandemics is important in negating human loss or disease severity.For instance, while the case of severe acute respiratory syndrome (SARS), caused by the SARS coronavirus (SARS-CoV), led to the deaths of 774 people from across 17 countries, and from different quarters, some of these deaths could have been avoided were it not for the information sharing delays by the Chinese authorities.Thus, the SARS outbreak that was first reported in November 2002 as originating in wild animals being sold as food in a local market in Guangdong, China, took approximately four months to be discovered (February 2003).It is reported that due to institutional structures and fragmented and disjointed bureaucracies in China at the time, the first reaction by Chinese officials was to deny that there was an outbreak, and this prompted inactions leading to uncontrolled virus propagation.Subsequent Chinese scientist research traced the virus through the intermediary of Asian Palm Civets (Paradoxurus hermaphroditus) to cave-dwelling horseshoe bats in Yunnan Province.The Institute of Medicine (U.S.) [33] concludes that the slowness of information and an official response was because this information was considered a 'state secret' and anyone including physicians or journalists that attempted in any way to share or report about the outbreak risked persecutions.Such policy decision was ill advised as by the time such information was finally shared the virus had already spread in almost 29 countries.
It appears that China has learnt from this 2002-2003 incident, and the central government's approach to the COVID-19 coronavirus has been far quicker and forthright.
Another case scenario of delayed detection, that killed 858 people, was the outbreak of Middle East Respiratory Syndrome (MERS), also known as 'camel flu', in 2012 viral respiratory infection caused by the MERS-coronavirus (MERS-CoV).MERS-attributed deaths, mainly across the Arabian Peninsula with pockets in South Korea, Kenya, The Philippines, and the United Kingdom, were reported to have taken place between 2012 and 2016, with 1841 cases confirmed via laboratory testing [34].
Opposite to these cases, the current outbreak of COVID-19 is observed to have been detected in a record time of only seven days, and this is seen to help in dispatching information globally on varying aspects appertaining to this outbreak.
The early detection of such cases in the recent years is partly being credited to the availability of Big Data technology that is impacting upon the health sector significantly.The availability of data in this sector is being promoted by the availability of a myriad of health-wearable devices with the capacity to collect data of health vitals.On this, Gaille [5] argues the valuable role these devices play in influencing the advancement in this sector, as big data is positioned as the new 'gold rush' of the 21st century and associated benefits with big data are clearly influencing geopolitical standings, in both corporate and conventional governance realms, and there is increased competition between powerful economies to ensure that they have the maximum control of big data.The 'push and pull' from the rollout of Smart Cities technologies and in particular Huawei's 5G rollout is testament to this [35][36][37][38][39].On this, it has been noted that the issue of data control and handling by a few corporations must not be geared solely for individual profit making or to benefit the territories they are registered in.Therefore, a geopolitical collaboration on the technological front, between large data-rich corporations can positively influence both the economic landscape and health sectors [40][41][42].
Moreover, the market value of the sector is leading to their numerical production increasing from 830 million devices in 2020 from the current 325 million devices reportedly produced in 2019.The sheer number of these devices will mean a substantial increase in health data, which when analyzed have the potential to transform the health sector.Nevertheless, this analysis, especially through the use of AI and Machine Learning is expected to experience some obstacles especially noting that these two technologies are still new, and health data is highly guarded due to aspects of privacy, insurance and civil and political sensitivities.In addition, the handling of these types of data attract fears of security breach risks and ethical and moral challenges associated with data storage and mis-/use [43,44]; especially that data relating to personal genomes [45].Allam and Jones [46] further support how this can be theoretically achieved across networks at an urban, regional and international scale.
On the above, there are some avenues provided by national laws that can be used to anonymise health data in a safe fashion.Such includes provisions to ensure that the privacy of a patient is not compromised.Such avenues include instances where the data in question are only used for legitimate purposes like research that does not lead to the disclosure of a patients' identity or cause stigmatisation.Such avenues, like the k-anonymity (having a dataset with no combination of user attributes) [47] and unicity strategy of anonymisation, can lead to better AI processing due to availability of larger datasets [48,49].However, to ensure that de-identification (anonymisation) of data is truly achieved, there is a need for standardisation of datasets and protocols to allow for the rich array of devices to be able to be communicated with each other and across systems.While discussion on this may be lengthy, following issues like proprietary pursuit by some device manufacturers, and the forfeiture of profits by the corporations that may benefit from the handling of such data, the potential that the processing of the data from unrestricted dataset is limitless [46].This, as Ellahham et al. [50] hold, would greatly benefit the health sector in scaling issues like diagnosis, personalised medication and the discovery of cures for ailments that bedevils the global population.
Numerous authors align on this belief and support that data processing through the use of AI will greatly enable new insights that will not only aid in the early detection of outbreaks of pandemics but will be instrumental in safeguarding urban economies when such pandemics are experienced.China and Hong Kong are both very cognizant of the negative economic impacts of SARS upon their economies.Thus, it appears that China's national level rapid response to the novel coronavirus (COVID-19) through the cessation of internal China human movements, air flights, boats and ferries, etc., recognises the risk this virus will have to their economy generally, in the shadow of major human movements that normally occur due to Chinese New Year cultural celebrations, but also the risk externally by any corporate and government that has regularly economic-related engagements with China and its citizens.Domestic and international tourism and transportation modes in particular have already, by the end of January 2020, been majorly negatively affected.
In addition, such insights will be instrumental in steering conversations that will eventually influence the formulation of global health policies with the potential to ensure outbreaks are addressed in a more efficient and comprehensive fashion, where information can be shared quicker with more process transparency.

Conclusions
The role of Artificial Intelligence (AI) in the early detection of the novel coronavirus (COVID-19) is documented in this paper through the work of two companies, BlueDot and Metabiota, and demonstrates how AI-driven algorithms can render more precise predictions and readings in the future through increased data sharing.The paper supports that an increased data sharing practice must be enforced in the urban health sector while abiding to the dimensions of privacy and security due to the sensitive nature of information in this industry.On this, AI processes drawing from Smart data sources and from Smart Cities science and their associated technological concepts, coupled with wearable technologies, can and must be encouraged, as it will render larger datasets and hence more accurate prediction and detection.For this actualisation, there is a need for the standardisation of protocols to encourage communication between devices and across systems without compromising data safety and preventing data oversight.The technological revolution upon us will see an increasing use of computing processes, and as their accuracy increases better management decisions may be rendered; as in the case of pandemics, and will thus lead to its prominent role in urban health policy.

Figure 2 .
Figure 2. Increase in AI-based research in healthcare.

Figure 2 .
Figure 2. Increase in AI-based research in healthcare.

Figure 2 .
Figure 2. Increase in AI-based research in healthcare.