Prediction models of coronavirus disease utilizing machine learning algorithms range from forecasting future suspect cases, predicting mortality rates, to building a pattern for country-specific pandemic end date. To predict the future suspect infection and death cases, we categorized the approaches found in the literature into: first, a purely data-driven approach, whose goal is to build a mathematical model that relates the data variables including outputs with inputs to detect general patterns. The discovered patterns can then be used to predict the future infected cases without any expert input. The second approach is partially data-driven; it uses historical data, but allows expert input such as the SIR epidemic algorithm. This approach assumes that the epidemic will end according to medical reasoning. In this paper, we compare the purely data-driven and partially-data driven approaches by applying them to data from three countries having different past pattern behavior. The countries are the US, Jordan, and Italy. It is found that those two prediction approaches yield significantly different results. Purely data-driven approach depends totally on the past behavior and does not show any decline in the number of the infected cases if the country did not experience any decline in the number of cases. On the other hand, a partially data-driven approach guarantees a timely decline of the infected curve to reach zero. Using the two approaches highlights the importance of human intervention in pandemic prediction to guide the learning process as opposed to the purely data-driven approach that predicts future cases based on the pattern detected in the data.
This is an open access article distributed under the Creative Commons Attribution License
which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited