Statistical Analytics and Regional Representation Learning for COVID-19 Pandemic Understanding
The rapid spread of the novel coronavirus (COVID-19) has severely impacted almost all countries around the world. It not only has caused a tremendous burden on health-care providers to bear, but it has also brought severe impacts on the economy and social life. The presence of reliable data and the results of in-depth statistical analyses provide researchers and policymakers with invaluable information to understand this pandemic and its growth pattern more clearly. This paper combines and processes an extensive collection of publicly available datasets to provide a unified information source for representing geographical regions with regards to their pandemic-related behavior. The features are grouped into various categories to account for their impact based on the higher-level concepts associated with them. This work uses several correlation analysis techniques to observe value and order relationships between features, feature groups, and COVID-19 occurrences. Dimensionality reduction techniques and projection methodologies are used to elaborate on individual and group importance of these representative features. A specific RNN-based inference pipeline called DoubleWindowLSTM-CP is proposed in this work for predictive event modeling. It utilizes sequential patterns and enables concise record representation while using but a minimal amount of historical data. The quantitative results of our statistical analytics indicated critical patterns reflecting on many of the expected collective behavior and their associated outcomes. Predictive modeling with DoubleWindowLSTM-CP instance exhibits efficient performance in quantitative and qualitative assessments while reducing the need for extended and reliable historical information on the pandemic.
READ FULL TEXT