Kareem Saheed | Freelancer Research On Corona Virus

RESEARCH ON CORONA VIRUS

CHAPTER ONE INTRODUCTION 1.0. BACKGROUND OF THE STUDY Corona viruses are a large family of viruses that cause respiratory illness ranging from the common cold to more severe disease such as middle east respiratory syndrome (MERS) and severe acute respiratory syndrome (SARS). Coronal viruses are usually transmitted between man and animal. The common symptoms include cough, fever, and shortness of breath and tiredness while the less common symptoms are aches and pain, sore throat, diarrhea, conjunctivitis, headache, loss of taste or smell, a rash on skin, or discolouration of fingers or toes which may appear as few as two days after infection or as long as 14 days after infection. Aaron kandola (2020) Why covid 19 (SAR-COV-2) is new strain of corona virus that has not been previously identified in human? It was first identified in Wuhan china. The world health organization (WHO) and some other national health agencies confirm that corona virus usually spread from infected persons to others through: 1. Through the air during the process of coughing by the infected person and sneezing. 2. Close personal contact such as touching or when shaking hands. 1 3. Touching of surfaces with the virus on it, then touching one’s eyes, mouth, nose or eye before washing the hand. The outbreak in Nigeria, according to the Nigeria Center for Disease Control (NCDC), Nigeria recorded its first case of corona virus through an Italian in February 27th, 2020 which sent wave of panic across the country due to the unpreparedness of her health sector to handle the deadly situation. Upon the detection of the index case, the NCDC instituted a multi-sectorial National Emergency Operations center (EOC) to oversee the national response to covid-19. Subsequently, the presidential Task Force (PTF) inaugurated on March 9, 2020, whom announces the guidelines, the risk and the travel ban on 13 COVID19 high-risk countries and their restrictions from entering into the country. Also, the Port Health Service and he NCDC itself monitored, the isolation of returning traveler from the listed affected nations onward. Their isolation for each returnee lasted for 14days which most of them did not comply with which led to the large outbreak in the Country. Jimoh Amzat et.al, (2020). Furthermore, the NCDC disclosed in the first 30days that most of the infected were returnee and 70% of the individual tested positive were males while 30% were females. Where Lagos state had the highest cases recorded. Thus, in this study, we attempt to address the relationship between factors (such WHO Regio, cumulative total cases , cumulative total cases per 100000 population, newly reported cases in last 7 days, newly reported cases in last 7 days per 100000 population, newly reported cases in last 24 hours, cumulative total deaths per 100000 population, 2 newly reported deaths in last 7 days, newly reported deaths in last 7 days per 100000 population, newly reported deaths in last 24 hours) that contribute to the cumulative total deaths of COVID-19 patients in Africa. 1.1. STATEMENT OF THE PROBLEM This study observed that the whole world was curious about the increased number of deaths of corona virus patients in Africa, which they do not put their focus on some factors that might influence the increase in the death of the COVID- 19 patients. Also, these factors might inversely or directly dictate the extent of death COVID-19 patients. This pertinent reason prompted this study using the available data. 1.2. AIM AND OBJECTIVES The aim of this study is to investigate some determinants that contribute to COVID-19 cumulative deaths. The specific objectives are to: i. Estimate a model on the basis of the statistical variable(s) that contribute to the cumulative deaths of COVID-19 patients. ii. Evaluate the performance of the model using some criteria. iii. Identify the variables that contributes significantly to cumulative death. iv. Observe the violation of assumptions of multiple linear regression on the COVID – 19 data. 3 1.3. SIGNIFICANCE OF THE STUDY Knowing the current statistical variables that causes rapid increase in the death of COVID-19 patients, with other countries/continents having adopted several methods to prevent themselves from those things that causes increase in these statistical variables which have significantly enhance their strategic plans to battle the virus humanly. 1.4. SCOPE OF THE STUDY The scope of this study is limited to the study of relationship of some determinants that significantly associates to COVID-19 cumulative deaths. 1.5. SOURCE OF DATA The data used for this project work is secondary data, extracted from World Health Organization (WHO) COVID-19 dashboard. (https://covid19.who.int/table) released May 25th, 2021. 1.6. ORGANIZATION OF THE STUDY The project is organized as follows: Chapter One presents the Background of the Study which includes the Introduction of the Subject of Study, Statement of the Problem, the Aim and Objectives, Significance of the Study and the Scope of the Study. Chapter Two covers the Literature Review, detailed explanation of Corona virus and the Methodology used in the analysis of data is presented and explained, while Chapter Three comprises of 4 Data Presentation, Data Analysis and Interpretation of Results. Chapter Four presents Summary, Conclusion, and Recommendation of the Study. 5 CHAPTER TWO LITERATURE REVIEW AND METHODOLOGY 2.0. INTRODUCTION This chapter comprises of two sections, the first section entails Empirical Literature Review related literature on COVID-19 in Nigeria. The second section describes the methodology used to meet the Aim and Objectives of the Study. 2.1. EMPIRICAL REVIEW OF RELATED LITERATURE Since the advent of the study of corona virus, several works have been done by many authors. Some of which are briefly summarized below: Ajao LO, et.al, (2020). in their study: Vector autoregressive (VAR) models for modeling and forecasting covid-19 variables with special focus on Nigeria cases from 1st march to 10th June 2020 stated: Using a timeseries approach, At lag of order 2, the hypothesis of non-stationary i.e changes either increase or decrease in the selected COVID-19 variables, is rejected at 5% level for all the multivariate variables using the augmented Dickey Fuller and Phillips-Perron unit root tests. The Granger causality test results indicate that there is a bivariate causal relationship among the variables by rejecting the null hypothesis of no Granger causality. The determinants of confirmed cases, new cases, and total deaths from covid-19 are generally significant at 5% level with p-value 0.0001 in each of the three derived models. The criteria AIC and log-likelihood implemented on the models confirmed that the VAR model of order 2 gives a better model for predictions and forecasts of covid6 19 cases in Nigeria. He also indicate and recommend a suitable model for handling multivariate time series data and suggests a reliable approach for forecasting future cases of covid-19 variables in the country that will help health policy makers in finding solution to the unceasing upward trend in the cases of the pandemic. 2.2. REGRESSION ANALYSIS Regression analysis is a statistical process for estimating the relationships among variables. It includes many techniques for modeling and analyzing several variables, when the focus is on the relationship between a dependent variable and one or more independent variables (or 'predictors'). More specifically, regression analysis helps one understand how the typical value of the dependent variable (or 'criterion variable') changes when any one of the independent variables is varied, while the other independent variables are held fixed. Most commonly, regression analysis estimates the conditional expectation of the dependent variable given the independent variables – that is, the average value of the dependent variable when the independent variables are fixed. Less commonly, the focus is on a quantile, or other location parameter of the conditional distribution of the dependent variable given the independent variables. In all cases, the estimation target is a function of the independent variables called the regression function. 7 Regression analysis is widely used for prediction and forecasting, where its use has substantial overlap with the field of machine learning. Regression analysis is also used to understand which among the independent variables are related to the dependent variable, and to explore the forms of these relationships. In restricted circumstances, regression analysis can be used to infer causal relationships between the independent and dependent variables. However this can lead to illusions or false relationships, so caution is advisable [Wikipedia, 2020]. The performance of regression analysis methods in practice depends on the form of the data generating process, and how it relates to the regression approach being used. Since the true form of the data-generating process is generally not known, regression analysis often depends to some extent on making assumptions about this process. These assumptions are sometimes testable if a sufficient quantity of data is available. Regression models for prediction are often useful even when the assumptions are moderately violated, although they may not perform optimally. However, in many applications, especially with small effects or questions of causality based on observational data, regression methods can give misleading results. Regression models involve the following: i. The unknown parameters, denoted as β, which may represent a scalar or a vector. 8 ii. The independent variables X. iii. The dependent variable, Y. A regression model relates Y to a function of X and β. 𝒀 = 𝜷𝟎 + ∑ 𝜷𝒊 𝑿𝒊 (2.0) 𝒘𝒉𝒆𝒓𝒆 𝒊: 𝟏 → 𝒌 The approximation is usually formalized as E(Y | X) = f(X, β). To carry out regression analysis, the form of the function f must be specified. Sometimes the form of this function is based on knowledge about the relationship between Y and X that does not rely on the data. If no such knowledge is available, a flexible or convenient form for f is chosen. Assume now that the vector of unknown parameters β is of length k. In order to perform a regression analysis, the user must provide information about the dependent variable Y: i. If N data points of the form (Y, X) are observed, where Nk data points are observed. In this case, there is enough information in the data to estimate a unique value for β that best fits the data in some sense, and the regression model when applied to the data can be viewed as an over determined system in β. In the last case, the regression analysis provides the tools for: 1. Finding a solution for unknown parameters β that will, for example, minimize the distance between the measured and predicted values of the dependent variable Y (also known as method of least squares). 2. Under certain statistical assumptions, the regression analysis uses the surplus of information to provide statistical information about the unknown parameters β and predicted values of the dependent variable Y. In linear regression, the model specification is that the dependent variable, is a linear combination of the parameters (but need not be linear in the independent variables). For example, in simple linear regression for modeling n data points there is one independent variable and two parameters. The regression model is given by the straight-line equation 𝒀 = 𝜷𝟎 + 𝜷𝟏 𝑿𝟏 + 𝜀 (2.1) 10 In multiple linear regression, there are several independent variables or functions of independent variables. Adding a term in to the preceding regression gives: Parabola: 𝒀 = 𝜷𝟎 + 𝜷𝟏 𝑿𝟏 + 𝛽2 𝑋2 2 + 𝜀. (2.2) This is still linear regression. Although the expression on the right-hand side is quadratic in the independent variable X2, it is linear in the parameters, 𝜷𝟎 , 𝜷𝟏 and 𝜷𝟐 Returning to the straight line case: Given a random sample from the population, we estimate the population parameters and obtain the sample linear regression model: The residual, 𝑒 = 𝑌𝑖 − 𝑌̂𝑖 is the difference between the value of the dependent variable predicted by the model, 𝑌̂𝑖 and the true value of the dependent variable 𝑌𝑖 One method of estimation is ordinary least squares. This method obtains parameter estimates that minimize the sum of squared residuals, SSE, also sometimes denoted RSS: SSE = ∑𝑛𝑖=1 𝑒𝑖2 (2.3) Minimization of this function results in a set of normal equations, a set of simultaneous ̂0 linear equations in the parameters, which are solved to yield the parameter estimators, 𝛽 ̂1 and 𝛽 In the case of simple regression, the formulas for the least square’s estimates are 11 𝛽̂1 = 𝑛 ∑ 𝑋𝑖 𝑌𝑖 − ∑ 𝑋𝑖 𝑌𝑖 𝑛𝑋12 −∑(𝑋𝐼 )2 (2.4) ̂𝟏 𝑎𝑛𝑑 𝛽̂0 = 𝒀̂𝒊 + 𝛽̂1 𝑿 (2.5) ̂ is the mean (average) of the Xvalues and 𝒀 ̂ is the mean of the Y values. Where 𝑿 Under the assumption that the population error term has a constant variance, the estimate of that variance is given by: 𝑆𝑆𝐸 𝜎𝑒2 = 𝑛−2 (2.6) This is called the mean square error (MSE) of the regression. The denominator is the sample size reduced by the number of model parameters estimated from the same data, (n-p) for p regressors or (n-p-1) if an intercept is used. In this case, p=1 so the denominator is n-2. The standard errors of the parameter estimates are given by: 1 𝜎𝛽0 = 𝜎𝑒 √𝑛 + 𝑥2 ∑(𝑥−𝑥̂)2 (2.7) 1 𝜎𝛽1 = 𝜎𝑒 √ ∑(𝑥−𝑥̂)2 (2.8) 12 Under the further assumption that the population error term is normally distributed, the researcher can use these estimated standard errors to create confidence intervals and conduct hypothesis tests about the population parameters. 2.3 MULTIPLE REGRESSION This is a statistical tool that examines how multiple independent variables are related to a dependent variable. Once one has identified how these multiple variables relate to the dependent variable, one can take information about all of the independent variables and use it to make much more powerful and accurate predictions about why things are the way they are. This latter process is called “Multiple Regression’’. 2.4 ESTIMATION OF MODEL PARAMETERS Recall the regression equation model 𝒀 = 𝜷𝟎 + 𝜷𝟏 𝑿𝟏 + 𝜷𝟐 𝑿𝟐 + ⋯ + 𝜷𝒌 𝑿𝒌 + 𝒆 (2.9) Where: Y = explained (response) variable, X1, X2, … , Xk = (k) explanatory variables, 𝜷𝟎 , 𝜷𝟏 and 𝜷𝟐 …,𝜷𝒌 = unknown parameters to be estimated (known as coefficients) and e is the error term. The above model can be written in matrix notation as 13 regression ̇ +𝒆 𝒀 = 𝑿𝜷 (2.9.1) where; Y is n x 1 vector of response variable, X is an n x k matrix of explanatory variables, is a k x 1 vector of unknown parameters and e is an n x 1 vector of error terms? Making e the subject of the of equation (ii), we have e = Y – X𝜷 (2.9.2) We wish to find the vector of least squares estimators 𝜷 that minimizes Q = e’e = (Y - 𝑋𝛽)’ (Y - 𝑋𝛽) (2.9.3) Q = e’e = (Y’Y - 2𝛽’X’Y + ′𝑋′𝑋𝛽) (2.9.4) 𝜕𝑄 𝜕𝛽 = −2𝑋 ′ 𝑌 + 2𝑋 ′ 𝑋𝛽̂ = 0 (2.9.5) This simplifies to ̂ = 𝑋′𝑌 𝑋 ′ 𝑋𝜷 (2.9.6) This is the set of least squares normal equations in matrix form. To solve the equation, multiply both sides by the inverse of X’X. Thus, the least squares estimator of 𝜷 for OLS is ̂ = (𝑿′ 𝑿)−𝟏 𝑿′𝒀 𝜷 (2.9.7) 14 2.5. PROPERTIES OF LEAST SQUARE ESTIMATOR i. Least square estimator are unbiased and have variance-covariance matrix: ̂ ) = (𝜷)−𝟏 𝝈𝟐 𝑬(𝜷 Proof: Recall = 𝛽̂ = (𝑿′ 𝑿)−𝟏 𝑿′𝒀 Thus, 𝐸(𝛽̂ ) = 𝐸((𝑿′ 𝑿)−𝟏 𝑿′𝒀) = (𝑿′ 𝑿)−𝟏 𝑿′𝑬(𝒀) where 𝑬(𝒀) = 𝑋𝜷 = (𝑿′ 𝑿)−𝟏 (𝑿′𝑿)𝜷 =I𝜷= 𝜷 (2.9.8) Also below is the proof for the variance-covariance matrix of the model parameters using affine transformation; ̂ = 𝐸(( 𝜷 ̂ − 𝜷)′ (𝜷 ̂ − 𝜷)) 𝑽(𝜷) 𝐸(((𝑿′ 𝑿)−𝟏 𝑿′ )′((𝑿′ 𝑿)−𝟏 𝑿′ 𝜺)) = 𝐸((𝑿′ 𝑿)−𝟏 𝜺𝜺′ ) = ((𝑿′ 𝑿)−𝟏 𝑬(𝜺𝜺′ ) where E (𝜺𝜺′) = 𝝈𝟐 = (𝑋 ′ 𝑋)−1 𝝈𝟐 15 (2.9.9) 2.6. ASSUMPTIONS OF CLASSICAL LINEAR REGRESSION MODEL Standard linear regression models with standard estimation techniques make a number of assumptions about the predictor variables, the response variable and their relationship. The following are the major assumptions made by standard linear regression models with standard estimation techniques. 1. The regression model is linear in parameters. 2. The expected value of the residual given any value of the explanatory is zero i.e., 𝑒 𝐸( 𝑖⁄𝑥𝑖 = 0 iii. The variances of the residual terms given any value of the explanatory variable 𝑒 equal 𝑖. 𝑒(𝑉𝑎𝑟( 𝑖⁄𝑥𝑖 = 𝜎 2 this is known as homoscedasticity. iv. The values of the explanatory variables are fixed in repeated sampling. v. There is no correlation between the residual terms given any value of the explanatory variable i.e., C𝑜𝑟(𝑒𝑖 , 𝑒𝑗 /𝑥𝑖, 𝑥𝑗 = 0 vi. There should be no specification error. vii. There is no linear relationship between the residual term and the explanatory variable i.e. C𝑜𝑟(𝑒𝑖 /𝑥𝑗 = 0 viii. The explanatory variables must not be collinear. ix. The residual term is normally distributed with mean zero and variance (i.e. ~ 𝑁(0, 𝜎 2 ) [Gujarati, 2004]. 16 2.7. Q-Q PLOT FOR NORMALITY ASSUMPTION A Q–Q plots is a plot of the quantiles of two distributions against each other, or a plot based on estimates of the quantiles. The main step in constructing a Q–Q plots is calculating or estimating the quantiles to be plotted. If one or both of the axes in a Q–Q plots is based on a theoretical distribution with a continuous cumulative distribution function (CDF), all quantiles are uniquely defined and can be obtained by inverting the CDF. If a theoretical probability distribution with a discontinuous CDF is one of the two distributions being compared, some of the quantiles may not be defined, so an interpolated quantile may be plotted. If the Q–Q plots is based on data, there are multiple quantile estimators in use. Rules for forming Q–Q plots when quantiles must be estimated or interpolated are called plotting positions A simple case is where one has two data sets of the same size. In that case, to make the Q–Q plot, one orders each set in increasing order, then pairs off and plots the corresponding values. A more complicated construction is the case where two data sets of different sizes are being compared. To construct the Q–Q plot in this case, it is necessary to use an interpolated quantile estimate so that quantiles corresponding to the same underlying probability can be constructed. More abstractly, given two cumulative probability distribution functions F and G, with associated quantile functions F−1 and G−1 (the inverse function of the CDF is the quantile 17 function), the Q–Q plot draws the q-th quantile of F against the q-th quantile of G for a range of values of q [Gibbons et al, 2003]. Thus, the Q–Q plot is a parametric curve indexed over [0,1] with values in the real plane R2 Interpretation of Q-Q Plots The use of Q–Q plots which of interest in this study is to compare the distribution of a sample to a theoretical distribution, such as the standard normal distribution N(0,1), as in a normal probability plot. As in the case when comparing two samples of data, one orders the data (formally, computes the order statistics), then plots them against certain quantiles of the theoretical distribution [Thode, 2002]. 2.8. SHAPIRO-WILK’S TEST FOR NORMALITY Theory: The Shapiro–Wilk test is a test of normality in frequentist statistics. It was published in 1965 by Samuel Sanford Shapiro and Martin Wilk. The test statistic is 𝑊= (∑𝑛 𝑖=1 𝑎𝑖 𝑥(𝑖) ) 2 (2.9.9.1) ∑𝑛 𝑖=1(𝑥𝑖 − 𝑥̂ ) where; 18 i. 𝑥(𝑖) (with parentheses enclosing the subscript index i; not to be confused with 𝑥(𝑖) is the ith order statistic, i.e., the ith-smallest number in the sample; ii. (𝑥1 + 𝑥2 + ⋯ + 𝑥𝑛 )⁄n is the sample mean; iii. the constants 𝑎𝑖 are given by 𝑚𝑉 𝑎𝑖 , … , 𝑎𝑛 = (𝑚𝑇 𝑉 −1 𝑉 −1𝑚)1/2 (2.9.9.2) where; 𝑚 = (𝑚1 , … , 𝑚𝑛 )𝑇 m = (𝑚1 , … , 𝑚𝑛 )T (2.9.9.3) and 𝑚1 ,…, 𝑚𝑛 are the expected values of the order statistics of in dependent and identically distributed random variables sampled from the standard normal distribution, and is the covariance matrix of those order statistics [Shapiro et al, 1965]. Test of Hypothesis: Under the following hypothesis: H0: Error term is normally distributed H1: Error term is not normally distributed If the p-value is less than the chosen alpha level, then the null hypothesis is rejected and there is evidence that the data tested are not from a normally distributed population; in other words, the data are not normal. On the contrary, if the p-value is greater than the 19 chosen alpha level, then the null hypothesis that the data came from a normally distributed population cannot be rejected (e.g., for an alpha level of 0.05, a data set with a p-value of 0.02 rejects the null hypothesis that the data are from a normally distributed population) [JMP, 2004]. As with most statistical tests, the test may be statistically significant from a normal distribution in any large samples. Thus a Q–Q plot is useful for verification in addition to the test. [Wikipedia, 2020]. 2.9.0 VARIANCE INFLATION FACTOR The variance inflation factor is the ratio of variance in a model with multiple terms, divided by the variance of a model with one term. It quantifies the severity of multicollinearity in an ordinary least squares regression analysis [Gujarati, 2004]. Procedure 1. Step one: First we run an ordinary least square regression that has Xi as a function of all the other explanatory variables in the first equation. If i = 1, for example, the equation would be 𝑋1 = 𝛼2 𝑋2 + ⋯ + 𝛼𝐾−1 𝑋𝐾−1 + 𝑐0 + 𝑒 Where co is the constant and e is the error term. 2. Step two: 20 We calculate the VIF factor for 𝛽̂𝑖 with the following formula: 𝑉𝐼𝐹 = 1 1 − 𝑅𝐼2 2 where 𝑅𝐼 is the coefficient of determination of the regression equation in step one, with 𝑋𝐼 on the left-hand side, and all the other predictor variables on the right-hand side. 2.9.1 DURBIN-WATSON TEST The Durbin–Watson statistic is a test statistic used to detect the presence of autocorrelation at lag 1 in the residuals (prediction errors) from a regression analysis. [Durbin and Watson; 1950, 1951] applied this statistic to the residuals from least squares regressions, and developed bounds tests for the null hypothesis that the errors are serially uncorrelated against the alternative that they follow a first order autoregressive process. Later, John Denis Sargan and Alok Bhargava developed several von Neumann–Durbin–Watson type test statistics for the null hypothesis that the errors on a regression model follow a process with a unit root against the alternative hypothesis that the errors follow a stationary first order autoregression [Sargan and Bhargava, 1983]. Note that the distribution of this test statistic does not depend on the estimated regression coefficients and the variance of the errors. 21 Procedure and Interpretation: If 𝑒𝑡 is the residual associated with the ith observation, then the test statistic is 𝑑= ∑𝑇𝑡(𝑒𝑡 − 𝑒𝑡−1 )2 ∑𝑇𝑡=2 𝑒𝑡2 where T is the number of observations. If one has a lengthy sample, then this can be linearly mapped to the Pearson correlation of the time-series data with its lags. Since d is approximately equal to 2(1 − r), where r is the sample autocorrelation of the residuals, d = 2 indicates no autocorrelation. The value of d always lies between 0 and 4. If the Durbin–Watson statistic is substantially less than 2, there is evidence of positive serial correlation. As a rough rule of thumb, if Durbin–Watson is less than 1.0, there may be cause for alarm. Small values of d indicate successive error terms are positively correlated. If d > 2, successive error terms are negatively correlated. In regression, this can imply an underestimation of the level of statistical significance [Durbin and Watson; 1950, 1951]. To test for positive autocorrelation at significance α, the test statistic d is compared to lower and upper critical values (dL,α and dU,α): i. If d < dL,α, there is statistical evidence that the error terms are positively autocorrelated. 22 ii. If d > dU,α, there is no statistical evidence that the error terms are positively autocorrelated. iii. If dL,α < d < dU,α, the test is inconclusive. Positive serial correlation is serial correlation in which a positive error for one observation increases the chances of a positive error for another observation. To test for negative autocorrelation at significance α, the test statistic (4 − d) is compared to lower and upper critical values (dL,α and dU,α): i. If (4 − d) < dL,α, there is statistical evidence that the error terms are negatively autocorrelated. ii. If (4 − d) > dU,α, there is no statistical evidence that the error terms are negatively autocorrelated. iii. If dL,α < (4 − d) < dU,α, the test is inconclusive. Negative serial correlation implies that a positive error for one observation increases the chance of a negative error for another observation and a negative error for one observation increases the chances of a positive error for another [Durbin and Watson; 1950, 1951]. The critical values, dL,α and dU,α, vary by level of significance (α), the number of observations, and the number of predictors in the regression equation. 23 2.9.2 F-TEST AND T-TEST These two measures are used to check for the model adequacy. Under the null hypothesis which says the model is inadequate, the below test statistic; 𝐻0 = 𝑇ℎ𝑒 𝑚𝑜𝑑𝑒𝑙 𝑖𝑠 𝑎𝑑𝑒𝑞𝑢𝑎𝑡𝑒 𝐻1 = 𝑇ℎ𝑒 𝑚𝑜𝑑𝑒𝑙 𝑖𝑠 𝑛𝑜𝑡 𝑎𝑑𝑒𝑞𝑢𝑎𝑡𝑒 𝐹= 𝑀𝑆𝑅𝑒𝑔𝑟𝑒𝑠𝑠𝑖𝑜𝑛 𝑀𝑠𝐸𝑟𝑟𝑜𝑟 = 𝑆𝑆𝑅𝑒𝑔𝑟𝑒𝑠𝑠𝑖𝑜𝑛 𝑘−1 𝑆𝑆𝐸𝑟𝑟𝑜𝑟 𝑛−𝑘 is used to compare with the tabulated value; ~𝐹(𝑘−1),(𝑛−𝑘) 𝐹𝑇𝐴𝐵 = 𝐹(𝑘−1),(𝑛−𝑘) . If the F-statistic is greater than the F-value from the table, then the null hypothesis of model inadequacy is rejected, otherwise we do not reject the null hypothesis [Abiodun, 2017]. In such a case when we reject the null hypothesis i.e., the model is adequate, and then we can use the T-TEST to check which of the variables contributes to the model i.e. to the rejection of the null hypothesis. Thus, we set the individual hypothesis 24 𝐻𝑂 : 𝛽𝐼 = 0 𝑣𝑠 𝐻1 : 𝛽𝐼 ≠ 0 and the below test statistic 𝑡= 𝛽̂𝑖 𝑠. 𝑒(𝛽̂𝑖 ) is used to compare with the tabulated value𝑡𝑇𝐴𝐵 ~𝑡𝑛−𝑘 = 𝑡𝑛−𝑘,𝛼/2 If the |t-statistic| is greater than the t-value from the table, then we reject the null hypothesis i.e. the variable contributes to the model, otherwise we do not the null hypothesis [Abiodun, 2017]. 25 CHAPTER THREE 3.0 DATA PRESENTATION AND DATA ANALYSIS 3.1 Introduction This section comprises presentation and analysis of data, the data used for this project work is a secondary data, extracted from World Health Organization (WHO) COVID-19 dashboard. (https://covid19.who.int/table) released May 25th, 2021. The variables present are used to carry out a multiple regression analysis to know which independent variable explains or have a significant impact on the dependent variable. Table 3.0: Presentation of data Name South Africa Cases cumulative total Cases cumulative total per 100000 population Cases newly reported in last 7 days Cases newly reported in last 7 days per 100000 population Cases newly reported in last 24 hours Deaths cumulative total Deaths cumulative total per 100000 population Deaths newly reported in last 7 days Deaths newly reported in last 24 hours - 2757.55 21737 36.65 2893 55802 94.09 592 30 Tunisia 335345 2837.43 8773 74.23 1246 12236 103.53 387 54 Ethiopia 269194 234.16 2930 2.55 293 4076 3.55 80 8 Egypt 253835 248.04 8114 7.93 1145 14721 14.39 394 51 Libya 183311 2667.78 1901 27.67 412 3111 45.28 23 6 Kenya 168432 313.24 2967 5.52 324 3059 5.69 56 10 Nigeria 166019 80.54 310 0.15 40 2067 1 1 0 Algeria 126860 289.3 1549 3.53 209 3418 7.79 44 7 Ghana 93620 301.29 287 0.92 37 783 2.52 0 0 Zambia 93201 506.97 765 4.16 95 1268 6.9 8 1 Cameroon 76756 289.14 0 0 0 1230 4.63 0 0 Mozambique 70590 225.85 148 0.47 22 831 2.66 5 0 Botswana 54151 2302.7 1989 84.58 0 784 33.34 23 0 26 Namibia 52946 2083.75 1728 68.01 234 765 30.11 47 2 Côte d’Ivoire 46942 177.96 286 1.08 0 298 1.13 0 0 Uganda 43734 95.61 955 2.09 227 356 0.78 9 6 Senegal 41062 245.24 212 1.27 39 1130 6.75 5 1 Madagascar 40876 147.61 735 2.65 96 800 2.89 37 7 Zimbabwe 38682 260.26 122 0.82 3 1586 10.67 4 0 Sudan 34889 79.57 0 0 0 2446 5.58 0 0 Malawi 34284 179.22 70 0.37 10 1153 6.03 0 0 Angola 32441 98.71 1804 5.49 292 725 2.21 66 10 Cabo Verde 29334 5276.02 1166 209.72 136 256 46.04 7 0 Rwanda 26688 206.05 712 5.5 264 349 2.69 5 1 Gabon 24107 1083.1 308 13.84 0 147 6.6 4 0 Réunion 23566 2632.16 922 102.98 0 176 19.66 7 0 Guinea 22988 175.04 254 1.93 0 158 1.2 7 0 Mayotte 20176 7395.49 0 0 0 171 62.68 0 0 Mauritania 19149 411.84 321 6.9 35 458 9.85 1 0 Eswatini 18551 1599 31 2.67 1 672 57.92 0 0 Mali 14241 70.32 51 0.25 5 514 2.54 3 2 Burkina Faso 13415 64.18 18 0.09 1 165 0.79 1 0 Togo 13374 161.55 99 1.2 22 125 1.51 0 0 Congo 11476 207.97 133 2.41 0 150 2.72 2 0 Lesotho 10822 505.17 32 1.49 16 326 15.22 6 6 South Sudan 10670 95.32 18 0.16 0 115 1.03 0 0 Seychelles Equatorial Guinea 10669 - 928 943.6 236 38 38.64 8 0 8436 601.29 742 52.89 0 113 8.05 1 0 Benin Central African Republic 8025 66.2 41 0.34 0 101 0.83 0 0 7079 146.57 213 4.41 69 97 2.01 2 1 Gambia 5978 247.37 32 1.32 0 178 7.37 3 0 Niger 5383 22.24 50 0.21 19 212 0.88 20 20 Chad 4924 29.98 20 0.12 1 173 1.05 0 0 Burundi 4546 38.23 199 1.67 52 6 0.05 0 0 Sierra Leone 4121 51.66 16 0.2 4 79 0.99 0 0 Comoros 3942 453.31 9 1.03 2 146 16.79 0 0 Eritrea 3932 110.87 88 2.48 0 14 0.39 2 0 Guinea-Bissau Sao Tome and Principe 3751 190.6 5 0.25 2 68 3.46 1 0 2336 1065.89 9 4.11 2 37 16.88 2 1 27 Liberia 2142 42.35 13 0.26 0 85 1.68 0 0 Mauritius United Republic of Tanzania 1322 103.95 34 2.67 0 17 1.34 0 0 509 0.85 0 0 0 21 0.04 0 0 0 0 0 0 0 0 0 0 0 Saint Helena 3.2 PRESENTATION AND INTERPRETATION OF RESULTS This section comprises of analysis and presentation of the result in this study using the methodology discussed in the previous chapter. Model specification; 𝒀 = 𝜷𝟎 + 𝜷𝟏 𝑿𝟏 + 𝜷𝟐 𝑿𝟐 + 𝜷𝟑 𝑿𝟑 + 𝜷𝟒 𝑿𝟒 + 𝜷𝟓 𝑿𝟓 + 𝜷𝟔 𝑿𝟔 + 𝜷7 𝑿𝟕 + 𝜷𝟖 𝑿𝟖 + 𝜷𝟗 𝑿𝟗 + 𝜀 Y = Deaths (cumulative total) 𝑿𝟏 = Cases (cumulative total) 𝑿𝟐 = Cases (cumulative total per 100000 population) 𝑿𝟑 = Cases ( newly reported in last 7 days) 𝑿𝟒 = Cases (newly reported in last 7 days per 100000 population) 𝑿𝟓 = Cases ( newly reported in last 24 hours) 𝑿𝟔 = Deaths (cumulative total per 100000 population) 𝑿𝟕 = Deaths (newly reported in last 7 days ) 𝑿𝟖 = Deaths ( newly reported in last 7 days per 100000 population ) 28 𝑿𝟗 = Deaths (newly reported in last 24 hours ) 𝜀 = 𝑟𝑎𝑛𝑑𝑜𝑚 𝑒𝑟𝑟𝑟𝑜𝑟 𝑡𝑒𝑟𝑚 29 Fitting the regression model using OLS: Table 3.1: Estimates for the regression Coefficients. Coefficients Model (Constant) Cases - cumulative total Cases - cumulative total per 100000 population Cases - newly reported in last 7 days Unstandardized Coefficients B Std. Error - - Standardized Coefficients Beta 0.634 -0.050 -0.718 0.426 -0.301 Cases - newly reported in last 7 days per 100000 population Cases - newly reported in last 24 hours 12.950 5.001 0.216 3.417 2.200 0.194 Deaths - cumulative total per 100000 population Deaths - newly reported in last 7 days Deaths - newly reported in last 7 days per 100000 population Deaths - newly reported in last 24 hours 15.843 15.404 -0.045 49.998 - - 0.682 -0.215 -162.691 40.723 -0.226 The fitted regression Model using OLS. ̂ = −𝟏𝟔𝟖. 𝟎𝟎𝟖 + 0.022𝑿𝟏 −0.205𝑿𝟐 + −0.718𝑿𝟑 + 12.950𝑿𝟒 + 𝒀 3.417𝑿𝟓 + 15.843 + 49.998𝑿𝟕 −-𝑿𝟖 − 162.691𝑿𝟗 30 (3.2) 3.2.1 TEST FOR MODEL ADEQUACY The following tests are used to check if the ordinary least square regression model applied is adequate and which variable(s) contributes to the adequacy of the model using the below Table 3.2: Anova table. Model 1 Regression Residual Total ANOVA Sum of Squares Df- 43 52 Mean Square F- Sig. 0.000 - Test for model significance: Hypothesis: 𝐻𝑜; 𝛽1 = 𝛽2 = 𝛽3 = 𝛽4 = 𝛽5 = 𝛽6 = 𝛽7 = 𝛽8 = 𝛽9 = 0 H1 = 𝐻𝑜 𝑖𝑠 𝑛𝑜 𝑡𝑟𝑢𝑒 Decision Rule: Reject the null hypothesis if p-value < α – value (0.05), otherwise do not reject the null hypothesis. Decision: Since = 0.000 < 0.05, then we reject the null hypothesis. 31 Interpretation: We are able to conclude based on the available data that at least one of the regressors is significant i.e., it has impact on the dependent variable at α = 0.05 level of significance. We can see that at least one independent variable contributes to the response variable, thus the regression model is said to be adequate. Table 3.2.1: Summary table R R Square- Adjusted R Square 0.987 Using the R2 value above, it can be seen that 98.9% of the variation in the dependent variable (cumulative death) is explained by the independent variable. Thus, the model is adequate. Now to test which variable(s) actually contributes to the dependent variable, we test each variable individually for the significance of each parameter. 32 Individual t-test for the parameters: Hypothesis: 𝐻𝑂 = 𝛽1 = 0 𝐻1 = 𝑁𝑜𝑡 𝐻𝑂 𝑇𝑒𝑠𝑡 𝑠𝑡𝑎𝑡𝑖𝑠𝑡𝑖𝑐𝑠 = ̂ 𝛽 𝑖 𝑡= ~𝑡𝑛−𝑘 ̂ ) 𝑠. 𝑒(𝛽 𝑖 Table 3.3: T- test table for individual parameters. Variables T Sig. -1.023 0.312 7.118 0.000 -1.038 0.305 -1.687 0.099 2.589 0.013 1.553 0.128 1.028 0.309 4.949 0.000 -2.985 0.005 -3.995 0.000 Constant Cases - cumulative total (𝑋1 ) Cases - cumulative total per 100000 population (𝑋2 ) Cases - newly reported in last 7 days (𝑋3 ) Cases - newly reported in last 7 days per 100000 population (𝑋4 ) Cases - newly reported in last 24 hours (𝑋5 ) Deaths - cumulative total per 100000 population (𝑋6 ) Deaths - newly reported in last 7 days (𝑋7 ) Deaths - newly reported in last 7 days per 100000 population (𝑋8 ) Deaths - newly reported in last 24 hours (𝑋9 ) 33 Decision rule: Reject Ho if tcal > ttab, or if p-value < α - value, Conclusion: Deaths - newly reported in last 7 days per 100000 population (X8), Deaths - newly reported in last 24 hours (X9), Cases - newly reported in last 7 days per 100000 population (X4), Deaths - newly reported in last 7 days(X7) and cumulative total (X1) are significant in the model, while, Cases - newly reported in last 7 days, Deaths cumulative total per 100000 population, cases cumulative total per 10000 population (X2), cases newly reported in the last 24hrs (X5), and death cumulative total per 100000 population (X6) are not significant. We observed from the individual tests that only variables 𝑋9 , 𝑋8 , 𝑋7 , 𝑋4 and 𝑋1 contribute significantly to the model. 34 3.4. Fitting and Testing for the adequacy of the reduced model. From the individual test for each model parameter, it was discovered that some independent variables contribute significantly to model while some do not contribute to the model, thus the significant independent variables are used in fitting a new model using the method of OLS. Table 3.4: Estimates for the regression Coefficients. (Constant) Deaths (newly reported in last 24 hours) (X9) Deaths (newly reported in last 7 days per 100000 population) (X8) Cases (cumulative total) (X1) Cases (newly reported in last 7 days per 100000 population) (X4) Deaths (newly reported in last 7 days) (X7) B Std. Error - - Beta -0.193 - 387.247 -0.173 - - - 43.029 6.469 0.587 Reduced model. ̂ = −𝟐𝟐𝟕. 𝟏𝟕𝟔 − 𝟎. 𝟎𝟐𝑿𝟏 + 8.954𝑿𝟒 + 43.029𝑿𝟕 − 1126,215𝑿𝟖 − 138.650𝑿𝟗 𝒀 (3.3) 35 3.4.1. Testing for Model adequacy of the reduced model The following tests are used to check if the ordinary least square regression model applied is adequate and which variable(s) contributes to the adequacy of the model using the below: Table 3.5: Model Summary Model 1 R R Square- Adjusted R Std. Error of Square the Estimate- DurbinWatson 1.390 From the summary table, we find that the R is 0.994 showing a very strong positive relationship between the dependent and the independent variable. adjusted R2 of the model in equation is 0.987 with R2 = 0.988. This means that the linear regression model explains 98.8% of the variance in the data, or 98.8% of the variation in the dependent variable is explained by the independent variable. Table 3.6: ANOVA table Model Regression Sum of Squares- Residual Total - ANOVA Df Mean Square- 47 52 36 - F 792.960 p.value. .000 F-test for model significance: Hypothesis: 𝐻𝑜; 𝛽1 = 𝛽4 = 𝛽7 = 𝛽8 = 𝛽9 = 0 H1 = 𝐻𝑜 𝑖𝑠 𝑛𝑜 𝑡𝑟𝑢𝑒 Decision Rule: Reject the null hypothesis if p.value < α value(0.05), otherwise do not reject the null hypothesis Decision: Since = 0.00 < 0.05, then we reject the null hypothesis. Interpretation: We are able to conclude based on the available data that the at least one of the regressors is significant i.e., it has impact on the dependent variable at α = 0.05 level of significance. We can see that at least one independent variable contributes to the to the response variable, thus the regression model is said to be adequate. Now to test which variable(s) actually contributes to the dependent variable, we test individually for the significance of each parameter. 37 Individual t-test for the parameters: Hypothesis: 𝐻𝑂 = 𝛽1 = 0 𝐻1 = 𝑁𝑜𝑡 𝐻𝑂 𝑇𝑒𝑠𝑡 𝑠𝑡𝑎𝑡𝑖𝑠𝑡𝑖𝑐𝑠 = ̂ 𝛽 𝑖 𝑡= ~𝑡𝑛−𝑘 ̂ ) 𝑠. 𝑒(𝛽 𝑖 Table 3.7: T- test for individual parameters. Variables Constant Cases - cumulative total (𝑋1) Cases - newly reported in last 7 days per 100000 population (𝑋4) Deaths - newly reported in last 7 days (𝑋7) Deaths - newly reported in last 7 days per 100000 population (𝑋8) Deaths - newly reported in last 24 hours (𝑋9) T - -2.908 -3.241 sig- Decision rule: Reject Ho if tcal > ttab, or if p-value < α - value, Conclusion: Deaths - newly reported in last 7 days per 100000 population (X8), Deaths - newly reported in last 7 days (𝑋7), Deaths - newly reported in last 24 hours (X9), Cases - newly reported in last 7 days per 100000 population (X4), and cumulative total (X1) all contribute to the cumulative death of COVID-19 patients in Africa. 38 3.5. TEST FOR VALIDATION OF ASSUMPTIONS 3.5.1. Testing for Linearity Assumption The following graphs depicts the relationship between the dependent variable (cumulative Death of COVID-19 patient in Africa) and each independent variable. Fig 3.0: Plot of cumulative death against death newly reported in the last 7 days/100000 population. 39 Fig 3.1: Plot of cumulative death against newly reported cases in the last 7 days. Figure 3.2: Plot of cumulative death against cumulative total. 40 It was observed from the plots in figure 3.0, 3.1 and 3.2 that variables cases total cumulative, Cases-Newly reported in last 7 days per 100000 population, Death – newly reported in last 7days, Death – newly Reported in the last 7days per 100000 population and Death -newly reported in Last 24hours coefficients are linearly related to the dependent variable. 3.5.2. Testing for Homoscedastic (equality of variance) Assumption Fig 3.3: Scatterplot of regression standardized residual against standardized predicted value. 41 The plot above indicates that the error is not normally distributed and shows a nonequality of variance (heteroscedacity), also the plot follows a pattern as the variance increases, with a point on the plot showing outlier in the data. 3.5.3. Testing for Normality Assumption 1. QQ plot. Fig 3.4 QQ plot As it can be seen from the Q-Q Plot that many of the studentized residuals falls in line with the straight line, then it shows that the assumption of normality is met. 42 2. Histogram Figure 3.5: Plot of standardized residuals against the observed cumulative probability for normality check. The histogram is bell shaped, which shows that Normality assumption is met. 43 3.8. Autocorrelation of Error Term Assumption: Durbin Watson i. 1.390 Durbin Watson Test: Hypothesis: H0: There is positive autocorrelation H1: H0 is not true Test Statistic: Decision Rule: d = 1.390 Reject H0 if DW = 2 or greater than 2 otherwise do not reject H0. Decision: Since DW value is < 2, then we do not reject H0. Conclusion: There is enough evidence to support the claim that the three is positive autocorrelation at α = 0.05 level of significance 44 3.5.6 Multicollinearity Assumption: This assumption is tested using Variance Inflation Factor (VIF) value which is given as; ′𝑉𝐼𝐹 = 1 1−𝑅𝐼2 Table 3.9.: VIF values for the independent variables Independent Variables Tolerance VIF Values Cases - cumulative total 0.094 10.644 Cases - newly reported in last 7 days per 100000 population 0.075 13.353 Deaths - newly reported in last 7 days 0.032 31.205 Deaths - newly reported in last 7 days per 100000 population 0.070 11.876 Deaths - newly reported in last 24 hours 0.084 13.024 It can be seen from the VIF values that there exists high multicollinearity of the independent variables i.e., majority of the values are greater than 10. This might be due to fact that the data is gotten from a population rather than a sample. 45 But the tolerance values for the independent variables indicates that there are no high multicollinearity or singularity between the variables since none of the tolerance values are above 0.9. 46 CHAPTER FOUR SUMMARY, CONCLUSION AND RECOMMENDATION 4.0. SUMMARY This project work is on regression modelling of some determinants of COVID-19 cumulative deaths in Africa. The statistical tool used in this project work is multiple regression. From the test for model adequacy, based on the available data, it was discovered that at least one of the regressor is significant i.e., it has impact on the dependent variable at α = 0.05 level of significance. Fitting the model using all the independent variables we find out that the adjusted R2 of the model is 0.989, R = 0.995 showing a strong correlation between the dependent variable (cumulative dealth) and the independent variables. And the R2 = 0.098 showing that the linear regression explains 98.90% of the variance in the data, or 98.90% of the variation in the dependent variable is explained by the independent variable. Individual tests for the independent variables reveal that only variables 𝑋9 , 𝑋8 , 𝑋7 , 𝑋4 and 𝑋1 contributes significantly to the model, the other independent variables were removed and fitted using the method of OLS. This means that Deaths of newly reported cases in last 7 days per 100000 population (X8), Deaths - newly reported in last 24 hours (X9), Cases - newly reported in last 7 days per 100000 population (X4), Deaths - newly reported in last 7 days(X7) and cumulative total (X1) contributes significantly to the cumulative dealt of COVID-19 patients. while, Cases - newly reported in last 7 days, Deaths - cumulative total per 100000 population, cases 47 cumulative total per 10000 population (X2), cases newly reported in the last 24hrs (X5), and death cumulative total per 100000 population (X6) are not significantly related to the cumulative death of COVID-19 patients in Africa. The regression equation using OLS is given by: ̂ = −𝟏𝟔𝟖. 𝟎𝟎𝟖𝟒𝟕𝟒 -𝑿𝟏 −-𝑿𝟐 ± 0.71773𝑿𝟑 -𝑿𝟒 𝒀 -𝑿𝟓 -𝑿𝟔 -𝑿𝟕 −-𝑿𝟖 −-𝑿𝟗 The Data met all assumptions except for that of the test for multicollinearity, a new model was fitted using the significant variables. The reduced model: ̂ = −𝟐𝟐𝟕. 𝟏𝟕𝟔 − 𝟎. 𝟎𝟐𝑿𝟏 + 8.954𝑿𝟒 + 43.029𝑿𝟕 − 1126,215𝑿𝟖 − 138.650𝑿𝟗 𝒀 Test to know the extent at which the model fit, from the result, we find out that the R is 0.994 showing a very strong positive relationship between the dependent and the independent variable. adjusted R2 of the model in equation is 0.987 with R2 = 0.988. This means that the linear regression model explains 98.8% of the variance in the data, or 98.8% of the variation in the dependent variable is explained by the independent variable. 48 4.1. CONCLUSION This study has illustrated that the available COVID-19 data satisfies conditions to be used for a multiple regression Analysis and that variables; Deaths of newly reported cases in last 7 days per 100000 population (X8), Deaths - newly reported in last 24 hours (X9), Cases - newly reported in last 7 days per 100000 population (X4), Deaths - newly reported in last 7 days(X7) and cumulative total (X1) can significantly predict the cumulative dealt of COVID-19 patients in Africa. 49 References Aaron kandola (2020). Medically reviewed by Meredith Goodwin, MD, FAAFP . Coronavirus transmission: How it spreads and how to avoid it (medicalnewstoday.com). Abiodun A.A. “STA 333 – Introduction to Regression Analysis Notes”. Department of statistics, University of Ilorin. Ajao I.O, Awogbemi C.A and Ilugbusi A.O. (202). Vector and autoregressive models for multivariate time series analysis on COVID-19 pandemic in Nigeria. Ajulo H. K. (2018). “A study of the effects of some clinical variables on prostate cancer. Amzat J, Aminu K, Kolo VI, Akinyele AA, Ogundairo JA, Danjibo CM, Coronavirus Outbreak in Nigeria: Burden and Socio-Medical Response during the First 100 Days, International Journal of Infectious Diseases (2020), doi: https://doi.org/10.1016/j.ijid-. and treatment of adenocarcinoma of the prostate: II. radical prostatectomy treated patients, Journal of Urology 141(5),-. Breusch, T.S.; Pagan, A.R. (1979). “A Simple Test for Heteroscedasticity and Random coefficient of estimation”. Econometrica. 47(5):-. Cook, R. Dennis (February 1977). "Detection of Influential Observations in Linear Regression". Technometrics. American Statistical Association. 19 (1): 15–18. 50 Cook, R. Dennis (March 1979). "Influential Observations in Linear Regression". Damodar N. Gujarati (2004). “Basic Econometrics, Fourth Edition”. Data”. 39th Hawaii International Conference on System Sciences (2006). doi:10.1093/biomet/-. JSTOR-. MR-. p. 593. doi:10.2307/-. JSTOR-. MR-. Durbin, J.; Watson, G.S. (1950). “Testing for serial Correlation in Least Squares Regression, I”. Biometrika. 37 (3-4): 409-428. doi:10.1093/biomet/-. Gibbons, Jean Dickinson; Chakraborti, Subhabrata (2003), Nonparametric statistical inference (4th ed.), CRC Press, ISBN-. Health Professional Version”. ILO, (2018): “Women and men in the informal economy: A statistical picture”. Jason W. Osborne and Amy Overbay (March 2004). "The Power of Outlier". Peerreviewed Electronic Journal. Volume 9, Pg 6. ISBN-. NCDC (2020); https://covid19.ncdc.gov.ng/. Accessed 25th of May 2021. 51 Oranusi C.K. (2012). “Prostate cancer awareness and screening among male public servants in Anambra, Nigeria”. https://doi.org/10.1016/j.afju-. p. 223. ISBN-. Royston, Patrick (1992). “Approximating the Shapiro-Wilk test for normality”. Shapiro, S. S.; Wilk, M. B. (1965). "An analysis of variance test for normality Stamey, T.A., Kabalin, J.N., McNeal, J.E., Johnstone, I.M., Statistic and Computing. 2(3): 117-119. The free encyclopedia. (2004, July 22). FL: Wikimedia Foundation, Inc. Retrieved The World Bank Group, (2020): Nigeria in Times of COVID-19: Laying Foundations for a Strong Recovery: The World Bank Group, 1818 H Street NW, Washington, DC 20433, USA; fax:-; e-mail:- United Nations (202). Policy brief Impact of COVID-19 in Africa. https://www.bing.com/search?q=journals+on+corona+virus&cvid=335b9388b44 84fbeb905cec30477ce08&aqs=edge World Health Organization. (2020). Coronavirus disease 2019 (COVID-19) Situation Report – 37. https://www.who.int/docs/default-source/coronaviruse/situation- 52 reports/--sitrep-37- covid-19.pdf?sfvrsn=-e_2 [Access on March 15, 2020]. 53 APPENDIX Name South Africa Cases cumulative total Cases cumulative total per 100000 population Cases newly reported in last 7 days Cases newly reported in last 7 days per 100000 population Cases newly reported in last 24 hours Deaths cumulative total Deaths cumulative total per 100000 population Deaths newly reported in last 7 days Deaths newly reported in last 24 hours - 2757.55 21737 36.65 2893 55802 94.09 592 30 Tunisia 335345 2837.43 8773 74.23 1246 12236 103.53 387 54 Ethiopia 269194 234.16 2930 2.55 293 4076 3.55 80 8 Egypt 253835 248.04 8114 7.93 1145 14721 14.39 394 51 Libya 183311 2667.78 1901 27.67 412 3111 45.28 23 6 Kenya 168432 313.24 2967 5.52 324 3059 5.69 56 10 Nigeria 166019 80.54 310 0.15 40 2067 1 1 0 Algeria 126860 289.3 1549 3.53 209 3418 7.79 44 7 Ghana 93620 301.29 287 0.92 37 783 2.52 0 0 Zambia 93201 506.97 765 4.16 95 1268 6.9 8 1 Cameroon 76756 289.14 0 0 0 1230 4.63 0 0 Mozambique 70590 225.85 148 0.47 22 831 2.66 5 0 Botswana 54151 2302.7 1989 84.58 0 784 33.34 23 0 Namibia 52946 2083.75 1728 68.01 234 765 30.11 47 2 Côte d’Ivoire 46942 177.96 286 1.08 0 298 1.13 0 0 Uganda 43734 95.61 955 2.09 227 356 0.78 9 6 Senegal 41062 245.24 212 1.27 39 1130 6.75 5 1 Madagascar 40876 147.61 735 2.65 96 800 2.89 37 7 Zimbabwe 38682 260.26 122 0.82 3 1586 10.67 4 0 Sudan 34889 79.57 0 0 0 2446 5.58 0 0 Malawi 34284 179.22 70 0.37 10 1153 6.03 0 0 Angola 32441 98.71 1804 5.49 292 725 2.21 66 10 Cabo Verde 29334 5276.02 1166 209.72 136 256 46.04 7 0 Rwanda 26688 206.05 712 5.5 264 349 2.69 5 1 Gabon 24107 1083.1 308 13.84 0 147 6.6 4 0 Réunion 23566 2632.16 922 102.98 0 176 19.66 7 0 Guinea 22988 175.04 254 1.93 0 158 1.2 7 0 Mayotte 20176 7395.49 0 0 0 171 62.68 0 0 Mauritania 19149 411.84 321 6.9 35 458 9.85 1 0 Eswatini 18551 1599 31 2.67 1 672 57.92 0 0 54 Mali 14241 70.32 51 0.25 5 514 2.54 3 2 Burkina Faso 13415 64.18 18 0.09 1 165 0.79 1 0 Togo 13374 161.55 99 1.2 22 125 1.51 0 0 Congo 11476 207.97 133 2.41 0 150 2.72 2 0 Lesotho 10822 505.17 32 1.49 16 326 15.22 6 6 South Sudan 10670 95.32 18 0.16 0 115 1.03 0 0 Seychelles 10669 - 928 943.6 236 38 38.64 8 0 Equatorial Guinea 8436 601.29 742 52.89 0 113 8.05 1 0 Benin Central African Republic 8025 66.2 41 0.34 0 101 0.83 0 0 7079 146.57 213 4.41 69 97 2.01 2 1 Gambia 5978 247.37 32 1.32 0 178 7.37 3 0 Niger 5383 22.24 50 0.21 19 212 0.88 20 20 Chad 4924 29.98 20 0.12 1 173 1.05 0 0 Burundi 4546 38.23 199 1.67 52 6 0.05 0 0 Sierra Leone 4121 51.66 16 0.2 4 79 0.99 0 0 Comoros 3942 453.31 9 1.03 2 146 16.79 0 0 Eritrea 3932 110.87 88 2.48 0 14 0.39 2 0 Guinea-Bissau Sao Tome and Principe 3751 190.6 5 0.25 2 68 3.46 1 0 2336 1065.89 9 4.11 2 37 16.88 2 1 Liberia 2142 42.35 13 0.26 0 85 1.68 0 0 Mauritius United Republic of Tanzania 1322 103.95 34 2.67 0 17 1.34 0 0 509 0.85 0 0 0 21 0.04 0 0 0 0 0 0 0 0 0 0 0 Saint Helena 55