• Researched and acquired daily, state-wise cases data for India from Jan’20-Jun’21 from Kaggle
• Using the scikit library in Python, trained various supervised regression algorithms (e.g., KNN, SVM, Random Forest, etc.), generating models to predict daily positive cases for each state
• Improved upon the above predictions using time-series forecasting models (VAR, ARIMA)
• Trained an unsupervised learning clustering algorithm (K-Means Clustering) using the ARIMA model parameters to generate clusters of states
• Interpreted the resulting clusters generated by the model to map to the real-world scenarios observed during the first and second wave of COVID-19 in India