Overview BIGDATA Project — M1 Full-Stack Dev (PAR02, 2024) School: EFREI 📝 Project Description This project aimed to build a Lakehouse architecture for global COVID-19 vaccination data analysis. Using Bronze, Silver, and Gold layers, we implemented a robust pipeline for scalable data transformation and analytics. 🎯 Key Objectives Raw data ingestion (Bronze layer) Data cleaning and enrichment (Silver layer) Star-schema modeling for analytics (Gold layer) Trend visualization: Vaccination progress by country Comparison by WHO region COVID-related death trends ⚙️ Tools & Technologies Databricks Apache Spark Lakehouse Architecture Python (PySpark) Visualization: Matplotlib, Pandas 📂 Project Links 📓 Databricks Notebook: https://databricks-prod-cloudfront.cloud.databricks.com/public/4027ec902e239c93eaaa8714f173bcfc/3445251035974576/367390688357069/1714409181088165/latest.html 🗂️ GitHub Repository: https://github.com/julienESN/databricks-vaccination-analysis 🎯 Certifications: https://credentials.databricks.com/b29c41cc-14ef-45d2-88e4-8fe968ee9016#acc.N53xLTpc https://credentials.databricks.com/a18e6dd7-bcc7-4012-891e-55e08328314f#acc.VCBGNXT6