• Developed a modern data pipeline using PySpark and Delta Lake with layered (Bronze–Silver–Gold) architecture for efficient and scalable data processing. •Automated dimension and fact table creation using SCD Type 1 logic with incremental updates and surrogate key generation, significantly reducing manual ETL maintenance. • Integrated DBT for modular transformations and validation, ensuring consistency and reliability across multiple data layers.