General Data Engineering Pipeline
General Data Engineering Pipeline
Overview:
- Implemented a robust data pipeline from data ingestion through transformation and visualization.
- Ingested data into Google Cloud Storage using Terraform, Docker and Apache Airflow for automation.
- Processed data with PySpark and orchestrated ETL workflows using Airflow DAGs.
- Stored cleansed and structured data in BigQuery for scalable warehousing.
- Applied data transformations and modeling using dbt to create analytics-ready tables.
- Visualized key metrics and business insights via Power BI dashboards.
- Managed infrastructure as code with Terraform for reproducible deployments and environment management.
GitHub repository: https://github.com/Abdou240/Dataeng