Data Engineering
ETL/ELT, pipelines, warehouses, batch/stream.
Overview
Move and transform data reliably: batch and streaming. Build simple, observable pipelines end‑to‑end.
Understand orchestration, schema evolution and data quality monitoring.
Syllabus
- Ingestion patterns and idempotent design
- Batch vs streaming and windowing basics
- File formats: CSV/JSON/Parquet and trade‑offs
- Transformations: joins, dedup, late‑data handling
- Orchestration and retries with backoff
- Schema evolution and compatibility checks
- Observability: metrics, logs and lineage
- Validation and SLAs for critical datasets