Data Infrastructure Overview
🌐 Data Infrastructure & Streaming
Modern data engineering requires managing a complex fleet of servers, brokers, and cloud services. This is where Data Platform Engineering begins.
🔍 Section Overview
Explore the backbone of real-time systems and how to manage them using code.
1. Apache Kafka Deep Dive
Master Kafka. Learn about Topics, Partitions, Replication, and the Consumer Group model for real-time data ingestion.
2. IaC with Terraform
Learn how to use Terraform or Pulumi to provision your data warehouses, buckets, and clusters in the cloud.
🎯 Key Learning Goals
- Build a real-time ingestion pipeline using Kafka.
- Provision cloud data infrastructure using Terraform.
- Understand the trade-offs between different cloud providers (AWS, Azure, GCP).