The ML Lifecycle (CRISP-ML)
๐ The ML Lifecycle (CRISP-ML)
Traditional software development (DevOps) focuses on code. MLOps focuses on the interaction between Code, Data, and Models. CRISP-ML(Q) is the industry standard for this lifecycle.
๐๏ธ 1. Business & Data Understanding
Before writing a single line of Python, you must define the success criteria.
- Define KPIs: (e.g., โReduce false positives in fraud detection by 10%โ).
- Data Audit: Is the data available? Is it labeled? What is its frequency?
๐ ๏ธ 2. Data Preparation (The DE Step)
This is where 80% of the work happens.
- ETL/ELT: Extracting from lakes, transforming with Spark/DuckDB.
- Feature Engineering: Creating input variables () that have predictive power.
- Data Validation: Checking for nulls, types, and schema violations.
๐งช 3. Modeling & Experimentation
The โScienceโ phase.
- Algorithm Selection: XGBoost, Random Forest, or Neural Networks?
- Hyperparameter Tuning: Searching for the optimal , etc.
- Cross-Validation: Ensuring the model generalizes to unseen data.
๐ 4. Deployment (The โOpsโ Step)
Moving from a .pkl file on a laptop to a production service.
- Packaging: Dockerizing the inference code.
- Serving: Exposing the model via FastAPI or BentoML.
- Infrastructure: Provisioning CPUs/GPUs on Kubernetes.
๐ 5. Monitoring & Maintenance
The phase where most models fail.
- Performance Decay: Accuracy drops over time as the world changes.
- Retraining Loops: Triggering Phase 3 again with fresh data.