project · 2019-2021
Production ML system for ABN AMRO Bank
Led ML team of 3 engineers building a content-classification + Named Entity Recognition system for ABN AMRO. Processed thousands of customer emails daily at 95%+ accuracy. End-to-end MLOps with Azure Data Factory, Databricks, and MLFlow.
A production machine-learning system for ABN AMRO Bank that classified incoming customer emails by topic, extracted entities (account numbers, names, dates, transaction references), and routed them to the right operations queue. Processed thousands of emails daily at 95%+ accuracy.
What it did
- Content-based classification of inbound emails into operational categories (mortgage, retail, fraud-alert, account-services, complaints).
- Named Entity Recognition for the entities operations agents needed to act on the email: customer ID, account number, transaction reference, dates, monetary amounts.
- Routing: classified-and-tagged emails were dispatched to the appropriate downstream queue, replacing a manual triage step.
How it was built
- Pipelines: Azure Data Factory for orchestration; Databricks notebooks for feature extraction and training.
- Model training and versioning: MLFlow tracked experiments, registered models, and served the production version. Promotion from staging to production was a deliberate review-and-approve step, not an auto-deploy.
- Production serving: containerised model behind a REST endpoint; the email pipeline called it synchronously per message.
- Evaluation: per-class F1 over a held-out set updated weekly. Drift detection on input distributions to flag when model retraining was warranted.
My role
Led the engineering team of 3. Owned the model lifecycle from experimentation through production deployment, MLOps practices (versioning, training reproducibility, model promotion), and the technical design reviews that kept the system aligned with the bank’s compliance posture.
Why this earns a spot in projects
ML systems shipped to a regulated industry (banking) carry constraints that toy projects do not: auditability, reproducibility, change control, hand-off rituals between data science and ops. Building one early in my career taught me that the system around the model matters more than the model. By the time I moved on, the workflow was: a new email category took less than a sprint to train, evaluate, and roll out, because the framework around the model was solid. That bar still informs every ML system I touch.
Stack
Python · Azure Databricks · Azure Data Factory · MLFlow · scikit-learn · Hugging Face Transformers · Docker.