project Β· 2019-2021

ADBConnectors, Databricks integration library

A PySpark-based Python package abstracting the I/O patterns between Azure Databricks and the half-dozen sinks an enterprise pipeline typically needs. Built during the ABN AMRO Bank engagement and open-sourced internally there.

A reusable PySpark integration library that abstracts the I/O patterns between Azure Databricks and the half-dozen sinks an enterprise data pipeline typically touches, Synapse, SQL Server, Cosmos DB, blob / parquet, JDBC. Built during the ABN AMRO Bank engagement and open-sourced internally to other teams there. Eliminates the boilerplate every team was reinventing per pipeline.

What it gives you

Adoption

Open-sourced inside ABN AMRO so other engineering teams could pick it up. A typical pipeline lost 100+ lines of plumbing per stage and gained consistent error semantics. Reviews stopped including β€œdid you handle the connection close in the failure path?” because the package handled it.

Stack

Python Β· PySpark Β· Azure Databricks Β· Azure Key Vault Β· Synapse Β· Cosmos DB Β· JDBC.

← all projects