project · 2019-2021

ADBConnectors, Databricks integration library

A PySpark-based Python package abstracting the I/O patterns between Azure Databricks and the half-dozen sinks an enterprise pipeline typically needs. Built during the ABN AMRO Bank engagement and open-sourced internally there.

A reusable PySpark integration library that abstracts the I/O patterns between Azure Databricks and the half-dozen sinks an enterprise data pipeline typically touches, Synapse, SQL Server, Cosmos DB, blob / parquet, JDBC. Built during the ABN AMRO Bank engagement and open-sourced internally to other teams there. Eliminates the boilerplate every team was reinventing per pipeline.

What it gives you

Adoption

Open-sourced inside ABN AMRO so other engineering teams could pick it up. A typical pipeline lost 100+ lines of plumbing per stage and gained consistent error semantics. Reviews stopped including “did you handle the connection close in the failure path?” because the package handled it.

Stack

Python · PySpark · Azure Databricks · Azure Key Vault · Synapse · Cosmos DB · JDBC.

← all projects