project Β· 2019-2021
ADBConnectors, Databricks integration library
A PySpark-based Python package abstracting the I/O patterns between Azure Databricks and the half-dozen sinks an enterprise pipeline typically needs. Built during the ABN AMRO Bank engagement and open-sourced internally there.
A reusable PySpark integration library that abstracts the I/O patterns between Azure Databricks and the half-dozen sinks an enterprise data pipeline typically touches, Synapse, SQL Server, Cosmos DB, blob / parquet, JDBC. Built during the ABN AMRO Bank engagement and open-sourced internally to other teams there. Eliminates the boilerplate every team was reinventing per pipeline.
What it gives you
- One unified API:
read(source, ...)/write(target, ...)across all supported backends. - Credential injection from Azure Key Vault baked in, no secrets in notebooks.
- Connection pooling at the cluster level, not the notebook level.
- Partition-aware writes: dynamic partition overwrite semantics for Synapse, schema-evolution policies handled at the package level.
Adoption
Open-sourced inside ABN AMRO so other engineering teams could pick it up. A typical pipeline lost 100+ lines of plumbing per stage and gained consistent error semantics. Reviews stopped including βdid you handle the connection close in the failure path?β because the package handled it.
Stack
Python Β· PySpark Β· Azure Databricks Β· Azure Key Vault Β· Synapse Β· Cosmos DB Β· JDBC.