Apache-Spark

Cloud arbitrage for spark pipelines
Spark-based data PaaS solutions are convenient. But they come with their own set of challenges such as a high vendor lock-in and obscured costs. We show how to use a dedicated orchestrator (dagster-pipes). It can not only make Databricks an implementation detail but also save cost. Also, it improves developer productivity. It allows you to take back control.
Cloud arbitrage for spark pipelines