Build a cloud-agnostic data stack with DuckDB + Dagster with Ducklake. Fast, portable, and no vendor lock-in. Tired of cloud lock-in and surprise bills? This talk shows how to build a fast, portable analytics stack around DuckDB and Dagster. Along the way of our journey to sovereignty and scale we touch on pg_duckdb which combines the strengths of DuckDB with the proven operability of Postgres. We then move on to Ducklake and how this enables simple, yet scalable collaboration with data. Expect live SQL, practical orchestration tips, and a blueprint you can run locally, on-prem, or in the cloud—without giving up control.
Oct 14, 2025
Datenabhängigkeiten, Tool-Silos und Entwicklungsprozesse müssen keine Hindernisse sein – sie können gezielt zu einem integralen Bestandteil einer leistungsfähigen Datenplattform werden. Dieser Beitrag beleuchtet die grundlegenden Prinzipien moderner Datenplattformen, zeigt aktuelle Herausforderungen im Data Engineering auf und stellt Lösungsansätze vor, die eine skalierbare Architektur ermöglichen. Im Mittelpunkt steht dabei der Orchestrator als zentrales Element, um Komplexität zu reduzieren und nachhaltiges Wachstum zu unterstützen.
Sep 15, 2025
Enterprise grade artifact management for pixi
Jun 23, 2025
Magenta Telekom ingests many terabytes of new data every day, and every downstream consumer wants it immediately. The real bottleneck turned out not to be hardware but humans wrestling with hidden, hard-wired dependencies in hundreds of heterogeneous pipelines and sometimes tool silos.
Jun 16, 2025
Join the Vienna Data Engineering Meetup on the 9th of April. Scaling data pipelines @Telekom Aleks and Georg share how Magenta Telekom is handling their data challenges by turning data dependencies, tooling silos and development from being a problem into a helpful component of Magenta`s data platform. They start by covering the basic principles underpinning the data platform, discuss the current challenges in the domain of data engineering and share how these are solved for Magenta. They finish by sharing how building around the orchestrator - makes it easy to build for scale.
Apr 9, 2025
Sharing my experience migrating from conda-build to rattler-build for python noarch packages.
Apr 1, 2025
20 minute impulse talk about open data and priciples of handling data. See https://docs.google.com/presentation/d/1W7MiXO-6qrYADONrIiJdOtY_zFPB9QSghJ_JjMiVaAU/edit?slide=id.g33e86fc9cec_0_15#slide=id.g33e86fc9cec_0_15 for the slides. https://www.data.gv.at/2025/03/19/open-data-hackathon-im-metalab-wien-datenschaetze-entdecken-und-nutzbar-machen/ for the event some interesting learnings high value open data sets https://www.data.gv.at/en/2023/01/26/innovation-potential-through-public-data-eu-commission-obliges-member-states-to-release-high-value-data-sets/ https://www.data.gv.at/katalog/dataset/e91bd464-be86-453c-b693-2ab818e11df2 https://www.data.gv.at/wp-content/uploads/2023/01/Auflistung-HVDs-Details_datagvat_26012023.pdf https://justizonline.gv.at/jop/web/iwg https://www.offenerhaushalt.at/ data how-to https://georgheiler.com/post/learning-data-engineering/ https://github.com/l-mds/local-data-stack
Mar 21, 2025
A comprehensive guide to modern data engineering with local-first development practices
Mar 14, 2025
Pixi is a tool which enables efficient dependency handling. It is created from prefix.dev, built in Rust and very fast. For us at Magenta Telekom in Austria Pixi is beneficial as we build our new data platform around metadata and strong governance and an explicit graph of data dependencies. In this talk we share our experience with Pixi and how it empowers our data infrastructure - in conjunction with Dagster.
Jan 31, 2025
Jumpstart your data processing with this local modern data stack template
Oct 25, 2024