Big-Data

Cloud arbitrage for spark pipelines

Spark-based data PaaS solutions are convenient. But they come with their own set of challenges such as a high vendor lock-in and obscured costs. We show how to use a dedicated orchestrator (dagster-pipes). It can not only make Databricks an implementation detail but also save cost. Also, it improves developer productivity. It allows you to take back control.

Jun 21, 2024 12:00 AM — 12:00 AM

Georg Heiler, Hernan Picatto

Cloud arbitrage for spark pipelines

Introduction to Geostatistics

Delve into the world of geostatistics with our introductory talk! Explore how spatial data analysis and modeling techniques unlock …

Nov 10, 2023 12:00 AM Indonesia

Georg Heiler

Introduction to Geostatistics

Visual analytics of mobility network changes observed using mobile phone data during COVID-19 pandemic

The limited exchange between human communities is a key factor in preventing the spread of COVID-19. This paper introduces a digital …

Mohammad Forghani, Christophe Claramunt, Farid Karimipour, Georg Heiler

Visual analytics of mobility network changes observed using mobile phone data during COVID-19 pandemic

Efficient Temporal Graph Analytics

Using large scale telecommunication data for mobility modeling and infrastructure maintenance

Oct 19, 2022 12:00 AM Indonesia

Georg Heiler

Efficient Temporal Graph Analytics

AI basierte Root Cause Analyse von CPD Störquellen in Docsis Netzen

Good quality network connectivity is ever more important. For hybrid fiber coaxial (HFC) networks, searching for upstream \emph{high noise} in the past was cumbersome and time-consuming. Even with machine learning due to the heterogeneity of the network and its topological structure, the task remains challenging. We present the automation of a simple business rule (largest change of a specific value) and compare its performance with state-of-the-art machine-learning methods and conclude that the precision@1 can be improved by 2.3 times. As it is best when a fault does not occur in the first place, we secondly evaluate multiple approaches to forecast network faults, which would allow performing predictive maintenance on the network.

May 10, 2022 12:00 AM — May 12, 2022 12:00 AM

Georg Heiler

AI basierte Root Cause Analyse von CPD Störquellen in Docsis Netzen

Identifying the root cause of cable network problems with machine learning

Good quality network connectivity is ever more important. For hybrid fiber coaxial (HFC) networks, searching for upstream high noise in …

Georg Heiler, Thassilo Gadermaier, Thomas Haider, Allan Hanbury, Peter Filzmoser

Identifying the root cause of cable network problems with machine learning

Monitoring supply networks from mobile phone data for estimating the systemic risk of an economy

Remarkably little is known about the structure, formation, and dynamics of supply- and production networks that form one foundation of …

Tobias Reisch, Georg Heiler, Christian Diem, Stefan Thurner

Monitoring supply networks from mobile phone data for estimating the systemic risk of an economy

Mobility analytics

Mobility monitoring in Austria during the COVID NPI

Nov 16, 2020 12:00 AM Vienna, Austria

Georg Heiler

Run the latest version of spark

Execute the latest version of spark on HDP.

Georg Heiler

Aug 31, 2020 4 min read

Run the latest version of spark

Production grade pyspark jobs

Use additional python packages with pyspark

Georg Heiler

Last updated on Nov 15, 2020 2 min read