• Home
  • Blog
  • Publications
  • Projects
  • Talks
  • Teaching
  • Projects
    • tsai
    • CSH Skillup
    • H3 conda-forge
    • Datalake for the enterprise & large geospatial data
    • Music streaming Analytics
    • Predictive credit scoring
    • PredictR
    • VDSG
  • Courses
    • DS101 - introduction to data science, 2022 DHBW
    • UII: Introduction to Big-data analytics 2021
    • DS101 - introduction to data science, 2021 DHBW
    • DHBW: introduction to data science 2020
    • UII: Introduction to Big-data analytics 2020
    • DHBW: introduction to data science 2019
    • UII: Big-data analytics with R & spark 2019
  • Posts
    • 2nd brain obsidian template
    • Migrating to rattler-build
    • Data Vault User Group Vienna 2025 February
    • Upskilling data engineers
    • Local data stack template
    • Cost efficient alternative to databricks lock-in
    • Dagster, dbt, duckdb as new local MDS
    • Securing Secrets with Mozilla SopS and AGE: A Powerful Combo
    • Unlocking Advanced Metadata Extraction with the New DBT API in Dagster
    • Making BigData small again (and green)
    • Comparing SQL-based streaming approaches
    • SFTP sensor
    • Connector goodness from Airbyte E2E lineage
    • Scalable data pipelines from dagster with pyspark
    • Tame your notebooks
    • Fully-fledged example with resources
    • Turning the data pipeline inside out
    • From hello-world to simple pipelines
    • Modern data orchestration using Dagster
    • Interactive dagster debugging
    • Scalable sparse matrix multiplication
    • COVID population model
    • ML project configuration management
    • Can you tell the nuts & berries apart in each group?
    • Intersting links about deep learning
    • Exact percentiles in Spark
    • Arrow 2.0.0 - structs in pandas
    • Sparkling SCD2
    • Speed up conda and improve error messages
    • Time-series visualization in python
    • Intersting links about Bayesian modeling
    • Run the latest version of spark
    • Intersting links about IoT
    • Production grade pyspark jobs
    • blazing-fast data science on GPUs
    • Deterministic scale-out for spark jobs under increased load
    • Spark and Hive 3
    • Parallel aggregation of dataframes
    • Tricks for scala with gradle
    • reproducible geospatial visualization in kepler.gl
    • Geospatial binning with hexagons on spark
    • Spark descriptive name for cached dataframes
    • Data links KW 28
    • Writing technical content in Markdown
    • Solve data skew issues for array columns in spark
    • Data links KW 22
    • Data links KW 21
    • Ultimate open vector geoprocessing on spark
    • Data links KW 20
    • Data links KW 19
    • Scalable cohort sampler
    • Analyze OSM data in spark
    • OSM to Spark
    • Headless spark on YARN
    • Data links KW 18
    • Data links KW 17
    • Scaling geospatial data processing in R
    • Data links KW 13
    • Data Works Summit
    • Data links KW 12
    • Data links KW 11
    • Data links KW 10
    • Data links KW 9
    • Data links KW 8
    • Dynamically select columns by type
    • Data links KW 7
    • Data links KW 6
    • Learnings from my master thesis with an industrial partner
    • Noise pollution data cleanup
    • Display Jupyter Notebooks with Academic
    • Data links KW 5
    • Data links KW 4
    • The beginning
    • Typesafe data analytics
    • fast AI deep learning course on google collab
    • You are the mean of all your peers
  • Publications
    • Cost-Effective Big Data Orchestration Using Dagster: A Multi-Platform Approach
    • The diaspora model for human migration
    • Visual analytics of mobility network changes observed using mobile phone data during COVID-19 pandemic
    • Specialization in Criminal Careers
    • Viral variant-resolved wastewater surveillance of SARS-CoV-2 at national scale
    • Identifying the root cause of cable network problems with machine learning
    • Data Anonymization – The key to innovation
    • Mobility changes in Austria in fall 2021
    • Monitoring supply networks from mobile phone data for estimating the systemic risk of an economy
    • Preprint: Varieties of mobility measures: Comparing survey and mobile phone data during the COVID-19 pandemic
    • Meteorological factors and non-pharmaceutical interventions explain local differences in the spread of SARS-CoV-2 in Austria
    • Hin zu einer regionalisierten Niedriginzidenz-Strategie für kommende Covid-19-Infektionswellen
    • Von Lockdown zu Lockdown: Über die Entwicklung der Mobilitätssreduktion in Österreichs Bundesländern
    • Behavioral gender differences are reinforced during the COVID-19 crisis
    • The impact of COVID-19 on relative changes in aggregated mobility using mobile-phone data
    • Complexity, transparency and time pressure: practical insights into science communication in times of crisis
    • Country-wide mobility changes observed using mobile phone data during COVID-19 pandemic
    • Comparing Implementation Variants Of Distributed Spatial Join on Spark
    • An example preprint / working paper
    • Cost-based statistical methods for fraud detection. Prediction of never-paying customers considering individual risk
    • An example journal article
    • Clustering time-series. An overview about different application contexts of time-series clustering
  • Recent & Upcoming Talks
    • Scaling data pipelines @Telekom
    • Open Data Hackathon Wien 25
    • Pixi powering Telekom data cloud
    • Cost efficient alternative to databricks lock-in
    • Cloud arbitrage for spark pipelines
    • Introduction to Geostatistics
    • Data Engineering in the DBT ecosystem
    • Modern data stack in the enterprise
    • Governance and pipelines in the modern data stack
    • Efficient Temporal Graph Analytics
    • AI basierte Root Cause Analyse von CPD Störquellen in Docsis Netzen
    • Orchestrating data in the mesh of the fragmented modern data stack
    • Mobility analytics
    • data processing
    • Geospatial data processing
    • R for HPC and big data
    • Make your ML app rock

VDSG

Jun 1, 2014 · 0 min read
Go to Project Site
Last updated on Nov 20, 2019
Society
Georg Heiler
Authors
Georg Heiler
senior data expert
My research interests include large geo-spatial time and network data analytics.

← PredictR Jun 1, 2014

© 2025 Georg Heiler. This work is licensed under CC BY NC ND 4.0

Published with Hugo Blox Builder — the free, open source website builder that empowers creators.