Georg Heiler
Georg Heiler
Home
Blog
Publications
Projects
Lecturing
Talks
Contact
Light
Dark
Automatic
Apache-Spark
Production grade pyspark jobs
Use additional python packages with pyspark
Georg Heiler
Last updated on Nov 15, 2020
2 min read
Deterministic scale-out for spark jobs under increased load
Make spark jobs scale reliably using iteration
Georg Heiler
Dec 13, 2019
2 min read
Spark and Hive 3
Get spark and Hive to play nice again on HDP 3.1
Georg Heiler
Dec 10, 2019
2 min read
Parallel aggregation of dataframes
Use idempotency of RDD’s to your advantage
Georg Heiler
Last updated on Jun 13, 2019
1 min read
Geospatial binning with hexagons on spark
Bring hexagons as efficient spatial operations to spark
Georg Heiler
Last updated on Jun 2, 2019
2 min read
data processing
recent history of data processing.
Aug 4, 2019 9:00 AM — 11:00 AM
Yogyakarta, Indonesia
Georg Heiler
Follow
Spark descriptive name for cached dataframes
Display user friendly names for cached table in Spark web UI
Georg Heiler
Jul 23, 2019
1 min read
Solve data skew issues for array columns in spark
Preventing data skew issues for Arrays.
Georg Heiler
Jun 13, 2019
4 min read
Ultimate open vector geoprocessing on spark
Combine the strengths from geomesa and geospark for ultimate geoprocessing capabilities on spark
Georg Heiler
Last updated on Jun 2, 2019
4 min read
Analyze OSM data in spark
Analyze the OSM community and extract geometries from the graph.
Georg Heiler
May 7, 2019
2 min read
«
»
Cite
×