Big-Data

Production grade pyspark jobs

Use additional python packages with pyspark

avatar
Dr. Georg Heiler

Deterministic scale-out for spark jobs under increased load

Make spark jobs scale reliably using iteration

avatar
Dr. Georg Heiler

Parallel aggregation of dataframes

Use idempotency of RDD's to your advantage

avatar
Dr. Georg Heiler
Geospatial binning with hexagons on spark featured image

Geospatial binning with hexagons on spark

Bring hexagons as efficient spatial operations to spark

avatar
Dr. Georg Heiler
data processing featured image

data processing

recent history of data processing.

avatar
Dr. Georg Heiler

Spark descriptive name for cached dataframes

Display user friendly names for cached table in Spark web UI

avatar
Dr. Georg Heiler

Solve data skew issues for array columns in spark

Preventing data skew issues for Arrays.

avatar
Dr. Georg Heiler
Ultimate open vector geoprocessing on spark featured image

Ultimate open vector geoprocessing on spark

Combine the strengths from geomesa and geospark for ultimate geoprocessing capabilities on spark

avatar
Dr. Georg Heiler

Analyze OSM data in spark

Analyze the OSM community and extract geometries from the graph.

avatar
Dr. Georg Heiler
OSM to Spark featured image

OSM to Spark

Processing OSM in a scalable hadoop native way.

avatar
Dr. Georg Heiler