Apache-Spark

Run the latest version of spark featured image

Run the latest version of spark

Execute the latest version of spark on HDP.

avatar
Dr. Georg Heiler

Production grade pyspark jobs

Use additional python packages with pyspark

avatar
Dr. Georg Heiler

Deterministic scale-out for spark jobs under increased load

Make spark jobs scale reliably using iteration

avatar
Dr. Georg Heiler

Spark and Hive 3

Get spark and Hive to play nice again on HDP 3.1

avatar
Dr. Georg Heiler

Parallel aggregation of dataframes

Use idempotency of RDD's to your advantage

avatar
Dr. Georg Heiler
Geospatial binning with hexagons on spark featured image

Geospatial binning with hexagons on spark

Bring hexagons as efficient spatial operations to spark

avatar
Dr. Georg Heiler
data processing featured image

data processing

recent history of data processing.

avatar
Dr. Georg Heiler

Spark descriptive name for cached dataframes

Display user friendly names for cached table in Spark web UI

avatar
Dr. Georg Heiler

Solve data skew issues for array columns in spark

Preventing data skew issues for Arrays.

avatar
Dr. Georg Heiler
Ultimate open vector geoprocessing on spark featured image

Ultimate open vector geoprocessing on spark

Combine the strengths from geomesa and geospark for ultimate geoprocessing capabilities on spark

avatar
Dr. Georg Heiler