ML project configuration management

Easy configuration handling for complex machine learning pipelines

Exact percentiles in Spark

Combining the power of Scala and Python to make the calculation of percentiles in Spark easy and fast

Arrow 2.0.0 - structs in pandas

Finally, nested types in Arrow.

Speed up conda and improve error messages

Efficient management of python packages

Time-series visualization in python

Interactive and scalable plots for time-series and periodicities

reproducible geospatial visualization in

Reproducible, effortless & great looking visualization of geospatial data.

Scalable cohort sampler

Scaling cohorts matching in Python using dask.

Noise pollution data cleanup

Harmonization of GIS data for Austria using python.