Data links KW 4

Databricks published some free training videos I particularly like:

def cacheAs(df:org.apache.spark.sql.DataFrame, name:String level:org.apache.spark.storage.StorageLevel) :org.apache.spark.sql.DataFrame = {
try spark.catalog.uncacheTable(name)
  catch { case _: org.apache.spark.sql.AnalysisException => () }
  df.createOrReplaceTempView (name)
  spark.catalog.cacheTable(name, Lever)!
return df
}

which gives cached RDDs nicer names and thus eases debbugging

5 Mistakes I Made When Doing Custom Data Visualization With D3.js
Digdag - a new workflow engine
Improve standups with a slack bot
Production grade spark on k8s
cascading failure of distributed systems
Zeppelin in the Enterprise
Imapla - Hive inconsistencies
Higher order functions in Spark 2.4
Think carefully about dependencies!

Linklist

Authors

Dr. Georg Heiler

senior data expert

Georg is a Senior data expert at Magenta and a ML-ops engineer at ASCII. He is solving challenges with data. His interests include geospatial graphs and time series. Georg transitions the data platform of Magenta to the cloud and is handling large scale multi-modal ML-ops challenges at ASCII.

← Data links KW 5 Feb 3, 2019

The beginning Jan 22, 2019 →

No results found