Data links KW 4
Jan 27, 2019·
·
1 min read
Dr. Georg Heiler
- Databricks published some free training videos I particularly like:
def cacheAs(df:org.apache.spark.sql.DataFrame, name:String level:org.apache.spark.storage.StorageLevel) :org.apache.spark.sql.DataFrame = {
try spark.catalog.uncacheTable(name)
catch { case _: org.apache.spark.sql.AnalysisException => () }
df.createOrReplaceTempView (name)
spark.catalog.cacheTable(name, Lever)!
return df
}
which gives cached RDDs nicer names and thus eases debbugging

Authors
senior data expert
Georg is a Senior data expert at Magenta and a ML-ops engineer at ASCII.
He is solving challenges with data. His interests include geospatial graphs
and time series. Georg transitions the data platform of Magenta to the cloud
and is handling large scale multi-modal ML-ops challenges at ASCII.