Remove data-engineering-cloud vertical-autoscaling-data-processing-cloud read
article thumbnail

1.5 Years of Spark Knowledge in 8 Tips

Towards Data Science

0 — Quick Review Quickly, let’s review what spark does… Spark is a big data processing engine. At it’s lowest level, spark creates tasks, which are parallelizable transformations on data partitions. Data partitions: subsets of rows of our data. This is horizontal/vertical scaling. This is data skew.

Scala 84