1.5 Years of Spark Knowledge in 8 Tips
Towards Data Science
DECEMBER 24, 2023
0 — Quick Review Quickly, let’s review what spark does… Spark is a big data processing engine. At it’s lowest level, spark creates tasks, which are parallelizable transformations on data partitions. Data partitions: subsets of rows of our data. This is horizontal/vertical scaling. This is data skew.
Let's personalize your content