Data Analysis with Spark
Zalando Engineering
FEBRUARY 28, 2018
Problem As data is rapidly growing, we need a tool which can clean and train the data fast enough. With large datasets, sometimes it take days to finish the job, which results in some very frustrated data analysts. Note: Spark keeps all data immutable and in-memory. Provides in memory storage for cached RDD’s.
Let's personalize your content