November, 2017

article thumbnail

Data Serialization Formats with Doug Cutting and Julien Le Dem - Episode 8

Data Engineering Podcast

Summary With the wealth of formats for sending and storing data it can be difficult to determine which one to use. In this episode Doug Cutting, creator of Avro, and Julien Le Dem, creator of Parquet, dig into the different classes of serialization formats, what their strengths are, and how to choose one for your workload. They also discuss the role of Arrow as a mechanism for in-memory data sharing and how hardware evolution will influence the state of the art for data formats.

Hadoop 100
article thumbnail

Building a Big Data Culture

Cloudera

In an earlier VISION post, The Five Markers on Your Big Data Journey , Amy O’Connor shared some common traits of many of the most successful data-driven companies. In this blog, I’d like to explore what I believe is the most important of those traits, building and fostering a culture of data. . The most important elements to establishing a data-driven culture are having a strong executive sponsor and consistent communication.

Insiders

Sign Up for our Newsletter

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

article thumbnail

Running Kafka Streams applications in AWS

Zalando Engineering

Second in our series about the use of Apache Kafka’s Streams API by Zalando This is the second in a series about the use of Apache Kafka’s Streams API by Zalando, Europe’s leading online fashion platform. See Ranking Websites in Real-time with Apache Kafka’s Streams API for the first post in the series. This piece was first published on confluent.io Running Kafka Streams applications in AWS At Zalando, Europe’s leading online fashion platform, we use Apache Kafka for a wide variety of use cases.

Kafka 40
article thumbnail

Buzzfeed Data Infrastructure with Walter Menendez - Episode 7

Data Engineering Podcast

Summary Buzzfeed needs to be able to understand how its users are interacting with the myriad articles, videos, etc. that they are posting. This lets them produce new content that will continue to be well-received. To surface the insights that they need to grow their business they need a robust data infrastructure to reliably capture all of those interactions.

article thumbnail

Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You Need to Know

Speaker: Timothy Chan, PhD., Head of Data Science

Are you ready to move beyond the basics and take a deep dive into the cutting-edge techniques that are reshaping the landscape of experimentation? 🌐 From Sequential Testing to Multi-Armed Bandits, Switchback Experiments to Stratified Sampling, Timothy Chan, Data Science Lead, is here to unravel the mysteries of these powerful methodologies that are revolutionizing how we approach testing.

article thumbnail

Cybersecurity On Call: Information War with Bill Gertz

Cloudera

What’s more terrifying, knowing that you just lost your identity or unknowingly being manipulated? While they both seem awful, they are the reality of the digital world that we live in, just look at the news. With countless articles discussing the recent Equifax hack where thousands of social security numbers were compromised to organizations like Facebook, Google, and Twitter coming forward with Russian accounts that were buying ads to influence US elections.

AWS 41
article thumbnail

Cybersecurity in our Connected Future

Cloudera

According to expert analysis , there will be more than 20 billion internet-connected devices by 2020. This profusion of connected devices, of course, is not limited to the private sector: from weapons systems and soldier uniforms to smart military bases and connected vehicles, the government has been an early adopter of the Internet of Things as a means to enhance national defense.

Medical 40

More Trending

article thumbnail

Cloudera in the Cloud (Part 2)

Cloudera

A noteworthy point is that Cloudera complements popular cloud services, such as Amazon Web Services (AWS) and Microsoft Azure. While cloud services do provide useful resources — such as compute instances and object storage on demand — Cloudera offers the unified platform to organize, process, analyze, and store data at large scale… anywhere.

Cloud 40
article thumbnail

Do We Really Need UI Tests?

Zalando Engineering

Two brothers examine the pros and cons of UI testing Based on their different experiences in Partner Solutions and Zalando Media Solutions respectively, we speak to frontend developers, Vadym Kukhtin and Oleksandr Kukhtin about their opposing opinions on UI testing. The Case Against UI Testing - Vadym TL;DR It depends on preference, but I believe that UI testing isn’t required in every instance In my experience, it is a sisyphean task to force developers to write even basic Unit tests, nevermind

Media 40
article thumbnail

Dedicated Ownership for Teams at Zalon

Zalando Engineering

Agile Lead and Software Engineer at Zalon, Jan Helwich on how to work well At the beginning of 2017, we at Zalon decided to enable our teams to work in what we believe is the most effective and efficient way. At the heart of this restructuring process, we assigned cross-functional teams to business goals or user needs only and let them take full responsibility for solving these problems.

article thumbnail

Zalando Wins Big in Dublin

Zalando Engineering

Ana Peleteiro Ramallo takes ‘Data Scientist of the Year’ award at the DatSci’s There was a great turnout at the Dublin DatSci Awards at Dublin’s Croke Park, with Data Scientists from across companies, universities, startups and the public sector attending. Zalando Dublin had finalists in two major award categories, backed up with two tables for support.

article thumbnail

From Developer Experience to Product Experience: How a Shared Focus Fuels Product Success

Speaker: Anne Steiner and David Laribee

As a concept, Developer Experience (DX) has gained significant attention in the tech industry. It emphasizes engineers’ efficiency and satisfaction during the product development process. As product managers, we need to understand how a good DX can contribute not only to the well-being of our development teams but also to the broader objectives of product success and customer satisfaction.

article thumbnail

Agile Fails

Zalando Engineering

Learning from and overcoming the issues that slow our teams down As Agile Coaches we meet a lot of teams, engineers, product people, leads and managers throughout our daily work. Not all of them are agile experts and we understand that there are some misconceptions about “agile”. When we encounter a misconception several times, we see it as a pattern that we can overcome.

article thumbnail

Real-time Ranking with Apache Kafka’s Streams API

Zalando Engineering

Using Apache and the Kafka Streams API with Scala on AWS for real-time fashion insights This piece was originally published on confluent.io The Fashion Web Zalando, Europe’s leading online fashion platform, cares deeply about fashion. Our mission statement is to, “Reimagine fashion for the good of all”. To reimagine something, first you need to understand it.

Kafka 40
article thumbnail

Why Event Driven?

Zalando Engineering

Zalando is using an event-driven approach for its new Fashion Platform. Conor Clifford examines why In a recent post , I wrote about how we went about building the core Article services and applications, of Zalando’s new Fashion Platform, with a strong event first focus. That new platform also has a strong overall event-driven focus, rather than a more “traditional” service-oriented approach.

article thumbnail

Introducing Cloudera Altus Analytic DB (beta) for Cloud-based Data Warehousing

Cloudera

Today, we are thrilled to announce the upcoming beta release of Cloudera Altus Analytic DB. As the first data warehouse cloud service that brings the warehouse to the data, it delivers instant self-service BI and SQL analytics to anyone – easily, reliably, and securely. Business analysts get iterative and flexible analytics with no limits on the number of users or use cases, and IT can easily manage across all tenants with simplified security and governance.

Cloud 40
article thumbnail

Peak Performance: Continuous Testing & Evaluation of LLM-Based Applications

Speaker: Aarushi Kansal, AI Leader & Author and Tony Karrer, Founder & CTO at Aggregage

Software leaders who are building applications based on Large Language Models (LLMs) often find it a challenge to achieve reliability. It’s no surprise given the non-deterministic nature of LLMs. To effectively create reliable LLM-based (often with RAG) applications, extensive testing and evaluation processes are crucial. This often ends up involving meticulous adjustments to prompts.

article thumbnail

The Future of Cloud-based Analytics (Part 3)

Cloudera

As the market moves toward cloud-based big data and analytics, three qualities emerge as vital for success. While many services will get some traction without meeting all three goals, they will also disappoint users and cause perpetual headaches for IT. At Cloudera, we see these undisputable attributes to be: Easy – Certainly no one goes out looking for a harder way to do their job.

Cloud 40
article thumbnail

Machine Learning, the DOCOMO Digital way: Two Core Use Cases

Cloudera

Pattern recognition. Anomaly detection. Event prediction. All of these capabilities are driven by machine learning (ML.) And recently, ML has been a hot topic among our clients. We’ve seen a steep uptick in companies highlighting their successes in ML. Cloudera offers a unified platform for analytics and machine learning. Each customer that we work with has a unique story, executing numerous uses cases, sometimes spanning multiple divisions, to gain insights that were impossible to achieve with

article thumbnail

Introduction to Six Strategies for Advancing Customer Knowledge

Cloudera

It’s almost indefensible today to say that there is a single more important asset to a modern business than the health and happiness of their customers. We simply can not grow as an enterprise without the support, voice, and participation of our users and constituents. However, rarely are our customers raising their hands to speak to us. That doesn’t mean that our customers aren’t talking to us.

BI 40
article thumbnail

Applying Data Science to Change Lives

Cloudera

My leadership experience has taught me many things so far; humility, patience, discipline, respect, and inclusion. All these qualities are applicable in martial arts, one of my many passions outside of work. I have also seen some of my early mentors integrate these values into the DNA of their teams and today, I aspire to do the same for the teams I lead in Cloudera.

article thumbnail

Entity Resolution Checklist: What to Consider When Evaluating Options

Are you trying to decide which entity resolution capabilities you need? It can be confusing to determine which features are most important for your project. And sometimes key features are overlooked. Get the Entity Resolution Evaluation Checklist to make sure you’ve thought of everything to make your project a success! The list was created by Senzing’s team of leading entity resolution experts, based on their real-world experience.