Sat.May 23, 2020 - Fri.May 29, 2020

article thumbnail

Tips on Data Science Masters in Germany

Team Data Science

Should you do a masters degree in data science in Germany? Why not, but keep the following in mind! In general, it is very, very practical in Germany because it doesn't cost a lot of money to study. Not like for example in the USA or something like that. So if you are interested in it, you should first think about what the corresponding Master's programme is about.

article thumbnail

Data Engineering Project for Beginners - Batch edition

Start Data Engineering

Introduction Approach Project overview Engineering Design Airflow Primer: Setup Code and explanation Stage 1. pg -> file -> s3 Stage 2. file -> s3 -> EMR -> s3 Stage 3. movie_review_stage, user_purchase_stage -> Redshift table -> quality Check data Monitoring ETL Design Review Common Scenarios Next Steps Conclusion Introduction Starting out in data engineering can be a little intimidating, especially because data engineering involves a lot of moving parts.

Insiders

Sign Up for our Newsletter

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

article thumbnail

Mapping The Customer Journey For B2B Companies At Dreamdata

Data Engineering Podcast

Summary Gaining a complete view of the customer journey is especially difficult in B2B companies. This is due to the number of different individuals involved and the myriad ways that they interface with the business. Dreamdata integrates data from the multitude of platforms that are used by these organizations so that they can get a comprehensive view of their customer lifecycle.

article thumbnail

Learning All About Wi-Fi Data with Apache Kafka and Friends

Confluent

Recently, I’ve been looking at what’s possible with streams of Wi-Fi packet capture (pcap) data. I was prompted after initially setting up my Raspberry Pi to capture pcap data and […].

Kafka 122
article thumbnail

Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You Need to Know

Speaker: Timothy Chan, PhD., Head of Data Science

Are you ready to move beyond the basics and take a deep dive into the cutting-edge techniques that are reshaping the landscape of experimentation? 🌐 From Sequential Testing to Multi-Armed Bandits, Switchback Experiments to Stratified Sampling, Timothy Chan, Data Science Lead, is here to unravel the mysteries of these powerful methodologies that are revolutionizing how we approach testing.

article thumbnail

Jupyter Notebooks or Standalone Scripts?

Team Data Science

Lot's of people like notebooks and so do I. Jupyter Notebooks for instance, are great to quickly explore some data or try something out. If you want to bring code into production however, you should or most likely, have to write standalone scripts. If you want to create something for production and then do it in production, Jupiter notebooks are not ideal.

Coding 130
article thumbnail

How to Balance Efficiency and Risk in Your Supply Chain

Teradata

Supply Chain organizations need visibility now to leverage data for making decisions and taking action, both in times of crisis and in relative stability.

Data 111

More Trending

article thumbnail

Building a Clickstream Dashboard Application with ksqlDB and Elasticsearch

Confluent

Using a powerful, event-driven application can help you unlock insights contained in the event streams of your business. Before we get into the technology, let’s go over some questions you […].

Building 117
article thumbnail

How to develop Spark applications with Zeppelin notebooks

Team Data Science

I love working with Zeppelin notebooks. Its so simple and you can just try something out. Especially working with dataframes and SparkSQL is a blast. What is a Zeppelin? A Zeppelin is a tool, a notebook tool, just like Jupiter. You can run it on a server and you can run it on your Hadoop cluster or whatever. And it can run Spark jobs in the background.

Hadoop 130
article thumbnail

Using Advanced Analytics to Predict the Onset of a Cytokine Storm

Teradata

A team of Teradata data scientists, clinicians & engineers set out to build a model that could track and predict the onset of a Cytokine Storm.

article thumbnail

Hyper Scale VPC Flow Logs enrichment to provide Network Insight

Netflix Tech

How Netflix is able to enrich VPC Flow Logs at Hyper Scale to provide Network Insight By Hariharan Ananthakrishnan and Angela Ho The Cloud Network Infrastructure that Netflix utilizes today is a large distributed ecosystem that consists of specialized functional tiers and services such as DirectConnect, VPC Peering, Transit Gateways, NAT Gateways, etc.

AWS 60
article thumbnail

From Developer Experience to Product Experience: How a Shared Focus Fuels Product Success

Speaker: Anne Steiner and David Laribee

As a concept, Developer Experience (DX) has gained significant attention in the tech industry. It emphasizes engineers’ efficiency and satisfaction during the product development process. As product managers, we need to understand how a good DX can contribute not only to the well-being of our development teams but also to the broader objectives of product success and customer satisfaction.

article thumbnail

Best Practices to Secure Your Apache Kafka Deployment

Confluent

For many organizations, Apache Kafka® is the backbone and source of truth for data systems across the enterprise. Protecting your event streaming platform is critical for data security and often […].

Kafka 111
article thumbnail

Elastically Scaling Confluent Platform on Kubernetes

Confluent

This month, we kicked off Project Metamorphosis by introducing several Confluent features that make Apache Kafka® clusters more elastic—the first of eight foundational traits characterizing cloud-native data systems that map […].

Kafka 75
article thumbnail

Integrating Teradata Vantage with AWS Glue

Teradata

Many Teradata customers are interested in integrating Teradata Vantage with AWS First Party Services. This Getting Started Guide can help. Read more.

AWS 52