September, 2019

article thumbnail

Ship Faster With An Opinionated Data Pipeline Framework

Data Engineering Podcast

Summary Building an end-to-end data pipeline for your machine learning projects is a complex task, made more difficult by the variety of ways that you can structure it. Kedro is a framework that provides an opinionated workflow that lets you focus on the parts that matter, so that you don’t waste time on gluing the steps together. In this episode Tom Goldenberg explains how it works, how it is being used at Quantum Black for customer projects, and how it can help you structure your own.

article thumbnail

Which Data Science Skills are core and which are hot/emerging ones?

KDnuggets

We identify two main groups of Data Science skills: A: 13 core, stable skills that most respondents have and B: a group of hot, emerging skills that most do not have (yet) but want to add. See our detailed analysis.

Insiders

Sign Up for our Newsletter

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

article thumbnail

How Artificial Intelligence & Deep Learning Change the Game

Teradata

AI & Deep Learning allow organizations to maximize player performance while minimizing player risk through better insights from performance and wellness data.

article thumbnail

Grafana Time-Series Dashboards with the Rockset-Grafana Plugin

Rockset

What Is Grafana? Grafana is an open-source software platform for time series analytics and monitoring. You can connect Grafana to a large number of data sources, from PostgreSQL to Prometheus. Once your data source is connected, you can use a built-in query control or editor to fetch data, and build dashboards from your data source. Grafana is frequently deployed for a wide variety of use cases, including DevOps and AdTech.

article thumbnail

Get Better Network Graphs & Save Analysts Time

Many organizations today are unlocking the power of their data by using graph databases to feed downstream analytics, enahance visualizations, and more. Yet, when different graph nodes represent the same entity, graphs get messy. Watch this essential video with Senzing CEO Jeff Jonas on how adding entity resolution to a graph database condenses network graphs to improve analytics and save your analysts time.

article thumbnail

Story about AWS RDS upgrade to AWS Aurora and InnoDB adaptive hash index parameter

nodeSWAT

Story about unexpected slowdown during AWS RDS upgrade to AWS Aurora and InnoDB adaptive hash index parameter TL;DR at the end. The parameter. MySQL 5.7 documentation about InnoDB adaptive hash index. Turning this parameter ON enables the database engine to analyze index searches and to automatically adapt to the queries/searches you are running. It does so by making custom indexes for these specific cases, in return making your queries run faster because they can now use the automatically gener

AWS 52
article thumbnail

Choosing a Reactive Programming Framework for Modern Android Development

Pandora Engineering

When embarking on the journey of developing a new application, a team must establish the foundational technologies upon which their… Continue reading on Algorithm and Blues »

More Trending

article thumbnail

10 Great Python Resources for Aspiring Data Scientists

KDnuggets

This is a collection of 10 interesting resources in the form of articles and tutorials for the aspiring data scientist new to Python, meant to provide both insight and practical instruction when starting on your journey.

Python 122
article thumbnail

Teradata Certification Program Embraces Vantage

Teradata

The Teradata Certification program is celebrating its 20th anniversary! Find out how it can advance your career by making you a certified expert on Vantage.

article thumbnail

Real-Time Analytics and Monitoring Dashboards with Apache Kafka and Rockset

Confluent

In the early days, many companies simply used Apache Kafka ® for data ingestion into Hadoop or another data lake. However, Apache Kafka is more than just messaging. The significant difference today is that companies use Apache Kafka as an event streaming platform for building mission-critical infrastructures and core operations platforms. Examples include microservice architectures, mainframe integration, instant payment, fraud detection, sensor analytics, real-time monitoring, and many more—dri

Kafka 21
article thumbnail

Evolving Regional Evacuation

Netflix Tech

Niosha Behnam | Demand Engineering @ Netflix At Netflix we prioritize innovation and velocity in pursuit of the best experience for our 150+ million global customers. This means that our microservices constantly evolve and change, but what doesn’t change is our responsibility to provide a highly available service that delivers 100+ million hours of daily streaming to our subscribers.

article thumbnail

Understanding User Needs and Satisfying Them

Speaker: Scott Sehlhorst

We know we want to create products which our customers find to be valuable. Whether we label it as customer-centric or product-led depends on how long we've been doing product management. There are three challenges we face when doing this. The obvious challenge is figuring out what our users need; the non-obvious challenges are in creating a shared understanding of those needs and in sensing if what we're doing is meeting those needs.

article thumbnail

AsyncTask, Rx, and Coroutines… Oh My!

Pandora Engineering

Credit: Sally Anscombe An Android Apprentice’s journey to understand Pandora’s migration from AsyncTask to newer APIs During my second month as an Android Engineer Apprentice, I was tasked with migrating AsyncTask to newer APIs. Early on, I was asked, “Do you know why we are migrating from AsyncTask?” I wracked my brain and answered shyly, “It has something to do with memory leaks?

article thumbnail

Navigating Boundless Data Streams With The Swim Kernel

Data Engineering Podcast

Summary The conventional approach to analytics involves collecting large amounts of data that can be cleaned, followed by a separate step for analysis and interpretation. Unfortunately this strategy is not viable for handling real-time, real-world use cases such as traffic management or supply chain logistics. In this episode Simon Crosby, CTO of Swim Inc., explains how the SwimOS kernel and the enterprise data fabric built on top of it enable brand new use cases for instant insights.

Data Lake 100
article thumbnail

12 Deep Learning Researchers and Leaders

KDnuggets

Our list of deep learning researchers and industry leaders are the people you should follow to stay current with this wildly expanding field in AI. From early practitioners and established academics to entrepreneurs and today’s top corporate influencers, this diverse group of individuals is leading the way into tomorrow’s deep learning landscape.

article thumbnail

Vantage: A Cloud-First Integrated Data & Analytics Platform

Teradata

There are a lot of misperceptions about Teradata. Learn more about what Teradata Vantage really is: a cloud-first integrated data and analytics platform.

Cloud 63
article thumbnail

Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You Need to Know

Speaker: Timothy Chan, PhD., Head of Data Science

Are you ready to move beyond the basics and take a deep dive into the cutting-edge techniques that are reshaping the landscape of experimentation? 🌐 From Sequential Testing to Multi-Armed Bandits, Switchback Experiments to Stratified Sampling, Timothy Chan, Data Science Lead, is here to unravel the mysteries of these powerful methodologies that are revolutionizing how we approach testing.

article thumbnail

How to Use Schema Registry and Avro in Spring Boot Applications

Confluent

TL;DR. Following on from How to Work with Apache Kafka in Your Spring Boot Application , which shows how to get started with Spring Boot and Apache Kafka ® , here I will demonstrate how to enable usage of Confluent Schema Registry and Avro serialization format in your Spring Boot applications. Using Avro schemas, you can establish a data contract between your microservices applications.

Java 20
article thumbnail

Scaling a Mature Data Pipeline?—?Managing Overhead

Airbnb Tech

Scaling a Mature Data Pipeline — Managing Overhead There is often a hidden performance cost tied to the complexity of data pipelines — the overhead. In this post, we will introduce its concept, and examine the techniques we use to avoid it in our data pipelines. Author : Zachary Ennenga The view from the third floor at Airbnb HQ! Background There is often a natural evolution in the tooling, organization, and technical underpinning of data pipelines.

article thumbnail

Outside Lands, Airbnb Prices, and Rockset’s Geospatial Queries

Rockset

Airbnb Prices Around Major Events Operational analytics on real-time data streams requires being able to slice and dice it along all the axes that matter to people, including time and space. We can see how important it is to analyze data spatially by looking at an app that’s all about location: Airbnb. Major events in San Francisco cause huge influxes of people, and Airbnb prices increase accordingly.

IT 40
article thumbnail

Building A Community For Data Professionals at Data Council

Data Engineering Podcast

Summary Data professionals are working in a domain that is rapidly evolving. In order to stay current we need access to deeply technical presentations that aren’t burdened by extraneous marketing. To fulfill that need Pete Soderling and his team have been running the Data Council series of conferences and meetups around the world. In this episode Pete discusses his motivation for starting these events, how they serve to bring the data community together, and the observations that he has ma

Building 100
article thumbnail

Peak Performance: Continuous Testing & Evaluation of LLM-Based Applications

Speaker: Aarushi Kansal, AI Leader & Author and Tony Karrer, Founder & CTO at Aggregage

Software leaders who are building applications based on Large Language Models (LLMs) often find it a challenge to achieve reliability. It’s no surprise given the non-deterministic nature of LLMs. To effectively create reliable LLM-based (often with RAG) applications, extensive testing and evaluation processes are crucial. This often ends up involving meticulous adjustments to prompts.

article thumbnail

TensorFlow vs PyTorch vs Keras for NLP

KDnuggets

These three deep learning frameworks are your go-to tools for NLP, so which is the best? Check out this comparative analysis based on the needs of NLP, and find out where things are headed in the future.

article thumbnail

How Reinforcement Learning is Changing Customer Engagement

Teradata

Companies are increasingly exploring opportunities to apply reinforcement learning to their most challenging problems. Learn what applications work the best.

56
article thumbnail

How to Make the Most of Kafka Summit San Francisco 2019

Confluent

Kafka Summit San Francisco is just one week away. Conferences can be busy affairs, so here are some tips on getting the most out of your time there. Plan. Go and check out the schedule. Spend a bit of time familiarising yourself with what sessions you want to get to, and mark them on your calendar. How do you pick which sessions to attend? My advice: diversify!

Kafka 18
article thumbnail

Reimagining Experimentation Analysis at Netflix

Netflix Tech

Toby Mao , Sri Sri Perangur , Colin McFarland Another day, another custom script to analyze an A/B test. Maybe you’ve done this before and have an old script lying around. If it’s new, it’s probably going to take some time to set up, right? Not at Netflix. ABlaze: The standard view of analyses in the XP UI Suppose you’re running a new video encoding test and theorize that the two new encodes should reduce play delay, a metric describing how long it takes for a video to play after you press the s

article thumbnail

The Big Payoff of Application Analytics

Outdated or absent analytics won’t cut it in today’s data-driven applications – not for your end users, your development team, or your business. That’s what drove the five companies in this e-book to change their approach to analytics. Download this e-book to learn about the unique problems each company faced and how they achieved huge returns beyond expectation by embedding analytics into applications.

article thumbnail

Real-Time Analytics in the World of Virtual Reality and Live Streaming

Rockset

"A fast-moving technology field where new tools, technologies and platforms are introduced very frequently and where it is very hard to keep up with new trends." I could be describing either the VR space or Data Engineering, but in fact this post is about the intersection of both. Virtual Reality – The Next Frontier in Media I work as a Data Engineer at a leading company in the VR space, with a mission to capture and transmit reality in perfect fidelity.

article thumbnail

Building A Reliable And Performant Router For Observability Data

Data Engineering Podcast

Summary The first stage in every data project is collecting information and routing it to a storage system for later analysis. For operational data this typically means collecting log messages and system metrics. Often a different tool is used for each class of data, increasing the overall complexity and number of moving parts. The engineers at Timber.io decided to build a new tool in the form of Vector that allows for processing both of these data types in a single framework that is reliable an

Building 100
article thumbnail

Advice on building a machine learning career and reading research papers by Prof. Andrew Ng

KDnuggets

This blog summarizes the career advice/reading research papers lecture in the CS230 Deep learning course by Stanford University on YouTube, and includes advice from Andrew Ng on how to read research papers.

article thumbnail

Taking Analytics to the 4th Dimension

Teradata

4D analytics combines geospatial, temporal and time series data to do advanced analysis of time and space. Learn how to uncover new insights today.

Data 56
article thumbnail

From Developer Experience to Product Experience: How a Shared Focus Fuels Product Success

Speaker: Anne Steiner and David Laribee

As a concept, Developer Experience (DX) has gained significant attention in the tech industry. It emphasizes engineers’ efficiency and satisfaction during the product development process. As product managers, we need to understand how a good DX can contribute not only to the well-being of our development teams but also to the broader objectives of product success and customer satisfaction.

article thumbnail

Kafka Summit SF 2019: Day 1 Recap

Confluent

Day one of the event, summarized for your convenience. They say you never forget your first Kafka Summit. Mine was in New York City in 2017, and it had, what, 300 people? Today we welcomed nearly 2,000 to a giant ballroom in San Francisco. There were laser beams. There were two Tony Stark allusions in the first 60 seconds. And most importantly, there was a vibrant and ever-growing community present.

Kafka 17
article thumbnail

5 Famous Deep Learning Courses/Schools of 2019

KDnuggets

Deep Learning is/has become the hottest skill in Data Science at the moment. There is a plethora of articles, courses, technologies, influencers and resources that we can leverage to gain the Deep Learning skills.

article thumbnail

Train sklearn 100x Faster

KDnuggets

As compute gets cheaper and time to market for machine learning solutions becomes more critical, we’ve explored options for speeding up model training. One of those solutions is to combine elements from Spark and scikit-learn into our own hybrid solution.

article thumbnail

BERT, RoBERTa, DistilBERT, XLNet: Which one to use?

KDnuggets

Lately, varying improvements over BERT have been shown — and here I will contrast the main similarities and differences so you can choose which one to use in your research or application.

121
121
article thumbnail

The Path to Product Excellence: Avoiding Common Pitfalls and Enhancing Communication

Speaker: David Bard, Principal at VP Product Coaching

In the fast-paced world of digital innovation, success is often accompanied by a multitude of challenges - like the pitfalls lurking at every turn, threatening to derail the most promising projects. But fret not, this webinar is your key to effective product development! Join us for an enlightening session to empower you to lead your team to greater heights.