August, 2019

article thumbnail

Building Tools And Platforms For Data Analytics

Data Engineering Podcast

Summary Data engineers are responsible for building tools and platforms to power the workflows of other members of the business. Each group of users has their own set of requirements for the way that they access and interact with those platforms depending on the insights they are trying to gather. Benn Stancil is the chief analyst at Mode Analytics and in this episode he explains the set of considerations and requirements that data analysts need in their tools and.

Building 100
article thumbnail

Nothing but NumPy: Understanding & Creating Neural Networks with Computational Graphs from Scratch

KDnuggets

Entirely implemented with NumPy, this extensive tutorial provides a detailed review of neural networks followed by guided code for creating one from scratch with computational graphs.

Coding 123
Insiders

Sign Up for our Newsletter

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

article thumbnail

Building the New Uber Freight App as Lists of Modular, Reusable Components

Uber Engineering

As Uber Freight marked its second anniversary, we went back to the drawing board to redesign its app. The original carrier app was successful for owner-operators with one or two drivers, but it wasn’t optimized for larger fleets—feedback we … The post Building the New Uber Freight App as Lists of Modular, Reusable Components appeared first on Uber Engineering Blog.

Building 111
article thumbnail

How Human Growth Defines the Future of Digital Disruption

Teradata

Contrary to popular belief, in today's technology-enabled, digitally-disrupted world, it's the human element that matters the most in business. Read more!

article thumbnail

Get Better Network Graphs & Save Analysts Time

Many organizations today are unlocking the power of their data by using graph databases to feed downstream analytics, enahance visualizations, and more. Yet, when different graph nodes represent the same entity, graphs get messy. Watch this essential video with Senzing CEO Jeff Jonas on how adding entity resolution to a graph database condenses network graphs to improve analytics and save your analysts time.

article thumbnail

How We Reduced DynamoDB Costs by Using DynamoDB Streams and Scans More Efficiently

Rockset

Many of our users implement operational reporting and analytics on DynamoDB using Rockset as a SQL intelligence layer to serve live dashboards and applications. As an engineering team, we are constantly searching for opportunities to improve their SQL-on-DynamoDB experience. For the past few weeks, we have been hard at work tuning the performance of our DynamoDB ingestion process.

Bytes 52
article thumbnail

Using Graph Processing for Kafka Stream Visualizations

Confluent

We know that Apache Kafka ® is great when you’re dealing with streams, allowing you to conveniently look at streams as tables. Stream processing engines like KSQL furthermore give you the ability to manipulate all of this fluently. But what about when the relationships between items dominate your application? For example, in a social network, understanding the network means we need to look at the friend relationships between people.

Kafka 55

More Trending

article thumbnail

How to Become More Marketable as a Data Scientist

KDnuggets

As a data scientist, you are in high demand. So, how can you increase your marketability even more? Check out these current trends in skills most desired by employers in 2019.

Data 123
article thumbnail

Less is More: Engineering Data Warehouse Efficiency with Minimalist Design

Uber Engineering

Maintaining Uber’s large-scale data warehouse comes with an operational cost in terms of ETL functions and storage. In our experience, optimizing for operational efficiency requires answering one key question: for which tables does the maintenance cost supersede utility? Once identified, … The post Less is More: Engineering Data Warehouse Efficiency with Minimalist Design appeared first on Uber Engineering Blog.

article thumbnail

The Power of Prioritization in Data Management

Teradata

Find out how the early architectural decisions surrounding the Teradata Database are still making a critical contribution to performance today. Read more!

article thumbnail

Using Tableau with DynamoDB: How to Build a Real-Time SQL Dashboard on NoSQL Data

Rockset

In this blog, we examine DynamoDB reporting and analytics, which can be challenging given the lack of SQL and the difficulty running analytical queries in DynamoDB. We will demonstrate how you can build an interactive dashboard with Tableau, using SQL on data from DynamoDB, in a series of easy steps, with no ETL involved. DynamoDB is a widely popular transactional primary data store.

NoSQL 40
article thumbnail

Understanding User Needs and Satisfying Them

Speaker: Scott Sehlhorst

We know we want to create products which our customers find to be valuable. Whether we label it as customer-centric or product-led depends on how long we've been doing product management. There are three challenges we face when doing this. The obvious challenge is figuring out what our users need; the non-obvious challenges are in creating a shared understanding of those needs and in sensing if what we're doing is meeting those needs.

article thumbnail

Top 10 Reasons to Attend Kafka Summit

Confluent

Yes, the other definition of event sourcing. 1. Keynotes from leading technologists. At Kafka Summit SF, you’ll get to hear incredible keynotes from leading technologists, including Jay Kreps and Neha Narkhede , original co-creators of Apache Kafka ®. In the past, we’ve featured Chris D’Agostino, James Watters, Martin Kleppmann, and Martin Fowler. This time around, we’re delighted to have Devendra Tagare , Engineering Manager of Streaming Platforms from Lyft and Chris Kasten , VP of Walmart Clou

Kafka 19
article thumbnail

Digging Into Data Replication At Fivetran

Data Engineering Podcast

Summary The extract and load pattern of data replication is the most commonly needed process in data engineering workflows. Because of the myriad sources and destinations that are available, it is also among the most difficult tasks that we encounter. Fivetran is a platform that does the hard work for you and replicates information from your source systems into whichever data warehouse you use.

Media 100
article thumbnail

Knowing Your Neighbours: Machine Learning on Graphs

KDnuggets

Graph Machine Learning uses the network structure of the underlying data to improve predictive outcomes. Learn how to use this modern machine learning method to solve challenges with connected data.

article thumbnail

Migrating Functionality Between Large-scale Production Systems Seamlessly

Uber Engineering

A common axiom among Uber engineers states that building new features is like fixing a car’s engine while driving it. As we scaled up to our present level of support for 14 million trips per day, the car in that … The post Migrating Functionality Between Large-scale Production Systems Seamlessly appeared first on Uber Engineering Blog.

Systems 84
article thumbnail

Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You Need to Know

Speaker: Timothy Chan, PhD., Head of Data Science

Are you ready to move beyond the basics and take a deep dive into the cutting-edge techniques that are reshaping the landscape of experimentation? 🌐 From Sequential Testing to Multi-Armed Bandits, Switchback Experiments to Stratified Sampling, Timothy Chan, Data Science Lead, is here to unravel the mysteries of these powerful methodologies that are revolutionizing how we approach testing.

article thumbnail

Is Finance Holding Back Your Bank’s Digital Transformation?

Teradata

How can a Digital CFO break down the silos in the Bank and support the digital agenda in transforming the customer journey? Read more from our experts!

Finance 53
article thumbnail

The Kafka Connect Plugin for Rockset and How It Works

Rockset

Rockset continuously ingests data streams from Kafka, without the need for a fixed schema, and serves fast SQL queries on that data. We created the Kafka Connect Plugin for Rockset to export data from Kafka and send it to a collection of documents in Rockset. Users can then build real-time dashboards or data APIs on top of the data in Rockset. This blog covers how we implemented the plugin.

Kafka 40
article thumbnail

Confluent Cloud Schema Registry is Now Generally Available

Confluent

We are excited to announce the release of Confluent Cloud Schema Registry in general availability (GA), available in Confluent Cloud , our fully managed event streaming service based on Apache Kafka ®. Before we dive into Confluent Cloud Schema Registry, let’s recap what Confluent Schema Registry is and does. Confluent Schema Registry provides a serving layer for your metadata and a RESTful interface for storing and retrieving Avro schemas.

Cloud 18
article thumbnail

Solving Data Discovery At Lyft

Data Engineering Podcast

Summary Data is only valuable if you use it for something, and the first step is knowing that it is available. As organizations grow and data sources proliferate it becomes difficult to keep track of everything, particularly for analysts and data scientists who are not involved with the collection and management of that information. Lyft has build the Amundsen platform to address the problem of data discovery and in this episode Tao Feng and Mark Grover explain how it works, why they built it, a

article thumbnail

Peak Performance: Continuous Testing & Evaluation of LLM-Based Applications

Speaker: Aarushi Kansal, AI Leader & Author and Tony Karrer, Founder & CTO at Aggregage

Software leaders who are building applications based on Large Language Models (LLMs) often find it a challenge to achieve reliability. It’s no surprise given the non-deterministic nature of LLMs. To effectively create reliable LLM-based (often with RAG) applications, extensive testing and evaluation processes are crucial. This often ends up involving meticulous adjustments to prompts.

article thumbnail

Types of Bias in Machine Learning

KDnuggets

The sample data used for training has to be as close a representation of the real scenario as possible. There are many factors that can bias a sample from the beginning and those reasons differ from each domain (i.e. business, security, medical, education etc.).

article thumbnail

Building the New Uber Freight App as Lists of Modular, Reusable Components

Uber Engineering

As Uber Freight marked its second anniversary, we went back to the drawing board to redesign its app. The original carrier app was successful for owner-operators with one or two drivers, but it wasn’t optimized for larger fleets—feedback we … The post Building the New Uber Freight App as Lists of Modular, Reusable Components appeared first on Uber Engineering Blog.

article thumbnail

3 Factors to Consider When Evaluating Self-Service Analytics

Teradata

What is the value of self-service analytics in your organization? What personas provide the most value & where should a business focus its resources? Read more.

IT 49
article thumbnail

Data-Driven Decisions for Where to Park in SF

Rockset

Have you ever felt uncertain parking in a shady area? In particular, have you ever parked in San Francisco and wondered, if I measured the average inverse square distance to every vehicle incident recorded by the SFPD in the last year, at what percentile would my current location fall? If so, we built an app for that. In this post we’ll explain our methodology and its implementation.

article thumbnail

The Big Payoff of Application Analytics

Outdated or absent analytics won’t cut it in today’s data-driven applications – not for your end users, your development team, or your business. That’s what drove the five companies in this e-book to change their approach to analytics. Download this e-book to learn about the unique problems each company faced and how they achieved huge returns beyond expectation by embedding analytics into applications.

article thumbnail

Applying Netflix DevOps Patterns to Windows

Netflix Tech

Baking Windows with Packer By Justin Phelps and Manuel Correa Customizing Windows images at Netflix was a manual, error-prone, and time consuming process. In this blog post, we describe how we improved the methodology, which technologies we leveraged, and how this has improved service deployment and consistency. Artisan Crafted Images In the Netflix full cycle DevOps culture the team responsible for building a service is also responsible for deploying, testing, infrastructure, and operation of t

AWS 82
article thumbnail

Simple node.JS and Slack WebHook integration

nodeSWAT

This post will walk you through the process of how to turn this awesome chat tool into a handy monitoring & alerting tool for your application. All this without any 3rd party modules and minimal code to keep the footprint small. Note: This post is using now outmoded integration method. Slack has introduced new ways to manage and send messages via Apps.

Coding 52
article thumbnail

Why Data Visualization Is The Most Important Skill in a Data Analyst Arsenal

KDnuggets

Visually-displayed data is much more accessible, and it’s criticalto promptly identify the weaknesses of an organization, accurately forecasttrading volumes and sale prices, or make the right business choices.

Data 122
article thumbnail

Announcing Bottom Navigator

Pandora Engineering

An Android Multiple Backstack Bottom Navigation Library Pandora’s latest mobile redesign brings the bottom navigation pattern to our apps. Bottom navigation has become a popular design choice for many apps due to its many advantages including easy one-handed use and enhanced discoverability of top app destinations. When Pandora embarked on this project our designers had a clear vision of how navigation should work, a vision that in many ways is familiar to users of other popular apps like Instag

article thumbnail

From Developer Experience to Product Experience: How a Shared Focus Fuels Product Success

Speaker: Anne Steiner and David Laribee

As a concept, Developer Experience (DX) has gained significant attention in the tech industry. It emphasizes engineers’ efficiency and satisfaction during the product development process. As product managers, we need to understand how a good DX can contribute not only to the well-being of our development teams but also to the broader objectives of product success and customer satisfaction.

article thumbnail

Data is Not the New Oil. Data is Water!

Teradata

If you work in data analytics or a related field, you’ve probably heard the mantra that data is the new oil. But data is not oil, it's water. Find out why.

Data 45
article thumbnail

Tableau Operational Dashboards and Reporting on DynamoDB - Evaluating Redshift and Athena

Rockset

Organizations speak of operational reporting and analytics as the next technical challenge in improving business processes and efficiency. In a world where everyone is becoming an analyst , live dashboards surface up-to-date insights and operationalize real-time data to provide in-time decision-making support across multiple areas of an organization.

BI 40
article thumbnail

KSQL UDFs and UDAFs Made Easy

Confluent

One of KSQL’s most powerful features is allowing users to build their own KSQL functions for processing real-time streams of data. These functions can be invoked on individual messages (user-defined functions or UDFs) or used to perform aggregations on groups of messages (user-defined aggregate functions or UDAFs). The previous blog post How to Build a UDF and/or UDAF in KSQL 5.0 discussed some key steps for building and deploying a custom KSQL UDF/UDAF.

Kafka 17
article thumbnail

Deep Learning Next Step: Transformers and Attention Mechanism

KDnuggets

With the pervasive important of NLP in so many of today's applications of deep learning, find out how advanced translation techniques can be further enhanced by transformers and attention mechanisms.

article thumbnail

The Path to Product Excellence: Avoiding Common Pitfalls and Enhancing Communication

Speaker: David Bard, Principal at VP Product Coaching

In the fast-paced world of digital innovation, success is often accompanied by a multitude of challenges - like the pitfalls lurking at every turn, threatening to derail the most promising projects. But fret not, this webinar is your key to effective product development! Join us for an enlightening session to empower you to lead your team to greater heights.