Sat.Jul 10, 2021 - Fri.Jul 16, 2021

article thumbnail

Tyrannical Data and Its Antidotes in the Microservices World

Confluent

Data is the lifeblood of so much of what we build as software professionals, so it’s unsurprising that operations involving its transfer occupy the vast majority of developer time across […].

IT 141
article thumbnail

Delivering Modern Enterprise Data Engineering with Cloudera Data Engineering on Azure

Cloudera

After the launch of CDP Data Engineering (CDE) on AWS a few months ago, we are thrilled to announce that CDE, the only cloud-native service purpose built for enterprise data engineers, is now available on Microsoft Azure. . CDP Data Engineering offers an all-inclusive toolset that enables data pipeline orchestration, automation, advanced monitoring, visual profiling, and a comprehensive management toolset for streamlining ETL processes and making complex data actionable across your analytic team

Insiders

Sign Up for our Newsletter

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

article thumbnail

Exploring The Design And Benefits Of The Modern Data Stack

Data Engineering Podcast

Summary We have been building platforms and workflows to store, process, and analyze data since the earliest days of computing. Over that time there have been countless architectures, patterns, and "best practices" to make that task manageable. With the growing popularity of cloud services a new pattern has emerged and been dubbed the "Modern Data Stack" In this episode members of the GoDataDriven team, Guillermo Sanchez, Bram Ochsendorf, and Juan Perafan, explain the combination

Designing 100
article thumbnail

Customer Support Automation Platform at Uber

Uber Engineering

High Level Overview of the Problem. Introduction. If you’ve used any online/digital service, chances are that you are familiar with what a typical customer service experience entails: you send a message (usually email aliased) to the company’s support staff, fill … The post Customer Support Automation Platform at Uber appeared first on Uber Engineering Blog.

article thumbnail

Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You Need to Know

Speaker: Timothy Chan, PhD., Head of Data Science

Are you ready to move beyond the basics and take a deep dive into the cutting-edge techniques that are reshaping the landscape of experimentation? 🌐 From Sequential Testing to Multi-Armed Bandits, Switchback Experiments to Stratified Sampling, Timothy Chan, Data Science Lead, is here to unravel the mysteries of these powerful methodologies that are revolutionizing how we approach testing.

article thumbnail

Create a Data Analysis Pipeline with Apache Kafka and RStudio

Confluent

In Data Science projects, we distinguish between descriptive analytics and statistical models running in production. Overall, these can be seen as one process. You start with analyzing historical data to […].

article thumbnail

Accelerate Offloading to Cloudera Data Warehouse (CDW) with Procedural SQL Support

Cloudera

Did you know Cloudera customers, such as SMG and Geisinger , offloaded their legacy DW environment to Cloudera Data Warehouse (CDW) to take advantage of CDW’s modern architecture and best-in-class performance? In addition to substantial cost savings upon moving to CDW, Geisinger is also able to search through hundreds of million patient note records in seconds providing better treatment to their patients.

More Trending

article thumbnail

Data Engineers of Netflix?—?Interview with Kevin Wylie

Netflix Tech

Data Engineers of Netflix?—?Interview with Kevin Wylie This post is part of our “Data Engineers of Netflix” series, where our very own data engineers talk about their journeys to Data Engineering @ Netflix. Kevin Wylie is a Data Engineer on the Content Data Science and Engineering team. In this post, Kevin talks about his extensive experience in content analytics at Netflix since joining more than 10 years ago.

article thumbnail

Real-Time Analytics with dbt + Rockset

Rockset

Rockset was founded to make it easy for developers and data teams to go from real-time data to actionable insights. We designed Rockset to remove many of the barriers teams face while building with real-time data including data preparation, performance tuning and infrastructure management. We also built ground up to support full SQL (including joins and aggregations), the most common query language for analytics.

SQL 52
article thumbnail

DIA Entries 2021: Judges’ Insight

Cloudera

The 2021 Data Impact Award (DIA) submissions are starting to stream in, and we know many of you are contemplating your entries – which we are excited to see. To help guide your award strategy, we thought it would be an excellent opportunity to ask our judges — a panel comprised of leading analysts and journalists well-versed in the application of data and the wider benefits it can bring across industries – what it takes for a winning project.

article thumbnail

The Post-Pandemic Supply Chain: Time to Go Back to Basics?

Teradata

Learn how complexities baked into the data analytics ecosystems of supply chains can be simplified to eliminate redundancy, increase time to value, and reduce cost.

article thumbnail

From Developer Experience to Product Experience: How a Shared Focus Fuels Product Success

Speaker: Anne Steiner and David Laribee

As a concept, Developer Experience (DX) has gained significant attention in the tech industry. It emphasizes engineers’ efficiency and satisfaction during the product development process. As product managers, we need to understand how a good DX can contribute not only to the well-being of our development teams but also to the broader objectives of product success and customer satisfaction.

article thumbnail

The Weekly ETL: Will Data Engineering Ever Be Sexy like Data Science?

Monte Carlo

In Monte Carlo’s Weekly ETL (Explanations Through Lior) series, Lior Gavish, Monte Carlo’s co-founder and CTO, answers a trending question on Reddit about some of data engineering’s hottest topics. Reddit thread can be found here. Reddit user /SWE-Aaron asks if data engineering will ever get the same attention as data science and whether that would actually be a good thing.

article thumbnail

20 Linear Regression Interview Questions and Answers 2023

ProjectPro

Linear Regression is probably one of the most well-known machine learning algorithms. It essentially involves modeling the relation between the given or derived parameters and the target to be learned. Therefore, any machine Learning job interview would be incomplete without a peppering of Linear Regression questions. These linear regression interview questions and answers will help you prepare for your machine learning interview.

article thumbnail

Paving the way for women in Tech: Fostering young girls’ enthusiasm for STEM

Cloudera

In the late 90s, when I was pursuing my studies in engineering, only a few girls enrolled in any STEM-related courses. While it was our love for math & science and the prospect of future opportunities that brought us here, we sadly found many of them gave up halfway through the course, and those who graduated either quit or never entered the profession. .

article thumbnail

How to Get More ROI—Faster—From Machine Learning

Teradata

Find out how to harness machine learning and AI to contain costs, increase revenue, and grow your organization’s customer base. Read more.

article thumbnail

Peak Performance: Continuous Testing & Evaluation of LLM-Based Applications

Speaker: Aarushi Kansal, AI Leader & Author and Tony Karrer, Founder & CTO at Aggregage

Software leaders who are building applications based on Large Language Models (LLMs) often find it a challenge to achieve reliability. It’s no surprise given the non-deterministic nature of LLMs. To effectively create reliable LLM-based (often with RAG) applications, extensive testing and evaluation processes are crucial. This often ends up involving meticulous adjustments to prompts.

article thumbnail

Announcing Monte Carlo’s Incident IQ, a Root Cause Analysis Workflow for Data Teams

Monte Carlo

Incident IQ gives data engineers and analysts a centralized, all-in-one solution for conducting incident management and root cause analysis on your data pipelines. Video courtesy of Monte Carlo. Today, we are excited to announce the release of Monte Carlo’s data incident management feature, Incident IQ, a new solution that allows data teams to collaboratively identify, alert on, and remediate the root cause of critical data issues before they impact downstream systems and end users.

Food 40
article thumbnail

Top 15 Cloud Computing Projects Ideas for Beginners in 2023

ProjectPro

People searching for cloud computing jobs per million grew by approximately 50%. According to an Indeed Jobs report, the share of cloud computing jobs has increased by 42% per million from 2018 to 2021. The global cloud computing market is poised to grow $287.03 billion during 2021-2025. Also, global spending on public cloud services will double by 2023.

article thumbnail

A Reference Architecture for the Cloudera Private Cloud Base Data Platform

Cloudera

Introduction and Rationale. The release of Cloudera Data Platform (CDP) Private Cloud Base edition provides customers with a next generation hybrid cloud architecture. This blog post provides an overview of best practice for the design and deployment of clusters incorporating hardware and operating system configuration, along with guidance for networking and security as well as integration with existing enterprise infrastructure.

article thumbnail

How to build a successful cloud data architecture

DataKitchen

The post How to build a successful cloud data architecture first appeared on DataKitchen.

article thumbnail

Entity Resolution Checklist: What to Consider When Evaluating Options

Are you trying to decide which entity resolution capabilities you need? It can be confusing to determine which features are most important for your project. And sometimes key features are overlooked. Get the Entity Resolution Evaluation Checklist to make sure you’ve thought of everything to make your project a success! The list was created by Senzing’s team of leading entity resolution experts, based on their real-world experience.

article thumbnail

Apache Superset 1.2: Release Notes

Preset

We're excited to announce the release of Apache Superset 1.2! In this release post, we will focus on the biggest and most interesting tangible, end-user features.

40
article thumbnail

How to Become an Artificial Intelligence Engineer in 2023

ProjectPro

The demand for data-related roles has increased massively in the past few years. Companies are actively seeking talent in these areas, and there is a huge market for individuals who can manipulate data, work with large databases and build machine learning algorithms. While data science is the most hyped-up career path in the data industry, it certainly isn't the only one.

article thumbnail

Optimizing Risk and Exposure Management – Roundtable Highlights

Cloudera

We recently hosted a roundtable focused on o ptimizing risk and exposure management with data insights. For financial institutions and insurers, risk and exposure management has always been a fundamental tenet of the business. Now, risk management has become exponentially complicated in multiple dimensions. . In this session we explored what firms are doing to approach the uncertainty with more predictability.

article thumbnail

Keys to DataOps Transformation

DataKitchen

The post Keys to DataOps Transformation first appeared on DataKitchen.

52
article thumbnail

How to Build an Experimentation Culture for Data-Driven Product Development

Speaker: Margaret-Ann Seger, Head of Product, Statsig

Experimentation is often seen as an aspirational practice, especially at smaller, fast-moving companies who are strapped for time and resources. So, how can you get your team making decisions in a more data-driven way while continuing to remain lean and maintaining ship velocity? In this webinar, Margaret-Ann Seger, Head of Product at Statsig, will teach you how to build an experimentation culture from the ground-up, graduating from just getting started with data-driven development to operating

article thumbnail

Why It’s Hard for Engineering to Support Marketing

RudderStack

Marketing teams get a bad rap from engineering, oftentimes for understandable reasons.

article thumbnail

15 Time Series Projects Ideas for Beginners to Practice 2023

ProjectPro

Time series analysis and forecasting is a dark horse in the domain of Data Science. Time series is among the most applied Data Science techniques in various industrial and business operations, such as financial analysis , production planning, supply chain management, and many more. Machine learning for time series is often a neglected topic. More recent techniques, such as natural language processing, pattern recognition, and others usually gain better attention.

Project 40
article thumbnail

Courage and Curiosity: Valuable Attributes for Women in Big Data

Cloudera

Last week we held our third Women In Data Webinar, and what a session it was! We were honored to welcome Justyna Lebedyk, Senior Product Owner Big Data, Commerzbank AG, who posed the question “Does diversity win?” . I had the pleasure of chatting with Justyna about the key themes from her talk and what advice she would give to others looking to pursue a career in data. .

article thumbnail

Low Code And High Quality Data Engineering For The Whole Organization With Prophecy

Data Engineering Podcast

Summary There is a wealth of tools and systems available for processing data, but the user experience of integrating them and building workflows is still lacking. This is particularly important in large and complex organizations where domain knowledge and context is paramount and there may not be access to engineers for codifying that expertise. Raj Bains founded Prophecy to address this need by creating a UI first platform for building and executing data engineering workflows that orchestrates

article thumbnail

The Path to Product Excellence: Avoiding Common Pitfalls and Enhancing Communication

Speaker: David Bard, Principal at VP Product Coaching

In the fast-paced world of digital innovation, success is often accompanied by a multitude of challenges - like the pitfalls lurking at every turn, threatening to derail the most promising projects. But fret not, this webinar is your key to effective product development! Join us for an enlightening session to empower you to lead your team to greater heights.

article thumbnail

Identifying Document Types at Scribd

Scribd Technology

User-uploaded documents have been a core component of Scribd’s business from the very beginning, understanding what is actually in the document corpus unlocks exciting new opportunities for discovery and recommendation. With Scribd anybody can upload and share documents , analogous to YouTube and videos. Over the years, our document corpus has become larger and more diverse which has made understanding it an ever-increasing challenge.

article thumbnail

Top 20 Logistic Regression Interview Questions and Answers

ProjectPro

To become a successful data scientist in the industry, understanding the end-to-end workflow of the data science pipeline (understanding data, data pre-processing, model building, model evaluation, and model deployment) is essential. Assuming you do not want to overwhelm yourself with fancy machine learning algorithms, mastering the concepts of logistic regression should be your primary step to get familiar with the end-to-end data science pipeline.

article thumbnail

Inclusive Leadership Minimises Negative Impact of Workplace Politics

Cloudera

Can an organization eradicate workplace politics completely? Defined by the Harvard Business Review as “a variety of activities associated with the use of influence tactics to improve personal or organizational interests”, politics at the workplace is inevitable. Undeniably, wielding influence to achieve positive outcomes is encouraged. However the question leaders should be asking is, are fragmented individual agendas taking precedence over an organization’s mission?

article thumbnail

Monte Carlo Launches Data Incident Management Feature, Incident IQ, to Help Organizations Achieve Data Trust

Monte Carlo

Monte Carlo , the data reliability company, today released data incident management feature, Incident IQ, a new suite of capabilities that help data engineers better pinpoint, address, and resolve data downtime at scale through the Monte Carlo Data Observability Platform. Incident IQ automatically generates rich insights about critical data issues through root cause analysis, giving teams unprecedented visibility into the end-to-end health and trust of their data beyond the scope of traditional

article thumbnail

Reimagined: Building Products with Generative AI

“Reimagined: Building Products with Generative AI” is an extensive guide for integrating generative AI into product strategy and careers featuring over 150 real-world examples, 30 case studies, and 20+ frameworks, and endorsed by over 20 leading AI and product executives, inventors, entrepreneurs, and researchers.