Sat.Jun 19, 2021 - Fri.Jun 25, 2021

article thumbnail

Designing a Data Project to Impress Hiring Managers

Start Data Engineering

Introduction Objective Setup Pre-requisites Project 1. ETL Code 2. Test 3. Scheduler 4. Presentation 4.1. Formatting, Linting, and Type checks 4.2. Architecture Diagram 4.3. README.md 5. Adding Dashboard to your Profile Future Work Tear down infra Conclusion Further Reading References Introduction Building a data project for your portfolio is hard. Getting hiring managers to read through your Github code is even harder.

Project 130
article thumbnail

Efficient and Reliable Compute Cluster Management at Scale

Uber Engineering

Introduction. Uber relies on a containerized microservice architecture. Our need for computational resources has grown significantly over the years, as a consequence of business’ growth. It is an important goal now to increase the efficiency of our computing resources. Broadly … The post Efficient and Reliable Compute Cluster Management at Scale appeared first on Uber Engineering Blog.

Insiders

Sign Up for our Newsletter

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

article thumbnail

Saxo Bank’s Best Practices for a Distributed Domain-Driven Architecture Founded on the Data Mesh

Confluent

Al data til folket (all data to the people) is a compelling proposition in an enterprise context. Yet the ability to quickly address integration challenges and deliver data to those […].

article thumbnail

Make Database Performance Optimization A Playful Experience With OtterTune

Data Engineering Podcast

Summary The database is the core of any system because it holds the data that drives your entire experience. We spend countless hours designing the data model, updating engine versions, and tuning performance. But how confident are you that you have configured it to be as performant as possible, given the dozens of parameters and how they interact with each other?

Database 100
article thumbnail

Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You Need to Know

Speaker: Timothy Chan, PhD., Head of Data Science

Are you ready to move beyond the basics and take a deep dive into the cutting-edge techniques that are reshaping the landscape of experimentation? 🌐 From Sequential Testing to Multi-Armed Bandits, Switchback Experiments to Stratified Sampling, Timothy Chan, Data Science Lead, is here to unravel the mysteries of these powerful methodologies that are revolutionizing how we approach testing.

article thumbnail

Introducing Netflix Timed Text Authoring Lineage

Netflix Tech

A Script Authoring Specification By: Bhanu Srikanth, Andy Swan, Casey Wilms, Patrick Pearson The Art of Dubbing and Subtitling Dubbing and subtitling are inherently creative processes. At Netflix, we strive to make shows as joyful to watch in every language as in the original language, whether a member watches with original or dubbed audio, closed captions, forced narratives, subtitles or any combination they prefer.

article thumbnail

Migrate Hive data from CDH to CDP public cloud

Cloudera

Introduction. Many Cloudera customers are making the transition from being completely on-prem to cloud by either backing up their data in the cloud, or running multi-functional analytics on CDP Public cloud in AWS or Azure. The Replication Manager service facilitates both disaster recovery and data migration across different environments. Using easy-to-define policies, Replication Manager solves one of the biggest barriers for the customers in their cloud adoption journey by allowing them to mov

Cloud 68

More Trending

article thumbnail

Look Out for Risks in Open Banking!

Teradata

Open Banking is re-shaping the landscape of financial services and introducing new types of risks extending beyond data security. Secure open banking is everyone’s responsibility.

Banking 59
article thumbnail

Standing Up a DataOps Program for Practitioners

DataKitchen

In this five-module course, Mike Lampa & Chris Bergh teach data professionals to plan their organization's DataOps program for low errors & fast deployment. The post Standing Up a DataOps Program for Practitioners first appeared on DataKitchen.

article thumbnail

Deploying applications on CDP Operational Database (COD)

Cloudera

CDP Operational Database Experience (COD) is a PaaS offering on the Cloudera Data Platform (CDP). COD enables you to create a new operational database with a few clicks and auto-scales based on your workload. Behind the scenes, COD automatically manages cluster deployment and configuration, reducing overheads related to setting up new database instances.

article thumbnail

Confluent Presented the Databricks ISV Momentum Partner Award 2021

Confluent

I’m excited to announce that Confluent was presented with the Databricks ISV Momentum Award at the Databricks Partner Executive Summit last month. This award is given to the partner whose […].

62
article thumbnail

From Developer Experience to Product Experience: How a Shared Focus Fuels Product Success

Speaker: Anne Steiner and David Laribee

As a concept, Developer Experience (DX) has gained significant attention in the tech industry. It emphasizes engineers’ efficiency and satisfaction during the product development process. As product managers, we need to understand how a good DX can contribute not only to the well-being of our development teams but also to the broader objectives of product success and customer satisfaction.

article thumbnail

Monte Carlo Expands Leadership Team from Snowflake, Segment to Support Hypergrowth of Data Observability Category

Monte Carlo

Monte Carlo , the data reliability company, today announced two new strategic hires to its leadership team: Daniel Day , Head of Marketing, and Jordan Van Horn , Head of Revenue. With experience leading award-winning go-to-market teams at Snowflake and Segment, Day and Van Horn share a deep expertise in the data industry and will help Monte Carlo meet the growing demands as the industry leader in Data Observability.

article thumbnail

The Role of DataOps in Data Modernization

DataKitchen

Cognizant's JP Thakur & DataKitchen's Chris Bergh discuss how DataOps sets the foundation for Data Modernization initiatives enabling continuous data & insight. The post The Role of DataOps in Data Modernization first appeared on DataKitchen.

Data 52
article thumbnail

What Concept Are You Trying to Prove?

Teradata

When undertaking the expense & time to execute a proof of concept on your journey to the cloud, make sure the efforts are well defined & drive an actionable outcome.

Cloud 52
article thumbnail

Building Real-Time Event Streams in the Cloud, On Premises, or Both with Confluent

Confluent

To the developer or architect seeking to provide their business with as much value as possible, what is the best way to start working with data in motion? Choosing Apache […].

article thumbnail

Peak Performance: Continuous Testing & Evaluation of LLM-Based Applications

Speaker: Aarushi Kansal, AI Leader & Author and Tony Karrer, Founder & CTO at Aggregage

Software leaders who are building applications based on Large Language Models (LLMs) often find it a challenge to achieve reliability. It’s no surprise given the non-deterministic nature of LLMs. To effectively create reliable LLM-based (often with RAG) applications, extensive testing and evaluation processes are crucial. This often ends up involving meticulous adjustments to prompts.

article thumbnail

RudderStack Product News Vol. #007 - New Security Features

RudderStack

RudderStack product news update includes the latest security features- Mobile SDK Distribution, New Spreadsheet, Database Destinations, and RudderStack video content.

article thumbnail

Visualizing ClickHouse Data - ClickHouse SQLAlchemy

Preset

Analyzing data that’s frequently updated? A comprehensive walkthrough of how to analyze and visualize ClickHouse data in Superset.

Data 40
article thumbnail

Federated Development with Deployment at Scale

Teradata

The connected cloud data warehouse is fundamental to Data Mesh implementation in large and complex organisations. Find out why.

article thumbnail

Lessons Learned From The Pipeline Data Engineering Academy

Data Engineering Podcast

Summary Data Engineering is a broad and constantly evolving topic, which makes it difficult to teach in a concise and effective manner. Despite that, Daniel Molnar and Peter Fabian started the Pipeline Academy to do exactly that. In this episode they reflect on the lessons that they learned while teaching the first cohort of their bootcamp how to be effective data engineers.

article thumbnail

Entity Resolution Checklist: What to Consider When Evaluating Options

Are you trying to decide which entity resolution capabilities you need? It can be confusing to determine which features are most important for your project. And sometimes key features are overlooked. Get the Entity Resolution Evaluation Checklist to make sure you’ve thought of everything to make your project a success! The list was created by Senzing’s team of leading entity resolution experts, based on their real-world experience.

article thumbnail

Our Product Vision for a Developer-First CDP

RudderStack

This blog talks in detail about RudderStack's product vision- what you can expect from RudderStack in the coming quarters in terms of features and experience.

40
article thumbnail

The A-Z Guide to Gradient Descent Algorithm and Its Variants

ProjectPro

If you have heard of Machine Learning and Deep Learning, you must have also heard about cost (error or loss) functions. But, even if you haven't, fret not! The cost function, in simple words, is a way of measuring the performance of a Machine Learning model by attributing a cost to every 'mistake' or wrong prediction that the model makes. But as we know from personal experience, one can gain little by simply knowing that a mistake has occurred or how many, for that matter, if you have no clue ho

article thumbnail

DAX-JUNGLE: NORM.DIST

FreshBI

It’s a jungle out there Back in the day- when I was stuck on a DAX problem, I used to toggle through the IntelliSense in PowerBI one letter at a time. I’ve learned much since then and in this blog I’d like to share my experience with using NORM.DIST in Dax. A: ABS ACOS ACOSH … B: BETA.DIST BETA.INV BLANK Etc…. Hours wasted. Mistakes were made A MUCH better use of my time would have been reviewing quality solutions to real world problems.

BI 52
article thumbnail

Zalando Tech Radar - Scaling Contributions to Technology Selection

Zalando Engineering

Introduction In our previous post about Technology Choices at Zalando we spoke about a few problems with scaling technology selection in Tech companies. Since then, we have focused on the remaining categories of the Tech Radar beyond languages and the Tech Radar contribution process. Now, we'd like to reflect on our lessons learned, which you can use when designing technology selection processes.

article thumbnail

How to Build an Experimentation Culture for Data-Driven Product Development

Speaker: Margaret-Ann Seger, Head of Product, Statsig

Experimentation is often seen as an aspirational practice, especially at smaller, fast-moving companies who are strapped for time and resources. So, how can you get your team making decisions in a more data-driven way while continuing to remain lean and maintaining ship velocity? In this webinar, Margaret-Ann Seger, Head of Product at Statsig, will teach you how to build an experimentation culture from the ground-up, graduating from just getting started with data-driven development to operating

article thumbnail

Exploring Data @ Netflix

Netflix Tech

By Gim Mahasintunan on behalf of Data Platform Engineering. Supporting a rapidly growing base of engineers of varied backgrounds using different data stores can be challenging in any organization. Netflix’s internal teams strive to provide leverage by investing in easy-to-use tooling that streamlines the user experience and incorporates best practices.

Data 61
article thumbnail

7 Types of Classification Algorithms in Machine Learning

ProjectPro

This blog will help you master the fundamentals of classification machine learning algorithms with their pros and cons. You will also explore some exciting machine learning project ideas that implement different types of classification algorithms. So, without much ado, let's dive in. Imagine that the pandemic is over and today is a weekday. All the schools, colleges, and offices are open, and you should reach your institution by 8 A.M.

article thumbnail

Insurers – Be Aware of the Hidden Exposures in assessing the economic impact of Climate Risk

Cloudera

Climate change is a challenge for insurers in some obvious ways, such as stronger and more frequent natural disasters. Yet there are also more subtle risks to monitor, including changes to insured assets, risks, and exposures. Climate impacts the production quality and quantity of insured consumable goods, their location, and their supply chains. Climate change can also impact the insurance carrier as an enterprise itself—similarly to cyber risks, insurers underwrite cyber risks for their custom

article thumbnail

Using DataOps to Drive Agility and Business Value

DataKitchen

In May 2021 at the CDO & Data Leaders Global Summit, DataKitchen sat down with the following data leaders to learn how to use DataOps to drive agility and business value. Kurt Zimmer, Head of Data Engineering for Data Enablement at AstraZeneca. Ryan Chapin, Former Executive Manager, Advanced Additive Design, Chief Product and Portfolio Manager, GE Aviation.

article thumbnail

The Path to Product Excellence: Avoiding Common Pitfalls and Enhancing Communication

Speaker: David Bard, Principal at VP Product Coaching

In the fast-paced world of digital innovation, success is often accompanied by a multitude of challenges - like the pitfalls lurking at every turn, threatening to derail the most promising projects. But fret not, this webinar is your key to effective product development! Join us for an enlightening session to empower you to lead your team to greater heights.

article thumbnail

Top 20 Data Analytics Projects for Students to Practice in 2023

ProjectPro

According to Gartner , organizations can suffer a financial loss of up to 15 million dollars for the poor quality of data. As per McKinsey , 47% of organizations believe that data analytics has impacted the market in their respective industries. According to Forbes , in 2012 only 12% of Fortune 1000 companies reported having a CDO (Chief Data Officer).