Sat.Jan 07, 2023 - Fri.Jan 13, 2023

article thumbnail

Inside Pollen's Software Engineering Salaries

The Pragmatic Engineer

👋 Hi, this is Gergely with a bonus, free issue of the Pragmatic Engineer Newsletter. We cover one and a half out of eight topics in today’s subscriber-only issue, Inside Pollen's Transparent Compensation Data. If you’re not yet a subscriber, you also missed this week’s deep-dive on Becoming a Fractional CTO. To get this newsletter every week, subscribe here.

article thumbnail

Simplify Delta Lake Complexity with mack.

Confessions of a Data Guy

Anyone who’s been roaming around the forest of Data Engineering has probably run into many of the newish tools that have been growing rapidly around the concepts of Data Warehouses, Data Lakes, and Lake Houses … the merging of the old relational database functionality with TB and PB level cloud-based file storage systems. Tools like […] The post Simplify Delta Lake Complexity with mack. appeared first on Confessions of a Data Guy.

Data Lake 162
Insiders

Sign Up for our Newsletter

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

article thumbnail

Data Pipeline Design Patterns - #2. Coding patterns in Python

Start Data Engineering

Introduction Sample project Code design patterns 1. Functional design 2. Factory pattern 3. Strategy pattern 4. Singleton, & Object pool patterns Python helpers 1. Typing 2. Dataclass 3. Context Managers 4. Testing with pytest 5. Decorators Misc Conclusion Further reading References Introduction Using the appropriate code design pattern can make your code easy to read, extensible, and seamless to modify existing logic, debug, and enable developers to onboard quicker.

Designing 147
article thumbnail

Analysis of Confluent Buying Immerok

Jesse Anderson

If you haven’t heard, Confluent announced they’re buying Immerok. This purchase represents a significant shift in strategy for Confluent. I started a Twitter thread with some of my initial thoughts, but I want to write a post giving more analysis and opinions. In short, I still echo the sentiment from my original tweet “This was always the way it should have been.

Kafka 147
article thumbnail

From Developer Experience to Product Experience: How a Shared Focus Fuels Product Success

Speaker: Anne Steiner and David Laribee

As a concept, Developer Experience (DX) has gained significant attention in the tech industry. It emphasizes engineers’ efficiency and satisfaction during the product development process. As product managers, we need to understand how a good DX can contribute not only to the well-being of our development teams but also to the broader objectives of product success and customer satisfaction.

article thumbnail

Data Warehouse Consultants – What Do They Do And Why You Need One

Seattle Data Guy

A data warehouse consultant plays an important role in companies looking to become data-driven. They help companies design and deploy centralized data sets that are easy to use and reliable. But in order to understand why you need a data warehouse consultant we should take a step back. In this article we will not only… Read more The post Data Warehouse Consultants – What Do They Do And Why You Need One appeared first on Seattle Data Guy.

article thumbnail

Using Rust to write a Data Pipeline. Thoughts. Musings.

Confessions of a Data Guy

Rust has been on my mind a lot lately, probably because of Data Engineering boredom, watching Spark clusters chug along like some medieval farm worker endlessly trudging through the muck and mire of life. Maybe Rust has breathed some life back into my stagnant soul, reminding me there is a big world out there, […] The post Using Rust to write a Data Pipeline.

More Trending

article thumbnail

Data News — Week 23.01

Christophe Blefari

You and me celebrating 2023 ( credits ) Happy new year 🎆 For those who were already subscribed at the start of last year I tried to put resolutions and objectives for the year that I did not succeed to follow. The year was so different to what I was expected. Maybe this is an excuse. Anyway I did not reach my goals. What about if we don't care for this year?

Data 130
article thumbnail

Improving Your Data Analytics Infrastructure In 2023 – Part 1

Seattle Data Guy

Data has been consistently demonstrated to be a valuable asset for businesses of all sizes. Consulting firms, like McKinsey, have found that companies using AI and analytics attribute 20% of their earnings to it. As a consultant, I have personally witnessed how data can uncover new sources of revenue and cost reduction opportunities for clients… Read more The post Improving Your Data Analytics Infrastructure In 2023 – Part 1 appeared first on Seattle Data Guy.

article thumbnail

Where Collaboration Fails Around Data (And 4 Tips for Fixing It)

KDnuggets

Data-driven organizations require complex collaboration between data teams and business stakeholders. Here are 4 proactive tips for reducing information asymmetries and achieving better collaboration.

IT 159
article thumbnail

Modern Data Stack: The Struggle of Enterprise Adoption

Simon Späti

In part I, The Open Data Stack Distilled into Four Core Tools, we discussed how to quickly set up a data stack, tackling end-to-end data analytics challenges. As a manager or developer working with data at a mid- to large-sized enterprise, you might ask why aren’t we using any of these tools. In this article, we dive into what mid-to-large-sized companies are using instead, the struggle of setting up a Modern Data Stack (MDS) for an enterprise size, and the opportunities of a free-of-charge and

Data 130
article thumbnail

Peak Performance: Continuous Testing & Evaluation of LLM-Based Applications

Speaker: Aarushi Kansal, AI Leader & Author and Tony Karrer, Founder & CTO at Aggregage

Software leaders who are building applications based on Large Language Models (LLMs) often find it a challenge to achieve reliability. It’s no surprise given the non-deterministic nature of LLMs. To effectively create reliable LLM-based (often with RAG) applications, extensive testing and evaluation processes are crucial. This often ends up involving meticulous adjustments to prompts.

article thumbnail

Succeeding with Change Data Capture

Confluent

CDC is a software design pattern that identifies and captures changes made to data in a database. Learn how CDC works, the best solutions, and how to get started with various implementations.

Data 123
article thumbnail

Product Discovery – Building the Right Things

Teradata

Product discovery is a process that cross functional product teams follow to reduce the uncertainty about a problem worth solving and a solution worth developing. Learn more.

article thumbnail

7 Best Platforms to Practice SQL

KDnuggets

Looking to level up your SQL skills? Here's a list of the best platforms to practice SQL, ace your SQL interviews, and land your dream data role.

SQL 134
article thumbnail

Databricks Power BI Connector Now Supports Native Query

databricks

This is a collaborative post from Databricks and Microsoft. We thank Mahesh Prakriya (Director in Intelligence Platform, Microsoft) and Bob Zhang (Sr. Technical.

BI 95
article thumbnail

How to Build an Experimentation Culture for Data-Driven Product Development

Speaker: Margaret-Ann Seger, Head of Product, Statsig

Experimentation is often seen as an aspirational practice, especially at smaller, fast-moving companies who are strapped for time and resources. So, how can you get your team making decisions in a more data-driven way while continuing to remain lean and maintaining ship velocity? In this webinar, Margaret-Ann Seger, Head of Product at Statsig, will teach you how to build an experimentation culture from the ground-up, graduating from just getting started with data-driven development to operating

article thumbnail

Top Data Integrity Trends Fueling Confident Business Decisions in 2023

Precisely

With global data creation projected to grow to more than 180 zettabytes by 2025 , it’s not surprising that more organizations than ever are looking to harness their ever-growing datasets to drive more confident business decisions. In fact, a recent study from 451 Research shows that nearly 79% of businesses report data will be more important to their organization’s strategic-making over the next 12 months.

article thumbnail

Saving Lives, Saving Costs: Predicting Heart Failure with Teradata

Teradata

A team of Teradata data scientists & industry experts worked alongside a U.S. insurance company to develop a solution that would predict the onset of heart failure 6 months in advance. Find out more.

article thumbnail

Top Posts January 2-8: Python Matplotlib Cheat Sheets

KDnuggets

Python Matplotlib Cheat Sheets • Free Data Management with Data Science Learning with CS639 • How to Select Rows and Columns in Pandas Using [ ],loc, iloc,at and.iat • Creating a Web Application to Extract Topics from Audio with Python • More Data Science Cheatsheets.

Python 113
article thumbnail

Supercharging H3 for Geospatial Analytics

databricks

On the heels of the initial release of H3 support in Databricks Runtime (DBR), we are happy to share ground-breaking performance improvements with.

83
article thumbnail

Entity Resolution Checklist: What to Consider When Evaluating Options

Are you trying to decide which entity resolution capabilities you need? It can be confusing to determine which features are most important for your project. And sometimes key features are overlooked. Get the Entity Resolution Evaluation Checklist to make sure you’ve thought of everything to make your project a success! The list was created by Senzing’s team of leading entity resolution experts, based on their real-world experience.

article thumbnail

5 Challenges of Ethical Data Stewardship

Precisely

The pressure is mounting. Data privacy regulations are constantly evolving, and customer preferences and expectations are high and on the move. That means businesses want to provide hyper-personalized experiences, but they also need to ensure they’re using, sharing, and protecting customer data with the utmost integrity. And with the rising focus on environmental, social, and governance (ESG), businesses can no longer rely on quality products alone to win and maintain the support of customers, e

article thumbnail

What Are Node.js Frameworks?: How To Choose the Best Node.js Framework for 2023

Trio

Node.js powers many of the modern real-time web applications you’re likely familiar with. It’s a scalable JavaScript runtime environment widely used to build online games, messengers, video platforms, and more. Technology companies like Netflix, Uber, Trello, and others use Node to create both rich user interfaces (UIs) and server-side environments.

article thumbnail

How to Perform Unit Testing in Python?

KDnuggets

Unit testing is an important part of the software development life cycle as it helps to ensure that code is correct and working as intended. This article aims to introduce the concept of unit testing in Python and provide a basic tutorial on how to write and run unit tests using a unittest module.

Python 109
article thumbnail

How to build a Snowflake API | Propel Data Analytics Blog

Propel Data

Create and query an API on top of your Snowflake data warehouse using Propel’s blazing-fast Serverless Analytics API Platform

article thumbnail

The Path to Product Excellence: Avoiding Common Pitfalls and Enhancing Communication

Speaker: David Bard, Principal at VP Product Coaching

In the fast-paced world of digital innovation, success is often accompanied by a multitude of challenges - like the pitfalls lurking at every turn, threatening to derail the most promising projects. But fret not, this webinar is your key to effective product development! Join us for an enlightening session to empower you to lead your team to greater heights.

article thumbnail

Better Data for Better Decisions in the Public Sector Through Entity Resolution - Part 1

databricks

One of the domains where better decisions mean a better society is the Public Sector. Each and every one of us has a.

Data 79
article thumbnail

In the spotlight with Nick Cooper: ThoughtSpot’s Selfless Excellence champion

ThoughtSpot

This is part of our ongoing spotlight series which highlights ThoughtSpot’s quarterly Selfless Excellence champion. Culture and shared values are at the heart of every decision, innovation, and team member at ThoughtSpot. By creating a family-first mentality among a truly diverse and inclusive team , we’ve been able to build more authentic relationships with one another.

article thumbnail

Overcome Your Data Quality Issues with Great Expectations

KDnuggets

Bad data costs organizations money, reputation, and time. Hence it is very important to monitor and validate data quality continuously.

Data 122
article thumbnail

The Power of Collaboration in Product Development

Eventbrite Engineering

Product development at Eventbrite is a practice centered around understanding what our customers need, so we can enhance current features or build new products. In order to achieve this, our product team collaborates across multiple disciplines throughout the company to ensure we’re thinking about customer needs from all angles. Who is involved in product development?

article thumbnail

Reimagined: Building Products with Generative AI

“Reimagined: Building Products with Generative AI” is an extensive guide for integrating generative AI into product strategy and careers featuring over 150 real-world examples, 30 case studies, and 20+ frameworks, and endorsed by over 20 leading AI and product executives, inventors, entrepreneurs, and researchers.

article thumbnail

Streaming in Production: Collected Best Practices, Part 2

databricks

In our two-part blog series titled "Streaming in Production: Collected Best Practices," this is the second article. Here we discuss the "After Deployment".

63
article thumbnail

GET_DDL: Sql Script, JS and Python

Cloudyard

Read Time: 1 Minute, 35 Second During this post we will discuss the multiple ways to extract the DDL from your snowflake database. Though we can use GET_DDL command to extract the metadata. But what if our database has huge list of tables and running GET_DDL on each table is not feasible approach. There should be some programmatic approach which should traverse the Table list automatically and extract the DDL.

SQL 52
article thumbnail

3 Things I Wish I Knew When I Started Data Science

KDnuggets

Looking back and realizing how I was wrong about the data science career.

article thumbnail

Introducing FreshBI’s Restoration Pro Pocket Dash

FreshBI

The North American restoration industry is a multi-billion dollar a year business. Increases in natural disasters, an aging population, and growing awareness of indoor air quality are among the many factors contributing to the projected industry growth rates over the years to come. And, as the demand for restoration services continues to grow, so does the competition.

BI 52
article thumbnail

The Big Payoff of Application Analytics

Outdated or absent analytics won’t cut it in today’s data-driven applications – not for your end users, your development team, or your business. That’s what drove the five companies in this e-book to change their approach to analytics. Download this e-book to learn about the unique problems each company faced and how they achieved huge returns beyond expectation by embedding analytics into applications.