Sat.Jan 07, 2023 - Fri.Jan 13, 2023

article thumbnail

Inside Pollen's Software Engineering Salaries

The Pragmatic Engineer

👋 Hi, this is Gergely with a bonus, free issue of the Pragmatic Engineer Newsletter. We cover one and a half out of eight topics in today’s subscriber-only issue, Inside Pollen's Transparent Compensation Data. If you’re not yet a subscriber, you also missed this week’s deep-dive on Becoming a Fractional CTO. To get this newsletter every week, subscribe here.

article thumbnail

Simplify Delta Lake Complexity with mack.

Confessions of a Data Guy

Anyone who’s been roaming around the forest of Data Engineering has probably run into many of the newish tools that have been growing rapidly around the concepts of Data Warehouses, Data Lakes, and Lake Houses … the merging of the old relational database functionality with TB and PB level cloud-based file storage systems. Tools like […] The post Simplify Delta Lake Complexity with mack. appeared first on Confessions of a Data Guy.

Data Lake 162
Insiders

Sign Up for our Newsletter

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

article thumbnail

Data Pipeline Design Patterns - #2. Coding patterns in Python

Start Data Engineering

Introduction Sample project Code design patterns 1. Functional design 2. Factory pattern 3. Strategy pattern 4. Singleton, & Object pool patterns Python helpers 1. Typing 2. Dataclass 3. Context Managers 4. Testing with pytest 5. Decorators Misc Conclusion Further reading References Introduction Using the appropriate code design pattern can make your code easy to read, extensible, and seamless to modify existing logic, debug, and enable developers to onboard quicker.

Designing 147
article thumbnail

Analysis of Confluent Buying Immerok

Jesse Anderson

If you haven’t heard, Confluent announced they’re buying Immerok. This purchase represents a significant shift in strategy for Confluent. I started a Twitter thread with some of my initial thoughts, but I want to write a post giving more analysis and opinions. In short, I still echo the sentiment from my original tweet “This was always the way it should have been.

Kafka 147
article thumbnail

How To Get Promoted In Product Management

Speaker: John Mansour

If you're looking to advance your career in product management, there are more options than just climbing the management ladder. Join our upcoming webinar to learn about highly rewarding career paths that don't involve management responsibilities. We'll cover both career tracks and provide tips on how to position yourself for success in the one that's right for you.

article thumbnail

Data Warehouse Consultants – What Do They Do And Why You Need One

Seattle Data Guy

A data warehouse consultant plays an important role in companies looking to become data-driven. They help companies design and deploy centralized data sets that are easy to use and reliable. But in order to understand why you need a data warehouse consultant we should take a step back. In this article we will not only… Read more The post Data Warehouse Consultants – What Do They Do And Why You Need One appeared first on Seattle Data Guy.

article thumbnail

Using Rust to write a Data Pipeline. Thoughts. Musings.

Confessions of a Data Guy

Rust has been on my mind a lot lately, probably because of Data Engineering boredom, watching Spark clusters chug along like some medieval farm worker endlessly trudging through the muck and mire of life. Maybe Rust has breathed some life back into my stagnant soul, reminding me there is a big world out there, […] The post Using Rust to write a Data Pipeline.

More Trending

article thumbnail

Data News — Week 23.01

Christophe Blefari

You and me celebrating 2023 ( credits ) Happy new year 🎆 For those who were already subscribed at the start of last year I tried to put resolutions and objectives for the year that I did not succeed to follow. The year was so different to what I was expected. Maybe this is an excuse. Anyway I did not reach my goals. What about if we don't care for this year?

Data 130
article thumbnail

Improving Your Data Analytics Infrastructure In 2023 – Part 1

Seattle Data Guy

Data has been consistently demonstrated to be a valuable asset for businesses of all sizes. Consulting firms, like McKinsey, have found that companies using AI and analytics attribute 20% of their earnings to it. As a consultant, I have personally witnessed how data can uncover new sources of revenue and cost reduction opportunities for clients… Read more The post Improving Your Data Analytics Infrastructure In 2023 – Part 1 appeared first on Seattle Data Guy.

article thumbnail

Where Collaboration Fails Around Data (And 4 Tips for Fixing It)

KDnuggets

Data-driven organizations require complex collaboration between data teams and business stakeholders. Here are 4 proactive tips for reducing information asymmetries and achieving better collaboration.

IT 159
article thumbnail

Modern Data Stack: The Struggle of Enterprise Adoption

Simon Späti

In part I, The Open Data Stack Distilled into Four Core Tools, we discussed how to quickly set up a data stack, tackling end-to-end data analytics challenges. As a manager or developer working with data at a mid- to large-sized enterprise, you might ask why aren’t we using any of these tools. In this article, we dive into what mid-to-large-sized companies are using instead, the struggle of setting up a Modern Data Stack (MDS) for an enterprise size, and the opportunities of a free-of-charge and

article thumbnail

Navigating the Future: Generative AI, Application Analytics, and Data

Generative AI is upending the way product developers & end-users alike are interacting with data. Despite the potential of AI, many are left with questions about the future of product development: How will AI impact my business and contribute to its success? What can product managers and developers expect in the future with the widespread adoption of AI?

article thumbnail

Succeeding with Change Data Capture

Confluent

CDC is a software design pattern that identifies and captures changes made to data in a database. Learn how CDC works, the best solutions, and how to get started with various implementations.

Data 121
article thumbnail

Product Discovery – Building the Right Things

Teradata

Product discovery is a process that cross functional product teams follow to reduce the uncertainty about a problem worth solving and a solution worth developing. Learn more.

article thumbnail

7 Best Platforms to Practice SQL

KDnuggets

Looking to level up your SQL skills? Here's a list of the best platforms to practice SQL, ace your SQL interviews, and land your dream data role.

SQL 136
article thumbnail

Databricks Power BI Connector Now Supports Native Query

databricks

This is a collaborative post from Databricks and Microsoft. We thank Mahesh Prakriya (Director in Intelligence Platform, Microsoft) and Bob Zhang (Sr. Technical.

BI 94
article thumbnail

Get Better Network Graphs & Save Analysts Time

Many organizations today are unlocking the power of their data by using graph databases to feed downstream analytics, enahance visualizations, and more. Yet, when different graph nodes represent the same entity, graphs get messy. Watch this essential video with Senzing CEO Jeff Jonas on how adding entity resolution to a graph database condenses network graphs to improve analytics and save your analysts time.

article thumbnail

Top Data Integrity Trends Fueling Confident Business Decisions in 2023

Precisely

With global data creation projected to grow to more than 180 zettabytes by 2025 , it’s not surprising that more organizations than ever are looking to harness their ever-growing datasets to drive more confident business decisions. In fact, a recent study from 451 Research shows that nearly 79% of businesses report data will be more important to their organization’s strategic-making over the next 12 months.

article thumbnail

Saving Lives, Saving Costs: Predicting Heart Failure with Teradata

Teradata

A team of Teradata data scientists & industry experts worked alongside a U.S. insurance company to develop a solution that would predict the onset of heart failure 6 months in advance. Find out more.

article thumbnail

Top Posts January 2-8: Python Matplotlib Cheat Sheets

KDnuggets

Python Matplotlib Cheat Sheets • Free Data Management with Data Science Learning with CS639 • How to Select Rows and Columns in Pandas Using [ ],loc, iloc,at and.iat • Creating a Web Application to Extract Topics from Audio with Python • More Data Science Cheatsheets.

Python 113
article thumbnail

Supercharging H3 for Geospatial Analytics

databricks

On the heels of the initial release of H3 support in Databricks Runtime (DBR), we are happy to share ground-breaking performance improvements with.

82
article thumbnail

Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You Need to Know

Speaker: Timothy Chan, PhD., Head of Data Science

Are you ready to move beyond the basics and take a deep dive into the cutting-edge techniques that are reshaping the landscape of experimentation? 🌐 From Sequential Testing to Multi-Armed Bandits, Switchback Experiments to Stratified Sampling, Timothy Chan, Data Science Lead, is here to unravel the mysteries of these powerful methodologies that are revolutionizing how we approach testing.

article thumbnail

5 Challenges of Ethical Data Stewardship

Precisely

The pressure is mounting. Data privacy regulations are constantly evolving, and customer preferences and expectations are high and on the move. That means businesses want to provide hyper-personalized experiences, but they also need to ensure they’re using, sharing, and protecting customer data with the utmost integrity. And with the rising focus on environmental, social, and governance (ESG), businesses can no longer rely on quality products alone to win and maintain the support of customers, e

article thumbnail

What Are Node.js Frameworks?: How To Choose the Best Node.js Framework for 2023

Trio

Node.js powers many of the modern real-time web applications you’re likely familiar with. It’s a scalable JavaScript runtime environment widely used to build online games, messengers, video platforms, and more. Technology companies like Netflix, Uber, Trello, and others use Node to create both rich user interfaces (UIs) and server-side environments.

article thumbnail

How to Perform Unit Testing in Python?

KDnuggets

Unit testing is an important part of the software development life cycle as it helps to ensure that code is correct and working as intended. This article aims to introduce the concept of unit testing in Python and provide a basic tutorial on how to write and run unit tests using a unittest module.

Python 109
article thumbnail

Better Data for Better Decisions in the Public Sector Through Entity Resolution - Part 1

databricks

One of the domains where better decisions mean a better society is the Public Sector. Each and every one of us has a.

Data 78
article thumbnail

Understanding User Needs and Satisfying Them

Speaker: Scott Sehlhorst

We know we want to create products which our customers find to be valuable. Whether we label it as customer-centric or product-led depends on how long we've been doing product management. There are three challenges we face when doing this. The obvious challenge is figuring out what our users need; the non-obvious challenges are in creating a shared understanding of those needs and in sensing if what we're doing is meeting those needs.

article thumbnail

How to build a Snowflake API | Propel Data Analytics Blog

Propel Data

Create and query an API on top of your Snowflake data warehouse using Propel’s blazing-fast Serverless Analytics API Platform

article thumbnail

In the spotlight with Nick Cooper: ThoughtSpot’s Selfless Excellence champion

ThoughtSpot

This is part of our ongoing spotlight series which highlights ThoughtSpot’s quarterly Selfless Excellence champion. Culture and shared values are at the heart of every decision, innovation, and team member at ThoughtSpot. By creating a family-first mentality among a truly diverse and inclusive team , we’ve been able to build more authentic relationships with one another.

article thumbnail

Overcome Your Data Quality Issues with Great Expectations

KDnuggets

Bad data costs organizations money, reputation, and time. Hence it is very important to monitor and validate data quality continuously.

Data 123
article thumbnail

Streaming in Production: Collected Best Practices, Part 2

databricks

In our two-part blog series titled "Streaming in Production: Collected Best Practices," this is the second article. Here we discuss the "After Deployment".

62
article thumbnail

How Embedded Analytics Gets You to Market Faster with a SAAS Offering

Start-ups & SMBs launching products quickly must bundle dashboards, reports, & self-service analytics into apps. Customers expect rapid value from your product (time-to-value), data security, and access to advanced capabilities. Traditional Business Intelligence (BI) tools can provide valuable data analysis capabilities, but they have a barrier to entry that can stop small and midsize businesses from capitalizing on them.

article thumbnail

The Power of Collaboration in Product Development

Eventbrite Engineering

Product development at Eventbrite is a practice centered around understanding what our customers need, so we can enhance current features or build new products. In order to achieve this, our product team collaborates across multiple disciplines throughout the company to ensure we’re thinking about customer needs from all angles. Who is involved in product development?

article thumbnail

GET_DDL: Sql Script, JS and Python

Cloudyard

Read Time: 1 Minute, 35 Second During this post we will discuss the multiple ways to extract the DDL from your snowflake database. Though we can use GET_DDL command to extract the metadata. But what if our database has huge list of tables and running GET_DDL on each table is not feasible approach. There should be some programmatic approach which should traverse the Table list automatically and extract the DDL.

SQL 52
article thumbnail

KDnuggets News, January 11: Python Matplotlib Cheatsheets • More Data Science Cheatsheets • Data Science & Machine Learning Developments of 2022

KDnuggets

Key Data Science, Machine Learning, AI and Analytics Developments of 2022 • Python Matplotlib Cheat Sheets • More Data Science Cheatsheets • Free Data Management with Data Science Learning with CS639 • Data-Driven Holiday Cheer: How Santa is Using Analytics to Make the Season Bright.

article thumbnail

The Impact of Data and AI on a Modern Business

databricks

It is no secret that there has been an explosion of data in the past 10 years. As per Forbes, from 2010 to.

Data 75
article thumbnail

Peak Performance: Continuous Testing & Evaluation of LLM-Based Applications

Speaker: Aarushi Kansal, AI Leader & Author and Tony Karrer, Founder & CTO at Aggregage

Software leaders who are building applications based on Large Language Models (LLMs) often find it a challenge to achieve reliability. It’s no surprise given the non-deterministic nature of LLMs. To effectively create reliable LLM-based (often with RAG) applications, extensive testing and evaluation processes are crucial. This often ends up involving meticulous adjustments to prompts.