Sat.Oct 21, 2023 - Fri.Oct 27, 2023

article thumbnail

Defining A Strategy For Your Data Products

Data Engineering Podcast

Summary The primary application of data has moved beyond analytics. With the broader audience comes the need to present data in a more approachable format. This has led to the broad adoption of data products being the delivery mechanism for information. In this episode Ranjith Raghunath shares his thoughts on how to build a strategy for the development, delivery, and evolution of data products.

BI 162
article thumbnail

Code Review on Printed Paper: an Excerpt from the Twitoons Comic Book

The Pragmatic Engineer

👋 Hi, this is Gergely with a bonus, free issue of the Pragmatic Engineer Newsletter. In every issue, I cover topics related to Big Tech and startups through the lens of engineering managers and senior engineers. In this article, we cover two out of seven topics from today’s full issue on The Man Behind the Big Tech Comics. To get full issues twice a week, subscribe here.

Coding 176
Insiders

Sign Up for our Newsletter

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

article thumbnail

6 Steps to Avoid Messy Data in Your Warehouse

Start Data Engineering

1. Introduction 2. Six Steps for a Clean Data Warehouse 2.1. Understand the business 2.2. Make data easy to use with the appropriate data model 2.3. Good input data is necessary for a good data warehouse 2.4. Define Source of Truth (SOT) and trace its usage 2.5. Keep stakeholders in the loop for a more significant impact 2.6. Watch out for org-level red flags ?

article thumbnail

What's new in Apache Spark 3.5.0 - Structured Streaming

Waitingforcode

It's time to start the series covering Apache Spark 3.5.0 features. As the first topic I'm going to cover Structured Streaming which has got a lot of RocksDB improvements and some major API changes.

IT 130
article thumbnail

Navigating the Future: Generative AI, Application Analytics, and Data

Generative AI is upending the way product developers & end-users alike are interacting with data. Despite the potential of AI, many are left with questions about the future of product development: How will AI impact my business and contribute to its success? What can product managers and developers expect in the future with the widespread adoption of AI?

article thumbnail

Drag, Drop, Analyze: The Rise of No-Code Data Science

KDnuggets

No-code or low-code functionalities in data science have gained significant traction in recent years. These solutions are well-proven and matured, and they make data science more accessible to a wider range of people.

article thumbnail

Snowflake To Acquire Ponder, Boosting Python Capabilities In the Data Cloud

Snowflake

Python’s popularity has more than doubled in the past decade¹ and it is quickly becoming the preferred language for development across machine learning, application development, pipelines, and more. One of our goals at Snowflake is to ensure we continue to deliver a best-in-class platform for Python developers. Snowflake customers are already harnessing the power of Python through Snowpark , a set of runtimes and libraries that securely deploy and process non-SQL code directly in Snowflake.

Python 141

More Trending

article thumbnail

Automating dead code cleanup

Engineering at Meta

Meta’s Systematic Code and Asset Removal Framework (SCARF) has a subsystem for identifying and removing dead code. SCARF combines static and dynamic analysis of programs to detect dead code from both a business and programming language perspective. SCARF automatically creates change requests that delete the dead code identified from the program analysis, minimizing developer costs.

Coding 128
article thumbnail

5 Free Books to Master Machine Learning

KDnuggets

Machine Learning is one of the most exciting fields in computer science today. In this article, we will take a look at the five best yet free books to learn machine learning in 2023.

article thumbnail

Announcing Apache Flink 1.18

Confluent

Read updates and improvements in Apache Flink 1.18, including dynamic fine-grained rescaling via REST API, Java 17 support, and faster rescaling & batch performance improvements.

Java 124
article thumbnail

High resolution data updates to Living Atlas World Elevation Layers and Tools (October 2023)

ArcGIS

In October 2023, elevation layers have been updated with high-res datasets of France, New Zealand, USA, Italy along with global bathymetry.

Datasets 134
article thumbnail

Get Better Network Graphs & Save Analysts Time

Many organizations today are unlocking the power of their data by using graph databases to feed downstream analytics, enahance visualizations, and more. Yet, when different graph nodes represent the same entity, graphs get messy. Watch this essential video with Senzing CEO Jeff Jonas on how adding entity resolution to a graph database condenses network graphs to improve analytics and save your analysts time.

article thumbnail

5 Things you didn’t know about Buck2

Engineering at Meta

Meta has a very large monorepo, with many different programming languages. To optimize build and performance, we developed our own build system called Buck , which was first open-sourced in 2013. Buck2 is the recently open-sourced successor. In our internal tests at Meta, we observed that Buck2 completed builds approximately 2x as fast as Buck1. Below are five interesting facts you might not have known about Buck2.

article thumbnail

The Top 5 Cloud Machine Learning Platforms & Tools

KDnuggets

What are the top 5 cloud machine learning platforms in the market today. Our list will help provide some vital insights into which platform might best cater to your specific machine learning needs. See what KDnuggets recommends.

article thumbnail

Introducing Predictive Optimization: Faster Queries, Cheaper Storage, No Sweat

databricks

Predictive Optimization intelligently optimizes your Lakehouse table data layouts for peak performance and cost-efficiency - without you needing to lift a finger.

Data 116
article thumbnail

Top 10 Six Sigma Black Belt Project Examples & Ideas

Knowledge Hut

A certified Six Sigma Black Belt expert is a professional who knows and can explain and implement the Six Sigma principles and philosophies. These include tools and supportive systems. A Black Belt professional must have impeccable leadership skills and understand team dynamics. They work in a collaborative manner to assign team members and give them roles and responsibilities.

Project 98
article thumbnail

Understanding User Needs and Satisfying Them

Speaker: Scott Sehlhorst

We know we want to create products which our customers find to be valuable. Whether we label it as customer-centric or product-led depends on how long we've been doing product management. There are three challenges we face when doing this. The obvious challenge is figuring out what our users need; the non-obvious challenges are in creating a shared understanding of those needs and in sensing if what we're doing is meeting those needs.

article thumbnail

Kubernetes And Kernel Panics

Netflix Tech

How Netflix’s Container Platform Connects Linux Kernel Panics to Kubernetes Pods By Kyle Anderson With a recent effort to reduce customer (engineers, not end users) pain on our container platform Titus , I started investigating “orphaned” pods. There are pods that never got to finish and had to be garbage collected with no real satisfactory final status.

article thumbnail

KDnuggets News, October 27: 5 Free Books to Master Data Science • 7 Steps to Mastering LLMs

KDnuggets

This week on KDnuggets: Go from learning what large language models are to building and deploying LLM apps in 7 steps • Check this list of free books for learning Python, statistics, linear algebra, machine learning and deep learning • And much, much more!

article thumbnail

Learn How to Build Airtight Data Pipelines for your AI Initiatives

databricks

"I can't think of anything that's been more powerful since the desktop computer." — Michael Carbin, Associate Professor, MIT, and Founding Advisor, MosaicML A.

article thumbnail

Top 15 Software Engineer Projects 2023 [Source Code]

Knowledge Hut

In today's fast-paced technological environment, software engineers are continually seeking innovative projects to hone their skills and stay ahead of industry trends. Engaging in software engineering projects not only helps sharpen your programming abilities but also enhances your professional portfolio. To further amplify your skillset, consider enrolling in Programming training course to leverage online programming courses from expert trainers and grow with mentorship programs.

article thumbnail

Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You Need to Know

Speaker: Timothy Chan, PhD., Head of Data Science

Are you ready to move beyond the basics and take a deep dive into the cutting-edge techniques that are reshaping the landscape of experimentation? 🌐 From Sequential Testing to Multi-Armed Bandits, Switchback Experiments to Stratified Sampling, Timothy Chan, Data Science Lead, is here to unravel the mysteries of these powerful methodologies that are revolutionizing how we approach testing.

article thumbnail

A Complete Guide to Scale Your Data Pipelines and Data Products with Contract Testing and Dbt

Towards Data Science

A Complete Guide to Effectively Scale Your Data Pipelines and Data Products with Contract Testing and dbt All you need to know to start implementing contract tests with dbt Photo by Jonas Gerg on Unsplash Let me tell you a story about data management systems and scale that will probably resonate with you if you are a data or analytics engineer trying to do your best work in 2023.

article thumbnail

Greening AI: 7 Strategies to Make Applications More Sustainable

KDnuggets

The article delves into a comprehensive methodology that sheds light on how to accurately estimate the carbon footprint associated with AI applications. It explains the environmental impact of AI, a crucial consideration in today's world.

IT 115
article thumbnail

How Providence Health Built a Model marketplace using Databricks?

databricks

Providence's MLOps Platform Providence is a healthcare organization with 120,000 caregivers serving over 50 hospitals and 1,000 clinics across seven states. Providence is.

article thumbnail

Top 20+ Cyber Security Projects for 2023 [With Source Code]

Knowledge Hut

Cybersecurity has become an integral component of every industry as the world advances technologically. In recent years, an increasing number of young professionals have shown interest in this field. If you are pursuing a course in this field, you should complete a project on cybersecurity as your area of competence. Beginners with theoretical knowledge should not undertake an impossible endeavor.

Coding 98
article thumbnail

How Embedded Analytics Gets You to Market Faster with a SAAS Offering

Start-ups & SMBs launching products quickly must bundle dashboards, reports, & self-service analytics into apps. Customers expect rapid value from your product (time-to-value), data security, and access to advanced capabilities. Traditional Business Intelligence (BI) tools can provide valuable data analysis capabilities, but they have a barrier to entry that can stop small and midsize businesses from capitalizing on them.

article thumbnail

ThoughtSpot announces our 2023 Partner Award winners

ThoughtSpot

To our entire partner ecosystem, I want to personally thank each of you for your incredible contributions over the past year. Our partners play a vital role in driving ThoughtSpot’s mission of becoming a more fact-driven world. Together, we help organizations leverage AI and natural language search to discover insights and make data-driven decisions for their businesses.

article thumbnail

Generative AI: The First Draft, Not Final

KDnuggets

This article gives a high-level overview of how LLMs work and their attendant limitations with accessible explanations and anecdotes throughout the piece. We also present advice on how people can introduce them into their workflows.

article thumbnail

Werner Gains Advanced Geospatial Capabilities with Snowflake and CARTO

Snowflake

Founded nearly 70 years ago, Werner Enterprises is a North American transportation and logistics leader that operates a fleet of almost 8,300 trucks and 30,000 trailers out of 16 terminals across the United States. The company generates a massive amount of data on the constantly changing, real-time location of each of its assets. Collecting and analyzing this geospatial data is vital for smart decision-making.

article thumbnail

SAFe Scrum Master Roles and Responsibilities

Knowledge Hut

With the steep upward trend in the adoption of agile practices across the IT industry, various frameworks have gained momentum. This has led to an appetite for exploring new ways of working and setting performance benchmarks. More and more organizations are looking for people who can help them effectively run in a new environment comprising frameworks based on Agile and its variants.

article thumbnail

Peak Performance: Continuous Testing & Evaluation of LLM-Based Applications

Speaker: Aarushi Kansal, AI Leader & Author and Tony Karrer, Founder & CTO at Aggregage

Software leaders who are building applications based on Large Language Models (LLMs) often find it a challenge to achieve reliability. It’s no surprise given the non-deterministic nature of LLMs. To effectively create reliable LLM-based (often with RAG) applications, extensive testing and evaluation processes are crucial. This often ends up involving meticulous adjustments to prompts.

article thumbnail

PyrOSM: working with Open Street Map data

Towards Data Science

Efficient geospatial manipulations for OSM map data Photo by Tabea Schimpf on Unsplash If you’ve worked with OSM data before, you know it’s not the easiest to extract. OSM data can be huge, and finding performant solutions for what you want to analyze is often a challenge. PyrOSM is a package that makes the process of reading in and working with OSM data much more efficient.

article thumbnail

10 Basic Statistical Concepts in Plain English

KDnuggets

Explore 10 foundational statistical concepts made simple, from probability distributions to the central limit theorem, for better data understanding.

Data 111
article thumbnail

Announcing GA of Predictive I/O for Updates: Faster DML Queries, Right Out of the Box

databricks

Announcing GA of Predictive I/O for Updates, which harnesses Photon and AI atop Deletion Vectors in order to significantly speed up MERGE, UPDATE and DELETE operations.

87
article thumbnail

Top 14 Six Sigma Project Examples

Knowledge Hut

Six Sigma is a methodology that improves a process's quality and performance. It is an improvement program that Motorola developed in the 1980s. The Six Sigma project aims to reduce the number of defects in a company's products or services. Six sigma certification training ensures maximum outcomes in terms of the project. The overall goal of the Six Sigma project is to improve customer satisfaction and increase revenue for the company.

Project 98
article thumbnail

From Developer Experience to Product Experience: How a Shared Focus Fuels Product Success

Speaker: Anne Steiner and David Laribee

As a concept, Developer Experience (DX) has gained significant attention in the tech industry. It emphasizes engineers’ efficiency and satisfaction during the product development process. As product managers, we need to understand how a good DX can contribute not only to the well-being of our development teams but also to the broader objectives of product success and customer satisfaction.