Sat.Oct 21, 2023 - Fri.Oct 27, 2023

article thumbnail

Defining A Strategy For Your Data Products

Data Engineering Podcast

Summary The primary application of data has moved beyond analytics. With the broader audience comes the need to present data in a more approachable format. This has led to the broad adoption of data products being the delivery mechanism for information. In this episode Ranjith Raghunath shares his thoughts on how to build a strategy for the development, delivery, and evolution of data products.

Data 162
article thumbnail

Code Review on Printed Paper: an Excerpt from the Twitoons Comic Book

The Pragmatic Engineer

👋 Hi, this is Gergely with a bonus, free issue of the Pragmatic Engineer Newsletter. In every issue, I cover topics related to Big Tech and startups through the lens of engineering managers and senior engineers. In this article, we cover two out of seven topics from today’s full issue on The Man Behind the Big Tech Comics. To get full issues twice a week, subscribe here.

Coding 191
Insiders

Sign Up for our Newsletter

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

article thumbnail

6 Steps to Avoid Messy Data in Your Warehouse

Start Data Engineering

1. Introduction 2. Six Steps for a Clean Data Warehouse 2.1. Understand the business 2.2. Make data easy to use with the appropriate data model 2.3. Good input data is necessary for a good data warehouse 2.4. Define Source of Truth (SOT) and trace its usage 2.5. Keep stakeholders in the loop for a more significant impact 2.6. Watch out for org-level red flags ?

article thumbnail

What's new in Apache Spark 3.5.0 - Structured Streaming

Waitingforcode

It's time to start the series covering Apache Spark 3.5.0 features. As the first topic I'm going to cover Structured Streaming which has got a lot of RocksDB improvements and some major API changes.

IT 130
article thumbnail

LLMs in Production: Tooling, Process, and Team Structure

Speaker: Dr. Greg Loughnane and Chris Alexiuk

Technology professionals developing generative AI applications are finding that there are big leaps from POCs and MVPs to production-ready applications. They're often developing using prompting, Retrieval Augmented Generation (RAG), and fine-tuning (up to and including Reinforcement Learning with Human Feedback (RLHF)), typically in that order. However, during development – and even more so once deployed to production – best practices for operating and improving generative AI applications are le

article thumbnail

Snowflake To Acquire Ponder, Boosting Python Capabilities In the Data Cloud

Snowflake

Python’s popularity has more than doubled in the past decade¹ and it is quickly becoming the preferred language for development across machine learning, application development, pipelines, and more. One of our goals at Snowflake is to ensure we continue to deliver a best-in-class platform for Python developers. Snowflake customers are already harnessing the power of Python through Snowpark , a set of runtimes and libraries that securely deploy and process non-SQL code directly in Snowflake.

Python 141

More Trending

article thumbnail

High resolution data updates to Living Atlas World Elevation Layers and Tools (October 2023)

ArcGIS

In October 2023, elevation layers have been updated with high-res datasets of France, New Zealand, USA, Italy along with global bathymetry.

Datasets 135
article thumbnail

Announcing Apache Flink 1.18

Confluent

Read updates and improvements in Apache Flink 1.18, including dynamic fine-grained rescaling via REST API, Java 17 support, and faster rescaling & batch performance improvements.

Java 117
article thumbnail

Automating dead code cleanup

Engineering at Meta

Meta’s Systematic Code and Asset Removal Framework (SCARF) has a subsystem for identifying and removing dead code. SCARF combines static and dynamic analysis of programs to detect dead code from both a business and programming language perspective. SCARF automatically creates change requests that delete the dead code identified from the program analysis, minimizing developer costs.

Coding 119
article thumbnail

5 Free Books to Master Machine Learning

KDnuggets

Machine Learning is one of the most exciting fields in computer science today. In this article, we will take a look at the five best yet free books to learn machine learning in 2023.

article thumbnail

The Definitive Entity Resolution Buyer’s Guide

Are you thinking of adding enhanced data matching and relationship detection to your product or service? Do you need to know more about what to look for when assessing your options? The Senzing Entity Resolution Buyer’s Guide gives you step-by-step details about everything you should consider when evaluating entity resolution technologies. You’ll learn about use cases, technology and deployment options, top ten evaluation criteria and more.

article thumbnail

Introducing Predictive Optimization: Faster Queries, Cheaper Storage, No Sweat

databricks

Predictive Optimization intelligently optimizes your Lakehouse table data layouts for peak performance and cost-efficiency - without you needing to lift a finger.

Data 113
article thumbnail

Master Data Management: Common Misconceptions You Should Know

Precisely

When most people think of master data management, they first think of customers and products. This is logical, as the core mission of any company is to develop products and services, find the right customers, and consistently deliver excellence. But master data encompasses so much more than data about customers and products. It includes information about suppliers, employees, and target prospects.

article thumbnail

Cloudera and AMD Spur Data Scientists to Take Climate Action

Cloudera

The world faces multiple environmental sustainability challenges — from the climate crisis and water scarcity to food production and urban resilience. Overcoming these hurdles offers opportunities for innovation through technology and artificial intelligence. That’s why Cloudera and AMD have partnered to host the Climate and Sustainability Hackathon.

Data 105
article thumbnail

10 Biggest Cybersecurity Trends in 2023

Knowledge Hut

Cybersecurity is a method of safeguarding networks and devices from external attacks. The cybersecurity trend shows a growing emphasis on protection, leading to an increased need for Cyber Security specialists. They are hired by businesses to secure secret information, preserve staff productivity, and boost customer trust in products and services. Cyber security is governed by the industry standard of confidentiality, integrity, and availability, or CIA.

Cloud 105
article thumbnail

The Top 5 Cloud Machine Learning Platforms & Tools

KDnuggets

What are the top 5 cloud machine learning platforms in the market today. Our list will help provide some vital insights into which platform might best cater to your specific machine learning needs. See what KDnuggets recommends.

article thumbnail

nixtract 0.1.0

Tweag

Tweag is excited to announce the first release of nixtract 0.1.0 ! This is our first step towards a broader effort to make Nix the best tool to tackle tomorrow’s challenges of the Software Supply Chain. In order to understand why we need nixtract , let me tell you about the “secret” value of Nixpkgs. Is it a bird? A plane? It’s a graph! The Nix language allows you to define the “recipe” to build anything into a package, like the sources and the steps to make the package, but also the dependencie

Metadata 103
article thumbnail

Learn How to Build Airtight Data Pipelines for your AI Initiatives

databricks

"I can't think of anything that's been more powerful since the desktop computer." — Michael Carbin, Associate Professor, MIT, and Founding Advisor, MosaicML A.

article thumbnail

Top 10 Six Sigma Black Belt Project Examples & Ideas

Knowledge Hut

A certified Six Sigma Black Belt expert is a professional who knows and can explain and implement the Six Sigma principles and philosophies. These include tools and supportive systems. A Black Belt professional must have impeccable leadership skills and understand team dynamics. They work in a collaborative manner to assign team members and give them roles and responsibilities.

Project 98
article thumbnail

Data Engineer ????????????????? Premium Service Delivery

Medium Data Engineering

เผยแพร่ครั้งแรกที่ MFEC Viva Channel: 8 Aug 2023 Continue reading on Medium »

article thumbnail

5 Things you didn’t know about Buck2

Engineering at Meta

Meta has a very large monorepo, with many different programming languages. To optimize build and performance, we developed our own build system called Buck , which was first open-sourced in 2013. Buck2 is the recently open-sourced successor. In our internal tests at Meta, we observed that Buck2 completed builds approximately 2x as fast as Buck1. Below are five interesting facts you might not have known about Buck2.

article thumbnail

KDnuggets News, October 27: 5 Free Books to Master Data Science • 7 Steps to Mastering LLMs

KDnuggets

This week on KDnuggets: Go from learning what large language models are to building and deploying LLM apps in 7 steps • Check this list of free books for learning Python, statistics, linear algebra, machine learning and deep learning • And much, much more!

article thumbnail

Top 15 Software Engineer Projects 2023 [Source Code]

Knowledge Hut

In today's fast-paced technological environment, software engineers are continually seeking innovative projects to hone their skills and stay ahead of industry trends. Engaging in software engineering projects not only helps sharpen your programming abilities but also enhances your professional portfolio. To further amplify your skillset, consider enrolling in Programming training course to leverage online programming courses from expert trainers and grow with mentorship programs.

article thumbnail

Werner Gains Advanced Geospatial Capabilities with Snowflake and CARTO

Snowflake

Founded nearly 70 years ago, Werner Enterprises is a North American transportation and logistics leader that operates a fleet of almost 8,300 trucks and 30,000 trailers out of 16 terminals across the United States. The company generates a massive amount of data on the constantly changing, real-time location of each of its assets. Collecting and analyzing this geospatial data is vital for smart decision-making.

article thumbnail

Kubernetes And Kernel Panics

Netflix Tech

How Netflix’s Container Platform Connects Linux Kernel Panics to Kubernetes Pods By Kyle Anderson With a recent effort to reduce customer (engineers, not end users) pain on our container platform Titus , I started investigating “orphaned” pods. There are pods that never got to finish and had to be garbage collected with no real satisfactory final status.

Cloud 91
article thumbnail

Greening AI: 7 Strategies to Make Applications More Sustainable

KDnuggets

The article delves into a comprehensive methodology that sheds light on how to accurately estimate the carbon footprint associated with AI applications. It explains the environmental impact of AI, a crucial consideration in today's world.

IT 98
article thumbnail

Top 6 Six Sigma Yellow Belt Project Examples & Ideas

Knowledge Hut

Six Sigma knowledge is crucial to developing a strong understanding and improving one's expertise in this domain. Six Sigma is a mechanism for removing variances and flaws from a company's operations. Companies use Six Sigma to identify problem areas and build programmes to address them. This results in a more productive operation, and, as a result, a corporation saves money.

Project 98
article thumbnail

3 Questions Marketers Should Ask When Evaluating AI Solutions

Snowflake

AI. It’s on everyone’s mind—and marketers are no exception. You’ve likely heard about it from co-workers, vendors and peers, and if you had a nickel for every AI mention you heard … well, you get the point. With the release of ChatGPT late last year, OpenAI supercharged the conversation around large language models (LLMs), marking 2023 as “the year of AI.

article thumbnail

Top 3 DataCamp Certifications for Data Analysts and Data Scientists in 2024

Medium Data Engineering

My favorite DataCamp certifications to start career in Data analysis, Data Science and Data Engineering in 2024 Continue reading on Javarevisited »

article thumbnail

Generative AI: The First Draft, Not Final

KDnuggets

This article gives a high-level overview of how LLMs work and their attendant limitations with accessible explanations and anecdotes throughout the piece. We also present advice on how people can introduce them into their workflows.

article thumbnail

Top 14 Six Sigma Project Examples

Knowledge Hut

Six Sigma is a methodology that improves a process's quality and performance. It is an improvement program that Motorola developed in the 1980s. The Six Sigma project aims to reduce the number of defects in a company's products or services. Six sigma certification training ensures maximum outcomes in terms of the project. The overall goal of the Six Sigma project is to improve customer satisfaction and increase revenue for the company.

Project 98
article thumbnail

A Complete Guide to Scale Your Data Pipelines and Data Products with Contract Testing and Dbt

Towards Data Science

A Complete Guide to Effectively Scale Your Data Pipelines and Data Products with Contract Testing and dbt All you need to know to start implementing contract tests with dbt Photo by Jonas Gerg on Unsplash Let me tell you a story about data management systems and scale that will probably resonate with you if you are a data or analytics engineer trying to do your best work in 2023.

article thumbnail

How Healthcare and Life Sciences Can Unlock the Potential of Generative AI

Snowflake

A patient interaction turned into clinician notes in seconds, increasing patient engagement and clinical efficiency. Novel compounds designed with desired properties, accelerating drug discovery. Realistic synthetic data created at scale, expediting research in rare under-addressed disease areas. These are just a few examples of how generative AI and large language models (LLMs) are transforming the healthcare and life sciences (HCLS) industry.

article thumbnail

Windows on Snapdragon Brings Hybrid AI to Apps at the Edge

KDnuggets

Let’s take a closer look at Hybrid AI, how you can take advantage of it, and how Snapdragon brings hybrid AI to apps at the edge.

IT 106