Top Data Engineering Digest Certification Accessibility Content for October, 2023

October, 2023

Drag, Drop, Analyze: The Rise of No-Code Data Science

KDnuggets

OCTOBER 26, 2023

No-code or low-code functionalities in data science have gained significant traction in recent years. These solutions are well-proven and matured, and they make data science more accessible to a wider range of people.

Data Science

Data Science Coding Data Accessible

Building a Streaming Data Pipeline with Redshift Serverless and Kinesis

Towards Data Science

OCTOBER 6, 2023

An End-To-End Tutorial for Beginners Continue reading on Towards Data Science »

Data Pipeline

Data Pipeline Building Data Science Data

Join 16,000+

Insiders

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Webinars

The Path to Product Excellence: Avoiding Common Pitfalls and Enhancing Communication

Peak Performance: Continuous Testing & Evaluation of LLM-Based Applications

From Developer Experience to Product Experience: How a Shared Focus Fuels Product Success

Understanding User Needs and Satisfying Them

Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You Need to Know

MORE WEBINARS

Trending Sources

Handling a Regional Outage: Comparing the Response From AWS, Azure and GCP

The Pragmatic Engineer

OCTOBER 31, 2023

👋 Hi, this is Gergely with a bonus, free issue of the Pragmatic Engineer Newsletter. In every issue, I cover topics related to Big Tech and startups through the lens of engineering managers and senior engineers. In this article, we cover three out of seven topics from today’s subscriber-only issue Three Cloud Providers, Three Outages: Three Different Responses.

AWS

AWS Google Cloud Cloud Food

Webinars

The Path to Product Excellence: Avoiding Common Pitfalls and Enhancing Communication

Peak Performance: Continuous Testing & Evaluation of LLM-Based Applications

From Developer Experience to Product Experience: How a Shared Focus Fuels Product Success

Understanding User Needs and Satisfying Them

Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You Need to Know

MORE WEBINARS

dbt multi-project collaboration

Christophe Blefari

OCTOBER 19, 2023

cross-project dependencies ( credits ) Over the last few years, dbt has become a de facto standard enabling companies to collaborate easily on data transformations. With dbt, you can apply software engineering practices to SQL development. Managing your SQL patrimony has never been easier. So, yes, dbt is cool but there is a common pattern with it: you accumulate SQL queries.

Project

Project Finance SQL Government

Get Better Network Graphs & Save Analysts Time

Many organizations today are unlocking the power of their data by using graph databases to feed downstream analytics, enahance visualizations, and more. Yet, when different graph nodes represent the same entity, graphs get messy. Watch this essential video with Senzing CEO Jeff Jonas on how adding entity resolution to a graph database condenses network graphs to improve analytics and save your analysts time.

Database

Introduction of Microsoft Fabric

Analytics Vidhya

OCTOBER 6, 2023

In today’s rapidly evolving digital landscape, seamless data, applications, and device integration are more pressing than ever. Enter Microsoft Fabric, a cutting-edge solution designed to revolutionize how we interact with technology. This article will explore the key features and benefits, identify the ideal users for this solution, and guide you on when and how to […] The post Introduction of Microsoft Fabric appeared first on Analytics Vidhya.

Designing

Designing Technology Data Lake BI

How to use the BranchPythonOperator

Marc Lamberti

OCTOBER 4, 2023

Are you looking for a way to choose one task or another? Do you want to execute a task based on a condition? Do you have multiple tasks, but only one should be executed if a criterion is valid? You’ve come to the right place! The BranchPythonOperator does precisely what you are looking for. It’s common to have DAGs with different execution flows, and you want to follow only one, depending on a value or a condition.

Python

Python Data Pipeline Machine Learning IT

Current 2023 Announcements

Jesse Anderson

OCTOBER 5, 2023

Confluent had their Current Conference (Videos: day one and day two ). There were many announcements that both technologists and investors need to know about. Confluent had two moats (replication and Confluent Cloud), and now they anticipate three moats (replication, Confluent Cloud, serverless Flink). As expected with a vendor conference, there is a lot of marketing from the stage.

Kafka

Kafka Finance Cloud Designing

More Trending

Current 2023 Announcements

Jesse Anderson

OCTOBER 5, 2023

Kafka

Kafka Finance Cloud Designing

Surveying The Market Of Database Products

Data Engineering Podcast

OCTOBER 29, 2023

Summary Databases are the core of most applications, whether transactional or analytical. In recent years the selection of database products has exploded, making the critical decision of which engine(s) to use even more difficult. In this episode Tanya Bragin shares her experiences as a product manager for two major vendors and the lessons that she has learned about how teams should approach the process of tool selection.

Database

Database BI SQL Machine Learning

Going from Developer to CEO: Chronosphere

The Pragmatic Engineer

OCTOBER 10, 2023

👋 Hi, this is Gergely with a bonus, free issue of the Pragmatic Engineer Newsletter. In every issue, I cover topics related to Big Tech and startups through the lens of engineering managers and senior engineers. In this article, we cover three out of eight topics from today’s deepdive into tech scaleup Chronosphere. To get full issues twice a week, subscribe here.

Software Engineer

Software Engineer Software Engineering Architecture Media

5 Free Books to Master Machine Learning

KDnuggets

OCTOBER 25, 2023

Machine Learning is one of the most exciting fields in computer science today. In this article, we will take a look at the five best yet free books to learn machine learning in 2023.

Machine Learning

Machine Learning Computer Science

LLM Inference Performance Engineering: Best Practices

databricks

OCTOBER 12, 2023

In this blog post, the MosaicML engineering team shares best practices for how to capitalize on popular open source large language models (LLMs).

Engineering

Understanding User Needs and Satisfying Them

Speaker: Scott Sehlhorst

We know we want to create products which our customers find to be valuable. Whether we label it as customer-centric or product-led depends on how long we've been doing product management. There are three challenges we face when doing this. The obvious challenge is figuring out what our users need; the non-obvious challenges are in creating a shared understanding of those needs and in sensing if what we're doing is meeting those needs.

Certification

Airflow Sensors: What you need to know

Marc Lamberti

OCTOBER 1, 2023

Airflow Sensors are one of the most common tasks in data pipelines. Why? Because a Sensor waits for a condition to be true to complete. Do you need to wait for a file? Check if an SQL entry exists? Delay the execution of a DAG? That’s the few possibilities of the Airflow Sensors. If you want to make complex and robust data pipelines, you have to understand how Sensors work genuinely.

Data Pipeline

Data Pipeline SQL Algorithm Coding

The State of WebAssembly 2023 by Colin Eberhardt

Scott Logic

OCTOBER 18, 2023

The State of WebAssembly 2023 survey has closed, the results are in … and they are fascinating! If you want the TL;DR; here are the highlights: Rust and JavaScript usage is continuing to increase, but some more notable changes are happening a little further down - with both Swift and Zig seeing a significant increase in adoption. When it comes to which languages developers ‘desire’, with Zig, Kotlin and C# we see that desirability exceeds current usage WebAssembly is still most often used for we

Programming Language

Programming Language Coding Java Datasets

Reducing The Barrier To Entry For Building Stream Processing Applications With Decodable

Data Engineering Podcast

OCTOBER 15, 2023

Summary Building streaming applications has gotten substantially easier over the past several years. Despite this, it is still operationally challenging to deploy and maintain your own stream processing infrastructure. Decodable was built with a mission of eliminating all of the painful aspects of developing and deploying stream processing systems for engineering teams.

Process

Process Building SQL BI

AMM Performance Testing Report

Ripple Engineering

OCTOBER 5, 2023

Overview In the rippled 1.12.0 release, the AMM amendment stands out as a significant feature in both size and scope. Since September 2022, the RippleX performance team has collaborated closely with the engineering team responsible for the AMM feature implementation. This report presents a thorough overview of our testing approach, findings, and key takeaways.

AWS

AWS BI Designing Database

Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You Need to Know

Speaker: Timothy Chan, PhD., Head of Data Science

Are you ready to move beyond the basics and take a deep dive into the cutting-edge techniques that are reshaping the landscape of experimentation? 🌐 From Sequential Testing to Multi-Armed Bandits, Switchback Experiments to Stratified Sampling, Timothy Chan, Data Science Lead, is here to unravel the mysteries of these powerful methodologies that are revolutionizing how we approach testing.

Data Science

7 Steps to Mastering Large Language Models (LLMs)

KDnuggets

OCTOBER 18, 2023

Large Language Models (LLMs) have unlocked a new era in natural language processing. So why not learn more about them? Go from learning what large language models are to building and deploying LLM apps in 7 easy steps with this guide.

Building

Building Process

Snowflake To Acquire Ponder, Boosting Python Capabilities In the Data Cloud

Snowflake

OCTOBER 23, 2023

Python’s popularity has more than doubled in the past decade¹ and it is quickly becoming the preferred language for development across machine learning, application development, pipelines, and more. One of our goals at Snowflake is to ensure we continue to deliver a best-in-class platform for Python developers. Snowflake customers are already harnessing the power of Python through Snowpark , a set of runtimes and libraries that securely deploy and process non-SQL code directly in Snowflake.

Python

Python Cloud Data Science Data Architecture

How LinkedIn Is Using Embeddings to Up Its Match Game for Job Seekers

LinkedIn Engineering

OCTOBER 5, 2023

Think of how many times a day you use some type of search functionality across your devices and applications to discover information, find a contact, or a new job opportunity. The truth is we all depend on the ability to search for things online, and finding the right match to the information, organization, or to a job that maps to your skills and interests makes all the difference in our experiences and the knowledge we can gain.

IT Metadata Designing Algorithm

Automating dead code cleanup

Engineering at Meta

OCTOBER 24, 2023

Meta’s Systematic Code and Asset Removal Framework (SCARF) has a subsystem for identifying and removing dead code. SCARF combines static and dynamic analysis of programs to detect dead code from both a business and programming language perspective. SCARF automatically creates change requests that delete the dead code identified from the program analysis, minimizing developer costs.

Coding

Coding Programming Language Python MySQL

Peak Performance: Continuous Testing & Evaluation of LLM-Based Applications

Speaker: Aarushi Kansal, AI Leader & Author and Tony Karrer, Founder & CTO at Aggregage

Software leaders who are building applications based on Large Language Models (LLMs) often find it a challenge to achieve reliability. It’s no surprise given the non-deterministic nature of LLMs. To effectively create reliable LLM-based (often with RAG) applications, extensive testing and evaluation processes are crucial. This often ends up involving meticulous adjustments to prompts.

Building

Training LLMs at Scale with AMD MI250 GPUs

databricks

OCTOBER 30, 2023

Introduction Four months ago, we shared how AMD had emerged as a capable platform for generative AI and demonstrated how to easily and.

Data Science

Data Science Engineering Data

Announcing Apache Flink 1.18

Confluent

OCTOBER 26, 2023

Read updates and improvements in Apache Flink 1.18, including dynamic fine-grained rescaling via REST API, Java 17 support, and faster rescaling & batch performance improvements.

Java

Why SQL is THE Language to Learn for Data Science

KDnuggets

OCTOBER 12, 2023

SQL is the essential data science language due to its universal database accessibility, efficient data cleaning capabilities, seamless integration with other languages, and requirement for most data science jobs.

Data Science

Data Science SQL Database Data

How DoorDash Standardized and Improved Microservices Caching

DoorDash Engineering

OCTOBER 19, 2023

As DoorDash’s microservices architecture has grown, so too has the volume of interservice traffic. Each team manages their own data and exposes access through gRPC services, an open-source remote procedure call framework used to build scalable APIs. Most business logic is I/O-bound because of calls to downstream services. Caching has long been a go-to strategy to improve performance and reduce costs.

Database

Database Coding Java Accessible

The Big Payoff of Application Analytics

Outdated or absent analytics won’t cut it in today’s data-driven applications – not for your end users, your development team, or your business. That’s what drove the five companies in this e-book to change their approach to analytics. Download this e-book to learn about the unique problems each company faced and how they achieved huge returns beyond expectation by embedding analytics into applications.

Building

Revolutionizing Real-Time Streaming Processing: 4 Trillion Events Daily at LinkedIn

LinkedIn Engineering

OCTOBER 19, 2023

Authors: Bingfeng Xia and Xinyu Liu Background At LinkedIn, Apache Beam plays a pivotal role in stream processing infrastructures that process over 4 trillion events daily through more than 3,000 pipelines across multiple production data centers. This robust framework empowers near real-time data processing for critical services and platforms, ranging from machine learning and notifications to anti-abuse AI modeling.

Process

Process Lambda Architecture Kafka Machine Learning

High resolution data updates to Living Atlas World Elevation Layers and Tools (October 2023)

ArcGIS

OCTOBER 26, 2023

In October 2023, elevation layers have been updated with high-res datasets of France, New Zealand, USA, Italy along with global bathymetry.

Datasets

Datasets Data

Announcing MLflow 2.8 LLM-as-a-judge metrics and Best Practices for LLM Evaluation of RAG Applications, Part 2

databricks

OCTOBER 31, 2023

Today we're excited to announce MLflow 2.8 supports our LLM-as-a-judge metrics which can help save time and costs while providing an approximation of.

Data Science

Data Science Engineering Data

How Meta is creating custom silicon for AI

Engineering at Meta

OCTOBER 18, 2023

With the recent launches of MTIA v1 , Meta’s first-generation AI inference accelerator, and Llama 2 , the next generation of Meta’s publicly available large language model, it’s clear that Meta is focused on advancing AI for a more connected world. Fueling the success of these products are world-class infrastructure teams, including Meta’s custom AI silicon team, led by Olivia Wu, a leader in the silicon industry for 30 years.

Designing

Designing Deep Learning Media Architecture

From Developer Experience to Product Experience: How a Shared Focus Fuels Product Success

Speaker: Anne Steiner and David Laribee

As a concept, Developer Experience (DX) has gained significant attention in the tech industry. It emphasizes engineers’ efficiency and satisfaction during the product development process. As product managers, we need to understand how a good DX can contribute not only to the well-being of our development teams but also to the broader objectives of product success and customer satisfaction.

Engineering

5 Free Books to Master Data Science

KDnuggets

OCTOBER 16, 2023

Want to break into data science? Check this list of free books for learning Python, statistics, linear algebra, machine learning and deep learning.

Data Science

Data Science Deep Learning Machine Learning Python

Top 30+ Computer Science Project Topics of 2023 [Source Code]

Knowledge Hut

OCTOBER 29, 2023

Choosing the best computer science project topic is critical to the success of any computer science student or employee. After all, the more engaging and interesting topic, the more likely it is that students or employees will be able to stay motivated and focused throughout the duration of the project. However, with so many options out there, it can be tough to decide which one is right for you.

Computer Science

Computer Science Coding Project Hospitality

nixtract 0.1.0

Tweag

OCTOBER 25, 2023

Tweag is excited to announce the first release of nixtract 0.1.0 ! This is our first step towards a broader effort to make Nix the best tool to tackle tomorrow’s challenges of the Software Supply Chain. In order to understand why we need nixtract , let me tell you about the “secret” value of Nixpkgs. Is it a bird? A plane? It’s a graph! The Nix language allows you to define the “recipe” to build anything into a package, like the sources and the steps to make the package, but also the dependencie

Metadata

Metadata Accessible Accessibility Python

Prepare your data for the National Spatial Reference System modernization of 2022 in the U.S.

ArcGIS

OCTOBER 17, 2023

The new U.S. datums of 2022 will soon be released. This article covers what is coming and how you should prepare your data.

Systems

Systems Data Data Management Government

The Path to Product Excellence: Avoiding Common Pitfalls and Enhancing Communication

Speaker: David Bard, Principal at VP Product Coaching

In the fast-paced world of digital innovation, success is often accompanied by a multitude of challenges - like the pitfalls lurking at every turn, threatening to derail the most promising projects. But fret not, this webinar is your key to effective product development! Join us for an enlightening session to empower you to lead your team to greater heights.

Certification

October, 2023

Drag, Drop, Analyze: The Rise of No-Code Data Science

Building a Streaming Data Pipeline with Redshift Serverless and Kinesis

Webinars

Trending Sources

Handling a Regional Outage: Comparing the Response From AWS, Azure and GCP

Webinars

dbt multi-project collaboration

Get Better Network Graphs & Save Analysts Time

Introduction of Microsoft Fabric

How to use the BranchPythonOperator

Current 2023 Announcements

Sign up to get articles personalized to your interests!

More Trending

Current 2023 Announcements

Surveying The Market Of Database Products

Going from Developer to CEO: Chronosphere

5 Free Books to Master Machine Learning

LLM Inference Performance Engineering: Best Practices

Understanding User Needs and Satisfying Them

Airflow Sensors: What you need to know

The State of WebAssembly 2023 by Colin Eberhardt

Reducing The Barrier To Entry For Building Stream Processing Applications With Decodable

AMM Performance Testing Report

Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You Need to Know

7 Steps to Mastering Large Language Models (LLMs)

Snowflake To Acquire Ponder, Boosting Python Capabilities In the Data Cloud

How LinkedIn Is Using Embeddings to Up Its Match Game for Job Seekers

Automating dead code cleanup

Peak Performance: Continuous Testing & Evaluation of LLM-Based Applications

Training LLMs at Scale with AMD MI250 GPUs

Announcing Apache Flink 1.18

Why SQL is THE Language to Learn for Data Science

How DoorDash Standardized and Improved Microservices Caching

The Big Payoff of Application Analytics

Revolutionizing Real-Time Streaming Processing: 4 Trillion Events Daily at LinkedIn

High resolution data updates to Living Atlas World Elevation Layers and Tools (October 2023)

Announcing MLflow 2.8 LLM-as-a-judge metrics and Best Practices for LLM Evaluation of RAG Applications, Part 2

How Meta is creating custom silicon for AI

From Developer Experience to Product Experience: How a Shared Focus Fuels Product Success

5 Free Books to Master Data Science

Top 30+ Computer Science Project Topics of 2023 [Source Code]

nixtract 0.1.0

Prepare your data for the National Spatial Reference System modernization of 2022 in the U.S.

The Path to Product Excellence: Avoiding Common Pitfalls and Enhancing Communication

Stay Connected