Sat.Mar 16, 2024 - Fri.Mar 22, 2024

article thumbnail

Is the “AI developer”a threat to jobs – or a marketing stunt?

The Pragmatic Engineer

👋 Hi, this is Gergely with a bonus, free issue of the Pragmatic Engineer Newsletter. In every issue, I cover topics related to Big Tech and startups through the lens of engineering managers and senior engineers. In this article, we cover one out of three topics from last week’s subscriber-only The Pulse issue. Today, full subscribers got access to a comprehensive Senior-and-above tech compensation research.

article thumbnail

Reconciling The Data In Your Databases With Datafold

Data Engineering Podcast

Summary A significant portion of data workflows involve storing and processing information in database engines. Validating that the information is stored and processed correctly can be complex and time-consuming, especially when the source and destination speak different dialects of SQL. In this episode Gleb Mezhanskiy, founder and CEO of Datafold, discusses the different error conditions and solutions that you need to know about to ensure the accuracy of your data.

Database 147
Insiders

Sign Up for our Newsletter

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

article thumbnail

Data News — Week 24.12

Christophe Blefari

Friday routine ( credits ) It's Friday and it's Data News. I don't go into too much detail about the magic of Data News, but every Friday is the same. At first, I'm: oh s**t, here we go again and 10 minutes later I'm lost in reading the content and picking too many articles to fit into a thousand word edition. Usually all the process takes me a whole Friday.

article thumbnail

StreamingQueryListener, from states to questions

Waitingforcode

Apache Spark leverages the observer design pattern for the framework-to-code communication. One of the consumers' implementations is StreamingQueryListener.

Coding 130
article thumbnail

Apache Airflow® Best Practices for ETL and ELT Pipelines

Whether you’re creating complex dashboards or fine-tuning large language models, your data must be extracted, transformed, and loaded. ETL and ELT pipelines form the foundation of any data product, and Airflow is the open-source data orchestrator specifically designed for moving and transforming data in ETL and ELT pipelines. This eBook covers: An overview of ETL vs.

article thumbnail

Introducing Tableflow

Confluent

Seamlessly integrate Apache Kafka data into your lakehouse as Apache Iceberg tables, bridging the operational and analytical divide, with Tableflow. Read more in our blog post.

Kafka 133
article thumbnail

Lilac Joins Databricks to Simplify Unstructured Data Evaluation for Generative AI

databricks

Today, we are thrilled to announce that Lilac is joining Databricks. Lilac is a scalable, user-friendly tool for data scientists to search, cluster.

More Trending

article thumbnail

Threads has entered the fediverse

Engineering at Meta

Threads has entered the fediverse! As part of our beta experience, now available in a few countries, Threads users aged 18+ with public profiles can now choose to share their Threads posts to other ActivityPub-compliant servers. People on those servers can now follow federated Threads profiles and see, like, reply to, and repost posts from the fediverse.

Media 127
article thumbnail

Data Trends 2024: Strategies for an AI-Ready Data Foundation

Snowflake

A company’s data strategy is always in motion. Since the explosion of interest in generative AI and large language models (LLMs), that is more true than ever, with business leaders discussing how quickly they should adopt these technologies to stay competitive. Some emerging approaches may be seen in our newly released Snowflake Data Trends 2024 , looking at how users in the Data Cloud are working with their data.

article thumbnail

The Path To Senior Engineer

Confessions of a Data Guy

Want to know how to grow to the Senior Engineering position? Take a look. The post The Path To Senior Engineer appeared first on Confessions of a Data Guy.

article thumbnail

Cloudera’s RHEL-volution: Powering the Cloud with Red Hat

Cloudera

As enterprise AI technologies rapidly reshape our digital environment, the foundation of your cloud infrastructure is more critical than ever. That’s why Cloudera and Red Hat , renowned for their open-source solutions, have teamed up to bring Red Hat Enterprise Linux ( RHEL ) to Cloudera on public cloud as the operating system for all of our public cloud platform images.

Cloud 114
article thumbnail

Apache Airflow®: The Ultimate Guide to DAG Writing

Speaker: Tamara Fingerlin, Developer Advocate

In this new webinar, Tamara Fingerlin, Developer Advocate, will walk you through many Airflow best practices and advanced features that can help you make your pipelines more manageable, adaptive, and robust. She'll focus on how to write best-in-class Airflow DAGs using the latest Airflow features like dynamic task mapping and data-driven scheduling!

article thumbnail

Best Practices for Confluent Schema Registry

Confluent

Learn the best practices for using Confluent Schema Registry, including using schema IDs, understanding subjects and versions, using data contracts, pre-registering schemas, and more.

Data 111
article thumbnail

Snowflake Brings Gen AI to Images, Video and More With Multimodal Language Models from Reka in Snowflake Cortex

Snowflake

Snowflake is committed to helping our customers unlock the power of artificial intelligence (AI) to drive better decisions, improve productivity and reach more customers using all types of data. Large Language Models ( LLMs ) are a critical component of generative AI applications, and multimodal models are an exciting category that allows users to go beyond text and incorporate images and video into their prompts to get a better understanding of the context and meaning of the data.

article thumbnail

Top 8 AI Search Engine That You Should Replace With Google

KDnuggets

GenAI has enabled new search engine platforms with unique features and advantages, challenging Google's dominance.

article thumbnail

Introducing the Databricks AI Security Framework (DASF)

databricks

We are excited to announce the release of the Databricks AI Security Framework (DASF) version 1.0 whitepaper! The framework is designed to improve.

Designing 123
article thumbnail

Optimizing The Modern Developer Experience with Coder

Many software teams have migrated their testing and production workloads to the cloud, yet development environments often remain tied to outdated local setups, limiting efficiency and growth. This is where Coder comes in. In our 101 Coder webinar, you’ll explore how cloud-based development environments can unlock new levels of productivity. Discover how to transition from local setups to a secure, cloud-powered ecosystem with ease.

article thumbnail

Fail Safe vs Fail Secure: Top Differences in Locking Systems

Knowledge Hut

When I worked in the hospitality industry, the electricity abruptly went out while we were establishing the network and door locks. To my surprise, the door lock opened like any other door. This was the first time in my life that I had ever seen a fail-safe door lock. I have comprehensively analyzed the area of physical security, particularly the ongoing discussion surrounding fail safe vs fail-safe secure electric strike locking systems.

Systems 105
article thumbnail

Navigating your way: Traffic Prediction with Machine Learning

WeCloudData

Machine learning is revolutionizing traffic prediction, enhancing route planning and reducing congestion in urban commuting. Explore advanced algorithms like Uni-LSTM and BiLSTM for accurate forecasts, along with Google Maps' integration of deep learning for improved ETA accuracy. Discover the practical utility of machine learning in everyday life. The post Navigating your way: Traffic Prediction with Machine Learning appeared first on WeCloudData.

article thumbnail

5 Free Books to Master Statistics for Data Science

KDnuggets

Statistics is a must-have skill for data science. And here are 5 free books that’ll help you learn all the statistics you need as a data professional.

article thumbnail

Turbocharged Training: Optimizing the Databricks Mosaic AI stack with FP8

databricks

Benchmarking for training (dense) models at scale. We demonstrate great performance (very high MFU) and highlight our use of NVIDIA's Transformer Engine, along with PyTorch FSDP and DTensor.

article thumbnail

15 Modern Use Cases for Enterprise Business Intelligence

Large enterprises face unique challenges in optimizing their Business Intelligence (BI) output due to the sheer scale and complexity of their operations. Unlike smaller organizations, where basic BI features and simple dashboards might suffice, enterprises must manage vast amounts of data from diverse sources. What are the top modern BI use cases for enterprise businesses to help you get a leg up on the competition?

article thumbnail

Important Tips For Employees To Achieve Their Career Growth Goals

Knowledge Hut

Employees who are in the profession today will change occupations considerably more much of the time than in past time. Thus, you should be aware and proactive in dealing with your profession. Yet, does this mean you need to yield in different parts of your life that are important to you? Our lives are sufficiently occupied balancing work and family without finding time for making out significant profession improvements also.

article thumbnail

Robinhood is now available to all customers in the United Kingdom

Robinhood

The largest UK brokers typically charge UK investors, with a £10,000 portfolio, an average of £240 per year to invest in US stocks*–Robinhood offers no commission fees and no foreign exchange (FX) fees on trades.** Today, we’ve rolled all eligible customers off our waitlist and Robinhood is now officially available throughout the United Kingdom. With Robinhood, customers simply get more for their money, with no commission fees and no foreign exchange (FX) fees on trades, access to more than 6,00

article thumbnail

A Free Data Science Learning Roadmap: For All Levels with IBM

KDnuggets

Learn data science according to your expertise with these 4 different learning roadmaps.

article thumbnail

Logarithm: A logging engine for AI training workflows and services

Engineering at Meta

Systems and application logs play a key role in operations, observability, and debugging workflows at Meta. Logarithm is a hosted, serverless, multitenant service, used only internally at Meta, that consumes and indexes these logs and provides an interactive query interface to retrieve and view logs. In this post, we present the design behind Logarithm, and show how it powers AI training debugging use cases.

article thumbnail

Prepare Now: 2025's Must-Know Trends For Product And Data Leaders

Speaker: Jay Allardyce, Deepak Vittal, Terrence Sheflin, and Mahyar Ghasemali

As we look ahead to 2025, business intelligence and data analytics are set to play pivotal roles in shaping success. Organizations are already starting to face a host of transformative trends as the year comes to a close, including the integration of AI in data analytics, an increased emphasis on real-time data insights, and the growing importance of user experience in BI solutions.

article thumbnail

What Is Logical Thinking – Significance, Components, And Examples

Knowledge Hut

Logical thinking skills play a significant role in developing careers because they help you reason through vital decisions, generate creative ideas, set goals, and solve problems. You may encounter multiple challenges in your life when you enter the job industry or advance your career. Therefore, need strong logical reasoning skills to solve your problems.

article thumbnail

LinkSage: GNN-based Pinterest Off-site Content Understanding

Pinterest Engineering

Adopted by Pinterest multiple user facing surfaces, Ads, and Board. Jianjin Dong | Staff Machine Learning Engineer, Content Quality; Qinglong Zeng | Senior Engineering Manager, Content Quality; Andrey Gusev | Director, Content Quality; Yangyi Lu | Machine Learning Engineer, Home Feed; Han Sun | Staff Machine Learning Engineer, Ads Conversion Modeling; William Zhao | Software Engineer, Boards Foundation, Jay Ma | Machine Learning Engineer, Ads Lightweight Ranking LinkSage: Graph Neural Network ba

article thumbnail

Getting Started with LLMOps: The Secret Sauce Behind Seamless Interactions

KDnuggets

Check out this beginner’s guide to understanding the role of Large Language Model Operations for seamless user experiences.

133
133
article thumbnail

Better video for mobile RTC with AV1 and HD

Engineering at Meta

At Meta, we support real-time communication (RTC) for billions of people through our apps, including Messenger, Instagram, and WhatsApp. We’ve seen significant benefits by adopting the AV1 codec for RTC. Here’s how we are improving the RTC video quality for our apps with tools like the AV1 codec, the challenges we face, and how we mitigate those challenges.

article thumbnail

Apache Airflow®: The Ultimate Guide to DAG Writing

Speaker: Tamara Fingerlin, Developer Advocate

In this new webinar, Tamara Fingerlin, Developer Advocate, will walk you through many Airflow best practices and advanced features that can help you make your pipelines more manageable, adaptive, and robust. She'll focus on how to write best-in-class Airflow DAGs using the latest Airflow features like dynamic task mapping and data-driven scheduling!

article thumbnail

Right Certification at Right Time of the Career

Knowledge Hut

Today, in the growing dynamic environment, definitely the only non-changed thing is changing itself, whatever the driver is from, opportunity or threats from outside or inside your organization… Believing some of you already realized that more and more talents were being required in the market, since most of the developed organizations were investing more and more for building up or upgrading of the organization capability, for catering to the growing dynamic environment, that just is the capabi

article thumbnail

Confluent Cloud for Apache Flink Is Now Generally Available

Confluent

Confluent Cloud's serverless Flink offering is now available on all major clouds, offering a unified, managed platform for real-time data processing.

Cloud 93
article thumbnail

Introducing MetaGPT’s Data Interpreter: SOTA Open Source LLM-based Data Solutions

KDnuggets

MetaGPT's newest agent addition makes running data interpretation and analysis tasks a breeze. Find out more and give it a try for yourself.

article thumbnail

The Modern Data Streaming Pipeline: Streaming Reference Architectures and Use Cases Across 7 Industries 

Snowflake

Executives across various industries are under pressure to reach insights and make decisions quickly. This is driving the importance of streaming data and analytics, which play a crucial role in making better-informed decisions that likely lead to faster, better outcomes. While traditional systems store and process data in batches, streaming data refers to data that is continuously generated from a variety of sources.

article thumbnail

The Cloud Development Environment Adoption Report

Cloud Development Environments (CDEs) are changing how software teams work by moving development to the cloud. Our Cloud Development Environment Adoption Report gathers insights from 223 developers and business leaders, uncovering key trends in CDE adoption. With 66% of large organizations already using CDEs, these platforms are quickly becoming essential to modern development practices.