Trending Articles

article thumbnail

Is there a drop in software engineer job openings, globally?

The Pragmatic Engineer

👋 Hi, this is Gergely with a bonus, free issue of the Pragmatic Engineer Newsletter. We cover one out of five topics in today’s subscriber-only The Scoop issue. To get full issues twice a week, subscribe here.

article thumbnail

Data News — Week 23.12

Christophe Blefari

The Earth can also generate great images ( credits ) Dear readers, I hope this new edition finds you well. It seems that you really liked the recent editions, which is perfect because it was fun to write.

Insiders

Sign Up for our Newsletter

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

article thumbnail

Worth reading for data engineers - part 2

Waitingforcode

Welcome to the 2nd part of the series with great streaming and project organization blog posts summaries

article thumbnail

Hello Dolly: Democratizing the magic of ChatGPT with open models

databricks

Summary We show that anyone can take a dated off-the-shelf open source large language model (LLM) and give it magical ChatGPT-like instruction following. Company Blog News

IT 129
article thumbnail

Future Proof Yourself Against AI.

Confessions of a Data Guy

The post Future Proof Yourself Against AI. appeared first on Confessions of a Data Guy. AI Data Data Engineering

article thumbnail

Top 4 Cloud Platforms to Host or Run Docker Containers for Free

Analytics Vidhya

Introduction Containerization is becoming more popular and widely used by developers in the software industry in recent years. Docker is still considered one of the top tools for creating containers by building Images between containerization platforms or cloud platforms.

Cloud 160
article thumbnail

KDnuggets Top Posts for January 2023: SQL and Python Interview Questions for Data Analysts

KDnuggets

SQL and Python Interview Questions for Data Analysts • 5 SQL Visualization Tools for Data Engineers • 5 Free Tools For Detecting ChatGPT, GPT3, and GPT2 • Top Free Resources To Learn ChatGPT • Free TensorFlow 2.0

SQL 109

More Trending

article thumbnail

Using CockroachDB to Reduce Feature Store Costs by 75%

DoorDash Engineering

While building a feature store to handle the massive growth of our machine-learning (“ML”) platform, we learned that using a mix of different databases can yield significant gains in efficiency and operational simplicity.

AWS 109
article thumbnail

AWS Lambdas. Useful for Data Engineering?

Confessions of a Data Guy

Are lambdas one of those tools that everyone uses and no one talks about? I guess I’ve taken them for granted over the years, even though they are incredibly useful.

AWS 130
article thumbnail

Top 11 Azure Data Services Interview Questions in 2023

Analytics Vidhya

Introduction In today’s world, data is growing exponentially with time with digitalization. Organizations are using various cloud platforms like Azure, GCP, etc., to store and analyze this data to get valuable business insights from it.

Database 169
article thumbnail

Top 30+ Project Management (PMP) Terms - Every Project Manager Should Know

Knowledge Hut

Project management is vital to the success of any company. It is responsible for keeping all project details organized, prioritized, and on track to meet deadlines and ensure quality. It also has a lot of influence over whether or not a project is completed successfully.

Project 98
article thumbnail

A Complete Collection of Data Science Free Courses – Part 1

KDnuggets

The first part covers the list of Programming, Web scraping, Statistics & Probability, Data Analytics, SQL, and Business Intelligence free courses. KDnuggets Evergreen KDnuggets Originals Data Science

article thumbnail

Announcing General Availability of Databricks Unity Catalog on Google Cloud Platform

databricks

We are thrilled to announce that Databricks Unity Catalog is now generally available on Google Cloud Platform (GCP). Unity Catalog provides a unified. Platform Blog Announcements

article thumbnail

Aligning Data Security With Business Productivity To Deploy Analytics Safely And At Speed

Data Engineering Podcast

Summary As with all aspects of technology, security is a critical element of data applications, and the different controls can be at cross purposes with productivity.

article thumbnail

lyft2vec?—?Embeddings at Lyft

Lyft Engineering

lyft2vec — Embeddings at Lyft Co-authors: Javen Xu , Hakan Baba and Adriana Deneault Intro Graph learning methods can reveal interesting insights that capture the underlying relational structures.

Algorithm 116
article thumbnail

How Monte Carlo’s New Jira Integration Streamlines Ticketing Workflows

Monte Carlo

At Monte Carlo, our goal is always to build and support integrations that elevate our customer’s existing data engineering workflows. We don’t want to reinvent the wheel—we want to help your existing wheels spin faster.

article thumbnail

Gaussian Naive Bayes, Explained

KDnuggets

Learn how Gaussian Naive Bayes works and implement it in Python. KDnuggets Evergreen KDnuggets Originals Machine Learning

Python 95
article thumbnail

Observe Everything

Cloudera

Over the past handful of years, systems architecture has evolved from monolithic approaches to applications and platforms that leverage containers, schedulers, lambda functions, and more across heterogeneous infrastructures.

article thumbnail

Fine-Tuning Large Language Models with Hugging Face and DeepSpeed

databricks

Large language models (LLMs) are currently in the spotlight following the sensational release of ChatGPT. Many are wondering how to take advantage of. Engineering Blog Data Science and ML

article thumbnail

Optimize Data Warehouse Storage with Views and Tables

Medium Data Engineering

The difference between tables and views and how to use them Continue reading on Towards Data Science » cloud-computing data-engineering data-warehouse data-storage data-analytics

article thumbnail

Wake Up to the Importance of Sleep: Celebrating World Sleep Day!

Analytics Training

According to a recent survey, a shocking 59% of the population go to bed way past midnight, directly affecting their health – and they are blaming social media and digital devices for their distractions.

article thumbnail

Machine Learning: What is Bootstrapping?

KDnuggets

Bootstrapping is an essential technique if you're into machine learning. We’ll discuss it from theoretical and practical standpoints. The practical part involves two examples of bootstrapping in Python. KDnuggets Evergreen KDnuggets Originals Machine Learning

article thumbnail

How Snowflake Delivers on the National Cybersecurity Strategy

Snowflake

Data is an asset and it is imperative to keep it safe. This is why security is a core designing principle of Snowflake, it’s always on, and that removes the need for users to intervene.

article thumbnail

Use SurrealDB to Persist Data with Rocket REST API

Workfall

Reading Time: 8 minutes Databases are essential in web development for organizing data in various forms and shapes (both structured and unstructured).

NoSQL 70
article thumbnail

Barracuda Networks uses ML on Databricks Lakehouse to prevent email phishing attacks at scale

databricks

This blog is authored by Mohamed Afifi Ibrahim, Principal Machine Learning Engineer at Barracuda Networks. 74% of organizations globally have fallen victim to. Company Blog Customers

article thumbnail

Linear Constraints: the problem with scopes

Tweag

This is the second of two companion blog posts to the paper Linearly Qualified Types , published at ICFP 2021 (there is also a long version, with appendices ). These blog posts will dive into some subjects that were touched, but not elaborated on, in the paper.

IT 97
article thumbnail

Learn About Large Language Models

KDnuggets

An introduction to Large Language Models, what they are, how they work, and use cases. KDnuggets Originals Natural Language Processing

Process 89
article thumbnail

In the spotlight with Hayley Bird, ThoughtSpot’s Selfless Excellence champion

ThoughtSpot

This is part of our ongoing spotlight series which highlights ThougthSpot’s quarterly Selfless Excellence champion. At ThoughtSpot, Selfless Excellence is the guiding principle for our culture.

article thumbnail

Unified Streaming And Batch Pipelines At LinkedIn: Reducing Processing time by 94% with Apache Beam

LinkedIn Engineering

Co-Authors: Yuhong Cheng , Shangjin Zhang , Xinyu Liu, and Yi Pan Efficient data processing is crucial in reducing learning curves, simplifying maintenance efforts, and decreasing operational complexity.

Process 90
article thumbnail

Announcing the General Availability of Private Link and Customer Managed Keys for Azure Databricks

databricks

We are excited to announce that Private Link and using customer-managed keys (CMK) for encryption are now Generally Available (GA) for Azure Databricks. Platform Blog Announcements

article thumbnail

Beyond Web Mercator: Building basemaps in different projections

ArcGIS

Using ArcGIS Pro to build 'Human Geography' style vector basemaps in different projections, for use in ArcGIS Online Mapping basemaps cartography Living Atlas of the World Projections vector tile layers

Project 98
article thumbnail

Plotly Express for Data Visualization Cheat Sheet

KDnuggets

Our latest cheat sheet is a handy reference for Plotly Express, a high-level data visualization library in Python built on top of Plotly. Cheat Sheets on Data Science, Machine Learning, AI & Analytics KDnuggets Evergreen KDnuggets Originals Data Science

Python 83
article thumbnail

How to Sign Commits With GPG Key

Medium Data Engineering

Autosigning Git commits with GPG keys on GitHub Continue reading on Level Up Coding » devops programming data-science data-engineering github

Coding 97
article thumbnail

Demand and ETR Forecasting at Airports

Uber Engineering

In this post we will dive into the algorithm, data modeling, and system design that go into estimating the length of time drivers would have to wait for a trip request at a given location, empowering them to strategically remain or reposition. Backend

article thumbnail

Using Real-Time Propensity Estimation to Drive Online Sales

databricks

Accelerated adoption of online services creates an opportunity for retail organizations to drive growth. While the sudden spike in online sales seen in. Industries Manufacturing

Retail 63
article thumbnail

Materialized Views in SQL Stream Builder

Cloudera

What is a materialized view? Cloudera SQL Stream Builder (SSB) gives the power of a unified stream processing engine to non-technical users so they can integrate, aggregate, query, and analyze both streaming and batch data sources in a single SQL interface.

SQL 56
article thumbnail

Data Quality Dimensions: Assuring Your Data Quality with Great Expectations

KDnuggets

This article highlights the significance of ensuring high-quality data and presents six key dimensions for measuring it. These dimensions include Completeness, Consistency, Integrity, Timelessness, Uniqueness, and Validity. KDnuggets Originals Data Engineering