Mon.Apr 22, 2024

article thumbnail

Docker Fundamentals for Data Engineers

Start Data Engineering

1. Introduction 2. Docker concepts 2.1. Define the OS and its configurations with an image 2.2. Use the image to run containers 2.2.1. Communicate between containers and local OS 2.2.2. Start containers with docker CLI or compose 3. Conclusion 1. Introduction Docker can be overwhelming to start with. Most data projects use Docker to set up the data infra locally (and often in production).

article thumbnail

5 Free Stanford University Courses to Learn Data Science

KDnuggets

Are you an aspiring data scientist? If so, these free data science courses from Stanford will help you move forward in your data science journey!

Insiders

Sign Up for our Newsletter

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

article thumbnail

How to test PySpark code with pytest

Start Data Engineering

1. Introduction 2. Ensure the code’s logic is working as expected with tests 2.1. Test types for data pipelines 2.2. pytest: A powerful Python library for testing 2.2.1. Set context, run code, check results & clean up 2.2.2. Tests are identified by their name 2.2.3. Use fixture to create fake data for testing 2.2.4. Define items to be shared among tests with conftest.

Coding 208
article thumbnail

Are we ready to put AI in the hands of business users? by Caitlin Salt

Scott Logic

Generative AI has been grabbing headlines, but many businesses are starting to feel left-behind. Large-model AI is becoming more and more influential in the market, and with the well-known tech giants starting to introduce easy-access AI stacks, a lot of businesses are left feeling that although there may be a use for AI in their business, they’re unable to see what use cases it might help them with.

BI 97
article thumbnail

Generative AI Deep Dive: Advancing from Proof of Concept to Production

Speaker: Maher Hanafi, VP of Engineering at Betterworks & Tony Karrer, CTO at Aggregage

Executive leaders and board members are pushing their teams to adopt Generative AI to gain a competitive edge, save money, and otherwise take advantage of the promise of this new era of artificial intelligence. There's no question that it is challenging to figure out where to focus and how to advance when it’s a new field that is evolving everyday. 💡 This new webinar featuring Maher Hanafi, VP of Engineering at Betterworks, will explore a practical framework to transform Generative AI pr

article thumbnail

Drawing a Blank? Understanding Drawing Alerts in ArcGIS Pro

ArcGIS

A drawing alert notification system was added in ArcGIS Pro 3.2 as a method for resolving drawing issues in your ArcGIS Pro projects.

Project 110
article thumbnail

Semantic Search with Vector Databases

KDnuggets

Leverage the latest technology to improve our search engine capabilities.

Database 101

More Trending

article thumbnail

How to Standout and Safeguard Your Job in the Generative AI Era

KDnuggets

The secret recipe to excel in your career in AI.

103
103
article thumbnail

How Striim Enhances Healthcare at Discovery Health with Real-Time Data

Striim

Discovery Health, originating in South Africa, has transcended borders to extend its services to over 40 million customers across more than 40 global markets, encompassing regions in Asia, EMEA, and the Americas. Since its inception in 1992, the company has remained steadfast in its core purpose: “to make people healthier and to enhance and protect their lives.” As a multifaceted financial services organization, Discovery Health operates in various sectors including healthcare, life

article thumbnail

How to Install Python 3 on Ubuntu [Step-by-Step Guide]

Knowledge Hut

Anyone aspiring to be a data scientist, machine learning engineer, or software developer must have thought about learning Python. The popularity of this programming language has grown exponentially in the past ten years. Even those unfamiliar with coding have probably heard about it. As per the developer survey by Stack Overflow in 2021 , approximately 68% of software developers or data scientists who have worked on developing software using Python have expressed that they will continue doing so

Python 98
article thumbnail

Enhancing Distributed System Load Shedding with TCP Congestion Control Algorithm

Zalando Engineering

Introduction Our team is responsible for sending out communications to all our customers at Zalando - e.g. confirming a placed order, informing about new content from a favourite brand or announcing sales campaigns. During the preparation of those messages as well during sending those out via different service providers we have to deal with limited resources.

article thumbnail

Leading the Development of Profitable and Sustainable Products

Speaker: Jason Tanner

While growth of software-enabled solutions generates momentum, growth alone is not enough to ensure sustainability. The probability of success dramatically improves with early planning for profitability. A sustainable business model contains a system of interrelated choices made not once but over time. Join this webinar for an iterative approach to ensuring solution, economic and relationship sustainability.

article thumbnail

Magnite’s Seamless Petabyte Scale Cross-Region Migration with Snowgrid

Snowflake

Magnite stands as the largest independent sell-side advertising platform, providing an essential bridge between publishers and advertisers. At its core, Magnite streamlines the advertising process, facilitating the buying and selling of advertising space across various channels, including connected TV (CTV), mobile, and desktop environments. By leveraging advanced technology and data analytics, Magnite offers a comprehensive suite of tools and services designed to maximize ad revenue for publish

AWS 70
article thumbnail

Web Performance Regression Detection (Part 1 of 3)

Pinterest Engineering

Michelle Vu | Web Performance Engineer Detecting, preventing, and resolving performance regressions has been a standard at Pinterest for many years. Over the years, we have seen many examples showing significant business metric movements resulting from performance optimizations and regressions. These concrete examples motivate us to optimize and maintain performance.

article thumbnail

Async APIs - don't confuse your events, commands and state by David Hope

Scott Logic

In my previous blog post I looked at various technologies for sending data asynchronously between services including RabbitMQ, Kafka, AWS EventBridge. This time round I’ll look at the messages themselves which over the last few years I’ve found to be a more complex and nuanced topic than expected. To set the scene see the diagram below of an imaginary financial trading application: There’s lots of data flying around varying from real time pricing data to instructions to execute trades.

AWS 52