Sat.Jan 14, 2023 - Fri.Jan 20, 2023

article thumbnail

Replacing Pandas with Polars. A Practical Guide.

Confessions of a Data Guy

I remember those days, oh so long ago, it seems like another lifetime. I haven’t used Pandas in many a year, decades, or whatever. We’ve all been there, done that. Pandas I mean. I would dare say it’s a rite of passage for most data folk. For those using Python, it’s probably one of the […] The post Replacing Pandas with Polars.

Python 361
article thumbnail

How To Hire Junior Data Engineers

Seattle Data Guy

With all the recent data events I have put together I inevitably run into new data engineers who are either finishing up college or looking to transition into a data engineer or data scientist position. In fact I have talked to several newly graduated engineers who are struggling to find work. A few told me… Read more The post How To Hire Junior Data Engineers appeared first on Seattle Data Guy.

Insiders

Sign Up for our Newsletter

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

article thumbnail

What Big Tech layoffs suggest for the industry

The Pragmatic Engineer

👋 Hi, this is Gergely with a bonus, free issue of the Pragmatic Engineer Newsletter. We cover one out of five topics in today’s subscriber-only The Scoop issue. To get the full issues, twice a week: subscribe here. Update on 20 January: less than a day after publishing this article, Google announced historic layoffs that will impact ~12,000 positions.

Banking 141
article thumbnail

Data News — Week 23.03

Christophe Blefari

Summer in coming ( credits ) Hey, new Friday, new Data News edition. I'm so happy to see new people coming every week. Thank you for every recommendation you do about the blog or the Data News. This kindness for my content gives me wings. This week I don't want to be late, so let's start the weekly wrap-up. I got less inspired this week, it means shorter edition.

article thumbnail

Generative AI Deep Dive: Advancing from Proof of Concept to Production

Speaker: Maher Hanafi, VP of Engineering at Betterworks & Tony Karrer, CTO at Aggregage

Executive leaders and board members are pushing their teams to adopt Generative AI to gain a competitive edge, save money, and otherwise take advantage of the promise of this new era of artificial intelligence. There's no question that it is challenging to figure out where to focus and how to advance when it’s a new field that is evolving everyday. 💡 This new webinar featuring Maher Hanafi, CTO of Betterworks, will explore a practical framework to transform Generative AI prototypes into

article thumbnail

ChatGPT as a Python Programming Assistant

KDnuggets

Is ChatGPT useful for Python programmers, specifically those of us who use Python for data processing, data cleaning, and building machine learning models? Let's give it a try and find out.

Python 159
article thumbnail

What Is The State Of Data Engineering And Infrastructure In 2023

Seattle Data Guy

2022 is coming to an end. What is the state of data infra? Are Snowflake and Databricks still fighting over total cost of ownership? Is everyone switching to DuckDB? Are data engineers all learning Rust? Let’s try to answer these questions. Our team is putting together an all day event focused on helping answer some… Read more The post What Is The State Of Data Engineering And Infrastructure In 2023 appeared first on Seattle Data Guy.

More Trending

article thumbnail

Data News — Week 23.02

Christophe Blefari

Abandoned Pandas ( credits ) Hey. I have busy weeks, I'm sorry Data News are coming on Saturday again. This is a bit hard to travel by train, work and write at the same time. Plus I'm a fast context switcher, so it piles up. Also a few of you have sent me messages recently and I've not yet answered, I see you and I did not forget you.

Python 130
article thumbnail

20 Questions (with Answers) to Detect Fake Data Scientists: ChatGPT Edition, Part 1

KDnuggets

Can ChatGPT provide answers to data science questions to the same standard of humans? Check out this attempt to do so, and compare the answers to those from experts.

article thumbnail

Why You Should Simplify Your Data Infrastructure

Seattle Data Guy

Good Design Is Easier to Change Than Bad Design – The Pragmatic Programmer Programming is just one aspect of the difficulties of tech work for data engineers. Creating simple yet robust systems that help manage your data infrastructure is equally important. This challenge of building a simple yet robust data infrastructure remains even with no-code/low-code solutions.

Data 130
article thumbnail

Devpod: Improving Developer Productivity at Uber with Remote Development

Uber Engineering

In this blog, we share how we improved the daily edit-build-run developer experience using DevPods, Uber’s remote development environment. We cover the challenges, pain points, our architecture, and lastly the future of remote development at Uber.

article thumbnail

Leading the Development of Profitable and Sustainable Products

Speaker: Jason Tanner

While growth of software-enabled solutions generates momentum, growth alone is not enough to ensure sustainability. The probability of success dramatically improves with early planning for profitability. A sustainable business model contains a system of interrelated choices made not once but over time. Join this webinar for an iterative approach to ensuring solution, economic and relationship sustainability.

article thumbnail

Data Integrity Trends for 2023

Precisely

For most enterprises, 2022 was a year of transition, as companies struggled to figure out how to accomplish more with fewer resources. Technology helped to bridge the gap, as AI, machine learning, and data analytics drove smarter decisions, and automation paved the way for greater efficiency. Data integrity trends for 2023, has agility toping the list of success factors for most firms, as business leaders focus on rapid time to value and an emphasis on responding quickly to emerging opportunitie

article thumbnail

SQL and Data Integration: ETL and ELT

KDnuggets

In this article, we will discuss use cases and methods for using ETL (Extract, Transform, Load) and ELT (Extract, Load, Transform) processes along with SQL to integrate data from various sources.

SQL 120
article thumbnail

DevTernity conference 2022 by Robat Williams

Scott Logic

Late last year I had the chance to attend DevTernity , an all-remote generalist software development conference. The first day was the main conference day, with the second (optional) day offering a choice of workshops by some of the speakers. It was a great conference. In this post I’ll cover off some points of interest from some of the talks I chose to attend, and reflect on the remote conference experience.

article thumbnail

Reducing Logging Cost by Two Orders of Magnitude using CLP

Uber Engineering

Uber’s Data team discusses how they used CLP to scale log ingestion, retention, and analytics for Petabytes of Spark logs, reducing log storage and management costs by 169x.

article thumbnail

Navigating the Future: Generative AI, Application Analytics, and Data

Generative AI is upending the way product developers & end-users alike are interacting with data. Despite the potential of AI, many are left with questions about the future of product development: How will AI impact my business and contribute to its success? What can product managers and developers expect in the future with the widespread adoption of AI?

article thumbnail

What’s New With SQL User-Defined Functions

databricks

Since their initial release, SQL user-defined functions have become hugely popular among both Databricks Runtime and Databricks SQL customers. This simple yet powerful.

SQL 83
article thumbnail

Fast-track your next move with in-demand data skills

KDnuggets

DataCamp offers over 400 interactive courses, projects, and career tracks in the most popular data technologies such as Python, SQL, R, Power BI, and Tableau. Start today and save up to 67% on career-advancing learning.

BI 120
article thumbnail

The Insurance Industry is Ready for a lot More Change

Teradata

The dwindling personal auto insurance market is a harbinger of a lot more change to come. Find out more.

article thumbnail

Uber’s Next Gen Push Platform on gRPC

Uber Engineering

Uber’s API platform team talks about how they built their Next Generation Push Platform on gRPC which helped improve the reliability and latency of messages significantly.

98
article thumbnail

Get Better Network Graphs & Save Analysts Time

Many organizations today are unlocking the power of their data by using graph databases to feed downstream analytics, enahance visualizations, and more. Yet, when different graph nodes represent the same entity, graphs get messy. Watch this essential video with Senzing CEO Jeff Jonas on how adding entity resolution to a graph database condenses network graphs to improve analytics and save your analysts time.

article thumbnail

Easy Ingestion to Lakehouse With COPY INTO

databricks

A new data management architecture known as the data lakehouse emerged independently across many organizations and use cases to support AI and BI.

BI 89
article thumbnail

How to Use Python and Machine Learning to Predict Football Match Winners

KDnuggets

We will be learning web scraping and training supervised machine-learning algorithms to predict winning teams.

article thumbnail

How to add an inner band of color to polygons in ArcGIS Pro

ArcGIS

Here's how you can add a ribbon of color to the inside of polygons, without rendering gaps or jaggies.

90
article thumbnail

Deduping and Storing Images at Uber Eats

Uber Engineering

Our engineers discuss how we dedupe and store millions of product images at Uber Eats using a content-addressable caching layer, which saves millions of image downloads every hour and ensures that every image is only stored once.

article thumbnail

Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You Need to Know

Speaker: Timothy Chan, PhD., Head of Data Science

Are you ready to move beyond the basics and take a deep dive into the cutting-edge techniques that are reshaping the landscape of experimentation? 🌐 From Sequential Testing to Multi-Armed Bandits, Switchback Experiments to Stratified Sampling, Timothy Chan, Data Science Lead, is here to unravel the mysteries of these powerful methodologies that are revolutionizing how we approach testing.

article thumbnail

New Built-in Functions for Databricks SQL

databricks

Built-in functions extend the power of SQL with specific transformations of values for common needs and use cases. For example, the LOG10 function.

SQL 86
article thumbnail

Top Posts January 9-15: Python Matplotlib Cheat Sheets

KDnuggets

Python Matplotlib Cheat Sheets • How to Select Rows and Columns in Pandas • 7 Best Platforms to Practice SQL • How to Perform Unit Testing in Python? • Google Data Analytics Certification Review.

Python 104
article thumbnail

Why GxP is Vital for Cloud Control

Teradata

GxPs are a set of guidelines used to reduce risk when dealing with tech suppliers. But guidelines are not certification tests. Learn what to consider when assessing reliability in the cloud.

Cloud 64
article thumbnail

How Uber Optimizes the Timing of Push Notifications using ML and Linear Programming

Uber Engineering

The Uber Eats team shares how they built a novel system with machine learning and linear programming to send the right message at the right time to its users.

article thumbnail

How To Get Promoted In Product Management

Speaker: John Mansour

If you're looking to advance your career in product management, there are more options than just climbing the management ladder. Join our upcoming webinar to learn about highly rewarding career paths that don't involve management responsibilities. We'll cover both career tracks and provide tips on how to position yourself for success in the one that's right for you.

article thumbnail

Language Models, Explained: How GPT and Other Models Work

AltexSoft

In 2020, a remarkable AI took Silicon Valley by storm. Dubbed GPT-3 and developed by OpenAI in San Francisco, it was the latest and strongest of its kind — a “large language model” capable of producing fluent text after having ingested billions of words from books, articles, and websites. According to the paper “Language Models are Few-Shot Learners” by OpenAI, GPT-3 was so advanced that many individuals had difficulty distinguishing between news stories generated by the model and those written

article thumbnail

Data Lakes and SQL: A Match Made in Data Heaven

KDnuggets

In this article, we will discuss the benefits of using SQL with a data lake and how it can help organizations unlock the full potential of their data.

Data Lake 108
article thumbnail

How to make this 3D diorama of the Straits of Mackinac

ArcGIS

Here's one way to make these fun and intriguing micro-world cutaway sorts of things!

96
article thumbnail

Introducing WorkflowGuard: The Workflow Governance and Observability System That Oversees over 120,000 Data Workflows

Uber Engineering

Our Data Workflow Platform team introduces WorkflowGuard: a new service to govern executions, prioritize resources, and manage life cycle for repetitive data jobs. Check out how it improved workflow reliability and cost efficiency while bringing more observability to users.

article thumbnail

How Embedded Analytics Gets You to Market Faster with a SAAS Offering

Start-ups & SMBs launching products quickly must bundle dashboards, reports, & self-service analytics into apps. Customers expect rapid value from your product (time-to-value), data security, and access to advanced capabilities. Traditional Business Intelligence (BI) tools can provide valuable data analysis capabilities, but they have a barrier to entry that can stop small and midsize businesses from capitalizing on them.