August, 2023

article thumbnail

7 Things You Should Do In Data Engineering

Medium Data Engineering

Data engineering is a crucial field that plays a pivotal role in modern data-driven businesses.

article thumbnail

Top 5 questions Data Engineers should ask before joining a startup

Towards Data Science

Advice from a startup founder in the data space on how to find a startup that works for you Photo by Leeloo Thefirst from Pexels.com So you want to join a startup huh? I’m not talking about a fancy Series E startup that’s about to go IPO funded by a16z. I’m talking about a real startup, from seed to series B — where every day can feel like you’re either about to soar or crash and burn — and there’s little in between.

Insiders

Sign Up for our Newsletter

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

article thumbnail

Acing the Data Engineering Coding Interview

Medium Data Engineering

In my experience the coding screen for data engineers can range wildly, from non-existent, to algorithmic challenges in line with software… Continue reading on Medium »

article thumbnail

Why Is Data Modeling So Challenging – How To Data Model For Analytics

Seattle Data Guy

Learning about how to data models from basic star schemas on the internet is like learning data science using the IRIS data set. It works great as a toy example. But it doesn’t match real life at all. Data modeling in real life requires you fully understand the data sources and your business use cases.… Read more The post Why Is Data Modeling So Challenging – How To Data Model For Analytics appeared first on Seattle Data Guy.

Data 246
article thumbnail

LLMs in Production: Tooling, Process, and Team Structure

Speaker: Dr. Greg Loughnane and Chris Alexiuk

Technology professionals developing generative AI applications are finding that there are big leaps from POCs and MVPs to production-ready applications. They're often developing using prompting, Retrieval Augmented Generation (RAG), and fine-tuning (up to and including Reinforcement Learning with Human Feedback (RLHF)), typically in that order. However, during development – and even more so once deployed to production – best practices for operating and improving generative AI applications are le

article thumbnail

What is a Senior Software Engineer at Wise and Amazon?

The Pragmatic Engineer

👋 Hi, this is Gergely with a bonus, free issue of the Pragmatic Engineer Newsletter. In every issue, I cover topics related to Big Tech and startups through the lens of engineering managers and senior engineers. To get full issues twice a week, subscribe here. The past month, we’ve done deepdives in the newsletter on what a senior software engineer is at Big Tech , and at scaleups.

More Trending

article thumbnail

Snowflake and Instacart: The Facts

Snowflake

In the past few days, the scope and trajectory of Instacart’s use of Snowflake has been misrepresented by some on social media. Snowflake has partnered closely with Instacart to scale up to meet the company’s massive demand growth, and then to optimize for efficiency. Optimizations are undertaken on a workload-by-workload basis, and have been extremely successful.

article thumbnail

KDnuggets News, August 30: 7 Projects Built with Generative AI • Beyond Numpy and Pandas: Lesser-Known Python Libraries

KDnuggets

7 Projects Built with Generative AI • Beyond Numpy and Pandas: Unlocking the Potential of Lesser-Known Python Libraries • 5 Ways You Can Use ChatGPT’s Code Interpreter For Data Science • GPT-4: 8 Models in One; The Secret is Out

Python 117
article thumbnail

Missing Data Demystified: The Absolute Primer for Data Scientists

Towards Data Science

Data Quality Chronicles Missing data, missing mechanisms, and missing data profiling Missing Data prevents data scientists to see the entire story the data has to tell. Sometimes, even the smallest pieces of information can provide a completely unique view of the world. Photo by Ronan Furuta on Unsplash. Earlier this year, I started a piece on several data quality issues (or characteristics) that heavily compromise our machine learning models.

Data 113
article thumbnail

Activating Data from the Lakehouse: Databricks Ventures Invests in Hightouch

databricks

It’s no secret that modern organizations are doubling down on their investments in data - investments that uncover deep customer insights that provide a.

Data 121
article thumbnail

The Definitive Entity Resolution Buyer’s Guide

Are you thinking of adding enhanced data matching and relationship detection to your product or service? Do you need to know more about what to look for when assessing your options? The Senzing Entity Resolution Buyer’s Guide gives you step-by-step details about everything you should consider when evaluating entity resolution technologies. You’ll learn about use cases, technology and deployment options, top ten evaluation criteria and more.

article thumbnail

How Games Typically Get Built

The Pragmatic Engineer

👋 Hi, this is Gergely with a bonus, free issue of the Pragmatic Engineer Newsletter. In every issue, I cover topics related to Big Tech and startups through the lens of engineering managers and senior engineers. In this article, we cover one out of for topics from the past newsletter issue Game Development Basics. To get the full issues, twice a week, subscribe here.

article thumbnail

ELT vs ETL: Unveiling the Differences and Similarities

Analytics Vidhya

Introduction In today’s data-driven world, seamless data integration plays a crucial role in driving business decisions and innovation. Two prominent methodologies have emerged to facilitate this process: Extract, Transform, Load (ETL) and Extract, Load, Transform (ELT). In this article, we will discuss ELT vs ETL, comparing their characteristics, benefits, and suitability for various use cases. […] The post ELT vs ETL: Unveiling the Differences and Similarities appeared first on Ana

article thumbnail

Robinhood Wallet Adds Support for Bitcoin and Dogecoin, and Enables Ethereum Swaps

Robinhood

Bitcoin and Dogecoin support is now available to all Robinhood Wallet users, and in-app Ethereum Swaps started rolling out today Since launching to the general public nearly six months ago, Robinhood Wallet has seen significant adoption globally, with hundreds of thousands of users in more than 140 countries worldwide. We are always gathering feedback, and have heard loud and clear that people want access to more coins on more chains.

article thumbnail

The Burtch Works 2023 Data Science & AI Professionals Salary Report is Here!

KDnuggets

The Burtch Works 2023 Data Science & AI Professionals salary report is here, and includes insightful data such as hiring and marketplace trends, compensation changes over time, and salary data. Get your copy here.

article thumbnail

Data driven Snowflake optimisation at HelloFresh

Medium Data Engineering

“We saved 30% in our Snowflake warehouse costs by analysing the impact of different warehouse configurations, by building a custom… Continue reading on HelloTech »

Data 98
article thumbnail

16+ fascinating Big data examples

InData Labs

The world is generating an unprecedented amount of data every second. From online transactions and social media interactions to sensor readings and scientific research, the sheer volume, velocity, and variety of data have given rise to the concept of “Big data.” This vast ocean of information holds immense potential, capable of revolutionizing industries, driving innovation, Запись 16+ fascinating Big data examples впервые появилась InData Labs.

article thumbnail

A senior engineer/EM job search story

The Pragmatic Engineer

👋 Hi, this is Gergely with a bonus, free issue of the Pragmatic Engineer Newsletter. In every issue, I cover topics related to Big Tech and startups through the lens of engineering managers and senior engineers. In this article, we cover one out of five topics from today’s subscriber-only The Pulse issue. To get full issues twice a week, subscribe here.

article thumbnail

Organizing Generative AI Teams: 5 Lessons Learned From Data Science

Monte Carlo

You did it! After executive leadership vaguely promised stakeholders that new Gen AI features would be incorporated across the organization, your tiger team sprinted to produce a MVP that checks the box. Integrating that OpenAI API into your application wasn’t that difficult and it may even turn out to be useful. But now what happens? Tiger teams can’t sprint forever.

article thumbnail

Dashboard Design That Dazzles Your CEO

FreshBI

Understanding the CEO’s Design Perspective To design a dashboard suited for your CEO, it is required to think like a CEO, get into the mind of a CEO. If anyone on the team understands the importance of good design, then it's often the CEO. CEOs prioritize and understand the importance of good design so well that they often take it for granted that the products that they build and that they surround themselves with, are designed well - for beauty and for function.

article thumbnail

5 Skills All Marketing Analytics and Data Science Pros Need Today

KDnuggets

Join us at the MADS conference in Washington, D.C., from Sept. 26 to 28, 2023. Learn more here and register with code KDN100 for $100 of your conference pass.

article thumbnail

Streamlit and MongoDB: Storing Your Data in the Cloud

Medium Data Engineering

Deploying your Streamlit app to the Cloud means that any data that you create with that app disappears when the app terminates — unless… Continue reading on Towards Data Science »

MongoDB 98
article thumbnail

Using MLflow AI Gateway and Llama 2 to Build Generative AI Apps

databricks

To build customer support bots, internal knowledge graphs, or Q&A systems, customers often use Retrieval Augmented Generation (RAG) applications which leverage pre-trained models.

article thumbnail

Are reports of StackOverflow’s fall greatly exaggerated?

The Pragmatic Engineer

👋 Hi, this is Gergely with a bonus, free issue of the Pragmatic Engineer Newsletter. In every issue, I cover topics related to Big Tech and startups through the lens of engineering managers and senior engineers. In this article, we cover one out of five topics from today’s subscriber-only The Pulse issue. To get full issues twice a week, subscribe here.

article thumbnail

What is Data Observability? 5 Key Pillars To Know

Monte Carlo

Editor’s Note : So much has happened since we first published this post and created the data observability category and Monte Carlo in 2019. We have updated this post to reflect this rapidly maturing space. You can read the original article linked at the bottom of this page. What is Data observability? The five pillars My data observability definition has not changed since I first coined it in 2019: Data observability refers to an organization’s comprehensive understanding of the health an

Data 98
article thumbnail

A step-by-step guide to build an Effective Data Quality Strategy from scratch

Towards Data Science

A Step-by-Step Guide to Building an Effective Data Quality Strategy from Scratch How to build an interpretable data quality framework based on user expectations Photo by Rémi Müller on Unsplash As data engineers, we are (or should be) responsible for the quality of the data we provide. This is nothing new, but every time I join a data project I ask myself the same questions: When should I start working on data quality?

Data 98
article thumbnail

Top Posts August 14-20: How to Use ChatGPT to Convert Text into a PowerPoint Presentation

KDnuggets

How to Use ChatGPT to Convert Text into a PowerPoint Presentation • 5 Ways You Can Use ChatGPT’s Code Interpreter For Data Science • Forget ChatGPT, This New AI Assistant Is Leagues Ahead and Will Change the Way You Work Forever • Python Vector Databases and Vector Indexes: Architecting LLM Apps • 3 Ways to Access GPT-4 for Free

Python 98
article thumbnail

Empowering Insights: The Dynamic Duo of #DataAnalytics and #InformaticaDeveloper

Medium Data Engineering

In today’s digitally driven world, information is power, and harnessing that power requires the synergistic collaboration of two key roles… Continue reading on Medium »

article thumbnail

Leveraging The Powers of Functional Code?—?Part 2

Booking.com Engineering

Leveraging The Powers of Functional Code — Part 2 The Fully Functional Haskell Solution Part one can be found here: [link] The Solution: Regarding the Haskell code — don’t worry if you don’t understand everything. I am going to explain the main points of it by drawing a parallel to the Java implementation. If you are curious about FP, I cannot recommend this book enough, and the online version is free: [link] It is a pleasant read with lots of humor (just the illustrations by themselves make me

Coding 98
article thumbnail

Precisely Women in Technology: Meet Monica Di Martino

Precisely

With an increasing number of women joining IT, it’s becoming a more inclusive environment. Precisely is committed to building a more inclusive work environment, which is why there are ample opportunities for women in the organization. One of the company’s initiatives is the Precisely Women in Technology (PWIT) program, that was established to be a place for women to come together, support each other, offer guidance, and more.

article thumbnail

Sunrise: Zalando's developer platform based on Backstage

Zalando Engineering

Introduction Since 2021, Zalando invested in building up a developer portal called Sunrise, aimed to become the starting point for Builders at Zalando. The portal is based on Spotify's Backstage platform with additional extensions built internally. Sunrise enables everyone at Zalando to view and discover information about teams, applications, APIs, events, CI/CD pipelines, Infrastructure accounts and costs, and much more.

article thumbnail

6 Essential Features for Enterprise Data Platforms: An Insight

Snowflake

In today’s digital age, the growth and success of an enterprise heavily rely on how it manages and leverages its data. There are multiple enterprise data platforms in the market, each offering its distinct capabilities. However, when it comes to enterprise-grade requirements certain key features are indispensable. In this blog post, we will delve into six such capabilities – comprehensive cross-cloud replication, zero copy database and schema clone, collation support, stored procedures, mu

Data 94
article thumbnail

Who Will Make Money from the Generative AI Gold Rush?

KDnuggets

Buckle up for the Generative AI gold rush! Will BigTech rule with its picks and shovels? Which startups will strike it rich? Will “copilot for X” be the business strategy to hit pay dirt? How can startups dig moats to keep out other prospectors? And will the US once again have the richest gold seams?

IT 95
article thumbnail

Finops: Journey to improve and manage the cost of BigQuery

Medium Data Engineering

จากที่เราได้เริ่ม project Finops มาระยะหนึ่ง ก็มีการ monitor cost ของ BigQuery ได้ดีขึ้น รวมถึงทำให้เกิด transparency ใ&#x