article thumbnail

How to test PySpark code with pytest

Start Data Engineering

Ensure the code’s logic is working as expected with tests 2.1. Set context, run code, check results & clean up 2.2.2. Introduction 2. Test types for data pipelines 2.2. pytest: A powerful Python library for testing 2.2.1. Tests are identified by their name 2.2.3. Use fixture to create fake data for testing 2.2.4.

Coding 208
article thumbnail

Top 20 Data Engineering Project Ideas [With Source Code]

Analytics Vidhya

This article presents the top 20 data engineering project ideas with their source code. Whether you’re […] The post Top 20 Data Engineering Project Ideas [With Source Code] appeared first on Analytics Vidhya. Aspiring data engineers often seek real-world projects to gain hands-on experience and showcase their expertise.

Insiders

Sign Up for our Newsletter

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

article thumbnail

Automating dead code cleanup

Engineering at Meta

Meta’s Systematic Code and Asset Removal Framework (SCARF) has a subsystem for identifying and removing dead code. SCARF combines static and dynamic analysis of programs to detect dead code from both a business and programming language perspective. These are combined and form an augmented dependency graph.

Coding 138
article thumbnail

How to Normalize Relational Databases With SQL Code?

Analytics Vidhya

So, we are […] The post How to Normalize Relational Databases With SQL Code? If a corrupted, unorganized, or redundant database is used, the results of the analysis may become inconsistent and highly misleading. appeared first on Analytics Vidhya.

article thumbnail

Monetizing Analytics Features: Why Data Visualizations Will Never Be Enough

Think your customers will pay more for data visualizations in your application? Five years ago they may have. But today, dashboards and visualizations have become table stakes. Discover which features will differentiate your application and maximize the ROI of your embedded analytics. Brought to you by Logi Analytics.

article thumbnail

How to Develop Serverless Code Using Azure Functions?

Analytics Vidhya

Introduction Azure Functions is a serverless computing service provided by Azure that provides users a platform to write code without having to provision or manage infrastructure in response to a variety of events. Azure functions allow developers […] The post How to Develop Serverless Code Using Azure Functions?

Coding 237
article thumbnail

Code Review on Printed Paper: an Excerpt from the Twitoons Comic Book

The Pragmatic Engineer

Today’s newsletter closes with a full chapter from this book, visualizing when Elon Musk demanded all Twitter software engineers print out their code on paper (!!) and report for code review. Code review on printed paper: an excerpt from the Twitoons book A year ago, the end of October 2022 was a very turbulent time at Twitter.

Coding 175
article thumbnail

New Study: 2018 State of Embedded Analytics Report

Why do some embedded analytics projects succeed while others fail? We surveyed 500+ application teams embedding analytics to find out which analytics features actually move the needle. Read the 6th annual State of Embedded Analytics Report to discover new best practices. Brought to you by Logi Analytics.

article thumbnail

3 Challenges of Building Complex Dashboards with Open Source Components

Speaker: Ryan MacCarrigan, Founding Principal, LeanStudio

Many product teams use charting components and open source code libraries to get dashboards and reporting functionality quickly. But what happens when you have a growing user base and additional feature requests?