Sat.Mar 06, 2021 - Fri.Mar 12, 2021

article thumbnail

Building a Data Engineering Project in 20 Minutes

Simon Späti

This post focuses on practical data pipelines with examples from web-scraping real-estates, uploading them to S3 with MinIO, Spark and Delta Lake, adding some Data Science magic with Jupyter Notebooks, ingesting into Data Warehouse Apache Druid, visualising dashboards with Superset and managing everything with Dagster. The goal is to touch on the common data engineering challenges and using promising new technologies, tools or frameworks, which most of them I wrote about in Business Intelligence

article thumbnail

Under the Hood of Real-Time Analytics with Apache Kafka and Pinot

Confluent

Real-time analytics has become the need of the hour for modern internet companies. The ability to derive internal insights around business metrics, user growth and adoption as well as security […].

Kafka 144
Insiders

Sign Up for our Newsletter

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

article thumbnail

Leave Your Data Where It Is And Automate Feature Extraction With Molecula

Data Engineering Podcast

Summary A majority of the time spent in data engineering is copying data between systems to make the information available for different purposes. This introduces challenges such as keeping information synchronized, managing schema evolution, building transformations to match the expectations of the destination systems. H.O. Maycotte was faced with these same challenges but at a massive scale, leading him to question if there is a better way.

IT 100
article thumbnail

Enterprise Data Operating Systems in the Cloud: Necessary, But Not Sufficient

Teradata

Getting your Cloud data architecture right starts with understanding which data products you need, the roles they perform, & the functional & non-functional characteristics that those roles demand.

Cloud 110
article thumbnail

From Developer Experience to Product Experience: How a Shared Focus Fuels Product Success

Speaker: Anne Steiner and David Laribee

As a concept, Developer Experience (DX) has gained significant attention in the tech industry. It emphasizes engineers’ efficiency and satisfaction during the product development process. As product managers, we need to understand how a good DX can contribute not only to the well-being of our development teams but also to the broader objectives of product success and customer satisfaction.

article thumbnail

Building a Data Engineering Project in 20 Minutes

Simon Späti

This post focuses on practical data pipelines with examples from web-scraping real-estates, uploading them to S3 with MinIO, Spark and Delta Lake, adding some Data Science magic with Jupyter Notebooks, ingesting into Data Warehouse Apache Druid, visualising dashboards with Superset and managing everything with Dagster. The goal is to touch on the common data engineering challenges and using promising new technologies, tools or frameworks, which most of them I wrote about in Business Intelligence

article thumbnail

How to Tune RocksDB for Your Kafka Streams Application

Confluent

Apache Kafka ships with Kafka Streams, a powerful yet lightweight client library for Java and Scala to implement highly scalable and elastic applications and microservices that process and analyze data […].

Kafka 130

More Trending

article thumbnail

ConsoleMe: A Central Control Plane for AWS Permissions and Access

Netflix Tech

ConsoleMe: A Central Control Plane for AWS Permissions and Access By Curtis Castrapel , Patrick Sanders , and Hee Won Kim At AWS re:Invent 2020, we open sourced two new tools for managing multi-account AWS permissions and access. We’re very excited to bring you ConsoleMe (pronounced: kuhn-soul-mee ), and its CLI utility, Weep (pun intended)! If you missed the talk, check it out here.

AWS 97
article thumbnail

All That Glitters is Not Gold!

Teradata

All companies want a golden data analytics platform. But instead of looking at the real properties of the platform, they are often mislead by its shine & look. Find out more.

article thumbnail

Integrating Apache Kafka Clients with CNCF Jaeger at Funding Circle Using OpenTelemetry

Confluent

At Funding Circle, we rely heavily on Kafka as the main piece of infrastructure to enable our event-driven-based microservices architecture. There are numerous organizational benefits of microservices, however a key […].

Kafka 82
article thumbnail

#ClouderaLife Spotlight: Karen Ji, Senior Manager, Customer Operations

Cloudera

Karen Ji, is Cloudera’s Senior Manager Customer operations ensuring the success of our global customers, with a regional focus on China and Korea. Multi-tasking is my superpower. Karen joined Cloudera as a Solutions Engineer before switching to lead customer support. On a daily basis Karen collaborates across the business with different functions, but works extremely closely with the field, sales and professional services teams to ensure that customers have the support and insights they need to

article thumbnail

Peak Performance: Continuous Testing & Evaluation of LLM-Based Applications

Speaker: Aarushi Kansal, AI Leader & Author and Tony Karrer, Founder & CTO at Aggregage

Software leaders who are building applications based on Large Language Models (LLMs) often find it a challenge to achieve reliability. It’s no surprise given the non-deterministic nature of LLMs. To effectively create reliable LLM-based (often with RAG) applications, extensive testing and evaluation processes are crucial. This often ends up involving meticulous adjustments to prompts.

article thumbnail

Gartner – Top Trends in Data and Analytics for 2021: XOps

DataKitchen

Gartner identified XOps (DataOps, ModelOps, DevOps) as one of the top trends in data and analytics for 2021. Below we provide additional suggestions for further reading based on Gartner’s recommendations. What is XOps? . Gartner: “The multiplication of Ops disciplines stemming out of DevOps best practices has caused significant confusion in the marketplace.

article thumbnail

How to Get Your Cloud Analytic Architecture Right

Teradata

Getting your Cloud data architecture right starts with understanding which data products you need, the roles they perform, & the functional & non-functional characteristics that those roles demand.

article thumbnail

CRM System Rate Limiting Overview

Grouparoo

Rate limiting is the method by which an API limits the calls for its use. When creating a data sync implementation with an API, it's important to adapt the approach that the remote system takes. Whether stated or not, all systems have a rate limit. Even if not addressed explicitly, there is still some finite number of parallel connections that a set of servers can handle.

Systems 52
article thumbnail

RippleNet Engineering's Inclusive Language Initiative: Part 2

Ripple Engineering

Welcome back to the second post of this Inclusive Language blog series! Previously, we contextualized the importance of eliminating terms with problematic and racist origins from our codebase, such as “master” and “slave”, or “blacklist” and “whitelist” We then suggested changing them with equally clear and more agreeable words such as “primary” and “secondary”, “denylist” and “allowlist

article thumbnail

How to Build an Experimentation Culture for Data-Driven Product Development

Speaker: Margaret-Ann Seger, Head of Product, Statsig

Experimentation is often seen as an aspirational practice, especially at smaller, fast-moving companies who are strapped for time and resources. So, how can you get your team making decisions in a more data-driven way while continuing to remain lean and maintaining ship velocity? In this webinar, Margaret-Ann Seger, Head of Product at Statsig, will teach you how to build an experimentation culture from the ground-up, graduating from just getting started with data-driven development to operating

article thumbnail

6 security risks in software development and how to address them

DataKitchen

The post 6 security risks in software development and how to address them first appeared on DataKitchen.

52
article thumbnail

Banco Bradesco

Teradata

Vantage scales in-database R/Python models on 70M clients. The customer analytics are transforming Bradesco to become the bank of the future, scaling insights and accelerating time-to-value.

Banking 52
article thumbnail

Monte Carlo Launches Chief Data Officer Advisory Board

Monte Carlo

Today, I am proud to announce the formation of Monte Carlo’s Chief Data Officer (CDO) advisory board. The advisory board was launched to help Monte Carlo and the emerging data observability market better serve customers on their journeys to data trust, advise their product roadmap, and pioneer the data observability category. This announcement comes just weeks after our $25M Series B funding round this February, led by Redpoint Ventures, backers of Snowflake and Looker, and GGV Capital, in

article thumbnail

Building Database Connectors for Superset Using SQLAlchemy

Preset

Superset can integrate with almost any SQL speaking database because of SQLAlchemy, Python DB-API 2, and some minimal custom logic.

article thumbnail

Entity Resolution Checklist: What to Consider When Evaluating Options

Are you trying to decide which entity resolution capabilities you need? It can be confusing to determine which features are most important for your project. And sometimes key features are overlooked. Get the Entity Resolution Evaluation Checklist to make sure you’ve thought of everything to make your project a success! The list was created by Senzing’s team of leading entity resolution experts, based on their real-world experience.

article thumbnail

Should You Build or Buy a DataOps Solution?

DataKitchen

The post Should You Build or Buy a DataOps Solution? first appeared on DataKitchen.

article thumbnail

The Future: Seamless Journey to Invisible Payments

Teradata

The future of payments is rapidly evolving toward seamless omni-channel customer journeys and ultimately, payments becoming invisible. Find out more.

52
article thumbnail

Towards a Data Mesh (part 1) : Data Domains and Teams Topologies.

François Nguyen

Just an illustration – not the truth and we will pivot if it does not work. I discovered Zhamak Dehghani’s first article about Data Mesh in August 2020. Thanks to Youtube, you have the live illustration in this video with even more context and explanations. And then, you have this second video that is an introduction to her second article (december 2020).

article thumbnail

Remote Workstations for the Discerning Artists

Netflix Tech

By Michelle Brenner Netflix is poised to become the world’s most prolific producer of visual effects and original animated content. To meet that demand, we need to attract the world’s best artistic talent. Artists like to work at places where they can create groundbreaking entertainment instead of worrying about getting access to the software or source files they need.

article thumbnail

The Path to Product Excellence: Avoiding Common Pitfalls and Enhancing Communication

Speaker: David Bard, Principal at VP Product Coaching

In the fast-paced world of digital innovation, success is often accompanied by a multitude of challenges - like the pitfalls lurking at every turn, threatening to derail the most promising projects. But fret not, this webinar is your key to effective product development! Join us for an enlightening session to empower you to lead your team to greater heights.

article thumbnail

Top Three Requirements for Data Flows

Cloudera

Data flows are an integral part of every modern enterprise. No matter whether they move data from one operational system to another to power a business process or fuel central data warehouses with the latest data for near-real-time reporting, life without them would be full of manual, tedious and error-prone data modification and copying tasks. At Cloudera, we’re helping our customers implement data flows on-premises and in the public cloud using Apache NiFi , a core component of Cloudera DataFl

Cloud 119
article thumbnail

Micro Frontends: from Fragments to Renderers (Part 1)

Zalando Engineering

In 2015, we wanted to improve how we delivered features to customers and move away from a monolithic shop system. Project Mosaic and its microservices approach for the frontend were vital to support this transition. Mosaic enabled a relatively large number of teams to work on the main Zalando website independently and without performance compromises.

article thumbnail

7 Data Engineering Trends to Watch

Silectis

The importance of data engineering is on the rise, with organizations increasingly investing in talent and infrastructure. Here at Silectis, we are in the fortunate position of working with a wide range of enterprises across multiple industries. I caught up with a few members of the team to take note of some of the data engineering trends we anticipate seeing more of this year and beyond. 1.

article thumbnail

Production Media Management: Transforming Media Workflows by leveraging the Cloud

Netflix Tech

Written by Anton Margoline , Avinash Dathathri , Devang Shah and Murthy Parthasarathi. Credit to Netflix Studio’s Product, Design, Content Hub Engineering teams along with all of the supporting partner and platform teams. In this post, we will share a behind-the-scenes look at how Netflix delivers technology and infrastructure to help production crews create and exchange media during production and post production stages.

Media 68
article thumbnail

Reimagined: Building Products with Generative AI

“Reimagined: Building Products with Generative AI” is an extensive guide for integrating generative AI into product strategy and careers featuring over 150 real-world examples, 30 case studies, and 20+ frameworks, and endorsed by over 20 leading AI and product executives, inventors, entrepreneurs, and researchers.

article thumbnail

Cloudera celebrates International Women’s Day – Sharing experiences and our voices from around the globe

Cloudera

Cloudera is happy to be an official supporter of International Women’s Day 2021. We at Cloudera believe in the undeniable power of data to build a more equitable future, and we are humbled to be building the products that make it possible for data to change the world for the better. . The theme of this year’s IWD is #ChooseToChallenge. As w e celebrate the social, economic, cultural, and political achievements of women, we’re building a foundation for our future young women, raising awareness ab

article thumbnail

5 Tips for Recruiting Top Engineering Talent in Startups

Rockset

“Two of the most important things as a CEO of a company are to make sure you have money in the bank and recruit amazing people.” - Venkat Venkataramani, CEO and Co-Founder of Rockset We hosted a Clubhouse event with VPs of Engineering from Gusto and Robinhood, Nimrod Hoofien and Adam Wolff, on their tips for recruiting top engineering talent in startups.

article thumbnail

The Future of Business Intelligence is Open Source

Maxime Beauchemin

While “software is [still actively] eating the world” , it’s also clear that open source is taking over software. Simply put, open source is a superior approach at building and distributing software because it provides important guaranties around how software can be discovered, tried, operated, collaborated on and packaged. For those reasons, it is not surprising that it has taken over most of the modern data stack: infrastructure, databases, orchestration, data processing, AI/ML and beyond.