Remove modernizing-data-pipelines-using-cloudera-data-platform-part-1
article thumbnail

Modernizing Data Pipelines using Cloudera Data Platform – Part 1

Cloudera

Data pipelines are in high demand in today’s data-driven organizations. As critical elements in supplying trusted, curated, and usable data for end-to-end analytic and machine learning workflows, the role of data pipelines is becoming indispensable.

article thumbnail

The Future Is Hybrid Data, Embrace It

Cloudera

We live in a hybrid data world. In the past decade, the amount of structured data created, captured, copied, and consumed globally has grown from less than 1 ZB in 2011 to nearly 14 ZB in 2020. Impressive, but dwarfed by the amount of unstructured data, cloud data, and machine data – another 50 ZB.

IT 110
Insiders

Sign Up for our Newsletter

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

article thumbnail

Introducing Self-Service, No-Code Airflow Authoring UI in Cloudera Data Engineering

Cloudera

Airflow has been adopted by many Cloudera Data Platform (CDP) customers in the public cloud as the next generation orchestration service to setup and operationalize complex data pipelines. Until now, the setup of such pipelines still required knowledge of Airflow and the associated python configurations.

Coding 117
article thumbnail

Next Stop – Predicting on Data with Cloudera Machine Learning

Cloudera

This is part 4 in this blog series. You can read part 1 here and part 2 here , and watch part 3 here. This blog series follows the manufacturing and operations data lifecycle stages of an electric car manufacturer – typically experienced in large, data-driven manufacturing companies.

article thumbnail

Supercharge your Airflow Pipelines with the Cloudera Provider Package

Cloudera

Many customers looking at modernizing their pipeline orchestration have turned to Apache Airflow, a flexible and scalable workflow manager for data engineers. A provider could be used to make HTTP requests, connect to a RDBMS, check file systems (such as S3 object storage), invoke cloud provider services, and much more.

Python 97
article thumbnail

Fraud Detection With Cloudera Stream Processing Part 2: Real-Time Streaming Analytics

Cloudera

In part 1 of this blog we discussed how Cloudera DataFlow for the Public Cloud (CDF-PC), the universal data distribution service powered by Apache NiFi, can make it easy to acquire data from wherever it originates and move it efficiently to make it available to other applications in a streaming fashion.

Process 88
article thumbnail

Digital Transformation is a Data Journey From Edge to Insight

Cloudera

Most of what is written though has to do with the enabling technology platforms (cloud or edge or point solutions like data warehouses) or use cases that are driving these benefits (predictive analytics applied to preventive maintenance, financial institution’s fraud detection, or predictive health monitoring as examples) not the underlying data.