Blog, Data Lake, Metadata and Structured Data

Blog

Data Lake

Metadata

Structured Data

Announcing New Innovations for Data Warehouse, Data Lake, and Data Lakehouse in the Data Cloud

Snowflake

NOVEMBER 2, 2023

Over the years, the technology landscape for data management has given rise to various architecture patterns, each thoughtfully designed to cater to specific use cases and requirements. These patterns include both centralized storage patterns like data warehouse , data lake and data lakehouse , and distributed patterns such as data mesh.

Data Lake

Data Lake Data Warehouse Cloud Unstructured Data

What Are the Best Data Modeling Methodologies & Processes for My Data Lake?

phData: Data Engineering

SEPTEMBER 19, 2023

With the amount of data companies are using growing to unprecedented levels, organizations are grappling with the challenge of efficiently managing and deriving insights from these vast volumes of structured and unstructured data. What is a Data Lake? Consistency of data throughout the data lake.

Data Lake

Data Lake Process Metadata Data Warehouse

Join 16,000+

Insiders

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Webinars

How to Optimize the Developer Experience for Monumental Impact

Generative AI Deep Dive: Advancing from Proof of Concept to Production

Understanding User Needs and Satisfying Them

Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You Need to Know

Leading the Development of Profitable and Sustainable Products

MORE WEBINARS

Trending Sources

5 Reasons Data Discovery Platforms Are Best For Data Lakes

Monte Carlo

APRIL 1, 2021

Over the past few years, data lakes have emerged as a must-have for the modern data stack. But while the technologies powering our access and analysis of data have matured, the mechanics behind understanding this data in a distributed environment have lagged behind. Data discovery tools and platforms can help.

Data Lake

Data Lake Unstructured Data Data Warehouse Metadata

Webinars

How to Optimize the Developer Experience for Monumental Impact

Generative AI Deep Dive: Advancing from Proof of Concept to Production

Understanding User Needs and Satisfying Them

Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You Need to Know

Leading the Development of Profitable and Sustainable Products

MORE WEBINARS

Taking Charge of Tables: Introducing OpenHouse for Big Data Management

LinkedIn Engineering

JULY 19, 2023

Open source data lakehouse deployments are built on the foundations of compute engines (like Apache Spark, Trino, Apache Flink), distributed storage (HDFS, cloud blob stores), and metadata catalogs / table formats (like Apache Iceberg, Delta, Hudi, Apache Hive Metastore). Tables are governed as per agreed upon company standards.

Big Data

Big Data Data Management Management Metadata

Migrate Hive data from CDH to CDP public cloud

Cloudera

JUNE 25, 2021

Using easy-to-define policies, Replication Manager solves one of the biggest barriers for the customers in their cloud adoption journey by allowing them to move both tables/structured data and files/unstructured data to the CDP cloud of their choice easily. CDP Data Lake cluster versions – CM 7.4.0,

Cloud

Cloud Data Lake Cloud Storage Metadata

20 Latest AWS Glue Interview Questions and Answers for 2023

ProjectPro

JANUARY 24, 2023

If you are preparing for your ETL developer or data engineer interview , you must possess a solid fundamental knowledge of AWS Glue, as you’re likely to get asked questions that test your ability to handle complex big data ETL tasks. You can leverage AWS Glue to discover, transform, and prepare your data for analytics.

AWS

AWS Data Lake Scala ETL Tools

Data Engineering Zoomcamp – Data Ingestion (Week 2)

Hepta Analytics

FEBRUARY 14, 2022

DE Zoomcamp 2.2.1 – Introduction to Workflow Orchestration Following last weeks blog , we move to data ingestion. We already had a script that downloaded a csv file, processed the data and pushed the data to postgres database. This week, we got to think about our data ingestion design.

Data Ingestion

Data Ingestion Data Engineering Data Engineer Engineering

Data Vault on Snowflake: Feature Engineering and Business Vault

Snowflake

MARCH 30, 2023

Data Vault as a practice does not stipulate how you transform your data, only that you follow the same standards to populate business vault link and satellite tables as you would to populate raw vault link and satellite tables. Based on Tecton blog So is this similar to data engineering pipelines into a data lake/warehouse?

Engineering

Engineering Raw Data Data Science Scala

20 Best Open Source Big Data Projects to Contribute on GitHub

ProjectPro

NOVEMBER 15, 2021

Table of Contents 20 Open Source Big Data Projects To Contribute How to Contribute to Open Source Big Data Projects? 20 Open Source Big Data Projects To Contribute There are thousands of open-source projects in action today. This blog will walk through the most popular and fascinating open source big data projects.

Big Data

Big Data Project Metadata Programming Language

Snowflake Architecture and It's Fundamental Concepts

ProjectPro

JANUARY 31, 2022

Launched in 2014, Snowflake is one of the most popular cloud data solutions on the market. This blog walks you through what does Snowflake do , the various features it offers, the Snowflake architecture, and so much more. Table of Contents Snowflake Overview and Architecture What is Snowflake Data Warehouse?

Architecture

Architecture IT Data Warehouse Amazon Web Services

100+ Big Data Interview Questions and Answers 2023

ProjectPro

JANUARY 31, 2023

If you're looking to break into the exciting field of big data or advance your big data career, being well-prepared for big data interview questions is essential. Get ready to expand your knowledge and take your big data career to the next level! But the concern is - how do you become a big data professional?

Big Data

Big Data Hadoop AWS Relational Database

20+ Data Engineering Projects for Beginners with Source Code

ProjectPro

AUGUST 24, 2021

Data professionals who work with raw data like data engineers, data analysts, machine learning scientists , and machine learning engineers also play a crucial role in any data science project. And, out of these professions, this blog will discuss the data engineering job role.

Data Engineering

Data Engineering Data Engineer Coding Project

70+ Azure Interview Questions and Answers to Prepare in 2023

ProjectPro

DECEMBER 10, 2021

This blog covers the top 50 most frequently asked Azure interview questions and answers. Well, this Azure interview questions and answers blog will help you land your dream cloud computing job role! Azure Table Storage- Azure Tables is a NoSQL database for storing structured data without a schema.

BI Cloud Computing SQL Database

50 Artificial Intelligence Interview Questions and Answers [2023]

ProjectPro

OCTOBER 20, 2021

If you are unsure, be vocal about your thought process and the way you are thinking – take inspiration from the examples below and explain the answer to the interviewer through your learnings and experiences from data science and machine learning projects. It will explain what an instance of the best-in-class answers would sound like.

Machine Learning

Machine Learning Algorithm Government Data Science

Big Data Fabric Weaves Together Automation, Scalability, and Intelligence

Cloudera

JANUARY 22, 2019

Today’s data landscape is characterized by exponentially increasing volumes of data, comprising a variety of structured, unstructured, and semi-structured data types originating from an expanding number of disparate data sources located on-premises, in the cloud, and at the edge.

Big Data

Big Data NoSQL Data Lake Hadoop

How Cloudera Data Flow Enables Successful Data Mesh Architectures

Cloudera

OCTOBER 7, 2021

In this blog, I will demonstrate the value of Cloudera DataFlow (CDF) , the edge-to-cloud streaming data platform available on the Cloudera Data Platform (CDP) , as a Data integration and Democratization fabric. Data and Metadata: Data inputs and data outputs produced based on the application logic.

Architecture

Architecture Metadata Government Kafka

Data Engineering Digest

Announcing New Innovations for Data Warehouse, Data Lake, and Data Lakehouse in the Data Cloud

What Are the Best Data Modeling Methodologies & Processes for My Data Lake?

Webinars

Trending Sources

5 Reasons Data Discovery Platforms Are Best For Data Lakes

Webinars

Taking Charge of Tables: Introducing OpenHouse for Big Data Management

Migrate Hive data from CDH to CDP public cloud

20 Latest AWS Glue Interview Questions and Answers for 2023

Data Engineering Zoomcamp – Data Ingestion (Week 2)

Data Vault on Snowflake: Feature Engineering and Business Vault

20 Best Open Source Big Data Projects to Contribute on GitHub

Snowflake Architecture and It's Fundamental Concepts

100+ Big Data Interview Questions and Answers 2023

20+ Data Engineering Projects for Beginners with Source Code

70+ Azure Interview Questions and Answers to Prepare in 2023

50 Artificial Intelligence Interview Questions and Answers [2023]

Big Data Fabric Weaves Together Automation, Scalability, and Intelligence

How Cloudera Data Flow Enables Successful Data Mesh Architectures

Stay Connected