article thumbnail

How to learn data engineering

Christophe Blefari

Hadoop initially led the way with Big Data and distributed computing on-premise to finally land on Modern Data Stack — in the cloud — with a data warehouse at the center. In order to understand today's data engineering I think that this is important to at least know Hadoop concepts and context and computer science basics.

article thumbnail

Exploring The Insights And Impact Of Dan Delorey's Distinguished Career In Data

Data Engineering Podcast

How well did the Drill project capture the core principles of Dremel as outlined in the eponymous white paper? How well did the Drill project capture the core principles of Dremel as outlined in the eponymous white paper? How has this approach evolved? What are some challenges with this approach?

Insiders

Sign Up for our Newsletter

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

article thumbnail

Data Serialization Formats with Doug Cutting and Julien Le Dem - Episode 8

Data Engineering Podcast

Contact Information Doug: cutting on GitHub Blog @cutting on Twitter Julien Email @J_ on Twitter Blog julienledem on GitHub Links Apache Avro Apache Parquet Apache Arrow Hadoop Apache Pig Xerox Parc Excite Nutch Vertica Dremel White Paper Twitter Blog on Release of Parquet CSV XML Hive Impala Presto Spark SQL Brotli ZStandard Apache Drill Trevni Apache (..)

Hadoop 100
article thumbnail

Recap of Hadoop News for October

ProjectPro

News on Hadoop-October 2016 Microsoft upgrades Azure HDInsight, its Hadoop Big Data offering.SiliconAngle.com,October 2, 2016. product Azure HDInsight is a managed Hadoop service that gives users access to deploy and manage hadoop clusters on the Azure Cloud. Microsoft and Hortonworks Inc. Trends in Big Data Research.”

Hadoop 40
article thumbnail

Unlock Answers to the Top Questions- What is Big Data and what is Hadoop?

ProjectPro

Big data and hadoop are catch-phrases these days in the tech media for describing the storage and processing of huge amounts of data. Over the years, big data has been defined in various ways and there is lots of confusion surrounding the terms big data and hadoop. Big Deal Companies are striking with Big Data Analytics What is Hadoop?

Hadoop 52
article thumbnail

The Good and the Bad of Apache Kafka Streaming Platform

AltexSoft

popular SQL and NoSQL database management systems including Oracle, SQL Server, Postgres, MySQL, MongoDB, Cassandra, and more; cloud storage services — Amazon S3, Azure Blob, and Google Cloud Storage; message brokers such as ActiveMQ, IBM MQ, and RabbitMQ; Big Data processing systems like Hadoop ; and. Kafka vs Hadoop.

Kafka 93
article thumbnail

What is the Learning Path to Become an AWS Certified Solutions Architect Associate?

Knowledge Hut

Preparing and presenting test plans, presentations, reports, analysis briefings, and white papers. This includes powerful and simple bucket storage like S3, relational database service, and Hadoop clusters. Now, you have to do some reading, go through FAQs and white papers. Also, study the best AWS practices.

AWS 52