Remove 2022 Remove Blog Remove Hadoop Remove Kafka
article thumbnail

Data Engineering Annotated Monthly – June 2022

Big Data Tools

It made me think that the era of on-premises free Hadoop installations had come to an end. I’m actually happy that this has happened – Hadoop was there for me at the very beginning of my career and I have very positive feelings associated with it. The State of Data Engineering 2022 – I like this kind of content.

article thumbnail

Data Engineering Annotated Monthly – June 2022

Big Data Tools

It made me think that the era of on-premises free Hadoop installations had come to an end. I’m actually happy that this has happened – Hadoop was there for me at the very beginning of my career and I have very positive feelings associated with it. The State of Data Engineering 2022 – I like this kind of content.

Insiders

Sign Up for our Newsletter

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

article thumbnail

Brief History of Data Engineering

Jesse Anderson

Doug Cutting took those papers and created Apache Hadoop in 2005. They were the first companies to commercialize open source big data technologies and pushed the marketing and commercialization of Hadoop. Hadoop was hard to program, and Apache Hive came along in 2010 to add SQL. We lacked a scalable pub/sub system.

article thumbnail

Data Engineering Annotated Monthly – May 2022

Big Data Tools

On top of that, it’s a part of the Hadoop platform, which created additional work that we otherwise would not have had to do. Kafka: Mark KRaft as Production Ready – One of the most interesting changes to Kafka from recent years is that it now works without ZooKeeper. Of course, the main topic is data streaming.

article thumbnail

Data Engineering Annotated Monthly – May 2022

Big Data Tools

On top of that, it’s a part of the Hadoop platform, which created additional work that we otherwise would not have had to do. Kafka: Mark KRaft as Production Ready – One of the most interesting changes to Kafka from recent years is that it now works without ZooKeeper. Of course, the main topic is data streaming.

article thumbnail

Data Engineering Annotated Monthly – September 2022

Big Data Tools

One of the use cases from the product page that stood out to me in particular was the effort to mirror multiple Kafka clusters in one Brooklin cluster! This practice can be extremely helpful, and in fact, famous, industry-changing open-source tools like Hadoop have been born out of it. This is no doubt very interesting.

article thumbnail

Data Engineering Annotated Monthly – September 2022

Big Data Tools

One of the use cases from the product page that stood out to me in particular was the effort to mirror multiple Kafka clusters in one Brooklin cluster! This practice can be extremely helpful, and in fact, famous, industry-changing open-source tools like Hadoop have been born out of it. This is no doubt very interesting.