Remove product bigquery
article thumbnail

Upgrade your Modern Data Stack

Christophe Blefari

We jumped from HDFS to Cloud Storage (S3, GCS) for storage and from Hadoop, Spark to Cloud warehouses (Redshift, BigQuery, Snowflake) for processing. Best to do is to defined what is a good documentation and then enforce the requirements before going to production. But it looks like way more expensive than BigQuery.

article thumbnail

Data News — Week 24.12

Christophe Blefari

Python codebase with best practices to support MLOps — This is a Github repository with a lot, I mean a lot, of tools and tips to create a production grade repository. This is a nice way to mix SQL and Python code. pip install data-stack — This is a title I could have written myself.

Insiders

Sign Up for our Newsletter

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

article thumbnail

Data News — Week 23.28

Christophe Blefari

Fast News ⚡️ How we cut BigQuery costs 80% by hunting down costly queries — Mixpanel team hugely reduced their BigQuery spending. Mainly their implementation is a Protobuf Schema Registry and interface at event production and consumption. They use Fivetran, dbt and Census. Interesting to see.

Datasets 130
article thumbnail

Data News — Week 24.02

Christophe Blefari

I actually cover data engineering and how to put data stuff into production. dbt meta tag — A list of the companies habing product features depending on the meta tag. How BigQuery stores semi-structured data? Transfer data from BigQuery to Fabric with Arrow and Rust. It's quite a broad subject.

article thumbnail

Data News — Week 24.03

Christophe Blefari

The students were incredibly calm, obviously my course is a bit difficult at the beginning because it touches on concepts that they are not used to—cloud, data in production, data engineering, etc. It deeply shows how OpenAI products are—or might be—used in order to win races. Palantir CEO: U.S.

Data 130
article thumbnail

Data News — Week 23.37

Christophe Blefari

At the same time Github Research quantified GitHub Copilot’s impact on developer productivity and happiness — Developer productivity is a difficult measure to compute. Also productivity ≠ speed, but speed is important. It will become a nice product in the Collibra data governance ecosystem.

article thumbnail

Data News — Week 23.42

Christophe Blefari

Still dbt Labs announcements were mainly towards dbt Cloud with great features to drive adoption of the paid product. They announced dbt Mesh a product enabling cross-project dependencies for teams with multiple dbt projects. This is awesome to see this directly integrated within BigQuery as it obviously brings simplicity.