Remove introducing-query-time-time-zone-support
article thumbnail

Data News — December 2023

Christophe Blefari

A lot of things have been happening at the same time in my professional and personal life. To be honest, everything's been going well, but I've found it hard to find time to write among other things. End of January, on the 31st I'll speak at a Modern Data Stack conf in Paris, still about DuckDB, but this time in French.

Data 100
article thumbnail

How DoorDash Migrated from StatsD to Prometheus

DoorDash Engineering

Accurate and reliable observability is essential when supporting a large distributed service, but this is only possible if your tools are equally scalable. We’ll briefly introduce StatD’s history before diving into those specific issues. We can take advantage of the open-source data formats and query languages.

AWS 82
Insiders

Sign Up for our Newsletter

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

article thumbnail

PinCompute: A Kubernetes Backed General Purpose Compute Platform for Pinterest

Pinterest Engineering

Each zone can have multiple member clusters, which strictly aligns with the failure domain defined by the cloud provider, and clearly defines fault isolation and operation boundaries for the platform to ensure availability and control blast radius. PinScaler is an abstraction that supports application auto scaling at Pinterest.

article thumbnail

Access control for Azure ADLS cloud object storage

Cloudera

introduces fine-grained authorization for access to Azure Data Lake Storage using Apache Ranger policies. The audit log for the above operations, with details like time, user, path, operation, client IP address, cluster name, and Ranger policy that authorized the access, are interactively available in Apache Ranger console.

article thumbnail

15 Machine Learning Projects GitHub for Beginners in 2023

ProjectPro

They can use ChatBots on their NLP-based conversing agents that can reply to queries automatically if appropriately trained. The task for your machine would be to correctly identify the category of the query and respond to it appropriately. It’s a fun activity that people usually do in their spare time and laugh at.

article thumbnail

How We Optimized Rockset's Hot Storage Tier to Improve Efficiency By More Than 200%

Rockset

This blog describes how we optimized Rockset’s hot storage tier to improve efficiency by more than 200%. GB-month, making real-time data more affordable than ever before. Rockset’s hot storage layer Rockset's storage solution is an SSD-based cache layered on top of Amazon S3, designed to deliver consistent low-latency query responses.

article thumbnail

Using CockroachDB to Reduce Feature Store Costs by 75%

DoorDash Engineering

Maintenance overheads of large-scale Redis clusters If you read the prior blog post on our feature store (a must-read), you might be asking, ‘Why add another database?’ We quickly learned that upscaling our large Redis clusters (>100 nodes) was an extremely time-consuming process that was prone to errors and not scalable.