Big Data

Streams Replication Manager Prefixless Replication

Posted in Technical | January 31, 2024 7 min read Replication is a crucial capability in distributed systems to address challenges related to...

Python List extend() Explained Simply

Introduction Python’s strength lies in its simplicity, offering developers a versatile toolkit. The extend() method, nestled within the list data structure, stands out as...

Combine transactional, streaming, and third-party data on Amazon Redshift for financial services

Financial services customers are using data from different sources that originate at different frequencies, which...

Faster Lakeview dashboards with Materialized Views

In this blog post, we will share how you can use Databricks SQL Materialized Views with Lakeview dashboards to deliver fresh data and...

Mastering Day 2 Operations with Cloudera

Posted in Technical | February 01, 2024 5 min read Delivering transformational innovation and accurate business decisions requires harnessing the full potential of...

Mastering Python’s Random Module

Introduction The Python random module is a built-in module that provides functionalities related to random number generation. It is often used in scenarios where...

Preprocess and fine-tune LLMs quickly and cost-effectively using Amazon EMR Serverless and Amazon SageMaker

Large language models (LLMs) are becoming increasing popular, with new use cases constantly being explored. In general, you can build applications powered by...

OLMo is Here, Powered by Databricks

As Chief Scientist (Neural Networks) at Databricks, I lead our research team toward the goal of giving everyone the ability to build and...

Latest articles