Big Data

A Deep Dive into the Latest Performance Improvements of Stateful Pipelines in Apache Spark Structured Streaming

This post is the second part of our two-part series on the latest performance improvements of stateful pipelines. The first part of this...

Using Streams Replication Manager Prefixless Replication for Kafka Topic Aggregation

Posted in Technical | February 28, 2024 9 min read Businesses often need to aggregate topics because it is essential for organizing, simplifying,...

Securing the Digital Frontier: Effective Threat Exposure Management

AI technology is radically changing the direction of the cybersecurity sector. Companies around the world are expected to spend $102.78 billion on AI...

Advancing Sparse LVLMs for Improved Efficiency

IntroductionThe ever-evolving landscape of artificial intelligence has presented an intersection of visual and linguistic data through large vision-language models (LVLMs).  MoE-LLaVA is one...

Introducing Amazon MWAA support for Apache Airflow version 2.8.1

Amazon Managed Workflows for Apache Airflow (Amazon MWAA) is a managed orchestration service for Apache...

Performance Improvements for Stateful Apache Spark Structured Streaming pipelines

IntroductionApache Spark™ Structured Streaming is a popular open-source stream processing platform that provides scalability and fault tolerance, built on top of the Spark...

Enhance RAG Performance with CRAG

IntroductionIn this article we will learn to enhance RAG performance with CRAG. The word RAG has been floating around for a while and...

Building a Robust Data Governance Framework for Organizational Success

Rising cybercriminal activities targeting enterprise intelligence assets and misusing individuals' personally identifiable information (PII) highlight the need for improved corporate data governance standards....

Latest articles