Announcing Kafka Connect: Building large-scale low-latency data pipelines
For a long time, a substantial portion of data processing that companies did ran as big batch jobs — CSV files dumped out of databases, log files collected at the
For a long time, a substantial portion of data processing that companies did ran as big batch jobs — CSV files dumped out of databases, log files collected at the
When Kafka was originally created, it shipped with a Scala producer and consumer client. Over time we came to realize many of the limitations of these APIs. For example, we
When we released Apache Kafka 0.9.0.0, we talked about all of the big new features we added: the new consumer, Kafka Connect, security features, and much more. What we didn’t
Apache Kafka is a high-throughput distributed message system that is being adopted by hundreds of companies to manage their real-time data. Companies use Kafka for many applications (real time stream
Apache Kafka has a data structure called the “request purgatory”. The purgatory holds any request that hasn’t yet met its criteria to succeed but also hasn’t yet resulted in an
I ran into the schema-management problem while working with my second Hadoop customer. Until then, there was one true database and the database was responsible for managing schemas and pretty
This post was jointly written by Neha Narkhede, co-creator of Apache Kafka, and Flavio Junqueira, co-creator of Apache ZooKeeper. Many distributed systems that we build and use currently rely on dependencies like
One of the things I realised while doing research for my book is that contemporary software engineering still has a lot to learn from the 1970s. As we’re in such
This post has been written in collaboration with Derrick Harris from Mesosphere and Joe Stein, a Kafka committer. For an updated version of this article, please see Apache Mesos, Apache Kafka and
Building operational simplicity into distributed systems, especially for nuanced behaviors, is somewhat of an art and often best achieved after gathering production experience. Apache Kafka‘s popularity can be attributed in
Use CL60BLOG to get an additional $60 of free Confluent Cloud