Apache Kafka は、1日当たり数兆件のイベントに対応可能なコミュニティ/分散型 Streaming Platform です。抽象化された分散型コミットログである Kafka は、当初はメッセージキューとして捉えられていました。2011年に LinkedIn でオープンソースベースで開発されて以降、メッセージキューから本格的な Streaming Platform へと急速に進化を遂げてきました。
Apache Kafka を生んだ開発者チームによって創設された Confluent は、Confluent Platform という形で最も完成度の高い Kafka ディストリビューションを提供しています。Confluent Platform のコミュニティ版/商用版機能を追加することで Kafka をさらに改善し、プロダクション環境において運用者と開発者のストリーミング体験を大規模に向上させることが可能となります。
At its heart lies the humble, immutable commit log, and from there you can subscribe to it, and publish data to any number of systems or real-time applications. Unlike messaging queues, Kafka is a highly scalable, fault tolerant distributed system, allowing it to be deployed for applications like managing passenger and driver matching at Uber, providing real-time analytics and predictive maintenance for British Gas’ smart home, and performing numerous real-time services across all of LinkedIn. This unique performance makes it perfect to scale from one app to company-wide use.
An abstraction of a distributed commit log commonly found in distributed databases, Apache Kafka provides durable storage. Kafka can act as a ‘source of truth’, being able to distribute data across multiple nodes for a highly available deployment within a single data center or across multiple availability zones.
A streaming platform would not be complete without the ability to manipulate that data as it arrives. The Streams API within Apache Kafka is a powerful, lightweight library that allows for on-the-fly processing, letting you aggregate, create windowing parameters, perform joins of data within a stream, and more. Perhaps best of all, it is built as a Java application on top of Kafka, keeping your workflow intact with no extra clusters to maintain.
Learn how to take full advantage of Apache Kafka, the distributed, publish-subscribe queue for handling real-time data feeds. With this comprehensive book, you’ll understand how Kafka works and how it’s designed.Get Your Copy
Apache Kafka is a popular tool for developers because it is easy to pick up and provides a powerful streaming platform complete with 4 APIs: Producer, Consumer, Streams, and Connect.
Often, developers will begin with a single use case. This could be using Apache Kafka as a message buffer to protect a legacy database that can’t keep up with today’s workloads, or using the Connect API to keep said database in sync with an accompanying search indexing engine, to process data as it arrives with the Streams API to surface aggregations right back to your application.
In short, Apache Kafka and its APIs make building data-driven apps and managing complex back-end systems simple. Kafka gives you peace of mind knowing your data is always fault-tolerant, replayable, and real-time. Helping you quickly build by providing a single streaming platform to process, store, and connect your apps and systems with real-time data.