This month the community has been focused on the upcoming release of Apache Kafka 0.10.1.0. Led by the fearless release manager, Jason Gustafson, we voted on a release plan, cut branches and started voting on the first release candidate. Please contribute to the community by downloading the release candidate, testing it out and letting everyone know how it went. If no serious bugs are found, we are hoping to finalize the release by mid-October.
In addition to the vote, we gave our website a quick facelift, contribution of Derrick Or. We appreciated the feedback from the community and issues were quickly addressed.
And as usual, there are several very lively discussions in the community:
- KIP-74: Proposal to limit not just the amount of data returned by a consumer fetch per partition, but also the amount of data returned for each fetch request overall. This will give users better control over the memory usage of consumers, but even better – this allows consumers to make progress even if a partition contains messages larger than the maximum fetch size. This proposal has been merged and will be part of the 0.10.1.0 release.
- KIP-79: Proposal to add methods for searching by timestamp to the new consumer was accepted and merged. It will be included in the next release to everyone’s great joy.
- KIP-82: Proposal for adding headers to Kafka messages. This proposal is very popular because so many organizations are using headers internally. It is also controversial – Kafka project has a long tradition of keeping the message completely unstructured and letting the users and client put whatever structure they need inside the message. Whatever the decision is, it will have serious impact on the Apache Kafka ecosystem.
- KIP-83: Much welcome proposal that allows to instantiate clients with different security configurations in the same JVM. There are already patches available by Rajini Sivaram and Edurdo Comar and once integrated it will allow us to update MirrorMaker to support different security configurations on source and target clusters.
- KIP-85: Allowing clients to take JAAS configurations dynamically rather than via a file. This will be huge for those of us implementing microservices in containers – adding files to containers has been very inconvenient.
In addition to ongoing Kafka improvements, there are other interesting news and blogs:
- Google are talking about use of Kafka in GCP and their new Kafka connectors.
- Good summary of the big announcements for the Streams community from Strata.
- Dean Wampler talks to O’Reilly about streams architecture.
- Tutorial at Strata showing how to build customer 360 architecture using Apache Kafka, Spark Streaming and Kudu. One of the main take-aways is that modern data architectures no longer assume that all the data you need is found in one database – instead they solve the data integration problem.
- How to test Kafka Streams topologies – because testing is the most important part of development.
- From CapitalOne, a great StrangeLoop talk: Commander: Better Distributed Applications through CQRS, Event Sourcing, and Immutable Logs.
- We made recommendations on how to move to the cloud with Kafka and added enterprise features.
- Using MirrorMaker? Want to use the new Consumer? Here are some gotchas you want to be aware of.
- And for the theory-inclined: Fascinating paper on graph processing on streams.
If you are interested in learning all about streaming data platforms, Confluent has released a 6-part online talk series focusing on Apache Kafka. You can view the recordings for the first two talks in the series by Jay Kreps and Jun Rao, and register for the upcoming sessions at www.confluent.io/apache-kafka-talk-series.