I’m pleased to announce the release of Apache Kafka 3.0 on behalf of the Apache Kafka® community. Apache Kafka 3.0 is a major release in more ways than one. Apache Kafka 3.0 introduces a variety of new features, breaking API changes, and improvements to KRaft—Apache Kafka’s built-in consensus mechanism that will replace Apache ZooKeeper™.
While KRaft is not yet recommended for production (list of known gaps), we have made many improvements to the KRaft metadata and APIs. Exactly-once and partition reassignment support are worth highlighting. We encourage you to check out KRaft’s new features and to try it out in a development environment.
Starting with Apache Kafka 3.0, the producer enables the strongest delivery guarantees by default (
enable.idempotence=true). This means that users now get ordering and durability by default.
Also, don’t miss the Kafka Connect task restart enhancements, KStreams improvements in timestamp-based synchronization, and MirrorMaker2’s more flexible configuration options.
To review the full list of features and enhancements, be sure to read the release notes. You can also watch the release video for a summary of what’s new in Apache Kafka 3.0.0:
Support for Java 8 is deprecated across all components of the Apache Kafka project in 3.0. This will give users time to adapt before the next major release (4.0), when Java 8 support is planned to be removed.
Support for Scala 2.12 is also deprecated everywhere in Apache Kafka 3.0. As with Java 8, we’re giving users time to adapt because support for Scala 2.12 is planned to be removed in the next major release (4.0).
A major feature that we are introducing with 3.0 is the ability for KRaft controllers and KRaft brokers to generate, replicate, and load snapshots for the metadata topic partition named
__cluster_metadata. This topic is used by the Kafka Cluster to store and replicate metadata information about the cluster like broker configuration, topic partition assignment, leadership, etc. As this state grows, Kafka Raft Snapshot provides an efficient way to store, load, and replicate this information.
Experience and continuous development since the first version of the Kafka Raft controller have surfaced the need to revise a few of the metadata record types that are used when Kafka is configured to run without ZooKeeper (ZK).
With 3.0 and KIP-730 the Kafka Controller is now completely taking over the responsibility of generating a Kafka producer ID. The Controller is doing so both in ZK and KRaft modes. This takes us closer to the bridge release, which will allow users to transition from Kafka deployments that use ZK to new deployments that use KRaft.
Starting with 3.0, the Kafka producer turns on by default idempotency and the acknowledgement of delivery by all of the replicas. This makes record delivery guarantees stronger by default.
The default value of the Kafka Consumer’s configuration property
session.timeout.ms is increased from 10 seconds to 45 seconds. This will allow the consumer to adapt better by default to transient network failures and avoid consecutive rebalances when a consumer appears to leave the group only temporarily.
Requesting the current offsets of a Kafka consumer group has been possible for quite some time. But fetching the offsets of multiple consumer groups requires an individual request for each group. In 3.0 and with KIP-709, the fetch and AdminClient APIs are extended to support reading the offsets of multiple consumer groups at the same time within a single request/response.
Supporting operations that can be applied to multiple consumer groups at the same time in an efficient way heavily depends on the ability of the clients to discover the coordinators of these groups efficiently. This becomes possible with KIP-699, which adds support for discovering the coordinators for multiple groups with one request. Kafka clients have been updated to use this optimization when talking to new Kafka brokers that support this request.
Four years since its introduction in June 2017 with Kafka 0.11.0, message format v2 has been the default message format. Thus, with enough water (or streams if you may) having flowed under the bridge, the major release of 3.0 gives us a good opportunity to deprecate the older message formats—namely v0 and v1. These formats are rarely in use today. With 3.0, users will get a warning if they configure their brokers to use the message formats v0 or v1. This option will be removed in Kafka 4.0 (see KIP-724 for details and implications from the deprecation of v0 and v1 message formats).
KafkaFuture type was introduced to facilitate the implementation of the Kafka AdminClient, pre-Java 8 versions were still in widespread use and Java 7 was officially supported by Kafka. Fast forward to a few years later, and now Kafka runs on Java versions that support the
CompletableFuture class types. With KIP-707,
KafkaFuture adds a method to return a
CompletionStage object and in that way enhances the usability of
KafkaFuture in a backwards compatible way.
KIP-466 adds new classes and methods for the serialization and deserialization of generic lists—a feature useful to Kafka clients and Kafka Streams alike.
The users’ capabilities to list offsets of Kafka topic/partitions have been extended. With KIP-734, users can now ask the AdminClient to return the offset and timestamp of the record with the highest timestamp in a topic/partition. (This is not to be confused with what the AdminClient returns already as the latest offset—which is the offset of the next record to be written in the topic/partition.) This extension to the existing ListOffsets API allows users to probe the liveliness of a partition by asking which is the offset of the most recent record written and what its timestamp is.
In Kafka Connect a connector is represented during runtime as a group of a
Connector class instance and one or more
Task class instances, and most operations on connectors available through the Connect REST API can be applied to the group as a whole. A notable exception since the beginning has been the
restart endpoints for the
Task instances. To restart the connector as a whole, users had to make individual calls to restart the Connector instance and the Task instances. In 3.0, KIP-745 gives the ability to the users to restart either all or only the failed of a connector’s
Task instances with a single call. This feature is an add-on capability and the previous behavior of the
restart REST API remains unchanged.
Following their deprecation in the previous major release (Apache Kafka 2.0),
internal.value.converter are removed as configuration properties and prefixes in the Connect worker’s configuration. Moving forward, internal Connect topics will exclusively use the
JsonConverter to store records without embedded schemas. Any existing Connect clusters that used different converters will have to port their internal topics to the new format (see KIP-738 for details on the upgrade path).
Since Apache Kafka 2.3.0, a Connector worker can be configured to allow connector configurations to override the Kafka client properties used by the connector. This has been a widely used feature and now with the opportunity of a major release the ability to override connector client properties is enabled by default (
connector.client.config.override.policy is set to
All by default).
Another feature that was introduced back in 2.3.0 but hasn’t been enabled by default up to this point is connector log contexts. This is changing in 3.0 and the connector context is added by default in the pattern of
log4j logs of the Connect worker. An upgrade to 3.0 from a previous release will change the format of log lines exported by
log4j by adding the connector context, where appropriate.
KIP-695 enhances the semantics of how Streams tasks choose to fetch records, and extends the meaning and the available values of the configuration property
max.task.idle.ms. This change required a new method in the Kafka consumer API,
currentLag, that is able to return the consumer lag of a specific partition if it is known locally and without contacting the Kafka Broker.
Starting with 3.0, three new methods are added to the
timeCurrentIdlingStarted. These methods can allow Streams applications to keep track of the progress and health of its tasks.
KIP-740 represents a significant renovation of the
TaskId class. Several methods and all internal fields are deprecated, with new
partition() getters replacing the old
partition fields (see also KIP-744 for relevant changes and an amendment to KIP-740).
KIP-744 takes the changes proposed by KIP-740 one step further and separates the implementation from the public API of a number of classes. To accomplish this, the new interfaces
StreamsMetadata are introduced while the existing classes with the same names are deprecated.
The Interactive Queries API is extended with a new set of methods in the
SessionStore interfaces that accept arguments of the
Instant data type. This change will affect any custom read-only Interactive Query session store implementations that will need to implement the new methods.
ProcessorContext adds two new methods in 3.0,
currentStreamTimeMs. The new methods give users the ability to query the cached system time and the streams time respectively, and they can be used in a uniform way in production and test code.
Support for the legacy metrics structure for the built-in metrics in Streams is lifted in 3.0. KIP-743 is removing the value
0.10.0-2.4 from the configuration property
built.in.metrics.version. That leaves
latest as the only valid value of this property at the moment (has been the default value since 2.5).
The prior default value of the default SerDe properties is removed. Streams used to default to the
ByteArraySerde. Starting with 3.0, there is no default, and users are required to either set their SerDes as needed in the API or set a default via
DEFAULT_VALUE_SERDE_CLASS_CONFIG in their Streams configuration. The prior default was almost always not applicable to real applications and caused more confusion than convenience.
With the opportunity of a major release, the default value of the Streams configuration property
replication.factor changes from 1 to -1. This will allow new Streams applications to use the default replication factor defined at the Kafka broker and therefore won’t be required to set this configuration value when they move to production. Note that Kafka Brokers version 2.5 or above are required for the new default value.
Another Streams configuration value that is deprecated in 3.0 is
exactly_once as a value of the property
processing.guarantee. The value
exactly_once corresponds to the original implementation of Exactly Once Semantics (EOS), available to any Streams applications that connects to a Kafka cluster version 0.11.0 or newer. This first implementation of EOS has been superseded by the second implementation of EOS in Streams, which was represented by the value
exactly_once_beta in the
processing.guarantee property. Moving forward, the name
exactly_once_beta is also deprecated and replaced by the new name
exactly_once_v2. In the next major version (4.0), both
exactly_once_beta will be removed, leaving
exactly_once_v2 as the only option for EOS delivery guarantees.
The configuration properties
default.windowed.value.serde.inner are deprecated in favor of a single new property
windowed.inner.class.serde for use by the consumer client. Kafka Streams users are recommended to configure their windowed SerDe by passing this into the SerDe constructor instead and then supplying the SerDe wherever it’s used in topology.
In Kafka Streams, windowed operations are allowed to process records outside of their window according to a configuration property that is called the grace period. Previously, this configuration was optional and easy to miss, leading to the default of 24 hours. This was a frequent source of confusion for users of the
Suppression operator since it would buffer records until the grace period had elapsed and therefore add a 24 hour latency. In 3.0,
Windows classes are enhanced with factory methods that require them to be constructed with a custom grace period or no grace period at all. The old factory methods that applied a default grace period of 24 hours have been deprecated, along with the corresponding
grace() APIs which are incompatible with the new factory methods that already set this config.
The Streams use of the application reset tool
kafka-streams-application-reset becomes more flexible with the addition of a new command-line parameter:
--internal-topics. The new parameter accepts a list of comma-separated topic names that correspond to internal topics that can be scheduled for deletion with this application tool. Combining this new parameter with the existing parameter
--dry-run allows users to confirm which topics will be deleted and specify a subset of them if necessary before actually performing the deletion operation.
With 3.0, the first version of MirrorMaker is being deprecated. Going forward, development of new features and major improvements will focus on MirrorMaker 2 (MM2).
With 3.0, users can now configure where MirrorMaker2 creates and stores its internal topic that it uses to convert consumer group offsets. This will allow users of MirrorMaker2 to maintain the source Kafka cluster as a strictly read-only cluster and use a different Kafka cluster to store offset records (that being the target Kafka cluster or even a third cluster beyond the source and target clusters).
Apache Kafka 3.0 is a major step forward for the Apache Kafka project. To learn more:
This was a huge community effort, so thank you to everyone who contributed to this release, including all our users and our 141 authors and reviewers:
A. Sophie Blee-Goldman, Adil Houmadi, Akhilesh Dubey, Alec Thomas, Alexander Iskuskov, Almog Gavra, Alok Nikhil, Alok Thatikunta, Andrew Lee, Bill Bejeck, Boyang Chen, Bruno Cadonna, CHUN-HAO TANG, Cao Manh Dat, Cheng Tan, Chia-Ping Tsai, Chris Egerton, Colin P. McCabe, Cong Ding, Daniel Urban, Daniyar Yeralin, David Arthur, David Christle, David Jacot, David Mao, David Osvath, Davor Poldrugo, Dejan Stojadinović, Dhruvil Shah, Diego Erdody, Dong Lin, Dongjoon Hyun, Dániel Urbán, Edoardo Comar, Edwin Hobor, Eric Beaudet, Ewen Cheslack-Postava, Gardner Vickers, Gasparina Damien, Geordie, Greg Harris, Gunnar Morling, Guozhang Wang, Gwen (Chen) Shapira, Ignacio Acuña Frías, Igor Soarez, Ismael Juma, Israel Ekpo, Ivan Ponomarev, Ivan Yurchenko, Jason Gustafson, Jeff Kim, Jim Galasyn, Jim Hurne, JoelWee, John Gray, John Roesler, Jorge Esteban Quilcate Otoya, Josep Prat, José Armando García Sancio, Juan Gonzalez-Zurita, Jun Rao, Justin Mclean, Justine Olshan, Kahn Cheny, Kalpesh Patel, Kamal Chandraprakash, Konstantine Karantasis, Kowshik Prakasam, Leah Thomas, Lee Dongjin, Lev Zemlyanov, Liu Qiang, Lucas Bradstreet, Luke Chen, Manikumar Reddy, Marco Aurelio Lotz, Matthew de Detrich, Matthias J. Sax, Michael G. Noll, Michael Noll, Mickael Maison, Nathan Lincoln, Niket Goel, Nikhil Bhatia, Omnia G H Ibrahim, Peng Lei, Phil Hardwick, Rajini Sivaram, Randall Hauch, Rohan Desai, Rohit Deshpande, Rohit Sachan, Ron Dagostino, Ryan Dielhenn, Ryanne Dolan, Sanjana Kaundinya, Sarwar Bhuiyan, Satish Duggana, Scott Hendricks, Sergio Peña, Shao Yang Hong, Shay Elkin, Stanislav Vodetskyi, Sven Erik Knop, Tom Bentley, UnityLung, Uwe Eisele, Vahid Hashemian, Valery Kokorev, Victoria Xia, Viktor Somogyi-Vass, Viswanathan Ranganathan, Vito Jeng, Walker Carlson, Warren Zhu, Xavier Léauté, YiDing-Duke, Zara Lim, Zhao Haiyuan, bmaidics, cyc, dengziming, feyman2016, high.lee, iamgd67, iczellion, ketulgupta1995, lamberken, loboya~, nicolasguyomar, prince-mahajan, runom, shenwenbing, thomaskwscott, tinawenqiao, vamossagar12, wenbingshen, wycccccc, xjin-Confluent, zhaohaidao
This post was originally published by Konstantine Karantasis on The Apache Software Foundation blog.
Konstantine Karantasis is a software engineer at Confluent. He’s a main contributor to Apache Kafka and its Connect API, and an author of widely used software, such as Confluent’s S3 and Replicator connectors, class loading isolation in Kafka Connect, Incremental Cooperative Rebalancing in Kafka, the Confluent CLI and more. Previously, he built scalable open source web services at Yahoo! and researched high-performance computing at the University of Illinois at Urbana-Champaign. Konstantine holds a Ph.D. from the University of Patras.