This blog post is the fourth in a four-part series that discusses a few new Confluent Control Center features that are introduced with Confluent Platform 6.2.0. It focuses on removing residue data via a new cleanup script that helps you remove old Control Center instances easily. The series highlights the following new features that make managing Apache Kafka® clusters via Control Center an even smoother experience:
If you are not too familiar with Control Center, you can always refer to the Control Center overview first. Having a running Control Center instance at hand helps you explore the features discussed in this blog series better.
Now that you are ready, let’s delve into the fourth feature here in part 4: removing residue data with the Control Center Cleanup script.
With each version upgrade or ID update (explained more later), Control Center creates a new set of internal topics that correspond to the new Control Center instance. Consequently, after a Control Center upgrade/update, you may notice topics from old instances are left behind, cluttering the “Topics” overview page as shown below. These old topics are not used by the new Control Center instance, but they continue to take up disk space. Control Center does not automatically delete the old topics in order to avoid accidental removal of wanted data. Unfortunately, manual deletion of the old topics can make the Control Center upgrade/update process cumbersome and error prone.
Version 6.2.0 introduces a new cleanup script
bin/control-center-cleanup that allows you to interactively delete the old instances’ residue—topics and local directories—easier and faster when you upgrade/update Control Center. With this new script, you can delete the old instances’ residue while the current instance of Control Center is running.
The example above shows the topic residue from the Control Center upgrade of
version 5.4.1 to
6.2.0, where the old set of internal topics prefixed with
_confluent-controlcenter-5-4-1-1 are left behind and coexist with the new set prefixed with
The same issue occurs if you change the Control Center unique identifier using
confluent.controlcenter.id in your properties file. Control Center unique identifiers are useful if you want multiple instances of Control Center to coexist on the same server. However, if you decide to keep only one instance after an identifier change, you will encounter the same data residue issues. For example, if you have Control Center version 6.2.0 and changed the
2, then the old set of internal topics prefixed with
_confluent-controlcenter-6-2-0-1 are left behind and coexist with the new set prefixed with
The cleanup script requires a Control Center properties file to establish the initial connection to the Kafka cluster and to decide what the current running Control Center instance is in order to avoid deleting its data. The cleanup script uses:
confluent.controlcenter.nameto determine the name of the running instance
confluent.controlcenter.idto determine the unique identifier of the running instance
confluent.controlcenter.data.dirto determine the directory that contains local data of Control Center instances
For example, the following Control Center properties file
etc/confluent-control-center/control-center.properties contains the following:
############################# Server Basics ############################# bootstrap.servers=localhost:9092 zookeeper.connect=localhost:2181 ######################### Control Center Settings ######################### confluent.controlcenter.data.dir=/tmp/control-center confluent.controlcenter.id=1 # using default confluent.controlcenter.name, “_confluent-controlcenter”
Therefore, running the cleanup script from package
confluent-6.2.0, the script determines that the running instance is
<running instance name>-<version>-<id>). It also determines that the local data of all instances reside in
Assume that only the Control Center instance defined in the properties file—
_confluent-controlcenter-6-2-0-1—is up and running.
$CONFLUENT_HOME, run the script as
./bin/control-center-cleanup <props_file>, and you will get the following prompt:
$CONFLUENT_HOMEis the environment variable for your Confluent Platform directory. You can set it with
export CONFLUENT_HOME=<path-to-confluent>, for example,
./bin/control-center-cleanup etc/confluent-control-center/control-center.properties ============================================================================ The cleanup script found the following instance: _confluent-controlcenter-6-2-0-1 We believe this COULD be the instance defined in your config file so it will not be prompted for cleanup. Here are the instances discovered for cleanup: _confluent-controlcenter-5-4-1-1 _confluent-controlcenter-5-4-1-2 Cleanup ALL of the instances above? [y/N]:
The script avoids cleaning the running instance—
_confluent-controlcenter-6-2-0-1—and discovers that there are two old Control Center instances from
version 5.4.1 available for cleanup,
You can type
y to clean all of the old instances without intermissions or prompts.
You can type
N to receive a prompt individual instance cleanup instead.
N is used in the previous step, you will receive the following prompt:
Do you want to cleanup _confluent-controlcenter-5-4-1-1 ? [y/N/dryRun]:
For each Control Center instance, you can type
y to clean the instance’s topics and the instance’s local directories.
_confluent-controlcenter-5-4-1-1-AlertHistoryStore-changelog _confluent-controlcenter-5-4-1-1-MetricsAggregateStore-changelog _confluent-controlcenter-5-4-1-1-cluster-rekey _confluent-controlcenter-5-4-1-1-expected-group-consumption-rekey _confluent-controlcenter-5-4-1-1-actual-group-consumption-rekey
<confluent.controlcenter.data.dir>/<instance id>/cp-command/<instance name>and
<confluent.controlcenter.data.dir>/<instance id>/kafka-streams/<instance name>. Refer to the table below for an example:
You can type
N to skip cleanup for the instance at hand.
You can type
dryRun to see what topics and local directories will be deleted without any actual impact. After
dryRun, you will be prompted to clean up the same instance again with option
[y/N/dryRun] until you either type
N, deciding to clean or skip the instance.
If you would like to avoid being prompted to clean each instance, type
y in step 2,
Cleanup ALL of the instances above? [y/N].
A majority of the logs are omitted below, except for high-level logs. Lines that start with
# are comments added later and are not part of the original log.
./bin/control-center-cleanup etc/confluent-control-center/control-center.properties ================================================================================ The cleanup script found the following instance: _confluent-controlcenter-6-2-0-1 We believe this COULD be the instance defined in your config file so it will not be prompted for cleanup. Here are the instances discovered for cleanup: _confluent-controlcenter-5-4-1-1 _confluent-controlcenter-5-4-1-2 Cleanup ALL of the instances above? [y/N]: N Do you want to cleanup _confluent-controlcenter-5-4-1-1 ? [y/N/dryRun]: dryRun ----Dry run displays the actions which will be performed when running Streams Reset Tool---- Reset-offsets for input topics [_confluent-monitoring, _confluent-command, _confluent-metrics] Seek-to-end for intermediate topics [_confluent-controlcenter-5-4-1-1-cluster-rekey, _confluent-controlcenter-5-4-1-1-monitoring-message-rekey-store, _confluent-controlcenter-5-4-1-1-actual-group-consumption-rekey, _confluent-controlcenter-5-4-1-1-expected-group-consumption-rekey, _confluent-controlcenter-5-4-1-1-group-stream-extension-rekey, _confluent-controlcenter-5-4-1-1-monitoring-trigger-event-rekey, _confluent-controlcenter-5-4-1-1-MetricsAggregateStore-repartition, _confluent-controlcenter-5-4-1-1-metrics-trigger-measurement-rekey] Following input topics offsets will be reset to (for consumer group _confluent-controlcenter-5-4-1-1) (...) Following intermediate topics offsets will be reset to end (for consumer group _confluent-controlcenter-5-4-1-1) (...) Deleting all internal/auto-created topics for application _confluent-controlcenter-5-4-1-1 (...) Deleting intermediate topics (for consumer group _confluent-controlcenter-5-4-1-1) (...) Deleting local RocksDB data in /tmp/confluent/control-center/1 Deleting /tmp/confluent/control-center/1/cp-command/_confluent-controlcenter-5-4-1-1-command Deleting /tmp/confluent/control-center/1/kafka-streams/_confluent-controlcenter-5-4-1-1 Done. Finished dryRun for _confluent-controlcenter-5-4-1-1 . Do you want to clean it up? [y/N/dryRun]: y # Logs omitted. Same steps as above: # 1. For input topics, reset offsets to specified position (default EARLIEST) ← from Kafka Streams Reset Tool # 2. For intermediate topics, seek offsets to the end, LATEST ← from Kafka Streams Reset Tool # 3. Delete internal/auto-created topics ← from Kafka Streams Reset Tool # 4. Delete intermediate topics # 5. Delete local RocksDB data in directories Do you want to cleanup _confluent-controlcenter-5-4-1-2 ? [y/N/dryRun]: y # Logs omitted. Same 5 steps as above. ================================================================================
If you run the cleanup script again, you will see that
_confluent-controlcenter-5-4-1-2 were cleaned up successfully and you won’t be prompted again.
./bin/control-center-cleanup etc/confluent-control-center/control-center.properties ================================================================================ The cleanup script found the following instance: _confluent-controlcenter-6-2-0-1 We believe this COULD be the instance defined in your config file so it will not be prompted for cleanup. The cleanup script found no instances for cleanup. ================================================================================
Historically, Control Center has a reset script,
bin/control-center-reset, which supports the cleanup of one instance at a time without any guidance prompts: The script only deletes the instance defined in the provided properties file and does not automatically discover other instances. Therefore, in order to maintain a clean Control Center environment, it is recommended that you run the reset script upon each version upgrade or unique identifier update.
Before we dive into the benefits of the cleanup script, the following provides a bit more detail about the reset script.
Just like the cleanup script, the reset script also requires a Control Center properties file. It is used to establish the initial connection to the Kafka cluster and to determine the Control Center instance to delete (the reset script only deletes the instance defined in your properties file). New with version 6.2.0,
dryRun flag is now supported for the reset script:
bin/control-center-reset <props_file> [--dryRun]
dryRun flag, the script previews the topics and directories pertaining to the Control Center instance defined in your properties file, without actually deleting them.
It is important to note that prior to version 6.2.0, the reset script would clean local directories more “drastically.” It finds the unique identifier in the properties file,
confluent.controlcenter.id, and deletes the entire ID directory, not just the directories of the target instance.
For example, if the unique identifier is 1, and you have two Control Center instances with ID 1,
_confluent-controlcenter-5-4-1-1 (target instance to delete) and
_confluent-controlcenter-6-2-0-1, then the entire ID directory
/tmp/control-center/1 would be deleted, not just the directories of
_confluent-controlcenter-5-4-1-1 (orange directories deleted):
This reset script issue is fixed in version 6.2.0, where only the target instance’s directories are deleted:
To summarize, despite the subtle differences between the two scripts, the reset script and the cleanup script are complete opposites. The former can only delete the Control Center instance defined in your Control Center configuration file, while the latter can automatically discover and delete any instances except the one defined in your configuration file. To maintain a clean environment, the reset script needs to be run each time before you start a new instance, while the cleanup script can run anytime (before or after a new instance and even only periodically). The cleanup script also provides a handful of guidance prompts, giving you full control over which instance(s) to delete.
There are a few benefits of the cleanup script that would make your Control Center upgrade/update process less error prone and cumbersome:
You are an operator and strive to maintain a clean environment by only keeping the necessary Control Center instances—you can now use the cleanup script to periodically delete all the unused instances in one go. No need to manually hunt down each Kafka topic or local data from old Control Center instances anymore.
Imagine you are an operator and just performed a Control Center unique identifier update. With the reset script, you would need to modify the properties file to target the instance that you want to delete and repeat the process until all the old instances are deleted. Now with the cleanup script, you only need the latest properties file, which will single out the running instance and delete the old ones.
Let’s say you are an operator and just performed a Control Center version upgrade. With the reset script, in order to delete an old instance, you would need to run the script in the Confluent Platform package whose version matches the target instance. For example, to delete an instance of version 5.4.1, you need to run the script in Confluent Platform package 5.4.1; running the reset script in Confluent Platform package 5.4.2 would delete the 5.4.2 instance, not the target 5.4.1 instance. Now with the cleanup script, you can run the script in any Confluent Platform package that provides it, and it will automatically discover old instances to delete. No need to match the package version with the target instance!
For operators who want to make sure they do not accidentally delete the wrong Control Center instances, the cleanup script provides guidance prompts to avoid accidental deletion.
In summary, removing residue data with the Control Center cleanup script allows you to maintain a clean environment by removing data from unused Control Center instances in one run, making the Control Center upgrade/update process more efficient and less error prone.
To learn about other new features of Control Center 6.2.0, check out the remaining blog posts in this series:
Rinka Yoshida joined Confluent in 2020 as a backend developer for Confluent Control Center and currently works on projects that identify and grow users’ journeys with Confluent Platform. She earned a bachelor’s degree in computer science from the University of California, San Diego.