Project Metamorphosis: Unveiling the next-gen event streaming platformLearn More

Getting Started with the Kafka Streams API using Confluent Docker Images


What’s great about the Kafka Streams API is not just how fast your application can process data with it, but also how fast you can get up and running with your application in the first place—regardless of whether you are implementing your applications in Java or other JVM-based languages such as Scala and Clojure. Unlike competing technologies, Apache Kafka® and its Streams API does not require installing a separate processing cluster, and it is equally viable for small, medium, large, and very large use cases.

In fact, it’s pretty common for our users to have their first application or proof-of-concept running in a matter of minutes. Some users, for example, opt to test-drive and develop their applications on their laptops against embedded, in-memory instances of Kafka and related services such as Confluent スキーマレジストリ. And they also use the same setup for automated integration testing in CI environments backed by Jenkins or Travis CI. Our own GitHub repo containing the Confluent demo applications uses exactly such a setup.

Docker and Kafka Streams API: A Perfect Match

Many developers love container technologies such as Docker and the Confluent Docker images to speed up the iterative development they’re doing on their laptops: for example, to quickly spin up a containerized Confluent Platform deployment consisting of multiple services such as Apache Kafka, Confluent スキーマレジストリ, and Confluent REST Proxy for Kafka.

Additionally, Docker is also a very popular choice among Kafka users for containerizing and deploying applications and microservices on platforms such as Kubernetes or in the cloud. And yes, unlike related technologies such as Apache® Spark™ or Apache Flink®, where you must install and run special processing clusters into which you then submit cluster-specific “processing jobs,” you actually can containerize applications that use the Kafka Streams API because these are standard Java applications. (And as a side note, these applications are backwards and forwards compatible with Kafka cluster versions, making such deployment super-flexible to accommodate for independently working teams across a company.) This also means you are able to use the same organizational processes and technical tooling for development, testing, packaging, deployment, and monitoring of the Kafka Streams applications just like you do everywhere else inside your company. For example, if you don’t like containers but prefer deploying to VMs with Puppet or Ansible, no problem. If you do like containers and enjoy deploying to Kubernetes or a cloud service like AWS EC2, no problem either. And—speaking of containers and Docker—this brings us to the focus of this blog.

To get started with the Kafka Streams API, most users typically begin with our Confluent demo applications or the Kafka Streams API chapter in the Confluent documentation.  In order to make your getting started experience even better, we recently added a new Docker-based demo setup. This Docker-based demo is the focus of this blog post and, because the demo is a one-click experience, the remainder of this post will be quite short and concise!

Creating the Kafka Music Demo

We will run the Confluent Kafka Music demo application in a containerized, multi-service deployment, using Docker. If you are reading this blog post for the first time, this will take you about five minutes. Afterward, this will take just a few seconds!

Our Kafka Music application demonstrates how to build a music charts application that continuously computes, in real-time, the latest charts such as “Top 5 songs” per music genre. It exposes its latest Streams processing results—the latest music charts—through Kafka’s Interactive Queries feature (see our documentation on Interactive Queries) combined with a REST API. The application’s input data is in Avro format and comes from two sources: a stream of play events (think: “song X was just played”) and a stream of song metadata (“song X was written by artist Y”).  The corresponding Avro schemas are registered with the Confluent Schema Registry instance because that’s how one creates production-ready data streams.

We will run the following containerized services:

If you first want to see a preview of what we will do in the subsequent sections, take a look at the following screencast:

Screencast: Running Confluent Kafka Music demo application (3 mins)


There is only one requirement to meet: you must install a recent version of Docker and Docker Compose on your host machine (e.g., your laptop running Mac OS, Linux, or Windows) if you haven’t done so already. If you are on a Mac, follow the instructions at Docker for Mac. The Confluent Docker images require Docker version 1.11 or greater.

For reference, I have run the instructions in this blog on a MacBook Pro with Mac OS Sierra and the following Docker versions:

Running the Kafka Music demo application

The first step is to clone the Confluent Docker Images repository:

Now we can launch the Kafka Music demo application including the services it depends on, such as Kafka:

After a few seconds, the application and the services are up and running. One of the started containers is continuously generating input data for the application by writing into the application’s input topics. This allows us to look at live, real-time data when using the Kafka Music application.

Now we can use our web browser or a CLI tool such as curl to interactively query the latest processing results of the Kafka Music application by accessing its REST API. In other words, we can play around now!

REST API example 1: list all running application instances of the Kafka Music application

REST API example 2: get the latest Top 5 songs across all music genres

The REST API exposed by the Kafka Music demo application supports further operations. See the top-level instructions in its source code for details (link points to the sources for Confluent 3.2).

If you’d like to continue exploring, perhaps by creating new Kafka topics or launching additional demonstrations, take a closer look at our Docker tutorial for Confluent 3.2.1.

Once you’re done you can stop all the services and containers with:

Conclusion and Wrapping Up

What’s great about what we have just done is not the actual Kafka Music example — rather, it’s that you can do the very same for your own applications! You can containerize your Kafka Streams application, similar to what we have done for the Kafka Music application above, and you can also deploy your application easily alongside other services such as an Apache Kafka cluster (with one or multiple brokers), Confluent スキーマレジストリ, Confluent Control Center, and much more—including your own dockerized services. All you need is Docker and Confluent Docker images for Apache Kafka and friends. If you need an example or template for containerizing your Kafka Streams application, take a look at the source code of the Docker image we used for this blog post.

Lastly, the image for running the Kafka Music demo application actually contains all of Confluent Kafka Streams demo applications. This means you can easily run any of these applications, too. I won’t cover that in this blog post, but we have instructions for how to do so.

Next Steps

If you have enjoyed this article, you might want to continue with the following resources to learn more about Apache Kafka’s Streams API:

Did you like this blog post? Share it now

Subscribe to the Confluent blog

More Articles Like This

Announcing the Snowflake Sink Connector for Apache Kafka in Confluent Cloud

We are excited to announce the preview release of the fully managed Snowflake sink connector in Confluent Cloud, our fully managed event streaming service based on Apache Kafka®. Our managed […]

How Merging Companies Will Give Rise to Unified Data Streams

Company mergers are becoming more common as businesses strive to improve performance and grow market share by saving costs and eliminating competition through acquisitions. But how do business mergers relate […]

Build Real-Time Observability Pipelines with Confluent Cloud and AppDynamics

Many organisations rely on commercial or open source monitoring tools to measure the performance and stability of business-critical applications. AppDynamics, Datadog, and Prometheus are widely used commercial and open source […]

Sign Up Now

Start your 3-month trial. Get up to $200 off on each of your first 3 Confluent Cloud monthly bills


上の「新規登録」をクリックすることにより、当社がお客様の個人情報を以下に従い処理することを理解されたものとみなします : プライバシーポリシー

上記の「新規登録」をクリックすることにより、お客様は以下に同意するものとします。 サービス利用規約 Confluent からのマーケティングメールの随時受信にも同意するものとします。また、当社がお客様の個人情報を以下に従い処理することを理解されたものとみなします: プライバシーポリシー

単一の Kafka Broker の場合には永遠に無料

商用版の機能を単一の Kafka Broker で無期限で使用できるソフトウェアです。2番目の Broker を追加すると、30日間の商用版試用期間が自動で開始します。この制限を単一の Broker へ戻すことでリセットすることはできません。

  • tar
  • zip
  • deb
  • rpm
  • docker
  • kubernetes
  • ansible

上の「無料ダウンロード」をクリックすることにより、当社がお客様の個人情報をプライバシーポリシーに従い処理することを理解されたものとみなします。 プライバシーポリシー

以下の「ダウンロード」をクリックすることにより、お客様は以下に同意するものとします。 Confluent ライセンス契約 Confluent からのマーケティングメールの随時受信にも同意するものとします。また、お客様の個人データが以下に従い処理することにも同意するものとします: プライバシーポリシー

このウェブサイトでは、ユーザーエクスペリエンスの向上に加え、ウェブサイトのパフォーマンスとトラフィック分析のため、Cookie を使用しています。また、サイトの使用に関する情報をソーシャルメディア、広告、分析のパートナーと共有しています。