Dublin Apache Kafka Meetup by Confluent – Recording

Dublin Apache Kafka Meetup by Confluent – Recording

Watch recording

Watch a recording from the Dublin Apache Kafka Meetup by Confluent from July 4, 2017.

Rethinking Stream Processing with Apache Kafka: Applications vs. Clusters, Streams vs. Databases

Speaker: Michael Noll, Product Manager, Confluent

データは現代のビジネスの核をなすものであり、このデータは常に変化を続けます。どうすれば、こうした情報の急流をリアルタイムで活用することができるでしょうか?その答えとなるのがストリーム処理です。そして、データストリーミングのコアプラットフォームとなった技術が Apache Kafka です。Kafka を利用して業界を変革し、作り変えた数千の企業としては Netflix、Uber、PayPal や AirBnB などが挙げられますが、Goldman Sachs、Cisco や Oracle などの大企業も例外ではありません。

Unfortunately, today’s common architectures for real-time data processing at scale suffer from complexity: there are many technologies that need to be stitched and operated together, and each individual technology is often complex by itself. This has led to a strong discrepancy between how we, as engineers, would like to work vs. how we actually end up working in practice.

In this session we talk about how Apache Kafka helps you to radically simplify your data architectures. We cover how you can now build normal applications to serve your real-time processing needs — rather than building clusters or similar special-purpose infrastructure — and still benefit from properties such as high scalability, distributed computing, and fault-tolerance, which are typically associated exclusively with cluster technologies. We discuss common use cases to realize that stream processing in practice often requires database-like functionality, and how Kafka allows you to bridge the worlds of streams and databases when implementing your own core business applications (inventory management for large retailers, patient monitoring in healthcare, fleet tracking in logistics, etc), for example in the form of event-driven, containerized microservices.


Real-time Data Integration at Scale with Kafka Connect

Speaker: Robin Moffat, Partner Technology Evangelist, Confluent

Apache Kafka is a streaming data platform. It enables integration of data across the enterprise, and ships with its own stream processing capabilities. But how do we get data in and out of Kafka in an easy, scalable, and standardised manner? Enter Kafka Connect. Part of Apache Kafka since 0.9, Kafka Connect defines an API that enables the integration of data from multiple sources, including MQTT, common NoSQL stores, and CDC from relational databases such as Oracle. By "turning the database inside out" we can enable an event-driven architecture in our business that reacts to changes made by applications writing to a database, without having to modify those applications themselves. As well as ingest, Kafka Connect has connectors with support for numerous targets, including HDFS, S3, and Elasticsearch.

This presentation will briefly recap the purpose of Kafka, and then dive into Kafka Connect, with practical examples of data pipelines that can be built with it and are in production at companies around the world already. We'll also look at the Single Message Transform (SMT) capabilities introduced with Kafka and how they can make Kafka Connect even more flexible and powerful.