In private and public clouds, stream analytics commonly means stateless processing systems organized around Apache Kafka® or a similar distributed log service. GCP took a somewhat different tack, with Cloud Pub/Sub, Dataflow, and BigQuery, distributing the responsibility for processing among ingestion, processing and database technologies.
We compare the two approaches to data integration and show how Dataflow allows you to join and transform and deliver data streams among on-prem and cloud Apache Kafka clusters, Cloud Pub/Sub topics and a variety of databases. The session will have a mix of architectural discussions and practical code reviews of Dataflow-based pipelines.
Ricardo Ferreira, Developer Advocate, Confluent
Ricardo is a Developer Advocate at Confluent, the company founded by the creators of Apache Kafka. He has +21 years of experience working with Software Engineering, where he specialized in different types of Distributed Systems architectures such as Integration, SOA, NoSQL, Messaging, In-Memory Caching, and Cloud Computing. Prior to Confluent, he worked for other vendors such as Oracle, Red Hat and IONA Technologies, as well as several consulting firms. While at Oracle, he used to be part of the Alpha Team, otherwise known as “The A-Team” — a special unit from the Engineering organization that handles projects using the following philosophy: when all else fails, we don’t. Currently, he lives in Apex, North Carolina, with his wife, son and two dogs.
Karthi Thyagarajan, Solutions Architect, Google Cloud
Karthi is a Solutions Architect at Google Cloud focusing on Data and Analytics. In his role, he helps customers build large scale distributed applications using Google Cloud Platform.