StreamSets Guide:

Getting Started with Apache Kafka

Apache Kafka is a scalable and fault tolerant messaging system common in publish and subscribe (pub/sub) architectures. Apache Kafka is used for a range of use cases including message bus modernization, microservices architectures and ETL over streaming data. The main benefits of Apache Kafka are:

  • High throughput -- Each server is capable of handling 100s MB/sec of data.
  • High availability -- Data can be stored redundantly in multiple servers and can survive individual server failure.
  • High scalability -- New servers can be added over time to scale out the system.
  • Easy integration with external data sources or data sinks.
  • Built-in real-time processing layer.

This guide provides instructions on:

  • Downloading and installing Apache Kafka
  • Producing and consuming messages
  • Working with Kafka Connect
  • Transformations in Kafka
  • Connect REST APIs

To learn more and to quickly get started with Apache Kafka, download this guide.

