Franz Kafka (3 July 1883 – 3 June 1924) was a German-language writer of novels and short stories who is widely regarded as one of the major figures of 20th-century literature.
Apache Kafka is an open-source message broker project developed by the Apache Software Foundation written in Scala. The project aims to provide a unified, high-throughput, low-latency platform for handling real-time data feeds. The design is heavily influenced by transaction logs.
Every byte of data has a story to tell. The faster and easier we move it around, the more we can focus on the core business. Data pipelines are the epicenter of data-driven companies, and Apache Kafka is becoming the heart of it.
What is Kafka?
Kafka is a distributed publish-subscribe messaging system that is designed to be fast, scalable, and durable.
- Kafka maintains feeds of messages in categories called topics.
- Producers write data to topics
- Consumers read from topics
- Kafka is run as a cluster comprised of one or more servers each of which is called a broker.
What is Kafka Connect?
The recent post from my co-founder Kostas Pardalis does a great job explaining it.
Kafka Connect was introduced recently as a feature of Apache Kafka 0.9+ with the narrow (although very important) scope of copying streaming data from and to a Kafka cluster.
Kafka Connect is about interacting with other data systems and move data between them and a Kafka Cluster. Many of the connectors that are available are focusing to systems that are managed by the owner of the Kafka Cluster, e.g. RDBMS systems that hold transactional data, trying to turn these systems into a stream of data.
So if you are exploring Kafka, check his detailed post, about a: