Skip to content

sovanmukherjee/springboot-kafka-avro

Repository files navigation

Getting Started

Spring Boot 3 + Kafka + Schema Registry and Avro

Overview

As the usage of enterprise message producers, brokers, and consumers grows, ensuring that data or payloads are compliant with a known schema becomes crucial for several reasons:

Interoperability: Schemas provide a common structure for data, ensuring that different components and systems can understand and work with the data. This is especially important in a heterogeneous environment where various technologies and platforms are in use.

Data Consistency: Schemas help to maintain data consistency by defining the format and data types. This consistency is essential to prevent data errors or misinterpretations that could lead to system failures or incorrect business decisions.

Validation: Schemas enable data validation. When data is produced, it can be validated against the schema to ensure it meets the required standards. This reduces the likelihood of erroneous data entering the system.

Version Control: Enterprises often evolve and update their data structures over time. Schemas provide a way to manage version control, ensuring that different versions of data can be correctly processed by consumers.

Security: Schemas can include security constraints, ensuring that sensitive data is handled appropriately, and access controls are enforced.

Documentation: Schemas serve as documentation for the data structures used within the enterprise. This aids in onboarding new team members, understanding data flows, and troubleshooting issues.

To achieve schema compliance, organizations often use technologies and practices such as:

Schema Definition Languages: Using languages like JSON Schema, Avro Schema, Protocol Buffers, or XML Schema to formally define the structure of data.

Schema Registry: Implementing a central repository (schema registry) where schema definitions are stored and managed. This allows producers and consumers to reference and validate data against the latest schema versions.

Data Validation: Implementing data validation processes at the producer and consumer ends to ensure data adheres to the defined schema before it's transmitted or processed.

Schema Evolution: Establishing procedures for handling schema changes, including versioning and backward compatibility to ensure smooth transitions when schemas are updated.

Monitoring and Alerting: Implementing monitoring tools and alerting mechanisms to detect and notify stakeholders of any schema compliance violations.

Education and Training: Ensuring that teams are well-trained on schema usage and compliance practices to minimize errors and maintain data quality.

By focusing on schema compliance, enterprises can maintain data quality, reduce errors, improve interoperability, and make their systems more robust and reliable as they scale and evolve.

🔹 In this Application we have used ${\color{green}Avro \space schemas}$ to establish a data contract between our microservices applications.

Guides

  1. Download and install Docker Desktop

  2. You can check the version of Docker you have installed:

    Screenshot 2023-10-02 at 1 35 25 AM
  3. Starting confluent platform on Docker:

    Download docker-compose.yml file and run docker compose command with -d option to run in detached mode

    docker-compose up -d

    You should see all the containers come up as shown below:

    Screenshot 2023-10-02 at 12 13 38 AM
  4. Create Kafka topics

    Navigate to Control Center at http://localhost:9021. It may take a minute or two for Control Center to start and load. Click on the cluster.

    Screenshot 2023-10-02 at 12 19 47 AM Screenshot 2023-10-02 at 12 20 19 AM

    In the navigation menu, click Topics to open the topics list. Click on Add topic button

    Screenshot 2023-10-02 at 2 16 03 AM

    In the Topic name field, enter topic name and click Create with defaults. Topic names are case-sensitive.

    Screenshot 2023-10-02 at 12 29 00 AM

    Create retry topic

    Screenshot 2023-10-02 at 12 32 27 AM

    Create dlt topic

    Screenshot 2023-10-02 at 12 33 05 AM

    You should see all the new topics in the Topics list.

    Screenshot 2023-10-02 at 12 35 28 AM
  5. Verify registered schema types:

    http://localhost:8081/schemas/types

    Response:

    [
     "JSON",
     "PROTOBUF",
     "AVRO"
    ]
    
    
  6. Created Avro Schema: student.avsc

    davidmc24 gradle avro plugin will generate the Student POJO in the org.poc.kafka.avro.model package which is defined in the schema. This POJO has id, firstName, lastName, contact properties.

image . image

  1. Run springboot-kafka-avro-producer service
Screenshot 2023-10-02 at 2 34 00 AM
  1. Open Swagger-Ui
Screenshot 2023-10-02 at 3 02 02 AM
  1. Run springboot-kafka-avro-consumer service

    Screenshot 2023-10-02 at 3 07 29 AM
  2. Execute Students API

Screenshot 2023-10-02 at 3 11 34 AM
  1. springboot-kafka-avro-producer console log
Screenshot 2023-10-02 at 3 14 13 AM

Students API response:

Screenshot 2023-10-02 at 3 16 54 AM
  1. springboot-kafka-avro-consumer console log
Screenshot 2023-10-02 at 3 19 37 AM
  1. You can check message in Control Center by selecting Date Time and Partition

Example in Retry topic message: image

  1. You can verify created avro student.avsc schema in the clusters
image

Additional Links

These additional references should also help you: