wikimedia-sampler

A toy example of a Kafka to OpenSearch / ElasticSearch pipeline for WikiMedia data written in pure FP Scala 3 with Cats Effect, fs2-kafka and opensearch-java client.

It consists of:

producer: a Kafka producer that reads from the WikiMedia event stream and writes to a kafka topic
consumer: a Kafka consumer that reads from a Kafka topic and writes to OpenSearch

Setup

Install scala-cli

Start VM (optional)

# MacOS only, not needed if you have Docker Desktop or similar
./colima.sh

Run local infra.

docker-compose -f ./docker-compose.yml up

Run the app

# start producer process
scala-cli ./sampler -- produce
# start consumer process
scala-cli ./sampler -- consume
# run the producer & consumer processes concurrently
scala-cli ./sampler -- produce-consume
# for more options
scala-cli ./sampler -- help

You can inspect processed data in:
- Kafka UI available at http://localhost:8080
- OpenSearch-Dashboards / Kibana console available at http://localhost:5601/app/dev_tools#/console

Potential improvements

parametrize more OpenSearch client options and move them to the CLI level
use Avro or other format instead of JSON
kafka consumer graceful shutdown
use explicit mapping instead of dynamic one in OpenSearch
create Grafana / Kibana dashboards
add more tests

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
sampler		sampler
.gitignore		.gitignore
.scalafmt.conf		.scalafmt.conf
README.md		README.md
colima.sh		colima.sh
docker-compose.yml		docker-compose.yml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

sampler

sampler

.gitignore

.gitignore

.scalafmt.conf

.scalafmt.conf

README.md

README.md

colima.sh

colima.sh

docker-compose.yml

docker-compose.yml

Repository files navigation

wikimedia-sampler

Setup

Potential improvements

About

Releases

Packages

Languages

grzegorz-bielski/wikimedia-sampler

Folders and files

Latest commit

History

Repository files navigation

wikimedia-sampler

Setup

Potential improvements

About

Topics

Resources

Stars

Watchers

Forks

Languages