Docker ZooKeeper image for the Confluent Open Source Platform using Oracle JDK
3.2.2
(3.2.2/Dockerfile)3.3.0
(3.3.0/Dockerfile)3.3.1
,latest
(3.3.1/Dockerfile)
All tag names follow the naming convention of the Confluent Open Source Platform
- Debian "slim" image variant
- Oracle JDK 8u152 addded, without MissionControl, VisualVM, JavaFX, ReadMe files, source archives, etc.
- Oracle Java Cryptography Extension added
- Python 2.7.9-1 & pip 9.0.1 added
- SHA 256 sum checks for all downloads
- JAVA_HOME environment variable set up
- Utility scripts added:
- Confluent utility belt script ('cub') - a Python CLI for a Confluent tool called docker-utils
- Docker utility belt script ('dub')
- Apache ZooKeeper added:
- version 3.4.9 in
3.2.2
- version 3.4.10 in
3.3.0
,3.3.1
andlatest
- version 3.4.9 in
This image was created with the sole purpose of offering the Confluent Open Source Platform running on top of Oracle JDK. Therefore, it follows the same structure as the one from the original repository. More precisely:
Apart of the base image (mbe1224/confluent-base), it has Apache ZooKeeper added on top of it, installed using the following Confluent Debian package:
confluent-kafka-2.11
Build the image
docker build -t mbe1224/confluent-zookeeper ./3.3.1/
Run the container
docker run -d \
--net=host \
--name=zookeeper \
-e ZOOKEEPER_CLIENT_PORT=32181 \
mbe1224/confluent-zookeeper
One can configure a ZooKeeper instance using environment variables. All configuration options from the official documentation can be used as long as the following naming rules are followed:
- upper caps
- "." replaced with "_"
- snake case instead of pascal case
- "ZOOKEEPER_" prefix
For example, in order to limit the number of concurrent connections a single client may make to a single member of the ZooKeeper ensemble, one has to modifiy the "maxClientCnxns" property, which is translated in the "ZOOKEEPER_MAX_CLIENT_CNXNS" environment variable.
The following default values are used:
# | Name | Default value | Meaning |
---|---|---|---|
1 | ZOOKEEPER_DATA_DIR | /var/lib/zookeeper/data | The location where ZooKeeper will store the in-memory database snapshots and, unless specified otherwise, the transaction log of updates to the database |
2 | ZOOKEEPER_DATA_LOG_DIR | /var/lib/zookeeper/log | This allows a dedicated log device to be used, and helps avoid competition between logging and snaphots |
3 | ZOOKEEPER_INIT_LIMIT | 10 | Timeouts ZooKeeper uses to limit the length of time the ZooKeeper servers in quorum have to connect to a leader |
4 | ZOOKEEPER_LOG4J_ROOT_LOGLEVEL | INFO | - |
5 | ZOOKEEPER_SYNC_LIMIT | 5 | Timeouts ZooKeeper to limit how far out of date a server can be from a leader |
In addition to these, the following environment variables are added for replicated scenarios:
# | Name | Meaning |
---|---|---|
1 | ZOOKEEPER_GROUPS | Semicolon separated list of indexed group identifiers used for forming hierarchical quorums |
2 | ZOOKEEPER_SERVER_ID | Node identifier |
3 | ZOOKEEPER_SERVERS | Semicolon separated list of host:port1:port2 where port1 is used for follower connections, if the current node is the leader and port2 is used for server connections during the leader election phase |
4 | ZOOKEEPER_WEIGHTS | Semicolon separated list of indexed weights used for forming quorums |
Furthrmore, one can tweak the JVM heap size using the JVM_HEAP_SIZE environment variable, that has a default value of 2GB.
For more information, check the Apache ZooKeeper's Official Documentation.
For Kubernetes deployments using StatefulSets (or other replication objects), ZOOKEEPER_SERVER_ID can't be set up in advance. Therefore, one needs to set the IS_KUBERNETES environemnt variable to a non-null value. In this scenario, ZooKeeper's ID (myid) will be generated using the value of the HOSTNAME environment variable.
Nevertheless, if one uses Pods, than the usual setup can be used and the IS_KUBERNETES environemnt variable must be ignored.