Bring up a reference implementation Senzing stack on Kubernetes
using minikube
, kubectl
, and helm
.
A containerized Kafka and a PostgreSQL database are deployed as
backing services
for demonstration purposes.
These instructions illustrate a reference implementation of Senzing using PostgreSQL as the underlying database.
The instructions show how to set up a system that:
- Reads JSON lines from a file on the internet and sends each JSON line to a message queue using the Senzing
stream-producer.
- In this implementation, the queue is Kafka.
- Reads messages from the queue and inserts into Senzing using the Senzing
stream-loader.
- In this implementation, Senzing keeps its data in a PostgreSQL database.
- Reads information from Senzing using the Senzing API Server server.
- Views resolved entities in a web app.
The following diagram shows the relationship of the Helm charts, docker containers, and code in this Kubernetes demonstration.
- Prerequisites
- Prerequisite software
- Clone repository
- Create demo directory
- Start minikube cluster
- View minikube cluster
- Set environment variables
- Identify Docker registry
- Create custom helm values files
- Create custom Kubernetes configuration files
- Save environment variables
- Create namespace
- Create persistent volume
- Add helm repositories
- Install Postgresql Helm chart
- Install pgAdmin Helm chart
- Initialize database
- Install Kafka Helm chart
- Install Kafka test client
- Demonstrate
- Cleanup
- Errors
- References
At Senzing, we strive to create GitHub documentation in a "don't make me think" style. For the most part, instructions are copy and paste. Whenever thinking is needed, it's marked with a "thinking" icon 🤔. Whenever customization is needed, it's marked with a "pencil" icon ✏️. If the instructions are not clear, please let us know by opening a new Documentation issue describing where we can improve. Now on with the show...
- 🤔 - A "thinker" icon means that a little extra thinking may be required. Perhaps you'll need to make some choices. Perhaps it's an optional step.
- ✏️ - A "pencil" icon means that the instructions may need modification before performing.
⚠️ - A "warning" icon means that something tricky is happening, so pay attention.
- Space: This repository and demonstration require 20 GB free disk space.
- Time: Budget 4 hours to get the demonstration up-and-running, depending on CPU and network speeds.
- Background knowledge: This repository assumes a working knowledge of:
The Git repository has files that will be used in the helm install --values
parameter.
-
Using these environment variable values:
export GIT_ACCOUNT=senzing export GIT_REPOSITORY=kubernetes-demo export GIT_ACCOUNT_DIR=~/${GIT_ACCOUNT}.git export GIT_REPOSITORY_DIR="${GIT_ACCOUNT_DIR}/${GIT_REPOSITORY}"
-
Follow steps in clone-repository to install the Git repository.
-
✏️ Create a unique prefix. This will be used in a local directory name as well as a prefix to Kubernetes object.
⚠️ Because it's used in Kubernetes resource names, it must be all lowercase.Example:
export DEMO_PREFIX=my
-
Make a directory for the demo. Example:
export SENZING_DEMO_DIR=~/senzing-kafka-postgresql-demo-${DEMO_PREFIX} mkdir -p ${SENZING_DEMO_DIR}
Using Get Started with Bitnami Charts using Minikube as a guide, start a minikube cluster.
-
Start cluster using minikube start. Example:
minikube start --cpus 4 --memory 8192 --disk-size=50g
🤔 Optional: View the minikube cluster using the dashboard.
-
Run command in a separate terminal using minikube dashboard. Example:
minikube dashboard
-
Set environment variables listed in "Clone repository".
-
Synthesize environment variables. Example:
export DEMO_NAMESPACE=${DEMO_PREFIX}-namespace
-
Retrieve docker image version numbers and set their environment variables. Example:
curl -X GET \ --output ${SENZING_DEMO_DIR}/docker-versions-stable.sh \ https://raw.githubusercontent.com/Senzing/knowledge-base/main/lists/docker-versions-stable.sh source ${SENZING_DEMO_DIR}/docker-versions-stable.sh
-
Retrieve Helm Chart version numbers and set their environment variables. Example:
curl -X GET \ --output ${SENZING_DEMO_DIR}/helm-versions-stable.sh \ https://raw.githubusercontent.com/Senzing/knowledge-base/main/lists/helm-versions-stable.sh source ${SENZING_DEMO_DIR}/helm-versions-stable.sh
-
🤔 Optional: To use a license other than the Senzing complimentary 100K record license, the
SENZING_LICENSE_BASE64_ENCODED
environment variable needs to be set. Note: Modify the path to a file containing the Senzing license in Base64 format. Example:export SENZING_LICENSE_BASE64_ENCODED=$(cat /etc/opt/senzing/g2lic_base64.txt) echo ${SENZING_LICENSE_BASE64_ENCODED}
🤔 There are 3 options when it comes to using a docker registry. Choose one:
Method #1: Pulls docker images from public internet registry.
-
Use the default public
docker.io
registry which pulls images from hub.docker.com. Example:export DOCKER_REGISTRY_URL=docker.io export DOCKER_REGISTRY_SECRET=${DOCKER_REGISTRY_URL}-secret
Method #2: Pulls docker images from a private registry.
-
✏️ Specify a private registry. Example:
export DOCKER_REGISTRY_URL=my.example.com:5000 export DOCKER_REGISTRY_SECRET=${DOCKER_REGISTRY_URL}-secret export SENZING_SUDO=sudo ${GIT_REPOSITORY_DIR}/bin/docker-pull-tag-and-push.sh docker-images-for-helm-kafka-postgresql
Method #3: Pulls docker images from minikube's registry.
-
Use minikube's docker registry using minkube addons enable and minikube image load. Example:
minikube addons enable registry export DOCKER_REGISTRY_URL=docker.io export DOCKER_REGISTRY_SECRET=${DOCKER_REGISTRY_URL}-secret ${GIT_REPOSITORY_DIR}/bin/populate-minikube-registry.sh docker-images-for-helm-kafka-postgresql
For final customization of the Helm Charts,
various files need to be created for use in the
--values
parameter of helm install
.
🤔 In this step, Helm template files are populated with actual values. There are two methods of accomplishing this. Only one method needs to be performed.
-
Method #1: Helm template files are instantiated with actual values into
${HELM_VALUES_DIR}
directory by using make-helm-values-files.sh.export HELM_VALUES_DIR=${SENZING_DEMO_DIR}/helm-values ${GIT_REPOSITORY_DIR}/bin/make-helm-values-files.sh
-
Method #2: Copy and manually modify files method. Example:
export HELM_VALUES_DIR=${SENZING_DEMO_DIR}/helm-values mkdir -p ${HELM_VALUES_DIR} cp ${GIT_REPOSITORY_DIR}/helm-values-templates/* ${HELM_VALUES_DIR}
✏️ Edit files in ${HELM_VALUES_DIR} replacing the following variables with actual values.
${DEMO_PREFIX}
${DOCKER_REGISTRY_SECRET}
${DOCKER_REGISTRY_URL}
${SENZING_ACCEPT_EULA}
-
🤔 Optional: List newly generated files. Example:
ls ${HELM_VALUES_DIR}
Create Kubernetes manifest files for use with kubectl create
.
🤔 In this step, Kubernetes template files are populated with actual values. There are two methods of accomplishing this. Only one method needs to be performed.
-
Method #1: Kubernetes manifest files are instantiated with actual values into
{KUBERNETES_DIR}
directory by using make-kubernetes-manifest-files.sh. Example:export KUBERNETES_DIR=${SENZING_DEMO_DIR}/kubernetes ${GIT_REPOSITORY_DIR}/bin/make-kubernetes-manifest-files.sh
-
Method #2: Copy and manually modify files method. Example:
export KUBERNETES_DIR=${SENZING_DEMO_DIR}/kubernetes mkdir -p ${KUBERNETES_DIR} cp ${GIT_REPOSITORY_DIR}/kubernetes-templates/* ${KUBERNETES_DIR}
✏️ Edit files in ${KUBERNETES_DIR} replacing the following variables with actual values.
${DEMO_NAMESPACE}
Environment variables will be needed in new terminal windows using save-environment-variables.sh.
-
Save environment variables into a file that can be sourced. Example:
${GIT_REPOSITORY_DIR}/bin/save-environment-variables.sh
A new Kubernetes namespace is created to isolate this demonstration from other applications running on Kubernetes.
-
Create Kubernetes namespace using kubectl create. Example:
kubectl create -f ${KUBERNETES_DIR}/namespace.yaml
-
🤔 Optional: Review namespaces using kubectl get. Example:
kubectl get namespaces
🤔 Optional: These steps for creating Persistent Volumes (PV) and Persistent Volume Claims (PVC) are for a demonstration environment. They are not sufficient for a production environment. If PVs and PVCs already exist, this step may be skipped.
Note: Senzing does not require Persistent Volumes. The volumes being created are for the PostgreSQL backing service.
-
Create persistent volumes using kubectl create. Example:
kubectl create -f ${KUBERNETES_DIR}/persistent-volume-postgresql.yaml
-
Create persistent volume claims using kubectl create. Example:
kubectl create -f ${KUBERNETES_DIR}/persistent-volume-claim-postgresql.yaml
-
🤔 Optional: Review persistent volumes and claims using kubectl get. Example:
kubectl get persistentvolumes \ --namespace ${DEMO_NAMESPACE} kubectl get persistentvolumeClaims \ --namespace ${DEMO_NAMESPACE}
-
Add Helm repositories using helm repo add. Example:
helm repo add bitnami https://charts.bitnami.com/bitnami helm repo add runix https://helm.runix.net helm repo add senzing https://hub.senzing.com/charts/
-
Update repositories using helm repo update. Example:
helm repo update
-
🤔 Optional: Review repositories using helm repo list. Example:
helm repo list
🤔 This step installs a PostgreSQL database container. It is not a production-ready database and is only used for demonstration purposes. The choice of database is a limiting factor in the speed at which Senzing can operate. This database choice is at least an order of magnitude slower than a well-tuned production database.
In a production environment,
a separate PostgreSQL database would be provisioned and maintained.
The ${SENZING_DEMO_DIR}/helm-values/*.yaml
files would then be updated to have the
SENZING_DATABASE_URL
point to the production database.
For this demonstration, the bitnami/postgresql Helm Chart provisions an instance of the bitnami/postgresql Docker image.
-
Create Configmap for
pg_hba.conf
using kubectl create. Example:kubectl create configmap ${DEMO_PREFIX}-pg-hba \ --namespace ${DEMO_NAMESPACE} \ --from-file=${KUBERNETES_DIR}/pg_hba.conf
Note:
pg_hba.conf
will be stored in the PersistentVolumeClaim. -
Install bitnami/postgresql chart using helm install. Example:
helm install \ ${DEMO_PREFIX}-bitnami-postgresql \ bitnami/postgresql \ --namespace ${DEMO_NAMESPACE} \ --values ${HELM_VALUES_DIR}/bitnami-postgresql.yaml \ --version ${SENZING_HELM_VERSION_BITNAMI_POSTGRESQL:-""}
-
Wait for pod to run using kubectl get. Example:
kubectl get pods \ --namespace ${DEMO_NAMESPACE} \ --watch
-
Example of pod running:
NAME READY STATUS RESTARTS AGE my-bitnami-postgresql-6bf64cbbdf-25gtb 1/1 Running 0 10m
pgAdmin is a web-based user interface for viewing the PostgreSQL database.
-
Install runix/pgadmin4 chart using helm install. Example:
helm install \ ${DEMO_PREFIX}-pgadmin \ runix/pgadmin4 \ --namespace ${DEMO_NAMESPACE} \ --values ${HELM_VALUES_DIR}/pgadmin.yaml \ --version ${SENZING_HELM_VERSION_RUNIX_PGADMIN4:-""}
-
To view PostgreSQL via pgAdmin, see View PostgreSQL.
senzing/init-postgresql is used to create Senzing tables in the database (i.e. the schema) and insert initial Senzing configuration.
-
Install senzing/senzing-init-postgresql chart using helm install. Example:
helm install \ ${DEMO_PREFIX}-senzing-init-postgresql \ senzing/senzing-init-postgresql \ --namespace ${DEMO_NAMESPACE} \ --values ${HELM_VALUES_DIR}/senzing-init-postgresql.yaml \ --version ${SENZING_HELM_VERSION_SENZING_INIT_POSTGRESQL:-""}
The binami/kafka Helm Chart provisions an instance of the bitnami/kafka Docker image.
-
Install bitnami/kafka chart using helm install. Example:
helm install \ ${DEMO_PREFIX}-bitnami-kafka \ bitnami/kafka \ --namespace ${DEMO_NAMESPACE} \ --values ${HELM_VALUES_DIR}/bitnami-kafka.yaml \ --version ${SENZING_HELM_VERSION_BITNAMI_KAFKA:-""}
-
Install Kafka test client app. Example:
helm install \ ${DEMO_PREFIX}-confluentinc-cp-kafka \ senzing/confluentinc-cp-kafka \ --namespace ${DEMO_NAMESPACE} \ --values ${HELM_VALUES_DIR}/confluentinc-cp-kafka.yaml \ --version ${SENZING_HELM_VERSION_SENZING_CONFLUENTINC_CP_KAFKA:-""}
-
Wait for pods to run using kubectl get. Note: Kafka will crash and restart until Zookeeper is up and running. Example:
kubectl get pods \ --namespace ${DEMO_NAMESPACE} \ --watch
-
To view Kafka, see View Kafka.
Now that all of the pre-requisites are in place, it's time to bring up a system that uses Senzing.
The Senzing API server receives HTTP requests to read and modify Senzing data.
-
Install senzing/senzing-api-server chart using helm install. Example:
helm install \ ${DEMO_PREFIX}-senzing-api-server \ senzing/senzing-api-server \ --namespace ${DEMO_NAMESPACE} \ --values ${HELM_VALUES_DIR}/senzing-api-server-postgresql.yaml \ --version ${SENZING_HELM_VERSION_SENZING_API_SERVER:-""}
-
To view Senzing API server, see View Senzing API Server.
The stream producer pulls JSON lines from a file and pushes them to a message queue.
-
Install senzing/senzing-stream-producer chart using helm install. Example:
helm install \ ${DEMO_PREFIX}-senzing-stream-producer \ senzing/senzing-stream-producer \ --namespace ${DEMO_NAMESPACE} \ --values ${HELM_VALUES_DIR}/senzing-stream-producer-kafka.yaml \ --version ${SENZING_HELM_VERSION_SENZING_STREAM_PRODUCER:-""}
The stream loader pulls messages from a message queue and sends them to Senzing.
-
Install senzing/senzing-stream-loader chart using helm install. Example:
helm install \ ${DEMO_PREFIX}-senzing-stream-loader \ senzing/senzing-stream-loader \ --namespace ${DEMO_NAMESPACE} \ --values ${HELM_VALUES_DIR}/senzing-stream-loader-kafka-postgresql.yaml \ --version ${SENZING_HELM_VERSION_SENZING_STREAM_LOADER:-""}
The senzing-console will be used later to inspect mounted volumes, debug issues, or run command-line tools.
-
Install senzing/senzing-console chart using helm install. Example:
helm install \ ${DEMO_PREFIX}-senzing-console \ senzing/senzing-console \ --namespace ${DEMO_NAMESPACE} \ --values ${HELM_VALUES_DIR}/senzing-console-postgresql.yaml \ --version ${SENZING_HELM_VERSION_SENZING_CONSOLE:-""}
-
To use senzing-console pod, see View Senzing Console pod.
The redoer pulls Senzing redo records from the Senzing database and re-processes.
-
Install senzing/senzing-redoer chart using helm install. Example:
helm install \ ${DEMO_PREFIX}-senzing-redoer \ senzing/senzing-redoer \ --namespace ${DEMO_NAMESPACE} \ --values ${HELM_VALUES_DIR}/senzing-redoer-postgresql.yaml \ --version ${SENZING_HELM_VERSION_SENZING_REDOER:-""}
The Senzing Entity Search WebApp is a light-weight WebApp demonstrating Senzing search capabilities.
-
Install senzing/senzing-entity-search-web-app chart using helm install. Example:
helm install \ ${DEMO_PREFIX}-senzing-entity-search-web-app \ senzing/senzing-entity-search-web-app \ --namespace ${DEMO_NAMESPACE} \ --values ${HELM_VALUES_DIR}/senzing-entity-search-web-app.yaml \ --version ${SENZING_HELM_VERSION_SENZING_ENTITY_SEARCH_WEB_APP:-""}
-
Wait for pods to run using kubectl get. Example:
kubectl get pods \ --namespace ${DEMO_NAMESPACE} \ --watch
-
To view Senzing Entity Search WebApp, see View Senzing Entity Search WebApp.
These charts are not necessary for the demonstration, but may be valuable in a production environment.
The SwaggerUI is a micro-service for viewing the Senzing REST OpenAPI specification in a web browser.
-
Install senzing/swaggerapi-swagger-ui chart using helm install. Example:
helm install \ ${DEMO_PREFIX}-swaggerapi-swagger-ui \ senzing/swaggerapi-swagger-ui \ --namespace ${DEMO_NAMESPACE} \ --values ${HELM_VALUES_DIR}/swaggerapi-swagger-ui.yaml \ --version ${SENZING_HELM_VERSION_SENZING_SWAGGERAPI_SWAGGER_UI:-""}
-
To view SwaggerUI, see View SwaggerUI.
The Senzing Configurator is a micro-service for changing Senzing configuration.
-
Install senzing/senzing-configurator chart using helm install. Example:
helm install \ ${DEMO_PREFIX}-senzing-configurator \ senzing/senzing-configurator \ --namespace ${DEMO_NAMESPACE} \ --values ${HELM_VALUES_DIR}/senzing-configurator-postgresql.yaml \ --version ${SENZING_HELM_VERSION_SENZING_CONFIGURATOR:-""}
-
To view Senzing Configurator, see View Senzing Configurator.
-
Because some of the Kubernetes Services use LoadBalancer, a
minikube
tunnel is needed for LoadBalancer access. Example:minikube tunnel
-
✏️ When using a separate terminal window in each of the examples below, set environment variables. Note: Replace
${DEMO_PREFIX}
with the actual DEMO_PREFIX value. Example:source ~/senzing-kafka-postgresql-demo-${DEMO_PREFIX}/environment.sh
-
Username and password for the following sites are the values seen in the corresponding "values" YAML file located in the helm-values-templates directory.
-
In a separate terminal window, run the test client. Example:
export KAFKA_TEST_POD_NAME=$(kubectl get pods \ --namespace ${DEMO_NAMESPACE} \ --output jsonpath="{.items[0].metadata.name}" \ --selector "app.kubernetes.io/name=confluentinc-cp-kafka, \ app.kubernetes.io/instance=${DEMO_PREFIX}-confluentinc-cp-kafka" \ ) kubectl exec \ -it \ --namespace ${DEMO_NAMESPACE} \ ${KAFKA_TEST_POD_NAME} -- /usr/bin/kafka-console-consumer \ --bootstrap-server ${DEMO_PREFIX}-bitnami-kafka:9092 \ --topic senzing-kafka-topic \ --from-beginning
pgAdmin is a web-based user interface for viewing the PostgreSQL database.
-
In a separate terminal window, port forward to local machine using kubectl port-forward. Example:
kubectl port-forward \ --address 0.0.0.0 \ --namespace ${DEMO_NAMESPACE} \ svc/${DEMO_PREFIX}-pgadmin-pgadmin4 9171:80
-
PostgreSQL will be viewable at localhost:9171.
- Login
- See
${SENZING_DEMO_DIR}/helm-values/pgpadmin.yaml
for pgadmin email and password (env.email
andenv.password
) - Default: username:
postgres
password:postgres
- See
- On left-hand navigation, select:
- Servers > senzing > databases > G2 > schemas > public > tables
- The records received from the queue can be viewed in the following Senzing tables:
- DSRC_RECORD
- OBS_ENT
- Login
The Senzing API server receives HTTP requests to read and modify Senzing data.
-
In a separate terminal window, port forward to local machine using kubectl port-forward. Example:
kubectl port-forward \ --address 0.0.0.0 \ --namespace ${DEMO_NAMESPACE} \ svc/${DEMO_PREFIX}-senzing-api-server 8250:80
-
Make HTTP calls using
curl
. Example:export SENZING_API_SERVICE=http://localhost:8250 curl -X GET ${SENZING_API_SERVICE}/heartbeat curl -X GET ${SENZING_API_SERVICE}/license curl -X GET ${SENZING_API_SERVICE}/entities/1
The Senzing Entity Search WebApp is a light-weight WebApp demonstrating Senzing search capabilities.
-
In a separate terminal window, port forward to local machine using kubectl port-forward. Example:
kubectl port-forward \ --address 0.0.0.0 \ --namespace ${DEMO_NAMESPACE} \ svc/${DEMO_PREFIX}-senzing-entity-search-web-app 8251:80
-
Senzing Entity Search WebApp will be viewable at localhost:8251. The demonstration instructions will give a tour of the Senzing web app.
The senzing-console is used to inspect mounted volumes, debug issues, or run command-line tools.
-
In a separate terminal window, log into Senzing Console pod using kubectl exec. Example:
export CONSOLE_POD_NAME=$(kubectl get pods \ --namespace ${DEMO_NAMESPACE} \ --output jsonpath="{.items[0].metadata.name}" \ --selector "app.kubernetes.io/name=senzing-console, \ app.kubernetes.io/instance=${DEMO_PREFIX}-senzing-console" \ ) kubectl exec -it --namespace ${DEMO_NAMESPACE} ${CONSOLE_POD_NAME} -- /bin/bash
The SwaggerUI is a micro-service for viewing the Senzing REST OpenAPI specification in a web browser.
-
In a separate terminal window, port forward to local machine using kubectl port-forward. Example:
kubectl port-forward \ --address 0.0.0.0 \ --namespace ${DEMO_NAMESPACE} \ svc/${DEMO_PREFIX}-swaggerapi-swagger-ui 9180:80
Then visit http://localhost:9180.
The Senzing Configurator is a micro-service for changing Senzing configuration.
-
If the Senzing configurator was deployed, in a separate terminal window port forward to local machine using kubectl port-forward. Example:
kubectl port-forward \ --address 0.0.0.0 \ --namespace ${DEMO_NAMESPACE} \ svc/${DEMO_PREFIX}-senzing-configurator 8253:80
-
Make HTTP calls using
curl
. Example:export SENZING_API_SERVICE=http://localhost:8253 curl -X GET ${SENZING_API_SERVICE}/datasources
The following commands remove the Senzing Demo application from Kubernetes.
Delete Kubernetes artifacts using helm uninstall, helm repo remove, and kubectl delete.
-
Example:
helm uninstall --namespace ${DEMO_NAMESPACE} ${DEMO_PREFIX}-senzing-configurator helm uninstall --namespace ${DEMO_NAMESPACE} ${DEMO_PREFIX}-swaggerapi-swagger-ui helm uninstall --namespace ${DEMO_NAMESPACE} ${DEMO_PREFIX}-senzing-redoer helm uninstall --namespace ${DEMO_NAMESPACE} ${DEMO_PREFIX}-senzing-entity-search-web-app helm uninstall --namespace ${DEMO_NAMESPACE} ${DEMO_PREFIX}-senzing-api-server helm uninstall --namespace ${DEMO_NAMESPACE} ${DEMO_PREFIX}-senzing-stream-loader helm uninstall --namespace ${DEMO_NAMESPACE} ${DEMO_PREFIX}-senzing-init-postgresql helm uninstall --namespace ${DEMO_NAMESPACE} ${DEMO_PREFIX}-senzing-stream-producer helm uninstall --namespace ${DEMO_NAMESPACE} ${DEMO_PREFIX}-confluentinc-cp-kafka helm uninstall --namespace ${DEMO_NAMESPACE} ${DEMO_PREFIX}-bitnami-kafka helm uninstall --namespace ${DEMO_NAMESPACE} ${DEMO_PREFIX}-pgadmin helm uninstall --namespace ${DEMO_NAMESPACE} ${DEMO_PREFIX}-bitnami-postgresql helm uninstall --namespace ${DEMO_NAMESPACE} ${DEMO_PREFIX}-senzing-console helm repo remove senzing helm repo remove runix helm repo remove bitnami kubectl delete -f ${KUBERNETES_DIR}/persistent-volume-claim-postgresql.yaml kubectl delete -f ${KUBERNETES_DIR}/persistent-volume-postgresql.yaml kubectl delete -f ${KUBERNETES_DIR}/namespace.yaml
Delete minikube artifacts using minikube stop and minikube delete
-
Example:
minikube stop minikube delete
- See docs/errors.md.