Skip to content

Commit

Permalink
Merge pull request #766 from Altinity/2.3.0
Browse files Browse the repository at this point in the history
2.3.0
  • Loading branch information
subkanthi authored Aug 26, 2024
2 parents 33bc1da + 9e11b61 commit eaba052
Show file tree
Hide file tree
Showing 32 changed files with 4,584 additions and 107 deletions.
1 change: 1 addition & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -59,6 +59,7 @@ First two are good tutorials on MySQL and PostgreSQL respectively.
* [Logging](doc/logging.md)
* [Production Setup](doc/production_setup.md)
* [Adding new tables(Incremental Snapshot)](doc/incremental_snapshot.md)
* [Configuration](doc/configuration.md)

### Operations

Expand Down
3 changes: 3 additions & 0 deletions doc/Troubleshooting.md
Original file line number Diff line number Diff line change
Expand Up @@ -55,3 +55,6 @@ https://stackoverflow.com/questions/63523998/multiple-debezium-connector-for-one

### PostgreSQL - ERROR - Error starting connectorio.debezium.DebeziumException: Creation of replication slot failed; when setting up multiple connectors for the same database host, please make sure to use a distinct replication slot name for each.
Make sure to add `slot.name` to the configuration(config.yml) and change it to a unique name.

### PostgreSQL (WAL size growing)
[Handling PostgreSQL WAL Growth with Debezium Connectors](doc/postgres_wal_growth.md)
65 changes: 33 additions & 32 deletions doc/configuration.md

Large diffs are not rendered by default.

41 changes: 41 additions & 0 deletions doc/postgres_wal_growth.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,41 @@
## Handling PostgreSQL WAL Growth with Debezium Connectors
Credits: https://medium.com/@pawanpg0963/postgres-replication-lag-using-debezium-connector-4ba50e330cd6

One of the common problems with PostgreSQL is the WAL size increasing. This issue can be observed when using Debezium connectors for change data capture.
The WAL size increases due to the connector not sending any data to the replication slot.
This can be observed by checking the replication slot lag using the following query:
```sql
postgres=# SELECT slot_name, pg_size_pretty(pg_wal_lsn_diff(pg_current_wal_lsn(), restart_lsn)) AS replicationSlotLag,
pg_size_pretty(pg_wal_lsn_diff(pg_current_wal_lsn(), confirmed_flush_lsn)) AS confirmedLag, active FROM pg_replication_slots;
slot_name | replicationslotlag | confirmedlag | active
-----------+--------------------+--------------+--------
db1_slot | 20 GB | 16 GB | t
db2_slot | 62 MB | 42 MB | t
(2 rows)
```

This issue can be addressed using the `heartbeat.interval.ms` configuration

### Solution
Create a new table in postgres (Heartbeat) table.
```sql
CREATE TABLE heartbeat.pg_heartbeat (
random_text TEXT,
last_update TIMESTAMP
);
```
Add the table to the existing publisher used by the connector:
```sql
INSERT INTO heartbeat.pg_heartbeat (random_text, last_update) VALUES ('test_heartbeat', NOW());
```
Add the table to the existing publisher used by the connector:
```
ALTER PUBLICATION db1_pub ADD TABLE heartbeat.pg_heartbeat;
ALTER PUBLICATION db2_pub ADD TABLE heartbeat.pg_heartbeat;
```
Grant privileges to the schema heartbeat and table pg_heartbeat to the replication user used by the connector.
Add the following configuration to `config.yml`
```
heartbeat.interval.ms=10000
heartbeat.action.query="UPDATE heartbeat.pg_heartbeat SET last_update=NOW();"
```
18 changes: 15 additions & 3 deletions doc/production_setup.md
Original file line number Diff line number Diff line change
@@ -1,9 +1,12 @@
## Production setup
![](img/production_setup.jpg)


### Improving throughput and/or Memory usage.
[Throughput & Memory Usage](#improving-throughput-and/or-memory-usage.) \
[Initial Load](#initial-load) \
[PostgreSQL Setup](#postgresql-production-setup)

### Improving throughput and/or Memory usage.
![](img/production_setup.jpg)
As detailed in the diagram above, there are components that store the messages and
can be configured to improve throughput and/or memory usage.

Expand Down Expand Up @@ -43,7 +46,7 @@ in terms of number of elements the queue can hold and the maximum size of the qu
buffer.flush.time.ms: "1000"
```

## Snapshots (Out of Memory)
## Initial Load

The following parameters might be useful to reduce the memory usage of the connector during the snapshotting phase.

Expand All @@ -62,3 +65,12 @@ The maximum number of rows that the connector fetches and reads into memory when

**snapshot.max.threads**: Increase this number from 1 to a higher value to enable parallel snapshotting.

**Single Threaded (Low Memory/Slow replication)**:
By setting the `single.threaded: true` configuration variable in `config.yml`, the replication will skip the sink connector queue and threadpool
and will insert batches directly from the debezium queue.
This mode will work on lower memory setup but will increase the replication speed.

## PostgreSQL Production Setup

One of the common problems with PostgreSQL is the WAL size increasing.
[Handling PostgreSQL WAL Growth with Debezium Connectors](doc/postgres_wal_growth.md)
22 changes: 22 additions & 0 deletions release-notes/2.3.0.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
## What's Changed
* Update Monitoring.md to include metrics.port by @subkanthi in https://github.com/Altinity/clickhouse-sink-connector/pull/735
* Added integration test for PostgreSQL keepermap by @subkanthi in https://github.com/Altinity/clickhouse-sink-connector/pull/728
* Enable metrics by default for MySQL and postgres by @subkanthi in https://github.com/Altinity/clickhouse-sink-connector/pull/750
* Added documentation to set the replication start position by @subkanthi in https://github.com/Altinity/clickhouse-sink-connector/pull/753
* Added documentation for using JAR file for postgres replication by @subkanthi in https://github.com/Altinity/clickhouse-sink-connector/pull/754
* Update quickstart.md by @subkanthi in https://github.com/Altinity/clickhouse-sink-connector/pull/763
* Update quickstart.md update docker tag. by @subkanthi in https://github.com/Altinity/clickhouse-sink-connector/pull/764
* Removed schema history configuration settings for postgres by @subkanthi in https://github.com/Altinity/clickhouse-sink-connector/pull/755
* 725 we cant start grafana in sink connector we have cert issue by @subkanthi in https://github.com/Altinity/clickhouse-sink-connector/pull/760
* Added JMX exporter to export JMX metrics from debezium. by @subkanthi in https://github.com/Altinity/clickhouse-sink-connector/pull/757
* Disable validation of source database by @subkanthi in https://github.com/Altinity/clickhouse-sink-connector/pull/716
* Added Integration test to validate truncate event replication in post…gresql by @subkanthi in https://github.com/Altinity/clickhouse-sink-connector/pull/759
* Fix set lsn to accept string and not long by @subkanthi in https://github.com/Altinity/clickhouse-sink-connector/pull/752
* Added logic in retrying database in case of failure by @subkanthi in https://github.com/Altinity/clickhouse-sink-connector/pull/761
* 630 postgres heartbeat setup documentation by @subkanthi in https://github.com/Altinity/clickhouse-sink-connector/pull/765
* 628 add integration test for mariadb by @subkanthi in https://github.com/Altinity/clickhouse-sink-connector/pull/673
* Added functionality to replicate in single threaded mode based on configuration without using a Queue by @subkanthi in https://github.com/Altinity/clickhouse-sink-connector/pull/756
* Convert localDateTime to String in show_replica_status API call by @subkanthi in https://github.com/Altinity/clickhouse-sink-connector/pull/736


**Full Changelog**: https://github.com/Altinity/clickhouse-sink-connector/compare/2.2.1...2.3.0
7 changes: 4 additions & 3 deletions sink-connector-lightweight/dependency-reduced-pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -42,6 +42,7 @@
</goals>
<configuration>
<transformers>
<transformer />
<transformer>
<mainClass>com.altinity.clickhouse.debezium.embedded.ClickHouseDebeziumEmbeddedApplication</mainClass>
</transformer>
Expand Down Expand Up @@ -118,7 +119,7 @@
<dependency>
<groupId>io.debezium</groupId>
<artifactId>debezium-connector-mongodb</artifactId>
<version>2.7.0.Alpha2</version>
<version>2.7.0.Beta2</version>
<scope>test</scope>
<exclusions>
<exclusion>
Expand Down Expand Up @@ -300,13 +301,13 @@
<version.testcontainers>1.19.1</version.testcontainers>
<surefire-plugin.version>3.0.0-M7</surefire-plugin.version>
<project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
<sink-connector-library-version>0.0.8</sink-connector-library-version>
<sink-connector-library-version>0.0.9</sink-connector-library-version>
<version.junit>5.9.1</version.junit>
<maven.compiler.source>17</maven.compiler.source>
<project.reporting.outputEncoding>UTF-8</project.reporting.outputEncoding>
<version.checkstyle.plugin>3.1.1</version.checkstyle.plugin>
<maven.compiler.target>17</maven.compiler.target>
<version.debezium>2.7.0.Alpha2</version.debezium>
<version.debezium>2.7.0.Beta2</version.debezium>
<quarkus.platform.group-id>io.quarkus.platform</quarkus.platform.group-id>
</properties>
</project>
14 changes: 14 additions & 0 deletions sink-connector-lightweight/docker/Dockerfile_grafana
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
# Start from the official Grafana image
FROM grafana/grafana:latest
USER root
# Add your CA certificate to the system's trusted certificates
# If you have multiple certificates, you can copy them all
COPY ca-certificates.crt /usr/local/share/ca-certificates/ca-certificates.crt
RUN apk add --no-cache ca-certificates && \
update-ca-certificates
# Install the Grafana plugin
# Replace 'your-plugin-id' with the actual plugin ID
#RUN grafana-cli --pluginUrl https://your-plugin-repository.com/plugins/your-plugin-id install your-plugin-id

# Restart Grafana to pick up the changes
CMD ["/run.sh"]
Loading

0 comments on commit eaba052

Please sign in to comment.