-
Notifications
You must be signed in to change notification settings - Fork 54
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge pull request #915 from Altinity/add_feature_matrix_doc
Add feature matrix doc
- Loading branch information
Showing
4 changed files
with
127 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,20 @@ | ||
| Feature | Altinity Sink Connector (Lightweight, Single Binary) | Airbyte | ClickHouse `mysql` Table Engine | Custom Python Script with ClickHouse Connect | | ||
|---------------------------------|------------------------------------------------------|--------------------------------|----------------------------------------|-----------------------------------------------| | ||
| **Replication Type** | Real-time CDC | Batch (Scheduled) | Direct Query | Batch or Scheduled | | ||
| **Data Freshness** | Near real-time | Configurable (e.g., hourly) | Near real-time (with latency) | Configurable | | ||
| **Schema Change Handling** | Full support(MySQL), Partial(PostgreSQL) | Manual schema refresh required | No automatic schema sync | Manual intervention needed | | ||
| **Complexity** | Low to Medium (single binary setup) | Moderate | Low | High (requires coding and scheduling) | | ||
| **Ease of Setup** | Easy (standalone binary, no Kafka needed) | Easy | Very easy | Complex (custom coding) | | ||
| **Maintenance** | Low to Moderate (single binary process) | Low | Low | High | | ||
| **Initial Sync Support** | Yes | Yes | Not applicable (direct query) | Yes | | ||
| **Transformation Capabilities** | Limited | Basic (Airbyte transformations)| No | Full control (custom code) | | ||
| **Cost** | Free or license-based | Free (Open-source) | Free (built-in to ClickHouse) | Free (but may require custom infrastructure) | | ||
| **Suitability for High Volume** | High | Medium | Medium | Medium to Low | | ||
| **Additional Infrastructure** | None | None | None | Optional (scheduling tools like Airflow) | | ||
| **Data Accuracy** | High (real-time CDC) | Medium (depends on sync frequency) | Medium | High | | ||
| **Ideal Use Case** | Low-latency, real-time replication without Kafka | Batch syncs, easy setup | Simple queries without replication | Custom, flexible ETL | | ||
|
||
|
||
| Feature | Altinity Sink Connector (Lightweight, Single Binary) | Airbyte | | ||
|---------------------------------|------------------------------------------------------|--------------------------------| | ||
| |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,76 @@ | ||
## MySQL Data Types | ||
Refer [Debezium](https://debezium.io/documentation/reference/stable/connectors/mysql.html#mysql-supported-data-types) for detailed data types. | ||
|
||
| MySQL | Debezium | ClickHouse | | ||
|--------------------|------------------------------------------------------|---------------------------------| | ||
| Bigint | INT64\_SCHEMA | Int64 | | ||
| Bigint Unsigned | INT64\_SCHEMA | UInt64 | | ||
| Blob | | String + hex | | ||
| Char | String | String / LowCardinality(String) | | ||
| Date | Schema: INT64<br>Name:<br>debezium.Date | Date(6) | | ||
| DateTime(0/1/2/3) | Schema: INT64<br>Name: debezium.Timestamp | DateTime64(0/1/2/3) | | ||
| DateTime(4/5/6) | Schema: INT64<br>Name: debezium.MicroTimestamp | DateTime64(4/5/6) | | ||
| Decimal(30,12) | Schema: Bytes<br>Name:<br>kafka.connect.data.Decimal | Decimal(30,12) | | ||
| Double | | Float64 | | ||
| Int | INT32 | Int32 | | ||
| Int Unsigned | INT64 | UInt32 | | ||
| Longblob | | String + hex | | ||
| Mediumblob | | String + hex | | ||
| Mediumint | INT32 | Int32 | | ||
| Mediumint Unsigned | INT32 | UInt32 | | ||
| Smallint | INT16 | Int16 | | ||
| Smallint Unsigned | INT32 | UInt16 | | ||
| Text | String | String | | ||
| Time | | String | | ||
| Time(6) | | String | | ||
| Timestamp | | DateTime64 | | ||
| Tinyint | INT16 | Int8 | | ||
| Tinyint Unsigned | INT16 | UInt8 | | ||
| varbinary(\*) | | String + hex | | ||
| varchar(\*) | | String | | ||
| JSON | | String | | ||
| BYTES | BYTES, io.debezium.bits | String | | ||
| YEAR | INT32 | INT32 | | ||
| GEOMETRY | Binary of WKB | String | | ||
| SET | | Array(String) | | ||
| ENUM | | Array(String) | | ||
|
||
|
||
### PostgreSQL Data Types | ||
|
||
| PostgreSQL Type | Notes | | ||
|---------------------------|---------------------------------------------------------------------------------------| | ||
| `SMALLINT` | | | ||
| `INTEGER` | Supported | | ||
| `BIGINT` | Supported | | ||
| `NUMERIC` | Supported | | ||
| `REAL` | Supported | | ||
| `DOUBLE PRECISION` | Supported | | ||
| `BOOLEAN` | Supported | | ||
| `CHAR(n)` | Supported | | ||
| `VARCHAR(n)` | Supported | | ||
| `TEXT` | Supported | | ||
| `BYTEA` | Supported | | ||
| `DATE` | Supported | | ||
| `TIME [ WITHOUT TIME ZONE ]` | Supported | | ||
| `TIME WITH TIME ZONE` | Supported | | ||
| `TIMESTAMP [ WITHOUT TIME ZONE ]` | Supported | | ||
| `TIMESTAMP WITH TIME ZONE` | Supported | | ||
| `INTERVAL` | Supported | | ||
| `UUID` | Supported | | ||
| `INET` | Supported | | ||
| `MACADDR` | Supported | | ||
| `JSON` | Supported | | ||
| `JSONB` | Supported | | ||
| `HSTORE` | Supported | | ||
| `ENUM` | Supported | | ||
| `ARRAY` | Supported, but arrays of unsupported types are not supported | | ||
| `GEOMETRY` (PostGIS) | Not supported | | ||
| `GEOGRAPHY` (PostGIS) | Not supported | | ||
| `CITEXT` | Supported | | ||
| `BIT` | Not supported | | ||
| `BIT VARYING` | Not supported | | ||
| `MONEY` | Not supported | | ||
| `XML` | Not supported | | ||
| `OID` | Not supported | | ||
| `UNSUPPORTED` | Types other than those listed are not supported | |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,26 @@ | ||
## Features | ||
|
||
| Feature | Description | | ||
| ------- | --------- | | ||
| Single Binary | No additional dependencies or infrastructure required | | ||
| Exactly Once Processing| Offsets are committed to ClickHouse after the messages are written to ClickHouse | | ||
| Supported Databases | MySQL, MariaDB, PostgreSQL, MongoDB(Experimental) | | ||
| Supported ClickHouse Versions | 24.8 and above | | ||
| Clickhouse Tables Types | ReplacingMergeTree, MergeTree, ReplicatedReplacingMergeTree | | ||
| Replication Start positioning | Using sink-connector-client to start replication from a specific offset or LSN(MySQL Binlog Position, PostgreSQL LSN) | | ||
| Supported Datatypes| Refer [Datatypes](https://debezium.io/documentation/reference/stable/connectors/mysql.html#mysql-supported-data-types) | | ||
| Initial Data load | Scripts to perform initial data load (MySQL) | | ||
| Fault Tolerance | Sink Connector Client to continue replication from the last committed offset/LSN in case of a failure | | ||
| Update, Delete | Supported with ReplacingMergeTree | ||
| Monitoring | Prometheus Metrics, Grafana Dashboard | | ||
| Schema Evolution| DDL support for MYSQL. | ||
| Deployment Models| Docker Compose, Java JAR file, Kubernetes | ||
| Start, Stop, Pause, Resume Replication | Supported using sink-connector-client | ||
| Filter sources databases, tables, columns | Supported using debezium configuration. | ||
| Map source databases to different ClickHouse databases | Database name overrides supported. | ||
| Column name overrides | Planned | ||
| MySQL extensive DDL support | Full list of DDL(sink-connector-lightweight/docs/mysql-ddl-support.md) | ||
| Replication Lag Monitoring| Grafana Dashboard and view to monitor lag | ||
| Batch inserts to ClickHouse | Configurable batch size/thread pool size to achieve high throughput/low latency | ||
| MySQL Generated/Alias/Materialized Columns | Supported | ||
| Auto create tables| Tables are automatically created in ClickHouse based on the source table structure. |