Skip to content

Latest commit

 

History

History
265 lines (246 loc) · 15.5 KB

08-Branch.md

File metadata and controls

265 lines (246 loc) · 15.5 KB

Background

The Branch feature is used to copy a database to another Wescale cluster or a local cluster. In a production environment, you can use Branch to create a database identical to the one in the development environment. This document will explain how to use the Branch function.Branch is an "umbrella" command. The action sub-command defines the operation on the workflow. Each action can have their own options.

General Syntax of Branch Command

The Branch operation is executed by the vtctld server, which maintains a gRPC connection with vttablets. Essentially, vtctld acts like a commander.

Note : If you want to use the Branch function, please ensure that the vtctld server on the target end is open.

Branch -- <options> <action>

action := [Prepare | Start | Stop | PrepareMergeBack | StartMergeBack | Cleanup | Schemadiff]
-- Each action has its own options that can be used to control its behavior.

Actions

Branch commands contain Prepare,Start,Stop,PrepareMergeBack,StartMergeBack and SchemaDiff command.

If your goal is simply to copy a database, please focus on the 'Prepare', 'Start', and 'Stop' commands. These commands can be used to complete the database copy workflow

The 'PrepareMergeBack' and 'StartMergeBack' commands are typically used to merge the target database schema from the target database into the source database.

Before the tutorial of Branch, We will use this environment as our primary example throughout this tutorial. Execute the following SQL commands to create the database and tables:

mysql -h127.0.0.1 -P15306 -e 'create database if not exists branch_source'

mysql -h127.0.0.1 -P15306 -e 'create table if not exists branch_source.product(
                                sku varchar(128),
                                description varchar(128),
                                price bigint,
                                primary key(sku)
                              ) ENGINE=InnoDB;
                              create table if not exists branch_source.customer(
                                customer_id bigint not null auto_increment,
                                email varchar(128),
                                primary key(customer_id)
                              ) ENGINE=InnoDB;
                              create table if not exists branch_source.corder(
                                order_id bigint not null auto_increment,
                                customer_id bigint,
                                sku varchar(128),
                                price bigint,
                                primary key(order_id)
                              ) ENGINE=InnoDB;
                              CREATE TABLE if not exists branch_source.user (
                                  id INT AUTO_INCREMENT PRIMARY KEY auto_increment,
                                  name VARCHAR(255) NOT NULL
                              ) ENGINE=InnoDB;'

Prepare

In the preparation phase, vtctld inserts the job and the corresponding table_rules into mysql.branch_jobs and mysql.branch_table_rules. You can customize a filter by modifying the information in mysql.branch_table_rules.

Branch -- 
--source_database=<source_database> 
--target_database=<target_database> 
--workflow_name=<workflow_name> 
[--source_topo_url=<source_topo_url>] 
[--tablet_types=<source_typelet_type>] 
[--cells=<cells>] 
[--include=<tables>]
[--exclude=<tables>]
[--skip_copy_phase =<true/false>]
[--stop_after_copy=<true/false>]
[--default_filter_rules=<filter_rule>]
Prepare
  • source_database : Specify the target database name.
  • target_database : Specify the target database name (it will be automatically created if it does not exist).
  • workflow_name : Specify Branch workflow (Use it to query branch status)
  • source_topo_url : The source cluster's topology server URL (default is the local topology server).
  • tablet_types : The data source tablet type (e.g., REPLICA, PRIMARY, etc.).
  • include : Specify tables to include from the source database.
  • exclude : Specify tables to exclude from the source database.
  • skip_copy_phase : Only copy the schema, not the data.
  • stop_after_copy : Continuously synchronize data between source and target if true.
  • default_filter_rules : Append conditions to the 'where' clause to filter specific data.

Usage

vtctlclient --server localhost:15999 Branch -- --source_database branch_source --target_database branch_target --skip_copy_phase=false  --stop_after_copy=false --workflow_name
 branch_test --default_filter_rules "RAND()<0.1" Prepare

successfully create branch workflow : branch_test sourceDatabase : branch_source targetDatabase : branch_target
rules : 
[source_table:"corder" target_table:"corder" filtering_rule:"select * from corder WHERE RAND()<0.1" create_ddl:"copy" merge_ddl:"copy" default_filter_rules:"RAND()<0.1"]
[source_table:"customer" target_table:"customer" filtering_rule:"select * from customer WHERE RAND()<0.1" create_ddl:"copy" merge_ddl:"copy" default_filter_rules:"RAND()<0.1"]
[source_table:"product" target_table:"product" filtering_rule:"select * from product WHERE RAND()<0.1" create_ddl:"copy" merge_ddl:"copy" default_filter_rules:"RAND()<0.1"]
[source_table:"user" target_table:"user" filtering_rule:"select * from `user` WHERE RAND()<0.1" create_ddl:"copy" merge_ddl:"copy" default_filter_rules:"RAND()<0.1"]

data transformer

Before the data Stream starts, you can perform data transformer by modifying the filtering_rule attribute in the mysql.branch_table_rules table. We are compatible with the content in gofakeit. You can use gofakeit_generate to generate a specific string or gofakeit_bytype to generate a specific type. The filtering rule will be executed on the source MySQL instance, and transformation and filtering will be carried out on the vttablet.

usage

update mysql.branch_table_rules set filtering_rule='select id, gofakeit_generate(\'{firstname}:###:???:{moviename}\') as name from user WHERE id<=100' where source_table_name = 'user';
update mysql.branch_table_rules set filtering_rule='select customer_id, gofakeit_bytype(\'regex\',\'^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$\') as email from `customer` WHERE customer_id<=100' where source_table_name = 'customer';
update mysql.branch_table_rules set filtering_rule='select sku,description,gofakeit_bytype(\'intrange\',110,150) as price,gofakeit_bytype(\'floatrange\',23.5,23.9) as weight from `product`' where source_table_name = 'product';
update mysql.branch_table_rules set filtering_rule='SELECT order_id,gofakeit_bytype(\'bigint\') as customer_id,gofakeit_generate(\'{firstname}:###:???:{moviename}\') as sku,gofakeit_bytype(\'bigint\') as price FROM corder where customer_id<=100' where source_table_name = 'corder';

Start

Start will start a stream between source side and the target side. The target side will perform ETL operations based on table_rules. Before start a Branch

Branch -- --workflow_name=<workflow_name> Start

usage

vtctlclient --server localhost:15999 Branch -- --workflow_name branch_test Start

Start workflow:branch_test successfully.

Stop

Stop will stop the previous stream, and you can use Start to restart the stream again

Branch -- --workflow_name=<workflow_name> Stop

usage

vtctlclient --server localhost:15999 Branch -- --workflow_name branch_test Stop

Start workflow branch_test successfully

Note : If the branch copy date is complete and stop_after_copy is set to true, the Stop function will become ineffective.

SchemaDiff

SchemaDiff displays the differences between two database schemas. This tool aids in deciding whether to merge changes back into the source schema.

Branch -- --workflow_name=<workflow_name> SchemaDiff

usage

mysql -h127.0.0.1 -P15306 -e 'alter table branch_target.product add column v2 int;'
mysql -h127.0.0.1 -P15306 -e 'alter table branch_target.product add column v3 int;'

vtctlclient --server localhost:15999 Branch -- --workflow_name branch_test schemadiff

table product is diff
branch_target:name:"product" schema:"CREATE TABLE `product` (\n  `sku` varchar(128) NOT NULL,\n  `description` varchar(128) DEFAULT NULL,\n  `price` bigint DEFAULT NULL,\n  `v2` int DEFAULT NULL,\n  `v3` int DEFAULT NULL,\n  PRIMARY KEY (`sku`)\n) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci" primary_key_columns:"sku" type:"BASE TABLE" data_length:16384
branch_source:name:"product" schema:"CREATE TABLE `product` (\n  `sku` varchar(128) NOT NULL,\n  `description` varchar(128) DEFAULT NULL,\n  `price` bigint DEFAULT NULL,\n  PRIMARY KEY (`sku`)\n) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci" primary_key_columns:"sku" type:"BASE TABLE" data_length:1589248 row_count:5294

PrepareMergeBack

PrepareMergeBack will construct the differences between the source database and the target database, then generate executable DDL and add it to the branch_table_rules table.

Branch -- --workflow_name=<workflow_name> PrepareMergeBack

usage

vtctlclient --server localhost:15999 Branch -- --workflow_name branch_test PrepareMergeBack

PrepareMergeBack branch_test successfully 
table: product entry: ALTER TABLE `product` ADD COLUMN `v2` int, ADD COLUMN `v3` int

StartMergeBack

PrepareMergeBack will execute the DDLs that are generated in the PrepareMergeBack stage using the 'online' strategy.

Branch -- --workflow_name=<workflow_name> StartMergeBack

usage

vtctlclient --server localhost:15999 Branch -- --workflow_name branch_test StartMergeBack

Start mergeBack branch_test successfully. uuid list:
[c057f330_b1e0_11ee_b7b2_5ea977d56bb7]

After initiating StartMergeBack, we receive a UUID for OnlineDDL. This UUID can then be used with the command show vitess_migrations like [uuid] to query the status of the onlineDDL.

mysql> show vitess_migrations like 'c057f330_b1e0_11ee_b7b2_5ea977d56bb7' \G;
*************************** 1. row ***************************
                             id: 1
                 migration_uuid: c057f330_b1e0_11ee_b7b2_5ea977d56bb7
                       keyspace: branch_source
                          shard: 0
                   mysql_schema: branch_source
                    mysql_table: product
            migration_statement: alter table product add column v2 int, add column v3 int
                       strategy: online
                        options: 
                added_timestamp: 2024-01-13 14:55:35
            requested_timestamp: 2024-01-13 14:55:35
                ready_timestamp: NULL
              started_timestamp: 2024-01-13 14:55:36
             liveness_timestamp: 2024-01-13 14:55:42
            completed_timestamp: 2024-01-13 14:55:43.427318
              cleanup_timestamp: NULL
               migration_status: complete
           status_before_paused: NULL
                       log_path: 
                      artifacts: _vt_HOLD_c354897cb1e011ee9dd55ea977d56bb7_20240114065540,_c057f330_b1e0_11ee_b7b2_5ea977d56bb7_20240113145536_vrepl,
                        retries: 0
                         tablet: zone1-0000000100
                 tablet_failure: 0
                       progress: 100
              migration_context: vtctl:c05515b6-b1e0-11ee-b7b2-5ea977d56bb7
                     ddl_action: alter
                        message: 
                    eta_seconds: 0
                    rows_copied: 5000
                     table_rows: 5000
              added_unique_keys: 0
            removed_unique_keys: 0
                       log_file: 
       retain_artifacts_seconds: 86400
            postpone_completion: 0
       removed_unique_key_names: 
dropped_no_default_column_names: 
          expanded_column_names: 
               revertible_notes: 
               allow_concurrent: 0
                  reverted_uuid: 
                        is_view: 0
              ready_to_complete: 1
                 stowaway_table: 
      vitess_liveness_indicator: 1705128939
            user_throttle_ratio: 0
                   special_plan: 
       last_throttled_timestamp: NULL
            component_throttled: 
            cancelled_timestamp: NULL
                postpone_launch: 0
                          stage: re-enabling writes
               cutover_attempts: 1
         is_immediate_operation: 0
             reviewed_timestamp: 2024-01-13 14:55:36
1 row in set (0.00 sec)

Cleanup

The Cleanup function will delete items related to workflow_name from both branch_jobs table and branch_table_rules table.

Branch -- --workflow_name=<workflow_name> Cleanup

usage

vtctlclient --server localhost:15999 Branch -- --workflow_name branch_test cleanup

cleanup workflow:branch_test successfully

gofakeit fucntion

Provides gofakeit_generate and gofakeit_bytype functions that are fully compatible with Mysql's different types of data generation.

gofakeit_generate

The gofakeit_generate function follows the pattern of gofakeit.generate to generate random strings. You can find more information about it on the gofakeit GitHub page.

gofakeit_bytype

gofakeit_bytype supports the following functions:

Type Description Usage Example
tinyint Generates a random tiny integer. gofakeit_bytype('tinyint')
smallint Generates a random small integer. gofakeit_bytype('smallint')
mediumint Generates a random medium integer. gofakeit_bytype('mediumint')
int Generates a random integer. gofakeit_bytype('int')
integer Alias for int. gofakeit_bytype('integer')
bigint Generates a random big integer. gofakeit_bytype('bigint')
float Generates a random float. gofakeit_bytype('float')
double Generates a random double. gofakeit_bytype('double')
decimal Generates a random decimal. gofakeit_bytype('decimal', 5, 2)
date Generates a random date. gofakeit_bytype('date')
datetime Generates a random datetime. gofakeit_bytype('datetime')
timestamp Generates a random timestamp. gofakeit_bytype('timestamp')
time Generates a random time. gofakeit_bytype('time')
year Generates a random year. gofakeit_bytype('year')
floatrange Generates a random float within a specified range. gofakeit_bytype('floatrange', 1.5, 10.0)
intrange Generates a random int within a specified range. gofakeit_bytype('intrange', 110, 150)
name Generates a random name. gofakeit_bytype('name')
address Generates a random address. gofakeit_bytype('address')
uuid Generates a random UUID. gofakeit_bytype('uuid')
regex Generates a string matching a regex pattern. gofakeit_bytype('regex', '[a-zA-Z]{5}')