-
Notifications
You must be signed in to change notification settings - Fork 990
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
replication: Support GTID tag in PreviousGTIDsEvent #952
base: master
Are you sure you want to change the base?
Conversation
Issue: ref go-mysql-org#845 The `PreviousGTIDsEvent` / `PREVIOUS_GTIDS_LOG_EVENT` has changed to work with tagged GTIDs. First the `uuidCount` has changed, it encodes the GTID format. Here format 1 is tagged and format 0 is untagged. Then each entry may have a tag. If there is a tag then the uuid itself isn't printed but the tag is appended to the last entry. Examples: `896e7882-18fe-11ef-ab88-22222d34d411:1-3` regular format, compatible with both formats `896e7882-18fe-11ef-ab88-22222d34d411:1-4:aaaa:1` tagged format. Combination of - `896e7882-18fe-11ef-ab88-22222d34d411:1-4` - `896e7882-18fe-11ef-ab88-22222d34d411:aaaa:1` `896e7882-18fe-11ef-ab88-22222d34d411:1-4:aaaa:1:abc:1-3:bbbbb:1:bbbbbb:1:x:1,896e7882-18fe-11ef-ab88-22222d34d412:1-2` Combination of: ``` 896e7882-18fe-11ef-ab88-22222d34d411:1-4 :aaaa:1 🔤1-3 :bbbbb:1 :bbbbbb:1 ❌1, 896e7882-18fe-11ef-ab88-22222d34d412:1-2 ``` Please also see: `mysqlbinlog --read-from-remote-server --hexdump $binlogfile` to see how MySQL encodes/decodes this. See also: - https://dev.mysql.com/doc/refman/8.4/en/replication-gtids-concepts.html
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure Example 3 in PR description has the same format as MySQL. In https://dev.mysql.com/doc/refman/8.4/en/replication-gtids-concepts.html , there's
GTIDs originating from the same server but having different tags are treated in a manner similar to those originating from different servers, like this:
3E11FA47-71CA-11E1-9E33-C80AA9429562:Domain_1:1-3:15-21, 3E11FA47-71CA-11E1-9E33-C80AA9429562:Domain_2:8-52
rather than 3E11FA47-71CA-11E1-9E33-C80AA9429562:Domain_1:1-3:15-21:Domain_2:8-52
. I guess the reason is if Domain_2
is replaced by tag 23
, we will see 3E11FA47-71CA-11E1-9E33-C80AA9429562:Domain_1:1-3:15-21:23:8-52
, which is ambiguous.
It looks like my example might be correct and the MySQL docs might not be in sync with the actual behavior. Note that From https://dev.mysql.com/doc/dev/mysql-server/latest/classmysql_1_1gtid_1_1Tag.html
|
And with the exact GTIDs from the docs:
|
I've filed https://bugs.mysql.com/bug.php?id=116789 for this |
I also filed https://bugs.mysql.com/bug.php?id=116747 some days ago |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a pure code-related review. I'll check GTID encoding and decoding soon
replication/event.go
Outdated
if format == GtidFormatTagged { | ||
tagLength := int(data[pos]) / 2 | ||
pos += 1 | ||
if tagLength > 0 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can the tag have zero length? please add comments
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
mysql-9.1.0> set gtid_next = 'AUTOMATIC:abc';
Query OK, 0 rows affected (0.00 sec)
mysql-9.1.0> create table test.t9 (id int primary key);
Query OK, 0 rows affected (0.02 sec)
mysql-9.1.0> set gtid_next = 'AUTOMATIC:';
ERROR 1774 (HY000): Malformed GTID specification 'AUTOMATIC:'.
https://dev.mysql.com/doc/dev/mysql-server/latest/classmysql_1_1gtid_1_1Tag.html#details
Representation of the GTID tag.
Tag format: [a-z_]{a-z0-9_}{0,31} Tag may be created from a given text. Text may contain leading and trailing spaces which will be omitted during Tag creation. Text may also contain uppercase characters. Acceptable format for text is as follows: [:space:][a-zA-Z_][a-zA-Z0-9_]{0,31}[:space:]
So it is non-zero if it has a tag. Note the {0,31}
in the regexp, which is only for the second character as the first can't be a number.
@@ -254,9 +298,14 @@ func (e *PreviousGTIDsEvent) Decode(data []byte) error { | |||
} | |||
intervals[i] = interval | |||
} | |||
previousGTIDSets[i] = fmt.Sprintf("%s:%s", uuid, strings.Join(intervals, ":")) | |||
if isTag { | |||
previousGTIDSets[currentSetnr-1] += fmt.Sprintf(":%s:%s", tag, strings.Join(intervals, ":")) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because string is immutable, string +=
will create a new string object each time. If there are too many tags the performance will drop. Maybe store every "tag+intervals" and join them once at all
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I think we should try to do that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe try this
diff --git a/replication/event.go b/replication/event.go
index 20c7b0c..0786516 100644
--- a/replication/event.go
+++ b/replication/event.go
@@ -277,7 +277,9 @@ func (e *PreviousGTIDsEvent) Decode(data []byte) error {
pos += 8
previousGTIDSets := make([]string, uuidCount)
+
currentSetnr := 0
+ var buf strings.Builder
for range previousGTIDSets {
uuid := e.decodeUuid(data[pos : pos+16])
pos += 16
@@ -292,30 +294,38 @@ func (e *PreviousGTIDsEvent) Decode(data []byte) error {
pos += tagLength
}
}
+
+ if isTag {
+ buf.WriteString(":")
+ buf.WriteString(tag)
+ } else {
+ if currentSetnr != 0 {
+ buf.WriteString(",")
+ }
+ buf.WriteString(uuid)
+ currentSetnr += 1
+ }
+
sliceCount := binary.LittleEndian.Uint16(data[pos : pos+8])
pos += 8
- intervals := make([]string, sliceCount)
- for i := range intervals {
+ for range sliceCount {
+ buf.WriteString(":")
+
start := e.decodeInterval(data[pos : pos+8])
pos += 8
stop := e.decodeInterval(data[pos : pos+8])
pos += 8
- interval := ""
if stop == start+1 {
- interval = fmt.Sprintf("%d", start)
+ fmt.Fprintf(&buf, "%d", start)
} else {
- interval = fmt.Sprintf("%d-%d", start, stop-1)
+ fmt.Fprintf(&buf, "%d-%d", start, stop-1)
}
- intervals[i] = interval
}
- if isTag {
- previousGTIDSets[currentSetnr-1] += fmt.Sprintf(":%s:%s", tag, strings.Join(intervals, ":"))
- } else {
- previousGTIDSets[currentSetnr] = fmt.Sprintf("%s:%s", uuid, strings.Join(intervals, ":"))
+ if !isTag {
currentSetnr += 1
}
}
- e.GTIDSets = strings.Join(previousGTIDSets[:currentSetnr], ",")
+ e.GTIDSets = buf.String()
return nil
}
This could also be useful: https://databaseblog.myname.nl/2024/12/mysql-gtid-tags-and-binlog-events.html |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
rest lgtm
// https://github.com/mysql/mysql-server/blob/61a3a1d8ef15512396b4c2af46e922a19bf2b174/sql/rpl_gtid_set.cc#L1363-L1378 | ||
func decodeSid(data []byte) (format GtidFormat, sidnr uint64) { | ||
gtidinfo := make([]byte, 8) | ||
copy(gtidinfo, data[:8]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the modification to gtidinfo
happens only when it's GtidFormatTagged
. So we can postpone the data copy until checked it's GtidFormatTagged
...
sid_mask := []byte{0, 255, 255, 255, 255, 255, 255, 0} | ||
|
||
// Apply the mask | ||
for i, _ := range gtidinfo[:8] { | ||
gtidinfo[i] &= sid_mask[i] | ||
} | ||
gtidinfo = append(gtidinfo, 0) | ||
|
||
// sidnr | ||
sidnr = binary.LittleEndian.Uint64(gtidinfo[1:]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
... and seems we only need to prepare a [8]byte and copy data[1:7] to it? Then binary.LittleEndian.Uint64
can work
isTag := false | ||
var tag string | ||
if format == GtidFormatTagged { | ||
tagLength := int(data[pos]) / 2 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure how this ... / 2
happens in MySQL's source code. Seems it's https://github.com/mysql/mysql-server/blob/61a3a1d8ef15512396b4c2af46e922a19bf2b174/libs/mysql/gtid/tag.cpp#L84 and it further calls a complex read_varlen_bytes
.
Just to double check with you, did you simplify read_varlen_bytes
into int(data[pos]) / 2
? If not, I think I can help to provide more test data with different tag length and check our calculation is correct
uuid := e.decodeUuid(data[pos : pos+16]) | ||
pos += 16 | ||
isTag := false |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
seems isTag
can be replaced by len(tag) > 0
@@ -254,9 +298,14 @@ func (e *PreviousGTIDsEvent) Decode(data []byte) error { | |||
} | |||
intervals[i] = interval | |||
} | |||
previousGTIDSets[i] = fmt.Sprintf("%s:%s", uuid, strings.Join(intervals, ":")) | |||
if isTag { | |||
previousGTIDSets[currentSetnr-1] += fmt.Sprintf(":%s:%s", tag, strings.Join(intervals, ":")) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe try this
diff --git a/replication/event.go b/replication/event.go
index 20c7b0c..0786516 100644
--- a/replication/event.go
+++ b/replication/event.go
@@ -277,7 +277,9 @@ func (e *PreviousGTIDsEvent) Decode(data []byte) error {
pos += 8
previousGTIDSets := make([]string, uuidCount)
+
currentSetnr := 0
+ var buf strings.Builder
for range previousGTIDSets {
uuid := e.decodeUuid(data[pos : pos+16])
pos += 16
@@ -292,30 +294,38 @@ func (e *PreviousGTIDsEvent) Decode(data []byte) error {
pos += tagLength
}
}
+
+ if isTag {
+ buf.WriteString(":")
+ buf.WriteString(tag)
+ } else {
+ if currentSetnr != 0 {
+ buf.WriteString(",")
+ }
+ buf.WriteString(uuid)
+ currentSetnr += 1
+ }
+
sliceCount := binary.LittleEndian.Uint16(data[pos : pos+8])
pos += 8
- intervals := make([]string, sliceCount)
- for i := range intervals {
+ for range sliceCount {
+ buf.WriteString(":")
+
start := e.decodeInterval(data[pos : pos+8])
pos += 8
stop := e.decodeInterval(data[pos : pos+8])
pos += 8
- interval := ""
if stop == start+1 {
- interval = fmt.Sprintf("%d", start)
+ fmt.Fprintf(&buf, "%d", start)
} else {
- interval = fmt.Sprintf("%d-%d", start, stop-1)
+ fmt.Fprintf(&buf, "%d-%d", start, stop-1)
}
- intervals[i] = interval
}
- if isTag {
- previousGTIDSets[currentSetnr-1] += fmt.Sprintf(":%s:%s", tag, strings.Join(intervals, ":"))
- } else {
- previousGTIDSets[currentSetnr] = fmt.Sprintf("%s:%s", uuid, strings.Join(intervals, ":"))
+ if !isTag {
currentSetnr += 1
}
}
- e.GTIDSets = strings.Join(previousGTIDSets[:currentSetnr], ",")
+ e.GTIDSets = buf.String()
return nil
}
Issue: ref #845
The
PreviousGTIDsEvent
/PREVIOUS_GTIDS_LOG_EVENT
has changed to work with tagged GTIDs.First the
uuidCount
has changed, it encodes the GTID format. Here format 1 is tagged and format 0 is untagged.Then each entry may have a tag. If there is a tag then the uuid itself isn't printed but the tag is appended to the last entry.
Example 1
896e7882-18fe-11ef-ab88-22222d34d411:1-3
regular format, compatible with both formats
Example 2
896e7882-18fe-11ef-ab88-22222d34d411:1-4:aaaa:1
tagged format.
Combination of
896e7882-18fe-11ef-ab88-22222d34d411:1-4
896e7882-18fe-11ef-ab88-22222d34d411:aaaa:1
Example 3
896e7882-18fe-11ef-ab88-22222d34d411:1-4:aaaa:1:abc:1-3:bbbbb:1:bbbbbb:1:x:1,896e7882-18fe-11ef-ab88-22222d34d412:1-2
Combination of:
Please also see:
mysqlbinlog --read-from-remote-server --hexdump $binlogfile
to see how MySQL encodes/decodes this.See also: