Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PS-9148 feature: Dictionary caching for Masking Functions Component (8.4) #5541

Open
wants to merge 10 commits into
base: 8.4
Choose a base branch
from

Conversation

percona-ysorokin
Copy link
Collaborator

percona-ysorokin and others added 10 commits January 7, 2025 01:50
The fix for PS-9453
"percona_telemetry causes a long wait on COND_thd_list due to the absence of the root user"
(commit 27468f8)
partially reverted as a preparation step for cherry-picking
Bug #34741098
"component::deinit() will block if calling any registry update APIs"
(commit mysql/mysql-server@d39330f).
update APIs

1. A no_lock version of registry and registry_registration service
   is implemented which provides the same functionality without
   taking any lock on the registry.
2. MySQL command service is updated to use either the lock or
   no_lock version based on the new flag no_lock_registry added in
   mcs_ext.
3. The no_lock_registry flag is set to true by health monitor query
   thread before calling MySQL command service APIs (close, connect).

Change-Id: I8ebf8f07cffb8ddc4de0f17c48c10cb15be7dad8
Re-applied the fix for PS-9453
"percona_telemetry causes a long wait on COND_thd_list due to the absence of the root user"
(commit e53363d)
partially reverted previously.
After changes in 'cssm_begin_connect()' instead of cherry-picking from 8.0 branch the modified patch from 8.4 was taken.

This is a finalization step of for cherry-picking
Bug #34741098
"component::deinit() will block if calling any registry update APIs"
(commit mysql/mysql-server@d39330f).
https://perconadev.atlassian.net/browse/PS-9551

Fixed problem in 'mysql_command_services_imp::set()'.
When user sets the 'MYSQL_COMMAND_LOCAL_THD_HANDLE' option very early after
server startup (when 'srv_session_server_is_available()' still returns false),
 'service->open(nullptr, nullptr)' may return nullptr and it is unsafe to use it afterwards.

Fixed by checking for nullness and returning earlier.
…ries in Command Services

https://perconadev.atlassian.net/browse/PS-9537

Fixed problem with re-using the same connection created via
'mysql_command_factory->init()' / 'mysql_command_factory->connect()'
in multiple calls to' mysql_command_query->query()'.

The problem seems to be a regression introduced by upstream in their fix for
Bug #34323788
"ASAN memory leaks reported by test_mysql_command_services_component.test"
(commit mysql/mysql-server@5dc1a14).

Inside 'csi_advanced_command()' 'mcs_extn->consumer_srv_data' when is
not nullptr cannot be simply re-used as its `mcs_extn->data` member
has already been set to nullptr by `std::exchange()` inside `csi_read_rows()`.

Fixed by calling 'factory_srv->end()' and 'factory_srv->start()' in this case.
Removed some outdated Clang 5 warning suppressions.
…g_functions

https://perconadev.atlassian.net/browse/PS-9148

- Added caching of mysql.masking_dictionaries table content.
- Implemented masking_dictionaries_flush() UDF which flushes data
  from the masking dictionaries table to the memory cache.

PS-9148 feature: Add masking_functions.masking_database sys var support

https://perconadev.atlassian.net/browse/PS-9148

The masking_functions.masking_database system variable for the
masking_functions component specifies database used for data
masking dictionaries.

PS-9148 feature: Implement dictionary flusher for masking_functions plugin

https://perconadev.atlassian.net/browse/PS-9148

- Added component_masking.dictionaries_flush_interval_seconds system
  variable.
- Added actual flusher thread. It periodically rereads content of
  dictionary table and updates in-memory cache.

PS-9148 feature: Implemented hierarchical storage for dictionaries and terms

https://perconadev.atlassian.net/browse/PS-9148

Introduced 'dictionary' and 'bookshelf' classes for storing terms on
per-dictionary level.
Reworked 'query_cache' to utilize these two new classes.

PS-9148 feature: Minor refactoring to break dependencies

https://perconadev.atlassian.net/browse/PS-9148

Introduced 'component_sys_variable_service_tuple' class for groupping comonent
system variable registration services (supposed to be used with
'primitive_singleton' class template).

'query_cache' now expects 'query_builder' and 'flusher_interval_seconds' as its
constructor's parameters.

Eliminates custom MySQL types (like 'ulonglong') and its includes (like
'my_inttypes.h') from the publicly facing headers.

'query_cache' is now explicitly initialized / deinitialized in the component's
'init()'' / 'deinit()'' functions via 'primitive_singleton' interface.

'query_cache' helper thread-related methods made private.

PS-9148 feature: Refactored usage of std::string_view for c-interfaces

https://perconadev.atlassian.net/browse/PS-9148

As std::string_view::data() is not guaranteed to be null-terminated, it is
not safe to use it in old c-functions accepting 'const char *'.
Some constants converted to arrays of char 'const char buffer[]{"value"}'.

PS-9148 feature: Implemented lazy query_cache initial population

https://perconadev.atlassian.net/browse/PS-9148

'command_service_tuple' struct extended with one more member - 'field_info'
service.

Reworked 'query_cache' class: instead of loading terms from the database in
constructor, this operation is now performed in first attempt to access one
of the dictionary methods ('contains()' / 'get_random()' / 'remove()' /
'insert()'). This is done in order to overcome a limitation that does not
allow 'mysql_command_query' service to be used from inside the componment
initialization function.
Fixed problem with 'm_dict_cache' shared pointer updated concurrently from
different threads.
Exceptions thrown from the cache loading function no longer escape the
flusher thread.

De-coupled 'sql_context' and 'bookshelf' classes: 'sql_context' now accepts a
generic insertion callback that can be used to populate any type of containers.

'component_masking_functions.dictionary_operations' MTR test case extended with
additional checks for flushed / unflushed dictionary cache.

PS-9148 feature: Reworked dictionary / bookshelf thread-safety model

https://perconadev.atlassian.net/browse/PS-9148

Both 'dictionary' and 'bookshelf' classes no longer include their own
'std::shared_mutex' to protect data. Instead, we now have a single
'std::shared_mutex' at the 'query_cache' level.

The return value of the 'get_random()' method in both 'dictionary' and
'bookshelf' classes changed from 'optional_string' to 'std::string_view'. Empty
(default constructed) 'std::string_view' is used as an indicator of an
unsuccessful operation.
'get_random()' method in the 'query_cache' class still returns a string by
value to avoid race conditions.

Changed the behaviour of the 'sql_context::execute_dml()' method - it now
throws when SQL errors (like "no table found", etc.) occur.

PS-9148 feature: Fix masking functions flusher thread intialization

https://perconadev.atlassian.net/browse/PS-9148

Added missing my_thread_attr_t initialization.

PS-9148 feature: Decoupled threading and caching functionality in query_cache

https://perconadev.atlassian.net/browse/PS-9148

Threading-related functionality extracted from the 'query_cache' class into
a separate 'dictionary_flusher_thread' class.

This new class now accepts an instance of existing 'query_cache' class as
a parameter of its constructor in a form of shared pointer.

Changed the way how these two objects are now initialized / deinitialized in
'component.cpp' ('component_init()' / 'component_deinit()' functions).

PS-9148 feature: Refactored dictionary_flusher_thread

https://perconadev.atlassian.net/browse/PS-9148

'dictionary_flusher_thread' class interface ('dictionary_flusher_thread.hpp') now
includes no  public / internal MySQL headers.

Introduced internal 'thread_handler_context' class, which is supposed to be
instantiated as the very first declaration of the MySQL thread handler function -
it performs proper 'THD' object initialization in constructor and
deinitialization in destructor.

Introduced internal 'thread_attributes' class - an RAII wrapper over
'my_thread_attr_t' with proper initialization in constructor and deinitialization
in destructor.

Introduced internal 'jthread' class that similarly to 'std::jthread' from c++20
spawns a joinable thread in constructor and joins it in destructor. It expects
only meaningful logic in a form of 'std::function<void()>' from the user and
hides all the MySQL initialization / PSI registration boilerplate. It uses
an instance of 'thread_attributes' class when spawns a thread. It also creates
an instance of the 'thread_handler_context' class inside actual thread handler
function ('jthread::raw_handler()'). This class also makes sure that no
exception escapes actual thread handler function.

Refactored  error handling in 'component_init()'.

Fixed problem with updating 'stopped_' variable (which the condition variable
uses in its 'waitt_for()' method) without properly locking the mutex.

Fixed instabilities in 'component_masking_functions.rpl_dictionaries_flush_interval'
MTR test case.

PS-9148 feature: Improved diagnostics (MySQL API error messages) in sql_context

https://perconadev.atlassian.net/browse/PS-9148

'command_service_tuple' class extended with new 'error_info' member  of type
'SERVICE_TYPE(mysql_command_error_info) *' that allows extracting MySQL
error codes and messages. It is expected to be used inside 'sql_context' class
methods.

Reworked the way how exceptions are thrown from the 'sql_context' class
methods - we now use 'raise_with_error_message()' helper method that
throws an exception that incorporates MySQL client API error message
extracted via added 'error_info' member of the `command_service_tuple`
class.

More meaningful error messages, which include underlying MySQL error
descriptions, are now generated from inside the 'query_cache' class methods.

Added new DBUG keywords and DEBUG_SYNC actions that allow control
over Dictionary Flusher background thread actions.

Added new 'component_masking_functions.flusher_thread_suspend_resume'
MTR test case that checks for various race conditions between current session
and  Dictionary Flusher background thread.

Improved 'component_masking_functions.rpl_dictionaries_flush_interval' MTR
test case - pre-recorder values replaced with 'assert.inc'.

PS-9148 feature: Refactored to avoid deadlock during UNINSTALL COMPONENT

https://perconadev.atlassian.net/browse/PS-9148

'sql_context' class constructor now takes one extra parameter that allows
to specify whether the user wants to set the
'MYSQL_NO_LOCK_REGISTRY' option or not.

Introduced new abstract class 'basic_sql_context_builder' that is used to
construct instances of the 'sql_context' class on demand.
Also added two concrete implementations of this interface:
'default_sql_context_builder' and 'static_sql_context_builder'.
The former just creates a new instance of the 'sql_context' class every time
the 'build()' method is invoked. This implementation is used from inside
the UDF function implementation methods.
The latter creates an instance of the 'sql_context' class during the first call
to the 'build()' method, saves it internally and returns this saved instance
for each subsequent call to the 'build()' method. This implementation is
used in the 'dictionary_flusher_thread'.

Refactored 'query_cache' class: basic functionality that needs external
'basic_sql_context_builder' and 'query_builder' extracted into separate
class 'sql_cache_core'.
For convenience, added new 'sql_cache' class as a wrapper over existing
'sql_cache_code', 'basic_sql_context_builder' and 'query_builder'. The
same 'sql_cache_core' can be shared between multiple instances of
the `sql_cache`.

This allowed to make sure that `dictionary_flusher_thread` uses a
dedicated long-living instance of the 'sql_context' class (with
'MYSQL_NO_LOCK_REGISTRY' enabled) and does not cause deadlocks
during 'UNINSTALL COMPONENT'.
See Bug #34741098 "component::deinit() will block if calling any registry"
(commit mysql/mysql-server@d39330f)
for more details.

As the 'sql_context' connection in the 'dictionary_flusher_thread' is now
a long-living one, it is now shown in the output of the 'SHOW PROCESSLIST'
statement.
Modified 'count_sessions.inc' / 'wait_until_count_sessions.inc' logic inside
'component_masking_functions.flusher_thread_suspend_resume' MTR
test case to reflect these changes.

PS-9148 feature: Refactored dictionary flusher thread startup / termination

https://perconadev.atlassian.net/browse/PS-9148

'command_service_tuple' extended with one more member 'thread' which is used
to initialize / deinitialize threads intended to be used as MySQL threads.

'sql_context' class constructor now takes one extra parameter that allows to
specify whether the user wants to associate a new session (including an instance
of the 'THD' class) with the calling thread. Internally it is done by setting the
'MYSQL_COMMAND_LOCAL_THD_HANDLE' option to nullptr.
Also 'sql_context' now tries to open connections on behalf of the internal
predefined 'mysql.session' MySQL user (instead of 'root').

Reworked 'static_sql_context_builder' class - it now creates a shared  "static"
instance of he 'sql_context'  class inside the class constructor and passes true
as its 'initialize_thread' parameter meaning an intent to associate the calling
thread with this connection. Before this change, the construction was done inside
the very first call to 'do_build()'.

The regular ("new instance per request" ) implementation of the
'basic_sql_context_builder', 'default_sql_context_builder', now passes false as
the 'initialize_thread' parameter (meaning no association with the thread needed).

Significantly reworked 'dictionary_flusher_thread':
- instead of composed 'query_cache' object it now expects its component
  'query_cache_core' and 'query_builder' as constructor arguments. This allows
  to create an instance of the 'static_sql_context_builder' and 'query_cache'
  directly inside the thread function.
- Instead of 'stopped_' boolean flag, we now have a state enumeration ('initial'
  'initialization_failure', 'operational', 'stopped')
- the implementation no longer uses 'std::conditional_variable' for awaiting timer
  events / termination requests. Instead, it just wakes up periodically (once per
  second) and checks it it needs to reload the cache. This is necessary to be able
  to respond to graceful termination requests like 'KILL CONNECTION' or shutdown
  closing sessions at shutdown.
- added 'request_termination()' method used inside component 'deinit()' handler.
- 'do_periodic_reload()' function now looks more lake a state machine performing
  different actions and making transitions to different states.
- added new logic to wait for Sessin Server availability inside the
  'do_periodic_reload()' function.

Reworked 'thread_handler_context' class - it now uses 'mysql_command_thread'
service to initialize / deinitialize the thread for MySQL.

Various MTR test case that use dictionary functions updated with explicit
granting necessary privileges on the dictionary table to the
'mysql.session'@'localhost' internal MySQL user.

Added new 'component_masking_functions.flusher_thread_connection_reuse'
MTR test case that checks that the same MySQL internal connection (created
via 'mysql_command_xxx' services) can be used several times (without closing
and re-opening) by the background flusher thread.

Added new 'component_masking_functions.flusher_thread_immediate_restart'
MTR test case that check for proper behavior during server shutdown
immediately after installing the component.

Added new 'wait_for_component_uninstall.inc' MTR include file which can be used
to perform several attempts to 'UNINSTALL COMPONENT' until it succeeds or
reaches the max number of attempts.

PS-9148 feature: basic_sql_context_builder renamed to abstract_sql_context_builder

https://perconadev.atlassian.net/browse/PS-9148

Added a comment describing class hierarchy.

PS-9148 feature: Extended class description comments

https://perconadev.atlassian.net/browse/PS-9148

PS-9148 feature: query_cache[_core] renamed to term_cache[_core]

https://perconadev.atlassian.net/browse/PS-9148

Added class descriptions for 'term_cache_core' and 'term_cache'.

PS-9148 feature: query_builder transformed into a singleton

https://perconadev.atlassian.net/browse/PS-9148

Removed all the boilerplate code connected with passing 'query_builder'
trhough class hierarchy.
'query_builder_ptr' changed from 'std::shared_ptr' to 'std::unique_ptr'.
'term_cache_ptr' changed from 'std::shared_ptr' to 'std::unique_ptr'.

PS-9148 feature: Added missing debug_sync facility reset to the MTR test cases

https://perconadev.atlassian.net/browse/PS-9148

PS-9148 feature: Extended component_masking_functions.dictionary_operations

https://perconadev.atlassian.net/browse/PS-9148

'component_masking_functions.dictionary_operations' extended with more
checks for the case when 'mysql.session'@'localhost' system user does not
have enough privileges to access 'mysql.masking_dictionaries' table.

Also fixed checks for non-existing 'mysql.masking_dictionaries' table.

PS-9148 feature: Fixed expected American Express card number length

https://perconadev.atlassian.net/browse/PS-9148

PS-9148 feature: Added more comments about flusher thread termination

https://perconadev.atlassian.net/browse/PS-9148

PS-9148 feature: Reworked bookshelf class

'bookshelf' class reworked so that internally it now holds 'std::unordered_map'
of 'dictionary' objects instead of 'dictionary_ptr' objects.

PS-9148 feature: Fixed compilation problem with older STLs

https://perconadev.atlassian.net/browse/PS-9148

Although in recent versions of STL implementation is is OK to use
'std::unordered_map' with incomplete types, this is not true for STL coming
with GCC 11 (default on Ubuntu Jammy).
'bookshelf.hpp' header now includes 'dictionary.hpp"' instead of
'dictionary_fwd.hpp"' to resolve ths issue.

PS-9148 feature: Reworked flusher thread initialization

https://perconadev.atlassian.net/browse/PS-9148

Reworked Dictionary Flusher thread initialization logic.
We now establish internal server connection under the
'LOCK_server_shutting_down' lock and only if the Server is not shutting
down ('server_shutting_down' is still false). This helps to ensure that
component's 'deinit()' function (that is called  with service registry locked) is not run concurrently with constructing an instance of the
'static_sql_context_builder' class (that attempts to acquire the same service registry lock).

'sql_context' class constructor now accepts 2 parameters:
* initialization_registry_locking_mode (used for establishing connections)
* operation_registry_locking_mode (used for closing connections)
instead of a single one 'registry_locking_mode'.

Because of these change component's 'deinit()' function can no longer fail.
As the result, there is no need in 'wait_for_component_uninstall.inc' anymore. Changed to simple 'UNINSTALL COMPONENT'.

Co-Authored-By: Oleksandr Kachan <[email protected]>
…ashed' into dev/PS-9148-8.4-masking_functions_background_thread_squashed
https://perconadev.atlassian.net/browse/PS-9148

Changed replication terminology inthe
'component_masking_functions.rpl_dictionaries_flush_interval' MTR test case.
https://perconadev.atlassian.net/browse/PS-9148

Thanks to c++20 transparent key support for unordered containers,
'dictionary' and 'bookshelf'classes can now accept 'std::string_view' instead
of 'const std::string &' in their interfaces that eliminates unnecessary data
copying.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant