January 16, 2025

Why Querying ps.data_locks Causes MySQL to Hang? Find Out!

Memory allocation exception and deadlock problem caused by querying performance schema table when MySQL executes specific statements.

1 Problem Overview

When the batch processing is executed to the statement insert into t1 select * from t2, there is a scheduled task running the MySQL inspection script. When the inspection script executes select * from performance_schema.data_locks and select * from performance_schema.data_lock_waits, it will cause MySQL to hang. At first, only some SQL statements have no response during execution, and eventually, it becomes impossible to log in to MySQL.

Tips: MySQL Version 8.0.26

2 Insight into Analytical Process

2.1 Thread States When Hanging Starts

Among the two threads marked in the figure below:

  • The complete SQL of the first thread is "insert into t1 select * from t2".
  • The complete SQL of the second thread is "select * from performance_schema.data_lock_waits", which is the SQL in the inspection script. The previous SQL is "select * from performance_schema.data_locks".
  • All the other threads are stuck.

2.2 Analysis of the Stack

The situations of locks held and being waited for by threads related to locks and mutexes are as follows:

  • Thread 285 holds LOCK_status and is blocked by srv_innodb_monitor_mutex held by Thread 21.
  • Thread 21 holds srv_innodb_monitor_mutex and is blocked by trx_sys_mutex_enter(), that is, trx_sys->mutex, which is held by an unknown thread.
  • A large number of threads are blocked by LOCK_status held by Thread 285.
  • A large number of threads (including insert into..select and the query on ps.data_lock_wait) are blocked at trx_sys_mutex_enter().

The current problem is that the thread that holds the trx_sys->mutex mutex has not been found.

2.3 Local Reproduction

Invocation of the stored procedure, when it comes to insert into... select..., another session executes select * from performance_schema.data_locks.

Tested many times repeatedly, and later found the necessary conditions for reproduction:

Executing select * from performance_schema.data_locks reports a memory allocation exception: ERROR 3044 (HY000): Memory allocation error: while scanning data_locks table in function rnd_next.

And then it is possible to observe that insert into... select gets stuck, and the stack shows that this thread is waiting for trx_sys->mutex.

Screenshot of reproduction:

The thread stack of insert into... select is as follows, but after analyzing all thread stacks, still can't find out who holds the trx_sys->mutex.

2.4 Code Analysis

Since the holder of the trx_sys->mutex mutex cannot be found in the stack information, another method that comes to mind is to use gdb to print out the trx_sys->mutex structure to see if there is any thread ID information in it.

It is found that there is only thread ID information in debug mode, not in normal mode. Therefore, it is necessary to compile a debug version for reproduction, and then use gdb to print out the holder of the trx_sys->mutex mutex.

2.5 Reproduction in the debug version

After querying ps.data_locks triggers a memory allocation error, print the stack of the insert into thread through gdb, and it gets stuck at mutex_enter_inline:

Then print the thread stack of querying ps.data_locks. The stack is normal, but when printing trx_sys->mutex, it is found that the holder is itself:

Converting the value of _M_thread to hexadecimal can be used to check the thread number output by gdb info thread:

gdb) p/x 140316922181376
$2 = 0x7f9e144d7700

2.6 Reasonable speculation & looking for evidence

After reproduction of the debug version, it is found that when querying ps.data_locks, a memory allocation error is triggered, but the trx_sys->mutex mutex is not released.

So it is reasonably speculated that: there is a bug. After a memory allocation error is triggered when querying ps.data_locks, the trx_sys->mutex will not be released, resulting in an internal deadlock.

Found this bug: https://github.com/mysql/mysql-server/commit/d6be2f8d23b1fe41f10c7147957faf68b117abb2

2.7 Bug explanation

In the implementation of performance_schame.data_locks, the C++ try-catch mechanism is used to handle the exception of memory allocation failure when reading `data_locks` records.

In the table_data_locks::rnd_next function of the table_data_locks.cc file:

Code explanation: catch (const std::bad_alloc &) is used to catch exceptions of type std::bad_alloc. When a std::bad_alloc exception is thrown into the try block, my_error is called to print the error message and ER_STD_BAD_ALLOC_ERROR is returned to end the execution.

iterator_done = it->scan(&m_container, true); in the try module is the actual business logic code being executed. it->scan performs some kind of iterative scanning operation, and the result is stored in the iterator_done variable. The exception caught by the catch module is thrown from here.

The definition of it->scan is in the Innodb_data_lock_iterator::scan function of the p_s.cc source code file, and the execution logic is very clear:

  1. trx_sys_mutex_enter() adds the trx_sys->mutex mutex first.
  2. Calls scan_trx_list to scan the two transaction lists rw_trx_list and mysql_trx_list.
  3. trx_sys_mutex_exit() releases the trx_sys->mutex mutex.

Obviously, if a std::bad_alloc memory exception occurs during the scan by calling scan_trx_list, it will be directly caught by the catch module, the exception will be thrown and the execution will end, and it will not be able to execute to trx_sys_mutex_exit() to release the mutex, resulting in the residue of the trx_sys->mutex mutex

This can be confirmed in the trx_sys->mutex information printed when reproducing in the debug version: It was added at line 592 of the p_s.cc file.

3 Steps to reproduce

The simplified steps to reproduce are as follows:

  1. Prepare a virtual machine with a small memory, such as 2 - 4G, to easily trigger a memory allocation exception.
  2. Create a table t1 with 5 million rows.
  3. Execute begin; select * from t1 for update;
  4. Execute select * from performance_schema.data_locks; to trigger the error  ERROR 3044 (HY000): Memory allocation error: while scanning data_locks table in function rnd_next.
  5. Continuing to query performance_schema.data_lock_waits will be blocked.

4. Conclusion

The causes of this failure are as follows:

  1. During the batch processing, INSERT INTO t1 SELECT * FROM t2; will add an S Lock to all rows of table t2 (reason: RR isolation level and cannot use the index). Table t2 is very large with hundreds of millions of rows, which will result in hundreds of millions of lock records in performance_schema.data_locks;
  2. When querying ps.data_locks, due to the large number of records, it consumes a lot of memory and triggers a memory allocation exception. In the implementation of ps.data_locks, the memory allocation exception is handled by the C++ try-catch mechanism, but this exception handling occurs between the mutex lock and mutex unlock, resulting in the mutex not being released. In this failure, a memory allocation exception occurred during the iterative scan of lock records after executing trx_sys_mutex_enter(), and the operation of trx_sys_mutex_exit() to release the mutex was not executed and remained;
  3. Since the trx_sys->mutex mutex is widely used in InnoDB, the subsequent inspection script query for data_lock_waits is blocked, and the background purge thread, innodb monitor thread, metadata refresh process, and business SQL executed by user threads are all blocked, eventually leading to a deadlock.

5 Solution

  1. The official fixed this bug in 8.0.37, and it can be solved by upgrading to 8.0.37.

The way to fix it is to remove the try-catch code in the implementation of ps.data_locks and data_lock_waits to prevent handling exceptions between the mutex lock and mutex unlock, which may cause the mutex unlock not to be executed and leave the mutex remaining.

  1. Do not query ps.data_locks when there are a lot of row locks.

You will get best features of ChatDBA