1 Problem Overview
When the batch processing is executed to the statement insert into t1 select * from t2
, there is a scheduled task running the MySQL inspection script. When the inspection script executes select * from performance_schema.data_locks
and select * from performance_schema.data_lock_waits
, it will cause MySQL to hang. At first, only some SQL statements have no response during execution, and eventually, it becomes impossible to log in to MySQL.
Tips: MySQL Version 8.0.26
2 Insight into Analytical Process
2.1 Thread States When Hanging Starts
Among the two threads marked in the figure below:
- The complete SQL of the first thread is "insert into t1 select * from t2".
- The complete SQL of the second thread is "select * from performance_schema.data_lock_waits", which is the SQL in the inspection script. The previous SQL is "select * from performance_schema.data_locks".
- All the other threads are stuck.
2.2 Analysis of the Stack
The situations of locks held and being waited for by threads related to locks and mutexes are as follows:
- Thread 285 holds LOCK_status and is blocked by
srv_innodb_monitor_mutex
held by Thread 21. - Thread 21 holds
srv_innodb_monitor_mutex
and is blocked bytrx_sys_mutex_enter()
, that is,trx_sys->mutex
, which is held by an unknown thread. - A large number of threads are blocked by LOCK_status held by Thread 285.
- A large number of threads (including
insert into..select
and the query onps.data_lock_wait
) are blocked attrx_sys_mutex_enter()
.
The current problem is that the thread that holds the trx_sys->mutex
mutex has not been found.
2.3 Local Reproduction
Invocation of the stored procedure, when it comes to insert into... select...
, another session executes select * from performance_schema.data_locks
.
Tested many times repeatedly, and later found the necessary conditions for reproduction:
Executing select * from performance_schema.data_locks
reports a memory allocation exception: ERROR 3044 (HY000): Memory allocation error: while scanning data_locks table in function rnd_next.
And then it is possible to observe that insert into... select
gets stuck, and the stack shows that this thread is waiting for trx_sys->mutex
.
Screenshot of reproduction:
The thread stack of insert into... select
is as follows, but after analyzing all thread stacks, still can't find out who holds the trx_sys->mutex
.
2.4 Code Analysis
Since the holder of the trx_sys->mutex
mutex cannot be found in the stack information, another method that comes to mind is to use gdb to print out the trx_sys->mutex
structure to see if there is any thread ID information in it.
It is found that there is only thread ID information in debug mode, not in normal mode. Therefore, it is necessary to compile a debug version for reproduction, and then use gdb to print out the holder of the trx_sys->mutex
mutex.
2.5 Reproduction in the debug version
After querying ps.data_locks
triggers a memory allocation error, print the stack of the insert into
thread through gdb, and it gets stuck at mutex_enter_inline
:
Then print the thread stack of querying ps.data_locks
. The stack is normal, but when printing trx_sys->mutex
, it is found that the holder is itself:
Converting the value of _M_thread
to hexadecimal can be used to check the thread number output by gdb info thread
:
gdb) p/x 140316922181376
$2 = 0x7f9e144d7700
2.6 Reasonable speculation & looking for evidence
After reproduction of the debug version, it is found that when querying ps.data_locks
, a memory allocation error is triggered, but the trx_sys->mutex
mutex is not released.
So it is reasonably speculated that: there is a bug. After a memory allocation error is triggered when querying ps.data_locks
, the trx_sys->mutex
will not be released, resulting in an internal deadlock.
Found this bug: https://github.com/mysql/mysql-server/commit/d6be2f8d23b1fe41f10c7147957faf68b117abb2
2.7 Bug explanation
In the implementation of performance_schame.data_locks
, the C++ try-catch mechanism is used to handle the exception of memory allocation failure when reading `data_locks` records.
In the table_data_locks::rnd_next
function of the table_data_locks.cc
file:
Code explanation: catch (const std::bad_alloc &)
is used to catch exceptions of type std::bad_alloc
. When a std::bad_alloc
exception is thrown into the try
block, my_error
is called to print the error message and ER_STD_BAD_ALLOC_ERROR
is returned to end the execution.
iterator_done = it->scan(&m_container, true);
in the try
module is the actual business logic code being executed. it->scan
performs some kind of iterative scanning operation, and the result is stored in the iterator_done
variable. The exception caught by the catch
module is thrown from here.
The definition of it->scan
is in the Innodb_data_lock_iterator::scan
function of the p_s.cc
source code file, and the execution logic is very clear:
trx_sys_mutex_enter()
adds thetrx_sys->mutex
mutex first.- Calls
scan_trx_list
to scan the two transaction listsrw_trx_list
andmysql_trx_list
. trx_sys_mutex_exit()
releases thetrx_sys->mutex
mutex.
Obviously, if a std::bad_alloc
memory exception occurs during the scan by calling scan_trx_list
, it will be directly caught by the catch
module, the exception will be thrown and the execution will end, and it will not be able to execute to trx_sys_mutex_exit()
to release the mutex, resulting in the residue of the trx_sys->mutex
mutex
This can be confirmed in the trx_sys->mutex
information printed when reproducing in the debug version: It was added at line 592 of the p_s.cc
file.
3 Steps to reproduce
The simplified steps to reproduce are as follows:
- Prepare a virtual machine with a small memory, such as 2 - 4G, to easily trigger a memory allocation exception.
- Create a table
t1
with 5 million rows. - Execute
begin; select * from t1 for update;
- Execute
select * from performance_schema.data_locks;
to trigger the error ERROR 3044 (HY000): Memory allocation error: while scanning data_locks table in function rnd_next. - Continuing to query
performance_schema.data_lock_waits
will be blocked.
4. Conclusion
The causes of this failure are as follows:
- During the batch processing,
INSERT INTO t1 SELECT * FROM t2;
will add an S Lock to all rows of tablet2
(reason: RR isolation level and cannot use the index). Tablet2
is very large with hundreds of millions of rows, which will result in hundreds of millions of lock records inperformance_schema.data_locks
; - When querying
ps.data_locks
, due to the large number of records, it consumes a lot of memory and triggers a memory allocation exception. In the implementation ofps.data_locks
, the memory allocation exception is handled by the C++ try-catch mechanism, but this exception handling occurs between the mutex lock and mutex unlock, resulting in the mutex not being released. In this failure, a memory allocation exception occurred during the iterative scan of lock records after executingtrx_sys_mutex_enter()
, and the operation oftrx_sys_mutex_exit()
to release the mutex was not executed and remained; - Since the
trx_sys->mutex
mutex is widely used in InnoDB, the subsequent inspection script query fordata_lock_waits
is blocked, and the background purge thread, innodb monitor thread, metadata refresh process, and business SQL executed by user threads are all blocked, eventually leading to a deadlock.
5 Solution
- The official fixed this bug in 8.0.37, and it can be solved by upgrading to 8.0.37.
The way to fix it is to remove the try-catch code in the implementation of ps.data_locks
and data_lock_waits
to prevent handling exceptions between the mutex lock and mutex unlock, which may cause the mutex unlock not to be executed and leave the mutex remaining.
- Do not query
ps.data_locks
when there are a lot of row locks.