By Michal Sokolowski — Dec 4, 2022

Unexpected SELECT FOR UPDATE side effect in MySQL

A few months back, I encountered a strange behavior in an application I was working on. From time to time, a timeout error would pop up in the application logs. I started checking potential problems.

First, I looked into the code, but I didn't find anything. The error occurred at random moments.

The code section where the error came from was as simple as: “save the model to the MySQL database”. Then I checked what query was executed in the database, it was an insert similar to the one below:

INSERT INTO t (<columns>)VALUES (<values>);

After a few hours of growing confusion and frustration, I narrowed down the potential problems to one: It had to be caused by some mechanic inside the database. Eventually, I found it. It is called Insert Intention Locks.

TL;DR

Using SELECT FOR UPDATE has a hidden side effect.
When the SELECT FOR UPDATE returns no rows, it may be locking rows in a table. As a result, no other transaction can insert row(s) into a part of the table until your transaction is committed or rolled back.
The name of the locking mechanism is Insert Intention Locks.
You can avoid this issue by checking if the rows you are asking for exist before using SELECT FOR UPDATE
The Insert Intention Locks can be disabled in by using READ COMMITTED isolation level (the default isolation level is REPEATABLE READS)

Insert Intention Locks

From the MySQL documentation we can read:

An insert intention lock is a type of gap lock set by INSERT operations prior to row insertion. This lock signals the intent to insert in such a way that multiple transactions inserting into the same index gap need not wait for each other if they are not inserting at the same position within the gap.

But what is a gap lock? As the name suggests gap lock is one of the locking mechanisms in MySQL. From the documentation as well:

A gap lock is a lock on a gap between index records, or a lock on the gap before the first or after the last index record. For example, SELECT c1 FROM t WHERE c1 BETWEEN 10 and 20 FOR UPDATE; prevents other transactions from inserting a value of 15 into column t.c1, whether or not there was already any such value in the column, because the gaps between all existing values in the range are locked.

Let’s look at an example:

Suppose we have a table with an indexed column ID. It contains values 3 and 9. The gaps for this index cover the following intervals (where a round bracket denotes exclusion of the interval endpoint and a square bracket means the inclusion of the endpoint):

(negative infinity, 3]
(3, 9]
(9, positive infinity)

So the insert couldn’t be executed because it was trying to lock an interval in the table. But what held the lock in the first place?

During my investigation, I discovered that a different part of the application was modifying rows in the same table. As you might have guessed, it used SELECT FOR UPDATE to ensure that only one transaction modifies a row simultaneously. But in some cases, the SELECT FOR UPDATE was executed for rows that didn’t exist (passed directly from the front-end app).

Example

Let’s try to reproduce the issue.

You can run MySQL in Docker with the command:

//create MySQL containerwithdatabase dbandpassword root
docker run--name db -e MYSQL_ROOT_PASSWORD=root -e MYSQL_DATABASE=db -p 3306:3306 -d mysql:latest123

To create a connection to the database you can use a terminal tool such as mycli:

mycli -u root -p root db

Create a table with the script:

USE db;
CREATETABLE t (ID INT,PRIMARYKEY (ID))ENGINE =InnoDB;
INSERTINTO t (ID)VALUES (3), (9);
123

Now create two separate connections to the database and execute the scripts in them (run connection_1 first):

# connection_1
STARTTRANSACTION;
SELECT *FROM tWHERE ID = 7FORUPDATE;
123

# connection_2
INSERTINTO t (ID)VALUES (10); # works
INSERTINTO t (ID)VALUES (2);  # works
INSERTINTO t (ID)VALUES (6);  # times out
INSERTINTO t (ID)VALUES (8);  # times out
12345

You can see what is happening in the database with the command:

SHOWENGINEINNODBSTATUS\\G
12

For me, it returns:

TRANSACTIONS
------------
Trx id counter 1827
Purge done for trx's n:o < 1826 undo n:o < 0 state: running but idle
History list length 2
LIST OF TRANSACTIONS FOR EACH SESSION:
---TRANSACTION 562947874593200, not started
0 lock struct(s), heap size 1128, 0 row lock(s)
---TRANSACTION 562947874592392, not started
0 lock struct(s), heap size 1128, 0 row lock(s)
---TRANSACTION 1826, ACTIVE 18 sec inserting
mysql tables in use 1, locked 1
LOCK WAIT 2 lock struct(s), heap size 1128, 1 row lock(s)
MySQL thread id 11, OS thread handle 281472364683200, query id 32 172.17.0.1 root update
INSERT INTO t (ID) VALUES (6);  # times out
------- TRX HAS BEEN WAITING 18 SEC FOR THIS LOCK TO BE GRANTED:
RECORD LOCKS space id 2 page no 4 n bits 72 index PRIMARY of table `db`.`t` trx id 1826 lock_mode X locks gap before rec insert intention waiting
Record lock, heap no 3 PHYSICAL RECORD: n_fields 3; compact format; info bits 0
 0: len 4; hex 80000009; asc     ;;
 1: len 6; hex 000000000716; asc       ;;
 2: len 7; hex 8200000120011d; asc        ;;
123456789101112131415161718192021

This part is the most interesting:

INSERT INTO t (ID) VALUES (6); # times out------- TRX HAS BEEN WAITING 18 SEC FOR THIS LOCK TO BE GRANTED:RECORD LOCKS space id 2 page no 4 n bits 72 index PRIMARY of table db.t trx id 1826 lock_mode X locks gap before rec insert intention waiting

It confirms that the insert of value 6 was waiting for the release of the insert intention lock.

When you comment inserting value 6 out, inserting 8 will also be locked.

Clean up

You can remove the created container with the command below:

docker rm db

Solution

To avoid accidentally locking rows in a table as explained above, you need to ensure that the row(s) you are querying with SELECT FOR UPDATE exists. You can do it in many ways, the simplest being executing the same select query but without FOR UPDATE first.

The transaction isolation level can control the mechanism described in this post. When you change it to the READ COMMITTED group, the issue will disappear altogether (but it can cause other problems). From documentation:

Gap locking can be disabled explicitly. This occurs if you change the transaction isolation level to READ COMMITTED. In this case, gap locking is disabled for searches and index scans and is used only for foreign-key constraint checking and duplicate-key checking.

There are also other effects of using the READ COMMITTED isolation level. Record locks for nonmatching rows are released after MySQL has evaluated the WHERE condition.

By default, MySQL sets the REPEATABLE READ isolation level.

Conclusion

I believe the issue described in this post is worth sharing. It lived long in the application I was working on. After reading this post, I hope you can stop it quickly in yours. :)