From the initial release, one of the biggest complaints I had about Group Replication is that it allowed “stale” reads and there was no way to prevent them or to even know that you read “stale” data. That was a huge limitation. Thankfully, Oracle released features to control the consistency levels , and it was exactly a year ago! I don’t know about you, but I personally was confused by naming it group_replication_consistency=’AFTER’ or ‘BEFORE’.
So now I want to try to make sense of it and share my understanding (even if it is one year later).
We will start with the default group_replication_consistency=’EVENTUAL’ and work from there. So let’s consider a very simple table:
CREATE TABLE `t1` ( `id` int(11) NOT NULL AUTO_INCREMENT, `cnt` int(11) DEFAULT NULL, PRIMARY KEY (`id`) )
With over 10mln rows:
select count(*) from t1; +----------+ | count(*) | +----------+ | 10485760 | +----------+
And we will do a very simple action. On Node 1 we will update the table and on Node 2 we will read data.
UPDATE t1 SET cnt=11; Query OK, 10485760 rows affected (1 min 20.23 sec) Rows matched: 10485760 Changed: 10485760 Warnings: 0
select cnt from t1 where id=10001; +------+ | cnt | +------+ | 10 | +------+ 1 row in set (0.00 sec)
There are a few points to note:
How can we get a better outcome:
Let’s look into the consistency level ‘BEFORE’. It says that a transaction on Node 2 will wait until the previously committed transaction on the Node 1 transaction is committed also on Node 2. So let’s change the set session to group_replication_consistency=’BEFORE’; on Node 2 and execute a similar update again.
UPDATE t1 SET cnt=12; Query OK, 10485760 rows affected (1 min 18.66 sec) Rows matched: 10485760 Changed: 10485760 Warnings: 0
select cnt from t1 where id=10001; +------+ | cnt | +------+ | 12 | +------+ 1 row in set (1 min 11.45 sec)
So there are very notable changes:
This mode allowed us to do exactly what we wanted – prevent stale reads. Great outcome! So what about group_replication_consistency=’AFTER’ ?
This mode says that a transaction on Node 1 will not return OK until it makes sure that other nodes applied the transaction. To see it in action, we will put Node 1 into group_replication_consistency=’AFTER’ and Node 2 into group_replication_consistency=’EVENTUAL’;.
UPDATE t1 SET cnt=13; Query OK, 10485760 rows affected (3 min 0.46 sec) Rows matched: 10485760 Changed: 10485760 Warnings: 0
select cnt from t1 where id=10001; +------+ | cnt | +------+ | 13 | +------+ 1 row in set (0.00 sec)
Here the situation is different. Now the execution time on Node 1 doubled, as the transaction waits to be committed on all nodes and after that on Node 2 the execution is immediate.
This mode still avoids “stale” reads, but in this case, we shifted the wait time from Node 2 to Node 1, and this is how we can view the difference between ‘BEFORE’ and ‘AFTER’ consistency modes in Group Replication.
Both modes provide a consistent view, but:
So which mode to choose? Actually I think it is good to have an option here. You can choose, if you want, to put wait time on your readers or on your writers; the decision is up to how your application is designed.