在SQL Server 2017的错误日志中出现"Parallel redo is started for database 'xxx' with worker pool size [2]"和“Parallel redo is shutdown for database 'xxx' with worker pool size [2].”这种信息,这意味着什么呢? 如下所示
Date 2020/5/16 11:07:38
Log SQL Server (Current - 2020/5/16 11:08:00)
Source spid33s
Message
Parallel redo is started for database 'YourSQLDba' with worker pool size [2].
Date 2020/5/16 11:07:38
Log SQL Server (Current - 2020/5/16 11:08:00)
Source spid33s
Message
Parallel redo is shutdown for database 'YourSQLDba' with worker pool size [2].
其实这个要涉及
parallel redo这个概念,官方文档有详细介绍,摘抄部分如下【详情请见参考资料】:
When availability group was initially released with SQL Server 2012, the transaction log redo was handled by a single redo thread for each database in an AG secondary replica. This redo model is also called as serial redo. In SQL Server 2016, the redo model was enhanced with multiple parallel redo worker threads per database to share the redo workload. In addition, each database has a new helper worker thread for handling the dirty page disk flush IO. This new redo model is called parallel redo. With the new parallel redo model that is the default setting since SQL Server 2016, workloads with highly concurrent small transactions are expected to achieve better redo performance. When the transaction redo operation is CPU intensive, such as when data encryption and/or data compression are enabled, parallel redo has even higher redo throughput (Redone Bytes/sec) compared to serial redo. Moreover, indirect checkpoint allows parallel redo to offload more disk IO (and IO waits for slow disk) to its helper worker thread and frees main redo thread to enumerate more received log records in secondary replica. It further speeds up the redo performance. However parallel redo, which enables multi-threading model, has an associated cost. 其实错误日志中出现这些信息,这是在SQL Server 2017中添加的与可用性组的并行重做(Parallel redo)相关的信息性日志消息。我们的SQL Server实例是单实例,并不是AG中的一个节点,怎么会有parallel redo的信息呢? 其实数据库没有参与AG,所以在数据库启动的时候,该数据库的parallel redo线程启动,然后数据库检查发现并没有可用性组。那么就会关闭parallel redo的线程。
所以在数据库实例重启过后,你会在错误日志看到“Parallel redo is started for database 'xxxx' with worker pool size [2].” 这样的输出信息,然后立马又会看到“Parallel redo is shutdown for database 'xxxx' with worker pool size [2].”.
其实呢,还有一种情况,就是你的用户数据设置开启了AUTO_CLOSE选项。如下所示,我将数据库的YourSQLDba的AUTO_CLOSE开启。
USE [master]
GO
ALTER DATABASE [YourSQLDba] SET AUTO_CLOSE ON WITH NO_WAIT
GO
SELECT d.name AS database_name
,SUSER_SNAME(owner_sid) AS database_owner
,d.create_date AS create_date
,d.collation_name AS collcation_name
,d.state_desc AS state_desc
,d.is_auto_close_on AS is_auto_close_on
FROM sys.databases d
如下所示,当会话访问此数据库,就会出现大量这样的日志信息。此时可以通过将数据库AUTO_CLOSE选项关闭,就不会在错误日志中出现大量这样的信息,但是在SQL Server实例启动的时候,你还是还是会看到这些日志信息
我们可以通过启用跟踪标记3459来关闭parallel redo这个功能。注意,这个跟踪标记(trace flag)仅仅适用于SQL Server 2016/2017或更高的版本。建议在数据库实例启动时通过使用 -T 命令行选项来启用全局跟踪标志。 这样可确保跟踪标志在服务器重新启动后保持活动状态。 若要让跟踪标志生效,请重启 SQL Server。
另外,注意关于parallel redo在特定版本有个Bug:“FIX: Parallel redo does not work after you disable Trace Flag 3459 in an instance of SQL Server”,希望你不在测试过程中命中了这个Bug,否则会影响测试结果(具体版本信息,请阅读参考资料的官方链接)
Assume that you use Always On Availability Groups in Microsoft SQL Server. After you switch to serial redo from parallel redo by enabling Trace Flag 3459, serial redo works as expected. However, when you switch back to parallel redo by disabling Trace Flag 3459, parallel redo does not work. If you restart the instance of SQL Server, parallel redo works as expected.