程序错误日志大量的报死锁错误,去数据库错误日志查看确实有很多死锁(应在数据库实例启动时执行dbcc traceon(1222,-1)开启死锁跟踪):
04/29/2016 14:07:51,spid33s,δ?,waiter id=process71da6bb88 mode=ix requesttype=wait04/29/2016 14:07:51,spid33s,δ?,waiter-list04/29/2016 14:07:51,spid33s,δ?,owner id=process5b534c8 mode=s04/29/2016 14:07:51,spid33s,δ?,owner-list04/29/2016 14:07:51,spid33s,δ?,pagelock fileid=1 pageid=22056721 dbid=5 objectname=adcdb.dbo.na_trafficmemberorderrelation id=lock1263c7180 mode=s associatedobjectid=7205759518198988804/29/2016 14:07:51,spid33s,δ?,waiter id=process5b534c8 mode=s requesttype=wait04/29/2016 14:07:51,spid33s,δ?,waiter-list04/29/2016 14:07:51,spid33s,δ?,owner id=process71da6bb88 mode=ix04/29/2016 14:07:51,spid33s,δ?,owner-list04/29/2016 14:07:51,spid33s,δ?,pagelock fileid=1 pageid=22075507 dbid=5 objectname=adcdb.dbo.na_trafficmemberorderrelation id=lock115c85a80 mode=ix associatedobjectid=7205759518198988804/29/2016 14:07:51,spid33s,δ?,resource-list04/29/2016 14:07:51,spid33s,δ?,(@p0 int@p1 int@p2 int@p3 int@p4 nvarchar(4000)@p5 nvarchar(4000)@p6 nvarchar(4000)@p7 nvarchar(4000)@p8 nvarchar(4000)@p9 nvarchar(4000)@p10 decimal(285)@p11 decimal(285)@p12 nvarchar(4000)@p13 nvarchar(4000)@p14 datetime@p15 datetime@p16 int@p17 datetime@p18 datetime@p19 datetime@p20 int)insert into na_trafficmemberorderrelation (applysource ordermode ordersource orderstate eccode ecprdcode usermobile memberordernum mainproductcode trafficpackagecode billprice flowvalue memberapplycode crmapplycode effecttime exprietime userid createtime ordertime modifytime discountrate) values (@p0 @p1 @p2 @p3 @p4 @p5 @p6 @p7 @p8 @p9 @p10 @p11 @p12 @p13 @p14 @p15 @p16 @p17 @p18 @p19 @p20); select scope_identity()04/29/2016 14:07:51,spid33s,δ?,inputbuf04/29/2016 14:07:51,spid33s,δ?,unknown04/29/2016 14:07:51,spid33s,δ?,frame procname=unknown line=1 sqlhandle=0x00000000000000000000000000000000000000000000000004/29/2016 14:07:51,spid33s,δ?,insert into na_trafficmemberorderrelation (applysource ordermode ordersource orderstate eccode ecprdcode usermobile memberordernum mainproductcode trafficpackagecode billprice flowvalue memberapplycode crmapplycode effecttime exprietime userid createtime ordertime modifytime discountrate) values (@p0 @p1 @p2 @p3 @p4 @p5 @p6 @p7 @p8 @p9 @p10 @p11 @p12 @p13 @p14 @p15 @p16 @p17 @p18 @p19 @p20);04/29/2016 14:07:51,spid33s,δ?,frame procname=adhoc line=1 stmtstart=626 stmtend=1494 sqlhandle=0x020000003e44d0396c586db1098e79558ceb28c112e75ea804/29/2016 14:07:51,spid33s,δ?,executionstack04/29/2016 14:07:51,spid33s,δ?,process id=process71da6bb88 taskpriority=0 logused=2964 waitresource=page: 5:1:22056721 waittime=3741 ownerid=43666495205 transactionname=user_transaction lasttranstarted=2016-04-29t14:07:45.920 xdes=0x3023d7950 lockmode=ix schedulerid=15 kpid=21872 status=suspended spid=518 sbid=0 ecid=0 priority=0 trancount=2 lastbatchstarted=2016-04-29t14:07:47.827 lastbatchcompleted=2016-04-29t14:07:47.823 clientapp=.net sqlclient data provider hostname=adc-intfe hostpid=10192 loginname=daemon_adcuser isolationlevel=read committed (2) xactid=43666495205 currentdb=5 locktimeout=4294967295 clientoption1=671088672 clientoption2=12805604/29/2016 14:07:51,spid33s,δ?,(@p0 int@p1 nvarchar(4000)@p2 nvarchar(4000)@p3 int@p4 nvarchar(4000))select top (@p0) trafficmem0_.trafficmemberorderrelationid as trafficm1_433_ trafficmem0_.applysource as applysou2_433_ trafficmem0_.ordermode as ordermode433_ trafficmem0_.ordersource as ordersou4_433_ trafficmem0_.orderstate as orderstate433_ trafficmem0_.eccode as eccode433_ trafficmem0_.ecprdcode as ecprdcode433_ trafficmem0_.usermobile as usermobile433_ trafficmem0_.memberordernum as memberor9_433_ trafficmem0_.mainproductcode as mainpro10_433_ trafficmem0_.trafficpackagecode as traffic11_433_ trafficmem0_.billprice as billprice433_ trafficmem0_.flowvalue as flowvalue433_ trafficmem0_.memberapplycode as membera14_433_ trafficmem0_.crmapplycode as crmappl15_433_ trafficmem0_.effecttime as effecttime433_ trafficmem0_.exprietime as exprietime433_ trafficmem0_.userid as userid433_ trafficmem0_.createtime as createtime433_ trafficmem0_.ordertime as ordertime433_ trafficmem0_.modifytime as modifytime433_ trafficmem004/29/2016 14:07:51,spid33s,δ?,inputbuf04/29/2016 14:07:51,spid33s,δ?,unknown04/29/2016 14:07:51,spid33s,δ?,frame procname=unknown line=1 sqlhandle=0x00000000000000000000000000000000000000000000000004/29/2016 14:07:51,spid33s,δ?,select top (@p0) trafficmem0_.trafficmemberorderrelationid as trafficm1_433_ trafficmem0_.applysource as applysou2_433_ trafficmem0_.ordermode as ordermode433_ trafficmem0_.ordersource as ordersou4_433_ trafficmem0_.orderstate as orderstate433_ trafficmem0_.eccode as eccode433_ trafficmem0_.ecprdcode as ecprdcode433_ trafficmem0_.usermobile as usermobile433_ trafficmem0_.memberordernum as memberor9_433_ trafficmem0_.mainproductcode as mainpro10_433_ trafficmem0_.trafficpackagecode as traffic11_433_ trafficmem0_.billprice as billprice433_ trafficmem0_.flowvalue as flowvalue433_ trafficmem0_.memberapplycode as membera14_433_ trafficmem0_.crmapplycode as crmappl15_433_ trafficmem0_.effecttime as effecttime433_ trafficmem0_.exprietime as exprietime433_ trafficmem0_.userid as userid433_ trafficmem0_.createtime as createtime433_ trafficmem0_.ordertime as ordertime433_ trafficmem0_.modifytime as modifytime433_ trafficmem0_.discountrate as discoun22_433_ from na_trafficmemberorderrelation traffic04/29/2016 14:07:51,spid33s,δ?,frame procname=adhoc line=1 stmtstart=148 sqlhandle=0x0200000043e4d6202dd7a9505f76a2897ed4c6cada41102104/29/2016 14:07:51,spid33s,δ?,executionstack04/29/2016 14:07:51,spid33s,δ?,process id=process5b534c8 taskpriority=0 logused=0 waitresource=page: 5:1:22075507 waittime=3659 ownerid=43666495407 transactionname=select lasttranstarted=2016-04-29t14:07:46.007 xdes=0x58b16cc90 lockmode=s schedulerid=16 kpid=18860 status=suspended spid=653 sbid=0 ecid=0 priority=0 trancount=0 lastbatchstarted=2016-04-29t14:07:46.007 lastbatchcompleted=2016-04-29t14:07:46.007 clientapp=.net sqlclient data provider hostname=adc-intfl hostpid=8820 loginname=daemon_adcuser isolationlevel=read committed (2) xactid=43666495407 currentdb=5 locktimeout=4294967295 clientoption1=671088672 clientoption2=12805604/29/2016 14:07:51,spid33s,δ?,process-list04/29/2016 14:07:51,spid33s,δ?,deadlock victim=process5b534c804/29/2016 14:07:51,spid33s,δ?,deadlock-list
从死锁日志的process-list部份分析,这里有两个进程 :
process id=process5b534c8正在执行对表na_trafficmemberorderrelation的查询
process id=process71da6bb88正在执行对表na_trafficmemberorderrelation的插入
由于死锁日志只记录当前线程开启事务正在执行的最后一句sql,所以要想获取事务的完整sql语句列表,必须通过sql sever profiler跟踪或阅读程序代码。获得的答案是:
查询语句是单独的事务
select top (1) *from na_trafficmemberorderrelation trafficmem0_ where trafficmem0_.usermobile=@p1 and trafficmem0_.trafficpackagecode=@p2 and trafficmem0_.orderstate=@p3 and trafficmem0_.ecprdcode=@p4
插入操作的事务大概是
begin traninsert into na_trafficmemberorderrelation values(...)insert into na_trafficmemberorderrelation values(...)commit
接着往下看resource-list部份,死锁的过程大概是
1、插入线程process71da6bb88已经获得页面(page)22075507上的意向排他锁ix(第一个插入语句执行成功后,在页面上加的ix锁),
2、查询线程process5b534c8获得页面22056721上的共享锁
3、插入线程process71da6bb88需要获得页面22056721上的ix以执行第二条插入语句,但查询进程process5b534c8已经获得了页面22056721上的s锁,而ix和s锁是不兼容的,所以插入线程在执行第二句插入语句时,被查询进程阻塞
4、查询线程process5b534c8需要获得页面22075507上的s以继续执行,但插入线程已经获得了该页面上的ix锁,此时结合第3步,两个进程进入死锁等待
那么问题来了,表na_trafficmemberorderrelation在usermobile字段是有索引的,按正常的索引查找,只需要在索引键上加s锁,所在的页面加is锁就可以了,而is锁和ix锁是兼容的,就不会产生死锁。而这里查询语句确在page上加了锁,那只能说明这个查询使用了索引扫描而不是索引查找(select语句会申请哪些锁,请参考(点击打开链接).
于是通过dbcc page查看页面22056721的内容发现,些页面属于表na_trafficmemberorderrelation上ecprdcode字段的索引,因此断定这个查询语句的执行计划是通过扫描ecprdcode索引来获取结果,执行计划是根据统计信息来生成的,所以事情的经过大概是这样的:
1、表na_trafficmemberorderrelation插入了大量数据,但还没有达到自动更新统计信息的阙值(表na_trafficmemberorderrelation现在6500万条记录).
2、sql server查询优化器为查询语句选择了扫描ecprdcode索引的执行计划(此处sql server应有bug)
3、通过sql server profiler跟踪该语句锁的申请和释放过程发现,是先申请第一个页面的s锁,再申请第二个页面的s锁,再释放第一个页面的s锁(不知微软是出于何考虑),如此循环,导致查询语句持有一个页面的s锁,又去申请下一个页面(上一个页面的s锁还没有释放)的s锁,最终形成死锁
知道了来龙去脉,解决起来就简单了,为查询语句加上forceseek提示:
select top (1) *from na_trafficmemberorderrelation trafficmem0_ with(forceseek) where trafficmem0_.usermobile=@p1 and trafficmem0_.trafficpackagecode=@p2 and trafficmem0_.orderstate=@p3 and trafficmem0_.ecprdcode=@p4
