先说问题的结论:https://jira.terracotta.org/jira/browse/QTZ-45

问题的解决方法:https://issues.jboss.org/browse/JBAS-8598  其实就是升级到1.8.3 就解决了

问题的源头:

项目中的代码发现死锁,jstack查看线程堆栈:

Found one Java-level deadlock:

=============================

"resin-port-9007-1558":

waiting to lock monitor 0x00007f6e04001a38 (object 0x00000007890586c0, a java.lang.Object),

which is held by "scheduler_Worker-2"

"scheduler_Worker-2":

waiting to lock monitor 0x00007f6e34004a78 (object 0x0000000788fc1a68, a java.lang.Object),

which is held by "scheduler_QuartzSchedulerThread"

"scheduler_QuartzSchedulerThread":

waiting to lock monitor 0x00007f6e04001a38 (object 0x00000007890586c0, a java.lang.Object),

which is held by "scheduler_Worker-2"

"scheduler_Worker-2":

at org.quartz.core.QuartzSchedulerThread.signalSchedulingChange(QuartzSchedulerThread.java:204)

- waiting to lock <0x0000000788fc1a68> (a java.lang.Object)

at org.quartz.core.SchedulerSignalerImpl.signalSchedulingChange(SchedulerSignalerImpl.java:87)

at org.quartz.simpl.RAMJobStore.triggeredJobComplete(RAMJobStore.java:1408)

- locked <0x00000007890586c0> (a java.lang.Object)

at org.quartz.core.QuartzScheduler.notifyJobStoreJobComplete(QuartzScheduler.java:1767)

at org.quartz.core.JobRunShell.run(JobRunShell.java:270)

at org.quartz.simpl.SimpleThreadPool$WorkerThread.run(SimpleThreadPool.java:549)

"scheduler_QuartzSchedulerThread":

at org.quartz.simpl.RAMJobStore.releaseAcquiredTrigger(RAMJobStore.java:1282)

- waiting to lock <0x00000007890586c0> (a java.lang.Object)

at org.quartz.core.QuartzSchedulerThread.releaseIfScheduleChangedSignificantly(QuartzSchedulerThread.java:432)

at org.quartz.core.QuartzSchedulerThread.run(QuartzSchedulerThread.java:288)

- locked <0x0000000788fc1a68> (a java.lang.Object)

查看代码,具体的逻辑是:

RAMJobStore 和 QuartzSchedulerThread 都有个自己的锁对象

逻辑一:QuartzSchedulerThread-》RAMJobStore  。

QuartzSchedulerThread 本身其实是个线程,他的run方法会加自己的锁,会去检查定时的改变情况,然后会调用RAMJobStore的方法

逻辑二:RAMJobStore  ->  QuartzSchedulerThread

定时任务执行的时候,会调用RAMJobStore,RAMJobStore会调用QuartzSchedulerThread的方法通知任务的完成情况

逻辑一和逻辑二的加锁顺序相反,死锁形成。

1.8.3 版本是如何解决死锁的呢:

是改变了逻辑一:检查定时任务的情况的时候,不会去调用RAMJobStore的方法去尝试取消任务:

0818b9ca8b590ca3270a3433284dd417.png 第一个红线的地方在1.8.0的时候,和第二个红线地方一样。红线一的地方不再调用RAMJobStore的加锁,所以死锁解决。

Logo

魔乐社区(Modelers.cn) 是一个中立、公益的人工智能社区,提供人工智能工具、模型、数据的托管、展示与应用协同服务,为人工智能开发及爱好者搭建开放的学习交流平台。社区通过理事会方式运作,由全产业链共同建设、共同运营、共同享有,推动国产AI生态繁荣发展。

更多推荐