Uploaded image for project: 'XWiki Platform'
  1. XWiki Platform
  2. XWIKI-19727

ReplayJob blocking EM after getting stuck while waiting for an answer for deleting unused pages

    XMLWordPrintable

Details

    • Bug
    • Resolution: Solved By
    • Major
    • 15.5.2, 14.10.6, 15.7
    • 13.10.4
    • Extension
    • None
    • Unknown

    Description

      We have noticed a situation where EM becomes unusable (unable to install/uninstall extensions on the main wiki or the affected subwiki) after a replay plan was used on a subwiki to uninstall some XAR extensions.

      In the Extension History of the affected wiki, a replay plan is reported to be ongoing, but the job log shows a "Successfully applied [UNINSTALL] for extension [ludovicdubost:todo-macro] ..." last entry, while never finishing.

      Upon doing a thread dump, we can see that, in fact, an underlying job from that plan is stuck waiting for an answer to a delete pages question:

      "extension/wiki:subwiki job group daemon thread - org.xwiki.extension.job.history.internal.ReplayJob@f2193f3([extension, history, 1652692189767-830])" #10694
         java.lang.Thread.State: WAITING
              at jdk.internal.misc.Unsafe.park(Native Method)
              - waiting on java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@23b59026
              at java.util.concurrent.locks.LockSupport.park(LockSupport.java:194)
              at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2081)
              at org.xwiki.job.AbstractJobStatus.ask(AbstractJobStatus.java:364)
              at org.xwiki.job.AbstractJobStatus.ask(AbstractJobStatus.java:331)
              at org.xwiki.extension.xar.internal.handler.XarExtensionJobFinishedListener.maybeDeletePages(XarExtensionJobFinishedListener.java:265)
              at org.xwiki.extension.xar.internal.handler.XarExtensionJobFinishedListener.onEvent(XarExtensionJobFinishedListener.java:155)
              at org.xwiki.observation.internal.DefaultObservationManager.notify(DefaultObservationManager.java:320)
              at org.xwiki.observation.internal.DefaultObservationManager.notify(DefaultObservationManager.java:285)
              at org.xwiki.job.AbstractJob.jobFinished(AbstractJob.java:294)
              at org.xwiki.extension.job.internal.AbstractExtensionJob.jobFinished(AbstractExtensionJob.java:134)
              at org.xwiki.job.AbstractJob.runInContext(AbstractJob.java:248)
              at org.xwiki.job.AbstractJob.run(AbstractJob.java:220)
              at org.xwiki.extension.job.history.internal.ReplayJob.replay(ReplayJob.java:137)
              at org.xwiki.extension.job.history.internal.ReplayJob.runInternal(ReplayJob.java:120)
              at org.xwiki.job.AbstractJob.runInContext(AbstractJob.java:243)
              at org.xwiki.job.AbstractJob.run(AbstractJob.java:233)
              at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
              at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
              at java.lang.Thread.run(Thread.java:829)
      

      Using the Active Jobs Application (https://extensions.xwiki.org/xwiki/bin/view/Extension/Active%20Jobs%20Application/) to look at the Jobs Module, and active job is reported as Running but with no question active:
      ( extra debugging added: )

        println indentation + "* " + jobStatus.getCurrentRecord()
        println indentation + "* " + jobStatus.getCurrentRecord().id;
        println indentation + "* " + jobStatus.getCurrentRecord().answers;
        println indentation + "* " + jobStatus.getCurrentRecord().answers.values()[0];
        println indentation + "* " + jobStatus.getCurrentRecord().answers.values()[0].class;
        println indentation + "* " + jobStatus.getCurrentRecord().answers.values()[0].answers;
        println indentation + "* " + jobStatus.getCurrentRecord().request;
        println indentation + "* " + jobStatus.getQuestion();
      

      outputs:

      [Group] extension/wiki:documentation
      
          executor: org.xwiki.job.internal.DefaultJobExecutor$JobGroupExecutor@36950378[Running, pool size = 1, active threads = 1, queued tasks = 1, completed tasks = 25]
              [Job] extension/history/1652692189767-830 (org.xwiki.extension.job.history.internal.ReplayJob)
                  Started at: Mon May 16 11:09:49 CEST 2022
      *debug begin:*
                  org.xwiki.extension.job.history.ExtensionJobHistoryRecord@7d88cab3
                  1652688797164-uninstall-wiki:documentation
                  [org.xwiki.extension.xar.question.CleanPagesQuestion:org.xwiki.extension.xar.internal.question.CleanPagesQuestionRecorder@26bb2a34]
                  org.xwiki.extension.xar.internal.question.CleanPagesQuestionRecorder@26bb2a34
                  class org.xwiki.extension.xar.internal.question.CleanPagesQuestionRecorder
                  [documentation:XWiki.TodoMacroService():true, documentation:XWiki.TodoListMacro():true, documentation:XWiki.TodoMacroInstall():true, documentation:XWiki.TodoMacroGroovy():true, documentation:XWiki.TodoMacro():true, documentation:Todo.ListDemo():true, documentation:Todo.AdvancedDemo():true, documentation:Todo.SimpleDemo():true, documentation:Todo.WebHome():true]
                  org.xwiki.extension.job.UninstallRequest@9a083e74
                  null
      *debug end.*
                  Cancelable: false
              Queue size: 1
                  extension/plan/org.xwiki.contrib:application-task-ui/wiki:documentation (org.xwiki.extension.job.internal.InstallPlanJob)
      

      In the job status xml on the filesystem was only showing the success of the todo-macro uninstalled XAR extension, similar to the UI.

      It seems like the question corresponds to a different subtask/subjob than the actually reported one (which is finished).

      According to tmortagne, any question inside a subtask of the ReplayJob should have been bubbled up to the top job, to be displayed in the UI and even in the Active Jobs extension, but it was not happening in our case, which suggests a bug.

      If more details and reproduction steps become available, I will provide them in the comments. So far, that is all we have.

      Attachments

        Issue Links

          Activity

            People

              lucaa Anca Luca
              enygma Eduard Moraru
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: