After we upgrade SharePoint to 2013, we found some workflows
failed intermittently. The failed workflows include not only custom workflows
but also out of box approval workflows. After debugging the OoB failed
workflows, we identified that most of the failed OoB workflows have due date or
duration configured. This is pointing to issue “Bulk workflow task processing”
timer job that should be trigger by “SharePoint Foundation Workflow Timer Service”.
If we looked at the SharePoint workflow architecture, the “SharePoint
Foundation Workflow Timer Service” should be the process to trigger the
workflows with due date or duration configured. This workflow timer service
would need to have all the references like assemblies and workflow definitions
in order to process successfully the workflows. In our cases, our previous
consultants configured the “SharePoint Foundation Workflow Timer Service” on the
application servers NOT the WFEs. We normally do not have solutions and workflows
deployed to application servers.
One issue we could explain the Muhimbi workflow failed in
production. Muhimbi installation will installed many features in 15 hive, one
workflow in web.config, and one workflow dll in GAC as in following three screenshots.
If “SharePoint Foundation Workflow Timer Service” is running on application servers, the service will not be able to find the workflow definition, related dlls, or other reference. It will fail as other people mentioned for different workflows. We might have similar issues for the following workflows.
- Third party workflows like Muhimbi worklfow
- Custom visual studio workflow activities
- Designer workflows
- OoB workflows with due date or duration
Now the fix to resolve intermittently failed workflow on SharePoint 2013 is to run the “SharePoint Foundation Workflow Timer Service” on all WFEs instead as Microsoft recommended. The “Microsoft SharePoint Foundation Workflow Timer Service” should run on the server with “Microsoft SharePoint Foundation Web Application” service running. This server is WFE server. Of cause, you could run the “Microsoft SharePoint Foundation Workflow Timer Service” on different server with the configurations mentioned from Microsoft. We have other SharePoint customers resolve the similar workflow issues by reconfigure the “SharePoint Foundation Workflow Timer Service”to run on WFEs.
In addition, here are some steps your might try to debug the workflow issue.
- Enabled all workflow log level to verbose - Check General, Timer and Workflow Infrastructure under the SharePoint Foundation category and select Verbose in the least critical event to report to trace log.
- Verify workflow service account has the full access to all content databases and config database
- Restart workflow timer jobs
Workflow
|
Processes
workflow events that are in the scheduled items table, such as delays.
|
5 minute
|
Workflow auto
cleanup
|
Deletes tasks
and instances in the workflow instance table for workflows that were marked
completed more than n days in the past, where n is specified in the workflow
association. Crawls through tasks and the workflow instance table.
|
Daily
|
Workflow
failover
|
Processes
events for workflows that have failed and are marked to be retried.
|
15 minute
|
Bulk workflow
task processing
|
Processes bulk
workflow task completion.
|
Daily
|
- Restart SharePoint timer job
No comments:
Post a Comment