The pacemaker `Filesystem` resource is killing unrelated processes when resource is stopped

Solution Verified - Updated

Environment

  • Red Hat Enterprise Linux Server 8, 9 (with the High Availability Add On)

Issue

  • The pacemaker Filesystem resource is killing unrelated processes when resource-agent is stopped when there are filesystems that are nested.

  • The pacemaker Filesystem resource is killing unrelated processes when resource-agent is stopped when there are filesystems that are not nested. It kills processes that are accessing the filesystem.

       Mar 15 13:18:09 node42 Filesystem(data)[3045553]: INFO: Running stop for /dev/mapper/vgdata-lv1 on /data
       Mar 15 13:18:09 node42 Filesystem(data)[3045582]: INFO: Trying to unmount /data
       Mar 15 13:18:09 node42 Filesystem(data)[3045595]: INFO: No processes on /data were signalled. force_unmount is set to 'true'
       Mar 15 13:18:10 node42 systemd[1]: data.mount: Succeeded.
       Mar 15 13:18:12 node42 Filesystem(data)[3045615]: INFO: sending signal KILL to: root           1       0  0  2025 ?        Ss     9:31 /usr/lib/systemd/systemd --switched-root --system --deserialize 17
       Mar 15 13:18:12 node42 Filesystem(data)[3045624]: INFO: sending signal KILL to: root           2       0  0  2025 ?        S      0:01 [kthreadd]
       Mar 15 13:18:12 node42 Filesystem(data)[3045633]: INFO: sending signal KILL to: root           3       2  0  2025 ?        I<     0:00 [rcu_gp]
       Mar 15 13:18:12 node42 Filesystem(data)[3045642]: INFO: sending signal KILL to: root           4       2  0  2025 ?        I<     0:00 [rcu_par_gp]
       Mar 15 13:18:12 node42 Filesystem(data)[3045651]: INFO: sending signal KILL to: root           5       2  0  2025 ?        I<     0:00 [slub_flushwq]
       Mar 15 13:18:12 node42 Filesystem(data)[3045660]: INFO: sending signal KILL to: root           7       2  0  2025 ?        I<     0:00 [kworker/0:0H-events_highpri]
       Mar 15 13:18:12 node42 Filesystem(data)[3045669]: INFO: sending signal KILL to: root          10       2  0  2025 ?        I<     0:00 [mm_percpu_wq]
       [....]
       Mar 15 13:18:13 node42 Filesystem(data)[3046638]: INFO: sending signal KILL to: root         132       2  0  2025 ?        S      0:00 [cpuhp/20]
       Mar 15 13:18:13 node42 Filesystem(data)[3046647]: INFO: sending signal KILL to: root         133       2  0  2025 ?        S      0:02 [watchdog/20]
       Mar 15 13:18:13 node42 Filesystem(data)[3046656]: INFO: sending signal KILL to: root         134       2  0  2025 ?        S      0:00 [migration/20]
       Mar 15 13:18:13 node42 Filesystem(data)[3046665]: INFO: sending signal KILL to: root         135       2  0  2025 ?        S      0:07 [ksoftirqd/20]
    

Resolution

Red Hat Enterprise Linux 8

Red Hat Enterprise Linux 9

Root Cause

The issue is addressed with the following fix: Content from github.com is not included.Merge pull request #1945 from gianlucapiccolo/fix-1944 · ClusterLabs/resource-agents@5a120c1. The issue can cause processes running on a different filesystem being killed when stopping a filesystem (managed by pacemaker resource-agent Filesystem).

It can try to kill processes running on the following different filesystems that may or may not be managed by pacemaker:

  • A filesystem nested below the pacemaker managed resource
  • The "/" filesystem.

Diagnostic Steps

Check to see if there are nested filesystems being managed by pacemaker or local filesystems. For example if the following was managed by pacemaker:

  • /foo: Managed locally or managed by pacemaker with a Filesystem resource-agent.
  • /foo/bar: Managed by pacemaker with a Filesystem resource-agent.

When the Filesystem resource /foo/bar is stopped, it also kills processes using the Filesystem resource /foo.

In addition we have see this occur on filesystem mounted at /data and then tries to kill procesesses on / filesystem that is not managed by pacemaker.

SBR
Components
Category

This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.