I would like to request one or more alternate filesystem collection modes for do-agent that do not require the agent to walk and interpret every mounted filesystem on the host.
Some environments have unusual, duplicated, virtualized, or bind-mounted filesystem layouts. In those environments, the current filesystem collector can encounter duplicate or misleading mount data. A safer alternative would allow administrators to collect only the basic disk and inode metrics they actually need.
This request is not for full support of every unusual environment. Instead, it is a request for safer fallback collection modes that could help administrators avoid full mount-table discovery when it is not appropriate for their server.
The three possible approaches are:
- A
df-based filesystem collector mode;
- An explicit path-based filesystem collector mode;
- A customer-provided filesystem metrics file.
Any one of these would help. Supporting more than one would give administrators flexibility.
Proposed option 1: df-based filesystem collector mode
Please consider adding a filesystem collector mode that gathers disk and inode metrics using the equivalent of:
Possible option names:
--collector.filesystem.mode=df
or:
--collector.filesystem.use-df
When enabled, do-agent would collect filesystem space and inode metrics using df-style output instead of walking and interpreting the full mount table through the current filesystem collector.
This would provide the same type of filesystem information administrators already trust from the terminal:
- total space;
- used space;
- available space;
- percent used;
- total inodes;
- used inodes;
- available inodes;
- inode percent used;
- filesystem/device;
- mountpoint.
Possible advanced options:
--collector.filesystem.df-path=/usr/bin/df
--collector.filesystem.df-args="-P"
--collector.filesystem.df-inode-args="-Pi"
If parsing fails, the agent could disable only the df filesystem collector and emit a single warning, rather than repeatedly logging the same failure.
Why a df-based mode may be enough
On the affected server, standard df output provides a clean and practical filesystem view without exposing the large CageFS bind-mount layout that caused problems for the normal collector.
Example df -h output:
Filesystem Size Used Avail Use% Mounted on
devtmpfs 4.0M 0 4.0M 0% /dev
tmpfs 1.8G 0 1.8G 0% /dev/shm
tmpfs 732M 74M 658M 11% /run
tmpfs 4.0M 0 4.0M 0% /sys/fs/cgroup
/dev/vda1 120G 30G 90G 26% /
/dev/vda3 507M 316M 191M 63% /boot
/dev/vda2 200M 7.5M 193M 4% /boot/efi
/dev/loop0 3.9G 204K 3.7G 1% /tmp
none 1.8G 4.0K 1.8G 1% /var/lve/dbgovernor-shm
Example df -ih output:
Filesystem Inodes IUsed IFree IUse% Mounted on
devtmpfs 447K 344 447K 1% /dev
tmpfs 457K 1 457K 1% /dev/shm
tmpfs 800K 924 800K 1% /run
tmpfs 1.0K 18 1006 2% /sys/fs/cgroup
/dev/vda1 60M 663K 60M 2% /
/dev/vda3 256K 327 256K 1% /boot
/dev/vda2 0 0 0 - /boot/efi
/dev/loop0 256K 49 256K 1% /tmp
none 457K 2 457K 1% /var/lve/dbgovernor-shm
tmpfs 92K 22 92K 1% /run/user/1002
This suggests that the issue is not that filesystem usage cannot be reported on this host. The issue is that the current collector appears to inspect the mount layout in a way that encounters CageFS bind mounts and duplicate filesystem metrics.
A df-based fallback mode could collect the basic disk and inode information administrators already use from the terminal, while avoiding deeper mount-table discovery.
For many servers, this would be sufficient. In this example, the useful monitored filesystems would primarily be:
and possibly /var/lve/dbgovernor-shm only if the administrator chooses to include tmpfs-style filesystems.
The agent could optionally ignore common virtual filesystems by default, such as:
devtmpfs
tmpfs
cgroup
cgroup2
proc
sysfs
debugfs
tracefs
overlay
squashfs
This would give do-agent a safer fallback for unusual mount layouts without requiring full CloudLinux/CageFS support.
Proposed option 2: explicit path-based filesystem checks
Please consider an option that collects filesystem metrics only for specific administrator-provided paths.
For example:
--collector.filesystem.paths=/,/boot,/boot/efi,/tmp
or:
--collector.filesystem.paths-file=/etc/do-agent/filesystem-paths.conf
Example paths file:
When this option is used, do-agent would skip full mountpoint discovery and collect filesystem metrics only for the listed paths.
The behavior could be similar to running:
df -P /
df -P /boot
df -P /boot/efi
df -P /tmp
df -Pi /
df -Pi /boot
df -Pi /boot/efi
df -Pi /tmp
or using equivalent statfs / statvfs calls internally.
This would allow administrators to say:
Only report disk and inode usage for these important paths.
That is often all that is needed for practical alerting.
This would also avoid requiring administrators to craft complex mountpoint exclusion regular expressions for bind-mount-heavy systems.
Proposed option 3: customer-provided filesystem metrics file
Please also consider allowing do-agent to read filesystem metrics from a local file.
For example:
--collector.filesystem.file=/var/lib/do-agent/filesystem-metrics.txt
or:
--collector.filesystem.source=file
--collector.filesystem.file=/var/lib/do-agent/filesystem-metrics.txt
In this model, the customer could generate the file however they prefer:
df;
stat;
- a shell script;
- a cron job;
- a monitoring tool;
- a custom parser with environment-specific exclusions.
do-agent would remain the trusted process that submits metrics to DigitalOcean, but the customer would control how filesystem metrics are gathered.
A file-based approach may be safer than an exec-based plugin because do-agent would not need to run arbitrary customer commands. It would only read a documented local file format.
Example conceptual format:
mountpoint=/ size_bytes=128849018880 used_bytes=32212254720 avail_bytes=96636764160 used_percent=26 inode_total=62914560 inode_used=663000 inode_avail=62251560 inode_used_percent=2
mountpoint=/boot size_bytes=531628032 used_bytes=331350016 avail_bytes=200278016 used_percent=63 inode_total=262144 inode_used=327 inode_avail=261817 inode_used_percent=1
mountpoint=/tmp size_bytes=4187593113 used_bytes=208896 avail_bytes=3972844748 used_percent=1 inode_total=262144 inode_used=49 inode_avail=262095 inode_used_percent=1
Or, if preferred, the file could use a documented Prometheus-style text format.
Why this is useful
Some environments have mount tables that are technically valid but difficult for a general-purpose filesystem collector to interpret safely.
Examples include:
- CloudLinux CageFS;
- cPanel/WHM systems;
- chroot-heavy systems;
- bind-mount-heavy systems;
- container-heavy hosts;
- Docker/LXC environments;
- systems with duplicated or virtualized mountpoints.
In these environments, the administrator may not need the agent to understand every mountpoint. They may only need reliable metrics for a few filesystems or paths, such as:
A df-style mode, explicit path mode, or customer-provided file mode would avoid unnecessary full mount discovery and reduce the risk of duplicate filesystem metrics.
Example use case: CloudLinux / CageFS / cPanel
I understand that CloudLinux/CageFS is not officially supported by do-agent. This feature request is not asking for full CloudLinux support.
However, this environment is a good example of why safer alternate filesystem collection modes would be useful.
Environment:
- DigitalOcean Droplet;
- CloudLinux + cPanel/WHM;
- CageFS enabled;
do-agent upgraded automatically from 3.18.10-1 to 3.18.12-1;
- Upgrade occurred around
2026-04-24 03:49 UTC.
After the upgrade, the filesystem collector began repeatedly logging duplicate metric errors related to CageFS bind mounts.
The repeated mountpoints were under:
/usr/share/cagefs-skeleton/
The logs repeatedly contained errors similar to:
failed to gather metrics: collected metric "node_filesystem_size_bytes" ... was collected before with the same name and label values
The impact was significant:
- sustained high CPU usage, around 75%;
- approximately 55 GB
/var/log/messages;
- approximately 40 GB rotated messages log;
- disk exhaustion;
- WHM/cPanel service interruption.
Disabling do-agent immediately stopped the log flood and CPU returned to normal.
In this case, I did not need the agent to inspect CageFS mountpoints. I only needed basic disk and inode metrics for the main filesystems. Commands such as the following were sufficient to show the information I needed:
or, for specific paths:
df -P /
df -P /boot
df -P /boot/efi
df -P /tmp
df -Pi /
df -Pi /boot
df -Pi /boot/efi
df -Pi /tmp
Current workaround
The only safe workaround I currently have is to disable the filesystem collector entirely:
/opt/digitalocean/bin/do-agent --syslog --no-collector.filesystem
That prevents the runaway filesystem collector behavior, but it also removes the DigitalOcean filesystem metrics I actually need for this Droplet.
This creates an unfortunate tradeoff:
- leave filesystem collection enabled and risk duplicate metric errors, runaway logging, high CPU usage, and disk exhaustion;
- disable filesystem collection and lose the disk/inode metrics that would help detect or prevent disk exhaustion.
A safer alternate collection mode would avoid this tradeoff by allowing do-agent to report basic filesystem usage without walking the full mount layout.
Why mountpoint exclusion rules are not always enough
Mountpoint exclusion rules are useful, but they still require the agent to discover and reason about the host’s mount layout.
In bind-mount-heavy or CageFS-style environments, that discovery process can be fragile. Administrators may also have to write complex regular expressions to exclude paths the agent did not need to inspect in the first place.
A df-based, path-based, or file-based mode would be simpler and more predictable:
- do not walk every mountpoint;
- do not inspect CageFS bind mounts unnecessarily;
- do not require complex mountpoint exclusion regular expressions;
- collect only the filesystems or paths the administrator explicitly cares about;
- allow administrators to generate clean filesystem metrics themselves when needed.
Requested features
Please consider adding one or more of the following options.
df-based mode
--collector.filesystem.mode=df
or:
--collector.filesystem.use-df
This would collect filesystem space and inode metrics using the equivalent of df -P and df -Pi.
Explicit path mode
--collector.filesystem.paths=/,/boot,/boot/efi,/tmp
or:
--collector.filesystem.paths-file=/etc/do-agent/filesystem-paths.conf
This would collect filesystem metrics only for explicitly configured paths.
Customer-provided file mode
--collector.filesystem.file=/var/lib/do-agent/filesystem-metrics.txt
or:
--collector.filesystem.source=file
--collector.filesystem.file=/var/lib/do-agent/filesystem-metrics.txt
This would allow customers to generate filesystem metrics themselves and let do-agent read and submit them.
Additional defensive behavior
Even when an environment is unsupported, it may also be helpful for the agent to handle repeated filesystem collector failures more defensively.
For example:
- rate-limit repeated duplicate metric errors;
- disable only the affected collector after repeated failures;
- emit one clear warning instead of repeatedly logging the same error;
- avoid filling system logs when the metrics collector is unhealthy.
A metrics issue should not be able to fill /var/log/messages, exhaust disk space, and contribute to a production service outage.
Related issues
This request may also help with or relate to other reports involving CloudLinux support, duplicate metric collection, or high CPU from filesystem metric collection:
This feature request is more specific: provide safer alternate filesystem collection modes, such as df-based collection, explicit path collection, or customer-provided filesystem metrics, so that users do not have to choose between unsafe full mount discovery and disabling filesystem metrics entirely.
Trouble Ticket
I also opened a DigitalOcean Support ticket for this incident in April. Support confirmed that CloudLinux/CageFS is not officially supported by do-agent.
This request is not for full CloudLinux support, but for a safer alternative filesystem collection mode that could help unsupported or unusual mount layouts avoid a full mount-table discovery.
[#12093409](https://cloudsupport.digitalocean.com/s/case-detail?recordId=500QP00001QvKFRYA3) do-agent 3.18.12 causes runaway logging and high CPU on CloudLinux CageFS Droplet
I would like to request one or more alternate filesystem collection modes for
do-agentthat do not require the agent to walk and interpret every mounted filesystem on the host.Some environments have unusual, duplicated, virtualized, or bind-mounted filesystem layouts. In those environments, the current filesystem collector can encounter duplicate or misleading mount data. A safer alternative would allow administrators to collect only the basic disk and inode metrics they actually need.
This request is not for full support of every unusual environment. Instead, it is a request for safer fallback collection modes that could help administrators avoid full mount-table discovery when it is not appropriate for their server.
The three possible approaches are:
df-based filesystem collector mode;Any one of these would help. Supporting more than one would give administrators flexibility.
Proposed option 1:
df-based filesystem collector modePlease consider adding a filesystem collector mode that gathers disk and inode metrics using the equivalent of:
Possible option names:
or:
When enabled,
do-agentwould collect filesystem space and inode metrics usingdf-style output instead of walking and interpreting the full mount table through the current filesystem collector.This would provide the same type of filesystem information administrators already trust from the terminal:
Possible advanced options:
--collector.filesystem.df-args="-P"--collector.filesystem.df-inode-args="-Pi"If parsing fails, the agent could disable only the
dffilesystem collector and emit a single warning, rather than repeatedly logging the same failure.Why a
df-based mode may be enoughOn the affected server, standard
dfoutput provides a clean and practical filesystem view without exposing the large CageFS bind-mount layout that caused problems for the normal collector.Example
df -houtput:Example
df -ihoutput:This suggests that the issue is not that filesystem usage cannot be reported on this host. The issue is that the current collector appears to inspect the mount layout in a way that encounters CageFS bind mounts and duplicate filesystem metrics.
A
df-based fallback mode could collect the basic disk and inode information administrators already use from the terminal, while avoiding deeper mount-table discovery.For many servers, this would be sufficient. In this example, the useful monitored filesystems would primarily be:
and possibly
/var/lve/dbgovernor-shmonly if the administrator chooses to include tmpfs-style filesystems.The agent could optionally ignore common virtual filesystems by default, such as:
This would give
do-agenta safer fallback for unusual mount layouts without requiring full CloudLinux/CageFS support.Proposed option 2: explicit path-based filesystem checks
Please consider an option that collects filesystem metrics only for specific administrator-provided paths.
For example:
or:
Example paths file:
When this option is used,
do-agentwould skip full mountpoint discovery and collect filesystem metrics only for the listed paths.The behavior could be similar to running:
or using equivalent
statfs/statvfscalls internally.This would allow administrators to say:
That is often all that is needed for practical alerting.
This would also avoid requiring administrators to craft complex mountpoint exclusion regular expressions for bind-mount-heavy systems.
Proposed option 3: customer-provided filesystem metrics file
Please also consider allowing
do-agentto read filesystem metrics from a local file.For example:
or:
In this model, the customer could generate the file however they prefer:
df;stat;do-agentwould remain the trusted process that submits metrics to DigitalOcean, but the customer would control how filesystem metrics are gathered.A file-based approach may be safer than an exec-based plugin because
do-agentwould not need to run arbitrary customer commands. It would only read a documented local file format.Example conceptual format:
Or, if preferred, the file could use a documented Prometheus-style text format.
Why this is useful
Some environments have mount tables that are technically valid but difficult for a general-purpose filesystem collector to interpret safely.
Examples include:
In these environments, the administrator may not need the agent to understand every mountpoint. They may only need reliable metrics for a few filesystems or paths, such as:
A
df-style mode, explicit path mode, or customer-provided file mode would avoid unnecessary full mount discovery and reduce the risk of duplicate filesystem metrics.Example use case: CloudLinux / CageFS / cPanel
I understand that CloudLinux/CageFS is not officially supported by
do-agent. This feature request is not asking for full CloudLinux support.However, this environment is a good example of why safer alternate filesystem collection modes would be useful.
Environment:
do-agentupgraded automatically from3.18.10-1to3.18.12-1;2026-04-24 03:49 UTC.After the upgrade, the filesystem collector began repeatedly logging duplicate metric errors related to CageFS bind mounts.
The repeated mountpoints were under:
The logs repeatedly contained errors similar to:
The impact was significant:
/var/log/messages;Disabling
do-agentimmediately stopped the log flood and CPU returned to normal.In this case, I did not need the agent to inspect CageFS mountpoints. I only needed basic disk and inode metrics for the main filesystems. Commands such as the following were sufficient to show the information I needed:
or, for specific paths:
Current workaround
The only safe workaround I currently have is to disable the filesystem collector entirely:
That prevents the runaway filesystem collector behavior, but it also removes the DigitalOcean filesystem metrics I actually need for this Droplet.
This creates an unfortunate tradeoff:
A safer alternate collection mode would avoid this tradeoff by allowing
do-agentto report basic filesystem usage without walking the full mount layout.Why mountpoint exclusion rules are not always enough
Mountpoint exclusion rules are useful, but they still require the agent to discover and reason about the host’s mount layout.
In bind-mount-heavy or CageFS-style environments, that discovery process can be fragile. Administrators may also have to write complex regular expressions to exclude paths the agent did not need to inspect in the first place.
A
df-based, path-based, or file-based mode would be simpler and more predictable:Requested features
Please consider adding one or more of the following options.
df-based modeor:
This would collect filesystem space and inode metrics using the equivalent of
df -Panddf -Pi.Explicit path mode
or:
This would collect filesystem metrics only for explicitly configured paths.
Customer-provided file mode
or:
This would allow customers to generate filesystem metrics themselves and let
do-agentread and submit them.Additional defensive behavior
Even when an environment is unsupported, it may also be helpful for the agent to handle repeated filesystem collector failures more defensively.
For example:
A metrics issue should not be able to fill
/var/log/messages, exhaust disk space, and contribute to a production service outage.Related issues
This request may also help with or relate to other reports involving CloudLinux support, duplicate metric collection, or high CPU from filesystem metric collection:
node_filesystem_*metric errors.This feature request is more specific: provide safer alternate filesystem collection modes, such as
df-based collection, explicit path collection, or customer-provided filesystem metrics, so that users do not have to choose between unsafe full mount discovery and disabling filesystem metrics entirely.Trouble Ticket
I also opened a DigitalOcean Support ticket for this incident in April. Support confirmed that CloudLinux/CageFS is not officially supported by
do-agent.This request is not for full CloudLinux support, but for a safer alternative filesystem collection mode that could help unsupported or unusual mount layouts avoid a full mount-table discovery.
[#12093409](https://cloudsupport.digitalocean.com/s/case-detail?recordId=500QP00001QvKFRYA3) do-agent 3.18.12 causes runaway logging and high CPU on CloudLinux CageFS Droplet