Story Details

  • Linux kernel cgroups writeback high CPU troubleshooting

    Posted: 2025-02-14 08:30:27

    The blog post details troubleshooting high CPU usage attributed to the writeback process in a Linux kernel. After initial investigations pointed towards cgroups and specifically the cpu.cfs_period_us parameter, the author traced the issue to a tight loop within the cgroup writeback mechanism. This loop was triggered by a large number of cgroups combined with a specific workload pattern. Ultimately, increasing the dirty_expire_centisecs kernel parameter, which controls how long dirty data stays in memory before being written to disk, provided the solution by significantly reducing the writeback activity and lowering CPU usage.

    Summary of Comments ( 15 )
    https://news.ycombinator.com/item?id=43046174

    Commenters on Hacker News largely discuss practical troubleshooting steps and potential causes of the high CPU usage related to cgroups writeback described in the linked blog post. Several suggest using tools like perf to profile the kernel and pinpoint the exact function causing the issue. Some discuss potential problems with the storage layer, like slow I/O or a misconfigured RAID, while others consider the possibility of a kernel bug or an interaction with specific hardware or drivers. One commenter shares a similar experience with NFS and high CPU usage related to writeback, suggesting a potential commonality in networked filesystems. Several users emphasize the importance of systematic debugging and isolation of the problem, starting with simpler checks before diving into complex kernel analysis.