记一次 Ubuntu持续io写入排查
2021年10月13日...大约 2 分钟
前言
今天发现树莓派的硬盘灯一直在闪,使用 iotop
查看后发现 jbd2/mmcblk0p2-
是罪魁祸首,而 jbd2
进程负责 ext4
这种日志文件系统的日志提交操作。
接下来主要针对sync操作频繁进行检查
排查
1.先开启内核trace检查
在 jbd2执行 flush 时输出日志
echo 1 > /sys/kernel/debug/tracing/events/jbd2/jbd2_commit_flushing/enable
在任意进程执行 sync 时输出日志
echo 1 > /sys/kernel/debug/tracing/events/ext4/ext4_sync_file_enter/enable
然后观察日志输出
cat /sys/kernel/debug/tracing/trace_pipe
输出了下面的信息:
jbd2/mmcblk0p2--820 [001] ...1 260452.131488: jbd2_commit_flushing: dev 179,2 transaction 1516530 sync 0
snapd-357038 [003] .... 260452.145505: ext4_sync_file_enter: dev 179,2 ino 1087 parent 74600 datasync 0
jbd2/mmcblk0p2--820 [001] ...1 260452.149615: jbd2_commit_flushing: dev 179,2 transaction 1516531 sync 0
snapd-357038 [003] .... 260452.152523: ext4_sync_file_enter: dev 179,2 ino 74600 parent 71970 datasync 0
jbd2/mmcblk0p2--820 [000] ...1 260452.152549: jbd2_commit_flushing: dev 179,2 transaction 1516532 sync 0
snapd-358581 [000] .... 260452.652817: ext4_sync_file_enter: dev 179,2 ino 1153 parent 74600 datasync 0
jbd2/mmcblk0p2--820 [002] ...1 260452.658712: jbd2_commit_flushing: dev 179,2 transaction 1516533 sync 0
snapd-358581 [001] .... 260452.664249: ext4_sync_file_enter: dev 179,2 ino 74600 parent 71970 datasync 0
jbd2/mmcblk0p2--820 [002] ...1 260452.664272: jbd2_commit_flushing: dev 179,2 transaction 1516534 sync 0
snapd-358581 [002] .... 260452.675168: ext4_sync_file_enter: dev 179,2 ino 1087 parent 74600 datasync 0
jbd2/mmcblk0p2--820 [001] ...1 260452.680429: jbd2_commit_flushing: dev 179,2 transaction 1516535 sync 0
snapd-358581 [002] .... 260452.683239: ext4_sync_file_enter: dev 179,2 ino 74600 parent 71970 datasync 0
jbd2/mmcblk0p2--820 [001] ...1 260452.683263: jbd2_commit_flushing: dev 179,2 transaction 1516536 sync 0
snapd-358581 [002] .... 260453.182930: ext4_sync_file_enter: dev 179,2 ino 1153 parent 74600 datasync 0
jbd2/mmcblk0p2--820 [001] ...1 260453.205389: jbd2_commit_flushing: dev 179,2 transaction 1516537 sync 0
snapd-358581 [002] .... 260453.209767: ext4_sync_file_enter: dev 179,2 ino 74600 parent 71970 datasync 0
jbd2/mmcblk0p2--820 [001] ...1 260453.209809: jbd2_commit_flushing: dev 179,2 transaction 1516538 sync 0
snapd-358581 [003] .... 260453.224301: ext4_sync_file_enter: dev 179,2 ino 1087 parent 74600 datasync 0
jbd2/mmcblk0p2--820 [002] ...1 260453.229586: jbd2_commit_flushing: dev 179,2 transaction 1516539 sync 0
snapd-358581 [003] .... 260453.232238: ext4_sync_file_enter: dev 179,2 ino 74600 parent 71970 datasync 0
jbd2/mmcblk0p2--820 [002] ...1 260453.232259: jbd2_commit_flushing: dev 179,2 transaction 1516540 sync 0
snapd-356108 [000] .... 260453.727280: ext4_sync_file_enter: dev 179,2 ino 1153 parent 74600 datasync 0
jbd2/mmcblk0p2--820 [003] ...1 260453.731968: jbd2_commit_flushing: dev 179,2 transaction 1516541 sync 0
snapd-356108 [002] .... 260453.737160: ext4_sync_file_enter: dev 179,2 ino 74600 parent 71970 datasync 0
jbd2/mmcblk0p2--820 [003] ...1 260453.737204: jbd2_commit_flushing: dev 179,2 transaction 1516542 sync 0
snapd-356108 [002] .... 260453.749038: ext4_sync_file_enter: dev 179,2 ino 1087 parent 74600 datasync 0
jbd2/mmcblk0p2--820 [003] ...1 260453.754401: jbd2_commit_flushing: dev 179,2 transaction 1516543 sync 0
snapd-356108 [002] .... 260453.760075: ext4_sync_file_enter: dev 179,2 ino 74600 parent 71970 datasync 0
jbd2/mmcblk0p2--820 [003] ...1 260453.760100: jbd2_commit_flushing: dev 179,2 transaction 1516544 sync 0
snapd-356101 [001] .... 260453.872052: ext4_sync_file_enter: dev 179,2 ino 1153 parent 74600 datasync 0
jbd2/mmcblk0p2--820 [002] ...1 260453.877497: jbd2_commit_flushing: dev 179,2 transaction 1516545 sync 0
snapd-356101 [003] .... 260453.885346: ext4_sync_file_enter: dev 179,2 ino 74600 parent 71970 datasync 0
能看到大量的snapd datasync
解决
首先我尝试停止 snapd
服务,但是没有效果,于是我直接使用 kill
命令结束了相关进程,再看 iotop
磁盘 io 已经恢复正常
Powered by Waline v3.4.2