The above kernel causes previously working CIFS mounts to fail with an “interrupted system call” error. A Linux system which had been using a NAS share for years acquired the new kernel on 17th March 2020 and the mount became unstable thereafter.
Interrupted System Call
It was during regular system backups that the problem surfaced. While attempting to read files on the mount, the backup software failed with this message:
Cannot open timestamp file for backup pluto.190305_193221.full: Interrupted system call at /home/backups/scripts/Backup.pm line 106, line 2.
The file in question (“timestamp” in the example) varied with each occurrence of the error. An attempt to “cat” the file immediately afterwards would sometimes return the file contents, and sometimes hang the shell.
Kernel reactions to the error varied in severity. Sometimes there were no additional messages. Sometimes this appeared in /var/log/kern.log:
CIFS VFS: No task to wake, unknown frame received! NumMids 1
and on at least one occasion, the kernel jumped on a stool as if it had seen a mouse, dumping into kern.log nearly half a million of these:
Apr 6 09:55:57 cx61 kernel: [ 3951.546518] 00000000: 23000000 424d53ff 000008a2 c80180c0 …#.SMB……..
Apr 6 09:55:57 cx61 kernel: [ 3951.546519] 00000010: 00000000 00000000 00000000 39460001 …………..F9
Apr 6 09:55:57 cx61 kernel: [ 3951.546520] 00000020: 65 00 9d 0e 00 e….
The NAS of interest is a rather ancient Linkstation. However the share has been working flawlessly for many years and continues to work with 5 other Linux clients, also Debian based. The difference seems to be that they are running Kernel 4 and the problematic box, running Linux Mint 19.3, is on 5.3.0-45.
For the record, these corresponding errors were left in the NAS’s log. An “oplock” referring to the same file. More likely a symptom than the cause. The same message was seen with earlier kernels.
smbd: Oplock break failed for file pluto.190305_193221.full/timestamp
It looks very similar to Gentoo bug 694780, although putting “ver=1.0” in fstab is not a workaround here. Options appear to be (a) go back to an earlier kernel eg. 4.19, or wait for kernel 5 to be fixed.
The NAS unit in question, a rather ancient Buffalo Linkstation HS-DHGL, was updated to the latest firmware (v. 2.21-0.83, posted 9th Jan 2018). After updating, the issue is still there.
The client Linux Mint kernel was updated from 5.3.0-45 to 5.3.0-51. After updating, the issue is still there.
The client encountering the error was a MSI CX61 laptop, about 5 years old.