вторник, 30 июля 2013 г.

Don't trust your mount

Today I was afraid for my sanity. We have 4 servers, one of them serves as nfs server, 3 other mounts two filesystems from the first. Today developer said that the content on servers is asynchronous. I said that it wasn't possible, but took a look. I was wrong. It was possible. Two of 4 servers were out of sync. Two first servers has common content. And two others... We have approximately the following structure. All servers run Ubuntu Linux 12.04. server1:
  • /home/user bind-mounted to /export/user
  • /home/user/www/shared - bind-mounted to /export/shared
  • /export/user - shared by nfs, mounted on /home/user on 2nd,3rd,4th servers
  • /export/shared - mounted on /home/user/www/shared on 2nd,3rd,4th server.
mount on 2,3,4 servers showed
...
/dev/mapper/volume on /home type ext4 (rw,noatime,nodiratime)
server1:/export/user on /home/user type nfs (rw,nfsvers=3,addr=x.x.x.x)
server1:/export/shared on /home/user/www/shared type nfs (rw,nfsvers=3,addr=x.x.x.x)
But contents on the first and second server was the same. And on the 3rd and 4th server I saw the difference in file sizes... When I did on the 4th server
cd /home/user/www/shared
touch 1
I saw the file only on the 4th and 3rd servers, but not on the first and second.... I thought about file caches, kernel bugs, my insanity and so on. Finally I did on the 3rd server
cat /proc/mounts

...
rpc_pipefs /run/rpc_pipefs rpc_pipefs rw,relatime 0 0
server1:/export/user/ /home/user nfs rw,relatime,vers=3,rsize=1048576,wsize=1048576,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,.....
server1:/export/shared /home/user/www_old/shared nfs rw,relatime,vers=3,rsize=1048576,wsize=1048576,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,.....
Wait... What? Yes, nfs filesystem was mounted not in www/shared, but in www_old/shared... Evidently, some developer has moved www to www_old and filled www with new stuff... It explained everything. The only thing I don't understand - why mount can't tell truth. I understand, it reads mtab. But why should it lie in such cases?