alp's notes: 2013

пятница, 8 ноября 2013 г.

beadm destroy error

Today when I was trying to destroy old boot environment, I got strange error:

# beadm destroy oi-hipster-2013-08-06
Are you sure you want to destroy oi-hipster-2013-08-06?
This action cannot be undone (y/[n]): y
be_destroy_callback: failed to destroy data/zones/build/ROOT/zbe: dataset is busy
be_destroy: failed to destroy BE data/zones/build/ROOT/zbe
be_destroy_zone_root_callback: failed to destroy zone root data/zones/build/ROOT/zbe
be_destroy_zone_roots: failed to destroy zone roots under zonepath dataset data/zones/build: dataset is busy
be_destroy_zones: failed to find and destroy zone roots for zone build
be_destroy: failed to destroy one or more zones for BE oi-hipster-2013-08-06

I didn't want to destroy zone root FS accidentally, so was a bit scared. However, after looking at it a bit longer, I found out, that zone root FS has several manual ZFS snapshots. After destroying snapshots I was able to destroy BE.

вторник, 30 июля 2013 г.

Don't trust your mount

Today I was afraid for my sanity. We have 4 servers, one of them serves as nfs server, 3 other mounts two filesystems from the first. Today developer said that the content on servers is asynchronous. I said that it wasn't possible, but took a look. I was wrong. It was possible. Two of 4 servers were out of sync. Two first servers has common content. And two others... We have approximately the following structure. All servers run Ubuntu Linux 12.04. server1:

/home/user bind-mounted to /export/user
/home/user/www/shared - bind-mounted to /export/shared
/export/user - shared by nfs, mounted on /home/user on 2nd,3rd,4th servers
/export/shared - mounted on /home/user/www/shared on 2nd,3rd,4th server.

mount on 2,3,4 servers showed

...
/dev/mapper/volume on /home type ext4 (rw,noatime,nodiratime)
server1:/export/user on /home/user type nfs (rw,nfsvers=3,addr=x.x.x.x)
server1:/export/shared on /home/user/www/shared type nfs (rw,nfsvers=3,addr=x.x.x.x)

But contents on the first and second server was the same. And on the 3rd and 4th server I saw the difference in file sizes... When I did on the 4th server

cd /home/user/www/shared
touch 1

I saw the file only on the 4th and 3rd servers, but not on the first and second.... I thought about file caches, kernel bugs, my insanity and so on. Finally I did on the 3rd server

cat /proc/mounts

...
rpc_pipefs /run/rpc_pipefs rpc_pipefs rw,relatime 0 0
server1:/export/user/ /home/user nfs rw,relatime,vers=3,rsize=1048576,wsize=1048576,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,.....
server1:/export/shared /home/user/www_old/shared nfs rw,relatime,vers=3,rsize=1048576,wsize=1048576,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,.....

Wait... What? Yes, nfs filesystem was mounted not in www/shared, but in www_old/shared... Evidently, some developer has moved www to www_old and filled www with new stuff... It explained everything. The only thing I don't understand - why mount can't tell truth. I understand, it reads mtab. But why should it lie in such cases?

четверг, 23 мая 2013 г.

elfdump -a in Solaris

Solaris userland is rather specific... A lot of utilities misses convenient options from GNU/BSD analogs. E.g., we don't have "elfdump -a" here... As always, a bit of scripting solves this problem:

elfdump -c /bin/ls |grep Header |awk ' { print $4; }'  |xargs -n 1 -I '{}' elfdump -N '{}' /bin/ls

четверг, 21 марта 2013 г.

Deleted files in directory

I was just refreshing my knowledge about filesystems... In FreeBSD it is definately easier to investigate fs behavior. You can just do a hexdump of a directory. In Solaris and Linux this does not work. Hexdump of zfs directory is not very interesting. However, in ufs dump we can see that deleted files persist in directory listing:

# zfs create -V 1G zpool/zvol1
# newfs /dev/zvol/zpool/zvol1
# mkdir test
# mount  /dev/zvol/zpool/zvol1 `pwd`/test
# cd test
# mkdir 99
# cd 99
# touch a b c d e
# hd . 
00000000  00 14 01 00 0c 00 04 01  2e 00 00 00 02 00 00 00  |................|
00000010  0c 00 04 02 2e 2e 00 00  01 14 01 00 0c 00 08 01  |................|
00000020  61 00 05 9e 02 14 01 00  0c 00 08 01 62 00 05 9e  |a...........b...|
00000030  03 14 01 00 0c 00 08 01  63 00 05 9e 04 14 01 00  |........c.......|
00000040  0c 00 08 01 64 00 05 9e  05 14 01 00 b8 01 08 01  |....d...........|
00000050  65 00 05 9e 00 00 00 00  00 00 00 00 00 00 00 00  |e...............|
00000060  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
*
00000200

Now, after removing some files we can see that corresponding records still exist in directory:

# rm e d a
# hd .
00000000  00 14 01 00 0c 00 04 01  2e 00 00 00 02 00 00 00  |................|
00000010  18 00 04 02 2e 2e 00 00  01 14 01 00 0c 00 08 01  |................|
00000020  61 00 05 9e 02 14 01 00  0c 00 08 01 62 00 05 9e  |a...........b...|
00000030  03 14 01 00 d0 01 08 01  63 00 05 9e 04 14 01 00  |........c.......|
00000040  c4 01 08 01 64 00 05 9e  05 14 01 00 b8 01 08 01  |....d...........|
00000050  65 00 05 9e 00 00 00 00  00 00 00 00 00 00 00 00  |e...............|
00000060  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
*
00000200
# ls 
b       c

If we add some files, we can see that deleted record is used again:

# touch f 
# hd .
00000000  00 14 01 00 0c 00 04 01  2e 00 00 00 02 00 00 00  |................|
00000010  0c 00 04 02 2e 2e 00 00  01 14 01 00 0c 00 08 01  |................|
00000020  66 00 e9 05 02 14 01 00  0c 00 08 01 62 00 05 9e  |f...........b...|
00000030  03 14 01 00 d0 01 08 01  63 00 05 9e 04 14 01 00  |........c.......|
00000040  c4 01 08 01 64 00 05 9e  05 14 01 00 b8 01 08 01  |....d...........|
00000050  65 00 05 9e 00 00 00 00  00 00 00 00 00 00 00 00  |e...............|
00000060  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
*
00000200

среда, 30 января 2013 г.

Debian Wheezy experience

We are deploying new Proxmox servers and decided to try Debian 7 (testing). We had some problems with old platform. In particular, Squeeze GRUB is not fresh enough to boot from multipath device and had to be replaced with one from testing. The following issues were discovered with Debian 7:

Installer gave some strange error during network configuration, however it can be related to missing bnx firmware (awful Debian policy, firmware is in non-free repository)
QLogic FC adapter also tried to load missing firmware, however the system recognized FC-attached device
Multipath-tools-boot package is partly broken - it installs /usr/share/initramfs-tools/scripts/init-top/multipath which it seems do nothing, but fails to execute during update-initramfs (so I've just deleted it)
And last, most annoying issue is absence of perl-suid package in repositories, which prevents Proxmox installation from "deb http://download.proxmox.com/debian squeeze pve"

Last issue was frustrating enough for us to revert to Squeeze with its old but well-known bugs and perculiarities.

пятница, 18 января 2013 г.

Oracle Dispatcher startup after shutdown

There is an "alter system shutdown immediate 'D000'" command to shutdown dispatcher in Oracle. However, how will you launch disppatcher after shutdown? It's evident - with "alter system set dispatchers='...'" :), where necessary config may be get from v$parameter. It seems that Oracle developers hate their users...

пятница, 8 ноября 2013 г.