вторник, 21 мая 2019 г.

One big Oracle ASM adventure

Well, this has happened long ago, but as I'm going to decomission http://dbseminar.r61.net, I think this information should be moved here...

Originally posted on on Tue, 04/07/2009

I had very unpleasant experience with RAC and ASM. I'd like to share it. Yesterday after VMware ESX Server upgrade (and maybe unsuccessfull live motion operations on one of two RAC nodes) I had the following records in asm instance alert log:
Mon Apr  6 15:48:52 2009
NOTE: cache registered group DATA number=1 incarn=0x3ad84292
NOTE: cache registered group FRA number=2 incarn=0x3b084293
NOTE: cache registered group LOGS number=3 incarn=0x3b084294
Mon Apr  6 15:48:52 2009
ERROR: no PST quorum in group 1: required 2, found 0
Mon Apr  6 15:48:52 2009
NOTE: cache dismounting group 1/0x3AD84292 (DATA)
NOTE: dbwr not being msg'd to dismount
ERROR: diskgroup DATA was not mounted
Mon Apr  6 15:48:52 2009
ERROR: no PST quorum in group 2: required 2, found 0
Mon Apr  6 15:48:52 2009
NOTE: cache dismounting group 2/0x3B084293 (FRA)
NOTE: dbwr not being msg'd to dismount
ERROR: diskgroup FRA was not mounted
Mon Apr  6 15:48:52 2009
ERROR: no PST quorum in group 3: required 2, found 0
Mon Apr  6 15:48:52 2009
NOTE: cache dismounting group 3/0x3B084294 (LOGS)
NOTE: dbwr not being msg'd to dismount
ERROR: diskgroup LOGS was not mounted

Diskgroups disappeared, and asm didn't want to see them. However, I could see /dev/sd* disks, where diskgroups were placed, and permissions were right: disks were owned by oracle user. select * from v$asm_diskgroup ; gave nothing and select path,header_status from v$asm_disk ; said that all disks were in provisioned state. It was awful, but in the end I've found sollution. I've noticed that kfed sees diskgroup and other attribute on the affected disks, but parameter kfdhdb.acdb.ub2spare is not 0 (as I found in Internet, it was its usual state). So I've dumped header info:
$ kfed read /dev/sdf > /tmp/sdf.noop.mod
$ vi /tmp/sdf.noop.mod  #Changed kfdhdb.acdb.ub2spare to 0
$ kfed op=write dev=/dev/sdf text=/tmp/sdf.noop.mod CHKSUM=YES
After this I could do "alter diskgroup fra mount" in asm instance, repeated this procedure for DATA asm diskgroup, after that I could mount diskgroups and recover my database. In conclusion I wish say that we are going to move our fra to OCFS2 fs the next weekend.

суббота, 20 октября 2018 г.

What is my sftp server doing?

Well, I'm not familiar with DTrace, but sometimes want to find, what some application is doing. In this case I wanted to monitor my sftp server. Luckily, most illumos distributions provide dtrace patch (coming from Oracle Solaris) to find this out. Unluckily, I haven't found any documentation on it, just source code. After reading Translators chapter of DTrace Guide and looking at /usr/lib/dtrace/sftp.d I've come to this:
dtrace -n 'sftp*:::transfer-done { printf ("%d: %s %s %s %d", pid, xlate <sftpinfo_t *>((sftpproto_t*)arg0)->sfi_pathname, xlate <sftpinfo_t *>((sftpproto_t*)arg0)->sfi_user, xlate <sftpinfo_t *>((sftpproto_t*)arg0)->sfi_operation, xlate <sftpinfo_t *>((sftpproto_t*)arg0)->sfi_nbytes  ); }'

dtrace: description 'sftp*:::transfer-done ' matched 8 probes
CPU     ID                    FUNCTION:NAME
  1  80412      process_read:transfer-done 7409: /export/home/user/1.pp user read 1808
  1  80412      process_read:transfer-done 7409: /export/home/user/1.pp user read 0
  1  80411     process_write:transfer-done 7409: /export/home/user/1.pp user write 1808
  1  80412      process_read:transfer-done 7409: /export/home/user/dtrace/poll.d user read 53
  1  80412      process_read:transfer-done 7409: /export/home/user/dtrace/poll.d user read 53

Seems rather interesting to me.

четверг, 30 августа 2018 г.

Quest: creating one hundred zones

Well, I need to create about one hundred zones once again. You could probably use ansible for this, but an old-fashioned man will do everything in shell. So: we have one "golden image" and have to create 100 zones like it. We could clone it, but with clones you receive wonderful issue - beadm activate fails in zone. So we create zones and do send/receive manually. This looks like this:
set -e

for i in $(seq 1 100); do 

    #Creating interface for the zone
    dladm create-vnic -l e1000g1 hnet$i

    #Creating initial config   

    cat > $TEMPFILE <<EOF
create -b
set zonepath=/zones/h$i
set autoboot=true
set ip-type=exclusive
add net
set physical=hnet$i
add capped-memory
set physical=2G
add rctl
set name=zone.max-swap
add value (priv=privileged,limit=2147483648,action=deny)
add rctl
set name=zone.max-locked-memory
add value (priv=privileged,limit=536870912,action=deny)

    zonecfg -z h$i -f $TEMPFILE
    zfs send -R data/zones/h0@initial | zfs recv -F data/zones/h$i
    # Zone tools should know that zone is in installed state, not configured
    # Also during installation zoneadm assigns uuid to zone (last field). We do this manually.
    gsed -i  -e "/^h${i}:/ s/\$/${uuid}/" -e "/^h${i}:/ s/configured/installed/" /etc/zones/index
    zoneadm -z h$i mount

    # We known that golden image ip address  ends in 254 and change it
    sed -i -e "s:hnet0:hnet$i:g" -e "s:\.254:.$addr:g" /zones/h$i/root/etc/ipadm/ipadm.conf
    zoneadm -z h$i unmount
    zfs destroy data/zones/h$i@initial
    rm $TEMPFILE
    zoneadm -z h$i boot

суббота, 10 февраля 2018 г.

пятница, 13 октября 2017 г.

Does ip belong to network?

It's so easy to check if IP belong to network... Until you start doing this in shell. I've tried and finally got this. This version works in bash, dash and ksh... Good enough for me, but perhaps it could be optimized a bit to avoid cut usage. Our function gets two parameters - ip address and network in address/netmask format. In fact we compare IPaddress & netmask and IPnetwork & netmask.

belongs_network ()

   netaddr=`echo $network | cut -d / -f 1`
   netcdr=`echo $network | cut -d / -f 2`

   a1=$(echo "$addr" | cut -d . -f 1)
   a2=$(echo "$addr" | cut -d . -f 2)
   a3=$(echo "$addr" | cut -d . -f 3)
   a4=$(echo "$addr" | cut -d . -f 4)

   n1=$(echo "$netaddr" | cut -d . -f 1)
   n2=$(echo "$netaddr" | cut -d . -f 2)
   n3=$(echo "$netaddr" | cut -d . -f 3)
   n4=$(echo "$netaddr" | cut -d . -f 4)


   if [ $ares -eq $nres ] ; then
     return 0
     return 1

if belongs_network; then
  echo "belongs"
  echo "does not belong"

среда, 29 марта 2017 г.

Thank you, Oracle engineers

After 2010, when Oracle acquired Sun, most of us, who followed OpenSolaris, were depressed. In one year one of the most advantageous operating systems was closed under steel curtain. Luckily, due to enormous efforts of community, of companies, dependent on OpenSolaris, the system survived. Currently we have several more or less successful illumos distributions, targeting different users. But nowadays there's a (of course, deserved) common negative feeling towards Oracle in illumos community. But let's speak from another point of view. Let's look at things, which illumos community (and in particular, OpenIndiana) got directly or indirectly from Oracle in recent years.
  • Our userland build system, which constantly evolves, however, in different directions, under Oracle control and in our distribution. But still a lot of components can be easily migrated between build systems.
  • A lot of software build receipts and patches, as result, were borrowed with small modifications, from Oracle userland-gate. The process is still going on.
  • We still borrow patches from Solaris pkg-gate. Also differences in underlying kernels are currently rather significant, a lot of changesets from pkg-gate can be ported to OpenIndiana pkg5 repository.
  • Of course, I can not avoid thanking Alan for his constant help in supporting Xorg subsystem and GUI parts of our distribution. He was always helpful to me and Aurélien.
  • Evidently, recent KMS work, integrated into OpenIndiana, wouldn't be possible without Oracle's open drm port, which was ported from Solaris to illumos by Martin Bochnig, and later independently ported and enhanced by Gordon Ross.
- And of course, I cannot count patches, which were suggested to upstream projects by Oracle engineers. Just today when I tried to solve two issues related with IPS and apache 2.4 interaction, I've found two patches by Petr Sumbera, fixing Apache issues on Solaris. So, I want to use the chance and thank all Oracle Solaris engineers for their work on open source projects. I doubt that without them illumos could survive in large scale. Perhaps, we could be an excellent playground for ZFS development, but not an universal operating system...