alp's notes: 2012

среда, 21 ноября 2012 г.

PostgreSQL quals in rewritten query plans

I've just examined PostgreSQL query debug output from the next query :

SELECT * from low_paid_emps_r;

where low_paid_emps_r is defined as

# SELECT definition from pg_views  where viewname like '%emps%';
                           definition     
-----------------------------------------------------
SELECT employees.employee_id, employees.first_name, employees.last_name, employees.email, employees.phone_number, employees.hire_date, employees.job_id, employees.salary, employees.commission_pct, employees.manager_id, employees.department_id FROM employees WHERE (employees.salary < (5000)::numeric);

In particular, I wished to find where is my where expression. It looks like the following.

   :quals 
               {OPEXPR 
               :opno 1754 
               :opfuncid 0 
               :opresulttype 16 
               :opretset false 
               :opcollid 0 
               :inputcollid 0 
               :args (
                  {VAR 
                  :varno 3 
                  :varattno 8 
                  :vartype 1700 
                  :vartypmod 524294 
                  :varcollid 0 
                  :varlevelsup 0 
                  :varnoold 3 
                  :varoattno 8 
                  :location -1
                  }
                  {FUNCEXPR 
                  :funcid 1740 
                  :funcresulttype 1700 
                  :funcretset false 
                  :funcformat 2 
                  :funccollid 0 
                  :inputcollid 0 
                  :args (
                     {CONST 
                     :consttype 23 
                     :consttypmod -1 
                     :constcollid 0 
                     :constlen 4 
                     :constbyval true 
                     :constisnull false 
                     :location -1 
                     :constvalue 4 [ -120 19 0 0 ]
                     }
                  )
                  :location -1
                  }
               )
               :location -1
               }

So, we have here <:

# SELECT oprname from pg_operator where oid=1754;
 oprname 
---------
<
(1 row)

The first argument is 8th attribute of our relation (salary). The other piece is ::numeric conversion:

# select proname from pg_proc where oid=1740;
 proname 
---------
 numeric
(1 row)

Ok, we reconstructed salary < numeric (Const) part. Let's find our constant.

# select typname from pg_type where oid=23;
 typname 
---------
 int4
(1 row)

It's int4. And constvalue as we can see from source code is a Datum. It seems form postgres.h that int4 is int32 and represented as:

#define DatumGetInt32(X) ((int32) GET_4_BYTES(X))

Now we can reconstruct out integer from ":constvalue 4 [ -120 19 0 0 ]" as 19*256+(256-120)=5000... Not evident; I possibly couldn't do this analysis not knowing original statement...

понедельник, 12 ноября 2012 г.

We've made this!

Hurray! We've catched up with China! Finally we have our own Great Russian Firewall!!!

вторник, 18 сентября 2012 г.

Temperature monitoring with OpenNMS and brocade FC switch

We are expanding! And we have to monitor temperature level in our new data center. So, we have:

existing OpenNMS monitoring system,
Brocade 300 SAN FC switch (or something very similar).

Our task is the same: to monitor temperature and send SMS when it's too high. Later part was already discussed.
The above mentioned FC switch has three temperature sensors, which data is available by SNMP (oids .1.3.6.1.4.1.1588.2.1.1.1.1.22.1.4.1 - .1.3.6.1.4.1.1588.2.1.1.1.1.22.1.3). The only problem is that OpenNMS doesn't know about them. Let's teach it!
At first, we have to modify datacollection-config.xml and add to "groups" section:

<group  name="brocade-temperature" ifType="ignore">
         <mibObj oid=".1.3.6.1.4.1.1588.2.1.1.1.1.22.1.4" instance="1" alias="temperature1" type="integer" />
         <mibObj oid=".1.3.6.1.4.1.1588.2.1.1.1.1.22.1.4" instance="2" alias="temperature2" type="integer" />
         <mibObj oid=".1.3.6.1.4.1.1588.2.1.1.1.1.22.1.4" instance="3" alias="temperature3" type="integer" />
      </group>

I know, "temperature[1-3]" names are ugly, but I didn't care at first and now when the whole thing is working I don't want to change anything :)
And modify "Brocade FC Switches" systemDef in systems section:

     <systemDef name="Brocade FC Switches">
        <sysoidMask>.1.3.6.1.4.1.1588.</sysoidMask>
        <collect>
          <includeGroup>brocade-temperature</includeGroup>
          <includeGroup>brocade-switch-fctable</includeGroup>
        </collect>
     </systemDef>

Now we are ready and can define thresholds. To be sure that requested parameters are gathered you can look at rrdRepository. Path to the repository is specfied in rrdRepository property of datacollection-config resource (look at top of datacollection-config.xml). If you have temperature[1-3].jrb files in ${rrdRepository}/${NodeID} and these files contains some data, everything is OK. NodeId is displayed in URL, when you look at node (/opennms/node.jsp?node=353) or you can query "node" table in OpenNMS database:

SELECT nodeid from node where nodelabel='your label';

To be sure that temperature[1-3].jrb contains correct data, you can dump it (your version of jrobin-*.jar file may be different):

# cd ${rrdRepository}/${NodeID}
# echo "dump temperature1.jrb" |  java -jar ${OPENNMS_HOME}/opennms/lib/jrobin-1.5.12.jar

However, OpenNMS still doesn't know how to display this data. Let's help it. To define new graph we will add the following parts to snmp-graph.properties. In the begining of file we have to add reports names to reports definition:

reports=mib2.HCbits, mib2.bits, mib2.percentdiscards, mib2.percenterrors, \
...
brocade.switch.temperature1, brocade.switch.temperature2, brocade.switch.temperature3, \
...

and define reports later (here only one report is shown, other two are essentially the same, just change [Tt]emperature1 to [Tt]emperature[23]):


report.brocade.switch.temperature1.name=Brocade switch temperature1
report.brocade.switch.temperature1.columns=temperature1
report.brocade.switch.temperature1.type=nodeSnmp
report.brocade.switch.temperature1.command=--title="Brocade switch temperature1" \
 --vertical-label="degrees celsius" \
 DEF:temperature1={rrd1}:temperature1:AVERAGE \
 AREA:temperature1#0000ff:"Temperature1" \
 GPRINT:temperature1:AVERAGE:" Avg \\: %8.2lf %s" \
 GPRINT:temperature1:MIN:"Min \\: %8.2lf %s" \
 GPRINT:temperature1:MAX:"Max \\: %8.2lf %s\\n" 
..

Now after "service opennms restart" we'll get pretty graphs if we build graph based on "Node-level Performance Data" resources for brocade node.

P.S. And lastly I just must give you a link to a good document concerning data collection configuration in OpenNMS and another one concerning SNMP configuration.

пятница, 31 августа 2012 г.

Amazing new world with pkgng

I really like it. It's a pure magic... It's pkgng. So, my task is to deploy about 20-30 servers (jails) for practical courses in PostgreSQL. So, every student will have its own unique DBMS to crack and crash. Two years ago I said students to do:

# cd /usr/ports/databases/postgresql84-server/ && make install

So, for half a lesson we monitored building of postgresql... Cool. Last year I've prebuilt necessary packages and put them in /var/packages/. It was faster, a student just had to do:

# cd /var/packages
# pkg_add postgresql90-server*tbz

or something similar. But later we had to update database in one of tasks. Dealing with packages (and scripts to copy them to every jail was not pleasant. I even started to think about OpenVZ, debian and apt... However, i dislike the way of treating postgresql in Debian. It is overcomplicated by pg_ctlcluster, separation of config files from $PGDATA... However, this time I looked at pkgng. And so far I happy. To create package server I just installed pkg from ports:

# cd /usr/ports/ports-mgmt/pkg && make install

Added WITH_PKGNG=yes to /etc/make.conf, run pkg2ng, created required packages by

cd /usr/ports/databases/postgresql91-server/ && make package-recursive

Now I could install nginx, copy /usr/ports/packages/* to /data, run

# pkg repo /data/

share /data with nginx with the following location in server section:

location / {
            root   /data;
            autoindex on;
        }

Repository server is ready. On client I downloaded pkg-1.0.r6.tbz from freebsd package server (old one, with pkg_install packages), installed it and could use repository:

# pkg_add pkg-1.0.r6.tbz

Converted packages db to new format with pkg2ng, changed PACKAGESITE in pkg.conf to my server:

# cp /usr/local/etc/pkg.conf.sample /usr/local/etc/pkg.conf
# vi /usr/local/etc/pkg.conf

Run:

 
# pkg update
# pkg rquery -e '%n ~ pos*' '%n-%v'
postgresql-client-9.1.5
postgresql-contrib-9.1.5
postgresql-plperl-9.1.5
postgresql-server-9.1.5

But when I tried to install postgresql-server, I was asked to update pkg:

# pkg install postgresql-server-9.1.5
Updating repository catalogue
Repository catalogue is up-to-date, no need to fetch fresh copy
New version of pkg detected; it needs to be installed first.
After this upgrade it is recommendedthat you do a full upgrade using: 'pkg upgrade'

The following packages will be installed:

 Upgrading pkg: 1.0.r6 -> 1.0

The installation will free 12 MB

1 MB to be downloaded

Proceed with installing packages [y/N]: n

# pkg install pkg
Updating repository catalogue
Repository catalogue is up-to-date, no need to fetch fresh copy
New version of pkg detected; it needs to be installed first.
After this upgrade it is recommendedthat you do a full upgrade using: 'pkg upgrade'

The following packages will be installed:

 Upgrading pkg: 1.0.r6 -> 1.0

The installation will free 12 MB

1 MB to be downloaded

Proceed with installing packages [y/N]: y
pkg-1.0.txz                                                                                                                                                                                      100% 1686KB   1.7MB/s   1.7MB/s   00:00    
Checking integrity... done
Upgrading pkg from 1.0.r6 to 1.0... done

After this I could quickly install postgresql:

# pkg install postgresql-server

The only inconvenience is a bootstrap problem. The bootstrap tool is necessary requirement to make work with pkg easier...

вторник, 31 июля 2012 г.

Notes on configuring two-nodes proxmox cluster with drbd-backed storage

We had a task to deploy two new visualization servers with possibility of live migration and high availability data. The second means that in case of physical server failure you don't want faulted VMs to be powered up automagically on another node, but just that you can do it by hand in five minutes.

We decided to use proxmox VE 2, because it's free, we have experience of maintaining proxmox 1.9 systems and because it supports live migration without shared storage.

So, we configured two nodes with 4 additional LVM volume groups each: for VZ data for each node (n1vz with one lvm mounted on first node on /var/lib/vz and n2vz with one volume mounted on /var/lib/vz/ on second, n1kvm and n2kvm as VM disk storage on each node, n1kvm is used by VMs running normally on first node, n2kvm - by VMs running on second node). 4 DRBD volumes with primary-primary configuration was created for each of 4 volume groups. Using separate pair of drbd devices for VM's disks makes split brain recovery easier, as explained here. And note, we can't use drbd-mirrored (quazy-shared) disk for VZ storage, because final step of VZ migration includes "rm -rf" after rsyncing container private area.

In such configuration we can do live migration of KVM VMs and VZ. Also we have copy of each VM and VZ for emergencies (falling of one node).

Some difficulties we met were related to LVM and DRBD startup ordering. First one was the following: LVM locked drbd backing storage and drbd couldn't use them. It was solved with correct filter in lvm.conf. The other one was more difficult. Physical volumes n1vz and n2vz available over DRBD couldn't be mounted normally - they should be mounted after initial system startup. Usually firstly starts lvm (and init script makes vgchange -ay, activating volume groups), then drbd, and now we have additional VG, but they are not active.

To solve this problem we are supposed to use hearthbeat. But I am too lazy to study it. So I adopted things more familiar to me - automounter (autofs) to mount /var/lib/vz and udev to make volume groups available on drbd* device appearance. I've added "/- /etc/auto.direct" line to /etc/auto.master and created /etc/auto.direct file, containing:

/var/lib/vz              -fstype=ext4            :/dev/mapper/n1vz-data

Configuration of udev consisted from creation of /etc/udev/rules.d/80-drbd-lvm.rules file, containing:

ACTION=="add|change", SUBSYSTEM=="block",KERNEL=="drbd*", RUN+="/bin/sh -c /sbin/lvm vgscan; /sbin/lvm vgchange -a y'"

I consider this more elegant then just including "vgchange -a y && mount ..." in rc.local.

пятница, 1 июня 2012 г.

php-fpm troubles

Life is boring without troubles. We were almost ready to push new servers in production, however, php-fpm started falling randomly with the following message:

Jun  1 04:14:36 srv2 kernel: [566115.463835] php5-fpm[29696]: segfault at 0 ip 00007f3b191e5558 sp 00007fff70f193c8 error 4 in libc-2.15.so[7f3b190b3000+1b3000]
Jun  1 04:14:36 srv2 kernel: [566115.463847] php5-fpm/29696: potentially unexpected fatal signal 11.
Jun  1 04:14:36 srv2 kernel: [566115.463850] 
Jun  1 04:14:36 srv2 kernel: [566115.463851] CPU 5 
Jun  1 04:14:36 srv2 kernel: [566115.463853] Modules linked in: vesafb psmouse i7core_edac edac_core ioatdma dca serio_raw joydev mac_hid lp parport usbhid hid e1000e megaraid_sas
Jun  1 04:14:36 srv2 kernel: [566115.463868] 
Jun  1 04:14:36 srv2 kernel: [566115.463871] Pid: 29696, comm: php5-fpm Not tainted 3.2.0-24-generic #39-Ubuntu Supermicro X8DTT-H/X8DTT-H
Jun  1 04:14:36 srv2 kernel: [566115.463876] RIP: 0033:[<00007f3b191e5558>]  [<00007f3b191e5558>] 0x7f3b191e5557
Jun  1 04:14:36 srv2 kernel: [566115.463882] RSP: 002b:00007fff70f193c8  EFLAGS: 00010206
Jun  1 04:14:36 srv2 kernel: [566115.463885] RAX: 0000000000000000 RBX: 00007f3b1b53f000 RCX: 0000000000000011
Jun  1 04:14:36 srv2 kernel: [566115.463888] RDX: 0000000000000066 RSI: 0000000000af8b15 RDI: 0000000000000000
Jun  1 04:14:36 srv2 kernel: [566115.463890] RBP: 0000000002705b08 R08: 0000000000000011 R09: 0000000000000000
Jun  1 04:14:36 srv2 kernel: [566115.463893] R10: eea633fc2a689ca0 R11: 00007f3b192344d0 R12: 0000000000000001
Jun  1 04:14:36 srv2 kernel: [566115.463896] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000002705bd0
Jun  1 04:14:36 srv2 kernel: [566115.463899] FS:  00007f3b1b531700(0000) GS:ffff8803332a0000(0000) knlGS:0000000000000000
Jun  1 04:14:36 srv2 kernel: [566115.463902] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Jun  1 04:14:36 srv2 kernel: [566115.463904] CR2: 0000000000000000 CR3: 0000000329cb2000 CR4: 00000000000006e0
Jun  1 04:14:36 srv2 kernel: [566115.463907] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
Jun  1 04:14:36 srv2 kernel: [566115.463910] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Jun  1 04:14:36 srv2 kernel: [566115.463913] Process php5-fpm (pid: 29696, threadinfo ffff88025be20000, task ffff880326120000)
Jun  1 04:14:36 srv2 kernel: [566115.463916] 
Jun  1 04:14:36 srv2 kernel: [566115.463917] Call Trace:

After examining core dump I got the following backtrace:

(gdb) bt
#0  __strstr_sse42 (s1=0x0, s2=) at ../sysdeps/x86_64/multiarch/strstr.c:175
#1  0x0000000000736d13 in fpm_status_handle_request () at /home/alp/build/php5-5.3.10/sapi/fpm/fpm/fpm_status.c:128
#2  0x000000000042b4ab in main (argc=11237155, argv=0x0) at /home/alp/build/php5-5.3.10/sapi/fpm/fpm/fpm_main.c:1809

Firstly I started to blame libc. However, nothing else crashed. I rebuilt php from sources. The result was the same. So after getting magic kick from my chief I sighed deeply and looked at php5-5.3.10/sapi/fpm/fpm/fpm_status.c:

               /* full status ? */
                full = SG(request_info).request_uri && strstr(SG(request_info).query_string, "full");
                short_syntax = short_post = NULL;
                full_separator = full_pre = full_syntax = full_post = NULL;
                encode = 0;

It seems to be just a copy-paste or something like that. It was a piece of code from path processing php-fpm status. Firstly, I've disabled php-fpm monitoring and didn't have new segfaults since then. So, I patched the file with the following:

 
--- php5-5.3.10/sapi/fpm/fpm/fpm_status.c       2012-06-01 04:00:43.492744472 -0400
+++ php5-5.3.10/sapi/fpm/fpm/fpm_status.c       2012-06-01 04:03:59.233040497 -0400
@@ -125,7 +125,7 @@
                }
 
                /* full status ? */
-               full = SG(request_info).request_uri && strstr(SG(request_info).query_string, "full");
+               full = SG(request_info).query_string && strstr(SG(request_info).query_string, "full");
                short_syntax = short_post = NULL;
                full_separator = full_pre = full_syntax = full_post = NULL;
                encode = 0;

and recompiled php... It seems I caught one more bug. Interesting thing to note is that no one else hited it. I've just reported bug 62205 and PHP team reacted quickly.

P.S. The same error appeared in two more places in fpm_status.c. The above mentioned bug report is resolved now, fix was committed to php head.

суббота, 28 апреля 2012 г.

Qt fonts in Ubuntu 12.04

After updating from 10.04 to 12.04 and adapting a bit to Unity (Adaptation included creation of setxkbmap startup script) I found annoying problem: fonts in Skype looked ugly...

I've tried using qtconfig to set fonts, but it was useless. Settings were not used. However they worked if application was run with sudo.

I found that reason was the following: applications run with sudo couldn't communicate with DBUS and so used font settings in ~/.config/Trolltech.conf. Whe run normally they communicated with gconfd-2 and tryed to use default system font (Ubuntu11). But Qt doesn't know anything about this font (don't know why). So I just changed default system fonts to DejaVu font family and now Qt applications (including Skype) looks better.

пятница, 13 апреля 2012 г.

OmniTI announced OmniOS

I've just read about OmniOS. Cool, it seems in the nearest future we'll have as much Illumos-based distros as Linux-based ones :)
I really hope that OmniTI guys will collaborate with illumos and illumos-userland teams in productive way. By the way, I really like OmniTI group - they develop and support my favorite DBMS and one of my favorite server operating systems :) The only thing which is frightening me is possible fragmentation. IMHO, without big developers community it is vital to be as much united as possible and try to be friendly to beginner users and developers.

четверг, 5 апреля 2012 г.

How large root do you need?

If you have separate /var, /usr and /home partitions, what should be the size of / in Ubuntu 10.04 Desktop ?
I've always thought that 3 GB is more than enough. I was wrong, today my laptop said me that it has only 138 MB free in / partition.
I was rather shocked. The first victim for me seemed /tmp. Yes, it is not cleared on boot in Ubuntu, and this is unusual (I usually enable /tmp cleanup on my FreeBSD desktop systems or use mfs for it and previously on my OpenSolaris desktop it was by default in RAM/swap and didn't bother me). But that didn't help. Maybe I just didn't wait enough for fs counters update.
But that forced me to continue investigation: what can use 3 GB in root fs ?
The short answer is /boot. I had about 20 versions of kernel installed: from manually compiled 2.6.31 kernel (I suddenly realized that I must have recompiled it to get SunSPOT support several years ago and it was still booted by default) to 2.6.32.41 one. Kernel plus initrd image occupies just some more than 20 MB. But if you have about 20 versions of them, it becomes significant for root fs...
Of course, I don't want my old kernel version to disappear after system update. But I would appreciate if package manager said something like this: "Hey, guy, you have 10 versions of kernel installed, are you sure that you need them?" :)

P.S. Just found out - /tmp is cleared up by default if it is on separate FS (look at /etc/init/mounted-tmp.conf). I understand that having /tmp in root is a bad habit, but nonetheless I don't think that it is so uncommon setup to ignore this possibility in startup scripts.

вторник, 3 апреля 2012 г.

PostgreSQL mostly online REINDEX

What do you do, when you want to do REINDEX online? Build new index concurrently, drop old one and rename new. Only you can't do it for indexes supporting primary keys and other constraints without dropping constraint. So, in any case, you have to REINDEX them. And let everyone to be waiting...
And if you have several hundred indexes, it is not very convenient to do it manually. Also note that CREATE INDEX CONCURRENTLY can't be EXECUTE'd from plpgsql function. I wish we had concurrent reindex in PostgreSQL or at least autonomous transactions to deal with the latter problem... But, we must work with available functionality...
So, the decision for me was to write PL/pgSQL script which will generate necessary SQL commands and feed them to psql.
The script itself is the following:


DO 
$$ 
DECLARE 
ind record;
str text;
str_drop text;
str_rename text;
BEGIN 
 FOR ind IN  (SELECT i.oid,i.relname FROM 
     pg_class r,pg_class i , pg_index ir, pg_namespace ns 
     WHERE ir.indexrelid=i.oid AND ir.indrelid=r.oid AND ns.oid=r.relnamespace
             AND ns.nspname='public' AND NOT ir.indisprimary \
             AND i.oid NOT IN 
                 (SELECT conindid FROM pg_constraint)) LOOP
  str:=replace(
      pg_get_indexdef(ind.oid),
      'INDEX '||ind.relname|| ' ON ', 
      'INDEX CONCURRENTLY '||ind.relname|| '_X999 ON ');
  str_drop:='DROP INDEX ' || ind.relname;
  str_rename:='ALTER INDEX ' || ind.relname||'_X999 RENAME TO ' ||ind.relname ;
  RAISE NOTICE '%', str;
  RAISE NOTICE '%', str_drop;
  RAISE NOTICE '%', str_rename;
 END LOOP;
 FOR ind IN  (SELECT i.oid,i.relname FROM 
     pg_class r,pg_class i, pg_index ir, pg_namespace ns 
     WHERE ir.indexrelid=i.oid AND ir.indrelid=r.oid AND ns.oid=r.relnamespace 
             AND ns.nspname='public' AND (ir.indisprimary 
                               OR i.oid IN (SELECT conindid FROM pg_constraint)) LOOP
   str:='REINDEX INDEX ' || ind.relname;
   raise notice '%', str;
 end loop;
END;
$$ LANGUAGE PLPGSQL;

Now with something like this:


$ psql -d DBNAME  -f generate_reindex.plpgsql" 2>&1| awk -F 'NOTICE:' '{ if (NF==2) {print $2 ";" } ; }' | psql -d DBNAME

we can mostly concurrently rebuild all indexes in database DBNAME. And last thing to remember is to add

\set ON_ERROR_STOP

to your ~/.psqlrc file just to be safe...

четверг, 22 марта 2012 г.

Redmine/Webrick rc script

I've just installed redmine 1.3.2 on FreeBSD server and met several inconviniences.

It is not in ports

It depends on old versions of different ruby gems, which are already not in ports.
In particular, it didn't want to work with rack 1.4.1 and I had to do
```
# gem install rake -v=1.1.0
```

It doesn't have rc script to startup Webrick web server (I didn't want to install apache on this host)

So, I've taken mongrel_cluster rc script and made redmine rc script from it. It is here:


#!/bin/sh
# PROVIDE: redmine
# REQUIRE: DAEMON
# KEYWORD: shutdown
#
# This script is modified by placing the following variables inside
# /etc/rc.conf:
#
# redmine_enable (bool):
#       Set it to YES to enable this service.
#       Default: NO
# redmine_dir (path):
#       The directory containing redmine
#       Default: /usr/local/redmine/
# redmine_user (username):
#       The user to run redmine as
#       Default: redmine
# redmine_args (string):
#       Additional command flags for ruby
#       Default: "-e production -d"

. /etc/rc.subr

name=redmine
rcvar=redmine_enable

command="/usr/local/bin/ruby"

load_rc_config $name

: ${redmine_enable="NO"}
: ${redmine_dir="/usr/local/redmine/"}
: ${redmine_args="-e production -d"}
: ${redmine_user="redmine"}

start_cmd="redmine_cmd start"
stop_cmd="redmine_cmd stop"
restart_cmd="redmine_cmd restart"
status_cmd="redmine_cmd status"

redmine_cmd()
{

    if [ ! -d "${redmine_dir}/." ]; then
        warn "${redmine_dir} is not a directory."
        return 1
    fi
 
    case $1 in
        "start")
           su -l ${redmine_user} -c "cd ${redmine_dir} && pwd &&  eval  $command  script/server webrick $redmine_args"
        ;;
        "stop")
           if [ -f ${redmine_dir}/tmp/pids/server.pid ] ; then
             PID=$(/usr/bin/head -1 ${redmine_dir}/tmp/pids/server.pid)
             /bin/kill -s int $PID  2>/dev/null
             sleep 3 
             /bin/kill -s  0 $PID 2>/dev/null && /bin/kill -s kill $PID;
           fi
        ;;
        "restart")
           redmine_cmd stop
           redmine_cmd start
        ;;
        "status")
          if [ -f ${redmine_dir}/tmp/pids/server.pid ] ; then
                echo "PID file exists"
                PID=$(/usr/bin/head -1 ${redmine_dir}/tmp/pids/server.pid)
                /bin/kill -s  0 $PID 2>/dev/null && echo "Server is running with pid $PID" && exit 0;
                echo "But server is not running..." && exit 1;
           fi
           echo "Server is not running" && exit 0;
        ;;
    esac
}

run_rc_command "$1"

The interesting thing to note is that to terminate webrick correctly, you have to send it SIGINT, not SIGTERM.

четверг, 15 марта 2012 г.

Remember TRUNCATE peculiarities

Just remember the following about TRUNCATE:
1) It is DDL operation and so is not transactional in Oracle (issuing TRUNCATE will make an implicit commit)
2) It is not MVCC-safe in PostgreSQL.

What does it mean?


alp=> create table t199 as select * from pg_opclass ;
SELECT 117
alp=> BEGIN ;
BEGIN
alp=> TRUNCATE t199;
TRUNCATE TABLE
alp=> ROLLBACK ;
ROLLBACK                         ^
alp=> SELECT count(*) from t199;
 count 
-------
   117
(1 row)

So far, everything looks good: TRUNCATE was rolled back. Transactional DDL is a powerful PostgreSQL feature. But let's start parallel transaction. In first session I'll issue


alp=> \set PROMPT1 'T1> '

and in second:


alp=> \set PROMPT1 'T1> '

So, let's fun begin.

 
T1> BEGIN;
BEGIN
T1> SET transaction_isolation TO serializable;
SET
T1> UPDATE employees SET email=email;
UPDATE 103

T2> BEGIN ;
BEGIN
T2> TRUNCATE t199;
TRUNCATE TABLE
T2> COMMIT ;
COMMIT

T1> SELECT count(*) from t199;
 count 
-------
     0
(1 row)

Oops... What has happened? Let's repeat...


T2> INSERT INTO t199 SELECT * from pg_opclass;
INSERT 0 117

T1> BEGIN;
BEGIN
T1> SET transaction_isolation TO serializable;
SET
T1> UPDATE employees SET email=email;
UPDATE 103

T2> BEGIN ;
BEGIN
T2> DELETE FROM t199;
DELETE 117
T2> COMMIT ;
COMMIT

T1> SELECT count(*) from t199;
 count 
-------
   117
(1 row)

Do you see the difference? T2's DELETE didn't modified T1 snapshot... But TRUNCATE did. And what about "UPDATE employees"? It is just necessary to get real XID and snapshot for transaction.
So, the main idea: be attentive with "TRUNCATE". It is not just "fast DELETE", it has some not-so-evident consequences.

понедельник, 27 февраля 2012 г.

Minor PostgreSQL updates and different unpleasant surprises

Today I've updated two our DBMS servers to PostgreSQL 9.0.7. It was a minor update, everything went smoothly: I just updated ports and let pormaster to do its job...
But I got an interesting consequence - something was fixed and our server began to write error log in Russian. Oh, shit! It wasn't pleasant. Have you ever tried to find something about particular error using its localized description?
I've checked my environment: LC_* variables were not set, pgsql's user class was not "russian", nothing has changed. However, I got a nice log with localized reports... One more trouble - they were messed up in some way (don't know, whose fault was this - OS, PostgreSQL) and hardly readable. I really don't understand, why OS or any service should ever print any errors or logs in native language, this practice leads to difficulties in their interpretation.
In any way, it turned out that in postgresql.conf lc_messages were set to "ru_RU.UTF-8", maybe since initdb run, but before update this GUC didn't have such influence. Something was fixed, and behavior changed. So, keep in mind, even minor updates may lead to different surprises...

воскресенье, 26 февраля 2012 г.

Libreoffice is a slow piece of shit

Subj!!! I hate this piece of software. It is fucken slow, when
a) it works with PostgreSQL-based datasource (e.g. as a bibliography source),
b) it works awfully slow with document, in which we store editing history...
The bad news are that there is little alternatives. Maybe the next time I wish to write some big and complex document, I'll try to do it in Google Docs...

вторник, 14 февраля 2012 г.

French symbols without azerty

So, sometimes I have to type some text in French. And I dislike AZERTY layout, it's too unfamiliar. So, what should I do? Right, to use altgr-intl US layout. To do this you just need to set XKB Variant to "altgr-intl" (something like this in hal policy file - /usr/local/etc/hal/fdi/policy/x11-input.fdi on my FreeBSD system):


<?xml version="1.0" encoding="ISO-8859-1"?>
<deviceinfo version="0.2">
<device>
 <match key="info.capabilities" contains="input.keyboard">
      <merge key="input.x11_options.XkbModel" type="string"> pc105</merge>
      <merge key="input.x11_options.XkbLayout" type="string">us,ru</merge>
      <merge key="input.x11_options.XkbOptions" type="string">grp:rwin_toggle,grp_led:scroll</merge>
      <merge key="input.x11_options.XkbVariant" type="string">altgr-intl,winkeys</merge>
    </match>
</device>
</deviceinfo>

Now, to type something with ^ , like û, you just type RightAlt+6 (^ symbol is there) and press u. To type è - RightAlt+` and e... Big list of shortcuts is here. And if you don't touch right alt this layout behaves just like basic us layout...

понедельник, 6 февраля 2012 г.

Illumian is here...

Today I've tried the first Illumian version. This is illumos-based Nexenta OS. Just a few remarks.
What I liked:
1) VTs finally work in illumos installer, it's good to have a possibility of going to text console and checking installer's log.
2) It seems, the installer can install system on mirror pool. C'est beau.
3) APT... I finally shouldn't learn one more package system and apt is IMHO faster then IPS.
4) It's still Solaris: you have DTrace, ZFS, projects, fair share scheduler...
5) git change number in place of OS version looks cool :)


# uname -a 
SunOS oi-test 5.11 4cece89cac3e i86pc i386 i86pc Solaris

What I disliked:
1) Zones don't work:


root@oi-test:~# zoneadm -z  myzone install
sh[1]: exec: /usr/lib/brand/ipkg/pkgcreatezone: not found

And Solaris without zones is a very strange thing...
2) /usr/gnu/bin in front of path... How can you use GNU chmod with ZFS ACL?
3) Soft, soft, soft... Where are you?


root@oi-test:~# apt-cache pkgnames |grep postgres

4) apt-clone doesn't work

This is just a first glance on this OS. It is very raw and is still in development. I doubt that calling this release 1.0 was a good idea. It seems, we will have a production-ready OS not sooner then in half of year, but still... I wonder, what Illumos-based OS will be first to reach production quality: OI or Illumian? It seems OI is in better form nowadays, but the fact is that OI team has failed to create a stable release in about 1.5 years.

понедельник, 30 января 2012 г.

php-fpm status polling

Just a brief note...
If you wish to monitor php-fpm pool, it provides status page determined by status_path configuration parameter in pool config file.


pm.status_path = /statusfpm

Status page usually looks like this:


X-Powered-By: PHP/5.3.3-1ubuntu9.7
text/plain
Content-type: text/html

accepted conn:   253745
pool:             mhouse
process manager:  static
idle processes:   50
active processes: 0
total processes:  50

The problem is that it is not usually accessible by HTTP, and you should usually use web-server to communcate with php-fpm by FastCGI protocol. Now let's imagine that you have 10 php-fpm nodes and one nginx server. It seems justified to get status from each php-fpm node, you should talk directly to php-fpm. And you can use php FCGIClient (big thanks to its author, Pierrick Charron) for this task.
Using this library is straightforward. The only difficulty is to send necessary headers to php-fpm. These headers we can find out with tcpdump, monitoring nginx and php-fpm communications. So, the following script can be used to retrieve distinct php-fpm statistics:


<?php
require('fastcgi.php');

$client = new FCGIClient('192.168.0.1', '9000');
$answer=$client->request(
        array(       
                'GATEWAY_INTERFACE' => 'FastCGI/1.0',
                'REQUEST_METHOD' => 'GET',
                'SCRIPT_FILENAME' => '/usr/local/nginx/html/statusfpm', //Nginx document root
                'SCRIPT_NAME' => '/statusfpm', //our pm.status_path
        ),''
);
?>

Recieved file can be analyzed further and result may be fed to your monitoring system:


       $lines=split("\n",$answer);
       //Line 0 (4 - with headers) - Accepted connections           
       $cnum=strpos($lines[4],':');
       $accepted=trim(substr($lines[4],$cnum+1));

       //Line 3(7) - idle processes
       $cnum=strpos($lines[7],':',1);
       $idle=trim(substr($lines[7],$cnum+1));

       //Line 4(8) - active processes
       $cnum=strpos($lines[8],":");
       $active=trim(substr($lines[8],$cnum+1));

       //Line 5(9) - total processes
       $cnum=strpos($lines[9],":");
       $total=trim(substr($lines[9],$cnum+1));

The latest part depends on monitoring system you using. We have some service php functions to interact with zabbix (in particular, slightly modified zabbix_sender function).

среда, 21 ноября 2012 г.

понедельник, 12 ноября 2012 г.

вторник, 18 сентября 2012 г.

пятница, 31 августа 2012 г.

вторник, 31 июля 2012 г.

пятница, 1 июня 2012 г.

суббота, 28 апреля 2012 г.

пятница, 13 апреля 2012 г.

четверг, 5 апреля 2012 г.

вторник, 3 апреля 2012 г.

четверг, 22 марта 2012 г.

четверг, 15 марта 2012 г.

понедельник, 27 февраля 2012 г.

воскресенье, 26 февраля 2012 г.

вторник, 14 февраля 2012 г.

понедельник, 6 февраля 2012 г.

понедельник, 30 января 2012 г.