Friday, April 23, 2010

slapd and me

If you run a development shop with hundreds of nasty test boxes, your OpenLDAP
authentication servers can get swamped and die.  Yes, die.  If OpenLDAP is not 
shut down gracefully, your OpenLDAP database can and will get corrupted.

Here's something quick and dirty I do to bring systems back to life:

* Shut down Samba (because my interation of Samba used LDAP as a backend auth db, and not silly Samba files
* Zap the existing LDAP backup dir (it would be kind of old) & move the current LDAP db to a new backup dir
* Add a backup ldif I had sitting on another system (you do have two of everything right?)
* Index the db so as to make sure the backup is consistent
* Start up LDAP & Samba services.

/etc/init.d/samba stop ; svc-stop /service/slapd ; \
rm -rf /var/lib/ldap.back ; mv /var/lib/ldap /var/lib/ldap.back ; mkdir /var/lib/ldap ; \
slapadd -f /etc/ldap/slapd.conf -c -l /tmp/2010042301207.ldif ; \
slapindex -v -f /etc/ldap/slapd.conf ; \
svc-start /service/slapd ; /etc/init.d/samba start

But wait, there's more!

So, how do you know that slapd is running?  Well, you can do this:

# lsof -i |grep slapd

slapd     13139        root    6u  IPv6  28760       TCP *:ldap (LISTEN)
slapd     13139        root    7u  IPv4  28761       TCP *:ldap (LISTEN)
slapd     13139        root   10u  IPv4  29580       TCP slapserver:ldap->ldapclient01:40117 (ESTABLISHED)
slapd     13139        root   12u  IPv4  29637       TCP slapserver:ldap->ldapclient02:41377 (ESTABLISHED)

If *:ldap (LISTEN) is missing, you may be having a problem with the ldap daemon not having stopped properly when the 
whole /service/slapd process was initiated.  That's cool. 

Do this:

# ps aux |grep slapd

You should see:

root     13058  0.0  0.0  1440  292 ?        S    12:12   0:00 supervise slapd
root     13129  0.0  0.0  1580  352 ?        S    12:12   0:00 multilog t /var/log/slapd
root     13139  0.2  0.0 23104 3164 ?        S    12:12   0:00 /usr/sbin/slapd -d 68
root     13171  0.0  0.0  1912  596 pts/0    S+   12:13   0:00 grep slapd

Sometimes the daemontools provided "utility" respawns horribly, or just didn't shut off properly.
The best way to figure out if something's gone awry is to check for zombies and 
then to see if those zombies are related to any service errors.

# ps -ef|grep defun
# ps ax | grep readproctitle | grep 'service errors:'

If you see any output, kill the offending parent svscan - it'll be the PID in the second column of PIDs.

Or!  Here's a nice script to help you out if svscan continues to respawn faster
than your keyboard strokes.

...

#!/bin/sh
# killslapd
#### DEFINE APP AND DIRECTORIES HERE
APP=slapd
LOCALSERVICEDIR=/etc/
SERVICEDIR=/service

#### DOWN THE DJB SERVICE
cd $SERVICEDIR/$APP
rm -f $SERVICEDIR/$APP
svc -dx . log

#### IN CASE THE DJB DOWN DIDN'T WORK, MANUALLY KILL IF NECESSARY
while test "$input" != "c"; do
        echo
        echo
        ps ax | grep $APP
        echo
        echo In the preceding processes, if you see either supervise $APP
        echo or /usr/local/bin/$APP
        echo or any other process running $APP
        echo 'you must kill it before continuing (open another terminal)'
        echo
        echo -n 'Press c then Enter to continue (after any necessary killing)==>'
        read input
done
echo '   Continuing...'


#### REMOVE THE supervise DIRECTORIES
rm -rf $LOCALSERVICEDIR/$APP/supervise
rm -rf $LOCALSERVICEDIR/$APP/log/supervise

#### SET THE run FILES TO 755 FOR PROPER REINSTALLATION
chmod 755 $LOCALSERVICEDIR/$APP/run
chmod 755 $LOCALSERVICEDIR/$APP/log/run

#### REINSTALL
ln -s $LOCALSERVICEDIR/$APP $SERVICEDIR/$APP
sleep 5

#### PRINT THE RESULTS
mycommand="svstat $SERVICEDIR/$APP"
echo
echo $mycommand
$mycommand 
echo
echo If the preceding svc and svstat commands give no error messages, 
echo your supervise directory is probably OK.
killslapd.sh (END) 

...

Or!  Here's a nice global script.  Just plug in slapd.

...

#!/bin/bash
# killsomething

echo -n "what do you wish to kill? "
read var1
kill -9 `ps -ef|grep $var1| awk '{print $2}'`

...

And as for those nasty zombies... find them...

...
#!/bin/bash
# hellozombie
ps -A -ostat,ppid,pid,cmd | grep -e '^[Zz]'

...

And now... kill them...

...

#!/bin/bash
# goodbyezombie
kill -HUP `ps -A -ostat,ppid,pid,cmd | grep -e '^[Zz]' | awk '{print $2}'`

...


No comments: