ataraxia: 2012

Friday, December 21, 2012

re-ip re-ip re-ip sol11.

solaris 11 re-ip process...

root@sunbox:# svcs svc:/network/physical:nwam

STATE          STIME    FMRI
disabled       Nov_03   svc:/network/physical:nwam

yes.  not using nwam.  i was smart.  for once.

make sure you have ncp turned off.  you never need it.  really.

root@sunbox:# netadm enable -p ncp DefaultFixed

root@sunbox:# ipadm show-addr
ADDROBJ           TYPE     STATE        ADDR
lo0/v4            static   ok           127.0.0.1/8
net0/v4           static   ok           192.168.10.66/22
lo0/v6            static   ok           ::1/128
net0/v6           addrconf ok           fe80::9a4b:e1ff:fe7c:e268/10

we're changing the ip address and mask.  one line it:

root@sunbox:# ipadm delete-addr net0/v4 ; ipadm create-addr -T static -a 10.128.10.66/20 net0/v4 ; route -p add default 10.128.10.1

update dns.  yissss:

svccfg -s network/dns/client
setprop config/search = astring: ("string.com" "anotherstring.com")
setprop config/nameserver = net_address: (10.128.10.10 10.128.10.15)
select network/dns/client:default
refresh
quit

adventures in exim or mail apocalypse

whoopsies!
i renumbered a range of systems, but failed to edit my exim conf file to allow relaying from the new subnet... the messages were first bounced when i failed to add the interface. and then frozen when i failed to add the new network.

hoo hum.

those relays are defined here:

dc_relay_nets='net/20;net/24'

another interface added:

dc_local_interfaces='127.0.0.1:ww.xx.yy.zz:ww.xx.yy.zz'

and now the good stuff...

unfreeze a single message:

exim -Mt messageid

unfreeze the entire queue and resend:

exim4 -qff

unfreeze the entire queue and force resend:

mailq | grep frozen | awk '{print $3}' | xargs exim -v -M

remove all frozen messages:

exiqgrep -z -i | xargs exim -Mrm

to watch the fun, open up another terminal and

tail -f -n 20 /var/log/exim4/mainlog

corrupt database:

/usr/sbin/exim_tidydb -t 1d /var/spool/exim retry > /dev/null 
/usr/sbin/exim_tidydb -t 1d /var/spool/exim reject > /dev/null 
/usr/sbin/exim_tidydb -t 1d /var/spool/exim wait-remote_smtp > /dev/null

Thursday, December 13, 2012

my answer to everthing

# init 6

Wednesday, November 28, 2012

temptrak rrd create

just in case i forget...

rrdtool create temp.rrd --step 3600 \
DS:probe1:GAUGE:300:U:U \ 
DS:probe2:GAUGE:300:U:U \
RRA:AVERAGE:0.5:1:576

let's make it granular

rrdtool create temp.rrd \ 
--start N --step 300 \
DS:probe1:GAUGE:600:55:95 \ 
DS:probe2:GAUGE:600:55:95 \
RRA:MIN:0.5:12:1440 \
RRA:MAX:0.5:12:1440 \ 
RRA:AVERAGE:0.5:1:1440

let's do something really basic

rrdtool create temp.rrd \
--start N --step 60 \
DS:probe1:GAUGE:300:U:U \
DS:probe2:GAUGE:300:U:U \
DS:probe3:GAUGE:300:U:U \
DS:probe4:GAUGE:300:U:U \
RRA:AVERAGE:0.5:1:576 \
RRA:AVERAGE:0.5:6:576 \
RRA:AVERAGE:0.5:24:576 \
RRA:AVERAGE:0.5:144:576 \
RRA:AVERAGE:0.5:288:576

Tuesday, November 27, 2012

aix 6.1 odm fun

trying to ssh userwithlongname@aixhost fails. when i su - userwithlongname i get this on AIX 6.1:

3004-503 Cannot set process credentials

What?

# pam.conf
sshd auth   required    /usr/lib/security/pam_aix use_new_state use_first_pass 
sshd account      required    /usr/lib/security/pam_aix 
sshd password     required    /usr/lib/security/pam_aix 
sshd session      required    /usr/lib/security/pam_aix

# /etc/ssh/sshd_config
uncomment the UsePAM line and change UsePAM = no to UsePAM = yes.

# chsec -f /etc/nscontrol.conf -s authorizations -a secorder=files,LDAP

# lsattr -El sys0
shows system variables in the ODM database.

# chdev -l sys0 -a max_logname=30

did it work?*

# getconf LOGIN_NAME_MAX
30

yeah.

# nfso -p -o nfs_use_reserved_ports=1

* Why?

because sometimes you have users with groups and names longer than 8 characters.
if so, if their primary GID is one of those groups, or if their uids are longer than 8 characters, no logon.
first hint... tried to su as a user, only first 8 characters shown.
did an lsgroup and the group did not exist.
did an lsgroup ALL and saw that the LDAP group had no content.

neat.

Friday, November 16, 2012

aix installed packages

What do I have installed. AIX 6.1, tell me...

# lslpp -L

# lslpp -l

Thursday, November 15, 2012

aix sshd install

after rpm (openssl installed, yes) hell, you decide to torture yourself more with sshd... quick & dirty:

# cd /tmp
# wget http://sourceforge.net/projects/openssh-aix/files/openssh-aix61/openssh_5.2p1_aix61.tar.Z/download
# mkdir openssl.0.9.8.1103 && cd openssl.0.9.8.1103 && uncompress -c < ../openssl.0.9.8.1103.tar.Z |tar -xvf - && installp -acXYgd . openssl

gen your keys:

# cd /etc/ssh
# ssh-keygen -t rsa

then edit /etc/ssh/sshd_confg to suit, and issue:

# stopsrc -g ssh ; startsrc -g ssh

Wednesday, November 14, 2012

solaris 10 statd death

statd problems galore in /var/adm/messages:

Nov 11 06:06:66 localhost statd[262]: [ID 766906 daemon.warning] statd: cannot talk to statd at nastynfsserver, RPC: Timed out(5)

# ps -eaf | fgrep statd 
  daemon 16000 17000   0 13:13:13 ?           0:00 /usr/lib/nfs/statd
    root 16000 17500   0 14:14:14 pts/13      0:00 fgrep statd

# svcs -a | grep "nfs/status"
online          13:13:13 svc:/network/nfs/status:default

# svcadm -v disable nfs/status
svc:/network/nfs/status:default disabled.

# ls /var/statmon/sm.bak
nastynfsserver

# rm /var/statmon/sm.bak/nastynfsserver

# svcadm -v enable nfs/status
svc:/network/nfs/status:default enabled.

NB:
if fgrep is not your friend, grep'll do:

ps -ef |grep -v grep |grep statd

debugging solaris 10 ssh daemon

on solaris 10 i had a problem. it bugged me off and on for like a week.

it was like this:

ldap user on a solaris 10 box with a pubkey or without a pubkey was unable to ssh to other systems, be they solaris or otherwise. this was the case for all zillion solaris 10 sparc and x86 systems i have. not so for solaris 9. and nope for solaris 11.

first i thought there was something amiss with the user's ssh directory. maybe it was the perms on the mount. hell. maybe it was an issue then with the ldap record. the ssh daemons? time to debug...

localhost # /usr/lib/ssh/sshd -p 2222 -Dddd
localhost ~ ssh -vvv -l notme -p 2222 localhost

little did i know, it was not a problem with:

/etc/pam.conf
login auth sufficient         pam_ldap.so.1

nor an issue with:

/etc/ssh/ssh_conf
Host *
   StrictHostKeyChecking no
   UserKnownHostsFile=/dev/null

or even:

/etc/ssh/sshd_conf
#ListenAddress 0.0.0.0
#ListenAddress ::

no no.

it was the existence of this wickedness:

localhost notme ~ .sunw

i don't care what that directory holds, it makes my systems puke:

localhost # cp -r /notme/.sunw /notme/.sunw.crap
localhost # rm -rf /notme/.sunw ; mkdir /notme/.sunw
localhost # chmod ugo-rwx /notme/.sunw
localhost # la -al /notme/ |grep .sunw*
drwxrwxr-x   5 notme    notme          4096 Nov 13 13:31 .sunw.crap
d---------   2 notme    notme          4096 Nov 13 13:31 .sunw

Monday, November 12, 2012

solaris 11 ldap client kick start

There's nothing more depressing than when you've got a console going and you see this course by when you do a warm restart of your Solaris 11 box:

svc.startd[44]: libsldap: Status: 2  Mesg: Unable to load configuration '/var/ldap/ldap_client_file' ('').

Say it ain't so. But it is.

Sadly, I've given up and trying to figure out what's wrong, because really, nothing's wrong at all. What'd I've done is throw in a kludge, sort of like what I used to have to do on Solaris 8, 9 and 10, to get my ldap clients running. Here's what I did:

Place a script in /etc/init.d and...

Place a symlink to said script in /etc/rc3/d.

First get those ldap services running:

#!/bin/sh

# set up ldap
svcadm enable network/ldap/client:default
svcadm enable network/nis/domain
svcadm enable dns/client
svcadm refresh name-service/switch
svcadm enable -r nfs/client

exit

Symlink it:

# ln -s /etc/init.d/svc-start-ldapclient.sh /etc/rc3.d/S99svc-start-ldapclient

That was easy.

solaris 11 client nfs gone missing

Solaris 11 is all new all the time. One thing that's sort of annoying or mystifying is why, after booting, my zones just decide to skip out on the whole mounting of nfs exports even though they are defined in /etc/vfstab. That's okay. I don't mind creating a cron job:

if [ $(mount| grep 'nfsserver' | grep -v grep | wc -l | tr -s "\n") -eq 0 ]; then mount -a ; fi 2>&1

Oh, and I'm okay with running it every five minutes in crontab.

0,5,10,15,20,25,30,35,40,45,50,55 * * * * /root/scripts/script.sh

Tuesday, November 6, 2012

solaris 10 forcefully shutdown a zone

In my notes this is marked: "killzonekill".

That being said...

Sometimes my zones on Solaris 10 refuse to shut down. This could be for a variety of reasons. A tell-tale sign is, say after 1day, you see this:

[root@bigsystem ~]# zoneadm -z soxvm218 shutdown

... 24 hours later ...

[root@bigsystem ~]# zoneadm list -civ 

  18 soxvm218       shutting_down /opt/zones/soxvm218          native   shared

Well hell. Maybe there be zombies.

[root@bigsystem ~]# ps -fz soxvm218
     UID   PID  PPID   C    STIME TTY         TIME CMD
    root  1619     1   0 21:56:00 ?           0:00 zsched
 0003088  4486     1   0        - ?           0:00 defunct

Yeah. defunct that's no fun.

You try the usual:

[root@bigsystem ~]# zoneadm -z zonename unmount -f
[root@bigsystem ~]# zoneadm -z zonename reboot -- -s 
[root@bigsystem ~]# pkill -9 -z zonename

Nada.

In that case, do some kill -9 action. Programmatically:

for i in `ps -lLef | grep defunct |grep -v grep | awk '{print $4}'`
               do 
                 echo "Killiing Process..pidno= $i" ; sleep 1 
                  kill -9 $i ; sleep 5; 
               done

Yeah. That does it every time.

Wednesday, October 24, 2012

solaris 11 zone creation & cloning notes

this is for me and me alone. i'll prettify it eventually.

sparc?
prepare zfs. export is a good place to start.
zfs create rpool/export/zones

create the virtual NIC:

Create 1 vnic for each zone you want to run:
dladm create-vnic -l net0 vnic1

To see the VNIC you have just added:
dladm show-vnic

We're doing exclusive IP-type zones.

Create a profile for the system.
sysconfig create-profile -o /tmp/zone1.xml

Create Zone

zonecfg -z zone1
create
set zonepath=/exports/zones/zone1
set ip-type=exclusive
set autoboot=true
add net
set physical=vnic1
end
add dedicated-cpu
set ncpus=1
end
add fs
set dir=/opt/SUNWspro
set special=/opt/SUNWspro
set type=lofs
set dir=/opt/csw
set special=/opt/csw
set type=lofs
end
verify
commit
exit

Now, install the zone with pre-populated settings:
zoneadm -z zone1 install -c /tmp/zone1.xml

Boot the zone:
zoneadm -z zone1 boot

; sol10
To finish the process login to the zone:
zlogin -C zone1

; sol11
zlogin zone1

create an xml file for system 0-state
sysconfig create-profile -o /tmp/zone1.xml

then import said xml file
sysconfig configure -g system -c /tmp/zone1.xml

exit
zoneadm -z zone1 halt & reboot.

Clone Zone
zonecfg -z zone1 export > zone1clone.cfg
zonecfg -z zone1clone -f zone1clone.cfg
zoneadm -z zone1clone clone -c /root/profiles/zone1clone.xml zone1

NB zone1clone.xml is an edited copy of zone1.xml .  i put under root.

Monday, October 15, 2012

i was cut today

by the way of our man in upper volta:

%WINDIR%\system32>sc config "SnazzyDemon" start= auto
[SC] ChangeServiceConfig SUCCESS

%WINDIR%\system32>sc config "SnazzyDemon" start=auto
[SC] Barf

DESCRIPTION:
         Modifies a service entry in the registry and Service Database.
USAGE:
         sc  config [service name]  ...
         REM remove that space and I cut you.

Thursday, October 11, 2012

i installed what version of sunstudio?

yes you did.

# pkginfo | grep SPRO
application SPROatd    Sun Studio 12 update 1 Advanced Tools Development Module
application SPROcc     Sun Studio 12 update 1 C Compiler
application SPROcmpl   Sun Studio 12 update 1 C++ Complex Library
application SPROcpl    Sun Studio 12 update 1 C++ Compiler
application SPROcplx   Sun Studio 12 update 1 C++ 64-bit Libraries

but what about CC?

/opt/SUNWspro/bin/CC -V

that'll tell you the patch level.

Monday, October 8, 2012

reverse ssh tunnel for tar over ssh

I have two systems. One is on a local LAN. The other is in a DMZ. I will call them: LAN and DMZ.

I need to copy an awful lot of data from DMZ to LAN. The data are so large that I can't just tar and gzip it up on DMZ and issue an scp from LAN. That would be too easy. Instead, since I need to preserve the permissions, symlinks, &c., I'll need to issue a tar over ssh; the best way for me to do this is to set up a reverse ssh tunnel.

I'm going to set it up on port 19999. So, this means the DMZ system will connect to DMZ loopback port 19999 and will have access to LAN, and all the LAN resources as available to LAN via 19999. Neat.

LAN has a pubkey on DMZ for passwordless logon. The account I'm doing the initial connection from on LAN is toor. The DMZ account is root.

setup initial connection via LAN:

~toor ssh -R 19999:localhost:22 root@DMZ

open a shell on DMZ, test it out:

# ssh -l toor -p 19999 localhost
# exit

It works, yay. Do it:

# tar cvf - /opt/stuff | ssh -l toor -p 19999 localhost "tar -xf - -C /tmp/DMZ.stuff"

Tuesday, October 2, 2012

sunstudio secrets

sun studio doesn't like to install. not always. but it usually does. this is really quick and dirty, and a fin way of getting the thing from a good distro to a bad one.

tar -cvf - /opt/SUNWspro | ssh -l root targetserver 'cd /opt/ ; tar xf -'

man that's lazy.

Monday, October 1, 2012

solaris 7 & 8 allow root telnet

What a boring post. But, what a tedious topic.

# chmod 644 /etc/default/login
# vi /etc/default/login


# If CONSOLE is set, root can only login on that device.
# Comment this line out to allow remote login by root.
#
CONSOLE=/dev/console

becomes...

# If CONSOLE is set, root can only login on that device.
# Comment this line out to allow remote login by root.
#
#CONSOLE=/dev/console

Monday, September 24, 2012

ulimits & confluence

I have a machine. I actually have many machines. This specific machine runs a daemon, let's call it Atlassian Confluence, just for fun. The daemon is run by a user, let's call it senhorcrap. This user is in a little jail, no ssh, no nothing.

I get a note from an enduser saying something to the effect of:
what the fark is going on with your farking website it is farking down.

I respond:
really?

Actually he said:
hey, i've gone a 500 error and then a few minutes ago i saw this: Service Temporarily Unavailable. The server is temporarily unable to service your request due to maintenance downtime or capacity problems. Please try again later.

I responded with:
not again.

Not again. Before I'd lazily restart the service and the world would be good. Not this time.

And a sick stack trace later...

Looking at the logs (we always look at the logs) I found it was a open file error. Too many of them were open. Interesting. Well. There are limits to these things to prevent system resource exhaustion.

# tail -f -n 30 /home/senhorcrap/senhorcraps-home/logs/atlassian-confluence.log

Then I tried to gracefully stop the service. Then I just killed it by sweeping it away with a script I have on this blog.

# killsomething
# ps aux |grep confluence

Not there. Nice.

# su - senhorcrap
# ulimit -aS | grep open
1024

# lsof |wc
2044

Uh.

As root... I edited /etc/security/limits.conf , /etc/pam.d/login , /etc/profile

/etc/security/limits.conf

senhorcrap      soft    nofile          1024
senhorcrap      hard    nofile          4096

/etc/pam.d/login

session    required   pam_limits.so

/etc/profile

if [ $USER = "senhorcrap" ]; then
        if [ $SHELL = "/bin/bash" ]; then
              ulimit -n 4096
        fi
fi

Once I su'd as senhorcrap I checked my limits, and all was well.
I started my daemon and the system was fine. Doing the "Windows refresh" wasn't required.

...

What I did not write was it took me a goodly long time to figure out I needed the soft and hard limits in limits.conf to work. And that those limits have to be divisible by 1024. And the new limits would only take effect on new processes (daemons) after the fact; thus I had to kill confluence. But, we don't talk about that. A note before you start to sneeze bs all over me. YES hard alone should work. In this instance, it did not. And I got mad. Well, as only as mad as a sysadmin can be, which is not really mad at all.

Tuesday, September 18, 2012

vmware esx 5 excitement + ghettoVCB

I have to backup a vm, but I don't have the VMware extensions. What to do? Use ghettoVCB, of course. that's fine, but the deal with VMWare is that a lot of stuff is just plain ephemeral.

My environment is pretty simple. I have an ESX 5 box with two NICs. One is connected to the prod network, the other to a private storage network. The priv net has a server with an NFS export where I can drop stuff from the ESX box.

I've got the NFS export mounted on my ESX box as /vm-repo . Via shell, it is located here:
/vmfs/volumes/vm-repo/

I've decided to use NFS as opposed to iSCSI since I am able to access the data and not have the partition formatted as vmfs. There are drawbacks to both, but for my purposes here, NFS works best. On the directory have placed ghettoVCB and a few more scripts.

Okay.

Luckily, /etc/rc.local survives between boots on an ESX 5 machine. I've added the following:

# boot vm
for i in $(vim-cmd vmsvc/getallvms|cut -f1 -d" "| grep -v Vmid); do vim-cmd vmsvc/power.on $i; sleep 10; done

# allow smtp through firewall
cp /vmfs/volumes/vm-repo/smtp.xml /etc/vmware/firewall/
esxcli network firewall refresh

# fix root cron
echo "0 0,6,18 * * * /vmfs/volumes/vm-repo/tools/ghettoVCB/ghettoVCB.sh -a" >> /var/spool/cron/crontabs/root

boot vm
This iterates through vms on my ESX box and starts them. This only happens at boot time. This is an issue because ESX no longer does an auto-start.

allow smtp
ESX does not have a nice clickable GUI where I can let SMTP go through. I want SMTP traffic to be sent by the system since I want to know what...

fix root cron
does. This calls the ghettoVCB script which creates a full backup of my VMs at midnight, 6am and 6pm.

Yay. Now my systems auto-start, I have backups and I get a report. Life is grand.

links
ghettoVCB http://communities.vmware.com/docs/DOC-8760
ghettoVCB-restore http://communities.vmware.com/docs/DOC-10595
smtp hint http://www.vladan.fr/how-to-change-default-ssh-port-on-esxi-5-and-make-the-change-persistent-after-reboot/
rc.local hint http://communities.vmware.com/thread/217704
vm restart http://blogs.vmware.com/vsphere/2012/03/free-esxi-hypervisor-auto-start-breaks-with-50-update-1.html

Wednesday, September 12, 2012

sol11 & gnu

note to self, they put gnu here:
/usr/gnu/bin

Tuesday, September 4, 2012

solaris 11, i weep

solaris11!

why have you cast aside the simplicity of solaris 10? what did i ever do to you? were you taunted as a child for boasting your sysv lineage? don't you just want to get back to your bsd roots? embrace unics, solaris 11. look what happened to your friends aix and hpux. no one really likes them, not really. all the kids look to debian derivatives for cool awesomeness. you had hope solaris 11, you really did. and debuting on armistice day, that was cool. i was quiet for two minutes. i was. forget this mean oracle branding. please?

Friday, August 31, 2012

my vendors don't listen or bulk ms-dns add script

Sigh. I specified that all my DHCP passed-out addresses need to have an A name and a PTR record. Apparently someone wasn't listening, or half-listened, as when I went to do whatever I do, my hosts were showing up sans-name. Oh man. Maybe they got tired typing. There is an easier way to create bulk DNS records.

Let's just say my hosts need this format:

dhcp-101.testorama.vendor.local  10.0.10.101
dhcp-102.testorama.vendor.local  10.0.10.102

Now, vendor.local is my forward lookup zone, and testorama is the domain.

First off, I need an input file with my particulars all separated by commas (csv files are fun).

HOSTNAME,ZONE,IP_ADDRESS

Within my DNS structure, a hostname is the host's name plus domain. Domains can be their own zones... but in my case, this is not so.

For the above a line in my input file called input.txt would look like:

dhcp-101.testorama,vendor.local,10.0.10.101

On the DNS server, or on a host on which you permission to edit DNS entries and have DNS tools installed (for the lovely dnscmd command) issue:

for /f "tokens=1-3 delims=," %a in (input.txt) do dnscmd  /RecordAdd %b %a A %c

to create A records .

Issue:

for /f "tokens=1-3 delims=," %a in (input.txt) do for /f "tokens=1-4 delims=." %e in ("%c") do dnscmd  /RecordAdd %g.%f.%e.in-addr.arpa. %h PTR %a.%b

for PTR records.

For A & PTR record deletions, because you made a mistake, by say, having a digit flip...

for /f "tokens=1-3 delims=," %a in (input.txt) do dnscmd  /RecordDelete %b %a A /f

for /f "tokens=1-3 delims=," %a in (list.txt) do for /f "tokens=1-4 delims=." %e in ("%c") do  do dnscmd  /RecordDelete %g.%f.%e.in-addr.arpa. %h PTR /f

sometimes your dns admins will not have separate zones for various subnets. in the above example, 10.0 is it. to remedy that, just change the variables in your PTR script:

Issue:

for /f "tokens=1-3 delims=," %a in (input.txt) do for /f "tokens=1-4 delims=." %e in ("%c") do dnscmd  /RecordAdd %f.%e.in-addr.arpa. %h PTR %a.%b

Wednesday, August 22, 2012

greping ldap for clues.

Sometimes you need to do queries off Active Directory. AD is basically an LDAP database with some weirdness. That's okay.

On my lovely ubuntu box, I need to do queries to find bunches of users.

ldapsearch -x -D "Domain\uid" -W \
-h ad.server.com \
-b "DC=my,DC=ad,DC=server,DC=com" \
-LLL -v "(sAMAccountName=anotheruid)" cn

What is all this?
-x says we're doing a simple bind.
AD likes authenticated queries. -D is who you're binding as. -W prompts for a pass.
-h is the AD server I'm talking to.
-b is the search base; that is the AD tree where I'm doing my query.
-LLL is the output format. It will show everything in the record.
-v is the verbose tag.
After all this is my search string. In this case, I'm looking for a uid and want to print its common name. I could plop sn which'd tell me the surname.
To be interesting, I could put in "(sn=clue)" cn and that'd display everyone with the surname "clue" and their common name. Fun.

Monday, August 20, 2012

lsof adventures on sol11.

Solaris 11, I heart you. But I h8 you. I do. You've skipped out of one of the most useful toolks known to sysadmindom:
lsof

Why? Well... we can roll our own, can't we? Sure we can.

Solaris 11 does not have a /usr/local/bin or /usr/local/sbin .
Create skel directories:

# mkdir -p /usr/local/bin
# mkdir -p /usr/local/man/man8

Then, with your downloaded lsof.tar.Z code from
ftp://sunsite.ualberta.ca/pub/Mirror/lsof/lsof_4.86.tar.Z

Read the READMES. Read them again..

# ./Configure solariscc
# make
# make install

And then you see...

Please write your own install rule. Lsof should be installed...

grumble. Thanks Vic for assuming I have half a brain... heh...

$ vi Makefile

DESTDIR= /usr/local
BIN=    ${DESTDIR}/sbin
DOC=    ${DESTDIR}/man/man8
GRP= sys  



 install -m 2755 -o root -g ${GRP} ${PROG} ${BIN}
 install -m 444 ${MAN} ${DOC}

If it still craps out...

# cp lsof /usr/local/sbin/.
# chmod 2755 /usr/local/sbin/lsof
# chown root:sys /usr/local/sbin/lsof
# cp lsof.8 /usr/local/man/man8/.
# chmod 755 /usr/local/man/man8/lsof.8

Wednesday, August 15, 2012

formatting a disk in a solaris10 system

After the drive's been placed in the system, solaris doesn't autofind the hardware a la kudzu. You need to do it yourself.

Run:
# devfsadm

To save yourself some pain, if you've mounted a disk used by an old system, redo to the label or partition table. I've had VTOC Warnings about not having backup labels when doing a simple partition table. So. Run:

# format -e
Choose your new disk.

You'll be presented with: SMI [0] or EFI [1].

format> label
[0] SMI Label
[1] EFI Label
Specify Label type[1]: 0
Warning: This disk has an EFI label. Changing to SMI label will erase all
current partitions.
Continue? y
Auto configuration via format.dat[no]? n
format> quit

SMI will create a new disk slice with backup. backup is the slice logically containing the entire space available on the disk.

When redoing the partition tables on the disk, do not delete or rename backup.

Run format again...

format> partition

partition> print
Current partition table (original):
Total disk cylinders available: 1020 + 2 (reserved cylinders)

Part Tag Flag Cylinders Size Blocks
0 unassigned wm 0 0 (0/0/0) 0
1 unassigned wm 0 0 (0/0/0) 0
2 backup wu 0 – 1020 1.99GB (1021/0/0) 4182016
3 unassigned wm 0 0 (0/0/0) 0
4 unassigned wm 0 0 (0/0/0) 0
5 unassigned wm 0 0 (0/0/0) 0
6 unassigned wm 0 0 (0/0/0) 0
7 unassigned wm 0 0 (0/0/0) 0
8 boot wu 0 – 0 2.00MB (1/0/0) 4096
9 unassigned wm 0 0 (0/0/0) 0

In this case, I just want to create one large partition for some extra storage so I will allocate all I can to partition 0. Note that partition 2 is used to reference the entire drive and is not a usable partition. To modify a given partition, just enter the number of the partition at the partition prompt:

Choose the partition, re-name unassigned and make wm.
I like to do the last slice on up, skipping slice 2, taking note of its size... and then when I've made it to 0, give it the same amount of space as slice 2.

Then...

partition> label
Ready to label disk, continue? y

partition> quit
format> quit

Create a lovely UFS filesystem...

# newfs /dev/dsk/c0t1d0s0
newfs: construct a new file system /dev/rdsk/c1t1d0s0: (y/n)? y
/dev/dsk/c0t1d0s0: 4173824 sectors in 1019 cylinders of 128 tracks, 32 sectors
5000.0MB in 45 cyl groups (23 c/g, 46.00MB/g, 11264 i/g)
super-block backups (for fsck -F ufs -o b=#) at:

Fsck it.

# fsck -y /dev/dsk/c0t1d0s0
And then mount it however you wish.

Thursday, August 9, 2012

exchange small ufs drive for a large one

teeny ufs drive to larger ufs drive on solaris 10 a possibility? ya betcha.

c0t0d0 is the source. it is formatted as ufs. bummer.
c1t0d0 is the destination. it shall be formatted as ufs. bummer.

the bum deal is that the source disk has all of these volumes defined, and since the backup disk slice is being a punk, i can't resize any of the slices. that's okay. i really just want to plop everything on the same slice and go on with life. i could make this complicated - you know, re-create all the disk slices and ufsdump slice to slice, but i'm in a rush. if you're doing the later, as opposed to ufsdump root partion, just do the rdsk. it works.

first. format the destination disk.

# format

second. create a filesystem on the destination disk.

# newfs

third. mount the disk and initiate a ufsdump and restore. dd be damned.

i'm going to mount it under /mnt.

# mount -F ufs -o rw /dev/dsk/c1t0d0s0 /mnt
# ufsdump 0f - / | ( cd /mnt ; ufsrestore xvf - )
# umount /mnt

at the end of it all, be sure to enable the disk to actually be booted.

# /usr/sbin/installboot /usr/platform/`uname -i`/lib/fs/ufs/bootblk /dev/rdsk/c1t0d0s0

for fun:
# fsck /dev/rdsk/c1t0d0s0

and, clean things up in /mnt/etc/vfstab . we don't want to mount things that aren't there, like the not-copied-over swap partition.

http://utahsysadmin.com/2008/02/07/adding-a-hard-drive-to-solaris-10/
http://nixforums.org/about22408-Copy-entire-Solaris-disk--to-new-Hard-Disk-.html
http://fengnet.com/book/Solaris_admin/ch01lev1sec15.html is a lovely discussion of smc admin tool. yay illegal yanking of copyrighted material prc peeps.

oracle solaris 11 is all new all the time

it is.

after install, re-configure networking. this will remove all profiles and anything that may muck up correct connectivity later on.

[undo]
yep. you start out by unconfiguring the default. go figure, right? well, this gets rid of all the confusion created by np and loc and "network magic."

# sysconfig configure -s

system will shut down; upon system start logon as "alternate account".

[ssh]
allow root ssh login solaris 11.

/etc/ssh/sshd_config
PermitRootLogin = yes

/etc/default/login
#CONSOLE =/dev/login

# rolemod -K type=normal root

[ldap]
what's ldap up to?
svc */ldap/*

svcadm enable network/ldap/client:default
svcadm enable network/nis/domain
svcs -l network/ldap/client:default
/usr/lib/ldap/ldap_cachemgr -g

svcs -l network/ldap/client:default
make sure the deps are online.

ldapclient -v manual \
-a defaultServerList=xx.xx.xx.xx \
-a defaultSearchBase=dc=xx,dc=xx,dc=xx \
-a defaultSearchScope=sub \
-a bindTimeLimit=20 \
-a credentialLevel=proxy \
-a authenticationMethod=simple \
-a proxyDN=cn=admin,dc=xx,dc=xx,dc=xx \
-a proxyPassword=aStringValue \
-a serviceSearchDescriptor=passwd:ou=users,dc=xx,dc=xx,dc=xx \
-a serviceSearchDescriptor=shadow:ou=users,dc=xx,dc=xx,dc=xx \
-a serviceSearchDescriptor=group:ou=groups,dc=xx,dc=xx,dc=xx \
-a followReferrals=true

# ldapclient list
determine that all fields are thus:

NS_LDAP_FILE_VERSION= 2.0
NS_LDAP_BINDDN= cn=admin,dc=xx,dc=xx,dc=xx
NS_LDAP_BINDPASSWD= {NS1}poop
NS_LDAP_SERVERS= xx.xx.xx.xx
NS_LDAP_SEARCH_BASEDN= dc=xx,dc=xx,dc=xx
NS_LDAP_AUTH= simple
NS_LDAP_SEARCH_REF= TRUE
NS_LDAP_SEARCH_SCOPE= sub
NS_LDAP_CACHETTL= 0
NS_LDAP_CREDENTIAL_LEVEL= proxy
NS_LDAP_SERVICE_SEARCH_DESC= passwd:ou=users,dc=xx,dc=xx,dc=xx
NS_LDAP_SERVICE_SEARCH_DESC= shadow:ou=users,dc=xx,dc=xx,dc=xx
NS_LDAP_SERVICE_SEARCH_DESC= group:ou=groups,dc=xx,dc=xx,dc=xx
NS_LDAP_BIND_TIME= 30

in pam.conf have:

# login service (explicit because of pam_dial_auth)
#
login auth requisite pam_authtok_get.so.1
login auth required pam_dhkeys.so.1
login auth required pam_unix_cred.so.1
login auth required pam_dial_auth.so.1
login auth binding pam_unix_auth.so.1 server_policy
login auth required pam_ldap.so.1

http://docs.oracle.com/cd/E23823_01/html/816-5166/ldapclient-1m.html shows all the neat switches.

[nsswitch]

# svccfg
svc:> select name-service/switch
svc:/system/name-service/switch> setprop config/host = astring: "files dns"
svc:/system/name-service/switch> setprop config/ipnodes = astring: "files dns"
svc:/system/name-service/switch> select system/name-service/switch:default
svc:/system/name-service/switch:default> refresh
svc:/system/name-service/switch:default> validate
svc:/system/name-service/switch:default> exit
# svcadm enable dns/client
# svcadm refresh name-service/switch
# grep host /etc/nsswitch.conf
hosts:  files dns
# cat /etc/resolv.conf

Tuesday, July 31, 2012

expect a pubkey

i have a pubkey. i need to put it all over the place.
but, i have my pubkey on some systems.

sigh.

first, i cat my favorite pubkeys into authorized_keys2, then i strip my dns zone file and get all my ip addresses. then i feed that list into this script. if the systems blink, i attempt to scp to them. if i get a password prompt, expect will throw the "i already know it password" in and copy over my keys. yeah. you can get fancy and do other things, but this is a start.

#!/bin/bash

for ip_addr in $(cat strippedzonefile) ; do

ping -q -c 1 $ip_addr &&

expect -c "
spawn scp /my/authorized_keys2 account@$ip_addr:/that/account/.ssh/authorized_keys2
expect \"?assword:*\"
send -- \"securepassword\r\"
expect eof
 "
done

nfs barfs

i need to re-export an nfs mount because i need to. i do my usual /etc/exports editing. and then nfsd barfs...

root@server:~# exportfs -ra
exportfs: Warning: /my/export does not support NFS export.

why?

Of course...
/etc/default/nfs-kernel-server
needs this line...

REEXPORT_NFS="yes"
then re-start nfs services, statd junk and portmap.

root@server:~# exportfs -ra

no error. nice. or just install unfs3.

Monday, July 30, 2012

strip ips from zonefile

so i want to strip ips from a zone file. easy.
dump it. scp it. whatever.


#!/bin/bash
echo "enter zone file"
read zonefile    
fileName=`pwd`"/$zonefile"

if [ -f "$zonefile" ] ; then

sed -n 's/\([0-9]\{1,3\}\.\)\{3\}[0-9]\{1,3\}/\nip&\n/gp' $zonefile | grep ip | sed 's/ip//'| sort | uniq > stripped

fi

no frills scp & execute command script

1000 machines need a file and a command run.
some machines are up. some are not.

first thing, pubkey them. done.

now, what to do about that file and the command?

my file is called, oh, file. it is in ~ . somewhere.
drop a file, say, computers in pwd.

first, check if the computers are alive. then drop the file. then run whatever's in the file.

#!/bin/bash

for ip_addr in $(cat computers) ; do
  ping -q -c 1 $ip_addr && \ 
  scp -r ~/somewhere/file toor@$ip_addr:/tmp && \ 
  ssh -l toor $ip_addr "bash -c \"/tmp/file \""
done

if you work by the hour, then this script would make you useless. if you're salaried, go get some coffee.

Thursday, July 26, 2012

i don't care about keys

well. i do and sometimes i don't. let's just suspend all those, do you want to accept rsa key prompts, shall we?

[systemwide]
in /etc/ssh/ssh_config (global client conf file) add stanza:

Host 192.168.168.*
   StrictHostKeyChecking no
   UserKnownHostsFile=/dev/null

* This may be done by subnet or host.

[per session]
$ ssh -o UserKnownHostsFile=/dev/null -o StrictHostKeyChecking=no \
uid@192.168.168.192

Wednesday, June 27, 2012

change all those perms

So you need to change the uid on all files owned by user. Do the following as root:

# find / -uid 1500 -gid 100 -exec chown 15038:101 {} \;

A breakdown is as follows:

find / -uid 1500 -gid 100 -exec chown 1500:101 {} \;
^    ^      ^        ^       ^
|    |      |        |       |
|    |      |        |       | 
|    |      |        |       |
|    |      |        |       do this chown new userid:group {all files found}
|    |      |        |
|    |      |        user's primary group
|    |      |
|    |      userid
|    |
|    filesystem
|
command

Wednesday, May 30, 2012

solaris 9 u4 & studio 11

are not compatible.

# ./installer

Exception in thread "Thread-28" java.lang.NoClassDefFoundError: com/sun/install/panels/ComponentSelectionListener
        at java.lang.Class.getDeclaredMethods0(Native Method)
        at java.lang.Class.privateGetDeclaredMethods(Class.java:2427)
        at java.lang.Class.getDeclaredMethod(Class.java:1935)
        at java.awt.Component.isCoalesceEventsOverriden(Component.java:5723)
        at java.awt.Component.access$100(Component.java:162)
        at java.awt.Component$2.run(Component.java:5677)
        at java.awt.Component$2.run(Component.java:5675)
        at java.security.AccessController.doPrivileged(Native Method)
        at java.awt.Component.checkCoalescing(Component.java:5674)
        at java.awt.Component.(Component.java:5643)
        at java.awt.Container.(Container.java:245)
        at javax.swing.JComponent.(JComponent.java:576)
        at javax.swing.JPanel.(JPanel.java:65)
        at javax.swing.JPanel.(JPanel.java:92)
        at javax.swing.JPanel.(JPanel.java:100)
        at com.sun.wizards.core.WizardComponent.(WizardComponent.java:159)
        at com.sun.wizards.core.WizardComponent.(WizardComponent.java:145)
        at com.sun.wizards.core.WizardLeaf.(WizardLeaf.java:78)
        at com.sun.install.panels.ComponentPanel.(ComponentPanel.java:144)
        at com.sun.install.products.CreateSimpleUninstaller.createSimpleUninstallerTree(CreateSimpleUninstaller.java:42)
        at com.sun.install.products.UninstallArchiveCreator.writeArchiveFile(UninstallArchiveCreator.java:537)
        at com.sun.install.products.UninstallArchiveCreator.writeArchive(UninstallArchiveCreator.java:317)
        at com.sun.install.products.UninstallUnit.install(UninstallUnit.java:740)
        at com.sun.install.products.Product.performInstallation(Product.java:649)
        at com.sun.install.tasks.ProductTask.perform(ProductTask.java:153)
        at com.sun.wizards.core.Sequence.perform(Sequence.java:343)
        at com.sun.wizards.core.SequenceManager.run(SequenceManager.java:226)
        at java.lang.Thread.run(Thread.java:619)

well. just take away my spoons.

download jdk-1_5_0_21-solaris-sparc.sh from here http://www.oracle.com/technetwork/java/javasebusiness/downloads/java-archive-downloads-javase5-419410.html#jdk-1.5.0_21-oth-JPR. and...

# mv /usr/java /usr/java1.4
# ln -s /usr/jdk1.5.0_21 /usr/java

Monday, April 30, 2012

oracle 11r1 & r2 centos install notes

i don't like reading long docs. just distill it down you say? okay.

echo redhat-4 >> /etc/redhat-release

in /etc/security/limits.conf

 # settings for oracle
 *               soft    nproc   2047
 *               hard    nproc   16384
 *               soft    nofile  1024
 *               hard    nofile  65536

in /etc/sysctl.conf

 kernel.shmmni = 4096

/sbin/sysctl -p

groupadd oinstall ; groupadd dba ; groupadd oper ; groupadd oracle 
useradd -g oinstall -G oracle -d /opt/oracle oracle
passwd oracle

install as user oracle...

11r1 & r2 install add'l packages

setarch
make
glibc
libaio
compat-libstdc++
gcc
libXp
openmotif
compat-db

11r2 yum install add'l packages:

elfutils-libelf-devel
gcc-c++
libaio-devel
libstdc++-devel
sysstat
unixODBC-2.2.11
unixODBC-devel-2.2.11
pdksh

Friday, April 27, 2012

dhcp3 combatting evil

After lunch yesterday I received a request for support from a fellow running several VMs and them not getting IP addresses from the DHCP server. That's weird. I've done nothing to my network and the ESX server looks just fine. There goes an afternoon...

After a look at the logs on the dhcp3 server, I found that an errant bank of devices was going haywire. Sure, pulling the power cord would've been a quicker fix, but I like puzzles.

Here's what I saw: first, a whole bunch of requests were coming in from a bunch of MACs pre-pended with e8:39:35 . All these requests were taking dhcp addresses. So, I plug in the address here:
http://www.wireshark.org/tools/oui-lookup.html

To figure out what hardware is behind that MAC.

I find out that it is not a virtual machine gone bad. HP device. Great. So then I pull out the bigger brain and decide that I want to craft a dhcp pool that'll ban HP devices and allow everything else. To do this I create rules explicitly allowing and denying classes of devices. Easy?

Below you'll find a list of common MAC identifiers for Virtual machines, a dhcp3.conf and some pertinent logs.

MAC identifiers

Company and Products                        MAC unique identifier
VMware ESX 3/4 Server, Workstation, Player  00:50:56 00:0C:29 00:05:69
MS Hyper-V, Virtual Server, Virtual PC      00:03:ff
Parallells Desktop, Workstation, Server, Virtuozzo 00:1c:42
Virtual Iron 4                              00:0f:4b
RedHat Xen                                  00:16:3e
Oracle VM                                   00:16:3e
XenSource                                   00:16:3e
Novell Xen                                  00:16:3e
Sun xVM VirtualBox                          08:00:27

dhcp3.conf

ddns-update-style none;

default-lease-time 600;
max-lease-time 7200;

authoritative;
log-facility local7;

option subnet-mask 255.255.255.0;
option broadcast-address 10.10.10.255;
option routers 10.10.10.1;
option domain-name-servers 10.10.10.2, 10.10.10.3;
option domain-name "my.company.com";
option netbios-name-servers 10.10.10.2;

class "evil" {
        match if (binary-to-ascii (16,8,":",substring(hardware, 0, 4)) = "1:e8:39:35");
        log (info, (binary-to-ascii (16,8,":",substring(hardware, 0, 4))));
}

class "vmware-clients" {
        match if (binary-to-ascii (16,8,":",substring(hardware, 0, 4)) = "1:0:50:56")
        or (binary-to-ascii (16,8,":",substring(hardware, 0, 4)) = "1:0:c:29")
        or (binary-to-ascii (16,8,":",substring(hardware, 0, 4)) = "1:0:5:69");
        log (info, (binary-to-ascii (16,8,":",substring(hardware, 0, 4))));
} 

class "not-evil" {
        match if not (binary-to-ascii (16,8,":",substring(hardware, 0, 4)) = "1:e8:39:35");
        log (info, (binary-to-ascii (16,8,":",substring(hardware, 0, 4))));
}

subnet 10.10.10.0 netmask 255.255.255.0 {
        pool {
                range 10.10.10.100 10.10.10.10.200;
                range 10.10.10.204 10.10.10.220;
                allow members of "vmware-clients";
                allow members of "not-evil";
                deny members of "evil";
                }
}

Log snippet

Apr 26 16:03:50 dhcpd: Wrote 8 leases to leases file.
Apr 26 16:05:00 dhcpd: DHCPREQUEST for 10.10.10.175 from e8:39:35:1f:8a:6e via eth0: lease 10.10.10.75 unavailable.
Apr 26 16:05:00 dhcpd: DHCPNAK on 10.10.10.175 to e8:39:35:1f:8a:6e via eth0
Apr 26 16:05:01 dhcpd: 1:0:50:56
Apr 26 16:05:01 dhcpd: 1:0:50:56
Apr 26 16:05:01 dhcpd: DHCPDISCOVER from 00:50:56:80:1a:75 via eth0
Apr 26 16:05:02 dhcpd: DHCPOFFER on 10.10.10.159 to 00:50:56:80:1a:75 (vmware-client01) via eth0
Apr 26 16:05:06 dhcpd: 1:0:50:56
Apr 26 16:05:06 dhcpd: 1:0:50:56
Apr 26 16:05:06 dhcpd: DHCPREQUEST for 10.10.10.159 (10.10.10.2) from 00:50:56:80:1a:75 (vmware-client01) via eth0
Apr 26 16:05:06 dhcpd: DHCPACK on 10.10.10.159 to 00:50:56:80:1a:75 (vmware-client01) via eth0
Apr 26 16:05:42 dhcpd: DHCPREQUEST for 10.10.10.162 from e8:39:35:1f:0e:97 via eth0: lease 10.10.10.162 unavailable.
Apr 26 16:05:42 dhcpd: DHCPNAK on 10.10.10.162 to e8:39:35:1f:0e:97 via eth0
Apr 26 16:07:03 dhcpd: 1:34:40:b5
Apr 26 16:07:03 dhcpd: DHCPREQUEST for 10.10.10.172 from 34:40:b5:20:a8:01 via eth0
Apr 26 16:07:03 dhcpd: DHCPACK on 10.10.10.172 to 34:40:b5:20:a8:01 via eth0

Wednesday, April 18, 2012

fix arp caches

so yeah. your ipv4 forwarder may be all scrambled and you've flushed the arp cache per a previous post, but the switches still have the incorrect arp information and hilarity ensues. an easy way to fix this is to issue a network command from the machine affected by arp nastiness. here's a quick oneliner to use ssh to connect to somewhere else - in this case via an ip'd secondary nic:

ssh -b secondary.nic.ip.address -p port me@somewhere

and the arp cache up the switch stack's been updated. of course, you're connecting to another system that's hanging off another switch up and around the stack, right?

Monday, April 16, 2012

who's plugging my ldap server

come on now. stop it already.

netstat -an | grep :389 | awk {'print $5'} | awk -F : '{print $1}' | sort | uniq

netstat -an | grep 389  | awk {'print $5'} | cut -f 1 -d \: | sort | uniq -c

or. who the heck is searching for that freaking uid?

ngrep -q -t "uid" \(port 389 or port 636 \)

Tuesday, April 3, 2012

sunstudio11 curses!

sigh i messed up a studio11 install. i did. delete the directory, sure? and i did.
in the process of reinstalling, the installer said studio was already installed.
oh... yeah... pkgadd... whoopsies!
i need to reinstall. what to do?

Fixing a Failed Installation or Uninstallation on Solaris Platforms

    Become superuser by typing:

    su
    Password: root-password

    Open the Solaris Product Registry tool by typing:

    /usr/bin/prodreg &

    In the left pane of the tool, expand the Unclassified Software node.
    Select all of the package names containing Oracle Solaris Studio 11 and click Uninstall. 
    Follow the instructions to remove the packages.
    Click Exit to exit the tool.
    Remove the /root/.nbi directory by typing:

    rm -r /.nbi

From the commandline:

# /var/sadm/prod/com.sun.studio_11/
# ./batch_uninstall_all

Tuesday, March 27, 2012

entry of 66048 (0x10200) when it should be 512 (0x200). eh?

So someone says:

"Oh so sorry, we've fascist controls on our AD-integrated site and you have: userAccountControl entry of 66048 (0x10200), when it should be 512 (0x200). No logon for you."

What does that mean, really?
Well. What it means is that according to UAC you've got the DONT_EXPIRE_PASSWORD property set. It incidentally has the hex and decimal settings of:
0x10000 and 65536 If we add those up, mister normal user, NORMAL_ACCOUNT (0x0200 512), we get 0x10200. That no expiring password... that's not expected.

Of course...

Here's something from Microsoft:

When you open the properties for a user account, click the Account tab, and then either select or clear the check boxes in the Account options dialog box, numerical values are assigned to the UserAccountControl attribute. The value that is assigned to the attribute tells Windows which options have been enabled.

To view user accounts, click Start, point to Programs, point to Administrative Tools, and then click Active Directory Users and Computers.

You can view and edit these attributes by using either the Ldp.exe tool or the Adsiedit.msc snap-in.

The following table lists possible flags that you can assign. You cannot set some of the values on a user or computer object because these values can be set or reset only by the directory service. Note that Ldp.exe shows the values in hexadecimal. Adsiedit.msc displays the values in decimal. The flags are cumulative. To disable a user's account, set the UserAccountControl attribute to 0x0202 (0x002 + 0x0200). In decimal, this is 514 (2 + 512).

Note You can directly edit Active Directory in both Ldp.exe and Adsiedit.msc. Only experienced administrators should use these tools to edit Active Directory. Both tools are available after you install the Support tools from your original Windows installation media.

Property flag                   hexadecimal decimal
SCRIPT                          0x0001          1
ACCOUNTDISABLE                  0x0002          2
HOMEDIR_REQUIRED                0x0008          8
LOCKOUT                         0x0010          16
PASSWD_NOTREQD                  0x0020          32
PASSWD_CANT_CHANGE              MS says this can't be done programmatically.
ENCRYPTED_TEXT_PWD_ALLOWED      0x0080          128
TEMP_DUPLICATE_ACCOUNT          0x0100          256
NORMAL_ACCOUNT                  0x0200          512
INTERDOMAIN_TRUST_ACCOUNT       0x0800          2048
WORKSTATION_TRUST_ACCOUNT       0x1000          4096
SERVER_TRUST_ACCOUNT            0x2000          8192
DONT_EXPIRE_PASSWORD            0x10000         65536
MNS_LOGON_ACCOUNT               0x20000         131072
SMARTCARD_REQUIRED              0x40000         262144
TRUSTED_FOR_DELEGATION          0x80000         524288
NOT_DELEGATED                   0x100000        1048576
USE_DES_KEY_ONLY                0x200000        2097152
DONT_REQ_PREAUTH                0x400000        4194304
PASSWORD_EXPIRED                0x800000        8388608
TRUSTED_TO_AUTH_FOR_DELEGATION  0x1000000       16777216
PARTIAL_SECRETS_ACCOUNT         0x04000000      67108864

Monday, March 12, 2012

rhel6 makes me bang my head on my cubicle wall sometimes

rhel6 is pesky in that if the netmask isn't standard, it'll make one up for you anyway and really mess up routes. come on redhat, learn something from debian already.

let's fix that:

/etc/sysconfig/network-scripts/route-ethX
default via dotted.router.ip dev ethX

at the end of:

/etc/sysconfig/network-scripts/ifcfg-ethX
ifconfig ethX netmask 255.255.252.0

Tuesday, March 6, 2012

lock it ,lock it up and lock it

~~I like to~~ I run backups and other scripts that require exclusive access to directories. For directory mirroring, rsync is a graceful candidate for the job - either locally or over the net to another host. A problem with some scripts that call rsync is that you can get into a race situation if one of your scheduled rsync jobs starts trying to "back up" the same thing that another scheduled rsync process is processing. Bad joss all around. Of course, you could write something that says, if this script is running, please don't run. Or. lockfile can be used in this regard. lockfile is part of the procmail package on various flavors of Ubuntu. To get it issue:

# apt-get install procmail

Easy.

Here's a useful snippet of code using lockfile in a shell script:

#!/bin/sh

LOCKFILE="/tmp/processname.lock"

# Break the lock if locking process has died
RUNNING_PID=`cat $LOCKFILE 2>/dev/null`;
if [ "x"$RUNNING_PID != "x" ] ; then
        RUNNING_NAME=`ps -p $RUNNING_PID -o comm= 2>/dev/null`;
        if [ "x"$RUNNING_NAME != "processname.sh" ] ; then
                rm -f $LOCKFILE
        fi
fi

# Acquire lock
lockfile $LOCKFILE
echo $$ > $LOCKFILE

echo whatever i am doing and plop in a log `date` >> /var/log/processname.log

...snip...

echo whatever i am doing is completed `date` >> /var/log/processname.log

# Release the lock
rm -f $LOCKFILE

If you're doing a scad of stuff, rotate your logs by placing an appropriately named file in logrotate.d:

/var/log/processname.log /var/log/ohlookanotherprocessname.log {
    rotate 7
    daily
    missingok
    notifempty
    compress
    sharedscripts
    endscript
}

Friday, March 2, 2012

sources.list for ubuntu 7.10

what an unoriginal title.

deb http://old-releases.ubuntu.com/ubuntu/ gutsy main restricted
deb http://old-releases.ubuntu.com/ubuntu/ gutsy-updates main restricted
deb http://old-releases.ubuntu.com/ubuntu/ gutsy universe
deb http://old-releases.ubuntu.com/ubuntu/ gutsy-updates universe
deb http://old-releases.ubuntu.com/ubuntu/ gutsy multiverse
deb http://old-releases.ubuntu.com/ubuntu/ gutsy-updates multiverse
deb http://old-releases.ubuntu.com/ubuntu/ gutsy-security main restricted
deb http://old-releases.ubuntu.com/ubuntu/ gutsy-security universe
deb http://old-releases.ubuntu.com/ubuntu/ gutsy-security multiverse

Monday, February 27, 2012

installation of hpacucli on Ubuntu 10.04.4 LTS (Lucid) x86_64

it all began with a simple query:

do we have a write cache?

and down the rabbit hole I went...

# cat /proc/driver/cciss/cciss*

cciss0: HP Smart Array P410i Controller
Board ID: 0x3245103c
Firmware Version: 5.14
IRQ: 63
Logical drives: 1
Current Q depth: 0
Current # commands on controller: 3
Max Q depth since init: 9
Max # commands on controller since init: 318
Max SG entries since init: 31
Sequential access devices: 0

cciss/c0d0:     1799.79GB       RAID 5

yay. i guess. to administer this, i can either take the system offline and mess around on the controller. or! i can install the hp tool HP Array Configuration Utility CLI for Linux (hpacucli). it has the added bonus of being able to be called by nagios... but i'm getting ahead of myself. it works with the following controllers:

Smart Array 5312 Controller
Smart Array 5302 Controller
Smart Array 5304 Controller
Smart Array 532 Controller
Smart Array 5i Controller
Smart Array 641 Controller
Smart Array 642 Controller
Smart Array 6400 Controller
Smart Array 6400 EM Controller
Smart Array 6i Controller
Smart Array P600 Controller
Smart Array P400 Controller
Smart Array P400i Controller
Smart Array E200 Controller
Smart Array E200i Controller
Smart Array P800 Controller
Smart Array E500 Controller
Smart Array P700m Contoller
Smart Array P410i Controller
Smart Array P411 Controller
Smart Array P212 Controller
Smart Array P712m Contoller
Smart Array B110i SATA RAID
Smart Array P812 Controller
MSA500 Controller
MSA500 G2 Controller
MSA1000 Controller
MSA1500 CS Controller
MSA20 Controller

the tool is supplied on HP Support Pack CDs, if you've got them; but you can download a newer version from HP here; this links to hpacucli-8.50-6.0.noarch.rpm.
after downloading, we need to convert the rpm into a format that we can work with. alien does this for us in ubuntu... other tools are rpm2cpio & rpm2tgz. i like alien. apt-get it.

# alien --to-tgz hpacucli-8.50-6.0.noarch.rpm

alien will report some errors and warnings; in your source directory, you'll see hpacucli-8.50.tgz.

# tar -xzf hpacucli-8.50.tgz

Move the unpacked files to corresponding locations:

# mv opt/compaq /opt/
# mv usr/sbin/* /usr/sbin/

since i'm running an x86_64 box, i need to:

# apt-get install ia32-libs

hpacucli should run. it does.

# hpacucli
=> ctrl all show      

Smart Array P410i in Slot 0 (Embedded)    (sn: 5001438017EA3640)

we have a RAID controller in Slot 0. Good to know.

=> ctrl all show detail      

Smart Array P410i in Slot 0 (Embedded)
   Bus Interface: PCI
   Slot: 0
   Serial Number: xxxxxxxxxxxxxxxxx
   Cache Serial Number: xxxxxxxxxxxxxxxxx
   RAID 6 (ADG) Status: Disabled
   Controller Status: OK
   Chassis Slot: 
   Hardware Revision: Rev C
   Firmware Version: 5.14
   Rebuild Priority: Medium
   Expand Priority: Medium
   Surface Scan Delay: 3 secs
   Queue Depth: Automatic
   Monitor and Performance Delay: 60 min
   Elevator Sort: Enabled
   Degraded Performance Optimization: Disabled
   Inconsistency Repair Policy: Disabled
   Wait for Cache Room: Disabled
   Surface Analysis Inconsistency Notification: Disabled
   Post Prompt Timeout: 15 secs
   Cache Board Present: True
   Cache Status: Not Configured
   Accelerator Ratio: 100% Read / 0% Write
   Read Cache Size: 0 MB
   Write Cache Size: 0 MB
   Drive Write Cache: Disabled
   Total Cache Size: 912 MB
   No-Battery Write Cache: Disabled
   Cache Backup Power Source: Capacitors
   Battery/Capacitor Count: 1
   Battery/Capacitor Status: OK
   SATA NCQ Supported: True

   Array: A
      Interface Type: SAS
      Unused Space: 0 MB
      Status: OK

      Logical Drive: 1
         Size: 1.6 TB
         Fault Tolerance: RAID 5
         Heads: 255
         Sectors Per Track: 63
         Cylinders: 65535
         Stripe Size: 256 KB
         Status: OK
         Array Accelerator: Not Configured
         Parity Initialization Status: Initialization Completed
         Unique Identifier: 600508B1001C5D95C9C5A46D895F6036
         Disk Name: /dev/cciss/c0d0
         Mount Points: /boot 243 MB
         OS Status: LOCKED
         Logical Drive Label: AE8582015001438017EA36402B33

      physicaldrive 1I:1:1
         Port: 1I
         Box: 1
         Bay: 1
         Status: OK
         Drive Type: Data Drive
         Interface Type: SAS
         Size: 300 GB
         Rotational Speed: 10000
         Firmware Revision: HPD4
         Serial Number: xxxxxxxxxxxxxxxxx
         Model: HP      EG0300FBDSP     
         PHY Count: 2
         PHY Transfer Rate: 6.0GBPS, Unknown

well, it looks like write caching is not enabled. great.

=> ctrl slot=0 modify dwc=enable
=> ctrl slot=0 modify cacheratio=25/75
=> ctrl slot=0 logicaldrive 1 modify aa=enable
=> ctrl all show config detail

Smart Array P410i in Slot 0 (Embedded)
   Bus Interface: PCI
   Slot: 0
   Serial Number: xxxxxxxxxxxxxxxxx
   Cache Serial Number: xxxxxxxxxxxxxxxxx
   RAID 6 (ADG) Status: Disabled
   Controller Status: OK
   Chassis Slot: 
   Hardware Revision: Rev C
   Firmware Version: 5.14
   Rebuild Priority: Medium
   Expand Priority: Medium
   Surface Scan Delay: 3 secs
   Queue Depth: Automatic
   Monitor and Performance Delay: 60 min
   Elevator Sort: Enabled
   Degraded Performance Optimization: Disabled
   Inconsistency Repair Policy: Disabled
   Wait for Cache Room: Disabled
   Surface Analysis Inconsistency Notification: Disabled
   Post Prompt Timeout: 15 secs
   Cache Board Present: True
   Cache Status: Not Configured
   Accelerator Ratio: 100% Read / 0% Write
   Read Cache Size: 0 MB
   Write Cache Size: 0 MB
   Drive Write Cache: Enabled
   Total Cache Size: 912 MB
   No-Battery Write Cache: Disabled
   Cache Backup Power Source: Capacitors
   Battery/Capacitor Count: 1
   Battery/Capacitor Status: OK
   SATA NCQ Supported: True

   Array: A
      Interface Type: SAS
      Unused Space: 0 MB
      Status: OK

      Logical Drive: 1
         Size: 1.6 TB
         Fault Tolerance: RAID 5
         Heads: 255
         Sectors Per Track: 63
         Cylinders: 65535
         Stripe Size: 256 KB
         Status: OK
         Array Accelerator: Not Configured
         Parity Initialization Status: Initialization Completed
         Unique Identifier: 600508B1001C5D95C9C5A46D895F6036
         Disk Name: /dev/cciss/c0d0
         Mount Points: /boot 243 MB
         OS Status: LOCKED
         Logical Drive Label: AE8582015001438017EA36402B33

      physicaldrive 1I:1:1
         Port: 1I
         Box: 1
         Bay: 1
         Status: OK
         Drive Type: Data Drive
         Interface Type: SAS
         Size: 300 GB
         Rotational Speed: 10000
         Firmware Revision: HPD4
         Serial Number: xxxxxxxxxxxxxxxxx
         Model: HP      EG0300FBDSP     
         PHY Count: 2
         PHY Transfer Rate: 6.0GBPS, Unknown

crap. it didn't update. or did it?

=> exit

apparently this is a bug. or a feature. we need to exit the utility, and then start it up again for the changes to be reflected. of course.

# hpacucli
=> ctrl all show config detail

Smart Array P410i in Slot 0 (Embedded)
   Bus Interface: PCI
   Slot: 0
   Serial Number: xxxxxxxxxxxxxxxxx
   Cache Serial Number: xxxxxxxxxxxxxxxxx
   RAID 6 (ADG) Status: Disabled
   Controller Status: OK
   Chassis Slot: 
   Hardware Revision: Rev C
   Firmware Version: 5.14
   Rebuild Priority: Medium
   Expand Priority: Medium
   Surface Scan Delay: 3 secs
   Queue Depth: Automatic
   Monitor and Performance Delay: 60 min
   Elevator Sort: Enabled
   Degraded Performance Optimization: Disabled
   Inconsistency Repair Policy: Disabled
   Wait for Cache Room: Disabled
   Surface Analysis Inconsistency Notification: Disabled
   Post Prompt Timeout: 15 secs
   Cache Board Present: True
   Cache Status: OK
   Accelerator Ratio: 25% Read / 75% Write
   Drive Write Cache: Enabled
   Total Cache Size: 1024 MB
   No-Battery Write Cache: Disabled
   Cache Backup Power Source: Capacitors
   Battery/Capacitor Count: 1
   Battery/Capacitor Status: OK
   SATA NCQ Supported: True

   Array: A
      Interface Type: SAS
      Unused Space: 0 MB
      Status: OK

      Logical Drive: 1
         Size: 1.6 TB
         Fault Tolerance: RAID 5
         Heads: 255
         Sectors Per Track: 63
         Cylinders: 65535
         Stripe Size: 256 KB
         Status: OK
         Array Accelerator: Enabled
         Parity Initialization Status: Initialization Completed
         Unique Identifier: 600508B1001C5D95C9C5A46D895F6036
         Disk Name: /dev/cciss/c0d0
         Mount Points: /boot 243 MB
         OS Status: LOCKED
         Logical Drive Label: AE8582015001438017EA36402B33

and of course, we want to see that our once dismal performance is not so. download and run iozone. and then test throughput:

# iozone -t4 -I

note:
everyone says, BBWC is a must have. but, as seen on this capacitor-backed up cache, all is cool. we're running with FBWC.
FBWC is a flash based cache module that does not have the battery limitation of how long it can retain what is written to the module.

addendum:
i got ahead of myself with nagios. use the plugin here to monitor the state of the array.

and...
if someone else is doing something funny and manages to crash your new friend... you'll need to clean up...

Error: Another instance of ACU is already running (possibly a service). Please
terminate the ACU application before running the ACU CLI. Press ENTER to
exit.

delete the shared IPC that hpacucli left when it died.

# ipcs

------ Shared Memory Segments --------
key        shmid      owner      perms      bytes      nattch     status      

------ Semaphore Arrays --------
key        semid      owner      perms      nsems     
0xffffffff 56890      root       0          1         

------ Message Queues --------
key        msqid      owner      perms      used-bytes   messages    
Then use ipcrm to remove the array with the semid you want:
# ipcrm -s 56890

and try to start hpacucli again.

postgresql and tmp

"Because PostgreSQL writes the write-ahead log to disk on every transaction commit using fsync(), and waits for that write to complete, users will see a huge performance boost if a write cache is used. Therefore, for performance and reliability, it is ideal if PostgreSQL can use a battery-backed write cache."

Moreover, Postgres recommends that if you are using RAID5, you should mount your /tmp dir on a spare drive if you have one.

But what do you do if you don't have a spare drive? And you're using everything for your RAID5 array?

mount /tmp to a 2G ram disk. of course.

Let's do it!

With any install, /tmp is usually always there. Usually. And since we're dealing with a DB, we want the data to be around, like just in case.

# mkdir /tmp <---- if it isn't there already.

Check and see if anyone is using /tmp ; if these are crucial daemons; I'd suggest stopping them.

Add this line to /etc/fstab in to mount the drive at boot-time:

tmpfs           /tmp tmpfs      defaults,size=2048M 0 0

tmpfs, by virtue of being tmpfs doesn't allocate all of that space in one go; only as needed. tmpfs is alright using up to half of your available RAM; use free -m to figure it out. I guess it is also worth mentioning that you do not need to recreate tmpfs each time the system is rebooted; it will auto-create between boots due to it being tmpfs.

That being said, mount the new filesystem after adding its entry in /etc/fstab.

# mount /tmp

Check to see that it's mounted

# mount
# df -h

You should see the following in mount and df -h output:

tmpfs on /tmp type tmpfs (rw,relatime,size=2097152k)
tmpfs         2.0G  0.0G  2.0G   0% /tmp

Next we need to create a directory to store the backup copies of whatever we've got in /tmp. /var is as good a place as any.

# mkdir /var/tmp-bak

Create script /etc/init.d/tmp-bak:

#! /bin/sh 
# /etc/init.d/tmp-bak
#
 
case "$1" in
  start)
    echo "copying files to tmp-bak"
    rsync -av /var/tmp-bak/ /tmp/
    echo [`date +"%Y-%m-%d %H:%M"`] tmp synched >> /var/log/tmp-bak_sync.log
    ;;
  sync)
    echo "synching files from tmp to tmp-bak"
    echo [`date +"%Y-%m-%d %H:%M"`] tmp synched to tmp-bak >> /var/log/tmp-bak_sync.log
    rsync -av --delete --recursive --force /tmp/ /var/tmp-bak/
    ;;
  stop)
    echo "synching files from tmp to tmp-bak"
    echo [`date +"%Y-%m-%d %H:%M"`] tmp synched to tmp-bak >> /var/log/ramdisk_sync.log
    rsync -av --delete --recursive --force /tmp/ /var/tmp-bak/
    ;;
  *)
    echo "Usage: /etc/init.d/tmp-bak {start|stop|sync}"
    exit 1
    ;;
esac

exit 0

Now set tmp-bak to run at startup:

# update-rc.d tmp-bak defaults 00 99

As a good rule of thumb, place the sync process in /etc/crontab:

5 * * * * root        /etc/init.d/tmp-bak sync >> /dev/null 2>&1

Friday, February 24, 2012

cidr cheetsheet

sometimes you need to know a cidr mask. sometimes.

Netmask              Netmask (binary)                 CIDR     Notes
_____________________________________________________________________________
255.255.255.255  11111111.11111111.11111111.11111111  /32  Host (single addr)
255.255.255.254  11111111.11111111.11111111.11111110  /31  Unuseable
255.255.255.252  11111111.11111111.11111111.11111100  /30    2  useable
255.255.255.248  11111111.11111111.11111111.11111000  /29    6  useable
255.255.255.240  11111111.11111111.11111111.11110000  /28   14  useable
255.255.255.224  11111111.11111111.11111111.11100000  /27   30  useable
255.255.255.192  11111111.11111111.11111111.11000000  /26   62  useable
255.255.255.128  11111111.11111111.11111111.10000000  /25  126  useable
255.255.255.0    11111111.11111111.11111111.00000000  /24 "Class C" 254 useable

255.255.254.0    11111111.11111111.11111110.00000000  /23    2  Class C's
255.255.253.0                                                3  Class C's
255.255.252.0    11111111.11111111.11111100.00000000  /22    4  Class C's
255.255.251.0                                                5  Class C's
255.255.250.0                                                6  Class C's
255.255.249.0                                                7  Class C's
255.255.248.0    11111111.11111111.11111000.00000000  /21    8  Class C's
255.255.240.0    11111111.11111111.11110000.00000000  /20   16  Class C's
255.255.224.0    11111111.11111111.11100000.00000000  /19   32  Class C's
255.255.192.0    11111111.11111111.11000000.00000000  /18   64  Class C's
255.255.128.0    11111111.11111111.10000000.00000000  /17  128  Class C's
255.255.0.0      11111111.11111111.00000000.00000000  /16  "Class B"

255.254.0.0      11111111.11111110.00000000.00000000  /15    2  Class B's
255.252.0.0      11111111.11111100.00000000.00000000  /14    4  Class B's
255.248.0.0      11111111.11111000.00000000.00000000  /13    8  Class B's
255.240.0.0      11111111.11110000.00000000.00000000  /12   16  Class B's
255.224.0.0      11111111.11100000.00000000.00000000  /11   32  Class B's
255.192.0.0      11111111.11000000.00000000.00000000  /10   64  Class B's
255.128.0.0      11111111.10000000.00000000.00000000  /9   128  Class B's
255.0.0.0        11111111.00000000.00000000.00000000  /8   "Class A"

254.0.0.0        11111110.00000000.00000000.00000000  /7
252.0.0.0        11111100.00000000.00000000.00000000  /6
248.0.0.0        11111000.00000000.00000000.00000000  /5
240.0.0.0        11110000.00000000.00000000.00000000  /4
224.0.0.0        11100000.00000000.00000000.00000000  /3
192.0.0.0        11000000.00000000.00000000.00000000  /2
128.0.0.0        10000000.00000000.00000000.00000000  /1
0.0.0.0          00000000.00000000.00000000.00000000  /0   IP space

                                   Net     Host    Total
Net      Addr                      Addr    Addr    Number
Class   Range      NetMask         Bits    Bits   of hosts
----------------------------------------------------------
A        0-127    255.0.0.0         8      24     16777216   (i.e. 114.0.0.0)
B      128-191    255.255.0.0      16      16        65536   (i.e. 150.0.0.0)
C      192-254    255.255.255.0    24       8          256   (i.e. 199.0.0.0)
D      224-239    (multicast)
E      240-255    (reserved)
F      208-215    255.255.255.240  28       4           16
G      216/8      ARIN - North America
G      217/8      RIPE NCC - Europe
G      218-219/8  APNIC
H      220-221    255.255.255.248  29       3            8   (reserved)
K      222-223    255.255.255.254  31       1            2   (reserved)

ref: RFC1375 & http://www.iana.org/assignments/ipv4-address-space
               http://www.iana.org/numbers.htm

----------------------------------------------------------

The current list of special use prefixes:
 0.0.0.0/8
 127.0.0.0/8
 192.0.2.0/24
 10.0.0.0/8
 172.16.0.0/12
 192.168.0.0/16
 169.254.0.0/16
 all D/E space

ref: RFC1918 http://www.rfc-editor.org/rfc/rfc1918.txt
       or     ftp://ftp.isi.edu/in-notes/rfc1918.txt
rfc search:   http://www.rfc-editor.org/rfcsearch.html
              http://www.ietf.org/ietf/1id-abstracts.txt
              http://www.ietf.org/shadow.html

Thursday, February 23, 2012

how's nfs' latency today?

let's check it out with tshark & iostat.

$ tshark -q -z rpc,rtt,100003,3,'nfs.nfsstat3!=70'

or... put something is a pcap file.

$ tshark -nlr nfs.pcap -R "rpc.time>0.5"

or... you can use iostat.

# iostat -x -n

Tuesday, February 7, 2012

my teeth chattr

need i say more?

#!/bin/sh

# changes ext2 or ext3 file attributes

for file in resolv.conf passwd shadow group motd hosts hostname
do
    if [ "$1" == "" ]    ; then lsattr    /etc/$file ; fi
    if [ "$1" == "on" ]  ; then chattr +i /etc/$file ; fi
    if [ "$1" == "off" ] ; then chattr -i /etc/$file ; fi
done

Thursday, February 2, 2012

macos 10.7.2 dmg to iso

sigh. you need to disk made from a dmg that you can ferry around, say to a xen box. turn it into an iso and away you go.

# hdiutil convert your.dmg -format UDTO -o your.iso
# mv your.iso.cdr your.iso

you could do this through Disk Utility, but the command line is always better.

Wednesday, February 1, 2012

solaris 9 notes

you see:
snmpXdmid: Error in Adding Row for Subscription Table Entry

Disable it...

   cd /etc/rc3.d
   ./S76snmpdx stop
   ./S77dmi stop
   mv S76snmpdx s76snmpdx
   mv S77dmi s77dmi

Friday, January 27, 2012

annoying pkgadd dependency chains be gone

one of the bum deals about pkgadd - sun's answer, i guess to rpm installs - is that you can try to add a package to a system and the install can fail if you don't have all the right dependencies. however, now some coolio folks wrote a util that downloads and checks dependencies if you're grabbing the open software from opencsw.

here's how to get pkgutil and install, say libstdc++5

# pkgadd -d http://get.opencsw.org/now
# vi ~/.bash_profile 

add /opt/csw/bin to your path

# pkgutil -i libstdc++5

crap.  it installs to /opt/csw/bin.

# ln -s /opt/csw/lib/libstdc++.so.5 /usr/lib
# ln -s /opt/csw/lib/libstdc++.so.5.0.5 /usr/lib

Monday, January 23, 2012

likewise, ms sfu + 2307 attributes & ldap

disgusting.

so, as you're probably aware, likewise-open is a nifty tool for getting authentication of linux and linuxesque boxes to active directory. likewise-open is placed in the ecosystem where admins simply need authentication and home directories mounted. it uses an internal hash mechanism to auto-generate uids and gids from user sids in active directory; so, in essence, all across an enterprise, the likewise-open uids and gids will be the same. okay. sure.

but what about mixed el-cheapo shops?

my problem was the following:
i have a windows active directory domain and i have a linux-based openldap system. i've invested heavily in both, so, i'm really not in the mood to retire or re-tool the linux side of the house. windows, sure. the end goal is to have a linux machine join active directory and be able to authenticate windows users preserving openldap uid and gids.

i do not want to use samba, i do not want to use winbind, i do not want to use likewise-open weird hash mechanisms. i do want to use RFC 2307 attributes.

microsoft ad's nice, as there's actually a schema extension that enables an admin to have unix uids and gids. this is accessible once idmu extensions are rendered visible and server for nis is installed. oh yes.

here's what i did:
1. on ms server 2003 ad controller, installed ms sfu 3.5 server for nis.
2. ditto, installed ms idmu extensions.
3. opened my ldap db and took note of my user uids and gids.
4. i now have something called, "services for unix authentication"
the domain is the short nt-namr for my ad domain. nice.
5. my ad entries now have the nifty tab, "UNIX Attributes"
6. added the proper uid & gid information as gleaned from ldap to each of my ad records.
i don't have many users to think about, so doing this by hand is a piece of cake.
7. on a linux box, i did the usual likewise-open installation.
we really just want the kerberos ticket generation stuff, so we don't have to
go to an ad server and run kerberos ticket utilities and the like. turn-key is
the name of the game.
8. edited several key files... ldap.conf, nsswitch.conf, krb5.conf

ldap.conf: we're pointing to the ad controller. we have cool rfc 2307 attributes defined here, too.
nsswitch.conf: remove lsass entries, it'll only prove to confuse things.
krb5.conf: get the ad controller in there.

just for fun, do an ldap search against your ad controller with a bind account. you
know and i know that ad will not allow searches by anonymous users. having ntp have its
time source set by the ad controller would be awesome, too.

here's a nice search:

# ldapsearch -x -D "notme@not.there.com" -w badpassword -h 10.0.0.1

you should see:

# extended LDIF
#
# LDAPv3
# base <> with scope subtree
# filter: (objectclass=*)
# requesting: ALL
#

# search result
search: 2
result: 10 Referral
text: 0000202B: RefErr: DSID-031006E0, data 0, 1 access points
        ref 1: 'not.here.com'

ref: ldap://not.there.com/dc=not,dc=there,dc=com

# numResponses: 1

here's what my conf files look like:

ldap.conf

host 10.0.0.1
base dc=not,dc=there,dc=com
uri ldap://10.0.0.1/
binddn notme@not.there.com <--- ad doesn't like the whole cn dn deal all the time.
bindpw badpassword
scope sub
bind_timelimit 15
timelimit 15
ssl no
referrals no
nss_base_passwd cn=Users,dc=not,dc=there,dc=com?sub
nss_base_shadow cn=Users,dc=not,dc=there,dc=com?sub
nss_base_group cn=Users,dc=not,dc=there,dc=com?sub?&(objectCategory=group)(gidnumber=*)
nss_map_objectclass posixAccount user
nss_map_objectclass shadowAccount user
nss_map_objectclass posixGroup group
nss_map_attribute gecos cn
nss_map_attribute homeDirectory unixHomeDirectory
nss_map_attribute uniqueMember member
nss_initgroups_ignoreusers ldap

nsswitch.conf

passwd: compat ldap lsass <---- remove
group:  compat ldap lsass <---- remove

hosts:  files dns
networks:       files dns

services:       files ldap
protocols:      files
rpc:    files
ethers: files
netmasks:       files
netgroup:       files ldap
publickey:      files

bootparams:     files
automount:      files nis
aliases:        files ldap
#passwd_compat: ldap
#group_compat:  ldap

krb5.conf

[libdefaults]
        default_realm = NOT.THERE.COM 
        default_keytab_name = /etc/krb5.keytab 
        default_tgs_enctypes = RC4-HMAC DES-CBC-MD5 DES-CBC-CRC 
        default_tkt_enctypes = RC4-HMAC DES-CBC-MD5 DES-CBC-CRC 
        preferred_enctypes = RC4-HMAC DES-CBC-MD5 DES-CBC-CRC 
        dns_lookup_kdc = true 
        pkinit_kdc_hostname =  
        pkinit_anchors = DIR:/var/lib/likewise/trusted_certs 
        pkinit_cert_match = &&msScLogin 
        pkinit_eku_checking = kpServerAuth 
        pkinit_win2k_require_binding = false 
        pkinit_identities = PKCS11:/opt/likewise/lib/libpkcs11.so 

[realms]
        NOT.THERE.COM = {
                auth_to_local = RULE:[1:$0\$1](^NOT\.THERE\.COM\\.*)s/^NOT\.THERE\.COM/NOT/
                auth_to_local = DEFAULT
                kdc = adserver.not.there.com
                admin_server = adserver.not.there.com
        }

[logging]
    kdc = FILE:/var/log/krb5/krb5kdc.log
    admin_server = FILE:/var/log/krb5/kadmind.log
    default = SYSLOG:NOTICE:DAEMON
[domain_realm]
  .not.there.com = NOT.THERE.COM 
[appdefaults]
        pam = {
   mappings = NOT\\(.*) $1@NOT.THERE.COM 
   forwardable = true
   validate = true
        }
        httpd = {
   mappings = NOT\\(.*) $1@NOT.THERE.COM 
   reverse_mappings = (.*)@NOT\.THERE\.COM NOT\$1
        }

Tuesday, January 17, 2012

osx 10.7.2 openldap authentication

MacOSX 10.7.2 LDAP authentication

0. Enable root.
* Go to a terminal prompt
* ~ sudo su - root
* type your password
* You're root!
* # passwd
* create a password for root.
* Log off
* Log on as root. Yes.

1. Add LDAPv3 Directory access
* Open Directory Access from /Applications/Utilities or under System Preferences > Users & Groups.

* Click the Lock on the bottom of the window.

* Click on LDAPv3 then click Configure

* Select Options then click Add
Enter a configuration name; e.g. myldap
Server Name: LDAP server canonical dns or IP address; e.g. myldap.my.com
Click on LDAP Mappings and select RFC 2307 (Unix)
For search base, put in your LDAP search base; e.g. dc=my,dc=com
Don not Check SSL

* Click edit and make sure all settings are at either their default or match your environment
Under Search and Mappings, if you're using a stock OpenLDAP install, it is safe to have a "Search in"
all subtrees set. This is recommended.
Check all Record types and attributes.
When done, Save Template. Somewhere.
Click OK, and OK again.

* At the Directory Access windows, Go to "Search Policy" and click on Authentication.
You're now going to add a Directory domain.
Select Custom Path

2.
Add the LDAPv3 server you just added. Click the + and add /LDAPv3/Server Name
Keep /Local/Default at the top; if not you'll not be able to logon with a local user account.
Once done, test your LDAP configuration by going to Directory Editor (also in Directory Access).
In the search box, search for a known account.

Did you mess up?

Check, /var/log/system.log for -14002 errors.

1. Remove all contents of directory /Library/Preferences/DirectoryService
2. Open /Applications/Utilities/Netinfo Manager and remove contents of directory /config/mcx-mask

If not, time to allow logons.

There's a bug in OSX 10.7.2 not allowing LDAP users to logon. Nice. Let's fix that.

1. As root...
# ldapsearch -x -h myldap.my.com -b "" -s base "(objectclass=*)" supportedSASLMechanisms

You should see something akin to:
supportedSASLMechanisms: NTLM
supportedSASLMechanisms: GSSAPI
supportedSASLMechanisms: DIGEST-MD5
supportedSASLMechanisms: CRAM-MD5

This shows you the sort of authentication mechanisms your LDAP server supports.
Let's make OSX add the SASL mechanisms - even if your LDAP server isn't using them.

2. Open the Opendirectoryd plist for your LDAPv3 server in /Library/Preferences/OpenDirectory/Configurations/LDAPv3,
and add all of the advertised SASL garnered from above to the Denied SASL Methods array in the plist file. Simply browse
to the file, double click and use xcode to edit.

Add the items here:
module options > ldap > Denied SASL Methods
add string items. Add the strings exactly as provided by your LDAP server.

3. Reboot the OSX machine and you'll then be able to logon using a LDAP-defined user.

Friday, January 13, 2012

likewise & netapp lessons learned

just so that i remember, here're some unsanitized notes.

the environment:
a mess of linux boxes, a group of windows systems and a netapp. active directory is the backend authentication mechanism.

the end goal:
authenticate linux/macos users to active directory and access home directories on the netapp.

likewise...
install likewise however you'd like. then...
afterward:
/opt/likewise/bin/lwconfig --detail AssumeDefaultDomain 
/opt/likewise/bin/lwconfig AssumeDefaultDomain true 
/opt/likewise/bin/lwconfig --show AssumeDefaultDomain 
/opt/likewise/bin/lwconfig LoginShellTemplate /bin/bash

/opt/likewise/bin/lwconfig --show HomeDirTemplate 
/opt/likewise/bin/lwconfig HomeDirPrefix /home 
/opt/likewise/bin/lwconfig HomeDirTemplate %H/%U 
/opt/likewise/bin/lwconfig CreateHomeDir false

in /etc/group:
admin:x:115:DOMAIN\me

in /etc/sudoers:
DOMAIN\\domain^admins ALL=(ALL) ALL

netapp...
netapp must have following:
qtree security /vol/silly_home unix

options cifs.signing.enable off
options cifs.nfs_root_ignore_acl on

passwd must have the uid of the windows user per likewise; e.g.
me::1952501801:1952501801::/:

* check using wcc -a & wcc -u
if not set, then user will be mapped to pcuser and unable to use nfs share.
UNIX uid = 65534

in usermap.cfg have a domain admin mapped as unix root:
DOMAIN\me <= root

nfs export must be long, not truncated; e.g.:
/vol/silly_home  -sec=sys,rw

client machine must mount long nfs export:
netapp:/vol/silly_home /home      nfs         defaults        0 0

problems with cifs?  turn on logging; shows up on the console.
options cifs.trace_login on

OSX 10.7.2 addendum

Since /Users is probably in use by local accounts, it would be best to mount 
the export to the place specified above (in our case /home).
OSX 10.7.2 does not have fstab.  Here's what you do:

Become root.
~ sudo su - root
As root...
# touch /etc/fstab
# vi /etc/fstab
Add the following:
netapp:/vol/silly_home /home      nfs         auto        0 0
# mount -a
Voila.