June 2011 Archives

Solaris in virtual machines: init 3, please!

I got tired of trying to negotiate a default graphical environment with Solaris 10, so I went right to the source and stopped that stuff:

svcadm disable /application/graphical-login/cde-login

Blammo. X server dies, stays off, and machine reverts to text-mode console.

Much easier to negotiate in an emulated environment or remote environment where you're lucky to get five frames per second of video or serious misalignment of the emulated mouse with the actual mouse cursor.

Linux, oh Linux, the bane of my existence

At $dayjob, we, like so many other companies use FibreChannel LUNs to provide stable, dependable storage to servers. From organization to organization, the use of LUNs varies. Some organizations simply place the file system on the raw device, others insist on a BSD disklabel or DOS partition table. In the Solaris world, the disk may be equipped with a disklabel and a several character label which the format command displays with the disk description.

As a part of a routine server deployment, I had several small LUNs 'attached' to several servers in a one-to-one configuration. However, after mkfs'ing the raw device, I discovered that the manager's preference was to install a DOS partition table and mkfs the partition. While updating the LUN to match the manager's expectations, I skipped over updating /etc/fstab accidentally, and issued the always fateful "reboot" command. The system came back up, failed to check /dev/sdc, and promptly ceased booting, waiting for the root password to proceed to maintenance mode. After entering the root password, I found I could not update /etc/fstab because the filesystem was mounted read-only. vim was very aware of this, as were command-line tools (e.g.: touch), but mount reported that the root filesystem was mounted read-write. After a few minutes of Googling and searching the old neural filing cabinets, I arrived on a few operands to place in the kernel line while booting from GRUB:

rescue
single
and so on.

But none of these delivered the result I was seeking. I needed the root to be read-write, and I was NOT going to search out a CD to boot from. Booting from the CD would have allowed me to use 'rescue' to mount the filesystem read-write and update /etc/fstab, but I adamantly knew in my brain that there was NO WAY that it was acceptable to *have* to have a Linux CD on hand to fix the problem.

Back to the Googles, I struck a bit of magic with some keywords. The magic phrase is this:

mount -t ext3 -o remount,rw /

Once completed, vim would edit the file, save the changes, and allow the system to be returned to service without performing some hack to boot the CD image on the server.

I picked up a few other things along the way:

fdisk /dev/sdc
     p
     1
    <Enter>
    <Enter>
     w
mkfs -t ext3 -v 1 /dev/sdc1
tune2fs -i 10000 /dev/sdc1

(add LUN to /etc/fstab as /slice)
mkdir /slice
fsck -p /dev/sdc1
mount /slice

/slice would be the name in /etc/fstab for the LUN

mkfs -v 1 sets the free space percentage kept by root to 1% of the disk space. I set this because our users usually fill the disk and keep it full.

tune2fs -i 10000 sets the fsck interval to 10,000 days. As a fan of the band Tool, I know that this is a period of time of some twenty-seven years; at the least this directive, while not recommended, will keep the system from pausing on startup for an unexpected fsck.

Experience has shown that these fscks always happen at unanticipated times, especially when the system must be returned to service as soon as possible. '

Good luck, keep calm, and carry on.

The Worst Possible Idea Must Be Implemented

No, I don't mean this in a sarcastic sense. I mean that when you have the worst possible idea, you must implement it. In my case, it was while building a Jumpstart server after building a Kickstart server. I have a Dell Optiplex GX1p (450MHz) missing most of the plastic (so it's really more of a Dell T1000 Optiplex). While DBAN'ing any number of SCSI and IDE hard drives (of almost every interface variety imaginable), the Optiplex would routinely reconfigure it's BIOS to change the boot order. After enough times of overriding it and seeing the PXE boot banners go by, I figured, what the hell?

Download pfSense, Debian Linux, setup a router and firewall in a virtual machine and provision another DHCP and TFTP server for PXEBOOT. Download a copy of DBAN. Put in blender. Three days later (and all typos removed one by one), I have DBAN in the boot menu for the PXEBOOT server. And it's set for auto-timeout. Yes, the worst possible idea must be implemented. Plug into my network and don't ask me, hope that you don't have your box configured to PXEBOOT. >=D

Original idea comes from Joako  over at DSL Reports and this post: http://www.dslreports.com/forum/r24834879-How-To-PXE-Boot-DBAN.

However, I found I kept having issues with DBAN booting. Finally, I cut it down to as few menu items as possible:

label DBAN
        MENU LABEL DBAN
        kernel dban/dban.bzi
        append nuke="dwipe"


label DBANautonuke
        MENU LABEL DBAN Autonuke
        kernel dban/dban.bzi
        append nuke="dwipe --autonuke" silent


And thus it was dangerous. Once ONTIMEOUT is set to DBANautonuke and DEFAULT set to DBANautonuke, only the TIMEOUT value saves you. Timeout is in tenths of seconds, so a TIMEOUT of 200 is 20 seconds.

So the recap here is that you configure a PXEBOOT server in your favorite fashion (everyone uses different paths, pick a tutorial (RedHat, IBM, etc.) and go through it), mount the DBAN CD on a mountpoint (/media/cdrom or /mnt or /cdrom or whatever) and copy over the contents of the CD to your tftpd home directory. I tar'd the files up into a dban.tgz tarball and put them into a directory named "dban". dban/isolinux.cfg tells you everything you need to plug into the pxelinux.cfg/default file for menu.c32 to know about.

Good luck, and happy trails.

The other fun part of this was configuring pfSense to do some routing and firewall work. Where I usually work, DHCP server are verbotten, so one must take a few precautions to make sure that evil DHCP packets aren't forwarded. Also, if there is a internet-facing port, one must take pains to assure that packets go in the correct direction. Download pfSense install CD image, fire up VMWare; install pfSense to the hard drive of the virtual machine. Give the virtual machine four interfaces; LAN, WAN, OPT1, and OPT2. LAN is 172.16.0.1/24, WAN is the internets feed, and OPT1 is a /30 (172.16.10.1/30) to the ethernet switch (172.16.10.2/30) for out-of-band management. pfSense handles the NAT and knows about routing.

In another virtual machine, a Debian Linux server sits with two ethernet cards. But I ran out of ethernet ports on the physical hardware, so the PXEBOOT server is a one-armed router. To accomplish this feat against ISC-DHCPD3's best wishes, I had to think a bit.

My first hurdle was getting interface aliases configured to start automatically in Debian. Not so difficult with The Googles:

/etc/network/interfaces:

iface eth0 inet static
    address 172.16.0.2
    netmask 255.255.255.128
    network 172.16.0.0
    broadcast 172.16.0.128
    gateway 172.16.0.1
    dns-nameservers 172.16.0.1
    dns-search lan

auto eth0:1
iface eth0:1 inet static
    address 172.16.0.129
    netmask 255.255.255.128

So 172.16.0.1 is the pfSense router (with DHCP turned off and DNS Forwarder on), 172.16.0.2 is the "outside" IP of the PXEBOOT server, and 172.16.0.129 is the "inside" IP of the PXEBOOT server.

Then DHCPD had to complain:

pxeboot:/etc/dhcp3# /usr/sbin/dhcpd3
Internet Systems Consortium DHCP Server V3.1.1
Copyright 2004-2008 Internet Systems Consortium.
All rights reserved.
For info, please visit http://www.isc.org/sw/dhcp/
Wrote 3 leases to leases file.
Interface eth0 matches multiple shared networks
pxeboot:/etc/dhcp3#

What the deuce?

Googling was marginally useful. I've seen this problem before however. I couldn't remember what the exact reasoning behind why it behaves this way, but I remembered reading a message from Paul Vixie about it that explained the rationale or behavior. And I believe it was a compile time option or some oddball flag that one had to set to get rid of the error message. After racking my brain for a bit, I settled on the idea that DHCPD had too much information. So I commented out the first subnet definition (172.16.0.1 255.255.255.128):

#subnet 172.16.0.0 netmask 255.255.255.128 { }
subnet 172.16.0.128 netmask 255.255.255.128 {
        range 172.16.0.130 172.16.0.253;
        default-lease-time 14400;
        max-lease-time 38400;
        option subnet-mask 255.255.255.128;
        option broadcast-address 172.16.0.255;
        option routers 172.16.0.129;
# comment below out if the machine's name will be something else.
        option domain-name "lan";
        filename "pxelinux.0";
        next-server 172.16.0.129;
}

Of course, the PXEBOOT server still has the default route set to 172.16.0.1, so if packets get there, pfSense should know what to do with them (and if it doesn't, I don't care because this server was designed for DEATH ;). In all seriousness, this server will have to coexist in the near future with a Jumpstart server, so the ability to alter DHCP capabilities is welcomed.

So the PXEBOOT server is a one-armed router/DHCP server, serving out DHCP for a network that isn't routeable while another network on the same wire is.  And the pfSense firewall provides inward connectivity for SSH, while keeping off-LAN users out. Through the OPT1 interface, some NAT work and a firewall rule, telnet and the web interface for the HP Procurve 4000 are made accessible to the LAN, while denying remote users the ability to get to it.

Some of the complicated stuff I enjoy, other things I needlessly complicate. But for good reason. =D

About this Archive

This page is an archive of entries from June 2011 listed from newest to oldest.

April 2011 is the previous archive.

November 2011 is the next archive.

Find recent content on the main index or look in the archives to find all content.