Solaris Live Upgrade on SunFire T1000/T2000 servers

The process of installing and configuring Live Upgrade on Solaris is pretty well covered at this stage but, as ever with Solaris, there is always something that is not covered in the many existing documents and blog postings on the subject. So, here is an overview of what I did, what worked, what did not work and how I overcame the bits that did not work.

Setup

I wanted to use Live Upgrade to ease the painful patching process on Solaris (not to upgrade the operating system). I tried this on a SunFire T1000 and a SunFire T2000 both running Solaris 2.10 U3 (06/06). In both cases, the systems contained 2 existing 10GB UFS file systems (/ on c0t0d0s0 and /opt on c0t0d0s3) as well as a swap partition (c0t0d0s1).

What Did Work

Following the Live Upgrade documentation from the Sun website, I narrowed the commands down to the following steps:

Create a new partition to accommodate the Alternative Boot Environment
# format
Created a Live Upgrade configuration:
# lucreate -c solaris_env_1 -n solaris_env_2 -m /:/dev/dsk/c0t0d0s4:ufs
Verified the successful creation using
# lustatus
# lucurr

Downloaded any available patches for this system
# smpatch download
Install the downloaded patches in the alternative boot environment
# smpatch update -b solaris_env_2
Reboot the system in the recommended manner
# init 6

What Did Not Work

No matter how many times I retried to reboot the system (in full accordance with the instructions in the documentation), the system simply refused to boot from the new/alternative boot environment. Each time, the lustatus command showed that both boot environment were correctly configured and that the alternative environment (solaris_env_2) was scheduled to boot at the next reboot. However, it never did boot from the alternative environment.
Apparently, when the patch process completes successfully (or whenever you use the luactivate command), the system should automatically update the boot-device EEPROM environment variable (see eeprom command from Solaris).

Workaround

I found that the only way around this problem was to manually adjust the boot-device environment variable from the system console each time you need to boot an alternative boot environment. This was done (from the system console) as follows:

Shut down the system
# init 0
Determine the appropriate setting for the boot-device variable
ok setenv auto-boot? false
ok reset-all
ok probe-scsi-all
ok show-disks

Using the output from the above commands, determine a new value for the boot-device variable. For example:
ok setenv boot-device /pci@7c0/pci@0/pci@8/scsi@2/disk@0,0:a
Verify that the system still boots from the old/existing environment
ok boot
If so, you should now be able to drop back to the boot prompt and set the boot-device variable to a new value in accordance to the Live Upgrade partition you created earlier (in my case, c0t0d0s4 or disk@0,0:e)
ok setenv boot-device /pci@7c0/pci@0/pci@8/scsi@2/disk@0,0:e
Also, you might like to return auto-boot setting to its normal value before booting the system again
ok setenv auto-boot? true
ok boot

Best of luck!

Locking down your Solaris system

In preparation for a forthcoming public trial of a new web service powered by Solaris, we recently spent some time investigating some different ways to lock down the system down. Here is an overview of our findings.

Solaris Installation

If you have the option of (re)installing Solaris, then take it. And in doing so, be sure to choose the Solaris Core installation cluster (SUNWCreq) as this is the most secure (mainly due to reduced number of packages that it includes). Of course this cluster will almost certainly not provide you with everything that you need (and you will have to manually install several packages thereafter) but it is generally worth while as you will know exactly what is and is not installed on your system.

Useful Tools and Utilities

I found the following utilities very useful:

  • netstat
  • nmap
  • lsof

The first is natively available on Solaris and the other two can be downloaded from the sunfreeware.com website. The combination of these utilities make it very to diagnose which ports are open (by which process) on a system. Refer to some of the articles below to see some good ways in which you can do this.

Solaris Security Toolkit

If you want to, you can manually lock down your system using the netstat, svcs and svcadm commands but you really need to know what you are doing. However, there is a far simpler way to do this and that is to use the Solaris Security Toolkit (SUNWjass). This is a very powerful (and extremely well documented) utility that is pretty easy to use, to very good effect. It is also free.

All you need to do is run one simple command (there are 16 variations to choose from depending on paranoid you are) and the SST will do the rest – disabling all the appropriate services, setting file permissions, creating hosts.allow and hosts.deny files and even invalidating non-root user passwords in certain cases. So be sure that you have console access to your system before you run it.

You can even run it again in analyse mode to ensure that the system is still locked down the same degree as it was when you first ran it (although I have not tested this).

Useful Links
Here are some good postings on the Solaris Security Toolkit

Apache + Mongrel on Solaris

Apache is an excellent Web Server but does not handle Rails projects very well. Mongrel is a Ruby-based library that can assist things like Apache with management of dynamic content.

Credit / Reference Site

We owe a great deal of thanks to this article posted by codehale on the same topic. The article you are reading here is a summation of notes taken during a recent installation of Apache and Mongrel on the Solaris operating system, based on codehale’s article.
Prerequisites

Several software packages need to be pre-installed on Solaris in order for Mongrel to install and operate correctly. I have listed them below along with the version I used. The version of Apache and Ruby were both taken from the Solaris CoolStack software bundle.

  1. Apache 2.2
  2. Ruby 1.8.5
  3. Ruby Gems 0.9.2
  4. Ruby on Rails 1.2.3
  5. Sun Studio 11

To install the software, simply download (and unzip) the relevant file from the above site(s) and install it (as root) using the pkgadd –d command.

Installation Procedure
The system used for this installation was a SunFire T1000 running Solaris 2.10 (U3).

Step 1 – Download Mongrel

Commands

# gem install daemons gem_plugin mongrel mongrel_cluster --include-dependencies --no-rdoc --no-ri

Notes

  • When asked about which version of Mongrel and FastThread, we chose the newest version for ruby (not win32).
  • The above command will attempt to compile and install some native code and requires a C compiler. If you are using Sun Studio, you should also read my other post regarding Ruby and Sun Studio.
  • If you are using GCC, then you may also require ginstall. However, ginstall is not available for Solaris so we overcame this by installing the GNU coreutils package and then creating a symbolic link: /usr/local/bin/ginstall pointing at /usr/local/bin/install (after coreutils was installed).
  • The reference site above also recommends that the “sendfile” utility be removed from your system (if installed). We used the “pkginfo | grep sendfile” command to verify that it was not present on our system.

Step 2 – Configuring Mongrel Clusters

Commands

# cd /opt/tssg/feedhenry/wc
# mongrel_rails cluster::configure -e production -p 8000 -a 127.0.0.1 -N 3 -c /opt/myrubyapp/public
# vi /opt/myrubyapp/config/mongrel_cluster.yml (and change port setting to 8000 - see below)
# mongrel_rails cluster::start -c /opt/myrubyapp/config/mongrel_cluster.yml

Notes

  • The -N value above represents the number of clusters required. We followed the example and chose 3.
  • The second command above creates a mongrel_cluster.yml file. However, we noticed that the port setting in this file was incorrect after it was created. It was “ath” instead of 8000 (not sure why).
  • The final command should produce one line of output for each cluster configured indicating that each cluster was correctly started.
  • You should also examine the logs/mongrel.log file and ensure that the clusters started on the correct ports (8000, 8001, 8002 etc). This is how we noticed that the .yml file had the wrong port setting.
  • You can use the mongrel_rails cluster::stop command to stop the clusters again (-c option also required here)

Step 3 – Configuring Apache

You will need use some additional Loadable Modules for Apache. Refer to this post for details on which ones you need and how to build them. You can then follow the instruction in the reference post.

Once these files were configured and placed in the conf/extra directory, we simply had to make one change to the main Apache configuration file (conf/httpd.conf) to ensure that these new files were loaded by Apache. This consisted of the following command:

Include conf/extra/myapp*.conf

You should now be ready to start Apache (/opt/coolstack/apache2/bin/apachectl start)

Best of luck!

Solaris CoolStack, Sun Studio and Ruby

If you are using the Solaris CoolStack edition of Ruby (CSKruby) with Sun Studio 11 (SUNWspro) then you may need to make a small change to one of the Ruby configuration files to tell it where Sun Studio is installed. I would have have expected the installer for the Ruby package to detect this but, alas, it does not.

In my case, this problem will only manifested itself when I attempted to install some platform-specific Ruby Gems packages (hpricot, fastthreads, mongrel).
To remedy this problem:

  1. Edit the file /opt/coolstack/lib/ruby/1.8/sparc-solaris2.10/rbconfig.rb
  2. Search for all instances of SUNWspro and change to the correct location
  3. Save the file and repeat your “gem install” command

Manually adding a user to a Solaris system with Automount enabled

If you need to add a new user (and group) to a Solaris system but do not have a graphical display attached to it then you will have to add the user from the command-line. Since home directories for users on Solaris are traditionally stored in /home then you would expect to be able to add the new user as follows:

# groupadd somegroup
# useradd -g somegroup -d /home/someuser -m someuser
# chown someuser:somegroup /home/someuser
# passwd someuser ... and assign a valid password

However, if your Solaris system is using the Automounter, then the /home directory will actually be under the control of the automounter and you will not have the permissions to create the directory there. Instead, you must create it in /export/home and tell the automounter about it, as follows:

# groupadd somegroup
# useradd -g somegroup -d /export/home/someuser -m someuser
# chown someuser:somegroup /export/home/someuser
# passwd someuser ... and assign a valid password
# vi /etc/auto_home ... and add the following line (after the +auto_home line)
someuser 127.0.0.1:/export/home/someuser

You should now be able to log in as the new user.

Installing a Solaris JumpStart server

We acquired a number of SunFire T1000 servers recently which came with Solaris 10 U2 (06/06) pre-installed. Before putting them into full-time use, we decided to take the opportunity to upgrade them to a later revision of Solaris, namely U3 (or 10/06). However, since these systems do not come with an internal CDROM drive, we had to reinstall the operating system over the network and to do this we needed a JumpStart server.

I used this site as the basis of my activities. It looks like it was written for Solaris 2.7 but seemed to work just as well (for me) on Solaris 2.10. So, here are my notes on installing and configuring a Solaris 10 jumpstart server. I used an existing SPARC system as my jumpstart server.

1. Preparation

Configure some NFS-mountable directories to store the contents of the Solaris installation media and some additional jumpstart configuration files:

# mkdir /export/install/5.10u3-sparc
# mkdir /export/jumpstart

Add the following lines to the /etc/dfs/dfstab file to make these directories shareable:

# vi /etc/dfs/dfstab
share -F nfs -o ro,anon=0 /export/install/5.10u3-sparc
share -F nfs -o ro,anon=0 /export/jumpstart

Finally to share (and verify) these directories, use the following command:

# shareall

Finally, you will need to mount the CDROM/DVD containing the Solaris media. This should happen automatically as soon as you insert the disc in the drive (if your system has the Solaris Volume Manager running on it) but strangely, it did not for me. I actually had to restart the Volume Manager to get the DVD mounted:

# /etc/init.d/volmgt stop
# /etc/init.d/volmgt start

Now use the mount or df command(s) to verify that the CD/DVD has been mounted. For the record, here is a useful way to determine the device name for your CD/DVD drive:

# ls -al /dev/sr*
xrwxrwx 1 root root 12 May 2 16:42 /dev/sr0 -> dsk/c2t0d0s2
# mount -F hsfs -o ro /dev/dsk/c2t0d0s2 /mnt/cdrom

However, the Solaris 10 DVD contains several UFS slices and cannot be mounted in this way.

2. Installing the JumpStart Server

Assuming that your CD/DVD has been mounted on /cdrom, here is how to install the core of the jumpstart server:

# cd /cdrom/cdrom0/s0/Solaris_10/Tools
# ./setup_install_server /export/install/5.10u3-sparc

This will copy the appropriate contents from the CD/DVD to the relevant directories within /export/install/5.10u3-sparc. This part of the process can take some time to complete so be patient.

You may also want to configure a Boot Server (I wasn’t sure if I needed this but did it anyway):

# cd /cdrom/cdrom0/s0/Solaris_10/Tools
# ./setup_install_server -b /export/install/5.10u3-sparc/sun4v

3. Copying up the JumpStart sample configuration files

There are several configuration files within a JumpStart server and the Solaris CD contains some samples to get you started. We created a directory for these earlier but must copy them from the CD now:

# cp -r /cdrom/cdrom0/s0/Solaris_10/Misc/jumpstart_sample/* /export/jumpstart

4. Setting up the JumpStart configuration files

As the reference site indicates, there are several files that need to be configured before you can attempt to initiate a network installation from a client, namely:

/export/jumpstart/rules

This file is mandatory and helps to define some rules to specify what (type of) clients are allowed to use this install server and what to do before and after the installation. I created a single rule as follows:

network XX.XX.XX.0 && arch sparc - myT1000 -

which says that only SPARC systems in the XX.XX.XX.0 network are allowed to use this server and when they do, the settings in the myT1000 file should be used to specify how those systems should be configured.

/export/jumpstart/myT1000

This file specifies how a given system should be configured by allowing you to predefine what type of installation you want, which software packages you want (and do not want) and how to lay out your file systems etc.

Here is the profile file that I used:

install_type initial_install
system_type server
partitioning explicit
filesys c0t0d0s0 10240 /
filesys c0t0d0s1 2048 swap
filesys c0t0d0s3 10240 /var
filesys c0t0d0s4 10240 /usr
filesys c0t0d0s5 10240 /opt
cluster SUNWCprog add
package SUNWauda delete
package SUNWaudh delete
package SUNWaudf delete
package SUNWxorg-devel-docs delete

As you can see, I chose the Developer Support software cluster but then requested that several software packages be excluded from it. I chose to remove many more than are shown here but I think you get the point…

/export/jumpstart/sysidcfg

This is an optional file that essentially allows you to specify extra settings for your installation. In summary, the more settings you specify here, the less questions you are asked during the network installation and the more automated the process becomes. Here is the sysidcfg file that I used:

system_locale=en_IE.UTF-8
install_locale=en_IE.UTF-8
timezone=Eire
terminal=vt100
timeserver=localhost
name_service=DNS { domain_name=XXXX.YYY name_server=X.X.X.X,Y.Y.Y.Y search=XXX.YYY,XXX.YY }
network_interface=bge0 {netmask=255.255.255.192 default_route=X.X.X.X protocol_ipv6=no }
security_policy=NONE

Once you have all of these files in place, you need to verify that they are syntactically correct. This is done using the check tool as follows:

# cd /export/jumpstart
# ./check
Validating rules...
Validating profile myT1000...
The custom JumpStart configuration is ok.

5. Telling the server about a client

Before you can commence an installation from a client, you need to tell the install server about that client. Since I was not using DHCP and already had a DNS server with a valid entry for my client, this stage was a little easier for me. Of course you also need to ensure that your server is running a TFTP Boot Server.

To tell the server about a client, you need to know the MAC address of the primary network adapter of the client and the intended hostname of the client. Once you know this, use the following command:

# cd /export/install/5.10u3-sparc/Solaris_10/Tools
# ./add_install_client -e 11:22:33:44:55:66 -s fonda:/export/install/5.10u3-sparc -c fonda:/export/jumpstart -p fonda:/export/jumpstart shefflin sun4v

The name of my client was shefflin (it was a sun4v system) and as I indicated earlier, the name of my server was fonda. Clearly, you will need to use your own values for the parameters as well as the correct MAC address. The result of this command is some new files in the TFTP Boot area as well an a new entry in the /etc/ethers file.

You are now ready to start the installation from the client.

6. Starting the Client Installation

This is actually the simplest part of the exercise and involves one command. However, you do need to ensure that the Network Management port of your client has been configured with a valid IP address. Anyway, to start the client installation, use the following command from the boot prompt of your client:

boot net - install

The system should then start installing the new version of Solaris. It will do things like request an IP address from the network, attempt to configure the network interfaces in the client and ultimately follow pretty much the same procedure as if you were installing from a CD or DVD. The more configuration files you provided on the server, the fewer questions you will be asked during the installation.

When the installation has completed, you will be dropped back to the (root) command prompt. You should now reboot the system after which you will be asked to provide a password for the root user. Upon completion of this task, you should finally be presented with the console login prompt and, hey presto, you’re done!

Reference Sites

Here are some other useful websites I discovered during this exercise:

And some links to others who have attempted the same task:

Best of luck!

Upgrading the firmware on a SunFire T2000

In preparation for installing a newer release of Solaris 10 on a SunFire T2000, we first decided to bring the system to the most recent version of firmware. Here is how we got on …

Personally, I have upgraded firmware on many different types of systems in the past and whilst the technique is invariably the same, there are always minor differences that catch you out. This is also true in this case (hence the reason for this article). We found that many of the articles covering this topic made certain assumptions (without realising) and thus, left out certain key information. This article attempts to address those omitted assumptions.

What is the latest version of firmware and where can I get it?

We found links to all versions of firmware for T2000 systems here. The latest version at the time this article was written was 6.4.4. The firmware is distributed in a ZIP file.

What do I do next?

The best source of information on installing the firmware you have just downloaded is contained in a file called Install.info which can be found inside the ZIP file you just downloaded.

What the Install.info does not tell you

Here is a list of things that I did not know before reading the Install.info but found out during …

  1. There are many references on how to use the System Controller Console (SC) to both determine whether you need an upgrade and also to carry out the upgrade. However, they don’t actually tell you how to get to it. I discovered that:
    • You do not need to shut down your system
    • You must type #. at the system console to get to the SC (that’s a hash followed by a dot).
  2. When you first access the SC, you will be asked to provide a password (for future sessions). However, later on when it comes to authenticating yourself, you are also asked for a username but nowhere do they tell you what this is. The default username is admin but this is not well documented.
  3. Once you have establlished your connection to the SC and determined whether or not you actually need a firmware upgrade, it will be time to use the flashupdate command. When I tried this, it kept telling me that it could not establish an FTP connection to the specified server. Once again, I soon discovered that:
    • The flash update process (via FTP) uses a different network port than the one for normal system operation. So I had to move my network connection on the T2000 to the network port labelled NET MGT. The network ports used for normal system operating are actually powered off at this time.
    • You need to assign an IP address to the NET MGT network port. This can be done using the setupsc command from the SC (click here for a separate article on this). Once this is done you then need to reset the SC (resetsc) after which you will require the username and password referred to earlier to regain access to the SC.
  4. Once you finally get to the stage where you have a proper outbound connection from the SC, there is still one last issue. It concerns the name of the file containing the firmware image that you are passing to the flashupdate command from SC (via the -f parameter). I had assumed that this path was relative to the home directory of the FTP user being used but it turns out it was not. I had to specify an absolute path (e.g. /tmp/abc.bin)

Summary

Essentially, the Install.info file is your friend but there were 3 fundamentals that it does not cover:

  1. You need to have a valid network connection to the NET MGT port
  2. Your NET MGT port must have a valid IP address, Mask and Gateway
  3. You must use an absolute path for the image file in the flashupdate command.

Now I fully accept that none of this is rocket science and much of it is indeed documented but if you have not done this before, you will waste time discovering these issues. So hopefully, you will benefit from knowing this and it will save you time some time in the future.

Best of luck!

Building Loadable Modules for Apache on Solaris

We recently installed a version of Sun’s CoolStack software bundle on a SunFire T2000 server running Solaris 10 so that we could use its CoolThreads-optimised version of Apache instead of the regular Apache that came pre-installed on the box (highly recommended by Sun).

However, when we ran the new version of Apache against our configuration, we discovered that it does not include a number of key modules that we require (namely Proxy Balancer). Sun does not provide these in binary format so we had to build them by hand. Fortunately, Apache does provide a convenient tool (apxs) for building modules but unfortunately this requires you to install a Sun compiler, Sun Studio (now free though) which added some extra time to the process.

Anyway, I appreciate that this may not be rocket science to many of you but, despite the many articles already published on this topic, there were still some issues that we hit which were not documented. So, here is how we did it.

  1. Ensure that you have installed the latest version of the CoolStack software (installs to /opt/coolstack)
  2. Download, unpack and install Sun Studio 11 (installs to /opt/SUNWspro)
  3. Be patient, very patient … 600MB download, followed by a long unzip to 1.1GB followed by a long install …
  4. Download, unpack and install the CoolStack Source (installs to /opt/coolstack/src)
  5. Now, as a root user, follow the commands below

# export PATH=/opt/SUNWspro/bin:$PATH
# cd /opt/coolstack/src/httpd-2.2.3/modules/proxy
# /opt/coolstack/apache2/bin/apxs -i -a -c mod_proxy.c proxy_util.c

This will compile the module, copy it to the appropriate directory and update the Apache configuration file for you. If you leave out the proxy_util.c you will get "proxy_lb_worker: symbol not found" errors when you start Apache

# /opt/coolstack/apache2/bin/apxs -i -a -c mod_proxy_balancer.c

Once again, this will compile, copy and deploy the module for you

# /opt/coolstack/apache2/bin/apxs -i -a -c mod_proxy_http.c

If you forget to install this module, Apache will start, but the site will be inaccessible (saying you do not have permissions to view this page). You will also see errors like "proxy: No protocol handler was valid for the URL /" in the Apache error log file for your product.

Best of luck!

Moving Times

I recently spent several hours hopping between my desk and the consoles of a number of computer systems running different operating systems (Solaris, Linux and Windows to be precise). I hadn’t yet configured remote access on these machines, hence the need to sit at the console.

Anyway, a rather curious observation was that the time on these systems was located in completely different areas. On Windows, it defaults to the bottom-right corner, Solaris defaults to the bottom-left corner and Ubuntu defaults to the upper-right corner. As I quite often don’t wear a watch, I found my eyes playing a game of Boggle every time I wanted to check the current time, a very strange experience.

Naturally, I presume I could have reconfigured the desktop(s) to show the time in the same place but it’s still interesting to wonder if this was a deliberate move on the part of each operating system vendor.