Recursive grep on Solaris

The grep utility that is included with Solaris 10 does not appear to support recursive behaviour, unlike many flavours of Linux which do so using the -r switch. Until recently I had been using something like:

$ grep 'somestring' `find . -name '*'`

in an attempt to emulate a recursive grep but this doesn’t work too well when you run it from a high-level directory on your system (the size of the output from the find becomes too large for grep to handle).

However, I recently found a much better alternative:

$ find . | xargs grep 'somestring'

This is shorter and much easier to remember, although from initial observations, it seems to take a little longer to run but I can live with that since it works.

Smaller C compiler package for Solaris

Over the course of recent months, I have deployed a number of Ruby applications onto Solaris, many of which use a handful of Gem packages. Most of the Gem packages are written in Ruby but a handful are not (e.g. Mongrel, FastThreads and Hpricot) and require native compilation on the host system (part of the installation process).

In order to complete the installation of these Gem packages, you need some form of C compiler. Ordinarily, I would choose GCC, but because the Ruby binaries (and most of the other supporting software packages) I use (i.e. Solaris Coolstack) were compiled using Sun Studio, using GCC is not really an option. So, instead I have to install Sun Studio.

Now, don’t get me wrong, I’ve nothing against the Sun Studio product per say (actually, it’s a superb product). My problem is rather that, in order to install a basic C compiler on my system, I have to download, unzip, untar and then install from a now enormous 1GB (zipped) file (as of Sun Studio 12). This is a real pain and takes a very, very long time (unzip along takes over 15 minutes due to use of Bunzip2 compression). This is all the more painful (and wasteful) when you consider that the disk space consumed by the installation of just the C compiler is a mere 200MB from this 1GB monster.

What I would like (and have requested) is a reduce-size package that contains a basic C compiler only – one that would be used purely for native compilation of the likes of Ruby Gem packages.

Here’s hoping…

Syslog connections rejected using Syslog-NG on Solaris

The Problem

After an upgrade from Syslog-NG version 1.6.7 to version 2.0.5, our syslog server began reporting the following error each time an event was received from a remote host:

Syslog connection rejected by tcpd; from=’AF_INET(127.0.0.1:XXXXX)’

The syslog server was a Solaris x86 system which had a number of reverse SSH tunnels to several SPARC-based syslog clients. The version of syslog-ng was obtained from sunfreeware.com in binary form on both systems.

The Solution

It turns out that Syslog-NG 2.x introduced support for TCP Wrappers (which tcpd is part of) and thus, the settings in my /etc/hosts.allow and /etc/hosts.deny files were actually preventing syslog-ng from accessing port 514 on the local host. Adding the following entry to hosts.allow seems to have fixed the problem:

syslog-ng: localhost

Of course you will need to refresh/restart the inetd service after you do this (svcadm refresh/restart inetd).

Installing Nagios on Solaris

If you are not familiar with it, Nagios is an open source system and network monitoring application that keeps a watchful eye on hosts and services that you specify, alerting you when things go bad and when they get better. It was designed to run under Linux but, according to the creators, should also run on other Unix variants. With this in mind, I decided to try it out on Solaris.

Fortunately, they provide a very comprehensive User Guide and trust me, you’re going to need it. Suffice it to say, there is a section early in this document entitled Advice for Beginners which is pretty blunt about how tricky Nagios is to set up and, believe me, they are not wrong. Having said that, they do also say that once you get it running you will never want to be without it and I can definitely subscribe this this notion too.
Anyway, here are my notes from installing Nagios on Solaris:

My Setup

  • Solaris 10 (u3) for x86
  • Sun Studio 12
  • Nagios 2.10
  • Nagios Plugins 1.4.11

Building Nagios

I downloaded the Nagios source tar ball and unpacked it as a non-root user. Then, following the User Guide, I ran the configure script with no argument (implying I wanted all default settings) followed make all and this seemed to work fine.

Building Nagios Plugins

Nagios does most things by using other scripts/applications which it calls plugins. The Nagios website provides a collection of popular plugins for you to build. Once again, I did so using a configure command followed by a make all command. However, this did not go entirely smoothly:

  1. A number of the plugins failed to build citing an “undefined symbol: floor” error. This was resolved by adding -lm to the LIBS defined in line 328 of nagios-plugins-1.4.11/plugins/Makefile. This could probably also have been fixed by adding $(MATHLIBS) to the links statement of the affected plugins but that would have been more work.
  2. The check_dhcp module failed to compile citing several unknown data types (i.e. u_int8_t and u_int32_t). This was resolved by adding a -D__solaris__ to CPPFLAGS definitions at line 161 of nagios-plugins-1.4.11/plugins-root/Makefile
  3. The nagios-plugins-1.4.11/plugins-root/Makefile was also missing the same -lm link parameter as the plugins Makefile (line 221)

Once all of the above changes were make, all of the plugins seemed to build correctly.

Problems Found During Nagios Configuration

1. check_ping plugin did not work

No matter what way I configured the use of the check_ping plugin (in localhost.cfg, services section) it always reported:

CRITICAL – You need more args!!!
Could not open pipe:

A number of websites suggested that this was an IPv6 issue and that I should have used the --with-ipv6=no in my original call to the configure script when building the plugins. However, this was not the solution for me. It turns out that the definition of PING_COMMAND in nagios-plugins-1.4.11/config.h was empty and thus the check_ping plugin was actually making no attempt whatsoever to ping the requested host. I suspect that the reason for this is that I built the software as a non-root user which, on Solaris, does not have the ping command in it’s path (since ping is located in /usr/sbin on Solaris). Hence, the original configure script was unable to produce a valid definition for PING_COMMAND.

The solution to this was to edit nagios-plugins-1.4.11/config.h and add the following definition for PING_COMMAND (line 796)

#define PING_COMMAND “/usr/sbin/ping -s %s 64 %u”

The above command specific to Solaris and makes the Solaris version of ping behave like the Linux ping command. After this edit, I had to force a rebuild of the check_ping plugin (touch plugins/check_ping.c; make)

2. statusmap.cgi did not build

I only noticed this when I tried to view the Status Map section of Nagios. In short, the reason why this has not built is that I was missing a GD library on my Solaris system. The solution was to download and install a version of the GD library (and each of its dependent packages). I got mine from sunfreeware.com. The statusmap.cgi utility then built correctly and once I copied it to the libexec directory where Nagios was installed, it worked.

3. VRML Browser Plugin required

When I tried to view the 3-D Status Map options in Nagios, my brower kept launching a “Save As” dialog box. I turns out I needed to install a VRML plugin in my browser. I chose one called Cortona from Parallel Graphics. It seems to work fine in Firefox although, as yet, the 3-D Status Map view is more impressive than it is useful (for me anyway).

Conclusion

Nagios indeed took a long time to install, configure and set up. However, I can confirm that it was worth the effort and I am very pleased with it so far.

MySQL Proxy Gotcha on Solaris SPARC

The Problem

During a recent install of MySQL Proxy on a Solaris 10 system (sparc u3), we encountered the following error whenever we supplied the --pid-file command-line argument:

Conversion from character set ‘646’ to ‘UTF-8’ is not supported

However, if we omitted the --pid-file argument, it worked just fine.
The Solution
Running ldd against the application revealed no missing libraries but when we ran it with truss, we discovered that it was indeed missing some library files:

open64("/export/home/mysqldev/mysql-proxy-32bit/inst/glib/lib/charset.alias", O_RDONLY) Err#2 ENOENT
access("/usr/lib/iconv/geniconvtbl/binarytables/646%UTF-8.bt", R_OK) Err#2 ENOENT
access("/usr/lib/iconv/646%UTF-8.so", R_OK) Err#2 ENOENT
open("/usr/lib/iconv/alias", O_RDONLY) Err#2 ENOENT

It turns out that the Solaris system we were using was originally installed using the Core System cluster (SUNWCreq) and did not have all the requisite Unicode packages installed. So, an installation of the SUNWuiu8 package (from the Solaris distribution media) duly resolved the matter.

Running multiple instances of MySQL with Solaris Coolstack 1.2

We recently had a requirement to run two instances of MySQL on two separate Solaris systems (for the purposes of dual-master replication). The systems in question were both SunFire T2000s running Solaris 10 (U3) and Coolstack 1.2 and already had a previous/single instance of MySQL running (SMF service csk-mysql).

In the end, it was a relatively straightforward exercise in that all we really did was replicate the existing instance (control script, configuration file and manifest file) and change the necessary bits of the resultant files so that the new instance was sufficiently different to the old, in the end, producing two independent SMF services (csk-mysql1 and csk-mysql2). Specifically, the following files were created:

MySQL Control Scripts:
/opt/coolstack/lib/svc/method/svc-cskmysql1
/opt/coolstack/lib/svc/method/svc-cskmysql1

MySQL Configuration Files:
/etc/my1.cnf
/etc/my2.cnf

MySQL Manifest Files:
/var/svc/manifest/network/cskmysql1.xml
/var/svc/manifest/network/cskmysql2.xml

The manifest files were identical but for the name of the service and the control script they invoke. The control scripts were also identical but for the configuration file and DBDIR that they use and the difference(s) in the configuration files were mainly in terms of the port numbers and socket files used.

Beware

The only thing that caught us out was in modifying the control scripts (to tell them which configuration file they should use). This was done by adding a --defaults-file argument to the invocation of mysql_safe. However, you need to ensure that this argument is the first one you pass to mysql_safe. Otherwise, MySQL will launch will not load the correct configuration settings. This is not (well) documented so beware!

Also, when trying to access the new instances via the mysql client, you will need to add the -P and -S arguments.

Renaming a Solaris system

I’ve had to change the hostname of a number of Solaris systems over the past few weeks, some of which were (sparse) Solaris zones. Here are some notes for my/your future reference:

File Global Zones Non-Global Zones
/etc/nodename Yes Yes
/etc/hostname.ifname Yes No
/etc/hosts Yes Yes
/etc/inet/hosts Yes Yes
/etc/inet/ipnodes Yes Yes

Notes

  1. You will need to replace the ifname above with the name of the appropriate interface on your system.
  2. Depending on your system configuration, I may of course be missing some files. However, the above worked for me!
  3. None of the systems in question were running IPv6

Uninstalling Sun Studio

I recently had reason to upgrade to a newer version of Sun Studio on a Solaris 10 system. I had Studio 11 installed but needed to go to version 12 for better support of natively compiled Ruby Gem packages (e.g. mongrel). However, uninstalling Sun Studio is not the most readily documented thing in the world so, for my own future reference, here is the easiest way to do it:

# cd /var/sadm/prod/com.sun.studio_11
# ./batch_uninstall_all

After this, just follow the on-screen instructions …

Native Ruby Gems require Sun Studio 12 with CoolStack 1.2

A substantial number of Ruby Gem packages are written in Ruby itself and install quite neatly using a simple gem install command. However, a number of Gem packages are partly written in C and require some compilation during installation (e.g. hpricot, fastthread and mongrel). This requires you to have a C compiler on the system where you are installing the packages, which is a real pain, but that’s a gripe for another day.

We recently upgraded to CoolStack 1.2 but when we tried to install the Gem packages above, we ran into trouble when the installer attempted to compile the native code for the package. The compiler (Sun Studio 11) complained about the definition of the NORETURN macro in /opt/coolstack/lib/ruby/1.8/i386-solaris2.10/config.h as follows:

syntax error before or at: __attribute__
warning: old-style declaration or incorrect type for:
__attribute__
warning: syntax error: empty declaration
...
cc: acomp failed for hpricot_scan.c

It turns out that the new version of Ruby (1.8.6) that is included with CoolStack 1.2 contains some GCC-oriented macro definitions that Sun Studio 11 does not support. The solution was to upgrade to Studio 12.

Huge thanks go to Basantk for helping to resolve this.

Corrupted Boot Archive after Solaris X86 patch update

I’ve installed a number of Solaris 10 X86 (U3) systems recently a very annoying issue on each one of them which results in the system not booting after installing the latest applicable patches for that system. Immediately after the GRUB boot menu times out and it attempts to boot Solaris, it returns with a “corrupted boot_archive. No boot device available” message. No other information is presented.

Here is how I recovered from this situation:

  1. Boot the system in Failsafe mode
  2. The system will detect your Solaris boot partition and offer to mount it on /a. Select Yes when asked about this.
  3. Once the system completes its Failsafe boot, go to /a/platform/i86pc and remove the file called boot_archive.
  4. Reboot the system using the “reboot” command wherby the system appears to re-generated the file you just deleted.
  5. The system should then boot normally again

After installation and registration of fresh Solaris system, I usually run the smpatch update command at least once to bring the system to a reasonable patch level (before installing any other software on it). I realise that this may not be entirely advisable in a live environment but on a fresh install, I feel it should be reasonable thing to do. After all, the man pages for the smpatch command state (for the update subcommand):

This subcommand analyzes the system, then downloads the appropriate updates from the Sun update server to your system. After the availability of the updates has been confirmed, the updates are applied based on the update policy. …If an update does not meet the policy for applying updates, the update is not applied.

I have used this technique several times on SPARC-based systems without issue. It only appears to happen on X86 installations.