I just run into a weird problem today when I was in the process of migrating a bunch of servers from an old HP SAN to a shiny new EMC VMAX.
The client have chosen to use Powerpath for the multipathing software on RHEL 5. The servers in question run multiple Oracle databases into a grid configuration. We had no problem with the old SAN and we had no problem with the new EMC VMAX using dm-multipath.
The problem started when we first reboot after installing Powerpath. All the devices was there, all the mount point worked fine but Oracle refused to start with the following message :
ORA-27154: post/wait create failed
ORA-27300: OS system dependent operation:semget failed withstatus: 28
ORA-27301: OS failure message: No space left on device
ORA-27302: failure occurred at: sskgpsemsper
Hmm, interesting. After searching on Oracle Metalink, we found that this message is normally related to an insufficient number of available semaphores. But, all our other servers work fine and we have followed the Oracle recommendation when we set the number of semaphores initially.
Our systems are currently configured with 128 semaphores arrays as per the Oracle recommendation. Using “ipcs -u”, we found out that we already use the whole 128 available arrays, even before trying to start Oracle. With the “ipcs -s” command, we saw that the root user had a huge number of semaphores arrays, 125 to be exact. Why these systems have 125 semaphores arrays for the root user when our other systems have around 25-30?
Here come the Powerpath semaphores eater! If you use Powerpath in combination with a EMC VMAX SAN, Powerpath use one semaphore array per LUN, per path to that LUN. So, if you have 4 paths to the EMC VMAX and 25 LUNs presented to the server, 100 semaphores arrays automatically goes away on boot, leaving not enough for your other normal task.
This problem is easily fixed by changing the value in /etc/sysctl.conf on the kernel.sem line. The semaphores arrays limit is the last digit. You can view your current limit with the “ipcs -l” command. I plan to write a following post shortly on using SystemTap to get a diagnostic during the boot on what consume semaphores arrays.
You can see this post for reference if you have a valid Red Hat subscription : https://access.redhat.com/knowledge/solutions/23696
Or this article on Oracle Metalink with a valid subscription : 949468.1