Microsoft announces their support and contribution to the OpenCompute project

On the brink of Open Compute Project Summit 2040 (OCP Summit V) starting tomorrow morning, Microsoft today announced the contribution of their cloud server designs to the Open Compute Project.

Microsoft OCP Keynote
Microsoft OCP Keynote

Interestingly enough,┬áBill Laing was scheduled to present a keynote tomorrow at the summit. This was surprising to me as Microsoft has been traditionally quiet about elaboration as to what kind of equipment they were using to power Azure. Now officially defined as the Microsoft Cloud Server Platform. This puts Microsoft in line with Facebook, to be the only cloud service providers to publicly release their server specifications. Continue reading “Microsoft announces their support and contribution to the OpenCompute project”

Static drive mapping using Open Compute Windmill with CentOS 6.4 and the Open Compute Open Vault (Knox Unit) JBOD

In my previous post about Installing CentOS on the Open Compute Windmill servers, all of the testing was done and completed without using the OCP Knox Unit. Once connected, it routinely caused drive mapping issues. For instance, /dev/sda would become /dev/sdb, /dev/sdo or /dev/sdp at reboot. Causing the server to hang at boot since it could not find the appropriate filesystem.

The problem being, that the megaraid_sas driver was being loaded prior to the libsas driver, causing the Knox Unit drives to come online prior to the internal drives. Unfortunately, it was not consistent enough to just use /dev/sdb, /dev/sdo or /dev/sdp as the boot drive, since it would rotate depending on what server I was connected to.

After a plethora of testing, the working solution I was able to come up with is:

1. Blacklist megaraid_sas in the PXE menu as a kernel parameter using: rdblacklist megaraid_sas

2. Blacklist megaraid_sas in the kickstart file as a kernel parameter using: rdblacklist megaraid_sas

3. Blacklist megaraid_sas in /etc/modprobe.d/blacklist.conf with blacklist megaraid_sas

4. Load megaraid_sas in /etc/rc.modules with modprobe megaraid_sas

My updated kickstart:

#fix centos bug
sed -i "s/\(.*\)\(speed\)/#\1\2/g" /boot/grub/grub.conf

#blacklist module from loading
echo "blacklist megaraid_sas" >> /etc/modprobe.d/blacklist.conf

#load module after all other modules
echo "modprobe megaraid_sas" >> /etc/rc.modules
chmod +x /etc/rc.modules

 

 

Open Compute Windmill + Open Compute Open Vault hangs at booting from local disk fix

Working through installing CentOS 6.4, ESXi and others, I started running into issues where the systems would run their PXE installations just fine, then end up hanging at booting from local disk afterwards… As it turns out, the systems we’re having issues trying to boot to /dev/sda when /dev/sda was not always where the OS was getting installed… as it turns out, sometimes the local SSD would be /dev/sda, /dev/sdo, etc. This is due to the mpt_sas driver getting loaded after the megaraid_sas driver.

Continue reading “Open Compute Windmill + Open Compute Open Vault hangs at booting from local disk fix”

Open Compute Open Vault Knox Unit Drive Identification

Trying to find a failed drive that does not have a red light on it can be quite challenging with the Open Vault Knox Unit(s) since the drives/slots are not labeled.

After looking around the interweb’s and reading the Open Compute Project Open Vault Storage Specification I found the diagram indicating which drive is which.

Open Vault Knox Unit Drive Layout
Open Vault Knox Unit Drive Layout

That being said, you should be able to use MegaCLI to identify the drives using the following command:

./MegaCli -PdLocate -start -physdrv[64:0] -a0

Using MegaCLI, you can also list those drives, and their status using:

./MegaCli pdlist -a0 | grep 'Firmware state'

And you can see the results here, after the drive was replaced.

[root@hadoopnode0 ~]# megacli pdlist -a0 | grep 'Firmware state'
Firmware state: Online, Spun Up
Firmware state: Online, Spun Up
Firmware state: Online, Spun Up
Firmware state: Online, Spun Up
Firmware state: Online, Spun Up
Firmware state: Online, Spun Up
Firmware state: Online, Spun Up
Firmware state: Online, Spun Up
Firmware state: Online, Spun Up
Firmware state: Online, Spun Up
Firmware state: Online, Spun Up
Firmware state: Online, Spun Up
Firmware state: Online, Spun Up
Firmware state: Online, Spun Up
Firmware state: Online, Spun Up