Using Puppet to configure LLDPAD on the Open Compute Winterfell

On our OCP Winterfell nodes, in CentOS 6; the 10GB Mellanox NIC’s show up as eth0 and eht1, where the 1GB management interface shows up as eth2. We are also using Brocade 10GB top-of-rack switches. So configuring LLDP was necessary for the servers to advertise themselves to the upstream switches. To do this, we use the LLDPAD package available  in the @base CentOS repo.

The next step is to create a Puppet module/mainfest to:

  1. Install the LLDPAD RPM from YUM.
  2. Start the LLDPAD service
  3. Ensure that the LLDPAD service is set to autostart at boot
  4. Configure eth0 and eth1 to broadcast their LLDP status to the upstream switches
  5. Ensure that it only runs once, not every time puppet agent runs

Continue reading “Using Puppet to configure LLDPAD on the Open Compute Winterfell”

Microsoft announces their support and contribution to the OpenCompute project

On the brink of Open Compute Project Summit 2040 (OCP Summit V) starting tomorrow morning, Microsoft today announced the contribution of their cloud server designs to the Open Compute Project.

Microsoft OCP Keynote
Microsoft OCP Keynote

Interestingly enough, Bill Laing was scheduled to present a keynote tomorrow at the summit. This was surprising to me as Microsoft has been traditionally quiet about elaboration as to what kind of equipment they were using to power Azure. Now officially defined as the Microsoft Cloud Server Platform. This puts Microsoft in line with Facebook, to be the only cloud service providers to publicly release their server specifications. Continue reading “Microsoft announces their support and contribution to the OpenCompute project”

Static drive mapping using Open Compute Windmill with CentOS 6.4 and the Open Compute Open Vault (Knox Unit) JBOD

In my previous post about Installing CentOS on the Open Compute Windmill servers, all of the testing was done and completed without using the OCP Knox Unit. Once connected, it routinely caused drive mapping issues. For instance, /dev/sda would become /dev/sdb, /dev/sdo or /dev/sdp at reboot. Causing the server to hang at boot since it could not find the appropriate filesystem.

The problem being, that the megaraid_sas driver was being loaded prior to the libsas driver, causing the Knox Unit drives to come online prior to the internal drives. Unfortunately, it was not consistent enough to just use /dev/sdb, /dev/sdo or /dev/sdp as the boot drive, since it would rotate depending on what server I was connected to.

After a plethora of testing, the working solution I was able to come up with is:

1. Blacklist megaraid_sas in the PXE menu as a kernel parameter using: rdblacklist megaraid_sas

2. Blacklist megaraid_sas in the kickstart file as a kernel parameter using: rdblacklist megaraid_sas

3. Blacklist megaraid_sas in /etc/modprobe.d/blacklist.conf with blacklist megaraid_sas

4. Load megaraid_sas in /etc/rc.modules with modprobe megaraid_sas

My updated kickstart:

#fix centos bug
sed -i "s/\(.*\)\(speed\)/#\1\2/g" /boot/grub/grub.conf

#blacklist module from loading
echo "blacklist megaraid_sas" >> /etc/modprobe.d/blacklist.conf

#load module after all other modules
echo "modprobe megaraid_sas" >> /etc/rc.modules
chmod +x /etc/rc.modules



MegaCLI working examples cheat sheet

MegaCLI is a pain… a real pain. I’m not a fan of the syntax, it’s not easily readable, did I say it was a pain?

Well, our OCP Knox Units are connected via the LSI MEGARAID SAS 9286CV-8E (SGL) controller, so I’ve been using MegaCLI quite a bit lately. Here is a collection of my working examples:

List All Devices

./MegaCLI -PDList -aALL | egrep 'Adapter|Enclosure|Slot|Inquiry'

Create raid 1 volume with disk 0 and 1

./MegaCLI -CfgLdAdd -r1' [38:0,38:1]' -a0

Create raid 5 volume with disk 0-14

./MegaCLI -CfgLdAdd -r5 '[38:0,38:1,38:2,38:3,38:4,38:5,38:6,38:7,38:8,38:9,38:10,38:11,38:12,38:13,38:14]' -a0 -NoLog

Create individual raid 0 volumes on drives 2-14

i=2; while [ $i -le 14 ] ; do ./MegaCli -cfgldadd -r0 [38:${i}] WB RA Cached CachedBadBBU -strpsz512 -a0 -NoLog ; i=`expr $i + 1`; done

List  all volumes

./MegaCli -LDInfo -Lall -aALL

Create raid 50 volume

./MegaCli -CfgSpanAdd -r50 -Array0[38:0,38:1,38:2,38:3,38:4,38:5] -Array1[38:6,38:7,38:8,38:9,38:10,38:11] Direct RA WB -a0

Bring physical drive 14 online

./MegaCli -PDMakeGood -PhysDrv[38:14] -a0

Show physical drive status

./MegaCli -PDList -a0

Enable JBOD support – this is necessary for vSphere vSAN technology

./MegaCli -AdpSetProp -EnableJBOD 1 -aALL

Set global hot spare

./MegaCli -PDHSP -SET -PhysDrv [38:14] -a0

Delete all volumes

./MegaCli -CfgLdDel -Lall -aAll

Clear foreign configs

./MegaCli -CfgForeign -Clear -aALL

List drive state

./MegaCli pdlist -a0 | grep 'Firmware state'

Enable drive location LED

./MegaCli -PdLocate -start -physdrv[64:0] -a0