Nutanix : Physical move ?


Background

Sometimes in the life of the datacenter, you need to move hardware from one location to another. Funny enough, we not always think that we need to move - especially when everything works (very) well. In the case of a Nutanix cluster, this could require some particular attention.

Moving Nutanix Cluster

Fìrst things to consider would be the most important part of the environment : your workloads. Of course, you need to create a maintenance window to power off the VMs, then switch the VMs off. When restarting the cluster, all VMs will be in a down state, so we need to capture the power state BEFORE switching the cluster off so we can revert back in the previous state once the maintenance is over.

I have created a script that generates a ready to use acli script (to be executed from any CVM of the cluster) that makes a list of running VMs, powers them off and finally powers the VMs back to their original state. Of course, this only works with AHV hypervisor ....

<?php

include_once "nxCredentials.php";
include_once "nxFramework.php";

$VMs=nxGetVMs($clusterConnect);

$VMState=array();
$i=0;

  foreach ($VMs->entities as $item) 
{
// print("VM : ".$item->name."\n");
// print("\tPower State : ".$item->power_state."\n");
$VMState[$i]['name']=$item->name;
$VMState[$i]['state']=$item->power_state;
$i++;
}

// Build power Off script

print("-> Save the below lines to poweroff.sh\n");
print("#!/bin/bash\n");
print("# Power off script\n");
print("# Run the below in one of the CVMs\n\n");
for ($i=0;$i<count($VMState);$i++)
{
if($VMState[$i]['state']==on) print("acli vm.off '".$VMState[$i]['name']."'\n");
}
print("\n"); 

// Build power back on script

print("-> Save the below lines to poweron.sh\n");
print("#!/bin/bash\n");
print("# Power on script\n");
print("# Run the below in one of the CVMs\n\n");

$toPowerOn=0;
for ($i=0;$i<count($VMState);$i++)
{
if($VMState[$i]['state']==on) 
{
print("acli vm.on '".$VMState[$i]['name']."'\n");
$toPowerOn++;
if(!($toPowerOn % 10)) print("sleep 10\n");
}
}

?>

The nxFramework.php script only contains one interesting function the nxGetVMs :

// ------------------------------------------------------------
// Get List of Nutanix VMs
// ------------------------------------------------------------

function nxGetVMs($clusterConnect)
{
        $API_URL="/PrismGateway/services/rest/v2.0/vms/";

$curl = curl_init();
curl_setopt($curl, CURLOPT_USERPWD,  $clusterConnect["username"].":".$clusterConnect["password"]);
curl_setopt($curl, CURLOPT_HTTPAUTH, CURLAUTH_BASIC);
curl_setopt($curl, CURLOPT_FOLLOWLOCATION, 1);
curl_setopt($curl, CURLOPT_HTTPHEADER, array('Accept: application/json'));
curl_setopt($curl, CURLOPT_SSL_VERIFYPEER, false);
curl_setopt($curl, CURLOPT_SSL_VERIFYHOST, false);
curl_setopt($curl, CURLOPT_URL, "https://".$clusterConnect["ip"].":9440".$API_URL);
curl_setopt($curl, CURLOPT_RETURNTRANSFER, 1);

$result = curl_exec($curl);
curl_close($curl);

return(json_decode($result));

}

The nxCredentials.php contains the credentials and IP of the cluster you are connecting to. It should be similar to this :

<?php
  $clusterConnect=array(
  "username" => "username",
  "password" => "password",
  "ip" => "192.168.1.1"
  );
?>

It will generate something like this :

-> Save the below lines to poweroff.sh
#!/bin/bash
# Power off script
# Run the below script in one of the CVMs

acli vm.off 'My VM #1'
acli vm.off 'My VM #2'
acli vm.off 'My VM #3'
acli vm.off 'My VM #4'
acli vm.off 'My VM #5'
acli vm.off 'My VM #6'
acli vm.off 'My VM #7'
acli vm.off 'My VM #8'
acli vm.off 'My VM #9'
acli vm.off 'My VM #10'
acli vm.off 'My VM #11'

-> Save the below lines to poweron.sh


#!/bin/bash
# Power on script
# Run the below script in one of the CVMs

acli vm.on 'My VM #1'
acli vm.on 'My VM #2'
acli vm.on 'My VM #3'
acli vm.on 'My VM #4'
acli vm.on 'My VM #5'
acli vm.on 'My VM #6'
acli vm.on 'My VM #7'
acli vm.on 'My VM #8'
acli vm.on 'My VM #9'
acli vm.on 'My VM #10'
sleep 10
acli vm.on 'My VM #11'

From a risk mitigation view (to reduce as much as possible de risk of having non admitted operations) the script powers on 10 VMs at a time and adds a delay of 10 secs. This is to be on the ultra safe side, but better safe and not sorry (Bruno Sousa from Nutanix said it…if you don't agree feel free to have words with him!). 

Power Off

When the power off section of the script is completed and when you have validated (either in Prism UI or with an acli script) that all VMs are down, you can stop the cluster using : 

ncli cluster stop

Next, on each CVMs do shutdown now -h. Next and final step issue a shutdown now -h on each physical node composing the Nutanix cluster. Then, this is time to remove the power and network cables and move the heavy boxe(s).

Power On

When you are ready to power it on again, leave it sometimes to have all prism services up and running. The CVMs are automatically starting up. Be sure all network services are available : NTP and DNS are the most critical since this is driving the entire network stack in the custer.

Last but not least, you can start the power on script generated above and you should be good to go !

Don't forget to confirm that the network configuration on the new location is properly configured to host the cluster in terms of VLANs, port security on the switches, and so on ...

Comments

What's hot ?

ShredOS : HDD degaussing with style

Wallbox : Get The Most Of It (with API)

Nutanix : CVM stuck into Phoenix