
Fun and crazy days here at Nutanix. I’ve busy been fielding a lot of calls around our new offering, CPS Standard on Nutanix. Now if you don’t know what CPS is, it stands for Cloud Platform System.
To continue on my last blog post on Exchange...
As I mentioned previously, I support SE’s from all over the world. And again today, I got asked what are the best practices for running Exchange on Nutanix. Funny enough, this question comes in quite often. Well, I am going to help resolve that. There’s a lot of great info out there, especially from my friend Josh Odgers, which has been leading the charge on this for a long time. Some of his posts can be controversial, but the truth is always there. He’s getting a point across.
On February 16, 2016, Nutanix announced the Acropolis NOS 4.6 release and last week was available for download. Along with many enhancements, I wanted to highlight several items, including some tech preview features.
If you missed other parts of my series, check out links below:
Part 1 – NPP Training series – Nutanix Terminology
Part 2 – NPP Training series – Nutanix Terminology
Cluster Architecture with Hyper-V
Data Structure on Nutanix with Hyper-V
I/O Path Overview
Drive Breakdown
To give credit, most of the content was taken from Steve Poitras’s “Nutanix Bible” blog as his content is the most accurate and then I put a Hyper-V lean to it. Also, he just rocks…other than being a Sea Hawks Fan :).
As mentioned before (likely numerous times), the Nutanix platform is a software-based solution which ships as a bundled software + hardware appliance. The controller VM or what we call the Nutanix CVM is where the vast majority of the Nutanix software and logic sits and was designed from the beginning to be an extensible and pluggable architecture. A key benefit to being software-defined and not relying upon any hardware offloads or constructs is around extensibility. As with any product life-cycle, advancements and new features will always be introduced.
By not relying on any custom ASIC/FPGA or hardware capabilities, Nutanix can develop and deploy these new features through a simple software update. This means that the deployment of a new feature (e.g., deduplication) can be deployed by upgrading the current version of the Nutanix software. This also allows newer generation features to be deployed on legacy hardware models. For example, say you’re running a workload running an older version of Nutanix software on a prior generation hardware platform (e.g., 2400). The running software version doesn’t provide deduplication capabilities which your workload could benefit greatly from. To get these features, you perform a rolling upgrade of the Nutanix software version while the workload is running, and you now have deduplication. It’s really that easy.
Similar to features, the ability to create new “adapters” or interfaces into Distributed Storage Fabric is another key capability. When the product first shipped, it solely supported iSCSI for I/O from the hypervisor, this has now grown to include NFS and SMB for Hyper-V. In the future, there is the ability to create new adapters for various workloads and hypervisors (HDFS, etc.).
And again, all of this can be deployed via a software update. This is contrary to most legacy infrastructures, where a hardware upgrade or software purchase is normally required to get the “latest and greatest” features. With Nutanix, it’s different. Since all features are deployed in software, they can run on any hardware platform, any hypervisor, and be deployed through simple software upgrades.
The following figure shows a logical representation of what this software-defined controller framework (Nutanix CVM) looks like:Next up, NPP Training Series – How does it all work – Disk Balancing
Until next time, Rob…
Table 1. Terminology Updates | |
New Terminology | Formerly Known As |
Acropolis base software | Nutanix operating system, NOS |
Acropolis hypervisor, AHV | Nutanix KVM hypervisor |
Acropolis API | Nutanix API and Acropolis API |
Acropolis App Mobility Fabric | Acropolis virtualization management and administration |
Acropolis Distributed Storage Fabric, DSF | Nutanix Distributed Filesystem (NDFS) |
Prism Element | Web console (for cluster management); also known as the Prism web console; a cluster managed by Prism Central |
Prism Central | Prism Central (for multicluster management) |
Block fault tolerance | Block awareness |
Note: You can configure bandwidth throttling only while updating the remote site. This option is not available during the configuration of remote site.
Note: You cannot use an NX-6035C cluster as a backup target with third-party backup software.
Note: Nutanix supports the ability to patch upgrade ESXi hosts with minor versions that are greater than or released after the Nutanix qualified version, but Nutanix might not have qualified those minor releases. Please see the the Nutanix hypervisor support statement in our Support FAQ.
Note: Do not use tech preview features on production systems or storage used or data stored on production systems.
Note: This feature should be used only after upgrading all nodes in the cluster to Acropolis base software 4.5.
Nutanix has introduced a Prism Central VM which is compatible with AHV to enable multicluster management in this environment. Prism Central now supports all three major hypervisors: AHV, Hyper-V, and ESXi.
The Prism Central VM requires these resources to support the clusters and VMs indicated in the table.
Prism Central vCPU |
Prism Central Memory (GB, default) | Total Storage Required for Prism Central VM (GB) | Clusters Supported | VMs Supported (across all clusters) | Virtual disks per VM |
4 | 8 | 256 | 50 | 5000 | 2 |
You can learn more about the Nutanix Cluster Check (NCC) health checks on the Nutanix support portal. The portal includes a series of Knowledge Base articles describing most NCC health checks run by the ncc health_checks command.
NCC 2.1 includes support for:
The following features are available as a Tech Preview in NCC 2.1.
Check Name | Description | KB Article |
check_disks | Check whether disks are discoverable by the host. Pass if the disks are discovered. | KB 2712 |
check_pending_reboot | Check if host has pending reboots. Pass if host does not have pending reboots. | KB 2713 |
check_storage_heavy_node | Verify that nodes such as the storage-heavy NX-6025C are running a service VM and no guest VMs. Verify that nodes such as the storage-heavy NX-6025C are runningthe Acropolis hypervisor only. |
KB 2726 KB 2727 |
check_utc_clock | Check if UTC clock is enabled. | KB 2711 |
cluster_version_check | Verifiy that the cluster is running a released version of NOS or the Acropolis base software. This check returns an INFO status and the version if the cluster is running a pre-release version. | KB 2720 |
compression_disabled_check | Verify if compression is enabled. | KB 2725 |
data_locality_check | Check if VMs that are part of a cluster with metro availability are in two different datastores (that is, fetching local data). | KB 2732 |
dedup_and_compression_enabled_containers_check | Checks if any container have deduplication and compression enabled together. | KB 2721 |
dimm_same_speed_check | Check that all DIMMs have the same speed. | KB 2723 |
esxi_ivybridge_performance_degradation_check | Check for the Ivy Bridge performance degradation scenario on ESXi clusters. | KB 2729 |
gpu_driver_installed_check | Check the version of the installed GPU driver. | KB 2714 |
quad_nic_driver_version_check | Check the version of the installed quad port NIC driver version. | KB 2715 |
vmknics_subnet_check | Check if any vmknics have same subnet (different subnets are not supported). | KB 2722 |
This release includes the following enhancements and changes:
Customers may create a cluster using the new Controller VM-based implementation in Foundation 3.0. Imaging bare metal nodes is still restricted to Nutanix sales engineers, support engineers, and partners.
Until next time, Rob…
To facilitate these management packs, SCOM supports standard discovery and data collection mechanisms like SNMP, but also affords vendors the flexibility of native API driven data collection. Nutanix provides management packs that support using the Microsoft System Center Operations Manager (SCOM) to monitor a Nutanix cluster.
The management packs collect information about software (cluster) elements through SNMP and hardware elements through ipmiutil (Intelligent Platform Management Interface Utility) and REST API calls and then package that information for SCOM to digest. Note: The Hardware Elements Management Pack leverages the ipmiutil program to gather information from Nutanix block for Fans, Power Supply and Temperature.
After the management packs have been installed and configured, you can use SCOM to monitor a variety of Nutanix objects including cluster, alert, and performance views as shown in examples below. Also, I check out this great video produced by pal @mcghem . He shows a great demo of the SCOM management pack…Kudo’s Mike….also, check out his blog.
Views and Objects Snapshots
Cluster Monitoring Snapshots
In the following diagram views, users can navigate to the components with failure.
The following provides an high level overview of Nutanix Cluster with Components:
The following sections describe Nutanix Cluster objects being monitored by this version of MPs:
Monitored Element |
Description |
Version |
Current cluster version. This is the nutanix-core package version expected on all the Controller VMs. |
Status |
Current Status of the cluster. This will usually be one of started or stopped |
TotalStorageCapacity |
Total storage capacity of the cluster |
UsedStorageCapacity |
Number of bytes of storage used on the cluster |
Iops |
For Performance: Cluster wide average IO operations per second |
Latency |
For Performance: Cluster wide average latency |
Monitored Element |
Description |
ControllerVMId |
Nutanix Controller VM Id |
Memory |
Total memory assigned to CVM |
NumCpus |
Total number of CPUs allocated to a CVM |
A storage pool is a group of physical disks from SSD and/or HDD tier.
Monitored Element |
Description |
PoolId |
Storage pool id |
PoolName |
Name of the storage pool |
TotalCapacity |
Total capacity of the storage pool Note: An alert if there is drop in capacity may indicate a bad disk. |
UsedCapacity |
Number of bytes used in the storage pool |
Monitored Element |
Description |
IOPerSecond |
Number of IO operations served per second from this storage pool. |
AvgLatencyUsecs |
Average IO latency for this storage pool in microseconds |
A container is a subset of available storage within a storage pool. Containers hold the virtual disks (vDisks) used by virtual machines. Selecting a storage pool for a new container defines the physical disks where the vDisks will be stored.
Monitored Element |
Description |
ContainerId |
Container id |
ContainerName |
Name of the container |
TotalCapacity |
Total capacity of the container |
UsedCapacity |
Number of bytes used in the container |
Monitored Element |
Description |
IOPerSecond |
Number of IO operations served per second from this container. |
AvgLatencyUsecs |
Average IO latency for this container in microseconds |
Monitored Element |
Description |
Discovery IP Address |
IP address used for discovery of cluster |
Cluster Incarnation ID |
Unique ID of cluster |
CPU Usage |
CPU usage for all the nodes of cluster |
Memory Usage |
Memory usage for all the nodes of cluster |
Node IP address |
External IP address of Node |
System Temperature |
System Temperature |
Monitored Element |
Description |
Disk State/health |
Node state as returned by the PRISM [REST /hosts “state” attribute ] |
Disk ID |
ID assigned to the disk |
Disk Name |
Name of the disk (Full path where meta data stored) |
Disk Serial Number |
Serial number of disk |
Hypervisor IP |
Host OS IP where disk is installed |
Tire Name |
Disk Tire |
CVM IP |
Cluster VM IP which controls the disk |
Total Capacity |
Total Disk capacity |
Used Capacity |
Total Disk used |
Online |
If Disk is online or offline |
Location |
Disk location |
Cluster Name |
Disk cluster name |
Discovery IP address |
IP address through which Disk was discovered |
Disk Status |
Status of the disk |
Monitored Element |
Description |
Node State/health |
Node state as returned by the PRISM [REST /hosts “state” attribute ] |
Node IP address |
External IP address of Node |
IPMI Address |
IPMI IP address of Node |
Block Model |
Hardware model of block |
Block Serial Number |
Serial number of block |
CPU Usage % |
CPU usage for Node |
Memory Usage % |
Memory usage for node |
Fan Count |
Total fans |
Power Supply Count |
Total Power supply |
System Temperature |
System Temperature |
Monitored Element |
Description |
Fan number |
Fan number |
Fan speed |
Fan speed in RPM |
Element |
Description |
Power supply number |
Power supply number |
Power supply status |
Power supply status whether present or absent |
If you would like to checkout the Nutanix management pack on your SCOM instance, please go to our portal to download the management pack and documentation.
This management pack was development by our awesome engineering team @ Nutanix. Kudos to Yogi and team for a job well done!!! 😉 I hope I gave you a good feel for Nutanix monitoring using SCOM. As always, if you have any questions or comments, please leave below….
Until next time….Rob
He takes complex technology subjects and explains it extremely well on many levels so everyone understands..He believes in the community….all things as technologists, we can all strive to achieve.
I recently had the lucky chance to interview him for the Nutanix .Next Community Podcast. It was great honor to interview him with my colleaguebuddy @NutanixTommy as we both had different points of views.
Symon joined 5nine Software earlier this year as Vice President, Business Development & Marketing and is how I came to meet Simon as part of my job in Technical Alliances at Nutanix.
For those of you who are not familiar with 5nine Software, 5nine has a great alternative management product for Hyper-V with benefits of simplified vCenter type management without the footprint of System Center. They also are the only vendor with agentless security product via the Hyper-V extensible virtual switch. Think vShield for Hyper-V…Very cool… 😎
For those that are not familiar with Symon…a brief history…
With more than 12 years of experience in the high-tech industry, Symon is an internationally recognized expert in virtualization, high-availability, disaster recovery, data center management, and cloud technologies.
As Microsoft’s Senior Technical Evangelist and worldwide technical lead covering virtualization, infrastructure, management and cloud. He has trained millions of IT Professionals, hosted the “Edge Show” weekly webcast, holds several patents and dozens of industry certifications, and in 2013 he co-authored “Introduction to System Center 2012 R2 for IT Professionals” (Microsoft Press). He graduated from Duke University with degrees in Computer Science, Economics and Film & Digital Studies.
Enjoy the show……
Until next time, Rob…
To continue Windows Azure Pack series here is my next topic: Installing and Configuring Windows Azure Pack
If you missed other parts of the series, check links below:
Part 1 – Understanding Windows Azure Pack
Part 2 – Understanding Windows Azure Pack – Deployment Scenarios
Part 3 – Understanding Windows Azure Pack – How to guide with Express Edition on Nutanix – Environment Prep
Part 4 – Deploying Service Provider Framework on Nutanix
Again to reiterate from my previous blog posts and set some context, Windows Azure Pack (WAP) includes the following capabilities: Continue reading