data:image/s3,"s3://crabby-images/37528/37528ded9cc8bda24fb551bede712613565a0ac1" alt="VMware vSphere Troubleshooting"
Tools for performance monitoring
As already mentioned, VMware provides many power tools to monitor the performance of your vSphere infrastructure. These power tools help you to diagnose different problems of your vSphere hosts and vCenter Server in order to resolve them. Let's take a look at some of the tools.
Using esxtop/resxtop
The main tool for performance monitoring is esxtop, which collects data based on different metrics, for example, host memory usage, network usage, disk usage/IOPs and CPU usage.
Esxtop is just like using top in Linux. It has the same look and feel and provides the same kind of information provided by top tools in Linux. Esxtop is a famous tool almost every seasoned system administrator knows about. It can be used in real-time performance monitoring of vSphere hosts, and metrics can be monitored for system interruptions, CPU, network, disk device, disk adapter, and memory, each on a dedicated screen. The real-time monitoring can help you to identify different problems, including latency, utilization, and other errors.
data:image/s3,"s3://crabby-images/3616a/3616a80cb54d8271ea606a309ebf0b72b4ff91f8" alt="Using esxtop/resxtop"
Figure 2.1
There are other tools available from VMware Labs called Flings. Though not officially supported by VMware, these tools have been created by VMware engineers to help end users. If you are a fan of GUI, you can use VisualEsxtop, an enhanced version of esxtop/resxtop that can be used in Windows. You can download it from VMware Labs at https://labs.vmware.com/flings/visualesxtop.
VisualEsxtop provides you all the statistics that can be collected using esxtop or resxtop. You can use it to connect to a vSphere host or directly to the vCenter Server. The following is a screenshot of VisualTop:
data:image/s3,"s3://crabby-images/1b327/1b327c0016dd59a35fab70a10fd181ec25cb4af0" alt="Using esxtop/resxtop"
Figure 2.2
Esxtop offers three modes for performance monitoring: interactive, batch, and replay. We will see these modes one by one in detail later in the chapter. For now, all you need to know is that the interactive mode can enable live monitoring of vSphere host performance; the batch mode can be used to export data to other tools for offline viewing; and in replay mode, you can simulate the resources gathered by vm-support.
As you will notice, esxtop is an extensive but simple tool—it takes some time to get familiar with it. First, I will walk you through this step-by-step, hands-on guide to get familiar with each mode in esxtop, and later I will explain the important metrics used by vSphere host to troubleshoot and tune the hosts' and virtual machines' performance. Let's run esxtop in the interactive mode:
- Connect to a vSphere host using SSH and log in as root or an administrative user.
- In the command prompt, type
esxtop
without any flags. - It will take you to a statistics console, as displayed in Figure 2.2.
- Let's examine the different screens presented by esxtop. By default, the first screen that appears displays CPU information. You can use the option c to see this screen.
- Press i to see the Interrupts screen. Press c to go back to the CPU information screen.
- Press m to view the Memory screen. This screen displays detailed information about memory usage. I will walk you through this in a while in this chapter. Press c again to go back to the previous screen.
- To examine network usage, press n. It will show you ports are being used and will present different statistics about network traffic.
- Press d for detailed information about the disk adapter.
- For disk information, press u, and this will take you to the disk information screen. You can find all the information about available storage (local, includes NFS as of 4.0 Update 2, VMFS, iSCSI) to your vSphere host. It will also present the usage, disk read and write, and some other information.
- Pressing v will take you to the disk VM screen, where you can find more information about the virtual machine's disk.
- Pressing p will display the CPU Power screen, where CPU power consumption and other power-related statistics can be monitored.
You can also use the batch mode to collect all the metrics and then save it in the .csv
format. The captured metrics can be examined later for offline analysis using other tools; for example, they can be ported to Microsoft's perfmon or esxplot. The default configuration file of esxtop is named .esxtop41rc
. You can customize this file according to your preferred list of fields and how they would appear on the screen. The generic esxtop command is written with the following flags:
esxtop [-] [h] [v] [b] [s] [a] [c filename] [R directory path] [d delay] [n iter]
Follow these steps to capture metrics in the batch mode:
- Connect to a vSphere host using SSH and log in as root or an administrative user.
- Edit the
/var/spool/cron/crontabs/root
file by typing the following in the console at the end of the current entries:vi /var/spool/cron/crontabs/root
- Do not delete the existing entries in the file.
- In the file, type the following command:
30 3 * * * esxtop –b –a –d 2 –n 1000 > data.csv
- Save and exit by pressing wq! in vi.
- Once you quit, it will load the new configuration automatically.
- The preceding command will capture the statistics every day at 3:30 A.M. and write them in a file called
data.csv
. The–d
flag sets a delay in seconds for sampling, and–n
sets the number of iterations esxtop should capture. The preceding command collects data after a delay of 2 seconds and collects up to 1,000 iterations. The command will generate data for about 33 minutes for examining.
Tip
For vCenter Server 5.5 and later, you can also download ESXtopNGC
as a plugin of the vSphere web interface from https://labs.vmware.com/flings/esxtopngc-plugin. This plugin gets integrated with vSphere web interface, and you can directly monitor the performance from the interface without requiring to log in to vSphere hosts. You can find further details on how to install the ESXtopNGC
plugin for the vSphere web interface in Appendix A, Learning PowerGUI Basics.
Esxplot can be downloaded from https://labs.vmware.com/flings.
The last mode of esxtop is its replay mode. We will use the vm-support tool to capture performance data. We have already seen how to use this tool to collect different logs in vSphere. We will use the vm-support tool with –p
to collect vSphere performance data. Collecting performance data using this tool is very similar to collecting performance data in the batch mode with esxtop. We need to set up the time interval and length of performance data collection. You can use -d with vm-support command to specify the collection duration and -i switch to define an interval for vm-support to wait between the data collection. To collect performance or diagnostic data using vm-support command use the following syntax:
vm-support –s –i 5 –d 10 –w /vmfs/volum es/NFSVol01 /var/log# vm-support -p -d 10 -i 5 -w /vmfs/volumes/NFSVol01 18:37:37: Creating /vmfs/volumes/NFSVol01/esx-crimv3esx002.linxsol.com-2015-03-26--18.37.tgz 18:41:22: Gathering output from /usr/sbin/localcli vm process list 18:41:55: Done. Please attach this file when submitting an incident report. To file a support incident, go to http://www.vmware.com/support/sr/sr_login.jsp To see the files collected, run: tar -tzf '/vmfs/volumes/NFSVol01/esx-crimv3esx002.linxsol.com-2015-03-26--18.37.tgz'
data:image/s3,"s3://crabby-images/1bdee/1bdee194e2e6857cc6658e8d8ca1f2afd970857f" alt="Replaying performance metrics – replay mode"
Figure 2.3
The preceding vm-support
command will gather metrics for five iterations per 10 seconds and a total of 50 seconds. The execution of vm-support will take a few minutes to be completed. Once the execution of vm-support is complete, it will store the file in /vmfs/volumes/NFSVol01
. The collected metrics are compressed in a tar
file to save disk space. Use tar
to extract it so we can use it with esxtop:
tar –tzf '/vmfs/volumes/NFSVol01/esx-crimv3esx011.linxsol.com-2015-03—26—18.37.tgz'
Now go into the extracted directory and run the following command:
./reconstruct.sh
The reconstruct.sh
command is a script provided by vm-support in the compressed file. This is to avoid the all vm-support snapshots have been used
error. Now type esxtop
with the R
flag to execute it in replay mode:
esxtop –R esx-crimv3esx0011.linxsol.com-2015-03—26—18.37
The preceding command will display the metrics from the provided vm-support
file.
Tip
In VCSA 6.0, VMware has introduced a new tool called vimtop. This is a powerful tool similar to esxtop/resxtop. You can use this tool to monitor and troubleshoot your VMware vCenter Server 6.0 appliance. You can log in to your VCSA 6.0 appliance using SSH. Then type shell.set –enabled True
followed by shell
in order to go to the bash shell. Once you are in the bash shell, simply type vimtop
to get the tool started. If the shell is not the bash shell, you can change it with the chsh –s "/bin/bash" root
command.
Using Windows Performance Monitor
Now we will use the Windows Performance Monitor tool to examine statistics we have gathered by implementing the preceding hands-on guide:
- Transfer the
data.csv
file to a Windows computer. You can use WinSCP, a free windows SCP client, to transfer the file. - Hold down the Windows key on your keyboard and press R.
- In the Run window, type
perfmon
and press OK or hit the Enter key. It will bring up the perfmon tool's window. - In the left pane, click on the Performance Monitor option. Click on the second icon in the console pane toolbar. It will open the Source tab of the Properties window for Performance Monitor.
- Click on the log files in the data source and click on the Add button.
- Select our
data.csv
file generated by the esxtop tool in the batch mode and click Open. You can also add multiple.csv
files. - You can reduce the range of time you would like to view the data from if you want.
- Then click on the next tab named Data. Click on the Add button; you will see the Add Counters window. Select Physical CPU and Memory counters to be displayed, and click on the Add button. You can choose other counters if you want.
Figure 2.4
- You can change the graph time by clicking on the third icon in the graph area. You can also generate a report of statistics collected by esxtop.
- Once you are done with selecting the file or multiple files, click OK to close the Properties window.
- Right-click on the Performance Monitor display and remove all the counters.
- Click Add and select the desired counters.
Figure 2.5