VCAP6-NV (3V0-643) Study Guide – Part 19. Troubleshoot Common VMware NSX Installation/Configuration Issues

This is part 19 of 20+ blogs I am writing covering the exam prep guide for the VMware Certified Advanced Professional 6 – Network Virtualisation Deployment (3V0-643)  VCAP6-NV certification.

At the time of writing there is no VCAP Design exam stream, thus you’re automatically granted the new VMware Certified Implementation Expert 6 – Network Virtualisation (VCIX6-NV) certification by successfully passing the VCAP6-NV Deploy exam.

For previous blogs in this series please refer to the VCAP6-NV Reference Guide I created. This has all the links to VMware NSX content and lists out each exam objective and the associated blog. Check it out here –>Exam Objective Reference Guide.

This blogs covers:

Section 7 – Perform Advanced VMware NSX Troubleshooting
Objective 7.1 – Troubleshoot Common VMware NSX Installation/Configuration Issues

  • Troubleshoot NSX Manager Services
  • Download Technical Supports Logs from NSX Manager
  • Troubleshoot Host Preparation Issues
  • Troubleshoot NSX Controller Cluster Status, Roles and Connectivity
  • Troubleshoot Logical Switch Transport Zone and NSX Edge Mappings
  • Troubleshoot Logical Router Interface and Route Mappings
  • Troubleshoot Distributed and Edge Firewall Implementations

 

Another brief blog, getting to the dregs now!

Troubleshoot NSX Manager Services

First I would want to log into the NSX Manager and confirm what services are having issues, assuming you can still log into the appliance.

Log into the NSX Manager by hitting the webpage. For me this is https://labnsx01.lab.local

nsx.JPG

Click the View Summary tab.

Confirm the following services are started: vPostgres, RabbitMQ, NSX Management Service and the NSX Universal Syncronisation Service should you be running cross-vCenter NSX.

If any of these are not running, try starting them and watch the outcome.

nsx1

To access the NSX Manager logs you need to SSH into the NSX Manager appliance.

There are two logs to have a good look at (1) the NSX Manager log and (2) the System log.

These commands are: show log manager & show log system

Both of these commands can have appended: follow to continuously update the logs, reverse to show the logs in the reverse order or last to only show the last n number of lines.

Example below of show log manager

log

Example below of show log system follow

follow.JPG

You can also configure Syslog on the NSX Manager which I have already done prior (see Specify a Syslog Server in blog 2).

You can see the Syslog settings on the NSX Manager appliance – under the Manage Appliance Settings tab.

syslog.JPG

By configuring the Syslog settings this sends all the logs to that destination.

My syslog server is VMware vRealize Log Insight, if I log into the webpage for Log Insight I can locate and search the NSX Manager logs.

logssss.JPG

Download Technical Supports Logs from NSX Manager

Technical Support Logs when downloaded are compressed in the .gz format. You would download these logfiles and then upload to VMware Support for analysis.

Log into the NSX Manager appliance.

nsx

Click Download Tech Support Log.

down.JPG

The logs start to be collected

tecg

Click Download to save the log files locally.

down2

The files are downloaded.

log3

You can also download Tech Support Logs for the Edge Services Gateway (ESG) and the NSX Controllers.

Troubleshoot Host Preparation Issues

Read more about host preparation in blog 3 under Prepare a cluster for NSX.

Host preparation installs NSX VIBs and drivers onto ESXi hosts. You want all your hosts to be prepared and ready for service. Sometimes there are issues and you will need to resolve.

Log into the Web Client.

Click Networking and Security.

Click Installation, then Host Preparation. This will show all vCenter Server cluster and hosts.

The below is a healthy cluster and all hosts have a green tick.

host.JPG

If any of the hosts have an error  or state not.JPG you will need to resolve this.

You might get Not Ready because the host/s might have had the VIBs and drivers uninstalled and the hosts require a reboot.

Check the Communication Channel Health. Run this option on either the cluster or host level.

healht

heal2.JPG

Check DNS records for hosts are correct, both forward and reverse.

Make sure Common Components are installed and running on the NSX Manager appliance. Check and restart the RabbitMQ service if more than one host affected.

comp

Check the ESX Agent Manager. This is responsible for automating vSphere agents. Check this from Web Client under vCenter Server Extensions.

click

Troubleshoot NSX Controller Cluster Status, Roles and Connectivity

There are 3 items here: status, roles and connectivity.

Log into the Web Client.

Click Networking and Security.

Click Installation, then Management.

Controller Cluster Status

Make sure all the controllers are healthy and connected.

control.JPG

If you had an issue with a controller you can delete it and recreate it, this is pretty fast – minutes to redeploy.

You can check the status of the controllers also by connecting by SSH to them.

Run the command: show control-cluster status

control

The above controller is healthy and joined to the cluster.

Here are all the show control-cluster commands:

cUntitled.png

Controller Cluster Roles

In any cluster something has to be the boss, be the primary or master etc. To check the role assign to an NSX Controller you need to SSH into the controller and run some commands.

Run the command: show control-cluster roles.

The controller with YES under the row MASTER is the master for that one role.

clust

On the controller shown above, it is the master for all roles.

If I run the same command on another host:

cluster

Controller Cluster Connectivity

Make sure the controllers are connected.

conn

Run the command show control-cluster connections

This command shows cluster communication. You can see this host is listening on multiple ports, currently with 6 connections open via port 443 etc.

list

Troubleshoot Distributed and Edge Firewall Implementations

Remember that the DFW is in the kernel of the hypervisor.

You want to at a minimum check that the NSX Firewall is enabled on the hosts.

dfw

Make sure VMware Tools are installed on the VMs.

I am not going into details on troubleshooting. This will come from experience and playing/breaking the product and learning from it.

And that’s it for this blog!

Blog 20 will cover:

Objective 7.2 – Troubleshoot VMware NSX Connectivity Issues

  • Monitor and Analyze Virtual Machine Traffic with Flow Monitoring
  • Troubleshoot Virtual Machine Connectivity
  • Troubleshoot Dynamic Routing Protocols

Follow me on Twitter or LinkedIn.

Be Social; Please Share.

 

  1. […] Objective 7.1 – Troubleshoot Common VMware NSX Installation/Configuration Issues […]

    Like

    Reply

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: