Posted by
Ofer Regev
September 11, 2019

With modern datacenters becoming more and more virtualized, traditional methods of capturing east-west traffic in the datacenter have become increasingly limited. Connecting a TAP to your network or using a SPAN port in order to capture network traffic is in many cases no longer possible. The reasons for capturing this traffic can be many, and are not in the scope of this post, but some of the more common ones are:

  1. Optimizing network performance
  2. Mapping dependencies between servers
  3. Security/intrusion detection
  4. Debugging applications

The Problem

First, in order to understand why the standard methods of collecting traffic are not viable in a virtual environment, we need an example:

In the above example, we have a simple network with 8 virtual machines in 2 different subnets connected to a virtual and physical network infrastructure. Let’s say that we want to get full network visibility into this environment.

At each point marked in the physical network we would only be able to see some of the traffic. For instance, at point E in the network, we would only see traffic crossing between our 2 subnets because only traffic that has to cross the subnet boundary will be sent to the router for layer 3 routing. We would not see any of the internal traffic inside the subnets (i.e. VM 1 communicating with VM 5) since that would just be passed through the switch using layer 2 switching. Also, if we used a TAP or a SPAN port on the physical switch to collect traffic from all connections A-D, we would still miss traffic going between virtual machines on the same host and in the same subnet (i.e. communication between VM 1 and VM 2). There is no way to monitor this traffic, short of installing an agent on each of the VMs. Or is there?

Full traffic capture on VMware ESX

In this post, we will be looking on how to get the full traffic from all the VMs in our example using VMware ESX. There are two main ways to do this depending the types of virtual switches being used.

Virtual port mirroring

The easiest way to get traffic for specific VMs is to use the port mirroring feature built into vSphere Distributed Switches (vDS). There is also a method to get the traffic if you are using standard switches and if you are, you can skip to the next part for details on that.

Under the setting of the vDS switches, there is a port mirroring tab which allows the configuration of port mirroring sessions. The vDS switches support multiple modes of port mirroring, but the simplest of them to use is Encapsulated Remote Mirroring. For this type of mirroring, you specify the source VM(s) for which you would like to get traffic and the destination IP address to send the traffic to. Make sure that you DO NOT add the vmkernel adapters to the sources since you may cause a loop where all packets are continuously retransmitted. This can cause the entire host to lose network connectivity and it will have to have its network settings recovered.

Each captured packet will have a header prepended to it with the GRE protocol and it will be sent over the existing physical and virtual network infrastructure to the destination address you specified. You can also optionally specify a maximum length to truncate the packets to, and a sampling rate in order reduce the amount of traffic sent over the network.

Note that each packet that you request to mirror in this fashion is duplicated and sent over the network so this can cause significant overhead on your network infrastructure. Also, the receiver will need to strip the GRE header in order to get the original packet data so make sure that whatever you are using to analyze the packets supports GRE.

Promiscuous mode network capture

The second way to capture network traffic on a VM without installing anything on the VM itself is by using the promiscuous mode feature on vmware port groups. Promiscuous mode for a port group basically allows the VM that is connected to that port group to see any traffic that is going through the switch, regardless of whether or not it was routed to that VM.

To set this up in our sample environment, we will need to configure two things:

  1. Set up port groups with promiscuous mode enabled
  2. Set up VMs to receive the captured traffic

Setting up promiscuous mode port groups

First, we will need to set up our port groups. We will need to set up a single port group for each switch.

For vDS switches, we need to create a distributed port group and change the following:

  1. In the security tab, we will need to set promiscuous mode to Accept

In the VLAN tab, we should set the type to VLAN trunking and set the range to 0-4094 in order to get traffic from all VLANs

For port groups on standard switches, the configuration is similar:

  1. In the Properties tab change the VLAN ID to All (4095)

In the Security tab, for Promiscuous mode, enable override and set to Accept

In the VLAN tab, we should set the type to VLAN trunking and set the range to 0-4094 in order to get traffic from all VLANs

Setting up VMs to receive the traffic

Once we have completed the configuration of port groups for all of our switches, we will need to create a capture VM for each one of the hosts we want to collect traffic for. Note that each capture VM will only be able to see the traffic in that specific host. Network traffic is never copied between physical hosts, even if promiscuous mode is enabled and the VMs are in the same switch/port group/cluster/etc.

For each capture VM, we will add multiple network interfaces – one for each virtual switch we want to monitor. We will assign one of our promiscuous mode port groups to each of the network interfaces on that VM.

For our previous example, assuming we were using standard switches, we would create 4 promiscuous mode port groups, 2 on each host (one for each switch). We would then create 2 capture VMs, each with 2 network interfaces, and assign them to the promiscuous mode port groups we configured.

If this was all configured correctly, Capture VM 1 should now be able to see all the traffic from all the VMs on Host 1 and Capture VM 2 should be able to see all the traffic from all the VMs on Host 2.  Notably:

  • This is all done without any overhead whatsoever on the physical network
  • As there is no duplication of any packets, this process is much more efficient than any form of port mirroring.
  • This can all be done on any version of VMware ESX, even the free versions.

Conclusion

In this post, we saw how we can gain network traffic visibility into a VMware environment without any dedicated hardware and with full visibility into the VM traffic. In the next part of this series, we will look at how we can gain information on network dependencies between VMs in a much quicker and simpler way if we do not require full capture of the network traffic.