What are the vSAN Trace Files?

Reading Time: 5 minutes

This article will provide an introduction to the vSAN Trace Files and some techniques to customize the maximum size and location to place those files.

So, First and Foremost, What are the vSAN Trace Files?

vSAN traces are log files that are used for diagnosing and troubleshooting issues with vSAN. These files can be particularly useful for debugging critical performance issues and data-path-related problems.

By default, vSAN traces are saved to /var/log/vsantraces.
The default maximum file size is 180MB, and there is a rotation of 8 files. This means that when the 9th file is created, the oldest file is deleted, ensuring that only the most recent 8 files are kept.

vSAN also generates “urgent traces”, which provide details on potentially significant problems. By default, these urgent traces are redirected through the ESXi syslog system. If an external Syslog server is defined, the urgent traces are forwarded to the external collector.

Important
The generation of these trace files is important for maintaining the health and performance of a vSAN system. However, if not managed properly, these trace files can take up a lot of space, especially if the ESXi hosts are running from a RAM disk with a limited disk size. Therefore, it's imporant to monitor and manage these files to prevent them from filling up the storage space!

Can I disable the generation of these trace files?

Yes, it’s possible to control the generation of vSAN trace files. However, completely disabling them is not recommended as these files are crucial for diagnosing and troubleshooting vSAN issues. We can manage the generation of these trace files by changing the default settings using the command:

esxcli vsan trace set

We can see all available options by calling the help option:

[root@host01:~] esxcli vsan trace set --help
Usage: esxcli vsan trace set [cmd options]

Description:
  set                   Configure vSAN trace. Please note: This command is not thread safe.

Cmd options:
  -d|--domobjnumfiles=<long>
                        Log file rotation for vSAN trace DOM object files.
  --domobjsize=<long>   Maximum size of vSAN DOM object trace files in MB.
  -l|--logtosyslog=<bool>
                        Boolean value to enable or disable logging urgent traces to syslog.
  --lsom-num-files=<long>
                        Log file rotation for vSAN trace LSOM files.
  --lsom-size=<long>    Maximum size of vSAN LSOM trace files in MB.
  --lsom-verbose-num-files=<long>
                        Log file rotation for vSAN trace LSOM Verbose files.
  --lsom-verbose-size=<long>
                        Maximum size of vSAN LSOM trace verbose files in MB.
  -f|--numfiles=<long>  Log file rotation for vSAN trace files.
  -p|--path=<str>       Path to store vSAN trace files.
  -r|--reset=<bool>     When set to true, reset defaults for vSAN trace files.
  -s|--size=<long>      Maximum size of vSAN trace files in MB.
  -u|--urgentnumfiles=<long>
                        Log file rotation for vSAN trace urgent files.
  --urgentsize=<long>   Maximum size of vSAN urgent trace files in MB.

For example, to disable the urgent traces to Syslog, the following command can be used:

esxcli vsan trace set -l false

How can I check the vSAN Traces configuration?

We can check the vSAN traces configuration in the file /etc/vmware/vsan/vsantraced.conf.
Access the desired ESXi host by SSH and type the following command:

cat /etc/vmware/vsan/vsantraced.conf | grep -v -E "#|^$"

For example – These values are not the default values. We already changed the default values:

[root@host01:/vsantraces] cat /etc/vmware/vsan/vsantraced.conf | grep -v -E "#|^$"
VSANTRACED_LOG_URGENT_TO_SYSLOG=1                # Enable to send urgent logs to Syslog
VSANTRACED_ROTATE_MAX_FILES=10                   # Maximum number of trace files
VSANTRACED_ROTATE_FILE_SIZE=10                   # Maximum size of each trace file
VSANTRACED_URGENT_ROTATE_MAX_FILES=10            # Maximum number of urgent files
VSANTRACED_URGENT_ROTATE_FILE_SIZE=10            # Maximum size of each urgent file
VSANTRACED_LAST_SELECTED_VOLUME="/vsantraces"    # Directory to store trace files
VSANOBSERVER_MAX_MB_SIZE="10"                    # Maximum size of observer file

We can get details of the vSAN traces using the command line, using the following command:

esxcli vsan trace get

For example:

[root@host01:/vsantraces] esxcli vsan trace get
   VSAN Traces Directory: /vsantraces           # Directory to store trace files
   Number Of Files To Rotate: 10                # Maximum number of trace files
   Maximum Trace File Size: 10 MB               # Maximum size of trace files
   Log Urgent Traces To Syslog: true            # Enable to send urgent trace to Syslog

How can I check the usage of those files?

We can use the “vdf -h” command to see the current usage of the vSAN traces. In this case, for instance, the current usage is 28% (considering that the maximum size of this ramdisk for vsantraces is 300M in this example):

[root@host01:/vsantraces] vdf -h | grep -i -E "Ramdisk|vsantraces"
Ramdisk                   Size      Used Available Use% Mounted on
vsantraces                300M       86M      213M  28% --

I want to continue using the default directory for the vSAN traces but I need to limit the maximum size of those files. How can I do that?

So, considering that your “partition” for the vSAN traces has 300 MB of size and you would like to limit those files to an maximum of 200 MB, the following command can achieve it:

esxcli vsan trace set --urgentnumfiles=10 --urgentsize=10 --numfiles=10 --size=10

Explaining the command and its parameters:

  • esxcli vsan trace set = Main command to configure the behavior of vSAN trace files
  • urgentnumfiles = Maximum number of urgent files
  • urgentsize = Maximum size (MB) of urgent files
  • numfiles = Maximum number of trace files
  • size = Maximum size (MB) of trace files

After applying this command, the expected behavior is the ramdisk for vsantraces to be limited to 200 MB (considering that the maximum value is 300 MB, you are using 200 MB – less than 90%).

If necessary, we can delete older trace files. We can access the files directory and apply a loop command to read files based on a specific pattern and then delete those files.
In this case, for instance, we are listing all files that begin with “vsanObserver–2024” and delete them (you should adjust this command to match your scenario):

for i in `ls | grep vsanObserver--2024` ; do rm -Rf "$i" ; done

I want to use an external place to store those files. How can I do that?

We can use a different place from /var/log/vsantraces to store the vSAN trace files. It can be elsewhere (a local datastore, an NFS datastore, etc).

In this case, for instance, we will use a local datastore to store these vSAN trace files:

1- The first step here is to create the new directory inside the datastore (you can use the command “df -h” to see all available mount points in your ESXi system):

cd /vmfs/volumes/local-datastore1
mkdir new-vsantraces

2- After creating the directory, set it as the new place to store the vSAN trace files:

esxcli vsan trace set -p /vmfs/volumes/local-datastore1/new-vsantraces/

3- Checking if the previous command changed the directory for the vSAN trace files:

esxcli vsan trace get

Example:

[root@host01:/vmfs/volumes/local-datastore1] esxcli vsan trace get
   VSAN Traces Directory: /vmfs/volumes/local-datastore1/new-vsantraces/
   Number Of Files To Rotate: 10
   Maximum Trace File Size: 10 MB
   Log Urgent Traces To Syslog: true

4- This change is applied automatically and does not need to restart any service. We can access the new directory and list the content – we already see some files here:

[root@host01:/vmfs/volumes/local-datastore1] cd new-vsantraces

[root@host01:/vmfs/volumes/local-datastore1/new-vsantraces] pwd
/vmfs/volumes/local-datastore1/new-vsantraces

[root@host01:/vmfs/volumes/local-datastore1/new-vsantraces] ls -l
total 1216
-rw-r--r--    1 root     root         43130 Feb 29 14:40 vsanObserver--2024-02-29T14h22m01s.gz
-rw-r--r--    1 root     root        950256 Feb 29 14:41 vsantraces--2024-02-29T14h38m21s662.gz
-rw-r--r--    1 root     root            24 Feb 29 14:38 vsantraces.index
-rw-r--r--    1 root     root            15 Feb 29 14:39 vsantracesDOMObj--2024-02-29T14h38m21s900.gz
-rw-r--r--    1 root     root            24 Feb 29 14:38 vsantracesDOMObj.index
-rw-r--r--    1 root     root         38611 Feb 29 14:40 vsantracesIODiag--2024-02-29T14h38m21s775.gz
-rw-r--r--    1 root     root            24 Feb 29 14:38 vsantracesIODiag.index
-rw-r--r--    1 root     root           398 Feb 29 14:40 vsantracesLSOM--2024-02-29T14h38m21s821.gz
-rw-r--r--    1 root     root            24 Feb 29 14:38 vsantracesLSOM.index
-rw-r--r--    1 root     root            15 Feb 29 14:39 vsantracesLSOMVerbose--2024-02-29T14h38m21s864.gz
-rw-r--r--    1 root     root            24 Feb 29 14:38 vsantracesLSOMVerbose.index
-rw-r--r--    1 root     root         17953 Feb 29 14:40 vsantracesUrgent--2024-02-29T14h38m21s725.gz
-rw-r--r--    1 root     root            24 Feb 29 14:38 vsantracesUrgent.index

Wrapping This Up

Managing these trace files effectively is crucial to prevent issues like a full ramdisk, which could lead to a hung host or the host becoming non-responsive.

Remember, the actual analysis of vSAN trace files typically requires a deep understanding of vSAN internals and is usually performed by VMware Support.

Important:

In vSAN 8U1, vSAN provides the ability to store more trace log data resiliently. While it still collects trace files under a local log location, it now periodically dumps the trace file information into a dedicated object. To read more about it, please, check the following links:
Enhanced Performance Diagnostics tools for vSAN 8 U1 | VMware
Native vSAN Trace Object – virtually Sensei