Backtrack:  
 
edited by lunarg on November 8th 2021, at 17:14

Coredumps are used for analysis and debugging if/when a ESX host "crashes" with a "kernel panic". This is visualized by a purple screen (often called PSOD, similar to Windows'es "Blue Screen Of Death").

VMWare ESX 5.5 and newer introduces the ability to perform coredumps to a file instead of a partition.

To configure this, you need access to the ESX host's CLI (either through vSphere Management Assistant (vMA), directly on the host through console or SSH, or some other method). For this to work, you need "root" access (or the equivalent of it through vMA).

  1. Once logged on, take a directory listing of the VMFS datastores to determine on which datastore you want to place the coredump files.
    ls -l /vmfs/volumes
    You will see a list of datastore UUIDs as well as symlinks to those UUIDs. Use the symlinked names to figure out which UUID points to which datastore's logical name:
    For example:
    DataStore01 -> 551ae2a4-26d6a84f-9854-b8ac6f118ae6
    DataStore02 -> 551afc82-2baaddc4-57d0-b8ac6f118ae6
    In the example above, if you were to wish to place the coredump on DataStore01, then you would use the UUID 551ae2a4-26d6a84f-9854-b8ac6f118ae6.
  2. Once you have the UUID, create the coredump file for the host. The coredump configuration is shared across the cluster, so it's recommended to name the file uniquely (e.g. the hostname of the ESX host).
    esxcli system coredump file add -d DATASTORE_UUID -f FILENAME
    Replace DATASTORE_UUID and FILENAME accordingly. For the filename, you do not need to specify an extension. The command will also automatically create a folder called vmkdump on the root of the specified datastore.
  3. Verify the creation of the file by running the following command:
    esxcli system coredump file list
    This will show a list of all coredump files (in the cluster). At the moment, the newly created coredump file will have both Active and Configured set to false. This is because the host is not yet configured to use the coredump file.
    Note that in a clustered environment, there will probably be more than one entry present. This is perfectly normal, and eventually, when finished configuring coredump files on every host, there should be one file for each host.
  4. Next, set the coredump file so it will actually be used by the host. Unlike the previous command, you will now have to specify the full path to the coredump file. You can easily obtain the full path from the output of the previous command.
    esxcli system coredump file set -p /vmfs/volumes/DATASTORE_UUID/vmkdump/FILENAME
  5. Verify the activation by running esxcli system coredump file list again. You should now see that Active and Configured are now set to true.
    In a clustered environment, you will see more than just the one coredump file. The columns Active and Configured will always be set to false for the coredump files for the other hosts.

To remove a coredump file, you can use a similar command to remove it:

esxcli system coredump file remove -f /vmfs/volumes/DATASTORE_UUID/vmkdump/FILENAME

Note that to remove a specific file, you need to do this on the host that is actively using the dumpfile, or else you will get a "Device/resource busy" error message. Also, if the file is actively in use, you may need to specify --force to forcefully remove the file (which will actually deactivate it before removing it).