Scratch location set to /scratch – vSAN node on SD card

At the place I work, we experienced continuous “The ramdisk ‘root’ is full” issues on our vSAN ESXi nodes.

The first thing we did was to raise a support call and have vmware check what is filling up the ramdisk.

Support suggested that we need to limit the size of vsantraces to 200MB, and pointed to the below KB Article

https://kb.vmware.com/kb/2150320
This puzzled me, as the vsantraces was not full.

Ramdisk Size Used Available Use% Mounted on
root 32M 32M 0B 100% --
etc 28M 5M 22M 18% --
opt 32M 368K 31M 1% --
var 48M 728K 47M 1% --
tmp 256M 492K 255M 0% --
iofilters 32M 0B 32M 0% --
hostdstats 1553M 17M 1535M 1% --
snmptraps 1M 0B 1M 0% --
vsantraces 300M 167M 132M 55% --

Ramdisk Size Used Available Use% Mounted on

root 32M 32M 0B 100% --

etc 28M 5M 22M 18% --

opt 32M 368K 31M 1% --

var 48M 728K 47M 1% --

tmp 256M 492K 255M 0% --

iofilters 32M 0B 32M 0% --

hostdstats 1553M 17M 1535M 1% --

snmptraps 1M 0B 1M 0% --

vsantraces 300M 167M 132M 55% --

I kept digging and I found out that the scratch partition on the hosts was not pointing scratch -> /tmp/scratch

but it was on / instead.

drwxr-xr-x 1 root root 512 Sep 30 11:00 scratch

1	drwxr-xr-x 1 root root 512 Sep 30 11:00 scratch

I have changed the ScratchConfig.CurrentScratchLocation under Advanced Settings, but the change did not persist after a reboot.

Raised another call with support, and after escalating to a senior engineer, we have been pointed to a new KB Article

KB2151209

Seems after upgrading the hosts using a custom HPE ESXi 6.5U1 image, we ran into the same issue, as the Dell EMC custom image

that the article points out.

Checked the drivers and they re were there, even though the card is not in use.

esxcli software vib list | grep elx

elx-esx-libelxima.so 11.2.1238.0-03 ELX VMwareCertified 2017-09-04
elxiscsi 11.2.1238.0-1OEM.650.0.0.4598673 EMU VMwareCertified 2017-09-04
elxnet 11.2.1149.0-1OEM.650.0.0.4240417 EMU VMwareCertified 2017-09-04
emulex-esx-elxnetcli 11.1.28.0-1.26.5969303 VMware VMwareCertified 2017-09-04

esxcli software vib list | grep elx

elx-esx-libelxima.so 11.2.1238.0-03 ELX VMwareCertified 2017-09-04

elxiscsi 11.2.1238.0-1OEM.650.0.0.4598673 EMU VMwareCertified 2017-09-04

elxnet 11.2.1149.0-1OEM.650.0.0.4240417 EMU VMwareCertified 2017-09-04

emulex-esx-elxnetcli 11.1.28.0-1.26.5969303 VMware VMwareCertified 2017-09-04

The solution to the issue is as follows:

Stop hostd (disconnects the host from vcenter)
- /etc/init.d/hostd stop watchdog-hostd: Terminating watchdog process with PID 70699 hostd stopped.
  
  1
  2
  3
  4
  5
  
  /etc/init.d/hostd stop
  
  watchdog-hostd: Terminating watchdog process with PID 70699
  
  hostd stopped.

Remove the below drivers

esxcli software vib remove -n elxiscsi -n elx-esx-libelxima.so

Removal Result:

 Message: The update completed successfully, but the system needs to be rebooted for the changes to be effective.

 Reboot Required: true

 VIBs Installed:

 VIBs Removed: ELX_bootbank_elx-esx-libelxima.so_11.2.1238.0-03, EMU_bootbank_elxiscsi_11.2.1238.0-1OEM.650.0.0.4598673

 VIBs Skipped:

esxcli software vib remove -n elxiscsi -n elx-esx-libelxima.so

Removal Result:

Message: The update completed successfully, but the system needs to be rebooted for the changes to be effective.

Reboot Required: true

VIBs Installed:

VIBs Removed: ELX_bootbank_elx-esx-libelxima.so_11.2.1238.0-03, EMU_bootbank_elxiscsi_11.2.1238.0-1OEM.650.0.0.4598673

VIBs Skipped:

Start hostd (host gets connected back to vcenter)

/etc/init.d/hostd start hostd started.

1
2
3

/etc/init.d/hostd start

hostd started.
Configure ScratchConfig.ConfiguredScratchLocation field to /tmp/scratch in Advanced System Settings
Reboot the Host

the issue is resolved and scratch is persistent to /tmp/scratch

lrwxrwxrwx    1 root     root            12 Oct  3 15:29 scratch -&gt; /tmp/scratch

1	lrwxrwxrwx 1 root root 12 Oct 3 15:29 scratch -> /tmp/scratch

Of course, that wouldn’t be an issue if ESXi was installed onto magnetic disks, or if scratch was redirected to a Datastore (shared storage)

these ESXi nodes form a vSAN cluster where, vSAN is the only Datastore.

Scratch location set to /scratch – vSAN node on SD card

Dimos Gorogias

Leave a Reply Cancel reply

Scratch location set to /scratch – vSAN node on SD card

Dimos Gorogias

You may also like

Platform Services Controller (PSC) Replication Latency – Enhanced Linked Mode

Deploy vSphere 6.5 Platform Services Controller – PSC

Home Lab Environment

Leave a Reply Cancel reply