vFlash Read Cache Performance tuning Block Size.

The first step is identifying the average block size of I/O that your vm is producing. Block size needs to either match or be smaller then your average block size read from your storage. Anything smaller then the designated block will use the entire block anyways. For example, a 4k block of data stored in a cache that is set to 8k will use an entire 8k block. This results in wasted space. The opposite is also true as far as performance and memory usage, if your block size is too small the result is more I/O on the cache and more memory usage.

To get this we will SSH into the vSphere host and enter the following command:

vscsiStats –l

Capitalization matters here. The output is going to look like:

Virtual Machine worldGroupID: 37710, Virtual Machine Display Name: CSITSF03-VM, Virtual Machine Config File: /vmfs/volumes/5092b3e3-9d63ac88-f387-3c4a92b3a0e4/CSITSF03-VM/CSITSF03-VM.vmx, { Virtual SCSI Disk handleID: 8195 (scsi0:0)}
The important parts are :

worldGroupID: 37710

Virtual SCSI Disk handleID: 8195`

You will need them to start data collection on a virtual disk, like this:

vscsiStats -s -w 37710 -i 8195

After you have done this you can enter the following to get the ioLength data:

vscsiStats -p ioLength -c -w 37710 -i 8195

This will produce a delaminated output that can be moved into a spreadsheet. But we are mostly here for the following bit of information:

Histogram: IO lengths of Read commands,virtual machine worldGroupID,37710,virtual disk handleID,8195 (scsi0:0) min,6656 max,262144 mean,69696 count,8

We want to let the stats collection run long enough to get a decent sample of data. Ideally you want to get stats for the average work load / work day. In my case most of my cache hits on this specific server occur between 7:00 am and 9:00 am. Outside of these hours Read Cache hits are minimal.

As per the vmware Documentation:

Setting the Correct Cache Block Size
As already covered in the section “Performance Tunables,” the cache block size impacts vFRC performance. The
best way to choose the best cache block size is to match it according to the I/O size of the workload. VscsiStats
[9] may be used to find the I/O size in real-time when running the workload. This utility outputs an IOLength
histogram that can be used to find the most dominant I/O size of the workload. The cache block size of vFRC can
be configured to match this value. In general, vFRC performs better if the cache block size either matches or is
less than the I/O size of workloads. However, configuring cache block size to be less than the dominant I/O size
leads to increased memory consumption and more I/Os issued to the cache, possibly resulting in lower
performance.

To stop the collection of stats, and you REALLY need to make sure you do this enter:

vscsiStats -x -w 37710 -i 8195

Jeremy Tirrell

Read more posts by this author.