The objective of this paper is to observe performance gains that can be achieved on a Linux server with the application of a host-based SSD cache controlled using FlashSoft® software. Red Hat Enterprise Linux 6.5 was installed directly on server hardware in a “bare metal” (not virtualized) system configuration. Write-through and write-back cache performance was compared to an all-HDD baseline using a synthetic workload generated and measured by the benchmark program fio.exe.
The test server was configured as described below:
The benchmark test was conducted multiple times to measure and compare performance of the non-accelerated HDD storage backend and the same HDD backend accelerated using FlashSoft software in write-through and write-back modes of operation.
The testing procedure is described below:
Testing Cached Configurations
In order to consistently measure the performance of cached configurations, the cache was completely flushed after each individual test and “pre-warmed” immediately before conducting the next benchmark test using the same warmup workload.
RAID Controller: The following settings were used for all storage for the tests:
OS HDD: A single 2TB SATA 7.2K RPM HDD contained the server operating system.
Target HDD: A virtual disk was created using direct attached storage for testing write-through and write-back caching performance. The virtual disk consisted of three 500GB SAS 7.2K RPM HDDs configured using RAID 5 and partitioned as a 100GB data volume for benchmark testing. Additional capacity was left unformatted and unused.
Cache SSD: A single 400GB SAS SSD was configured using RAID 1 and partitioned as a 30GB cache volume for benchmark testing. Additional capacity was left unformatted and unused.
Data to Cache Ratio: The partitioned sizes of data (100GB) and cache (30GB) were selected to emulate the 30% cache-to-dataset ratio commonly encountered in production environments.
The benchmark tests were executed using the following commands:
Warmup: Prior to each measured test the cache was warmed up with three sequential passes of the following script to apply weighted hotspot distribution across two distinct regions of the workload.
[global] rw=randrw ioengine=libaio direct=1 filename=/dev/sdc randrepeat=0 norandommap rwmixread=80 thread runtime=1200 time_based=1 exitall [job1] size=10G iodepth=32 bs=4k flow=-1 [job2] offset=10G iodepth=32 bs=4k flow=5 ; End of Job file
Benchmark Test: After the preconditioning sequence, the performance test was run with the following parameters:
[global] rw=randrw ioengine=libaio direct=1 filename=/dev/sdc randrepeat=0 norandommap rwmixread=80 thread ramp_time=1200 runtime=1200 time_based=1 exitall [job1] size=10G iodepth=32 bs=4k flow=-1 [job2] offset=10G iodepth=32 bs=4k flow=5 ; End of Job file
The benchmark test was configured to run two jobs simultaneously and the aggregate performance numbers were totaled and used for comparison. Job1 was restricted in size to a 10GB region of the target storage (size=10G) with a higher density of IO traffic (flow=-1). Job2 was assigned to the remainder of the storage (offset=10G) with a lower density of IO traffic (flow=5). The objective was to create an 80/20 split of all IOs across a 10/90 split of the storage – thus driving 80% of all IO traffic within 10GB of disk space. This scenario intensifies the workload IO while at the same time ensures the entire target storage is utilized during the test.
Data measured using FlashSoft software show noticeable improvement of performance compared to the baseline across all measured parameters. When operated using write-through cache mode, FlashSoft doubled IOPS performance and reduced latency by half compared to the baseline. Write-back cache mode doubled the performance of write-through mode providing nearly five times IOPS performance compared to the baseline and reduced latency nearly 80%. Comparable increases in CPU utilization when caching was applied illustrate the ability of FlashSoft software to reduce I/O “bottlenecks” and allow the system to operate more efficiently.
This test was conducted using synthetic data generated by the benchmark program fio.exe, the workload was constructed to incorporate a mixed ratio of random reads and writes, operate disparate workload jobs, utilize a cache limited in size to 30% of the entire workload and forced to span the entire storage backend. In actual application, performance will vary depending upon the specifics of the workload and system configuration; however, as demonstrated by this “torture test” FlashSoft software is capable of significantly enhancing application performance by reducing latency between the host and backend storage.
Whether you'd like to ask a few initial questions or are ready to discuss a SanDisk solution tailored to your organizations's needs, the SanDisk sales team is standing by to help.
We're happy to answer your questions, so please fill out the form below so we can get started. If you need to talk to the sales team immediately, please phone: 800.578.6007
Thank you. We have received your request.