A python program I created is IO bounded. The majority of the time (over 90%) is spent in a single loop which repeats ~10,000 times. In this loop, ~100KB data is generated and written to a temporary file; it is then read back out by another program and statistics about that data collected. This is the only way to pass data into the second program.
Due to this being the main bottleneck, I thought that moving the location of the temporary file from my main HDD to a (~40MB) RAMdisk (inside of over 2GB of free RAM) would greatly increase the IO speed for this file and so reduce the run-time. However, I obtained the following results (each averaged over 20 runs):
- Test data 1: Without RAMdisk – 72.7s, With RAMdisk – 78.6s
- Test data 2: Without RAMdisk – 223.0s, With RAMdisk – 235.1s
It would appear that the RAMdisk is slower that my HDD.
What could be causing this?
Are there any other alternative to using a RAMdisk in order to get faster file IO?
Your operating system is almost certainly buffering/caching disk writes already. It’s not surprising the RAM disk is so close in performance.
Without knowing exactly what you’re writing or how, we can only offer general suggestions. Some ideas:
If you have 2 GB RAM you probably have a decent processor, so you could write this data to a filesystem that has compression. That would trade I/O operations for CPU time, assuming your data is amenable to that.
If you’re doing many small writes, combine them to write larger pieces at once. (Can we see the source code?)
Are you removing the 100 KB file after use? If you don’t need it, then delete it. Otherwise the OS may be forced to flush it to disk.
Answered By – Ken
Answer Checked By – David Marino (BugsFixing Volunteer)