Saturday, November 10, 2007

Testing ... Testing ... 1 2 3

How do you test whether a given defrag program has sped up your system or not? It's all very well to look at the drive layout and say "this looks neat", but how can you tell if the machine boots faster or programs load faster?
While I was analysing the "review" by the 3d Professor, I tried to think of a method of objectively testing the performance of the file system, as improved by a given defrag program. I tried gathering data from the readfile program, but the maths doesn't work correctly. I'm not sure why, and I don't understand the C source code enough to figure it out. Also, readfile only accepts a single wildcard, so you can do *.* or *.exe but not *.exe;*.dll;*.sys for example.
I did an exercise in measuring the time it took to read all the files in the c:\windows\system32 directory, and the results were almost what I expected, but it still wasn't an accurate enough reflection of how a defrag program affects the performance of the system as a whole. Clearly there are some files in the system32 folder that are seldom if ever opened or run, and if the system was optimised for files that are used often, these slower files would skew the results.
So I gave up on readfile and decided to write my own program in Visual Basic 6. It's called "Prefetch.exe" or the "Prefetch File Processor" and I'm busy in the initial stages of running tests on an old ThinkPad R31 laptop. The program is freeware and you can examine all the source code as well to see in detail how it works. The idea is that the results should be repeatable on any given system, subject to the limitations of the package.
The program has three stages:
  • In the first stage the "layout.ini" file is interpreted and checked and the results copied to a "layout.txt" file, which can be edited.
  • In the second stage this file is opened and each file name it contains is opened and read, and the time this process takes is calculated in milliseconds.
  • In the third stage the results are saved in a CSV file for further analysis and checking.
There are some limitations to this process. The biggest limitation is that Visual Basic 6 can only read the first 2,147,483,646 bytes (2GB) of any given file. Usually that's enough, and I'm not that interested in huge data files, so it's good enough. The second limitation is that certain system files are opened and locked by the system, so these can't be timed. This is a limitation, but equal across all programs tested. The third limitation is that it can't test the time it takes to open a folder, only a file.
Another problem is that the "layout.ini" file includes some junk files and it keeps changing. That's why I created the "layout.txt" file, and edited it to remove files in the temporary folders, references to cookies, log files, critical updates, and other miscellaneous junk. The test file I am using is here, and hopefully it will deliver a fair test of the system.
The timing starts just before the file is opened, and finishes when the file is closed. The timer works in msecs, and does not include the time it takes to read the name of the file from the "layout.txt" file, or the time to save the result in the "timing.csv" file. Later I may extend the program to do some read-write tests, but for now it's read only.
My Test System
I have created a 5.86GB partition on Penny's old ThinkPad R31 laptop. With Windows XP Professional and IE7 and the .NET 1.1 and 2.0 frameworks and all service packs loaded, there is 1.86GB free space, i.e. 32% free. I then made a full sector-by-sector image backup using the Acronis True Image Home 11 Recovery CD, and this image is stored on another partition of the disk that is not included in the testing. This backup image is fragmented, exactly as it was created during the install process, with absolutely no attempt to defragment the drive in any way. Here are some facts about the system, as reported by JkDefrag in "analyse" mode:
Total disk space: 6,293,757,952 bytes (5.86 gigabytes), 1,536,562 clusters
Bytes per cluster: 4,096 bytes
Number of files: 20,110
Number of directories: 2,464
Total size of analyzed items: 4,268,003,328 bytes (3.97 gigabytes), 1,041,993 clusters
Number of fragmented items: 1,342; 5.94% of all items
Total size of fragmented items: 952,217,600 bytes, 232,475 clusters, 22.31% of all items, 15.13% of disk
Free disk space: 1,262,436,352 bytes, 308,212 clusters, 20.06% of disk
Number of gaps: 4,722
Number of small gaps: 3,978; 84.24% of all gaps
Size of small gaps: 78,368,768 bytes, 19,133 clusters, 6.21% of free disk space
Number of big gaps: 744 (15.76% of all gaps)
Size of big gaps: 1,184,067,584 bytes, 289,079 clusters, 93.79% of free disk space
Average gap size: 65.27 clusters
Biggest gap: 926,806,016 bytes, 226,271 clusters, 73.41% of free disk space
Not chaotic, but hardly optimal. Each program tested will be given a chance to defragment and optimise this data, and once it has done its best, the system will be rebooted and the Prefetch File Processor will read the list of files and time the process. I will also record screen shots of the drive image before and after, and use the JkDefrag analyse log file to note other aspects of the defrag process.
I would welcome any comments or criticisms of this process, and feel free to download the program and run your own tests, and inspect the Visual Basic 6 code. I will document it more fully in the next few days, so the code is easier to read.
Update: My TrueImage backup file is corrupt, and I have to reinstall everything. I'll update the numbers published above once this has been done. Using the WDD defragger improved the read times by 15%, but I will re-run the tests and publish the results in full.
Update: For technical reasons associated with a large bad spot on my drive, I can't do a sector-based backup, only a complete file backup. This is going to complicate matters slightly, but hopefully the results won't be too skewed. It's a lot of work reinstalling XP, not to mention tons of bandwidth during updates.


JSComputerTech said...

I think the best way to ensure that the disk is identical for each test would be to duplicate it using DD for windows. It's used for forensic backups when the hard disk needs to be absolutely identical. You would need 2 identical disks to use this method though.

I also think that file access times while they are interesting to look at for us tech people who want to know every little detail about our computers is not really a good way to gage the performance increase from defraging.

I would suggest using SYSmark 07, if you can get your hands on a copy. Since it uses actual applications it should allow I-FAAST to do it's thing, provided you run it 4 times a day.

I'd also think the disk cache in windows should be disabled when performing the tests to help isolate actual disk performance.

***Other Thoughts***
I noticed in your review of Diskeeper 07 that it had trouble with the compressed database file, I wanted to know if you used the ntfs compression by right clicking the file and checking the "Compress contents to save disk space" option?

I have noticed in the past that diskeeper doesn't seem to be able to defrag any large file that is compressed this way.

I have also noticed that VSS and I-FAAST don't work together when auto defrag is set to not trigger VSS to make new snapshots. The interface will not show that I-FASST is "Unavailable" until you close and reopen it.

Do you have system restore enabled on the computers you used for the tests?

Thanks for writing your defrag review it's the first one I have seen that actually provides real facts about how well these programs work.

Donn Edwards said...

Thanks for some great comments!

I don't have a spare disk, so DD isn't going to help, and the source disk is a bit flaky in patches.

How do you disable the disk cache? It seems like a great idea. I'll look into this. At present the caching is set to 32MB, which isn't a lot, and I do a reboot before testing, so all the results are equally cached.

The compressed files were NTFS ones as you described.

System restore is enabled, and I expect the defragger to know what to do with the stuff. None of the files tested are in the restore space.

JSComputerTech said...

In Vistaand XP you can access the cache settings by going into device manager and then opening the properties page for the relevant disk drive, its located under the polices tab.

If you have the Intel Matrix Storage Manger installed the setting is moved to their application. You have to change to the advanced view and then right click the volume name of the drive you want to change.