This project is sponsored by the Innovative Software AG!



Definitions and Objectives

Introduction


Definition of Terms
  • internal-fragmentation
  • external-fragmentation
  • standard formulation
  • fragmentationX
  • fragmented path


  • Link to my old page


    Copyright and Contact



    Tool: Fibmap

    Description
  • download



  • Tool: Read

    Description
  • download



  • Tool: Agesystem

    Description
  • download


  • Gauge Measurements

  • Test System

  • 500 MB Partition

  • file size distribution
  • results for Ext2FS
  • results for ReiserFS
  • results for JFS
  • results for XFS


  • 1GB Partition
  • file size distribution
  • all results


  • 4 GB Partition
  • file size distribution
  • all results


  • Gauge Verifications

  • Test System

  • 500 MB Partition

  • 1 GB Partition

  • 4 GB Partition


  • TODO



    Gauge Performance

    Description
  • download


  • Measurements on 2.4.5

  • Test System

  • Output Explanation

  • variable file size
  • create fixed size
  • append variable size


  • Measurements on 2.4.8

  • Test System

  • Output Explanation

  • variable file size



  • Append Tests

    Description
  • download


  • Measurements on 2.4.8

  • Test System

  • Output Explanation

  • fixed append size
  • variable append size
  • low statistics


  • With Preallocation

  • Output Explanation

  • fixed append size


  • Measurements on 2.4.10
  • Test System

  • Output Explanation

  • fixed append size


  • With Preallocation

  • Output Explanation

  • fixed append size


  • Tool: Agesystem3

    Description
  • download



  • Aging Tests

    Description
  • download


  • Measurements on 2.4.8

  • Test System

  • Explanation of Terms

  • generic file size dist


  • Measurements on 2.4.10
  • Test System

  • Explanation of Terms

  • generic file size dist



  • External Links

    Filesystem Homepages
  • ReiserFS
  • Ext2FS
  • Ext3FS
  • XFS
  • JFS


  • Benchmarks & Results
  • ReiserFS benchs
  • Denis'+Kresimir's site
  • Randy's tests
  • Yves' YAFB tests


  • Literature
  • Smith's site
  • My master thesis




  • Additional Notes

    Copyright 2001 by
    Constantin Loizides,
    loizides AT informatik DOT uni-frankfurt DOT de

    Last changes:
    Do Feb  5 09:40:02 CET 2004
    

    Journaling-Filesystem Fragmentation Project

    Append Tests




    Description

    The following bunch of tests tries to measure a worst case setup for every file system. On a given partition I chose to create a certain number of files in a certain directory structure (the -n and -c option of
    agesystem. A fixed or variable number (according to a given distribution) of bytes is appended to each files in a row so that the files continuously grow in size. When reaching a certain percentage of used blocks of the partition, the partition is copied on to a similar but freshly created partition of the same partition size. Then the known tests of the gauge section are being run on both partitions. In that way I want to compare the performance of the first (fragmented) partition to the second (non fragmented).
    In Detail:
    • Choose a set of file systems to test (eg. reiser, reiser -o notail and ext2)
    • Choose the directory structure for agesystem (eg. 1x1)
    • Choose the file size distribution for agesystem (eg. the standard for a 1 GB partition) or choose a fixed size (-f and -b option for fixed size) to append (-o option)
    • Loop over the file systems to test
      • Format partition one to append on
      • Loop over the -u option (usage) of 25, 50, 75 and 99 percent for agesystem
        • Run agesystem with the set of chosen options
        • Format partition two and copy partition one onto it (cp -R . /partition2)
        • Run fibmap to get the written block layout on both partitions
        • Run read to measure the read performance on both partitions

    Download

    You may download the necessary files and try it yourself. After downloading the package create a directory where to untargz it. You probably need to adjust some paths, then type ''make agesystem; make fibmap; make read'' and look for the script ''apptest.sh'' that shows how to setup the options described above. If you like to get some graphs out of the resulting output, send the output files to me (targzed, please!).



    Measurements on 2.4.8-ac1

    Test System

    The hardware and settings are the same as in the gauge section.

    • AMD Duron 650 Mhz, 128 MB, 40 GB EIDE-harddisk for the system with the tests taken on
      • Adaptec AHA-2940U2/W host adapter
      • 9 GB IBM DNES-309170W SCSI-harddisk
    Due to lack of simulation time, problems with the Linux 2.4 VM and no time to patch each day another kernel the file system tested are ReiserFS with and without tails and Ext2 on Alan's 2.4.8-ac1 kernel.
    • Software: SuSE 7.2 (gcc 2.95-3) kernel 2.4.8-ac1 restricted to 32 MB memory without swap.



    Graphical Output Explanation

    The shown graphical output is devided in 6 parts and reads as follows:
    • The first part consists of the legend and some remarks describing the graph as a total.
    • The second part shows the absolute write performance in KB per second from the agesystem output file for 25%, 50%, 75% and 100% usage. When the partition is copied --usually there is a remark ''-copy'' somewhere in the legend-- the shown performance is the total number of bytes copied devided by the total runtime of the ''cp -R . /partition2''. Keep that difference in mind comparing the two numbers, when both are shown in the same graphs.
    • The third part shows the read result for the different usage size. For each file system and usage you see two small bars. The first is the absolute read performance in KB per second measured with the timer starting just before the read system call and stopping right after it while recursively walking through all files in all directories. The second small bar should be lower than the first. It includes the time needed to traverse all files in all directories plus reading them eg. the timer is started before all file reads and stopped after the last file is read. At best for a read test the bars should not differ too much with the second bar smaller than the first.
    • The fourth part measures the random read performance when the files are randomly chosen and then sequentially read. The first run of read stores the file names and paths in memory, so that this time no stat or readdir overhead is needed. Again there are the two columns for the random read performance which should now be ''more equal'' because there is a smaller difference between the total accumulated read time and the total running time of the random read test than when recursively walking through the directories stating each file.
    • The fifth part shows some meta data performance. The first small bar is the number of files stated (and read) during the recursive walk of read devided by the time difference mentioned above, eg. total run time of read minus accumulated read time of read which mainly is the time needed to open and read the directories plus stating the files in them. The second bar is the number of files found while fibmap recursively walks through all directories stating (and fibmaping) all files but not reading them devided by its running time. Again, the two bars should be of the same order in magnitude with the second less high than the first. At least I thought that before the measurements...
    • The sixth part shows the internal fragmentation (first small bar) and the external fragmentation (second small bar) and the fragmentation path results (third small bar) reported by fibmap. If the maximum frag path shown is greater than one (its minimum value) the log of it is plotted which is marked on the left (it says: ''PatL'' instead of ''Path'').
    The printed table values and units are understood as follows: In general the first value of a table entry corresponds to the fragmented partition, the second to the copied partition and a third if shown to the ratio of the both (mostly times 100 to get the percentage).
    • Read: The units of the first two values are KB per second, the third is the percentage.
    • Random Read: The units of the first two values are KB per second, the third is the percentage.
    • Write: Units are KB per second.
    • Fragmentation: Absolute values for the first two rows, the third is the percentage. Except for frag path, in that case it is simply the ratio.
    • Runtime: Total running time of writing (appending or copying) and the three tests. Units are seconds.


    Fixed Append Size Results

    • Partition Size: 1 GB
    • File systems: Reiser, Reiser with -o notail and Ext2
    • Fixed append sizes: 4K, 12K, 64K
    • Number of total files: 1000, 5000, 10000, 25000
    • Directory structures: 1x1
    The result is quite clear: All tested systems suffer due to high fragmentation (up to 90 % performance loss). The fragmentation gets higher for less files with smaller appends when referring to the random read test. For all except one case Reiser or Reiser without tails shows less performance losses than Ext2 for the read tests (not regarding absolute values though).

    The following links show results and graphs for the different partition fullness:


    Low Statistics (but better than no statistic at all)

    Some statistical results for 5000 files taken over three samples (only):
    Look here for an explanation of the FRead row of the graphs.


    Variable Append Size Results



    With Preallocation

    To improve the performance of the file system in which the files one by one in turn are appended by a fixed amount of bytes I set up the same settings as
    described above. The difference is that the files are first preallocated to their expected length and then truncated to the first block before starting to append. Using this trick I hope to give the file system under question a hint for the block layout. The expectation is that the fragmentation gets lower and at least the read performance higher.

    Graphical Output Explanation

    The shown output is the same as
    the output above. Only the fifth row differs.
    • The fifth row is the FRead performance. It is the read performance when the files are read in the order which is given by their logical block numbers sorted in an ascendant order. Files with logical first block number of zero (eg. tails) are omitted (see also the read section).

    Fixed Append Size Results (with prealloc)

    • Partition Size: 1 GB
    • File systems: Reiser, Reiser with -o notail and Ext2
    • Fixed append sizes: 4K, 12K, 64K
    • Number of total files: 5000
    • Directory structures: 1x1
    The result is as follows: Reiser and Reiser with -o notail gain between a factor of two to three in their read performance when compared to the non preallocation case. Strangely, Ext2 does not get a significant speed up through preallocation. As expected the difference gets smaller for larger append sizes.


    Measurements on 2.4.10

    Test System

    The hardware and settings are the same as in the gauge section.

    • AMD Duron 650 Mhz, 128 MB, 40 GB EIDE-harddisk for the system with the tests taken on
      • Adaptec AHA-2940U2/W host adapter
      • 9 GB IBM DNES-309170W SCSI-harddisk
    The file system tested are ReiserFS with and without tails and Ext2 on 2.4.10 standard kernel.
    • Software: SuSE 7.2 (gcc 2.95-3) kernel 2.4.8-ac1 restricted to 32 MB memory without swap.


    Graphical Output Explanation

    The shown output is the same as
    the output above. Only the fifth row differs.
    • The fifth row is the FRead performance. It is the read performance when the files are read in the order which is given by their logical block numbers sorted in an ascendant order. Files with logical first block number of zero (eg. tails) are omitted (see also the read section).

    Fixed Append Size Results

    • Partition Size: 1 GB
    • File systems: Reiser, Reiser with -o notail and Ext2
    • Fixed append sizes: 4K
    • Number of total files: 1000, 5000, 10000
    • Directory structures: 1x1 1x1000
    The result is quite clear: All tested cases suffer of high fragmentation (up to 90 % performance loss). There is no significant difference between the different directory structures.

    The following links show 1x1 results and graphs for the different partition fullness:
    The following links show 1x1000 results and graphs for the different partition fullness:

    With Preallocation

    To improve the performance of the file system in which the files one by one in turn are appended by a fixed amount of bytes I set up the same settings as
    described above. The difference is that the files are first preallocated to their expected length and then truncated to the first block before starting to append. Using this trick I hope to give the file system under question a hint for the block layout. The expectation is that the fragmentation gets lower and at least the read performance higher.

    Graphical Output Explanation

    The shown output is the same as
    the output above.

    Fixed Append Size Results (with prealloc)

    • Partition Size: 1 GB
    • File systems: Reiser, Reiser with -o notail and Ext2
    • Fixed append sizes: 4K
    • Number of total files: 1000, 5000, 10000
    • Directory structures: 1x1, 1x1000
    The result is as follows: Reiser and Reiser with -o notail gain a factor of up to two in their read performance when compared to the non preallocation case. Strangely, Ext2 does not get a significant speed up through preallocation. .
    The following links show 1x1 results and graphs for the different partition fullness:

    From the 1x1000 case one can deduce that when putting in each directory only one file which is subsequently appended does not give a significant performance gain using the preallocation trick. Having 5 or 10 files per directory again gets the factor two of performance gain back when using the preallocation trick
    The following links show 1x1000 results and graphs for the different partition fullness: