skip to content
Alvin Lucillo

Data width affects index size

/ 3 min read

I performed a small test to determine how much index storage the width of a data included in an index takes up. The Go program creates two indexes in index_size_repro.events and inserts 300K documents, with each document having unique SHA256 and SHA512. By default, there’s another index _id_ created for the unique identifier. To give you context, an ObjectID is 12 bytes, SHA256 is 32 bytes raw (but 64 characters as hex string), and SHA512 is 64 bytes raw (but 128 characters as hex string).

	db := client.Database("index_size_repro")
	coll := db.Collection("events")

	_, err = coll.Indexes().CreateMany(ctx, []mongo.IndexModel{
		{
			Keys:    bson.D{{Key: "sha256", Value: 1}},
			Options: options.Index().SetName("sha256_1"),
		},
		{
			Keys:    bson.D{{Key: "sha512", Value: 1}},
			Options: options.Index().SetName("sha512_1"),
		},
	})

You will notice as the data grow, the index size is growing larger than the collection data storage, reaching almost twice as the size of the total data storage. Other notes:

  1. _id_ index, the default index, is relatively small, only 2.3% of the total index storage and 4% of the total collection data storage (see the last row of the result table)
  2. The index that takes up the majority of the index storage is sha512, followed by sha256
  3. This goes to show that multiple, individual indexes with high cardinality wide values may create larger index storage
      docs     dataMB    indexMB       idMB   sha256MB   sha512MB  idx/data%   id/data% s256/data% s512/data%
     10000       2.14       2.33       0.11       0.76       1.46      108.7        5.1       35.5       68.1
     20000       7.25       6.91       0.31       2.25       4.34       95.2        4.3       31.1       59.9
     30000       7.88      11.34       0.50       3.73       7.12      143.9        6.3       47.3       90.3
     40000      12.98      16.12       0.68       5.26      10.18      124.1        5.3       40.5       78.4
     50000      13.43      20.87       0.87       6.84      13.16      155.4        6.5       51.0       98.0
     60000      18.53      25.21       0.87       8.36      15.99      136.1        4.7       45.1       86.3
     70000      19.04      29.53       0.87       9.76      18.90      155.1        4.6       51.3       99.3
     80000      23.65      33.97       1.06      11.21      21.70      143.6        4.5       47.4       91.8
     90000      24.24      38.62       1.25      12.75      24.62      159.3        5.2       52.6      101.6
    100000      24.37      42.94       1.25      14.21      27.48      176.2        5.1       58.3      112.8
    110000      29.14      47.18       1.25      15.61      30.32      161.9        4.3       53.6      104.0
    120000      29.75      51.74       1.44      17.13      33.18      173.9        4.8       57.6      111.5
    130000      34.54      56.30       1.63      18.61      36.06      163.0        4.7       53.9      104.4
    140000      35.05      60.61       1.63      20.03      38.95      172.9        4.6       57.1      111.1
    150000      39.64      64.97       1.63      21.53      41.82      163.9        4.1       54.3      105.5
    160000      40.23      69.55       1.82      23.02      44.71      172.9        4.5       57.2      111.1
    170000      40.54      74.11       2.01      24.48      47.62      182.8        5.0       60.4      117.4
    180000      45.27      78.39       2.01      25.93      50.45      173.2        4.4       57.3      111.5
    190000      45.95      82.74       2.01      27.41      53.33      180.1        4.4       59.6      116.0
    200000      50.79      86.98       2.20      28.84      55.94      171.3        4.3       56.8      110.1
    210000      51.18      91.30       2.39      30.29      58.62      178.4        4.7       59.2      114.5
    220000      55.71      96.40       2.39      31.78      62.23      173.0        4.3       57.0      111.7
    230000      56.29     101.15       2.39      33.26      65.50      179.7        4.2       59.1      116.4
    240000      56.64     105.20       2.58      34.71      67.91      185.8        4.6       61.3      119.9
    250000      61.38     109.15       2.77      36.16      70.22      177.8        4.5       58.9      114.4
    260000      62.04     113.41       2.77      37.63      73.01      182.8        4.5       60.6      117.7
    270000      66.84     118.07       2.79      39.14      76.14      176.7        4.2       58.6      113.9
    280000      67.25     123.00       2.99      40.70      79.30      182.9        4.4       60.5      117.9
    290000      71.82     126.52       2.99      42.21      81.32      176.2        4.2       58.8      113.2
    300000      72.43     130.42       3.03      43.72      83.67      180.1        4.2       60.4      115.5

Legend:

docs: inserted document count
dataMB: collection data storage
indexMB: all index storage
idMB: automatic _id_ index storage
sha256MB: secondary sha256 index storage
sha512MB: secondary sha512 index storage
idx/data%: all index storage as a percentage of collection data storage
id/data%: _id_ index storage as a percentage of collection data storage
s256/data%: sha256 index storage as a percentage of collection data storage
s512/data%: sha512 index storage as a percentage of collection data storage