GROUPED ITEMS ------------- Normally, an index tuple points to one heap tuple. If you enable grouped items, an index tuple can point to multiple heap tuples that are on the same page. If there's a significant amount of clustering, this leads to a dramatically smaller index, which leads to I/O savings. Normal index tuples are transformed to grouped index tuples in an operation called "groupification". When a page gets full, the page is scanned for groups of pointers to heap tuples on same heap pages, and all such groups are collapsed to grouped index tuples. Hopefully that frees enough space on the page that it doesn't need to be split. Groupification can be controlled using the index option "groupthreshold". Groupthreshold is the minimum number of index tuples in the group before the group of index tuples is collapsed to a grouped index tuple. Valid range of values is 2-MaxHeapTuplesPerPage, or 0 which disables the feature. Groupification -------------- When a page gets full, tuples are collapsed to grouped index tuples in a groupify-operation. The page is scanned from beginning to end. Whenever there's a sequence of index tuples that point to the same heap block, those index tuples form a group. If the number of tuples in the group exceeds the effective groupthreshold, the group is collapsed to a single grouped index tuple. For example, imagine a groupthreshold of 4, and a page like this: key -> heap page (heap offsets) A -> 1 (10) B -> 2 (3) C -> 2 (1) D -> 2 (2) E -> 2 (4) F -> 1 (1) G -> 1 (2) H -> 1 (3) After the groupify the page would look like this: key -> heap page (heap offsets) A -> 1 (1) B* -> 2 (1,2,3,4) F -> 1 (1) G -> 1 (2) H -> 1 (3) The group B-E is collapsed to a single grouped index tuple. The group F-H isn't, because it contains only 3 tuples, which is below the threshold of 4. On-disk format -------------- Normal index tuples look like this, assuming a MAXALIGN of 8: Item pointer to heap | blk no | offset | flags | size of tuple | key -------+-------------+-------+---------------+-------- 1 | 1 | | 16 | aaa 1 | 2 | | 16 | bbb 1 | 3 | | 16 | ccc 2 | 2 | | 16 | ddd 2 | 4 | | 16 | eee 2 | 7 | | 16 | fff With grouped items enabled, multiple consecutive index tuples are collapsed to one grouped index tuple in the groupify-operation. A grouped index tuple stores just the first key of the collapsed entries, and a bitmap of heap offsets. The bitmap is stored at the end of the index tuple, and the byte offset of the bitmap is stored in the space normally used for the offset of the item pointer: Item pointer to heap | blk no | offset | flags | size of tuple | key | bitmap of offsets -------+-------------+-------+---------------+--------+------------------ 1 | 16 | GRP | 20 | aaa | 11100000 2 | 16 | GRP | 20 | ddd | 01010010 Search algorithm ---------------- The normal search algorithm needs to be sligthly changed to support grouped index entries. When locating the starting point of a forward scan, if no equal tuple is found, we need to step back to the previous index tuple. If that's a grouped item, it might contain matches, so we need to start the scan by scanning those heap tuples. Since we don't store the key of every heap tuple, the keys of the tuples fetched from a grouped index are only "candidate matches", unless the scan quals include the key range of the grouped item completely. Furthermore, because the order of the heap tuples that are collapsed to a grouped item is not preserved, an index scan needs to sort them to be able to return them in index order. In a bitmap scan, grouped items are returned as "lossy" pointers to the page, and the bitmap heap scan node will scan the whole page and recheck the scan quals. Grouped index tuples whose keys don't match non-boundary scan quals can't be skipped, because some heap tuples represented by those index tuples might still match. Insertion algorithm ------------------- The location of the new key in the index is located by descending the tree and performing a binary search in the leaf page as usual. After that, things get more complicated. Let's call the offset found by descend and binary search as 'insoffset', and the offset right before that as 'prevoffset'. if (key of the item in 'insoffset' equals the insertion key) { Insert a new index tuple at 'insoffset' (note: the new tuple must go before the old tuple at 'insoffset'. Otherwise, if the old tuple is a grouped item, inserting after it would truncate the range of the group) } else { if(item at 'prevoffset' is *not* a grouped item) { Insert a new index tuple at 'insoffset' } else { We'll have to split the old grouped tuple to two halves. All heap tuples with key < the new insertion key go the left half and the rest go to the right half. After splitting, the new entry is inserted as a new index tuple between the halves. } } Uniqueness checking requires a few minor modifications as well. After fetching a heap tuple for uniqueness check, it's key need to be checked for equality with the new key, if the heap tuple was part of a grouped item. Vacuum ------ Vacuuming group index tuples is similar to vacuuming normal items. Bits representing dead heap tuples are cleared from grouped items. However, note that a grouped item can't be converted to a normal index tuple even if there's only one bit left in the bitmap, because the key of the index tuple might not match the key of the heap tuple that's left in the bitmap.