Xia, Wen and Zou, Xiangyu and Jiang, Hong and Zhou, Yukun and Liu, Chuanyi and Feng, Dan and Hua, Yu and Hu, Yuchong and Zhang, Yucheng
year
2020
Gear hash is much faster than Rabin
Mask needs to be shifted left as using rightmost bits effectively reduces sliding window size
Enforcing min size helps with performance (as the first N bytes don’t need to be hashed and can be skipped) but reduces deduplication ratio (because a potential cut-point has been skipped).
Normalization = using more strict criteria for chunks < avg size, and more relaxed one for > avg size. So it’s easier to generate chunks once the size exceeds the average.