I was confused about the intended use case but there's more information in the docs folder: <a href="https://github.com/microsoft/bistring/blob/master/docs/Introduction.rst" rel="nofollow">https://github.com/microsoft/bistring/blob/master/docs/Intro...</a><p>Apparently it's for machine learning where you want to pick out a span/substring in the original text but your model can only accept normalized text (I am guessing for stuff like transforming out-of-vocabulary words into UNK/unknown tokens). This solves that problem by keeping track of the index mapping between the original text and transformed text.<p>(picking out spans is very common task in NLP, for example see the SQuAD dataset: <a href="https://rajpurkar.github.io/SQuAD-explorer/explore/v2.0/dev/Normans.html" rel="nofollow">https://rajpurkar.github.io/SQuAD-explorer/explore/v2.0/dev/...</a>)