Hi, This was a post I worked on to better understand how tokenizers are used in the latest Deep learning models such as BERT and what impact they have on how those models learn. It was a really interesting and difficult topic to research so would love any thoughts or feedback on it