Thanks for introducing a new (to me) idea. I didn't watch the video but I felt the write-up could have been more cohesive. Perhaps just a conclusion to tie all the ideas together. I'm also left wondering why we would use this WME approach over other document embedding techniques (averaging word vectors, paragraph vectors, smooth inverse frequency weighting, etc). Is it faster, gives better similarity estimates, etc.?