TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

Bistring – Bidirectionally Transformed Strings

72 pointsby varunagrawalalmost 6 years ago

3 comments

zawerfalmost 6 years ago
I was confused about the intended use case but there&#x27;s more information in the docs folder: <a href="https:&#x2F;&#x2F;github.com&#x2F;microsoft&#x2F;bistring&#x2F;blob&#x2F;master&#x2F;docs&#x2F;Introduction.rst" rel="nofollow">https:&#x2F;&#x2F;github.com&#x2F;microsoft&#x2F;bistring&#x2F;blob&#x2F;master&#x2F;docs&#x2F;Intro...</a><p>Apparently it&#x27;s for machine learning where you want to pick out a span&#x2F;substring in the original text but your model can only accept normalized text (I am guessing for stuff like transforming out-of-vocabulary words into UNK&#x2F;unknown tokens). This solves that problem by keeping track of the index mapping between the original text and transformed text.<p>(picking out spans is very common task in NLP, for example see the SQuAD dataset: <a href="https:&#x2F;&#x2F;rajpurkar.github.io&#x2F;SQuAD-explorer&#x2F;explore&#x2F;v2.0&#x2F;dev&#x2F;Normans.html" rel="nofollow">https:&#x2F;&#x2F;rajpurkar.github.io&#x2F;SQuAD-explorer&#x2F;explore&#x2F;v2.0&#x2F;dev&#x2F;...</a>)
评论 #20430054 未加载
评论 #20429134 未加载
andrewflnralmost 6 years ago
Somewhat related: Boomerang <a href="https:&#x2F;&#x2F;www.seas.upenn.edu&#x2F;~harmony&#x2F;" rel="nofollow">https:&#x2F;&#x2F;www.seas.upenn.edu&#x2F;~harmony&#x2F;</a> Discussed here at least once: <a href="https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=565874" rel="nofollow">https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=565874</a><p>The title made me think of Boomerang, this looks like it has rather different use cases in mind.
bltalmost 6 years ago
This is interesting, but the readme doesn&#x27;t say much about use cases. What is a big application that could benefit from this?
评论 #20429281 未加载