As a non-native English speaker I find that the best way to check grammar is to google whole parts of sentences (in apostrophes - exact match). It's because there are multiple exceptions to language rules and some wording just can feel "not right" despite being correct.<p>Is there a tool that does something like this automatically?<p>I thought about writing such tool by myself, but it seems there are no good-quality, free search engine APIs that allow many calls. Or, maybe there are some open APIs to book dumps or something similar?
AFAIK, an ex-Googler had that very same itch and he founded <a href="http://www.linguee.com" rel="nofollow">http://www.linguee.com</a> to try to solve it.
There are quite a few Ngram datasets available <a href="https://www.google.com/search?q=download+n-gram+dataset" rel="nofollow">https://www.google.com/search?q=download+n-gram+dataset</a><p>... these are almost certainly used in many spelling and grammar checkers. (To help with where the same spelled word is used in different context)<p><a href="http://www.aclweb.org/anthology/W12-0304" rel="nofollow">http://www.aclweb.org/anthology/W12-0304</a>
I wonder if there is a tool like this:<p>1. You enter a sentence<p>2. It gives out 5 different ways to say the exact same thing.<p>Such a tool not only would help ESL people but also it would help native speakers find more relaxed or formal versions of a sentence.
Check out <a href="http://foxtype.com" rel="nofollow">http://foxtype.com</a> - does some of that but more grammar-like heuristics such as conciseness, complexity.<p>On a side note, I'm part of a team working on <a href="http://emailfox.co" rel="nofollow">http://emailfox.co</a> which will provide 'Smart Sentences' for you when composing an email, based on a recipient. Allowing you to write personal, relevant emails faster.
Try <a href="http://www.netspeak.org/?locale=en" rel="nofollow">http://www.netspeak.org/?locale=en</a> it seems to do some of the things you asked.
It is implemented on top of n-gram corpora.
You could probably use some of the Ngrams datasets to figure this out. Parse some books from <a href="https://www.gutenberg.org/" rel="nofollow">https://www.gutenberg.org/</a> or use the google ngrams corpus. Pay attention to the year(s) which you wish to model english from - grammar and form keep changing!
I have been thinking of doing something like this (using Ngrams for grammar check for non-natives) for a while. I would be happy to fund development if you or somebody else are interested in working on it.
From XKCD themselves, an editor that only allows for common words: <a href="https://xkcd.com/simplewriter/" rel="nofollow">https://xkcd.com/simplewriter/</a>
www.grammarly.com (haven't tried it though) In the demo they showed it turning a sentence into a more colloquial sentence.<p>I'm a native English speaker, and I'd like to know appropriate punctuation for a given combination of words. I'd like to search through a list.
When I'm conflicted about different phrasings of things (for instance, if there is a hyphen or there isn't on when writing compound words), I usually just use a google search and go with whatever result has the most number of hits. That could be a suitable enough proxy for your use-case, and perhaps you could just use the google search service as an API...<p>Of course, the RIGHT way to do this would be to use the n-gram datasets that people here have suggested :-)
In FAQ: "Why does Google Books only provide feedback on 5 tokens or less?"<p>You mean "..feedback only for 5 tokens or FEWER?" Use your app! ;) //runs away
To improve the qualitative aspects of writing, in this case for job listings primarily, check out <a href="https://textio.com/" rel="nofollow">https://textio.com/</a>. There's no API, but I think it will help you think about what "popular" language means.
What you want is a language model. This will give you the probability on a word by word basis.<p>Something like [1] is pretty much state-of-the-art. It's worth noting that the kind of writing you are doing change the probability significantly. [2] shows this quite well.<p>[1] <a href="https://colinmorris.github.io/lm-sentences/#/billion_words" rel="nofollow">https://colinmorris.github.io/lm-sentences/#/billion_words</a><p>[2] <a href="https://colinmorris.github.io/lm-sentences/#/brown_romance" rel="nofollow">https://colinmorris.github.io/lm-sentences/#/brown_romance</a>
Bah, if you have good reason to be confident that your sentence is correct even if English speakers might feel it is wrong, then I say you should just write it anyway.<p>I like to read such things because it makes me think about what is being said and how the language works. If we always use "popoular" patterns then our writing becomes cliched and boring and people's eyes will glide right over it.
If you can read Chinese, there's interesting tool:<p><a href="http://www.pigai.org/guest2016.html" rel="nofollow">http://www.pigai.org/guest2016.html</a><p>It extracted common phrase from the sentences with explanations & suggestions & count usages from corpus.
<a href="https://github.com/rickyhan/bodine" rel="nofollow">https://github.com/rickyhan/bodine</a><p>This is a tiny tool I wrote a long time ago. There's also writefullapp.com which is closed source.