Proselint

398 pointsby g1n016399about 9 years ago

39 comments

IanCalabout 9 years ago

This sounds interesting. As a bit of constructive criticism, please put some examples high up.You tell me it does cool things. Great, show me. I've looked about on the various pages and can see only one example and I don't understand it:<pre><code> text.md:0:10: wallace.uncomparables Comparison of an uncomparable: 'unique' can not be compared. </code></pre> What's the context of this, what's the error it would have caught in my writing?The tool is in a perfect place to show this off as it's text.

评论 #11237832 未加载

评论 #11237839 未加载

评论 #11237944 未加载

评论 #11237988 未加载

jonstokesabout 9 years ago

I'm a writer and editor, and I dislike the idea of this tool quite a bit.1. Writing isn't coding. In coding, you can do various types of "cargo cult programming" and "copypasta" and what-have-you -- in other words, as long as the code runs you don't necessarily have to know why or how a programming idiom or convention works, or how/why expressing it one way in code is better than expressing it another way in code. This definitionally untrue with writing. If you don't know the why/how of something, then it's better for you to botch it and let the reader attempt to parse it so at least they know what they're dealing with and how to interpret it ("oh, this guy's a non-native speaker, so I'll adjust my reception accordingly" or "ah, this person is kind of clueless about the whole sexist language thing, which is good info for me.").2. 90% of writing style advice falls into one of two categories: a) hotly debated, and b) totally wrong. Most of it is in the latter category, and this includes Strunk & White (just use google for numerous takedowns of that text). I looked through the PR queue and saw that it consists of eager coders finding style advice from various sources and trying to work that into the tool. That is terrible, terrible, terrible... This will guarantee that the tool will represent a collection of awful writing advice gleaned from dubious sources and wielded with unforgiving ignorance.This tool may be a terrible idea, but the idea of automated prose linting is not terrible. Most beginner to intermediate writers have tics, and as an editor I often have a couple of writer-specific find/replace things I do when I get a new piece from a particular writer (e.g. "this person uses 'however' when she means 'but', and this person overuses these four business jargon terms, etc.). If editors were able to easily compose and execute writer-specific linters from within something like Wordpress, that would probably be pretty great.But this particular command line tool is destined to be either totally unused or massively abused.I'm sorry, I hate to be mean... or, actually, there is a small part of me that enjoys playing Mr. Party Pooper when I see a mob of enthusiastic programmers trying to tie down some great cultural Gulliver with a thousand tiny little automated, black-and-white rules.

评论 #11239667 未加载

评论 #11240002 未加载

评论 #11240275 未加载

评论 #11239705 未加载

评论 #11240545 未加载

评论 #11239695 未加载

评论 #11240189 未加载

rosserabout 9 years ago

I can see a lot of value for this sort of tool, and might even play with it myself, for sake of evaluating whether or not to incorporate its suggestions into my writing. At the same time, however, I have some wariness that its widespread use could actually have a shaping, and, specifically homogenizing, effect on language. For me, a large part of the beauty of language is how facile it is, how judiciously breaking its rules can create a more artful and compelling means of expression than linted — if you will, "prosaic" — prose seems likely to offer.

评论 #11237682 未加载

评论 #11238063 未加载

评论 #11238142 未加载

dcw303about 9 years ago

This sounds promising, but I think a lot of potential users would be deterred by the lack of examples.This positively screams for a online interface to test drive.

评论 #11238291 未加载

评论 #11237705 未加载

pronabout 9 years ago

Probably a stupid nitpick, but this bothers me:> detecting grammatical errors is AI-complete, requiring human-level intelligence to get things right.(emphasis mine)First, there's a problem of usage. When in CS we say that a problem is class-complete (like NP-complete), we mean that the problem belongs to the class (which in this case is true, because human-level intelligence can check grammar), but also that it is class-hard, which informally means "at least as hard as the hardest problems in class", and more formally means that any other problem in class can be cheaply reduced to the problem, and so finding a suitable solution to the problem is identical to finding a suitable solution to all other problems in class. Not only checking grammar not known to be "AI-complete" then, we don't even know that human-level intelligence is necessary to solve it.But the reason this bothers me even though I fully understand the statement was made informally, is a little deeper than that: we don't even know what "human-level intelligence" (or intelligence in general) is, let alone what AI means. That people refer to AI as if it's a thing rather than a very vague notion, clouds how people think of AI research as well as intelligence. I would have simply said "we don't know of good algorithms to dependably check grammar, and this appears to be a very hard problem that may require intelligence".

MichaelBurgeabout 9 years ago

If you're on Ubuntu, you want to run 'pip3 install proselint' rather than 'pip install proselint'.I ran it on a couple 800 word emails and it didn't catch anything except me using 2 spaces instead of 1 in one place. I also ran it on my city's sidewalk maintenance ordinance, and it didn't report anything.

评论 #11238032 未加载

czechdeveloperabout 9 years ago

Does anyone know about similar tool for scientific papers? Specifically to help non native English speakers to write high quality scientific papers?

评论 #11238087 未加载

MatthewWilkesabout 9 years ago

While the idea is interesting, I do worry about the proliferation of linting to prose. Especially the hint about authoritative near the end of the article. Linters turn guidelines into steadfast rules in programming, removing all ability to use judgement if you want your PR merged. I personally want less of that, not more.

评论 #11239208 未加载

kbensonabout 9 years ago

Ah, another part of my brain I can offload to an external source. It will be interesting when we get to "social-lint", so those of us that are no good at social interactions (through lack of ability or lack of willingness to spend the effort to combat that with ) or that feel they spend far too much brainpower on social interactions to make up for lack of natural ability can benefit.

yitchelleabout 9 years ago

Can someone explain in layman's terms how this is any better from an app like the Hemmingway Editor [0]? Both analyses the text and makes suggestions to make it better.[0]- <a href="http://www.hemingwayapp.com/" rel="nofollow">http://www.hemingwayapp.com/</a>

评论 #11237727 未加载

评论 #11238031 未加载

squimmyabout 9 years ago

I question how useful a tool like this is for a skilled writer.Prose isn't code.Many key elements of good writing are based around the idea of knowing the rules, and then carefully breaking them.

评论 #11237838 未加载

评论 #11239062 未加载

评论 #11237635 未加载

评论 #11237673 未加载

vpontisabout 9 years ago

Can someone who has tried this share their experience?It sounds really awesome but it's very hard to tell if it's going to be more annoying or more useful. Maybe it would be useful to have some example linting errors on the homepage.Either way, I really love the idea!

评论 #11237452 未加载

评论 #11237662 未加载

staredabout 9 years ago

Is it already in Atom or Sublime Text?EDIT: I must be blind - they say about ST plugin (although they don't link to it). <a href="https://packagecontrol.io/packages/SublimeLinter-contrib-proselint" rel="nofollow">https://packagecontrol.io/packages/SublimeLinter-contrib-pro...</a>

评论 #11238490 未加载

synthmeatabout 9 years ago

Here's a suggestion...Have copy on web site be intentionally incorrect, red-underlined with (small modals? tooltips?) that show what's been corrected/suggested by the tool.

评论 #11238367 未加载

gepochabout 9 years ago

See also write-good: <a href="https://github.com/btford/write-good" rel="nofollow">https://github.com/btford/write-good</a>

评论 #11239825 未加载

nmstokerabout 9 years ago

Looks really interesting. I'd done some preliminary investigation into whether this kind of concept might work for the style guide at my company, but I never got time to take it further.Is there any word on business model / the intentions of the developers? Is it something that's being open sourced and then integration assistance would be commercialised?

kmfrkabout 9 years ago

This is very cool and needed, thank you.Could you include a sample .proselintrc? rc files tend to have very different opinions on how to be formatted: dictionaries, JSON, bash-argument syntax, and so on. (EDIT: Ah, found one: <a href="https://github.com/amperser/proselint/blob/cd428bb0ecc5530c1e2b269e993f1a57d1e8ff21/.proselintrc" rel="nofollow">https://github.com/amperser/proselint/blob/cd428bb0ecc5530c1...</a>. Can’t quite get it to ignore butterick, though.)I find it a little curious that you use a Markdown example and lint for curly quotes and unicode ellipses by default (butterick), since Markdown discourages such pre-formatting in its syntax, but that’s just hairsplitting, of which I can tell by your swelling Issues count that you have plenty of as it is. :)Looking forward to some formatting/syntax highlighting in the CLI output, but I know you have your hands full as it is.

joncpabout 9 years ago

Tried it with "I'm better then you" and it didn't complain.Nice idea, but you need to catch homophone errors.

raphman_about 9 years ago

Are there any plans to support rules for texts written in other languages (e.g., German)? Would a set of such rules fit within the scope of this project or is proselint purposely or inherently limited to English prose? (@suchow)

评论 #11237849 未加载

segphaultabout 9 years ago

The main problem with a tool like this it that it needs to understand sentence structure in order to find a lot of common anti-patterns. Without some natural language processing, it's just going to be able to scan for word usage and simple things that you can catch with a regex. You could probably build something a lot more sophisticated on top of something like Apple's NSLinguisticTagger and related APIs.After testing this against a dozen of my blog posts, I'm not terribly impressed with the output. I get more immediate value out of MarkedApp's keyword drawer and word repetition visualization.

评论 #11237749 未加载

评论 #11237595 未加载

评论 #11237593 未加载

gansaiabout 9 years ago

Will this be used by automated content creators? For example, lots of articles on some of news websites (including wikipedia) are written by bots. So the bot would write an article, invoke proselint and correct, if required?

kaelukaabout 9 years ago

Related: artbollocks-mode <a href="https://github.com/sachac/artbollocks-mode" rel="nofollow">https://github.com/sachac/artbollocks-mode</a>

vorticoabout 9 years ago

I was skeptical that it would only detect obvious issues, but the sheer number of built-in checks is surprising. I'll try this on the next large text I write.

jake-lowabout 9 years ago

I've been interested in linters and style checkers for English prose for a while, and I'm excited to try this out!To the author(s): Your website, as far as I could tell, doesn't tell me how to install it; I had to go to GitHub to realize it was pip-installable. You should consider adding that to the main page.

评论 #11237603 未加载

kylemathewsabout 9 years ago

Nice idea.Bug report — it told me I had too many exclamation marks in a Markdown file with a number of images in it.

评论 #11238524 未加载

timlyoabout 9 years ago

Going through the example, it comes up with:> Get that off of me before I catch on fire! > Needless variant. 'catch fire' is the preferred formI don't think I've ever heard anyone say "catch fire" rather than "catch on fire".From the UK if that changes anything.

评论 #11239044 未加载

vram22about 9 years ago

Ha ha, slightly related fun snippet I wrote:<a href="http://jugad2.blogspot.in/2015/07/cut-crap-absolutely-essential-tool-for.html" rel="nofollow">http://jugad2.blogspot.in/2015/07/cut-crap-absolutely-essent...</a>

edwinyzhabout 9 years ago

Very interesting, and I'm looking into integrating it to <a href="http://WritingOutliner.com" rel="nofollow">http://WritingOutliner.com</a> (or as a separate Word addin) :)

Dowwieabout 9 years ago

Thank you for working on this project and sharing it.One of the more challenging sections in the GMAT entails sentence correction. A proselint-enabled GMAT prep for sentence correction would be very valuable.

ameliusabout 9 years ago

What kinds of NLP technique does this system use?Is it possible to specify new rules in a high-level way?Can it learn from examples?Does it work on a sentence-by-sentence basis only, or does it "grasp" complete paragraphs?

评论 #11238165 未加载

评论 #11238258 未加载

jcofflandabout 9 years ago

It would be interesting to run this against campaign speeches as a unbiased way of judging the quality of prose. Surely content is more important but still it would be fun.

brudgersabout 9 years ago

Github: <a href="https://github.com/amperser/proselint/" rel="nofollow">https://github.com/amperser/proselint/</a>

willvarfarabout 9 years ago

Its a python module? I'm looking forward to making a Pelican plugin so my mate can start checking his blog for glaring errors before he posts! :)

true_religionabout 9 years ago

I'm curious is this just a grammar checker? Or does it do spell checking too like aspell?

评论 #11237826 未加载

zimpenfishabout 9 years ago

Most important question - How many linguists are on the team developing this?

erubinabout 9 years ago

Can I use this with latex?

评论 #11237520 未加载

biturdabout 9 years ago

FYI, seems to work perfectly find in Safari on Mac OS X Desktop.

staredabout 9 years ago

What is wrong with "very smart"? (line 86)

评论 #11238539 未加载

评论 #11239012 未加载

bltabout 9 years ago

Microsoft Word had something like this round about 1999

评论 #11238794 未加载

39 comments

IanCalabout 9 years ago

评论 #11237832 未加载

评论 #11237839 未加载

评论 #11237944 未加载

评论 #11237988 未加载

jonstokesabout 9 years ago

评论 #11239667 未加载

评论 #11240002 未加载

评论 #11240275 未加载

评论 #11239705 未加载

评论 #11240545 未加载

评论 #11239695 未加载

评论 #11240189 未加载

rosserabout 9 years ago

评论 #11237682 未加载

评论 #11238063 未加载

评论 #11238142 未加载

dcw303about 9 years ago

This sounds promising, but I think a lot of potential users would be deterred by the lack of examples.This positively screams for a online interface to test drive.

评论 #11238291 未加载

评论 #11237705 未加载

pronabout 9 years ago

MichaelBurgeabout 9 years ago

评论 #11238032 未加载

czechdeveloperabout 9 years ago

Does anyone know about similar tool for scientific papers? Specifically to help non native English speakers to write high quality scientific papers?

评论 #11238087 未加载

MatthewWilkesabout 9 years ago

评论 #11239208 未加载

kbensonabout 9 years ago

yitchelleabout 9 years ago

评论 #11237727 未加载

评论 #11238031 未加载

squimmyabout 9 years ago

评论 #11237838 未加载

评论 #11239062 未加载

评论 #11237635 未加载

评论 #11237673 未加载

vpontisabout 9 years ago

评论 #11237452 未加载

评论 #11237662 未加载

staredabout 9 years ago

评论 #11238490 未加载

synthmeatabout 9 years ago

Here's a suggestion...Have copy on web site be intentionally incorrect, red-underlined with (small modals? tooltips?) that show what's been corrected/suggested by the tool.

评论 #11238367 未加载

gepochabout 9 years ago

See also write-good: <a href="https://github.com/btford/write-good" rel="nofollow">https://github.com/btford/write-good</a>

评论 #11239825 未加载

nmstokerabout 9 years ago

kmfrkabout 9 years ago

joncpabout 9 years ago

Tried it with "I'm better then you" and it didn't complain.Nice idea, but you need to catch homophone errors.

raphman_about 9 years ago

评论 #11237849 未加载

segphaultabout 9 years ago

评论 #11237749 未加载

评论 #11237595 未加载

评论 #11237593 未加载

gansaiabout 9 years ago

kaelukaabout 9 years ago

Related: artbollocks-mode <a href="https://github.com/sachac/artbollocks-mode" rel="nofollow">https://github.com/sachac/artbollocks-mode</a>

vorticoabout 9 years ago

I was skeptical that it would only detect obvious issues, but the sheer number of built-in checks is surprising. I'll try this on the next large text I write.

jake-lowabout 9 years ago

评论 #11237603 未加载

kylemathewsabout 9 years ago

Nice idea.Bug report — it told me I had too many exclamation marks in a Markdown file with a number of images in it.

评论 #11238524 未加载

timlyoabout 9 years ago

评论 #11239044 未加载

vram22about 9 years ago

edwinyzhabout 9 years ago

Very interesting, and I'm looking into integrating it to <a href="http://WritingOutliner.com" rel="nofollow">http://WritingOutliner.com</a> (or as a separate Word addin) :)

Dowwieabout 9 years ago

ameliusabout 9 years ago

评论 #11238165 未加载

评论 #11238258 未加载

jcofflandabout 9 years ago

It would be interesting to run this against campaign speeches as a unbiased way of judging the quality of prose. Surely content is more important but still it would be fun.

brudgersabout 9 years ago

Github: <a href="https://github.com/amperser/proselint/" rel="nofollow">https://github.com/amperser/proselint/</a>

willvarfarabout 9 years ago

Its a python module? I'm looking forward to making a Pelican plugin so my mate can start checking his blog for glaring errors before he posts! :)

true_religionabout 9 years ago

I'm curious is this just a grammar checker? Or does it do spell checking too like aspell?