This sounds interesting. As a bit of constructive criticism, please put some examples high up.<p>You tell me it does cool things. Great, show me. I've looked about on the various pages and can see only one example and I don't understand it:<p><pre><code> text.md:0:10: wallace.uncomparables Comparison of an uncomparable: 'unique' can not be compared.
</code></pre>
What's the context of this, what's the error it would have caught in my writing?<p>The tool is in a perfect place to show this off as it's text.
I'm a writer and editor, and I dislike the idea of this tool quite a bit.<p>1. Writing isn't coding. In coding, you can do various types of "cargo cult programming" and "copypasta" and what-have-you -- in other words, as long as the code runs you don't necessarily have to know why or how a programming idiom or convention works, or how/why expressing it one way in code is better than expressing it another way in code. This definitionally untrue with writing. If you don't know the why/how of something, then it's better for you to botch it and let the reader attempt to parse it so at least they know what they're dealing with and how to interpret it ("oh, this guy's a non-native speaker, so I'll adjust my reception accordingly" or "ah, this person is kind of clueless about the whole sexist language thing, which is good info for me.").<p>2. 90% of writing style advice falls into one of two categories: a) hotly debated, and b) totally wrong. Most of it is in the latter category, and this includes Strunk & White (just use google for numerous takedowns of that text). I looked through the PR queue and saw that it consists of eager coders finding style advice from various sources and trying to work that into the tool. That is terrible, terrible, terrible... This will guarantee that the tool will represent a collection of awful writing advice gleaned from dubious sources and wielded with unforgiving ignorance.<p>This tool may be a terrible idea, but the idea of automated prose linting is not terrible. Most beginner to intermediate writers have tics, and as an editor I often have a couple of writer-specific find/replace things I do when I get a new piece from a particular writer (e.g. "this person uses 'however' when she means 'but', and this person overuses these four business jargon terms, etc.). If editors were able to easily compose and execute writer-specific linters from within something like Wordpress, that would probably be pretty great.<p>But this particular command line tool is destined to be either totally unused or massively abused.<p>I'm sorry, I hate to be mean... or, actually, there is a small part of me that enjoys playing Mr. Party Pooper when I see a mob of enthusiastic programmers trying to tie down some great cultural Gulliver with a thousand tiny little automated, black-and-white rules.
I can see a lot of value for this sort of tool, and might even play with it myself, for sake of evaluating whether or not to incorporate its <i>suggestions</i> into my writing. At the same time, however, I have some wariness that its widespread use could actually have a shaping, and, specifically <i>homogenizing</i>, effect on language. For me, a large part of the beauty of language is how facile it is, how judiciously breaking its rules can create a more artful and compelling means of expression than linted — if you will, "prosaic" — prose seems likely to offer.
This sounds promising, but I think a lot of potential users would be deterred by the lack of examples.<p>This positively screams for a online interface to test drive.
Probably a stupid nitpick, but this bothers me:<p>> detecting grammatical errors is <i>AI-complete</i>, requiring human-level intelligence to get things right.<p>(emphasis mine)<p>First, there's a problem of usage. When in CS we say that a problem is <i>class</i>-complete (like NP-complete), we mean that the problem belongs to the class (which in this case is true, because human-level intelligence can check grammar), but also that it is <i>class</i>-hard, which informally means "at least as hard as the hardest problems in <i>class</i>", and more formally means that any other problem in <i>class</i> can be cheaply reduced to the problem, and so finding a suitable solution to the problem is identical to finding a suitable solution to all other problems in <i>class</i>. Not only checking grammar not known to be "AI-complete" then, we don't even know that human-level intelligence is necessary to solve it.<p>But the reason this bothers me even though I fully understand the statement was made informally, is a little deeper than that: we don't even know what "human-level intelligence" (or intelligence in general) is, let alone what AI means. That people refer to AI as if it's a thing rather than a very vague notion, clouds how people think of AI research as well as intelligence. I would have simply said "we don't know of good algorithms to dependably check grammar, and this appears to be a very hard problem that may require intelligence".
If you're on Ubuntu, you want to run 'pip3 install proselint' rather than 'pip install proselint'.<p>I ran it on a couple 800 word emails and it didn't catch anything except me using 2 spaces instead of 1 in one place. I also ran it on my city's sidewalk maintenance ordinance, and it didn't report anything.
While the idea is interesting, I do worry about the proliferation of linting to prose. Especially the hint about authoritative near the end of the article. Linters turn guidelines into steadfast rules in programming, removing all ability to use judgement if you want your PR merged. I personally want less of that, not more.
Ah, another part of my brain I can offload to an external source. It will be interesting when we get to "social-lint", so those of us that are no good at social interactions (through lack of ability or lack of willingness to spend the effort to combat that with ) or that feel they spend far too much brainpower on social interactions to make up for lack of natural ability can benefit.
Can someone explain in layman's terms how this is any better from an app like the Hemmingway Editor [0]? Both analyses the text and makes suggestions to make it better.<p>[0]- <a href="http://www.hemingwayapp.com/" rel="nofollow">http://www.hemingwayapp.com/</a>
I question how useful a tool like this is for a skilled writer.<p>Prose isn't code.<p>Many key elements of good writing are based around the idea of knowing the rules, and then <i>carefully breaking them</i>.
Can someone who has tried this share their experience?<p>It sounds really awesome but it's very hard to tell if it's going to be more annoying or more useful. Maybe it would be useful to have some example linting errors on the homepage.<p>Either way, I really love the idea!
Is it already in Atom or Sublime Text?<p>EDIT: I must be blind - they say about ST plugin (although they don't link to it). <a href="https://packagecontrol.io/packages/SublimeLinter-contrib-proselint" rel="nofollow">https://packagecontrol.io/packages/SublimeLinter-contrib-pro...</a>
Here's a suggestion...<p>Have copy on web site be intentionally incorrect, red-underlined with (small modals? tooltips?) that show what's been corrected/suggested by the tool.
See also write-good: <a href="https://github.com/btford/write-good" rel="nofollow">https://github.com/btford/write-good</a>
Looks really interesting. I'd done some preliminary investigation into whether this kind of concept might work for the style guide at my company, but I never got time to take it further.<p>Is there any word on business model / the intentions of the developers? Is it something that's being open sourced and then integration assistance would be commercialised?
This is very cool and needed, thank you.<p>Could you include a sample .proselintrc? rc files tend to have very different opinions on how to be formatted: dictionaries, JSON, bash-argument syntax, and so on. (EDIT: Ah, found one: <a href="https://github.com/amperser/proselint/blob/cd428bb0ecc5530c1e2b269e993f1a57d1e8ff21/.proselintrc" rel="nofollow">https://github.com/amperser/proselint/blob/cd428bb0ecc5530c1...</a>. Can’t quite get it to ignore butterick, though.)<p>I find it a little curious that you use a Markdown example and lint for curly quotes and unicode ellipses by default (butterick), since Markdown discourages such pre-formatting in its syntax, but that’s just hairsplitting, of which I can tell by your swelling Issues count that you have plenty of as it is. :)<p>Looking forward to some formatting/syntax highlighting in the CLI output, but I know you have your hands full as it is.
Are there any plans to support rules for texts written in other languages (e.g., German)? Would a set of such rules fit within the scope of this project or is proselint purposely or inherently limited to English prose? (@suchow)
The main problem with a tool like this it that it needs to understand sentence structure in order to find a lot of common anti-patterns. Without some natural language processing, it's just going to be able to scan for word usage and simple things that you can catch with a regex. You could probably build something a lot more sophisticated on top of something like Apple's NSLinguisticTagger and related APIs.<p>After testing this against a dozen of my blog posts, I'm not terribly impressed with the output. I get more immediate value out of MarkedApp's keyword drawer and word repetition visualization.
Will this be used by automated content creators? For example, lots of articles on some of news websites (including wikipedia) are written by bots. So the bot would write an article, invoke proselint and correct, if required?
I was skeptical that it would only detect obvious issues, but the sheer number of built-in checks is surprising. I'll try this on the next large text I write.
I've been interested in linters and style checkers for English prose for a while, and I'm excited to try this out!<p>To the author(s): Your website, as far as I could tell, doesn't tell me how to install it; I had to go to GitHub to realize it was pip-installable. You should consider adding that to the main page.
Going through the example, it comes up with:<p>> Get that off of me before I catch on fire!
> Needless variant. 'catch fire' is the preferred form<p>I don't think I've ever heard anyone say "catch fire" rather than "catch on fire".<p>From the UK if that changes anything.
Ha ha, slightly related fun snippet I wrote:<p><a href="http://jugad2.blogspot.in/2015/07/cut-crap-absolutely-essential-tool-for.html" rel="nofollow">http://jugad2.blogspot.in/2015/07/cut-crap-absolutely-essent...</a>
Very interesting, and I'm looking into integrating it to <a href="http://WritingOutliner.com" rel="nofollow">http://WritingOutliner.com</a> (or as a separate Word addin) :)
Thank you for working on this project and sharing it.<p>One of the more challenging sections in the GMAT entails sentence correction. A proselint-enabled GMAT prep for sentence correction would be very valuable.
What kinds of NLP technique does this system use?<p>Is it possible to specify new rules in a high-level way?<p>Can it learn from examples?<p>Does it work on a sentence-by-sentence basis only, or does it "grasp" complete paragraphs?
It would be interesting to run this against campaign speeches as a unbiased way of judging the quality of prose. Surely content is more important but still it would be fun.