Building a no-code toxicity classifier by talking to GitHub Copilot

212 pointsby AiFoGhostabout 3 years ago

34 comments

abeppuabout 3 years ago

We're all focusing on the weaknesses of co-pilot (the comments can be longer than the code produced; you need to understand code to know when to elaborate your comment, etc).But also ... what do you need to know to recognize that the concept of a 'toxicity classifier' is likely broken? We can do _profanity_ detection pretty well, and without a huge amount of data. But with 1000 example comments, can you actually get at 'toxicity'? Can you judge toxicity purely from a comment in isolation, or does it need to be considered in the context in which that comment is made?Maybe you don't need to know about python, but if you're building this, you should probably have spent some time thinking and grappling with ML problems in context, right? You want to know that, for example, the pipeline copilot is suggesting (word counts, TFIDF, naive Bayes) doesn't understand word order? Or to wonder whether it's tokenizing on just whitespace, and whether `'eat sh!t'` will fail to get flagged b/c `'shit'` and `'sh!t'` are literally orthogonal to the model?More people should be able to create digital stuff that _does_ things, and maybe copilot is a tool to help us move in that direction. Great! But writing a bad "toxicity classifier" by not really engaging with the problem or thinking about how the solution works and where it fails seems potentially net harmful. More people should be able to make physical stuff too, but 3d-printed high-capacity magazines don't really get most of us where we want to go.

评论 #30799523 未加载

评论 #30798699 未加载

评论 #30802369 未加载

评论 #30801428 未加载

评论 #30800147 未加载

tomervabout 3 years ago

The first comment asks Copilot to import all the libraries needed for a toxicity classifier, and it imports libraries such as re (regex engine) and nltk (natural language toolkit). But what if I wanted a classifier for toxic chemicals and not toxic speech? That was my first thought when I saw "toxicity" in the title.I'm now imagining a very frustrated junior developer a few years from now trying to argue with Copilot to write code for a classifier for chemical compounds, but it just spits out code for classifying text.

评论 #30798651 未加载

评论 #30799462 未加载

评论 #30799486 未加载

StopHammoTimeabout 3 years ago

Just to clarify, it's not really no-code: pseudocode is the new bytecode it would seem and this is just compiling that into usable code.You still need to be able to code and understand what you're doing. You can't just ask simple questions and get complex answers. You still have to be capable of asking complex questions.A common scenario I can think if is where I struggle to remember the name or API of the exact thing I want to do but I know exactly how it works - typing that in and getting a result would improve my workflow, but it's just saving a trip to Google, we're not talking the difference between doing and not doing, just a saving a minute.I would rate the value of this more as interesting rather than useful, simply because as another commenter highlighted it's just easier to write code. It could be useful incrementally but not for everything.

评论 #30798492 未加载

评论 #30798232 未加载

chronolitusabout 3 years ago

Reminds me of this post by Scott Aaronson: <a href="https://scottaaronson.blog/?p=6288" rel="nofollow">https://scottaaronson.blog/?p=6288</a>"Forget all that. Judged against where AI was 20-25 years ago, when I was a student, a dog is now holding meaningful conversations in English. And people are complaining that the dog isn’t a very eloquent orator, that it often makes grammatical errors and has to start again, that it took heroic effort to train it, and that it’s unclear how much the dog really understands."

评论 #30800812 未加载

bmitcabout 3 years ago

I don't really understand this. You're not coding directly in the language, but now you're coding in an implicit language provided by Copilot. From what I've seen on Copilot, although it is an impressive piece of tech, all it really points out is that code documentation and discovery is terrible. But I'm not for sure writing implicit code in comments is really a better approach than seeking ways to make discovery of language and library features more discoverable.And I know it sounds silly and like "I had an idea like that once" (see Office Space), but I actually came up with the idea for or at least a similar one to Copilot in an off comment to a coworker back in like 2014 or so. The idea was that as you wrote code, it would display on the side similar code that had been written by others doing the same or similar thing, and then it would allow you automatically upload small processing functions to some sort of cloud library. Same thing for doing autoformatting, although that's less of a concern now that formatters are becoming popular. The context I was working in was visual languages though. I had even started writing a tool during an "innovation week" (that I never showed) that would start visually classifying whether code written in the visual language was "good" or "clean" or not. I never got anywhere with it and mainly just have some diagrams generated from that project that were buggy so that they kind of look like art.

评论 #30799549 未加载

kcorbittabout 3 years ago

What funny timing! Just this week I've actually been working on an open source VS Code extension that uses OpenAI's new code edit API[1] to let you write or edit code in your IDE by typing instructions.And as a bonus related to the article title, it literally lets you talk to your editor (ie you can press the keyboard shortcut and then give edit commands by voice[2]). I've been leaning on it heavily for the last few days and the setup feels really productive!If you want to try it out you can install it here: <a href="https://marketplace.visualstudio.com/items?itemName=clippy-ai.clippy-ai" rel="nofollow">https://marketplace.visualstudio.com/items?itemName=clippy-a...</a>You can also find the full source code here: <a href="https://github.com/corbt/clippy-ai/tree/main/vs-code-extension" rel="nofollow">https://github.com/corbt/clippy-ai/tree/main/vs-code-extensi...</a>I'd love feedback![1]: <a href="https://openai.com/blog/gpt-3-edit-insert/" rel="nofollow">https://openai.com/blog/gpt-3-edit-insert/</a>[2]: I just wrote the voice command interface yesterday and it's still highly experimental. Relies on having ffmpeg installed on MacOS and doesn't work with all audio setups yet. But there's a clear path to making it more robust.

评论 #30803394 未加载

hartatorabout 3 years ago

Notice that the comments used to generate the code via GitHub Copilot are just another very inefficient programming language.

评论 #30797831 未加载

评论 #30797843 未加载

评论 #30799597 未加载

评论 #30797945 未加载

评论 #30798636 未加载

junonabout 3 years ago

As with most harmful speech classifiers (even classic models) this most likely won't catch the more passive aggressive remarks. Those worded innocently but imply something terrible. I've had a 100% success rate getting these sorts of models to tell me asking someone to "kindly end their own life" is not rude, toxic or harmful.

esjeonabout 3 years ago

Not really no-code. Let's be honest. The OP is taking steps just like how an experienced SW developer would. Copilot simply cut the need for reading through documentations. This doesn't really say that Copilot can replace programmers.p.s. Does anyone know when Copilot will update the insecure example on their website? Or are they just trying to be honest with the possible quality issues with the generated code?

评论 #30800648 未加载

Ozzie_osmanabout 3 years ago

This is a game-changer, even if it doesn't work 100% of the time. I only infrequently need to use notebooks and dataframes, I'd say once every few months. Frequently enough that I have a vague idea of what I need to do but not frequently enough that I can remember syntax.With this, I don't need to memorize the syntax OR be bottlenecked on looking at documentation or stack overflowing the commands I need.

评论 #30800346 未加载

pech0rinabout 3 years ago

echoing a bunch of comments but this seems sort of like a nightmare. its like the classic “dont use comments that are exactly what the code is doing”. basically you are requiring writing this type of boilerplate comments which are completely useless but are now so the machines can write the code for you. i guess if you could have some tool that auto-removes these comments afterwards it wouldn’t be terrible but i just see this as a way to have people completely forget apis and then not actually be able to find more powerful tools in a language just living on the rails that copilot provides for you. overall seems like a step backwards, especially if newer devs use this as a crutch when jumping in. now we have a generation of devs who dont actually understand the way things work.i guess stack overflow has a similar problem but at least there people provide documentation, explanation, and helpful links. this just force feeds you some code. i dont see this as a positive movement for our industry as a whole

评论 #30799872 未加载

评论 #30798234 未加载

arciiniabout 3 years ago

This is really pretty impressive. I think Copilot for these kinds of one-off analysis tasks where specific data manipulation rather than structuring abstractions makes a lot more sense. Structuring libraries or building UI requires a lot more understanding of potential users - in that case, writing the requirements is honestly the harder part.

评论 #30797727 未加载

softwarebewareabout 3 years ago

I'm out almost immediately. The first comment is more text than the code that it produces.

评论 #30797970 未加载

评论 #30798002 未加载

评论 #30798282 未加载

评论 #30798070 未加载

评论 #30798502 未加载

wojcikstefanabout 3 years ago

1. This is not “no-code”. You still have to read & understand the code Copilot generates.2. I’m very skeptical of a small group of people reading a bunch of online comments and deciding what is “toxic” and “non-toxic”, even more so when it’s done with no clear definitions/guidelines. As their GitHub repo [0] says:> Rather than operating under a strict definition of toxicity, we asked our team to identify comments that they personally found toxic.[0]: <a href="https://github.com/surge-ai/toxicity" rel="nofollow">https://github.com/surge-ai/toxicity</a>

stitched2gethrabout 3 years ago

This is actually pretty impressive. More so than I expected, and I sincerely hope this opens the door to simple solutions for those who are still learning or don't code often.That said, this isn't the robot that replaces us, obviously. Making the process of getting to 80% faster is better for everyone, but the last 20 is tough and anything further needs real expertise. I like how promising this is for the masses.

vba616about 3 years ago

I thought at first this was a classifier for the toxicity of no-code solutions.For instance, Microsoft Power Automate should rank highly.

rogue7about 3 years ago

This is impressive, Copilot knows scikit-learn better than the data scientist that I am.

Loeffelmannabout 3 years ago

I've been using copilot for a bit now and it's honestly really impressive. I was skeptical at first and didn't really believe all the praise but it works so well. You still have to understand what the code is doing but more often then not copilot spits out a out of the box working solution. It is phenomenal at writing tests. I can pretty much tell it "write tests for this function" and it will do it with surprising Quality and maybe even goes through cases I haven't thought about.I think this technology will really shake up how we code.

TauNeutrinoabout 3 years ago

It's an AI writing another AI, the miracle of guided reproduction! As programmers we should appreciate the subtle meta in that.It is also highly symbolic that the first AI (copilot) was created to save humans from repeating toil, while the second (classifier) is about controlling and limiting us.I believe the author chose to apply his method to this particular example intentionally for the two above points, not because of the hype of toxicity.

holografixabout 3 years ago

This would be awesome for crap I don’t want to learn like CSS

bradleybudaabout 3 years ago

So, has anyone asked Copilot to write a better Copilot yet?

评论 #30805213 未加载

fsargentabout 3 years ago

I seriously thought that GitHub CoPilot was suggesting how to find new kinds of sarin gas. <a href="https://www.theverge.com/2022/3/17/22983197/ai-new-possible-chemical-weapons-generative-models-vx" rel="nofollow">https://www.theverge.com/2022/3/17/22983197/ai-new-possible-...</a> How long until it does?

评论 #30799352 未加载

MrYellowPabout 3 years ago

I'm not sure people understand how utterly dystopian and fascist this is. It's like people believe that this is a good thing, instead of understanding how totalitarianism is spreading literally everywhere."In the name of what's Good & Right, you have to behave how we want you to ... or else."

评论 #30801216 未加载

DeathArrowabout 3 years ago

I would like to see a project adding together capabilities of both Autopilot and Intellicode. Copilot uses GPT-4 and GitHub project for training and is giving suggestions based on few lines, Intellicode is reading the whole project and is giving suggestions based on that.

boredumbabout 3 years ago

The last thing this world needs is automation around people calling things toxic or problematic.

ameliusabout 3 years ago

This works because there's a lot of ML code out there, and it's all very much the same.

ah27182about 3 years ago

The page is not working anymore, getting a 400 error

linkddabout 3 years ago

Can we ask Copilot to write a proof for the collatz conjecture? or P=NP?

评论 #30801749 未加载

Metacelsusabout 3 years ago

From the title I thought it would be about chemical toxicity.

eric4smithabout 3 years ago

Impressive BUT.Who is defining toxic speech? Where is that data being taken from?This is the definition of using AI to set what the edges of “speech” should be based on potentially flawed data.This is a clown world.

评论 #30797856 未加载

评论 #30798129 未加载

hombre_fatalabout 3 years ago

This is absolutely insane. I had no idea Copilot was this good.The negativity here just seems like sour grapes or weird goal posts.Sure, it makes mistakes and needs verification. But know what also makes mistakes and needs verification? All the code I already manually write as I tediously ratchet towards a solution. Removing some cycles from that process is a win.Just stubbing out close-enough boilerplate is a win by itself, like setting up an NLP pipeline or figuring out which menagerie of classes need to be instantiated and hooked up together to do basic things in some verbose libs/langs.

评论 #30799617 未加载

评论 #30799382 未加载

评论 #30799502 未加载

评论 #30799877 未加载

评论 #30800008 未加载

xodjmkabout 3 years ago

Please add "No-Code" and "Toxicity Classifier" to your toxicity dataset.

评论 #30798331 未加载

dustedabout 3 years ago

This comment will of course be down voted, I'll attribute this to selection bias caused by the headline of the article.You can't classify a comment as boolean toxic, toxicity does not exist in a vacuum. To extend the analogy from it's biological counterpart, toxicity depends on the organism. You should never just a piece of text in isolation and draw any conclusion about it. It must understood in context, both that of the subject, the recipient and the sender.

评论 #30800015 未加载

评论 #30800163 未加载

nixpulvisabout 3 years ago

Fuck people who think they can define speech patterns in datasets like this. Especially since I am required to request permission to view their "Elite" documents.This is some dystopian shit right here. I don't care what fancy models you train on it, or even what funny jokes you make of it. I'm just so done with this.