TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

Testing GPT 4's code-writing capabilities with some real world problems

567 点作者 TylerGlaiel大约 2 年前

114 条评论

danwee大约 2 年前
I want to see GPT-4 dealing with this situation:<p>- they: we need a new basic POST endpoint<p>- us: cool, what does the api contract look like? URL? Query params? Payload? Response? Status code?<p>- they: Not sure. Third-party company XXQ will let you know the details. They will be the ones calling this new endpoint. But in essence it should be very simple: just grab whatever they pass and save it in our db<p>- us: ok, cool. Let me get in contact with them<p>- ... one week later...<p>- company XXQ: we got this contract here: &lt;contract_json&gt;<p>- us: thanks! We&#x27;ll work on this<p>- ... 2 days later...<p>- us: umm, there&#x27;s something not specified in &lt;contract_json&gt;. What about this part here that says that...<p>- ... 2 days later...<p>- company XXQ: ah sure, sorry we missed that part. It&#x27;s like this...<p>- ...and so on...<p>Basically, 99% of the effort is NOT WRITING CODE. It&#x27;s all about communication with people, and problem solving. If we use GPT-X in our company, it will help us with 1% of our workload. So, I couldn&#x27;t care less about it.
评论 #35196179 未加载
评论 #35196217 未加载
评论 #35198199 未加载
评论 #35196037 未加载
评论 #35197339 未加载
评论 #35196023 未加载
评论 #35196348 未加载
评论 #35199878 未加载
评论 #35196445 未加载
评论 #35196519 未加载
评论 #35199570 未加载
评论 #35198448 未加载
评论 #35196934 未加载
评论 #35197656 未加载
评论 #35196566 未加载
评论 #35202278 未加载
评论 #35196778 未加载
评论 #35196154 未加载
评论 #35199338 未加载
评论 #35197203 未加载
评论 #35198931 未加载
评论 #35197170 未加载
评论 #35198052 未加载
评论 #35198232 未加载
评论 #35200926 未加载
评论 #35196415 未加载
评论 #35199845 未加载
评论 #35197353 未加载
评论 #35232952 未加载
评论 #35201878 未加载
评论 #35203562 未加载
评论 #35201720 未加载
评论 #35197548 未加载
评论 #35196803 未加载
评论 #35198364 未加载
评论 #35196523 未加载
评论 #35196785 未加载
评论 #35196892 未加载
评论 #35197952 未加载
评论 #35199546 未加载
评论 #35200349 未加载
评论 #35207391 未加载
评论 #35196149 未加载
评论 #35196709 未加载
评论 #35198249 未加载
评论 #35200777 未加载
评论 #35198072 未加载
评论 #35203904 未加载
评论 #35206605 未加载
评论 #35206364 未加载
评论 #35197468 未加载
评论 #35197323 未加载
评论 #35196467 未加载
评论 #35197428 未加载
评论 #35200518 未加载
评论 #35206457 未加载
评论 #35199167 未加载
评论 #35199575 未加载
评论 #35200154 未加载
评论 #35199569 未加载
评论 #35200902 未加载
评论 #35197242 未加载
评论 #35196694 未加载
评论 #35210035 未加载
评论 #35196176 未加载
评论 #35200879 未加载
评论 #35211454 未加载
nwienert大约 2 年前
In before all the comments about how “most code is trivial” or “most programming is stuff that already exists” or “you’re missing the point look how it’s getting better”.<p>I really am in awe of how much work people seem willing to do to justify this as revolutionary and programmers as infantile, and also why they do that. It’s fascinating.<p>Thinking back to my first job out of college as a solid entry level programmer. ChatGPT couldn’t have done what I was doing on day 2. Not because it’s so hard or I’m so special. Just because programming is <i>never just a snippet of code</i>. Programming is an iterative process that involves a CLI, shell, many runtimes, many files, a REPL, a debugger, a lot of time figuring out a big codebase and how it all links together, and a ton of time going back and forth between designers, managers, and other programmers on your team, iterating in problems that aren’t fully clear, getting feedback, testing it across devices, realizing it feels off for reasons, and then often doing it and redoing it after testing for performance, feel, and feedback.<p>Often it’s “spend a whole day just reading code and trying to replicate something very tricky to find” and you only produce a single tiny change deep in the code somewhere. GPT is absolutely terrible at stuff like this.<p>And yes, often it is finding new solutions that aren’t anywhere on the internet. That’s the most valuable programming work, and a significant % of it.<p>Feel like there’s 10 more points I could make here but I’m on my phone and don’t like wasting too much time on HN. But man, what a disappointment of critical thinking I’ve seen in this specific topic.
评论 #35196255 未加载
评论 #35196486 未加载
评论 #35193654 未加载
评论 #35193892 未加载
评论 #35196182 未加载
评论 #35194304 未加载
评论 #35193778 未加载
评论 #35193711 未加载
评论 #35193688 未加载
评论 #35195909 未加载
评论 #35195707 未加载
评论 #35197635 未加载
评论 #35193883 未加载
评论 #35194808 未加载
评论 #35206738 未加载
评论 #35194430 未加载
评论 #35205101 未加载
评论 #35198879 未加载
keyle大约 2 年前
So, for a bit of fun, I signed up to GPT-4 thingy plus and I picked a fairly common web application and built it from scratch, only by talking to GPT-4 and copy pasting the code bits.<p>I&#x27;m actually taken back by how well it&#x27;s doing; including providing me some refreshers on stuff I forgot how it should work.<p>I can see it failing at solving complex problems, but like the blog post mentions, most programming isn&#x27;t new or hard problems.<p>This is particularly powerful when you&#x27;re producing something you&#x27;ve done before, but in a completely different language&#x2F;stack. You just guide GPT-4 towards the goal, you roughly know the methods needed to get to the end goal and just watch your assistant do all the dirty work.<p>Looking back, I came from a world of floppy disks; I left them behind for zip disks and CDs, then portable disks and cloud storage. I also came from dialup Internet, I left it behind for ADSL then fibre. I feel this is a tangential point here too, where AI, whatever it ends up being called, will become a fulltime assistant making our lives easier; so that we can focus on the hard parts and the creative problem solving. What are we leaving behind? For me, mostly Stack Overflow and Google.<p>You&#x27;d be silly to ignore it and palm it off. It&#x27;s a big deal.
评论 #35193808 未加载
评论 #35194451 未加载
评论 #35196292 未加载
评论 #35194377 未加载
评论 #35198806 未加载
评论 #35199694 未加载
评论 #35193927 未加载
maxdoop大约 2 年前
It seems so many of you guys are extremely lucky to be working on novel problems require elegant new solutions each day. That must be the case, otherwise I don’t understand these comments shrugging off GPT-n capabilities around coding.<p>“Psh, it’s just doing stuff it saw from its training data. It’s not thinking. It can’t make anything new.”<p>In my 11 years as a professional software engineer (that is, being paid by companies to write software), I don’t think I’ve once had come up with a truly original solution to any problem.<p>It’s CRUD; or it’s an API mapping some input data to a desired output; or it’s configuring some infra and then integrating different systems. It’s debugging given some exception message within a given context; or it’s taking some flow diagram and converting it to working code.<p>These are all things I do most days (and get paid quite well to do it).<p>And GPT-4 is able to do that all quite well. Even likely the flow diagrams, given it’s multi-modal abilities (sure, the image analysis might be subpar right now but what about in a few years?)<p>I’m not acutely worried by any means, as much of the output from the current LLMs is dependent on the quality of prompts you give it. And my prompts really only work well because I have deeper knowledge of what I need, what language to use, and how to describe my problem.<p>But good god the scoffing (maybe it’s hopium?) is getting ridiculous.
评论 #35197416 未加载
评论 #35197011 未加载
评论 #35197748 未加载
scg大约 2 年前
As a human programmer I didn&#x27;t quite understand the problem statement until I read the whole article and the tests.<p>I believe the goal is to find a path with the fewest possible &quot;fire&quot; cells and the minimum cost as a tie breaker. The cost of a path is the sum of its cells&#x27; cost and it can&#x27;t be greater than 5.<p>If I understood the assignment correctly, I don&#x27;t think the problem statement is equivalent to what&#x27;s included in the prompt. Specifically, the prompt doesn&#x27;t clarify what happens if you have to cross through multiple &quot;fire&quot; cells.<p>&gt; Fire tiles cost 1 point to move through, but they should avoid pathing through them even if it means taking a longer path to their destination (provided the path is still within their limited movement range)
评论 #35195398 未加载
评论 #35195907 未加载
评论 #35193890 未加载
评论 #35197993 未加载
评论 #35198568 未加载
评论 #35194266 未加载
Faint大约 2 年前
Remember that these models generate one token at a time. They do not &quot;think ahead&quot; much more than maybe a few tokens in beam search. So if the problem requires search - actual comparison of approaches, and going back-and-forth between draft and thinking through the implications - the model can&#x27;t do it (except in a limited sense, if you prompt it to give it&#x27;s &quot;train of thought&quot;). So it&#x27;s comparable of you being in front of whiteboard, hit with a question, and you would have to start answering immediately without thinking more than you can while talking through your answer at the same time. Doable if you know the material well. If it&#x27;s a new problem, that approach is doomed. Given that, I think the language models do remarkably well. A little bit of search, and maybe trying to generate the answer in different order (like short draft -&gt; more detailed draft -&gt; more detailed draft... etc.) will improve things a lot.
theshrike79大约 2 年前
I just used GPT-4 yesterday to write a Go-parser for a specific JSON input.<p>Within two prompts it could read the JSON data from a stdin stream, unmarshal it to Go structs and print the correct fields to stdout as a human-readable line of text.<p>Then I told it to colour the timestamp and id fields using the fatih&#x2F;color -package, and it did it correctly.<p>In total it took me about 4-5 prompts to get where I wanted. I just needed to fine-tune the printing to stdout part a bit to get it just how I liked, but it saved me a ton of boring template code writing and iteration.<p>I could&#x27;ve done it easily myself, but there were a few fiddly bits that would&#x27;ve required me to look up the documentation to check the exact way to do things. GPT4 had it correct from the start.<p>Then I asked it to write unit tests for the code, and it confidently started writing correct-looking code that would take the same input and expect the correct output, but just stopped in the middle. Three times. I stopped trying.<p>And another case:<p>I tried to use GPT-3.5 to write me a program that would live-tail JSON-logs from Sumo Logic and pretty-print them to stdout. It confidently typed out completely correct code with API endpoints and all. ...except the endpoints didn&#x27;t exist anymore, Sumo Logic in their great wisdom had removed them completely. The only solution is to use their 5 year old binary-only livetail executable.<p>GPT4 with the same input gave me a shell-script that starts a search job with the correct parameters and polls the endpoint that returns the result when it&#x27;s done.<p>The speed at which this is developing is really fascinating, I&#x27;m not really afraid for my job but I do love how this will automate (some of) the boring stuff away a bit like GitHub CoPilot did, but better.
评论 #35194302 未加载
importantbrian大约 2 年前
&gt; After going in circles a few more times, I decided that was it. It got <i>close</i>. It seemed to understand the problem, but it could not actually properly solve it.<p>I had this same loop issue with Chat-GPT. I had something I wanted to do with asyncio in Python. That&#x27;s not something I work with much so I thought I&#x27;d see if Chat-GPT could help me out. It was actually good at getting me up to speed on ansycio and which parts of the library to look at to solve my problem. It got pretty close, but it can&#x27;t seem to solve edge cases at all. I got into this loop where I asked it to make a change and the code it output contained an error. I asked it to fix the error so it gave me a slightly modified version of the code prior to the change. So I asked it to make the change again and the code it spit out gave the same error again. I went through this loop a few times before I gave up.<p>Overall, it&#x27;s cool to see the progress, but from what I can tell GPT-4 suffers from all the same issues Chat-GPT did. I think we&#x27;re probably missing some fundamental advance and just continuing to scale the models isn&#x27;t going to get us where we want to go.<p>My biggest concern with the current batch of LLMs is that we&#x27;re in for Stackoverflow driven development on steroids. There&#x27;s going to be a ton of code out there copy and pasted from LLMs with subtle or not so subtle bugs that we&#x27;re going to have to spend a ton of time fixing.
评论 #35200187 未加载
legohead大约 2 年前
I was just complaining to my friend about how much trouble I&#x27;m having with it. I purchased the $20 GPT-Plus so I could use GPT-4 after reading someone on HN say that GPT-4 is &quot;scary impressive&quot; at writing code.<p>I have two tasks I wanted it to try, both making use of public APIs, starting from scratch. In short, it was frustrating as hell. Never-ending import problems -- I&#x27;d tell it the error, it&#x27;d give me a different way to import, only leading to a new import problem. I think I used up all my 100 queries in 4 hours of GPT-4 just on the import&#x2F;library problem.<p>Then there were constant mis-use of functions -- ones that didn&#x27;t exist, or didn&#x27;t exist in the object it was using, but did exist in some other object instead, at which point it would apologize and fix it (why didn&#x27;t you give it to me correct the first time, if you &quot;know&quot; the right one?)<p>The actual code it wrote seemed fine, but not what I&#x27;d call &quot;scary impressive.&quot; It also kept writing the same code in many different styles, which is kind of neat, but I found one style I particularly liked and I don&#x27;t know how to tell it to use that style.<p>Lastly, it&#x27;s only trained up to Sep 2021, so all the APIs it knew were well behind. I did manage to tell it to use an updated version, and it seemed to oblige, but I don&#x27;t really know if it&#x27;s using it or not -- I still continued to have all the above problems with it using the updated API version.<p>Anyway, I hope MS fiddles with it and incorporates it into Visual Studio Code in some clever way. For now, I&#x27;ll continue to play with it, but I don&#x27;t expect great things.
评论 #35206418 未加载
评论 #35206700 未加载
qlm大约 2 年前
I had a similar experience earlier. Described a problem that isn&#x27;t even that hard - very similar to something there are probably lots of examples of online but subtly different. I wanted to see if handled these subtly different requirements.<p>It failed miserably, even with repeated instructions. It just assumed I wanted the more common problem. Every time I pointed out the problem it would say &quot;sorry for the confusion, I&#x27;ve fixed it now&quot; and give me back identical code. I even asked it to talk me through test cases. It identified that its own code didn&#x27;t pass the test cases but then still gave me back identical code.<p>I eventually gave up.
评论 #35193832 未加载
评论 #35194299 未加载
ArchitectAnon大约 2 年前
Here&#x27;s my perspective on this as an Architect: Most construction details have been done before, they could be easily reproduced by an AI surely? There&#x27;s usually just a few things that are different from the last time I have drawn it. A few 3D interactions with other components that need to be reasoned about. They are not that complicated individually.<p>But yet I see this problem as well just using old fashioned automation let along AI to save time. I find that if you haven&#x27;t drawn the 2D section through all the different edge cases of a particular thing you are trying to design, you haven&#x27;t done the analysis and you don&#x27;t really understand what&#x27;s happening. I&#x27;ve made mistakes where I&#x27;ve been working in 3D on something complicated and I&#x27;ve had to hide some element to be able to view what I&#x27;m working on, only to find later that when I turn everything on again I&#x27;ve created a clash or something impossible to build. That&#x27;s why we still do 2D drawings because they are an analysis tool that we&#x27;ve developed for solving these problems and we need to do the analysis, which is to draw section cuts through things, as well as building 3D models. After all, if models were such a good way to describe buildings, then why weren&#x27;t we just building physical scale models and giving them to the builders 100 years ago; it&#x27;s because you can&#x27;t see the build-up of the layers and you can&#x27;t reason about them.<p>Reading this article I get the same sense about software engineering, if you haven&#x27;t solved the problem, you don&#x27;t really understand the code the AI is generating and so you don&#x27;t really know if it is going to do what you&#x27;ve tried to describe in your prompt. You still have to read the code it&#x27;s generated and understand what it is doing to be able to tell if it is going to do what you expect.
评论 #35197613 未加载
sinuhe69大约 2 年前
So it continues to reaffirm what we’ve known: generative LLM does not have a model of the world, can not reason and can not plan. It generates text by mix-matching remembered texts and thus it can not generate truly new content.<p>No surprise because GPT-4 is built upon the same model as GPT-3. Clever Engineering will bring us far, but breakthrough requires change of the fundamentals.<p>Nevertheless, it’s useful and can helps us solve problems when we guide it and split the work into many smaller subunits.
评论 #35193794 未加载
gwoolhurme大约 2 年前
I guess my genuine question is to the people who are saying this is a big deal and it will take our jobs. I am a bit lucky to where at the moment I am working in a &quot;novel&quot; field. Lets say though for the sake of argument that AI does come for SWE jobs. To be honest? I don&#x27;t know what to do in that case, I have no backup plan, not enough to retire. The country I&#x27;ve lived in for 9 years is still through a work visa (hopefully at least that changes soon). I am just comfortable enough with my salary. If all that is pulled from under me, I lose my job tomorrow, I lose my profession, my visa, my home. I honestly would like to ask to the people who say this is and will come for us soon. Well OK, but what is your advice for someone like me? It&#x27;s true society doesn&#x27;t owe me anything, nobody does. So it is also just an answer that some of us will be dropped by the wayside. That&#x27;s what happened before. Just curious what anyone&#x27;s advice would be assuming they are right and it does take our jobs.
评论 #35197883 未加载
评论 #35203285 未加载
yawnxyz大约 2 年前
As a designer&#x2F;non-coder, it feels like I&#x27;m just pair programming all the time.<p>Stuff that usually took me a long time like regexes or Excel&#x2F;Sheets formulas now take like two minutes. AND I&#x27;m learning how they work in the process. I can actually write regexes now that used to be wildly confusing to me a couple of months ago, because Copilot &#x2F; ChatGPT is walking through the process, making mistakes, and me prodding it along.<p>I feel like it doesn&#x27;t matter how &quot;mindblowing&quot; or &quot;a big deal&quot; this tool is — it&#x27;s a great learning tool for me and helps me do my work 100x faster.
评论 #35209574 未加载
FeepingCreature大约 2 年前
Yep: the biggest remaining weakness is that it&#x27;s incapable of thinking deeply and iteratively. This is an architectural limitation (lack of reflectivity), but to fix it will probably usher in the singularity, so maybe we should be glad for it.<p>I suspect if you poked GPT-4 just right (starting with a detailed design&#x2F;analysis phase?) it could find a rhetorical path through the problem that resulted in a correct algorithm on the other end. The challenge is that it can&#x27;t find a path like that on its own.<p>Op: Can you get it to write your algorithm for this problem if you describe it in detail, as-is?<p>I suspect the difficulty here is just finding a socratic part to that description, which would tend to be rare in the training material. Most online material explains what and how, not why; more importantly, it doesn&#x27;t tend to explain why <i>first</i>.
评论 #35194309 未加载
amluto大约 2 年前
I&#x27;m not an AI as far as I know, but I would try a classic programming competition technique for this and observe that 6 isn&#x27;t a very big number.<p>Step 0: Let&#x27;s try to find a path without walking through fire. Run Dijkstra&#x27;s or A* to find the shortest path with no fire, up to distance 6. If it succeeds, that&#x27;s the answer.<p>Step 1: Okay, that didn&#x27;t work. We need to go through at least 1 fire tile. Maybe we can do at most 1. Define distances to be a tuple (fire, cost) where fire is the number of fire tiles used and cost is the cost. Comparison works the obvious way, and Dijkstra&#x27;s algorithm and A* work fine with distances like this. Look for a solution with cost at most (1, 6). Implemented straightforwardly will likely explore the whole grid (which may be fine), but I&#x27;m pretty sure that the search could be pruned when the distance hits values like (0, 7) since any path of cost (0, 7) cannot possibly be a prefix of a (1, c) path for any c&lt;=6. If this succeeds, then return the path -- we already know there is no path of cost (0, c) for c &lt;= 6, so a path of cost (1, c) for minimal c must be the right answer.<p>Step f: We know we need to go through at least f fire tiles. If f &gt; 6, then just fail -- no path exists. Otherwise solve it like step 1 but for costs up to (f, 6). Prune paths with cost (f&#x27;, c&#x27;) with c&#x27; &gt; 6.<p>This will have complexity 6<i>D where D is the cost of Dijkstra&#x27;s or A</i> or whatever the underlying search is. Without pruning, D will be the cost of search with no length limit but, with pruning, D is nicely bounded (by the number of tiles with Manhattan distance 6 from the origin times a small constant).<p>For a large level and much larger values of 6, this could be nasty and might get as large as t^2 * polylog(t) where t is the number of tiles. Fortunately, is upper-bounded by 6 and doesn&#x27;t actually get that large.
it_citizen大约 2 年前
To be fair, in 15 years writing code, I have spent the vast majority of my time working on minor variations of problems already solved.<p>I am not a fancy developer coming up with new algorithms. I make sign up flows, on-boarding flows, paginated lists, integrations to other apis.<p>And I definitely feel that my job might be threatened by LLMs.
评论 #35193661 未加载
评论 #35193685 未加载
评论 #35194376 未加载
评论 #35194065 未加载
评论 #35196325 未加载
graboid大约 2 年前
Yesterday evening I thought it would be fun to try to do a Raycaster in Javascript with the help of GPT-4. Experience was mixed.<p>1. Basic rendering logic was a breeze. I barely had to change anything, just copy paste, and I have a map with walls that were darker the further away they were, using textures, and basic movement using arrow keys. For an inexperienced graphics programmer like me probably saved hours getting to that point.<p>2. I asked it to add a minimap. Did not work perfectly at the first try, but after a few minutes of exchanging messages, it worked and looked okay.<p>3. I asked for an FPS display. Worked on first try.<p>4. Now I asked for a solution to render walls of different heights. Here I had to correct it a few times, or suggest a different approach, but it got it working halfway correct (but not very performant). Definitely took way longer than steps 1 to 3 combined (30+ minutes).<p>5. I asked for floor rendering (often called &quot;floorcasting&quot;). Here it completely failed. The code it suggested often looked like it might be the right approach, but never really worked. And the longer we exchanged messages (mostly me giving feedback whether the code worked or suggesting possible fixes), the more it seemed to hallucinate: very often variables suddenly appeared that were defined nowhere or in a different scope. At that point, it became increasingly frustrating for me, and I often closed the chat and &quot;reset&quot;, by posting my complete working code, and again prompting for a solution to the floor rendering. Still, until I went to bed, it did not produce any working solution. In retrospect, it would probably have been faster to read a tutorial how the floorcasting should work, and implement it myself like a caveman, but that was not what I was aiming for.<p>It was definitely fun, and I can clearly see the potential time-savings. But maybe I have to learn when to recognize it won&#x27;t bring me past a certain point, and I will save time and nerves if I switch to &quot;manual control&quot;.
chillfox大约 2 年前
I don&#x27;t think I have ever solved a truly new problem from scratch when programming... It&#x27;s all been apply algorithm x to y problem or crud stuff.<p>The most difficult problem that I have asked GPT-4 to solve was writing a parser for the Azure AD query language in a niche programming language and it did that just fine (I did have to copy paste some docs into the prompt).
评论 #35193770 未加载
ThreeToZero大约 2 年前
A modified A* that solves the fire routing problem (less efficiently than OP&#x27;s I think).<p>Each A* location stores where it comes from, how long it takes to get to it, and how many fires it passed through to get there. The algorithm only considers fire cells neighbors if the current number of fires passed through is less than the current fireWillingness global.<p>1. count fire tiles within movement range<p>2. run A* from src to dst completely avoiding fire<p>3. if we can reach then that&#x27;s the solution<p>4. if we can&#x27;t reach, increase fireWillingness to 1, re-run A* on the board<p>5. keep increasing fire-willingness until the A* results don&#x27;t change, or we can now reach the dst.<p>This works because a low fire path is always better than a high fire path. And increasing fire-tolerance will only shorten the paths from src to dst.
评论 #35195811 未加载
评论 #35193932 未加载
cdchn大约 2 年前
If you look at the Leetcode scores, it looks like GPT-4 can generally do most &quot;basic&quot; leetcode but fails on &quot;medium&quot; or &quot;hard&quot; problems. This seems to align with what I see most people&#x27;s experience with using GPT-3&#x2F;3.5&#x2F;4 to generate code seems to be. Works well for simple cases (which you could probably find examples of online) but stumbles on nuances of incrementally harder problems.
irobeth大约 2 年前
&gt; I think ChatGPT is just kind of bullshitting at this point. It doesn’t have an answer, and cannot think of one, so it’s just making shit up at this point [...] But instead it’s [overconfident] in its own capabilities, and just makes shit up. It’s the same problem it has with plenty of other fields<p>If anything, the article demonstrates it can write code, but it can&#x27;t <i>thoroughly reason about problems it hasn&#x27;t been trained on</i><p>So when saying something like &quot;Its possible that similar problems to that have shown up in its training set.&quot; as a way to dismiss any scintilla of &#x27;intelligence&#x27;, how many of these articles reduce to a critique e.g. &quot;Can a Middle Schooler actually understand dynamic programming?&quot;<p>Like, what is the actual conclusion? That a software model with O(N) parameters isn&#x27;t as good as a biological model with O(N^N) paremeters? That artisans need to understand the limits of their tools?
评论 #35197816 未加载
laserbeam大约 2 年前
Actually, if the state A* searches through is not &quot;tile reached&quot; but &quot;tile reached + count of fires on path&quot;, then it just becomes regular A*. This solves the A to C doesn&#x27;t always go through B, because it turns B into multiple distinct states, some with fires, one without.<p>There are a few issues with this. Search state is bigger (performance goes down), might not scale if other search features are needed in the game, you might need to be smart about when you stop the search and how you write your heuristic to not have to reach all combinations of fire counts before you end your search...<p>But the trick to &quot;just use A*&quot; is not in modifying the cost, but changing the search space.<p>PS. I see no reason why you should change your current code, obviously.<p>PPS. I don&#x27;t think GPT could come up with that insight. It sure didn&#x27;t in your case.
评论 #35193925 未加载
numbsafari大约 2 年前
Hey GitHub &#x2F; Microsoft &#x2F; OpenAI<p>How about this prompt:<p>I have a web page where customers see their invoice due. When they enter their credit card information, sometimes the page just refreshes and doesn&#x27;t show any kind of error information whatsoever, but the invoice remains unpaid. This has been going on FOR YEARS NOW. Can you write some code to fix this as we have been busy laying off all the umans.<p>Oh, or this one:<p>I have this page called &quot;Pull Reqeuest&quot;, at the bottom there is a button that says &quot;Comment&quot; and right next to it is a button that says &quot;Close this PR&quot;. We probably shouldn&#x27;t have a button that performs a destructive action immediately next to the most common button on the page. This has also been going on for years, but, you know, no umans.
capableweb大约 2 年前
Personally, I found GPT-4 to be helpful when writing code for games. But I&#x27;m a web programmer trying to learn game development, I&#x27;m no professional game developer by any measure. And I&#x27;m using Rust and Bevy, for what it&#x27;s worth. So it might not be as helpful for someone like Tyler who actually know what they are doing, similarly for me if I were to use it for web development.<p>The most helpful thing with GPT-4 have been getting help with math heavy stuff I don&#x27;t really grok, and that I can try to compile the code, get an error and instruct GPT-4 that the code didn&#x27;t work, here is the error, please fix it. Other things it been helpful for is applying the &quot;Socratic method&quot; for helping me understand concepts I don&#x27;t really grok, like Quaternions. Then, knowing GPT-4 isn&#x27;t perfect, I always verify the information it tells me, but it gives me great starting points for my research.<p>Here a conversation I had lately with GPT-4 in order to write a function that generates a 2D terrain with Perlin Noise: <a href="https:&#x2F;&#x2F;pastebin.com&#x2F;eDZWyJeL" rel="nofollow">https:&#x2F;&#x2F;pastebin.com&#x2F;eDZWyJeL</a><p>Summary:<p>- Write me a 2D terrain generator<p>- Me reminding GPT-4 it should be 1D instead of 2D (I used the wrong wording, confusing a 1D vector with 2D)<p>- Code had issues with returning only values with 0.0<p>- GPT-4 helping me tracking down the issue, where I used the `scale` argument wrong<p>- Got a working version, but unhappy with unrealistic results, I asked it to modify the function<p>- Finally got a version I was happy with
theK大约 2 年前
I think this article is a great example of the one key shortcoming that Ai based code generation has. Even a seasoned developer will fail to describe the intricate details and context of what they are trying to do. Non developers constantly fall flat on their face on this and rely on devs to “keep the edge cases in mind” etc.
naillo大约 2 年前
The biggest thing here is that it&#x27;s semi capable and improving. I feel safe about my job right now but it is worrying to invest time to compete with a machine that will continue to get better over the years where previously I felt safe that the effort of my labour would bear fruit for decades to come. Now I&#x27;m not so sure.
评论 #35196204 未加载
评论 #35195740 未加载
jsemrau大约 2 年前
GPT3 worked well for me with smaller programming tasks. I.e, helper functions, api calls, etc In those cases it was easier to type : Write a javascript function that does X<p>It totally failed for me creating a nice looking website using bootstrap. While GPT3 created a workable outline, it never looked right and the css adjustments never worked.
davbryn大约 2 年前
I&#x27;m more intrigued why the author finds this a difficult problem for the needs of their game. It looks like their search space is at most a 10 X 10 grid for the most part (I&#x27;m assuming based on asset since and detail it doesn&#x27;t grow too much larger).<p>I know it isn&#x27;t relevant to the Chat-GTP code writing discussion, but A*, Dijkstra and heuristics to move an entity around 8 spaces could raise the question &quot;Can the developer be more pragmatic?&quot;.
iambateman大约 2 年前
The replies here are defensive and I think misguided. Yes, a programming job doesn’t solely require typing code. But the reason we have well-paid programming jobs is because there is a specialized skillset required to understand a body of syntax that takes several years to really grasp.<p>The difference is that writing a well-formed prompt is massively easier to teach than writing the code itself, for similar results. That’s not to say prompt writing requires no skill - it will certainly need understanding of systems and the scope of what is possible within a language. Asking GPT-4 to write a jQuery plug-in that generates an original Bob Dylan song will probably just not work.<p>But it is wildly easier to teach someone what is possible with JavasScript and let them spend a month watching someone prompt the system and let them go from there.
评论 #35197213 未加载
kneel大约 2 年前
I self taught myself how to code and have never been very good, I don&#x27;t code often and when I do I spend a lot of time relearning some simple programming detail I forgot.<p>ChatGPT (also copilot) allows me to focus on the project that I&#x27;m working on and offload the stack overflow searches to prompting. I don&#x27;t have to find a similar error someone else posted on SO and figure out how it applies to my current problem. I can keep a high level view of the project and not get bogged down with silly bugs, learning new libraries, or deciphering someone else&#x27;s code.<p>I imagine there are a lot of people who are in a similar situation, it&#x27;s crazy that we&#x27;ve just unleashed this massive productivity booster onto millions of people.
评论 #35200357 未加载
akamoonknight大约 2 年前
One thought that came to mind when the author mentioned that GPT-4 seemed to want to use A* was that maybe the problem statement was not formed in a way that would get the best response. I&#x27;m not entirely sure if this type of back and forth would be useful, but just struck me as interesting to try.<p>I put the following prompt into ChatGPT (no GPT-4 access for me yet):<p>*A problem statement is given as follows:<p>An agent needs to find the best path on a 10x10 tile grid from their current location to a target location. They have a limited movement range of 5 points. Regular tiles cost 1 point to move through, water tiles cost 2 points to move through. Fire tiles cost 1 point to move through, but they should avoid pathing through them even if it means taking a longer path to their destination (provided the path is still within their limited movement range)<p>Please point out any parts of the problem statement that require clarification.*<p>Which returned the following response:<p>Based on the given problem statement, some potential areas that may require clarification are:<p>Current location and target location: It is not clear if the current location and the target location are fixed or if they change with each problem instance. If they are fixed, it should be mentioned explicitly. If they are not fixed, it should be specified how they are generated for each problem instance.<p>Movement range: It is not specified whether the agent&#x27;s movement range is circular or square. If it is circular, it means that the agent can move to any tile that is within a distance of 5 from its current location. If it is square, it means that the agent can move to any tile that is within a range of 5 tiles in both the horizontal and vertical directions. The shape of the movement range needs to be specified.<p>Fire tiles: The problem statement mentions that the agent should avoid pathing through fire tiles even if it means taking a longer path to their destination, but it is not clear how the agent knows which tiles are fire tiles. It should be specified how the agent perceives and distinguishes fire tiles from regular tiles.<p>Tile costs: The problem statement mentions that regular tiles cost 1 point and water tiles cost 2 points, but it is not specified what happens if the agent moves diagonally. It should be specified whether diagonal moves are allowed and how much they cost.<p>Objective function: It is not specified how the &quot;best path&quot; is defined. It should be mentioned whether the objective is to find the shortest path or the path that minimizes some other cost function (e.g., total cost of the path).
评论 #35194230 未加载
评论 #35194574 未加载
评论 #35193930 未加载
评论 #35194283 未加载
PopePompus大约 2 年前
I&#x27;m a casual programmer, who knows enough to write decent Python scripts, but who is probably unaware of 99% of the Python library modules that have been written. Yesterday I had GPT-4 write a script that would accept the name of a star, and print out all the journal articles that have that star&#x27;s name in the article title. This is a bit trickier than it sounds, because almost every interesting star has many names (Vega, for example, has more than 60 names, not including non-English names) and I wanted the script to check the titles for all the names that might be used for the particular star I had specified. I told GPT-4 to use the SIMBAD database to get all the star names, and to use NASA ADS to get all the publications. GPT-4 wrote a script to do that. The script was buggy, but I was able to fix the bugs easily and quickly. The wonderful thing was that GPT-4 used 2 different libraries that I had never even heard of, to pull data out of those databases. The process of producing the script was far faster than I would have been able to do on my own. Professional programmers may be well aware of the software packages that will allow them to do their jobs, and might not get much help from a GPT assistant. But I think for people who know how to program, but are not professionals and may not know &quot;what&#x27;s out there&quot; in the way of resources, a GPT assistant will vastly increase their ability to use their programming skills (such as they are...) to get useful stuff done.
bobek大约 2 年前
&gt; The useful thing to do would be to just say “I do not know of an algorithm that does this.” But instead it’s overcompetent in its own capabilities, and just makes shit up.<p>I had recently very similar reaction. And then realized, that this is exactly same behavior as with many of my colleagues at work...
sequoia大约 2 年前
I had a teacher with a sign on her door that read:<p><pre><code> Technology will not replace teachers But teachers who use technology will replace those who don&#x27;t </code></pre> s&#x2F;teachers&#x2F;programmers&#x2F; and s&#x2F;technology&#x2F;AI&#x2F; and this sounds about right. It may become typical or even required to <i>leverage</i> AI to write code more efficiently.
braingenious大约 2 年前
I tried out gpt4 today with the task of “take some html files made by a non technical person using various versions of microsoft word over a decade ago and put the contents into a csv” and it hasn’t done great. Not terrible, but not great.<p>That being said, I don’t know anybody talented enough to handle it that would even look at this project for $20 so ¯\_(ツ)_&#x2F;¯
评论 #35196271 未加载
sgarland大约 2 年前
There are plenty of edge cases where it fails. However, the one thing that made me think it actually knows (for certain definitions of the word &quot;knows&quot;) what it&#x27;s doing was asking it to re-write a non-trivial SQL query into equivalent relational algebra. I created a simplified schema from Northwind [0], gave it the CREATE TABLE statements for tables, and then some sort-of-TSV files for the values.<p>It was able to not only produce reasonable outputs from various queries, but also to produce valid relational algebra for them. To me, that shows a fairly deep level of understanding of the underlying concepts.<p>[0]: <a href="https:&#x2F;&#x2F;en.wikiversity.org&#x2F;wiki&#x2F;Database_Examples&#x2F;Northwind" rel="nofollow">https:&#x2F;&#x2F;en.wikiversity.org&#x2F;wiki&#x2F;Database_Examples&#x2F;Northwind</a>
评论 #35196887 未加载
评论 #35196547 未加载
closetnerd大约 2 年前
What&#x27;s most astounding is that it was expected to work when so much of the prompt just doesn&#x27;t make sense:<p>&gt; Objects should only move if they will end up on an empty tile after their move<p>&gt; &quot;An object is free to move onto a tile that another object moves onto if that other object is moving off of its tile&quot;
评论 #35194058 未加载
评论 #35194091 未加载
MavisBacon大约 2 年前
I think the question also needs to be asked of- can GPT-4 write accessible&#x2F;WCAG compliant code? I just spent the last two days at a digital accessibility conference called axe-con, financed by a firm called Deque who primarily make AI&#x2F;ML powered software for the detection of accessibility issues in code<p>The resounding attitude seems to be that AI&#x2F;ML is a friend to disabled users and can help do a lot of lifting with writing, maintaining, auditing code- but we are a long long ways away from fully automated processes that account for accessibility and produce websites that will work with assistive tech like screen readers, if it is possible at all
r3trohack3r大约 2 年前
Watching GPT write code hurts my brain.<p>It starts at the first character, works forward one “token” at a time, and ends at the last character. Never moving back.<p>It feels like it knows where it’s going at the first character, even though it doesn’t.<p>It’s like it starts speaking a sentence and, by the time it’s done speaking, it’s written a syntactically correct Node.js application.<p>The way GPT communicates in English does seem similar to how humans communicate. The way GPT writes code doesn’t seem to come close to approximating how humans do - it’s an entirely different mechanism. Humans generally can’t write code without a cursor and backspace.
评论 #35199204 未加载
yanis_t大约 2 年前
I find it interesting that many people took a defensive position towards AI. For many the discurs seems to be &quot;will this AI thing eventually replace me and kick out of me job&quot;.<p>For me it&#x27;s more like will that AI thing make me a 10x developer? And the answer I&#x27;m leaning for is yes.<p>I use copilot which saves me time googling and reading stackoverflow. I use chatgpt for writing tests to my code (which I hate to do myself). Sometimes I use it to ping-pong ideas, and eventually set on a good solution to a problem.<p>It saves me tons of time I use to complete other tasks (or spend with my family).
评论 #35197754 未加载
halflife大约 2 年前
I had the exact same experience. Writing code for existing popular problems is phenomenal. But when you diverge slightly, it breaks down. I asked it to write a regex which finds all html tags that has a specific class name, but does not contain another specific class name. I assume this problem has been tackled many times by scores of developers. It had outputted an excellent regex. I asked it to ignore texts in inline script (such as event handlers), and it presented an invalid regex. I tried to point out the problem but it just went into a loop of bad regex code.
fendy3002大约 2 年前
I&#x27;ve used chatgpt and while it&#x27;s mediocre, even bad at writing code, it&#x27;s very good at reading code and explaining it in a way that&#x27;s easier to understand. It&#x27;s also good at giving hints and starting point for when you don&#x27;t quite familiar with the language feature &#x2F; library.<p>From writing code, they&#x27;re good at bootstrapping unit tests and skeleton code, also useful at transpiling dto &#x2F; entities between languages.<p>Overall if you&#x27;re willing to learn and not just treat a gpt as code monkey, they&#x27;re very useful.
race2tb大约 2 年前
I like how it ends with it can write code that is repetitive and has many examples of solutions floating around which is most of coding.<p>It will probably edge upward as well as time goes by till there are only very edge problems that it cannot solve. Even then I would use it to write the broken down version of the solution. It is going to be getting fed by pretty much every programmer,knowledge profession on the planet using copilots of some sort. Eventually it will have knowledge transferred everything humans can do into its model.
sequoia大约 2 年前
One other question: can GPT-4 reliably <i>modify</i> code? In A Philosophy of Software Design the author points out that code is written once and modified possibly dozens of times, so ease of maintainability&#x2F;reading is more important than ease of writing.<p>I wonder whether a) AI can reliably <i>modify</i> code and b) whether AI can reliably write code that is able to be easily modified by humans. If AI starts spitting out machine code or something, that&#x27;s not useful to me even if &quot;it works.&quot;
评论 #35197274 未加载
评论 #35197441 未加载
lxe大约 2 年前
It is far from being able to solve difficult or even annoyingly complicated problems that programmers solve on a regular basis just by a one-shot prompt.<p>Ask it to parse a PDF document and separate it into paragraphs, for example. The first solution isn&#x27;t gonna work well, and by the time you get to solving yet another quirk while it apologizes to you making a mistake, it will lose context.<p>Best way to use this tool is to ask it short and precise questions that deal with a small piece of code.
GrumpyNl大约 2 年前
I asked it to write the code for all the unique combinations of A,B,C,D in PhP, after 27 tries it succeeded. The i asked it to solve the problem of a horse is 15 dollar, a chicken one dollar and a egg .25 dollar, i can spend 100 dollar for 100 items, some of each. After 2 hours, it was not able to solve it. One time it gave 5 possible answers, with the correct one also, but it did not recognize the correct one.
dmix大约 2 年前
From my own sampling of going through about 10 times I used GPT for real world Typescript code, some used in production, I can confirm that GPT-4 does a noticeably better job and produces code I actually want to use way more often.<p>GPT-3.5 always produced very verbose types and over engineered code. The GPT-4 outputs were consistently shorter and more focused. Kind of like how a junior dev has to think through all the smaller steps and makes functions for each, as he incrementally solves the problem slower and less intuitively, almost over explaining the basics, while a senior dev merges the simpler stuff into small concise functions. You can see it with the var names and type choices GPT-4 focused much more on what the code is trying to accomplish rather than what the code itself is doing. And these are all with the same prompts.<p>There’s still things like unused vars being included occasionally and some annoying syntax choices, if I could append prettier&#x2F;eslint rules automatically to GPT output it’d be gold (I haven’t tried to do this myself).<p>But still very encouraging.
评论 #35197432 未加载
havefunbesafe大约 2 年前
If you&#x27;ve never worked as a dev&#x2F;in product, this will not help you. If you have a working understanding of your codebase, as a product person, and can bump your way through writing code and the command line, it WILL help you immensely. Source: I wrote an integration that connects our API to google drive to pass video content (something I could have NEVER done before).
davewritescode大约 2 年前
I think most folks (perhaps not here) misunderstand is that writing code is the easiest part of a software engineering job. Anyone can write code in their little piece of the system, write some tests to prove it works and move on. Given enough time I feel most good software engineers can do that part of the job without issue.<p>Knowing how code might fail and preventing cascading effects, tuning resource usage, troubleshooting incidents are the actual hard parts of software development and it&#x27;s where even good software engineers tend to fall over. We&#x27;ve created whole specialties like SRE to pickup where application developers fall short. I&#x27;ve seen lots of systems fail for the dumbest reasons. Thread pools misconfigured, connection timeouts with poor configuration, database connection pools are completely incorrect.<p>Wake me up when ChatGPT can troubleshoot at 1 AM when the SRE and on call engineer are both frantically trying to figure out why logs are clean but the service is missing it&#x27;s SLO.
fancyfredbot大约 2 年前
A great article with a practical example of a programmer using GPT to solve a problem it hasn&#x27;t seen in its training data. It gives plausible but incorrect answers and the user isn&#x27;t able to prompt it to correct them.<p>It seems likely that a understanding of when NOT to use an LLM is a new skill programmers are going to want to learn in order to use their time efficiently.
evo_9大约 2 年前
It&#x27;s not bad. I just had it write a small API call to the NHL endpoint to gather some data for something I was curious about stat-wise.<p>Anyway, I initially had it write it in Python, and it mostly worked, but I was having some issues getting the data exactly right, and formatted the way I wanted.<p>Once I had it more &#x2F; less right in Python, I had it rewrite it as a dotnet console app (C#), which is what I know best.<p>The only real issue I ran into is it would randomly stop before completing the conversion to dotnet. Like it would write 85% then just stop in the middle of a helper function. Not a huge deal, I just had it complete the last function, and with a little bit of fiddling in VS Code got it running pretty much the way I wanted.<p>So overall, yeah, not bad. Probably saved me an hour or so, plus I couldn&#x27;t find great docs for the NHL endpoint, and ChatGPT was able to sus out the correct syntax to get to the data I needed.<p>I wonder how git Copilot compares, has anyone tried out both?
zamalek大约 2 年前
I have been writing a text editor, and I&#x27;m currently working on the VT100 stuff. Unit testing VT100 is a lot of busy work. There&#x27;s a bunch of different message frames (DCS, OSC, CSI, etc.) and many, many, escape codes.<p>I decided to try out CodeGPT.nvim, and it was a massive help. It didn&#x27;t provide perfect code, not by a long shot, but it gave me extremely valuable starting points - and did a somewhat decent job of exercising most of the branches (certainly enough for me to be happy): <a href="https:&#x2F;&#x2F;gitlab.com&#x2F;jcdickinson&#x2F;moded&#x2F;-&#x2F;blob&#x2F;main&#x2F;crates&#x2F;term&#x2F;src&#x2F;vt&#x2F;messages&#x2F;ansi.rs#L182" rel="nofollow">https:&#x2F;&#x2F;gitlab.com&#x2F;jcdickinson&#x2F;moded&#x2F;-&#x2F;blob&#x2F;main&#x2F;crates&#x2F;term...</a><p>Many people have said it, and it&#x27;s true. Expecting GPT to write a complete solution is just asking for problems, but it is an incredible assistant.
awb大约 2 年前
How are people generating multiple files for larger applications?<p>I gave it a prompt and asked it to respond with a list of file names required to build the app. Then when I prompted a file name it should print the code for that file along with a list of ungenerated file names. It got through two before it got confused.<p>I’m stuck with having it write one function at a time.
评论 #35198588 未加载
mrandish大约 2 年前
I&#x27;m actually quite excited by what the examples in the article show – despite the fact they show GPT-4 can&#x27;t replace a good dev solving somewhat tricky algorithm problems.<p>Reason: I&#x27;m a code hobbyist who glues various modules together that have been written by much better programmers than I am. My end goals are never more ambitious than doing pretty simple things which I&#x27;m doing mostly to amuse myself. My biggest time sucks turn out to be tracking down fairly simple syntax things that vary between different languages and frameworks I&#x27;m slapping together (because I rarely spent more than a couple hours working on any one thing, I never get super familiar with them).<p>Being a lousy coder with little desire to put in significant effort to improve just to make my personal hobby projects a little easier, a basic AI assist like this looks pretty useful to me.
chapliboy大约 2 年前
The job of a programmer, in a business context especially, is to take real-world requirements, and convert them into clearly defined systems where they can be solved &#x2F; reasoned with.<p>I once had a manager telling me what needed to be done. Even with an actual person (me) in the loop, the code produced would often have glaring differences from what he wanted.<p>By its very nature, code requires a lot of assumptions. In any business context, a lot of things are implicitly or explicitly assumed. If you need a computer, or another person to give you exactly what you desire, you need to be able to spot the assumptions that are required to be made, and then clearly state them. And after a point, that&#x27;s just programming again.<p>So this, or some other AI, is more likely to replace JS and python, or create another level of abstraction away from systems programming. But programmers will still always be required to guide and instruct it.
uhtred大约 2 年前
I was thinking last night about how my job as a pretty average software engineer is probably going to be taken by GPT* in less than 5 years, and how skilled blue collar jobs like electricians and plumbers and carpenters are probably much safer, since robotics is way behind AI.
评论 #35198786 未加载
bawolff大约 2 年前
I feel like gpt is basically just stack overflow on steriods. With all the pros and cons that entails.
jokoon大约 2 年前
People need to understand that AI doesn&#x27;t think, doesn&#x27;t have insight or intuition. AI just repeat patterns it saw in a huge database, but are not able to understand what is going on.<p>Nobody can really understand what&#x27;s inside a trained neural network, and nobody is really looking.<p>No psychologist or neuro-scientist can really understand how a human brain, a mouse brain or even an ant brain or a fly brain even works, so don&#x27;t expect computer scientists to have any insight about doing something relevant with just a small collection of sophisticated statistical methods.<p>AI is soon going to become the pseudo-scam status that bitcoin experienced.<p>ChatGPT is an improved search engine at best.
评论 #35198966 未加载
评论 #35198738 未加载
tomduncalf大约 2 年前
I didn’t have much luck with ChatGPT trying to solve a novel problem (sorry can’t share details), it gave answers that kind of sounded plausible if you didn’t really understand the problem but in reality were no help. It also hallucinated a bunch of research papers that sounded really useful haha.<p>Will have to try GPT-4 for the same thing and see if it’s any better, I suspect though that this kind of genuinely novel problem solving may be beyond its current abilities (unless you work through to step by step in a very granular way, at which point you’re solving the problem and it’s writing the code - which could be a glimpse of the future!)
ummonk大约 2 年前
I use it to reduce the drudgery of writing code like this. I&#x27;ve found I have to do a lot of hand-holding in terms of telling it what data structures and logic it should use. I also just directly tell it what changes it needs to make to fix big bugs or logic errors I spot. That gets it to the point that I can tweak it myself and complete the code.<p>One of the frustrating things is that it doesn&#x27;t ask for clarification of something that&#x27;s unclear - it just makes an assumption. Really demonstrates why software engineering interviews emphasize the candidate asking clarifying questions.
评论 #35201644 未加载
tehsauce大约 2 年前
This is one of the best analyses of gpt4 Ive read so far. Besides potentially including the visual aspect, I wonder if part of the reason it has trouble with harder problems is that it’s been tuned&#x2F;prompted in a suboptimal way. The advertised used case mostly is “write down the solution for this problem”, but for novel problems it does much better when it’s given the chance to reason through it before trying to write down a solution. I wonder how much better it would do with a prompt like “try to work out a way to solve this problem, and then validate it to be sure if it’s a correct solution.”
评论 #35195358 未加载
modeless大约 2 年前
I wonder if the multimodal capabilities would be helpful on easily visualized problems like this. Could it benefit from seeing the diagrams? Seems far fetched, but so did its current capabilities a few months ago.
zitterbewegung大约 2 年前
I had some Matlab code and I wanted it to be ported to numpy. I couldn&#x27;t get it running on Python and it wasn&#x27;t doing it correctly on chatGPT.<p>On the other hand it could regurgitate code to use fastapi and transformers and it looked correct to me.<p>When you think about it this is very very similar to a stack exchange or google search but with a much different way to search and it can synthesize simple things which limits the complexity of what you want to do. So I don&#x27;t really think it can write code but it can surely get you something that gets you 50% there.
jah242大约 2 年前
Whilst maybe GPT-4 will change this, I think it is important to remember that these general ChatBots are not the way we have generally trained LLMs to write the best code. In fact, coding is one of the few areas where training specifically just using source code and maybe some stack overflow (not all natural language on the internet) leads to better results on the previous iteration of LLMs (GPT-3 wave). So the real test will be whether the GPT-4 wave of specific coding LLMs i.e GPT-4-Codex can &#x27;actually write code&#x27; see:<p>AlphaCode Codex CodeGen
spywaregorilla大约 2 年前
I wanted GPT to help me write some code for unreal engine. I was very impressed with what it could do. It was able to write code that correctly utilized Quartz, an experimental plugin for queuing things on the audio thread. Which is awful impressive given that Quartz is super niche and doesn&#x27;t seem to have basically any documentation around to train on for cpp code.<p>I presume it is because unreal engine is source available and the model has seen the whole damn thing.<p>I&#x27;m curious if it must be worse on unity, which is not source available.
marcyb5st大约 2 年前
Ah, in my experiments it writes like &gt; 90% of the code correctly.<p>I got the best results with prompts like:<p>Given the following python code:<p>``` Few hundreds python loc here ```<p>Write tests for the function name_of_function maximizing coverage.<p>The function in this example had a bit of read&#x2F;dumps from disk and everything. The code returned correctly created mocks, set up setup and teardown methods and came up with 4 test cases. I only needed to fix the imports, but that&#x27;s because I just dumped python code without preserving the file structure.<p>I am amazed how fast these models are evolving.
yosito大约 2 年前
Once GPT can look at the open issues in my GitHub repos, and submit Pull Requests that legitimately solve the problems, then I&#x27;ll worry that AI might be coming for my job.
shanehoban大约 2 年前
I&#x27;m building a pretty neat database with it at the moment, its not perfect, but it is saving me potentially months of fine tuning, down to just hours. It is amazing IMHO.
justinzollars大约 2 年前
GPT 3.5 helped me debug something very complex. There was a bug related to symlinks in neovim with gopls LSP. The error dialog line was appearing, then disappearing.<p>Chat GPT walked me through strategies to debug this, confirm everything was set up, tail the RPC log (wasn&#x27;t aware that was a feature) - and identify the failing path - which was a symlink!<p>I&#x27;m actually blown away by this capability. It was like having a savant next to me. I couldn&#x27;t have debugged it on my own.
blagie大约 2 年前
Off topic: This problem was fun. I would really enjoy a site with a pile of nonstandard, fun, and interesting problems like this one.<p>Coming up with a working algorithm took about 30 seconds (I got lucky, not brilliant), but it stretched my brain in an interesting way.<p>That&#x27;s different from practice sites like leetcode, which have pretty cookie cutter problems. On problems like this one, sometimes:<p>- I get it in a few seconds, like this case<p>- Get it in a few minutes<p>- Get it in a few hours<p>- Give up and look up the solution<p>A fun problem a day would be, well, fun.
tarkin2大约 2 年前
The only argument I&#x27;ve heard against our impending doom is:<p>The productivity gains will not leave people unemployed, but will give managers the opportunity to start more projects.<p>The role of a developer will change. We&#x27;ll be looking at generated and regenerated code. But we&#x27;ll still be demand by those with ideas and never-decreasing human demand.<p>This assumes that GPT-X won&#x27;t end up being used by end-users--bypassing both the C-level, the managers and the developers.
wankle大约 2 年前
When business A&#x27;s lightly technical aware AI operator asks their AI for a solution to push payment information to a bank B and describes it, and A&#x27;s AI talks to the bank&#x27;s AI and they coordinate the creation of the API then A&#x27;s and B&#x27;s AI talks to their respective production counterpart AI&#x27;s and they create the implementation and put it into production; I feel we programmers will mostly be obsolete.
mik1998大约 2 年前
As expected, LLMs don&#x27;t actually think. This is not really a surprising result when you understand that it&#x27;s a few billion Markov chains in a trenchcoat.
asah大约 2 年前
Me &amp; many senior engineer friends are starting to use ChatGPT like &quot;junior engineers&quot; to write code them. It&#x27;s not even a question anymore.
neom大约 2 年前
I came up in the 90s, used a lot of dreamweaver to build sites, all my friends thought I was a wizard because I had a website and that required you to program interweb stuff. Then the net became pretty complex, I gave up around dhtml but always really appreciated and loved what folks could do with the DOM. I&#x27;ve been thinking a lot recently that GPT might allow me to build again, I have some of that 90s dreamweaver vibes using it.
PKop大约 2 年前
&quot;I think ChatGPT is just kind of bullshitting at this point.&quot;<p>This line sums up the entire problem with these tools for anything concrete, like analyzing input data, writing code, producing a series of particular facts, data analysis etc. Much of it can be right, but whatever isn&#x27;t makes the whole output useless. You&#x27;ll spend as much time checking its work as producing it yourself.
d357r0y3r大约 2 年前
GPT-4 can write code, but it can&#x27;t build software.<p>Well...it <i>could</i> build software if humans gave it the right prompts. Coming up with the right prompts is difficult, because it means you&#x27;re asking all the right questions.<p>If you&#x27;re just really good at writing code, then yes, GPT is coming for your job. Do what humans are good at: responding to the needs of other human beings and building solutions around them.
the_af大约 2 年前
The first problem statement that GPT got wrong actually shows a problem with human language.<p>&quot;Avoid fire&quot; means &quot;do not ever go through fire&quot; to me (and GPT thinks the same, apparently). The author thought it meant &quot;avoid fire if you can, but go through it if there&#x27;s no other way&quot;. This was a problem with informal requirements that could have happened in an entirely human context.
评论 #35196636 未加载
peterisza大约 2 年前
I skipped the article and only read the prompt. I have no idea what to do with fire. Why does it matter what it costs if you must avoid it?
_nalply大约 2 年前
I think what GPT-4 is missing: a feedback loop.<p>Imagine you were GPT-4 and being asked to write a small program, but you can&#x27;t try it out yourself.
iandanforth大约 2 年前
The second overlapping crescent moon solution the GPT provides is really interesting. If it was hard to find a counter example I wonder if there is a restricted case for the radius of the inner circles for which the proposed algorithm <i>is</i> true. I don&#x27;t have the maths to determine this myself but would love to hear speculation from others.
bruce511大约 2 年前
I don&#x27;t need GPT-4 to write code, I can do that myself.<p>I want it to attend all the meetings for me with endless managers discussing what the code does, should do, could do, customer would like it to do, can&#x27;t be done and so on.<p>Hint to managers: Programming doesn&#x27;t take up my time. Your endless meetings to discuss my programming takes up all the time...
yanis_t大约 2 年前
Here&#x27;s another view. If you&#x27;re a music composer, you hear the music in your head. But in order to get it out, you need to play and record musical instruments, learn to sign, learn to produce, etc. What if you had a device that takes music from your brain and gives you an mp3 file?<p>That&#x27;s what I think AI is doing for developers here.
hackandthink大约 2 年前
Maybe LLM Code writing will us slow down (initially). Productivity is hard to measure and can be counter intuitive.<p>Talking the whole time with your LLM may distract more than it helps.<p><a href="https:&#x2F;&#x2F;en.wikipedia.org&#x2F;wiki&#x2F;Productivity_paradox" rel="nofollow">https:&#x2F;&#x2F;en.wikipedia.org&#x2F;wiki&#x2F;Productivity_paradox</a>
burntalmonds大约 2 年前
I&#x27;m sure I&#x27;ll change my mind as this tech improves, but having AI generate code goes against every instinct I have. Way too easy for there to be a subtle bug in the code among other problems. It makes me wonder though if AI could useful for writing tests of my code. And also AI code review.
评论 #35197593 未加载
Hedepig大约 2 年前
It brings to my mind the levels of self driving<p>We&#x27;re definitely at 2 right now, and picking away at level 3.<p>I have heard some people skeptical that we can overcome the problems of truthfulness due to the inherent limitations of LLMs. But, at least on the face of it, it appears we can make incremental improvements.<p>If only they would actually be <i>Open</i>AI<p>I have seen
评论 #35195317 未加载
sagebird大约 2 年前
I would ask it to pretend a mathematician knows how to solve it, and is having a conversation with a novice programmer attempting to solve the problem, and pointing out mistakes and hints at each step, and gives examples of where it fails, until a proof is given that the program is correct.
bob1029大约 2 年前
Yes. I only need it to write 50-100 lines at a time to be incredibly effective.<p>My productivity this month has been insane. I&#x27;m still writing most of the code the old fashioned way, but the confidence of having this kind of tool makes it a lot easier to push through boring&#x2F;tricky items.
Kareem71大约 2 年前
Problem is reading someone else&#x27;s old code takes an order of magnitude longer than writing new code
评论 #35194298 未加载
dukeofdoom大约 2 年前
A solo developer can now afford an assistant. It&#x27;s liberating, since it makes it easier to get some things done. So you can do more, or have more free time. You can get by using Midjourney for art, and GPT-4 to answer questions and occasionally help to write code.
Havoc大约 2 年前
Not sure it matters? If the majority of coding is gluing things together and it can replace that then you&#x27;ve suddenly got 10x as many coders gunning for the remaining gigs that have hard problems.<p>Good for whoever comes out on top, but not sustainable from a societal perspective
adverbly大约 2 年前
I think having a better understanding about the underlying statistical model of how these AIs are trained is helping me keep back the wave of fear and anxiety associated with AI risks.<p>The singularity requires AIs to be very good at doing things people have not done before. But this form of machine learning is bad at that. It is like someone who doesn&#x27;t actually understand anything has somehow managed to memorize their way through whatever topic you&#x27;re asking about. They have lots of tips and information about things, similar to what you might currently find by doing research. But they don&#x27;t seem to have what is required to push the boundaries of knowledge for understanding, because they don&#x27;t actually really have it in the first place. Or maybe what they have is just very minimal when compared to the contribution of their memorization.<p>Obviously you still have the main risks of breaking capitalism, mass unemployment, pollution of public communications, etc. But honestly, I think each of these are far less scary to me than the existential risk of superintelligence. So in a way I&#x27;m actually happy this is happening the way it is right now, and we don&#x27;t have to deal with both of these risks at the same time.<p>Our current approach is probably the safest way to progress AI that I can think of: it requires a new model to improve, and it&#x27;s learning entirely from human data. It might not seem like it, but this is actually pretty slow, expensive, and limited compared to how I expected AI to improve given Sci fi movies or Nick Bostrom&#x27;s writings(curious what he&#x27;d have to say about this resurgence of AI)
diedyesterday大约 2 年前
So this guy is basically complaining about GPT-4 not being a super-intelligence. Still that makes it more powerful and versatile thae the great majority of programmers out there.... And the game is only getting started. This is just the warmup.
altitudinous大约 2 年前
Blah Blah Blah. I use ChatGPT for this every day to write code to save my own efforts and it is doing just fine thanks. I also use it for creative content in my apps, although I edit this work to get the tone in its writing correct. It is excellent for this.
awill88大约 2 年前
And more things to yawn over! Come on, who cares if it writes code?! Is it that fascinating?
DeathArrow大约 2 年前
I think AI will never write good code but can be very useful for very basic stuff, boiler plate or repetitive stuff, like a smart IntelliCode. In fact, I think MS built some AI in IntelliCode but not advanced stuff so they can sell GitHub Copilot.
abecedarius大约 2 年前
For fun, I had a chat with it starting with a request to work out the <i>math</i> of crescent intersection, before committing to code. It still confabulated, but I was able to coax out a solution in the end that made sense.
elif大约 2 年前
I think it would have done well if you added an explicit rule like: &quot;the path chosen should always minimize the number of fire tiles passed through.&quot;<p>The way the prompt was phrased sort of invited the all-or-nothing fire approach.
danjc大约 2 年前
Code is just the rendering of an algorithm.<p>It might be best to prompt it with a high level description of an algorithm, then iteratively prompt it to refine its prior output or add more detail. Render to code should be the final step.
LeanderK大约 2 年前
training ML model to code is a very interesting challenge. I am surprised by GPTs ability to code, given that it, as I understood it, has basically no tools at the ready. I am convinced that it is way harder to code without debugging and other interactive features both for a human and for a machine. Keep in mind that GPT could not have learned to simulate the code internally given its fixed runtime.<p>I think ML models need to learn how to interact with our tools (compiler, debugger etc.) to really be effective at coding. That&#x27;s hard.
chrismsimpson大约 2 年前
Prediction based on statistical probabilities != comprehension. So no.
评论 #35196152 未加载
bsaul大约 2 年前
there&#x27;s a bit of confusion when people say it&#x27;s not going replace programmers because they all have tricky things to do in their work week.<p>This is not how it&#x27;s going to happen : if your boring time-consuming tasks take virtually 0 time thanks to gpt, and let you focus on the 1% that&#x27;s hard, you&#x27;ve suddenly become 100x more efficient, and can thus accomplish the same job as 100 you. That means the company can now fire 99 coworkers, keeping only you, and end up with the same result.
评论 #35194921 未加载
评论 #35194816 未加载
haolez大约 2 年前
Just imagine if these models fall in love with languages like APL&#x2F;K&#x2F;J. Even with the context windows, they will be able to do tons of work.
ben7799大约 2 年前
I’m curious how long till we figure out if these algorithms are plagiarizing OSS or other code they come across like GitHub Copilot.<p>It requires special tools to actually figure out if this is happening. Having seen tests with such tools the problem seems a lot worse than commonly discussed.<p>Inserting stolen code or using OSS code in violation of licenses is going to be a big mess. Copying snippets versus pulling in dependencies creates tons of issues. Even if you get away with violating licenses you set yourself up for security issues if the tools plagiarize code with vulnerabilities in a way that won’t get updated.<p>It might mean this stuff is a useful tool for someone with a clue but not for someone who doesn’t know what they’re doing.
IIAOPSW大约 2 年前
So, how long until GPT decides it wants to automate all this tedious work and starts trying to code its own language model?
评论 #35193699 未加载
photochemsyn大约 2 年前
The only take-home message here is that people who claim to write &#x27;self-documenting code&#x27; are well, let&#x27;s not be hyperbolistic, but come on. No comments on that code example? Every line could have an explanatory comment, then the author could remember what they were thinking at the time and it would probably help the AI out too.<p>&gt; &quot;People who claim code can document itself considered harmful&quot;
评论 #35194286 未加载
est大约 2 年前
tl;dr<p>&gt; Given a description of an algorithm or a description of a well known problem with plenty of existing examples on the web, yeah GPT-4 can absolutely write code. It’s mostly just assembling and remixing stuff it’s seen, but TO BE FAIR… a lot of programming is just that.
nathias大约 2 年前
of couse it can write code, but it can&#x27;t do software development
tantaman大约 2 年前
I was mock interviewing ChatGPT for a few hours yesterday with application and system design + coding said application. My conclusion was it was a no hire for even the most jr positions because it required considerable amounts of direction to arrive at anything approximating an acceptable solution.<p>tldr -- this matches my experiences as well.
phkahler大约 2 年前
I&#x27;ll be impressed when they can debug existing code.
hackerlight大约 2 年前
Try again with chain-of-thought prompting?
HervalFreire大约 2 年前
Guys all of this happened within a couple months.<p>If you guys are confident about the entity as it is right now not taking over your job, what if I double the accuracy of gpt output?<p>What if I double it again? Then again? And again? And again?<p>You guys realize this is what&#x27;s coming right? This thing literally is a baby as of now.
评论 #35197847 未加载
breakingrules大约 2 年前
i was trying to get it to make a document scanner last night, it apologized to me like 10 times and we eventually got running code but the result was way off. this thing can write code but you&#x27;re not gonna rely on it and nobody is gonna know it enough to edit it. it is not there yet, still very helpful for small things or extremely simple things. if you tell it to give you an express server with socket.io and your db, it will probably set that up for you perfectly.