How I program with LLMs

919 pointsby stpn4 months ago

53 comments

dewitt4 months ago

One interesting bit of context is that the author of this post is a legit world-class software engineer already (though probably too modest to admit it). Former staff engineer at Google and co-founder / CTO of Tailscale. He doesn't need LLMs. That he says LLMs make him more productive at all as a hands-on developer, especially around first drafts on a new idea, means a lot to me personally.His post reminds me of an old idea I had of a language where all you wrote was function signatures and high-level control flow, and maybe some conformance tests around them. The language was designed around filling in the implementations for you. 20 years ago that would have been from a live online database, with implementations vying for popularity on the basis of speed or correctness. Nowadays LLMs would generate most of it on the fly, presumably.Most ideas are unoriginal, so I wouldn't be surprised if this has been tried already.

评论 #42619856 未加载

评论 #42620535 未加载

评论 #42620609 未加载

评论 #42620047 未加载

评论 #42620128 未加载

评论 #42621926 未加载

评论 #42619487 未加载

评论 #42622749 未加载

评论 #42619912 未加载

评论 #42620414 未加载

评论 #42625724 未加载

评论 #42619412 未加载

评论 #42622399 未加载

评论 #42637505 未加载

评论 #42620331 未加载

highfrequency4 months ago

> A lot of the value I personally get out of chat-driven programming is I reach a point in the day when I know what needs to be written, I can describe it, but I don’t have the energy to create a new file, start typing, then start looking up the libraries I need... LLMs perform that service for me in programming. They give me a first draft, with some good ideas, with several of the dependencies I need, and often some mistakes. Often, I find fixing those mistakes is a lot easier than starting from scratch.This to me is the biggest advantage of LLMs. They dramatically reduce the activation energy of doing something you are unfamiliar with. Much in the way that you're a lot more likely to try kitesurfing if you are at the beach standing next to a kitesurfing instructor.While LLMs may not yet have human-level depth, it's clear that they already have vastly superhuman breadth. You can argue about the current level of expertise (does it have undergrad knowledge in every field? PhD level knowledge in every field?) but you can't argue about the breadth of fields, nor that the level of expertise improves every year.My guess is that the programmers who find LLMs useful are people who do a lot of different kinds of programming every week (and thus are constantly going from incompetent to competent in things that other people already know), rather than domain experts who do the same kind of narrow and specialized work every day.

评论 #42626439 未加载

mlepath4 months ago

The first rule of programming with LLMs is don't use them for anything you don't know how to do. If you can look at the solution and immediately know what's wrong with it, they are a time saver otherwise...I find chat for search is really helpful (as the article states)

评论 #42618549 未加载

评论 #42618711 未加载

评论 #42618512 未加载

评论 #42619720 未加载

评论 #42618497 未加载

评论 #42618792 未加载

评论 #42623669 未加载

评论 #42618914 未加载

评论 #42618705 未加载

评论 #42618474 未加载

评论 #42620675 未加载

评论 #42621099 未加载

评论 #42620028 未加载

wdutch4 months ago

I no longer work in tech, but I still write simple applications to make my work life easier.I frequently use what OP refers to as chat-driven programming, and I find it incredibly useful. My process starts by explaining a minimum viable product to the chat, which then generates the code for me. Sometimes, the code requires a bit of manual tweaking, but it’s usually a solid starting point. From there, I describe each new feature I want to add—often pasting in specific functions for the chat to modify or expand.This approach significantly boosts what I can get done in one coding session. I can take an idea and turn it into something functional on the same day. It allows me to quickly test all my ideas, and if one doesn’t help as expected, I haven’t wasted much time or effort.The biggest downside, however, is the rapid accumulation of technical debt. The code can get messy quickly. There's often a lot of redundancy and after a few iterations it can be quite daunting to modify.

评论 #42619220 未加载

评论 #42619107 未加载

评论 #42623796 未加载

评论 #42622944 未加载

评论 #42618789 未加载

nemothekid4 months ago

I think "Chat driven programming" is the most common type of the most hyped LLM-based programming I see on twitter that I just can't relate to. I've incorporated LLMs mainly as auto-complete and search; asking ChatGPT to write a quick script or to scaffold some code for which the documentation is too esoteric to parse.But having the LLM do things for me, I frequently run into issues where it feels like I'm wasting my time with an intern. "Chat-based LLMs do best with exam-style questions" really speaks to me, however I find that constructing my prompts in such a way where the LLM does what I want uses just as much brainpower as just programming the thing my self.I do find ChatGPT (o1 especially) really good at optimizing existing code.

评论 #42619050 未加载

评论 #42619186 未加载

评论 #42618838 未加载

评论 #42620629 未加载

评论 #42619018 未加载

notjoemama4 months ago

Our company has a no AI use policy. The assumption is zero trust. We simply can’t know whether a model or its framework could or would send proprietary code outside the network. So it’s best to assume all LLMs/AI is or will send code or fragments of code. While I applaud the incredible work by their creators, I’m not sure how a responsible enterprise class company could rely on “trust us bro” EULAs or repo readmes.

评论 #42618495 未加载

评论 #42618566 未加载

评论 #42619113 未加载

评论 #42620291 未加载

评论 #42618862 未加载

评论 #42618802 未加载

评论 #42623477 未加载

评论 #42619321 未加载

评论 #42622032 未加载

评论 #42621689 未加载

评论 #42619573 未加载

评论 #42618979 未加载

评论 #42622139 未加载

评论 #42626695 未加载

评论 #42620646 未加载

Ozzie_osman4 months ago

One mode I felt was missed was "thought partner", especially while debugging (aka rubber ducking).We had an issue recently with a task queue seemingly randomly stalling. We were able to arrive at the root cause much more quickly than we would have because of a back-and-forth brainstorming session with Claude, which involved describing the issue we were seeing, pasting in code from library to ask questions, asking it to write some code to add some missing telemetry, and then probing it for ideas on what might be going wrong. An issue that may have taken days to debug took about an hour to identify.Think of it as rubber ducking with a very strong generalist engineer who knows about basically any technical concepts.

评论 #42620058 未加载

评论 #42619554 未加载

nunez4 months ago

I definitely respect David's opinion given his caliber, but pieces like this make me feel strange that I just don't have a burning desire to use them.Like, yesterday I made some light changes to a containerized VPN proxy that I maintain. My first thought wasn't "how would Claude do this?" Same thing with an API I made a few weeks ago that scrapes a flight data website to summarize flights in JSON form.I knew I would need to write some boilerplate and that I'd have to visit SO for some stuff, but asking Claude or o1 to write the tests or boilerplate for me wasn't something I wanted or needed to do. I guess it makes me slower, sure, but I actually enjoy the process of making the software end to end.Then again, I do all of my programming on Vim and, technically, writing software isn't my day job (I'm in pre-sales, so, best case, I'm writing POC stuff). Perhaps I'd feel differently if I were doing this day in, day out. (Interestingly, I feel the same way about AI in this sense that I do about VSCode. I've used it; I know what's it capable of; I have no interest in it at all.)The closest I got to "I'll use LLMs for something real" was using it in my backend app that tracks all of my expenses to parse pictures of receipts. Theoretically, this will save me 30 seconds per scan, as I won't need to add all of the transaction metadata myself. Realistically, this would (a) make my review process slower, as LLMs are not yet capable of saying "I'm not sure" and I'd have to manually check each transaction at review time, (b) make my submit API endpoint slower since it takes relatively-forever for it to analyze images (or at least it did when I experimented with this on GPT4-turbo last year), and (c) drive my costs way up (this service costs almost nothing to run, as I run it within Lambda's free tier limit).

评论 #42628393 未加载

评论 #42623197 未加载

评论 #42623272 未加载

bangaladore4 months ago

The killer feature about LLMs with programming in my opinion is autocomplete (the simple copilot feature). I can probably be 2-3x more productive as I'm not typing (or thinking much). It does a fairly good job pulling in nearby context to help it. And that's even without a language server.Using it to generate blocks of code in a chat like manner in my opinion just never works well enough in the domains I use it on. I'll try to get it to generate something and then realize when I get some functional result I could've done it faster and more effectively.Funny enough, other commenters here hate autocomplete but love chat.

评论 #42618674 未加载

评论 #42625508 未加载

评论 #42623524 未加载

评论 #42618721 未加载

LouisSayers4 months ago

The use of LLMs reminds me a bit of how people use search engines.Some years ago I gave a task to some of my younger (but intelligent) coworkers.They spent about 50 minutes searching in google and came back to me saying they couldn't find what they were looking for.I then typed in a query, clicked one of the first search results and BAM! - there was the information they were unable to find.What was the difference? It was the keywords / phrases we were using.

Balgair4 months ago

I'm not a 'programmer'. At best, I'm a hacker, at best. I don't work in a team. All my code is mostly one time usage to just get some little thing done, sometimes a bit of personal stuff too. I mostly use Excel anyways, and then python, and even then, I hate python because half the time I'm just dealing with library issues (not a joke, I measured it (and, no, I'm not learning another language, but thank you)). I'm in biotech, a very non code-y section of it too.LLMs are just a life saver. Literally.They take my code time down from weeks to an afternoon, sometimes less. Any they're kind.I'm trying to write a baseball simulator on my own, as a stretch goal. I'm writing my own functions now, a step up for me. The code is to take in real stats, do Monte Carlo, get results. Basic stuff. Such a task was impossible for me before LLMs. I've tried it a few times. No go. Now with LLMs, I've got the skeleton working and should be good to go before opening day. I'm hoping that I can use it for some novels that I am writing to get more realistic stats (don't ask).I know a lot of HN is very dismissive of LLMs as code help. But to me, a non programmer, they've opened it up. I can do things I never imagined that I could. Is it prod ready? Hell no, please God no. But is it good enough for me to putz with and get just working? Absolutely.I've downloaded a bunch of free ones from huggingface and Meta just to be sure they can't take them away from me. I'm never going back to that frustration, that 'Why can't I just be not so stupid?', that self-hating, that darkness. They have liberated me.

brabel4 months ago

What the author is asking about, a quick sketchpad where you can try out code quickly and chat with the AI, already exists in the JetBrains IDEs. It's called a scratch file[1].As far as I know, the idea of a scratch "buffer" comes from emacs. But in Jetbrains IDEs, you have the full IDE support even with context from your current project (you can pick the "modules" you want to have in context). Given the good integration with LLMs, that's basically what the author seems to want. Perhaps give GoLand[2] a try.Disclosure: no, I don't work for Jetbrains :D just a very happy customer.[1] <a href="https://www.jetbrains.com/help/idea/scratches.html" rel="nofollow">https://www.jetbrains.com/help/idea/scratches.html</a>[2] <a href="https://www.jetbrains.com/go/" rel="nofollow">https://www.jetbrains.com/go/</a>

评论 #42628255 未加载

rafaelmn4 months ago

I disagree about search. While LLM can give you an answer faster, good doc (eg. MDN article in CSS example) will :- be way more reliable- probably be up to date on how you should solve it in latest/recommend approach- put you in a place where you can search for adjecent techLLM with search has potential but I'd like if current tools are more oriented on source material rather than AI paraphrasing.

评论 #42619480 未加载

评论 #42621672 未加载

charlieyu14 months ago

I’m a hobby programmer who never worked a programming job. Last week I was bored, I asked o1 to help me to write a Solitaire card game using React because I’m very rusty with web development.The first few steps were great. Guided me to install things and setup a project structure. The model even generated codes for a few files.Then something went wrong, the model kept telling me what to do in vague, but didn’t output codes anymore. So I asked for further help, and now it started contradicting itself, rewriting business logic that were implemented in the first response, 3-4 pieces of code snippets of the same file that aren’t compatible etc, and it all fell apart.

评论 #42628939 未加载

评论 #42627290 未加载

评论 #42629322 未加载

justatdotin4 months ago

lots of colleauges using copilot or whatever for autocomplete - I just find that annoying.or writing tests - that's ... not so helpful. worst is when a lazy dev takes the generated tests and leaves it at that: usually just a few placeholders that test the happy path but ignore obvious corner cases. (I suppose for API tests that comes down to adding test case parameters)but chatting about a large codebase, I've been amazed at how helpful it can be.what software patterns can you see in this repo? how does the implementation compare to others in the organisation? what common features of the pattern are missing?also, like a linter on steroids, chat can help explore how my project might be refactored to better match the organisation's coding style.

评论 #42618446 未加载

hansvm4 months ago

That quartile reservoir sampler example is ... intriguing?My experience with LLM code is that it can't come up with anything even remotely novel. If I say "make it run in amortized O(1)" then 99 times out of 100 I'll get a solution so wildly incorrect (but confidently asserting its own correctness) that it can't possibly be reshaped into something reasonable without a re-write. The remaining 1/100 times aren't usually "good" either.For the reservoir sampler -- here, it did do the job. David almost certainly knows enough to know the limits of that code and is happy with its limitations. I've solved that particular problem at $WORK though (reservoir sampling for percentile estimates), and for the life of me I can't find a single LLM prompt or sequence of prompts that comes anywhere close to optimality unless that prompt also includes the sorts of insights which lead to an amortized O(1) algorithm being possible (and, even then, you still have to re-run the query many times to get a useful response).Picking on the article's solution a bit, why on earth is `sorted` appearing in the quantile estimation phase? That's fine if you're only using the data structure once (init -> finalize), but it's uselessly slow otherwise, even ignoring splay trees or anything else you could use to speed up the final inference further.I personally find LLMs helpful for development when either (1) you can tolerate those sorts of mishaps (e.g., I just want to run a certain algorithm through Scala and don't really care how slow it is if I can run it once and hexedit the output), or (2) you can supply all the auxilliary information so that the LLM has a decent chance of doing it right -- once you've solved the hard problems, the LLM can often get the boilerplate correct when framing and encapsulating your ideas.

wrs4 months ago

I’ve been working with Cursor’s agent mode a lot this week and am seeing where we need a new kind of tool. Because it sees the whole codebase, the agent will quickly get into a state where it’s changed several files to implement some layering or refactor something. This requires a response from the developer that’s sort of like a code review, in that you need to see changes and make comments across multiple files, but unlike a code review, it’s not finished code. It probably doesn’t compile, big chunks of it are not quite what you want, it’s not structured into coherent changesets…it’s kind of like you gave the intern the problem and they submitted a bit of a mess. It would be a terrible PR, but it’s a useful intermediate state to take another step from.It feels like the IDE needs a new mode to deal with this state, and that SCM needs to be involved somehow too. Somehow help the developer guide this somewhat flaky stream of edits and sculpt it into a good changeset.

评论 #42618490 未加载

评论 #42620683 未加载

choeger4 months ago

Essentially, an LLM is a compressed database with a universal translator.So what we can get out of it is everything that has been written (and publicly released) before translated to any language it knows about.This has some consequences.1. Programmers still need to know what algorithms or interfaces or models they want.2. Programmers do not have to know a language very well anymore, to write code, but the have to for bug fixing. Consequently the rift between garbage software and quality software will grow.3. New programming languages will face a big economical hurdle to take off.

评论 #42621665 未加载

cratermoon4 months ago

But the question must be asked: At what cost?Are the results a paradigm shift so much better that it's worth the hundreds of billions sunk into the hardware and data centers? Is spicy autocomplete worth the equivalent of flying from New York to London while guzzling thousands of liters of water?It might work, for some definition of useful, but what happens when the AI companies try to claw back some of that half a trillion dollars they burnt?

评论 #42628296 未加载

owebmaster4 months ago

I thought his project, sketch.dev is of very poor quality. I wouldn't ship something like this - the auth process is awful and broke, I still can't login. If after 14 hours of the post the service is still rugged to death, it also means the scalability of the app is bad. If we are going to use LLMs to replace hours of programming, we should aim for quality too.

评论 #42623021 未加载

singpolyma34 months ago

It seems like everything I see about success using LLMs for this kind of work is for greenfield. What about three weeks later when the job changes to maintenance and interation on something that's already working? Are people applying LLMs to that space?

评论 #42619941 未加载

评论 #42620189 未加载

评论 #42620435 未加载

评论 #42618975 未加载

评论 #42621382 未加载

ripped_britches4 months ago

I’ll say that the payoff for investing the time to learn how to do this right is huge. Especially with cursor which allows me to easily chat around context (docs, library files, etc)

评论 #42620661 未加载

simondotau4 months ago

I've recently started using Cursor because it means I can now write python where two weeks ago I couldn't write python. It wrote the first pass of an API implementation by feeding it the PDF documentation. I've spent a few days testing and massaging it into a well formed, well structured library, pair-programming style.Then I needed to write a simple command line utility, so I wrote it in Go, even though I've never written Go before. Being able to make tiny standalone executables which do real work is incredible.Now if I ever need to write something, I can choose the language most suited to the task, not the one I happen to have the most experience with.That's a superpower.

评论 #42620587 未加载

golergka4 months ago

I have written a small fullstack app over the holidays, mostly with LLMs, to see how far would they get me. Turns out, they can easily write 90% of the code, but you still need to review everything, make the main architectural decisions and debug stuff when AI cant solve the bug after 2-3 iterations. I get a huge productivity boost and at the same time am not afraid that they will replace me. At least not yet.Can't recommend aider enough. I've tried many different coding tools, but they all seem like a leaky abstraction over LLMs medium of sequential text generation. Aider, on the other hand, leans into it in the best possible way.

dxuh4 months ago

Currently a lot of my work consists of looking at large, (to me) unknown code bases and figuring out how certain things work. I think LLMs are currently very bad at this and it is my understanding that there are problems in increasing context window sizes to multiple millions of tokens, so I wonder if LLMs will ever get good at this.

评论 #42621200 未加载

sublimefire4 months ago

I've been doing that for a while as well and mostly agree. Although one thing that I find useful is to build the local infrastructure to be able to collect useful prompts and the ability to work with files and urls. Web interface is limiting alone.I like gptresearcher and all of the glue put in place to be able to extend prompts and agents etc. Not to mention the ability to fetch resources from the web and do research type summaries on it.All in all it reminds me the work of security researchers, pentesters and analysts. Throughout the career they would build a set of tools and scripts to solve various problems. LLMs kind of force the devs to create/select tools for themselves to ease the burden of their specific line of work as well. You could work without LLMs but maybe it will be a bit more difficult to stand out in the future.

ianpurton4 months ago

I've been coding professionally for 30 years.I'm probably in the same place as the author, using Chat-GPT to create functions etc, then cut and pasting that into VSCode.I've started using cline which allows me to code using prompts inside VSCode.i.e. Create a new page so that users can add tasks to a tasks table.I'm getting mixed results, but it is very promising. I create a clinerules file which gets added to the system prompt so the AI is more aware of my architecture. I'm also looking at overiding the cline system prompt to both make it fit my architecture better and also to remove stuff I don't need.I jokingly imagine in the future we won't get asked how long a new feature will take, rather, how many tokens will it take.

评论 #42622136 未加载

btbuildem4 months ago

The search part really resonates with me. I do a lot of odd/unusual/one-off things for my side projects, and I use LLMs extensively in helping me find a path forward. It's like an infinitely patient, all-knowing expert that pulls together info from any and all domain. Sometimes it will have answers that I am unable to find another way (eg, what's the difference between "busy s..." and "busy p..." AT command response on the esp8285?). It saves me hours of struggle, and I would not want to go back to the old ways.

polotics4 months ago

My main usage is in helping me approach domains and tools I don't know enough to confidently know how best to get started.So one thing that doesn't get a mention in the article but is quite significant I think is the long lag of knowledge cutoff dates: looking at even the latest and greatest, there is one year or more of missing information.I would love for someone more versed than me to tell us how best to use RAG or LoRA to get the model to answer with fully up to date knowledge on libraries, frameworks, ...

ryanobjc4 months ago

I have been getting more value out of LLMs recently, and the great irony is it is because of a few different packages in emacs and the wonderful CLI LLM chat programming tool 'aider'.My workflow puts LLM chat at my fingertips, and I can control the context. Pretty much any text in emacs can be sent to a LLM of your choice via API.Aider is even better, it does a bunch of tricks to improve performance, and is rapidly becoming a 'must have' benchmark for LLM coding. It integrates with git so each chat modification becomes a new git commit. Easy to undo changes, redo changes, etc. It also has a bunch of hacks because while o1 is good as reasoning, it (apparently) doesn't do code modification well. Aider will send different types of requests to different 'strengths' of LLMs etc. Although if you can use sonnet, you can just use that and be done with it.It's pretty good, but ultimately it's still just a tool for transforming words into code. It won't help you think or understand.I feel bad for new kids who won't develop muscle and sight strength to read/write code. Because you still need to read/write code, and can't rely on the chat interface for everything.

Ygg24 months ago

> Search. If I have a question about a complex environment, say “how do I make a button transparent in CSS” I will get a far better answer asking any consumer-based LLM, than I do using an old fashioned web search engine.I don't think this is about LLMs getting better, but search becoming worse. In no small thanks to LLMs polluting the results. Do search images for terms and count how many are AI generated.I can say I got better result from Google X years ago vs Google of today.

评论 #42633866 未加载

bambax4 months ago

> There are three ways I use LLMs in my day-to-day programming: 1/ Autocomplete 2/ Search 3/ Chat-driven programmingI do mostly 2/ Search, which is like a personalized Stack Overflow and sometimes feels incredible. You can ask a general question about a specific problem and then dive into some specific point to make sure you understand every part clearly. This works best for things one doesn't know enough about, but has a general idea of how the solution should sound or what it should do. Or, copy-pasting error messages from tools like Docker and have the LLM debug it for you really feels like magic.For some reason I have always disliked autocomplete anywhere, so I don't do that.The third way, chat-driven programming, is more difficult, because the code generated by LLMs can be large, and can also be wrong. LLMs are too eager to help, and they will try to find a solution even if there isn't one, and will invent it if necessary. Telling them in the prompt to say "I don't know" or "it's impossible" if need be, can help.But, like the author says, it's very helpful to get started on something.> That is why I still use an LLM via a web browser, because I want a blank slate on which to craft a well-contained requestThat's also what I do. I wouldn't like having something in the IDE trying to second guess what I write or suddenly absorbing everything into context and coming up with answers that it thinks make a lot of sense but actually don't.But the main benefit is, like the author says, that it lets one start afresh with every new question or problem, and save focused threads on specific topics.

averus4 months ago

I think the author is really on the right path with his vision for LLMs as tool for software development. Last week I tried probably all of them with something like a code challenge.I have to say that I am impressed with sketch.dev, it got me a working example from the first try and it looked cleaner form all the others, similar but cleaner somehow in terms of styling.The whole time I was using those tools I was thinking that I want exactly this a LLM trained specifically on the Go official documentation, or whatever your favourite language is, ideally fined tuned by the maintainers of the language.I want the LLM to show me an idiomatic way to write an API using the standard library I don't necessarily want it to do it instead of me, or to be trained on all of the scrapped data they could scrape. Show me a couple of examples maybe explain a concept, give me steps by step guidance.I also share his frustrations with the chat based approach what annoys me personally the most is the anthropomorphization of the LLMs, yesterday Gemini was even patronizing me...

agentultra4 months ago

It seems nice for small projects but I wouldn’t use it for anything serious that I want to maintain long term.I would write the tests first and foremost: they are the specification. They’re for future me and other maintainers to understand and I wouldn’t want them to be generated: write them with the intention of explaining the module or system to another person. If the code isn’t that important I’ll write unit tests. If I need better assurances I’ll write property tests at a minimum.If I’m working on concurrent or parallel code or I’m working on designing a distributed system, it’s gotta be a model checker. I’ve verified enough code to know that even a brilliant human cannot find 1-in-a-million programming errors that surface in systems processing millions of transactions a minute. We’re not wired that way. Fortunately we have formal methods. Maths is an excellent language for specifying problems and managing complexity. Induction, category theory, all awesome stuff.Most importantly though… you have to write the stuff and read it and interact with it to be able to keep it in your head. Programming is theory-building as Naur said.Personally I just don’t care to read a bunch of code and play, “spot the error;” a game that’s rigged for me to be bad at. It’s much more my speed to write code that obviously has no errors in it because I’ve thought the problem through. Although I struggle with this at times. The struggle is an important part of the process for acquiring new knowledge.Though I do look forward to algorithms that can find proofs of trivial theorems for me. That would be nice to hand off… although simp does a lot of work like that already. ;)

fassssst4 months ago

They’re pretty great for printf debugging. Yesterday I was confounded by a bug so I rapidly added a ton of logging that the LLM wrote instantly, then I had the LLM analyze the state difference between the repro and non repro logs. It found something instantly that it would have taken me a few hours to find, which led me to a fix.

ghostbit4 months ago

> you’re going to have days of tense back-and-forth about whether the cost of the work is worth the benefit. An LLM will do it in 60 seconds and not make you fight to get it done. Take advantage of the fact that redoing work is extremely cheap.The fast iteration cycle of getting a baseline (but less than ideal or even completely wrong) is a great point here. Redoing the work is fast and easy but still requires review and validation to know how to request the rework to obtain the optimal result.

aerhardt4 months ago

His experience mirrors mine. I'm happy he explicitly mentions search, when people have been shouting "this is not meant for search" for a couple years now. Of course it helps with search. I also love the tech for producing first drafts, and it greatly lowers the energy and cognitive load when attacking new tasks, like others are repeating on this thread.I think at the same time, while the author says this is the second most impressive technology he's seen in his lifetime, it's still a far cry from the bombastic claims being made by the titans of industry regarding its potential. Not uncommon to see claims here on HN of 10x improvements in productivity, or teams of dozens of people being axed, but nothing in the article or in my experience lines up with that.

yawnxyz4 months ago

> I could not go a week without getting frustrated by how much mundane typing I had to do before having a FIM modelFor those not in-the-know, I just learned today that code autocomplete is actually called "Fill-in-the-Middle" tasks

评论 #42619665 未加载

jmull4 months ago

LLM auto-complete is good — it suggests more of what I was going to type, and correctly (or close enough) often enough that it’s useful. Especially in the boilerplate-y languages/code I have to use for $dayjob.Search has been neutral. For finding little facts it’s been about the same as regular search. When digging in, I want comprehensive, dense, reasonably well-written reference documentation. That’s not exactly wide-spread, but LLMs don’t provide this either.Chat-driven generates too much buggy/incomplete code to be useful, and the chat interface is seriously clunky.

e12e4 months ago

Interesting. I wonder what the equivalent of sketch.dev would look like if it targeted Smalltalk and was embedded in a Smalltalk image (preferably with a local LLM running in smalltalk)?I'd love to be able to tell my (hypothetical smalltalk) tablet to create an app for me, and work interactively, interacting with the app as it gets built...Ed: I suppose I should just try and see where cloud ai can take smalltalk today:<a href="https://github.com/rsbohn/Cuis-Smalltalk-Dexter-LLM">https://github.com/rsbohn/Cuis-Smalltalk-Dexter-LLM</a>

评论 #42624556 未加载

stevage4 months ago

This is a great article with lots of useful insights.But I'm completely unconvinced by the final claim that LLM interfaces should be separate from IDE's, and should be their own websites. No thanks.

9999000009994 months ago

I still find most LLMS to be extremely poor programmers .Claude will often generate tons and tons of useless code quickly using up it's limit. I often find myself yelling at it to stop.I was just working with it last night."Hi Claude, can you add tabs here.": <div><MainContent/><div/>Claude will then start generating MainContent.DeepSeek, despite being free does a much better job than Claude. I don't know if it's smarter, but whatever internal logic it has is much more to the point.Claude also has a very weird bias towards a handful of UI libraries that has installed, even if those wouldn't be good for your project. I wasted hours on shancn UI which requires a very particular setup to work.LLM's are generally great at common tasks using a top 5( popularity) language.Ask it to do something in a Haxe UI library and it'll make up functions that *look* correct.Overall I like them, they definitely speed things up. I don't think most experienced software engineers have much to worry about for now. But I am really worried about juniors. Why higher a junior engineer, when you can just tell your seniors they need to use Copilot to crank out more code

评论 #42633416 未加载

denvermullets4 months ago

this is almost exactly how ive been using llms. i dont like the code complete in the ide, personally, and prefer all llm usage to be narrow specific blocks of code. it helps as i bounce between a lot of side projects, projects at work, and freelance projects. not to mention with context switching it really helps keep things moving, imo

justinl334 months ago

I've maintained several SDKs, and the 'cover everything' approach leads to nightmare dependency trees and documentation bloat. imo, the LLM paradigm shifts this even further - why maintain a massive SDK when users can generate precisely what they need? This could fundamentally change how we think about API distribution.

jimmydoe4 months ago

Anyone has good recommendation of LocalLLM for autocompletionMost editors I use supports online LLM but it's too slow sometimes for me.

评论 #42618987 未加载

评论 #42628023 未加载

dboreham4 months ago

Interesting that he had the same thought initially as I did (after running a model myself on my own hardware) : this is like the first time I ran a traceroute across the planet.

lysecret4 months ago

Funny, he starts of dismissing an AI IDE to end with building an AI IDE :D (Smells a little bit like not invented here syndrom) Otherwise fascinating article!

评论 #42629447 未加载

theptip4 months ago

This lines up well with my experience. I’ve tried coming at things from the IDE and chat side, and I think we need to merge tooling more to find the sweet spot. Claude is amazing at building small SPAs, and then you hit the context window cutoff and can’t do anything except copy your file out. I suspect IDEs will figure this out before Claude/ChatGPT learn to be good enough at the things folks need from IDEs. But long-term, i suppose you don’t want to have to drop down to code at all and so the constraints of chat might force the exploration of the new paradigm more aggressively.Hot take of the day, I think making tests and refactors easier is going to be revolutionary for code quality.

jordanmorgan104 months ago

The more experienced the engineer the less CSS is on the page. This seems to be a universal truth, I want to learn from these people - but my goodness, but could we at least use margins to center content.

assimpleaspossi4 months ago

Since all these AI products just put together things they pull from elsewhere, I'm wondering if, eventually, there could be legal issues involving software products put together using such things.

EGreg4 months ago

Can’t we just use test-driven development with AI Agents?1) Idea2) Tests3) Code until all tests pass

_boffin_4 months ago

Does anyone know of any good chat based ui builders. No. Not build a chat app.Does webflow have something?My problem is being able to describe what I want in the style I want.

评论 #42619797 未加载

User234 months ago

LLMs are, at their core, search tools. Training is indexing and prompting is querying that index. The granularity being at the n-gram rather than the document level is a huge deal though.Properly using them requires understanding that. And just like we understand every query won’t find what we want, neither will every prompt. Iterative refinement is virtually required for nontrivial cases. Automating that process, like eg cursor agent, is very promising.

评论 #42618908 未加载

评论 #42618683 未加载

评论 #42618580 未加载

评论 #42618628 未加载