TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

Ask HN: How can I learn about performance optimization?

335 pointsby fvrghlabout 1 year ago
What are some good resources for learning about performance optimization? This is an area that is new to me, but a big part of my new job.

69 comments

Agentlienabout 1 year ago
For several years I have worked primarily with performance optimizations in the context of video games (and previously in the context of surgical simulation). This differs subtly from optimization in certain other areas, so I figured I&#x27;d add my own perspective to this already excellent comment section.<p>1. First and foremost: measure early, measure often. It&#x27;s been said so often and it still needs repeating. In fact, the more you know about performance the easier it can be to fall into the trap of not measuring enough. Measuring will show exactly where you need to focus your efforts. It will also tell you without question whether your work has actually lead to an improvement, and to what degree.<p>2. The easiest way to make things go faster is to do less work. Use a more efficient algorithm, refactor code to eliminate unnecessary operations, move repeated work outside of loops. There are many flavours, but very often the biggest performance boosts are gained by simply solving the same problem through fewer instructions.<p>3. Understand the performance characteristics of your system. Is your application CPU bound, GPU compute bound, memory bound? If you don&#x27;t know this you could make the code ten times as fast without gaining a single ms because the system is still stuck waiting for a memory transfer. On the flip side, if you know your system is busy waiting for memory, perhaps you can move computations to this spot to leverage this free work? This is particularly important in shader optimizations (latency hiding).<p>4. Solve a different problem! You can very often optimize your program by redefining your problem. Perhaps you are using the optimal algorithm for the problem as defined. But what does the end user really need? Often there are very similar but much easier problems which are equivalent for all practical purposes. Sometimes because the complexity lies in special cases which can be avoided or because there&#x27;s a cheap approximation which gives sufficient accuracy. This happens especially often in graphics programming where the end goal is often to give an <i>impression</i> that you&#x27;ve calculated something.
评论 #39581574 未加载
评论 #39579950 未加载
评论 #39581293 未加载
评论 #39580239 未加载
评论 #39585724 未加载
pca006132about 1 year ago
Performance optimization covers a lot of topics, it depends on what you are trying to optimize.<p>1. Latency vs throughput. Oftentimes they are the same, i.e. reduce the time it takes to do something. However, when you passed a certain threshold, techniques that can optimize throughput will hurt latency, so it is important to know what you are looking for. There are also low level details if you have rather extreme latency requirement, e.g. pinning the cores, kernel settings etc.<p>2. Knowledge about the overall system <i>and your input distribution</i>. While this seems trivial, often times you can get large performance improvement by avoiding redundant work, either by caching or lazy evaluation. Some computation may only exist because they <i>may</i> be needed later, and these can be avoided by lazy evaluation.<p>3. Better algorithms. Again, this seems trivial but oftentimes people are using algorithms that are far from optimal. And even if the algorithm can be asymptotically, there may be faster algorithms for special cases or faster in practice. Optimizing special cases may be rewarding if they occur frequently. Do you really need optimal solutions? Can you allow randomization? Can you do optimization on the queries to make it faster <i>overall</i> without optimizing individual operations?<p>4. Parallelization. Can you do parallelization? Are your problem instances large enough, or individual stages slow enough to benefit from parallelization? Do you have computation that are trivially parallelizable and can benefit from offloading to the GPU? If your code is waiting on some events, can you make them async? Can you avoid locks or atomic operations in your parallel code?<p>5. Data structure optimization. Can you reduce the number of allocation needed? Can you make the data structure more linear and predictable so the CPU can have better cache utilization? Can you compress certain data if they are sparse?<p>6. Low level CPU&#x2F;GPU optimizations. There are a lot of great resources out there, but only do it when you are very sure it will be worth it, i.e. they are bottleneck in your system.
评论 #39579397 未加载
评论 #39579753 未加载
评论 #39579284 未加载
hliyanabout 1 year ago
Former HFT dev here. Know fundamentals: sources of performance issues = things that eat&#x2F;waste CPU cycles, things that reach too far down the memory hierarchy. Usually the latter. E.g. L2 cache to RAM - order of magnitude slower; RAM to disk: 4+ orders of magnitude slower.<p>Things that eat CPU: iterations, string operations. Things that waste CPU: lock contentions in multi-threaded environments, wait states.<p>You can usually build a lot of the understanding from first principles starting there. Back in the day we had to do this because there wasn&#x27;t much by way of readily available literature on the subject. Actual techniques will depend or evolve based on your choice of platform or version.<p>E.g. 20 years ago, we used to create object pools in C++ at load time to avoid Unix heap locks at runtime. This may no longer be necessary. 15(ish?) years ago, JNI was used when the JVM wasn&#x27;t fast enough for certain stuff. This is no longer necessary. 10 years ago, immutable JS objects were thought to be faster because the JS runtimes at the time were slower to mutate existing objects than to create new ones. This too, may no longer be true (I haven&#x27;t checked recently). Until very recently, re-rendering with virtual DOM diffing was considered more performant than direct, incremental DOM manipulation. This too, may no longer be true.
评论 #39581302 未加载
评论 #39580317 未加载
评论 #39580747 未加载
评论 #39582794 未加载
评论 #39581684 未加载
levodelellisabout 1 year ago
Casey Muratori did many lectures. Check out his yt page playlist. Start with the one titled software quality <a href="https:&#x2F;&#x2F;www.youtube.com&#x2F;@MollyRocket&#x2F;playlists" rel="nofollow">https:&#x2F;&#x2F;www.youtube.com&#x2F;@MollyRocket&#x2F;playlists</a><p>I heard good things about his course <a href="https:&#x2F;&#x2F;www.computerenhance.com&#x2F;" rel="nofollow">https:&#x2F;&#x2F;www.computerenhance.com&#x2F;</a><p>Agner Fog manuals are good too <a href="https:&#x2F;&#x2F;www.agner.org&#x2F;optimize&#x2F;#manuals" rel="nofollow">https:&#x2F;&#x2F;www.agner.org&#x2F;optimize&#x2F;#manuals</a><p>A site to look up instruction timings is <a href="https:&#x2F;&#x2F;uops.info&#x2F;table.html" rel="nofollow">https:&#x2F;&#x2F;uops.info&#x2F;table.html</a>
评论 #39583634 未加载
devheartabout 1 year ago
<a href="https:&#x2F;&#x2F;www.computerenhance.com&#x2F;" rel="nofollow">https:&#x2F;&#x2F;www.computerenhance.com&#x2F;</a>
评论 #39579726 未加载
评论 #39579685 未加载
评论 #39580480 未加载
评论 #39580235 未加载
moggiabout 1 year ago
If you want to learn how to understand the performance of the whole system I can recommend Brendan Gregg&#x27;s Systems Performance: Enterprise and the Cloud (<a href="https:&#x2F;&#x2F;www.brendangregg.com&#x2F;blog&#x2F;2020-07-15&#x2F;systems-performance-2nd-edition.html" rel="nofollow">https:&#x2F;&#x2F;www.brendangregg.com&#x2F;blog&#x2F;2020-07-15&#x2F;systems-perform...</a>). It is a good book that teaches a lot of basics and techniques and gives a good understanding of the impact different system components can have on performance.
评论 #39579935 未加载
评论 #39579255 未加载
anymouse123456about 1 year ago
Please try to remember that one of the the most abused quotes in Software Engineering is the old Knuth chestnut about &quot;Premature optimization is the root of all evil.&quot;<p>The full quote is as follows, &quot;Programmers waste enormous amounts of time thinking about, or worrying about, the speed of noncritical parts of their programs, and these attempts at efficiency actually have a strong negative impact when debugging and maintenance are considered. We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil. Yet we should not pass up our opportunities in that critical 3%.&quot;<p>He specifically does not recommend writing code that is obviously inefficient. He is clearly referring to engineers optimizing routines by introducing new, additional complexity. He is not recommending that anyone write obviously, ruinously slow, bloated code.<p>Writing software is an art and a science.<p>Optimization is no different. One frequently missing part of our process is to keep a watchful eye on features that are obviously toxic to performance during development (i.e., multiply-nested for loops, many large external dependencies, introducing and frequently iterating over huge, bloated structs, etc.).<p>As everyone says, measure early and often, but also please don&#x27;t just shout &quot;LEEEEEEEROY JENKINS!&quot; as you throw fireballs of slow, bloated code into the world.
评论 #39582171 未加载
mvelbaumabout 1 year ago
Denis Bakhvalov has some great resources for this:<p>1. His free course: <a href="https:&#x2F;&#x2F;products.easyperf.net&#x2F;perf-ninja" rel="nofollow">https:&#x2F;&#x2F;products.easyperf.net&#x2F;perf-ninja</a><p>2. His free book: <a href="https:&#x2F;&#x2F;book.easyperf.net&#x2F;perf_book" rel="nofollow">https:&#x2F;&#x2F;book.easyperf.net&#x2F;perf_book</a> (the 2nd edition is being worked on right now and there&#x27;s a draft on github: <a href="https:&#x2F;&#x2F;github.com&#x2F;dendibakh&#x2F;perf-book">https:&#x2F;&#x2F;github.com&#x2F;dendibakh&#x2F;perf-book</a>)
评论 #39578667 未加载
reacharavindhabout 1 year ago
Not a comprehensive set of resources as you asked, but I want to share one line of thought that had a profound impact in my way of working.<p>Think of a system as a chain of bottlenecks, visualized as a set of pipes. If you can measure the metric you care about (tput, latency etc) at a component level, and put together the system’s control flow, you can spot where the bottleneck is. Optimise that component, and you will reveal the next bottleneck, now optimize that… and it goes on. To limit the fun of this exercise, it helps to do a back of the envelope calculation of what is a realistic estimate of the thing you measure in the system. Example - I want this service to do 100 emails&#x2F; sec. Now, piece by piece remove bottlenecks to achieve close that value.
SleepyMyroslavabout 1 year ago
If you need to organize your thoughts on what measurements are and how at least some profiling tools work you can pick up a book or two. I would recommend for example [1]. It is a bit heavy on C++ side but you can complement it with something relevant to your job&#x27;s language.<p>If you want one bit of advice on optimization, I can try one: follow your app architecture closely. This is where data structures that hold all of the important data live and this is what limits what is possible to achieve on performance. A lot of learning is narrowly focused to specific micro optimization techniques leaving big picture as an exercise.<p>1 Fedor Pikus, The Art of Writing Efficient Programs
keskadaleabout 1 year ago
<a href="https:&#x2F;&#x2F;en.algorithmica.org&#x2F;hpc&#x2F;" rel="nofollow">https:&#x2F;&#x2F;en.algorithmica.org&#x2F;hpc&#x2F;</a><p>This is a good book. It covers most common concepts and techniques in a fairly accessible way. At they end it also shows builds up a highly optimized version of some algorithms and data structures and does explains every optimization.
评论 #39580072 未加载
ohyesabout 1 year ago
Performance optimization is very simple. Make computer do less to get same or similar result. The most performant application does very little and still gets you the result you need.<p>To do this you must find ways to “cheat”. This can be of various forms. Better algorithms, better data structures, precomputation, caching. At some point you will exhaust low hanging fruit and need to dig into lower level aspects of the code or its compilation.<p>Anyway, best way to learn is to do it, go depth first and always check your work thoroughly. (It is easy to optimize yourself into a solution that is not working properly).
tanelpoderabout 1 year ago
Understand first, then fix. And you understand by measuring the right thing at the right time (scope). Systemwide resource utilization averages are not gonna tell you where your critical thread or database connection is spending their time at - you need to measure (profile) precisely where your task of interest is spending their time.<p>I&#x27;ve learned a lot from Cary Millsap over the last 2 decades and he recently published a general performance optimization book &quot;How to Make Things Faster&quot; that I can recommend [1]. It&#x27;s less about tools, more about the method and systematic approach for performance optimization:<p>[1] <a href="https:&#x2F;&#x2F;method-r.com&#x2F;books&#x2F;faster&#x2F;" rel="nofollow">https:&#x2F;&#x2F;method-r.com&#x2F;books&#x2F;faster&#x2F;</a>
joshxyzabout 1 year ago
work with steve jobs, or someone like him.<p>&gt; One of the best, if possibly exaggerated, examples of the reality distortion field comes from Jobs&#x27;s biographer Isaacson. During development of the Macintosh computer in 1984, Jobs asked Larry Kenyon, an engineer, to reduce the Mac boot time by 10 seconds. When Kenyon replied that it was not possible to reduce the time, Jobs asked him, &quot;If it would save a person&#x27;s life, could you find a way to shave 10 seconds off the boot time?&quot; Kenyon said that he could. Jobs went to a white board and pointed out that if 5 million people wasted an additional 10 seconds booting the computer, the sum time of all users would be equivalent to 100 human lifetimes every year. A few weeks later Kenyon returned with rewritten code that booted 28 seconds faster than before.<p><a href="https:&#x2F;&#x2F;en.m.wikipedia.org&#x2F;wiki&#x2F;Reality_distortion_field" rel="nofollow">https:&#x2F;&#x2F;en.m.wikipedia.org&#x2F;wiki&#x2F;Reality_distortion_field</a>
评论 #39579612 未加载
评论 #39578658 未加载
评论 #39578952 未加载
slashrootabout 1 year ago
Google publishes some of its data center optimization lessons and tips at <a href="http:&#x2F;&#x2F;abseil.io&#x2F;fast" rel="nofollow">http:&#x2F;&#x2F;abseil.io&#x2F;fast</a>. This includes topics like higher-level methodology and goal setting, these topics are often less covered by other resources.<p>Full disclosure: I&#x27;m the editor in chief for the series.
boberoniabout 1 year ago
&gt; This is an area that is new to me, but a big part of my new job.<p>Can you tell me more about what your new job is, without releasing anything sensitive?<p>If you are running applications on Linux containers in the cloud, then I would recommend Brendan Gregg&#x27;s blog and books (<a href="https:&#x2F;&#x2F;www.brendangregg.com&#x2F;overview.html" rel="nofollow">https:&#x2F;&#x2F;www.brendangregg.com&#x2F;overview.html</a>). He does a lot of knowledge sharing from his experiences at Netflix.
hsaliakabout 1 year ago
What you really want to learn about is observability, benchmarking and instrumentation. Once you are an expert in these topics for your domain, optimization will be about making obvious choices within localized constraints.
评论 #39582600 未加载
mtzetabout 1 year ago
Most software in the industry is slow because it&#x27;s doing a lot of stuff that it shouldn&#x27;t. Often times additional &quot;optimization&quot; layers adds caching, but makes getting to the root of the issue harder. The biggest win is primarily getting rid of things you don&#x27;t need and secondarily operating on things in batch.<p>My playbook for optimizing in the real world is something like this: 1. Understand what you&#x27;re actually trying to compute end-to-end. The bigger the chunk you&#x27;re trying to optimize, the greater the potential for performance.<p>2. Sketch out what an optimal process would look like. What data do you need to fetch, what computation do you need to do on this, how often does this need to happen. Don&#x27;t try to be clever and micro-optimize or cache computations. Just focus on only doing the things you need to do in a simple way. Use arrays a lot.<p>3. Understand what the current code is actually doing. How close to the sketch above are you? Are you doing a lot of I&#x2F;O in the middle of the computation? Do you keep coming back to the same data?<p>If you want to understand the limits of how fast computers are, and what optimal performance looks like I&#x27;d recommend two talks that come with a very different perspective from what you usually hear:<p>1. Mike Acton&#x27;s talk at cppcon 2014 <a href="https:&#x2F;&#x2F;www.youtube.com&#x2F;watch?v=rX0ItVEVjHc" rel="nofollow">https:&#x2F;&#x2F;www.youtube.com&#x2F;watch?v=rX0ItVEVjHc</a><p>2. Casey Muratori&#x27;s talk about optimizing a grass planting algorithm <a href="https:&#x2F;&#x2F;www.youtube.com&#x2F;watch?v=Ge3aKEmZcqY" rel="nofollow">https:&#x2F;&#x2F;www.youtube.com&#x2F;watch?v=Ge3aKEmZcqY</a>
评论 #39579432 未加载
edderlyabout 1 year ago
If you&#x27;re new to this area, I would first start by understanding which profiling tools you can use depending on the OS, languages and systems involved.<p>Even if your system is not C++, I&#x27;ve always enjoyed this talk and the subsequent discussion which tackles some of the problems associated with some programming practices and the impact on performance.<p>CppCon 2014: Mike Acton &#x27;Data-Oriented Design and C++&#x27; <a href="https:&#x2F;&#x2F;youtu.be&#x2F;rX0ItVEVjHc" rel="nofollow">https:&#x2F;&#x2F;youtu.be&#x2F;rX0ItVEVjHc</a>
评论 #39583101 未加载
lokarabout 1 year ago
Check out:<p><a href="https:&#x2F;&#x2F;www.oreilly.com&#x2F;library&#x2F;view&#x2F;understanding-software-dynamics&#x2F;9780137589692&#x2F;" rel="nofollow">https:&#x2F;&#x2F;www.oreilly.com&#x2F;library&#x2F;view&#x2F;understanding-software-...</a>
arpafauconabout 1 year ago
I really liked listening to and working on the projects of the open MIT course about performance <a href="https:&#x2F;&#x2F;ocw.mit.edu&#x2F;courses&#x2F;6-172-performance-engineering-of-software-systems-fall-2018&#x2F;pages&#x2F;syllabus&#x2F;" rel="nofollow">https:&#x2F;&#x2F;ocw.mit.edu&#x2F;courses&#x2F;6-172-performance-engineering-of...</a>
评论 #39580030 未加载
mikhael28about 1 year ago
Write a piece of software in a week - a full app, with discrete functionality that would challenge you to deliver on time. Do it, and burn through it.<p>Then optimize it - measure front end render performance&#x2F;compilation times&#x2F;code perf, and then do the same on the backend. Write a blog post about it.<p>No substitute for experience
lenkiteabout 1 year ago
The book Understanding Software Dynamics by Richard Sites is all about performance optimization<p><a href="https:&#x2F;&#x2F;www.amazon.in&#x2F;Understanding-Software-Addison-Wesley-Professional-Computing&#x2F;dp&#x2F;0137589735" rel="nofollow">https:&#x2F;&#x2F;www.amazon.in&#x2F;Understanding-Software-Addison-Wesley-...</a>
whiterknightabout 1 year ago
You’re unlikely to find a good answer because it’s a very specialized skill that is mostly from experience doing it, and HN tends to self select out of that pursuit.
评论 #39580189 未加载
评论 #39581438 未加载
TammyEvertsabout 1 year ago
I&#x27;ve been working in the performance industry (focusing on front-end performance and UX) for 15 years. Some resources you might find helpful:<p>• Steve Souders&#x27; books &#x27;High Performance Web Sites&#x27; &amp; &#x27;Even Faster Web Sites&#x27; – <a href="https:&#x2F;&#x2F;stevesouders.com&#x2F;" rel="nofollow">https:&#x2F;&#x2F;stevesouders.com&#x2F;</a> – Steve literally wrote the book(s) on performance and is widely considered the godfather of the industry.<p>• WPO stats – <a href="https:&#x2F;&#x2F;wpostats.com&#x2F;" rel="nofollow">https:&#x2F;&#x2F;wpostats.com&#x2F;</a> – Case studies and experiments demonstrating the impact of performance optimization on UX and business metrics<p>• performance.now() talks – <a href="https:&#x2F;&#x2F;www.youtube.com&#x2F;c&#x2F;WebConferencesAmsterdam&#x2F;videos" rel="nofollow">https:&#x2F;&#x2F;www.youtube.com&#x2F;c&#x2F;WebConferencesAmsterdam&#x2F;videos</a> – PerfNow is the only annual global conference dedicated to performance. You can watch past talks on their YouTube channel.<p>• Intro to Web Performance – <a href="https:&#x2F;&#x2F;support.speedcurve.com&#x2F;docs&#x2F;psychology-of-web-performance" rel="nofollow">https:&#x2F;&#x2F;support.speedcurve.com&#x2F;docs&#x2F;psychology-of-web-perfor...</a> – This is a collection of introductory articles covering topics like the psychology of sit speed to Core Web Vitals to how to create a performance culture in your organization.
newprintabout 1 year ago
There is a MIT course on YouTube and also, there is a pretty famous former M$ performance engineer who worked on Xbox and bunch of other large projects, he has webpage about how he tracks down bugs and performance issues, don&#x27;t it have it handy unfortunately. Another thing to look at - low level optimization. There is a cool book, two volumes written by a German guy - I don&#x27;t have a link for it either. Maybe someone who has those links can post them here. EDIT: <a href="https:&#x2F;&#x2F;www.agner.org&#x2F;optimize&#x2F;" rel="nofollow">https:&#x2F;&#x2F;www.agner.org&#x2F;optimize&#x2F;</a>
评论 #39672968 未加载
评论 #39578688 未加载
joshspankitabout 1 year ago
Suggestion: Program some (slow) microcontrollers as a hobby.<p>Go multi-core because async is an important optimization skillset, but other than that just build some things.<p>I live and breathe optimizations (it feels almost as satisfying to me as driving fast) and as an example recently I created an 11-board (one for each channel) wifi-presence-detection system in a busy wifi area and there was literally no way it was going to work without optimization. From communication protocol to having to be strict about every byte of memory, it’s working with the first principles that built the entire industry.
amadioabout 1 year ago
For me, one of the biggest leaps in how I think about performance was when I learned about the Top-Down Micro-Architecture Analysis Method, by Ahmad Yasin from Intel. You can learn the main ideas from himself in the video below:<p><a href="https:&#x2F;&#x2F;youtu.be&#x2F;kjufVhyuV_A" rel="nofollow">https:&#x2F;&#x2F;youtu.be&#x2F;kjufVhyuV_A</a><p>The idea to classify cycles into front-end bound, backend bound, bad speculation or memory bound is brilliant. Once you know which one your program suffers from, it&#x27;s easy to know what can be done to improve things.
austin-cheneyabout 1 year ago
Measure everything and be extremely critical. Be ready to challenge common and popular held assumptions.<p>Here is something I wrote about extreme performance in JavaScript that is discarded by most programmers because most people that program JavaScript professionally cannot really program.<p><a href="https:&#x2F;&#x2F;github.com&#x2F;prettydiff&#x2F;wisdom&#x2F;blob&#x2F;master&#x2F;performance_frontend.md">https:&#x2F;&#x2F;github.com&#x2F;prettydiff&#x2F;wisdom&#x2F;blob&#x2F;master&#x2F;performance...</a>
评论 #39580097 未加载
评论 #39580307 未加载
评论 #39580181 未加载
globular-toastabout 1 year ago
One thing to keep in mind is there&#x27;s three layers of optimisation:<p>1. The problem, 2. The algorithms, 3. Micro-optimisation.<p>The potential gains shrink rapidly as you descend this list. A lot of people start thinking at level 3 straight away, but this is pointless if you&#x27;ve left performance on the table at the higher levels. For example, no amount of clever bit twiddling will compensate for the wrong algorithm, and even the best algorithm is pointless if you&#x27;re solving the wrong problem.
评论 #39581993 未加载
krannerabout 1 year ago
Michael Abrash&#x27;s stuff is still worth reading:<p><a href="https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=20883860">https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=20883860</a>
评论 #39578703 未加载
DeathArrowabout 1 year ago
I enjoyed talks about optimization by Casey Muratori. Some examples:<p><a href="https:&#x2F;&#x2F;m.youtube.com&#x2F;watch?v=Ge3aKEmZcqY" rel="nofollow">https:&#x2F;&#x2F;m.youtube.com&#x2F;watch?v=Ge3aKEmZcqY</a><p><a href="https:&#x2F;&#x2F;m.youtube.com&#x2F;watch?v=ffDXc6oup3Q" rel="nofollow">https:&#x2F;&#x2F;m.youtube.com&#x2F;watch?v=ffDXc6oup3Q</a><p><a href="https:&#x2F;&#x2F;m.youtube.com&#x2F;watch?v=pgoetgxecw8" rel="nofollow">https:&#x2F;&#x2F;m.youtube.com&#x2F;watch?v=pgoetgxecw8</a>
saagarjhaabout 1 year ago
What kind of work are you doing? There are some shared ideas (measure, do less work, etc.) but the best advice would probably be tailored to what you’re working on.
maniaticoabout 1 year ago
I think the Optimization course by Prof. Jacco is a nice start <a href="https:&#x2F;&#x2F;web.archive.org&#x2F;web&#x2F;20230924064410&#x2F;https:&#x2F;&#x2F;www.cs.uu.nl&#x2F;docs&#x2F;vakken&#x2F;mov&#x2F;" rel="nofollow">https:&#x2F;&#x2F;web.archive.org&#x2F;web&#x2F;20230924064410&#x2F;https:&#x2F;&#x2F;www.cs.uu...</a> (sadly the website seems to be currently down). Basically you need a mental framework to approach optimizing software (else you might just be spinning around wasting time). I recommend reading at least lecture 1 and looking at the references on the website.<p>As for specific optimizations, it requires context of what software are you trying to optimize and under what circumstances. A lot of the times you are going to see that the answer to asking if certain optimizations are worth the effort is going to be &#x27;it depends&#x27;
easyas124about 1 year ago
Understand how the software works, and how computers work in general. You have to understand the system before you can a) understand how it&#x27;s slow, and b) how to make it faster. If you can tell us what, specifically, you need to optimize, we can recommend more specific techniques.<p>Or get a job you can handle idk.
chainingsolidabout 1 year ago
Here&#x27;s 2 more links. I didn&#x27;t see already posted worth watching&#x2F;reading. Should give a good intro. Within 3 hours combined (For CPU performance anyway).<p>A good talk, doesn&#x27;t go deep and instead goes a bit wide. <a href="https:&#x2F;&#x2F;www.youtube.com&#x2F;watch?v=6RlloT_6WxA" rel="nofollow">https:&#x2F;&#x2F;www.youtube.com&#x2F;watch?v=6RlloT_6WxA</a><p>This one explains the black magic the CPU makes have been doing. If you going to be optimizing code you should know your hardware. <a href="https:&#x2F;&#x2F;www.lighterra.com&#x2F;papers&#x2F;modernmicroprocessors&#x2F;" rel="nofollow">https:&#x2F;&#x2F;www.lighterra.com&#x2F;papers&#x2F;modernmicroprocessors&#x2F;</a><p>Aditional note: I&#x27;ve noticed C++ conventions have a habit of having performance related talks, YT is your friend.
txutxuabout 1 year ago
Has you mention &quot;new job&quot;, and everyone is talking you about computers... I will mention the other side:<p>1) Try to understand well the architecture of your company (who is who, who decides what changes are made to computers, how they decide that, what metrics do they use, what tests and benchmarks are passed before changes, etc)<p>2) Try to understand your place in such architecture. Am I responsible from the overall performance? or only the performance of certain components? am I responsible about the latency of the network or the latency of the database, or both etc. Make a clear scope. This will help you to focus on which metrics do you need to follow.<p>3) Try to understand the company procedures. Can I refuse a change that comes from the product or marketing team? can I refuse a change that comes from developers? can I refuse a change that comes from the platform team? how much time do I have to analyze such changes before they reach production, how can I request the rollback of a unsupervised change, where can I check the performance impact of each change made on production in the past? etc<p>4) Try to understand what the CEO and CTO, your team and the rest of teams should expect from you. Are there any SLA o SLO for your position related to the overall performance?<p>5) Make clear how you are informed of ongoing changes and roadmaps? Should I spend all the week in the performance of a component that is going to be deprecated in the next sprint? etc<p>6) Ask doubts and questions to your team mates, or department head. They may teach you about the company workflows, past issues, past solutions, corner cases, blockers, resources, plans and guidelines.<p>In short... look for reading&#x2F;watching material, but don&#x27;t forget to look at your company too, you will find things to learn there too, and maybe things that need to change if they are important enough; or need to be clarified that they are not important enough to change, to defend your work on future performance issues, related to those things that weren&#x27;t changed.
andaiabout 1 year ago
The most interesting thing I&#x27;ve learned in this regard (from Casey Muratori) is non-pessimization. Non pessimization means don&#x27;t make the computer do unnecessary work. Just write the simplest code that does the thing. Unfortunately almost no software is written like that.
atoavabout 1 year ago
To be honest I think the best way to learn about it is to develope for resource constrained environments. E.g. when you use 99% of your embedded MCUs code memory and another static string for a label shown on screen stops your code from compiling you <i>will</i> optimize code.
marcosdumayabout 1 year ago
Well, start with literature on the specific domain of your new job. That way, you can learn what &quot;performance&quot; even means on your area, what are the common problems, what to measure, and what kind of knowledge you need.
firecallabout 1 year ago
Out of curiosty, what are you optimising exactly?
评论 #39579428 未加载
vram22about 1 year ago
The book &quot;Writing Efficient Programs&quot;, by Jon Bentley, is still a valuable resource. See the short subthread starting with a comment I posted here some years ago:<p><a href="https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=13407192">https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=13407192</a><p>A few people had replied, agreeing with my opinion, and giving some more details. One of them called the book &quot;gold&quot;.<p>I have also posted about the book a few other times on HN, over the years.<p>Those comments can be found by searching hn.algia.com for comments (not stories) matching the pattern &quot;writing efficient programs vram22&quot;.
评论 #39588393 未加载
dgskiabout 1 year ago
Plugging the book that I just published: <a href="https:&#x2F;&#x2F;a.co&#x2F;d&#x2F;iTjaQzP" rel="nofollow">https:&#x2F;&#x2F;a.co&#x2F;d&#x2F;iTjaQzP</a><p>It&#x27;s a beginner-friendly introduction to Low Latency Programming, which involves a lot of performance optimization. Could be a good way to start your learning on the subject.<p>You can read one of the chapters on my blog: <a href="https:&#x2F;&#x2F;tech.davidgorski.ca&#x2F;introduction-to-low-latency-programming-minimize-branching-and-jumping&#x2F;" rel="nofollow">https:&#x2F;&#x2F;tech.davidgorski.ca&#x2F;introduction-to-low-latency-prog...</a>
hesdeadjimabout 1 year ago
Huge topic, what are you trying to optimize? What language(s), hardware, etc.<p>Optimizing games sends you deep down a fun rabbit hole, but that will be very different than trying to optimize a Go backend server.
dborehamabout 1 year ago
The Nike doctrine works: just do it. Besides that I recommend always ask yourself the question: we told the computer to do X, and it took too long, so what was it doing for that time? The rough answer can come from surprisingly simple sources such as &quot;top&quot;. Fancy, intrusive tools such as traditional profilers are often not the best first place to look for answers. If X is some short one-off thing that makes it hard to see what&#x27;s happening: make it do 1M of X so bulk data can be observed.
fslothabout 1 year ago
Agner Fog’s optimization manuals are pretty good <a href="https:&#x2F;&#x2F;www.agner.org&#x2F;optimize&#x2F;" rel="nofollow">https:&#x2F;&#x2F;www.agner.org&#x2F;optimize&#x2F;</a>
评论 #39579932 未加载
jkoudysabout 1 year ago
Many have covered specific learning material, but my best advice is to find a mentor. It really is a skill that&#x27;s best learned by apprenticing. I new a lot of concepts and could muddle my way through them, but it wasn&#x27;t until I had experienced people to work under directly that my skills really took off.<p>Just like the best advice on learning to write code is to write code, the best way to learn how to optimize performance is to optimize performance.
Archelaosabout 1 year ago
For a specific introduction to database optimization, I can recommend Silvia Botros and Jeremy Tinley: &quot;High Performance MySQL&quot; -- <a href="https:&#x2F;&#x2F;www.oreilly.com&#x2F;library&#x2F;view&#x2F;high-performance-mysql&#x2F;9781492080503&#x2F;" rel="nofollow">https:&#x2F;&#x2F;www.oreilly.com&#x2F;library&#x2F;view&#x2F;high-performance-mysql&#x2F;...</a>
bewuethrabout 1 year ago
There&#x27;s <i>The Every Computer Performance Book</i> <a href="http:&#x2F;&#x2F;www.treewhimsy.com&#x2F;TECPB&#x2F;Book.html" rel="nofollow">http:&#x2F;&#x2F;www.treewhimsy.com&#x2F;TECPB&#x2F;Book.html</a> and the blog it&#x27;s based on, <a href="https:&#x2F;&#x2F;rwwescott.wordpress.com&#x2F;" rel="nofollow">https:&#x2F;&#x2F;rwwescott.wordpress.com&#x2F;</a>
thetwentyoneabout 1 year ago
This has been very helpful to me: <a href="https:&#x2F;&#x2F;viralinstruction.com&#x2F;posts&#x2F;hardware&#x2F;#74a3ddb4-8af1-11eb-186e-4d80402adfcf" rel="nofollow">https:&#x2F;&#x2F;viralinstruction.com&#x2F;posts&#x2F;hardware&#x2F;#74a3ddb4-8af1-1...</a><p>It’s really not specific to Julia, though the language does let you drill down into the details nicely.
Qwertiousabout 1 year ago
&gt;What are some good resources for learning about performance optimization?<p>I swear, nobody actually read OP&#x27;s post. The top <i>five</i> comments are &quot;here&#x27;s some personal advice about general rules of thumb, without any links to actual resources!&quot;<p>Kudos to levodelellis, whose post is at #6 root comment and contain some links and drops some names.
评论 #39583993 未加载
101008about 1 year ago
I think this thread is old enough to ask something like this, but instead of learning performance optimizatoin, is there a way to be hired to work on this? I am fascinated and I always loved when I had to optimize something (mostly code in my experience, algorithms, etc), but that&#x27;s only a very small percentage of the work I do.
AtNightWeCodeabout 1 year ago
The difficult thing is to benchmark the software correctly and evaluate the impact of a change. Most of the examples on the Internet are useless micro-optimizations. I evaluated a program some time ago that did several speed tricks. But the reason it was slow was because it reread a file on each iteration in a loop.
torialabout 1 year ago
If you are looking at the .Net ecosystem, I can&#x27;t recommend this book enough. The chapter on Garbage Collection itself was worth the price of the book to me: <a href="https:&#x2F;&#x2F;www.writinghighperf.net&#x2F;" rel="nofollow">https:&#x2F;&#x2F;www.writinghighperf.net&#x2F;</a>
charlyslabout 1 year ago
MIT&#x27;s open course 6.172 Performance Engineering of Software Systems:<p><a href="https:&#x2F;&#x2F;ocw.mit.edu&#x2F;courses&#x2F;6-172-performance-engineering-of-software-systems-fall-2018&#x2F;" rel="nofollow">https:&#x2F;&#x2F;ocw.mit.edu&#x2F;courses&#x2F;6-172-performance-engineering-of...</a>
kamikaz1kabout 1 year ago
1. Computer Enhance by Casey Muratori (check out some of his YouTube videos if you want a preview)<p>2. Read the blog posts people wrote about the 1BR challenge where they tried to figure out the fastest way to process 1 billion rows of data<p>3. Brendan Gregg‘s blog<p>4. Google&#x2F;YouTube how to profile code in your desired language
DougN7about 1 year ago
Look into the write ups of the One Billion Row Challenge - you’ll see lots of techniques.
konradhaabout 1 year ago
Read some of the performance-related work here: <a href="https:&#x2F;&#x2F;acl.inf.ethz.ch&#x2F;publications&#x2F;" rel="nofollow">https:&#x2F;&#x2F;acl.inf.ethz.ch&#x2F;publications&#x2F;</a>
bdangubicabout 1 year ago
If you are in “javaland” look at billion row challenge, you will learn a lot - <a href="https:&#x2F;&#x2F;github.com&#x2F;gunnarmorling&#x2F;1brc">https:&#x2F;&#x2F;github.com&#x2F;gunnarmorling&#x2F;1brc</a>
pbronezabout 1 year ago
Mature Optimization by Carlos Bueno <a href="https:&#x2F;&#x2F;carlos.bueno.org&#x2F;optimization&#x2F;" rel="nofollow">https:&#x2F;&#x2F;carlos.bueno.org&#x2F;optimization&#x2F;</a>
lallysinghabout 1 year ago
It really depends on what level you&#x27;re working on improving. It&#x27;s effectively queues all the way down, but programmers hate reading statistics. E.g. a server process is a series of queues between your TCP socket to your process to your disk, CPU&#x27;s reorder buffer, and scheduler.<p>You have three areas to study:<p>1. Measurement - makes you define the performance you&#x27;re looking for and measure it. Until you do this it&#x27;s mostly a bullshit &quot;make people stop complaining about performance&quot; errand that&#x27;s too wishy washy to do with more than a few stabs in the dark. With containers and decent capture of samples of your load, a benchmark is pretty straightforward to set up.<p>2. Modeling - these models are usually little more than measured rates and latencies applied to Little&#x27;s Law. Pocket-calculator math is often good enough. At worst, an M&#x2F;M&#x2F;1 queue.<p>3. Instrumentation - Figuring out how to attribute your computer&#x27;s resources (memory, CPU time, iops, etc) to different parts of your code. Tracing libraries, Linux perf, and ebpf can be useful here.<p>There are a decent number of computers performance books. I like the ones by Jain (great, but AFAICT out of print) and Harchol-Baltar. For work, you shouldn&#x27;t read them straight through but iterate through parts as you better understand the problem you&#x27;re trying to solve and start choosing strategies. For the tactical side. Brendon Gregg (sp?) has some decent measurement tool books. Figure out what you want to improve and how to measure that. Then start attributing the existing performance to implementation choices that you can control. Then control those choices (e.g. change algorithm, load balance better, make design trade-offs) to improve performance.
Cloudefabout 1 year ago
Watch this video <a href="https:&#x2F;&#x2F;www.youtube.com&#x2F;watch?v=4LiP39gJuqE" rel="nofollow">https:&#x2F;&#x2F;www.youtube.com&#x2F;watch?v=4LiP39gJuqE</a>
midzerabout 1 year ago
Run Lighthouse developer tools of Chrom* based browser to give first hints about potential optimizations of any website.
keeperofdakeysabout 1 year ago
I&#x27;d recommend learning how to instrument and measure the performance of your code. I find most performance issues are (mostly) situations you didn&#x27;t and couldn&#x27;t anticipate. So instead of preventing them, learning to investigate and fix them is key. (Shout out to Brendan and his Linux Performance page <a href="https:&#x2F;&#x2F;www.brendangregg.com&#x2F;linuxperf.html" rel="nofollow">https:&#x2F;&#x2F;www.brendangregg.com&#x2F;linuxperf.html</a>).<p>Second there is an important engineering lesson to learn. Often there are many performance issues, with only a few acting as serious bottlenecks. Additionally sometimes the solutions to performance issues add complexity, but as an engineer you want to avoid complexity. Engineering effort is usually limited, so there is always a question of whether a performance issue needs to be fixed now or left till later.<p>Here is a quick example to illustrate my point. pgAdmin is a webui program to interact with PostgreSQL databases, allowing you to remotely run queries. Part of its operations fetches information about columns in a result set, in one version this code ran one query per column sequentially. So c columns, each a synchronous query to the server - almost instant on a local database with a small number of columns. However with 400 columns, and a 40ms internet link, it ended up taking at least 400*40=16 seconds to complete. In 99% of cases this code works just fine, but in a few less obvious scenarios its runtime balloons.<p>Another example; what happens if all the daily scheduled jobs run at the same time? <a href="https:&#x2F;&#x2F;github.com&#x2F;go-acme&#x2F;lego&#x2F;issues&#x2F;1656">https:&#x2F;&#x2F;github.com&#x2F;go-acme&#x2F;lego&#x2F;issues&#x2F;1656</a>
jsennabout 1 year ago
As you can tell from the diversity of responses here it really depends on what you&#x27;re doing. In my work I use C++, and &quot;optimization&quot; typically involves making a heavy computation run faster (measured in wall clock time) or making a particular subsystem use less memory.<p>The number one most important thing you can do is dive in and start profiling real-world code. Find a part of your software that is too slow or uses too many resources, and use whatever the standard profiler is for your development environment to figure out why. Performance optimization is a very empirical discipline. Yes there are general principles, but if you don&#x27;t measure your baseline or your changes you won&#x27;t know how good your optimization was. In my experience, the first attempt at a fix is often flat-out wrong! Doing this first will also help motivate your reading.<p>Once you know how to measure the performance of your software, I recommend learning the basics of modern computer architecture. At a minimum, learn about CPU caches, how they work, and how to design your code to use them effectively. I find Algorithms for Modern Hardware to be a good resource for this [1], but there are many others. Relatedly, you should have a rough idea of how long it takes for your computer to do various basic things (fetch something from memory, fetch something from cache, etc.). There&#x27;s a table at [2] that gives a good idea. Don&#x27;t worry too much about the absolute values--the order of magnitude is what&#x27;s important.<p>You should also study fundamental data structures, but understand that for low-level programming 95% of the time the correct answer will be to shove everything into a simple flat array (e.g. std::vector in C++), maybe with some sort of index on top. Fancy data structures are more important in higher-level languages that are structurally unable to make effective use of modern hardware.<p>[1] <a href="https:&#x2F;&#x2F;en.algorithmica.org&#x2F;hpc&#x2F;" rel="nofollow">https:&#x2F;&#x2F;en.algorithmica.org&#x2F;hpc&#x2F;</a><p>[2] <a href="https:&#x2F;&#x2F;gist.github.com&#x2F;jboner&#x2F;2841832" rel="nofollow">https:&#x2F;&#x2F;gist.github.com&#x2F;jboner&#x2F;2841832</a>
benreesmanabout 1 year ago
It really depends on where you sit in the stack.<p>The generally useful rule is “measure before acting”.<p>There are some rules of thumb at every layer:<p>If you’re getting bad scrolling in a web application on a mobile phone, something is probably getting called over and over<p>If you’ve got an x86_64 server maxing out but the cores aren’t printing work? Zen4’s northbridge has some edge cases.<p>If you’re trying to melt aluminum so that exquisite optics can do extreme ultra-violet litho: weak hyper charge is very well determined empirically but there are some weird readings on muon spin.<p>I’m sort of kidding because this is an Endless Internet Feud, but really it’s measure and whack the hot spots.<p>I’ve done a bunch of this shit: if you’re not sure where to start feel free to email.
评论 #39580176 未加载
csoursabout 1 year ago
1. Don&#x27;t do remote calls in loops.<p>That&#x27;s it.
评论 #39580133 未加载
darksim905about 1 year ago
performance optimization of -what- ?
tillulenabout 1 year ago
I admire Daniel Lemire’s work on SIMD implementations. [Lemire]<p>[Lemire] <a href="https:&#x2F;&#x2F;lemire.me&#x2F;en&#x2F;#publications" rel="nofollow">https:&#x2F;&#x2F;lemire.me&#x2F;en&#x2F;#publications</a><p>I learn a lot by reading my compiler’s and profiler’s documentation.<p>For Rust, the <i>Rust Performance Book</i> by Nicholas Nethercote et al. [Nethercote] seems like a nice place to start after reading the Cargo and rustc books.<p>[Nethercote] <a href="https:&#x2F;&#x2F;nnethercote.github.io&#x2F;perf-book&#x2F;" rel="nofollow">https:&#x2F;&#x2F;nnethercote.github.io&#x2F;perf-book&#x2F;</a><p><i>Algorithms for Modern Hardware</i> by Sergey Slotin [Slotin] is a dense and approachable overview.<p>[Slotin] <a href="https:&#x2F;&#x2F;en.algorithmica.org&#x2F;hpc&#x2F;" rel="nofollow">https:&#x2F;&#x2F;en.algorithmica.org&#x2F;hpc&#x2F;</a><p>Quantitative understanding of the underlying implementations and computer architecture has been invaluable for me. <i>Computer architecture: a quantitative approach</i> by John L. Hennessy and David A. Patterson [H&amp;P] and <i>Computer organization and design: the hardware&#x2F;software interface</i> by Patterson and Hennessy [P&amp;H ARM, P&amp;H RISC] are two introductory books I like the best. There are three editions of the second book: the ARM, MIPS and RISC-V editions.<p>[H&amp;P] <a href="https:&#x2F;&#x2F;www.google.com&#x2F;books&#x2F;edition&#x2F;_&#x2F;cM8mDwAAQBAJ" rel="nofollow">https:&#x2F;&#x2F;www.google.com&#x2F;books&#x2F;edition&#x2F;_&#x2F;cM8mDwAAQBAJ</a><p>[P&amp;H ARM] <a href="https:&#x2F;&#x2F;www.google.com&#x2F;books&#x2F;edition&#x2F;_&#x2F;jxHajgEACAAJ" rel="nofollow">https:&#x2F;&#x2F;www.google.com&#x2F;books&#x2F;edition&#x2F;_&#x2F;jxHajgEACAAJ</a><p>[P&amp;H RISC] <a href="https:&#x2F;&#x2F;www.google.com&#x2F;books&#x2F;edition&#x2F;_&#x2F;e8DvDwAAQBAJ" rel="nofollow">https:&#x2F;&#x2F;www.google.com&#x2F;books&#x2F;edition&#x2F;_&#x2F;e8DvDwAAQBAJ</a><p>Compiler Explorer by Matt Godbolt [Godbolt] can help better understand what code a compiler generates under different circumstances.<p>[Godbolt] <a href="https:&#x2F;&#x2F;godbolt.org" rel="nofollow">https:&#x2F;&#x2F;godbolt.org</a><p>The official CPU architecture manuals from CPU vendors are surprisingly readable and information-rich. I only read the fragments that I need or that I am interested in and move on. Here is the Intel’s one [Intel]. I use the Combined Volume Set, which is a huge PDF comprising all the ten volumes. It is easier to search in when it’s all in one file. I can open several copies on different pages to make navigation easier.<p>Intel also has a whole optimization reference manual [Intel] (scroll down, it’s all on the same page). The manual helps understand what exactly the CPU is doing.<p>[Intel] <a href="https:&#x2F;&#x2F;www.intel.com&#x2F;content&#x2F;www&#x2F;us&#x2F;en&#x2F;developer&#x2F;articles&#x2F;technical&#x2F;intel-sdm.html" rel="nofollow">https:&#x2F;&#x2F;www.intel.com&#x2F;content&#x2F;www&#x2F;us&#x2F;en&#x2F;developer&#x2F;articles&#x2F;t...</a><p>Personally, I believe in automated benchmarks that measure end-to-end what is actually important and notify you when a change impacts performance for the worse.