Ask HN: How do you deal with 2000 line functions?

26 点作者 pinkunicorn超过 9 年前

I'm working with an archaic code base and I'm stuck at trying to add new business logic into a 2000 line function that is absolutely crappily written with nothing but if() conditions in it. Given such a scenario, how do I go about adding a new functionality to this mess? Should I tell my manager that the function needs to be refactored and take time to refactor it?Edit: Removed offensive word

35 条评论

ndesaulniers超过 9 年前

1. Unit tests. Otherwise modifying old code is full of potentially unforeseen landmines.2. Read the function a few times to get a sense of what it's doing.3. Break the function into blocks of statements that are performing a related task.4. Factor those out into smaller functions.5. Recursively repeat 3-5 until you're satisfied with the new functions' line count.> Should I tell my manager that the function needs to be refactored and take time to refactor it?It's absolutely worth mentioning. Time estimations are one of the harder tasks of Software Engineering. Being up front with your manager can help better set expectations.

评论 #10932645 未加载

评论 #10932913 未加载

评论 #10932643 未加载

tinco超过 9 年前

Don't tell you manager anything, just do it. If someone tells you to add functionality to a 2000 line function, it takes extra time. If you want the company to benefit a little bit more from the time you spend on grokking it (by saving the next poor soul some time) by splitting it up.Go to the part that you suspect you need to modify, identify a block of code that together forms something you can give a name, i.e. "initialize_doodad_struct", "validate_widget_params", pull it out put it in a function. Now look at all identifiers that are referenced or declared in your block of code, do a ack or ctrl+f through the 2000-N line function to 100% verify that they're not referenced anywhere. If they aren't you can safely encapsulate them in your new function (if you're working in a static language you probably could just right-mouse-click-extract to do all this).Then immediately verify that the code still works. Run it, ideally in unit tests but most certainly also in real world(-like) scenarios. Do this for every small step. If your codebase takes 2 hours to compile, instead make a commit. Make a commit for every single 3-10 line extraction you do. (do it anyway, it'll look good and everyone will love you)And then repeat until either the part you wanted to modify is clear to you, or there's nothing more to extract ;)By the way, I'm a Ruby programmer mainly, and in Ruby any function over 10 lines is frowned upon. I know that in some programming language communities (looking at you C and C++!) a 200 line function doesn't raise any eyebrows, and they'll sometimes even scold you for extracting a function if it doesn't 'add' anything. I think this is because in C every function pollutes the global namespace, so there's a little more cost associated with it.

评论 #10932926 未加载

评论 #10935002 未加载

mod超过 9 年前

You test the shit out of it, then re-write it ensuring tests pass, then add your new code.It's probably easier to do that than to muck around in the 2000-line version hoping you don't screw something else up.

评论 #10932487 未加载

评论 #10932956 未加载

评论 #10932431 未加载

falcolas超过 9 年前

Slowly and carefully, like you're defusing a malfunctioning clockwork bomb.1) Understand at a very fundamental level what it does (there are probably multiple things).2) Understand at a very fundamental level what it does (this is really the hard part)3) Document your hard-earned understanding (tests, ironically, do not work well for this, since there are no fundamental units to test at, and there likely is a metric ton of shared state)4) Refactor it into something easier to understand and test.I've had to work with a similar 5,000 line C function at the heart of a DSL parser; literally the core of the entire business. Changes were made very slowly, and with a lot of consideration. Refactoring is still underway 6 years later.

kpil超过 9 年前

Pft, 2000 lines :-)What you do is that you:A) Realize that there might be years of hidden bugs, but also fixes to complex problems, speedups, kludges, weird changes to requirements, etc, hiding in the code, so even really creative unit-test might not cover real-world edge-cases.B) Take as small bit as you can, and just refactor out the smelly code into smaller parts. Often a good name around smelly code helps a long way.D) Iterate. Until good enough.But not more... Maybe you have more important things to do than rewriting ugly code that works.There is a big risk that since you probably won't understand everything that goes on, and why - you will break something important.

mh8h超过 9 年前

Michael Feathers' book, Working Effectively with Legacy Code, is highly recommended. In that book he explains that you can choose between two approaches: Change and pray, or Cover and modify. Then he provides dozens of different strategies to "cover" the existing legacy codes before you make the changes.

评论 #10932925 未加载

azeirah超过 9 年前

Something I found to be really useful is to remove all statements inside the code, so you'll be left with large skeleton structures. Look at it from a distance, (small font), see some patterns. Can you split it up?getting a feel for the structure of the function helps me understand it.Ex:function someFunction () {<pre><code> if () { for () { if () { } else { } } if () { for () { } } } }</code></pre>

评论 #10932795 未加载

评论 #10932723 未加载

yk超过 9 年前

Depends, but very likely. There may be cases were there is practically no way around a 2000 line function, one example would be a something like a very long switch-case statement, were the logic is very simple, but there are a lot of cases to cover. For example something equivalent to (python pseudo code)<pre><code> dict[ keyword](args) </code></pre> were dict is a large dictionary containing function objects. In that case the dictionary approach has the advantage of separating the logic from the clutter, but it is possible to work with 2000 lines of<pre><code> case something: foo(); break; </code></pre> However if it is not such a case, then you are in trouble. Thing is, management does not like to be told that the code is a steaming pile of shit, and management is especially unhappy if you tell them that you are going to spend a few month on accomplishing absolutely nothing. From their perspective refactoring is accomplishing nothing: the functionality is there, it works and they do not care about the gory details. So you need to convince your manager that refactoring is a good idea, which just goes against every basic assumption of his job. So think about the actual problems you have with the code, and what the company gains by refactoring it. (Easier extensibility, less risk of bugs and faster turn around if a bug occurs.) Then make sure that you convince your manager that refactoring is the right thing to do. ( Ideally you should enlist your coworkers for that, if they know the kinds of problems.)Speaking a bit more generally, this is part of why technical dept happens. Selling refactoring is hard, so everybody tries to tip-toe around the 2000 line gorilla until some unhappy soul (read someone else) can no longer avoid to wrestle with it.

lutorm超过 9 年前

I like this book, it has a lot of tips for situations like these:<a href="http://www.amazon.com/Working-Effectively-Legacy-Michael-Feathers/dp/0131177052" rel="nofollow">http://www.amazon.com/Working-Effectively-Legacy-Michael-Fea...</a>

noonespecial超过 9 年前

I've had worse. I was instructed to add to the > 2000 line function of nested if/than/else but the function was written by the manager who said "Don't change any of my code, its been working perfectly".This function was used as a container to hold all the business logic and flow control. It had dozens of parameters that all had to have some value or another passed to keep from throwing undefined errors. There was an integer called "wat_to_do" that would choose different blocks of if/than/else inside the function. Not like case, mind you, scattered everywhere.I've never really been the same since then. Can you get code PTSD?

joshka超过 9 年前

A few people have mentioned Michael Feathers' book Working Effectively with Legacy code already, but I'll add my piece on this. The book describes legacy code as any code without tests.It then goes on to describe the "legacy software change algorithm" (google that for more articles that will help you. Get targeted legacy code into test harness and cover with tests: 1. Identify change points for the target change or new code addition. a. Find test points where the behavior of the code can be sensed. b. Break dependencies (very carefully and often without sufficient tests) and get the targeted legacy code into a test harness. c. Cover targeted legacy code with (characterization) unit tests 2. Add new functionality with Test Driven Development (TDD) 3. Refactor to remove duplication, clean up, etc.It really is worth reading this book as it covers almost exactly your question: "22. I Need To Change a Monster Method and I Can’t Write Tests for It." and various other relevant chapters.Answering your manager question with a standard consultant answer "It depends". Refactoring code is about setting yourself up for future development. You should always be refactoring your code after you confirm it works. Asking your manager whether you do it should take on the same amount of importance as asking whether you should wash your hands after visiting the restroom. That said, developers can often feel a sense of ownership of code and not wish change for various reason (usually familiarity with the current crap situation).

评论 #10934483 未加载

davimack超过 9 年前

I would actually approach this differently. I'd go back to whomever the authority is with regards to what the function is supposed to do, or to the spec, and then rewrite it from scratch. This won't ensure that it behaves as it does now but with a little bit added - it will ensure that it behaves as per the specification / user requirement. If the function is that huge, with just if's, it's very likely that it's buggy and that those bugs haven't been addressed. They won't be addressed if you duplicate the functionality in a refactor unless you can get a handle on the purpose of the thing to begin with. Scrapping it and rewriting it is not a bad thing - frequently, you'll make something much tighter if you do that.(note: would be happy for you to have left the "offensive" word in - 2,000 lines of code for a single function screams for a refactor, and says that the original dev had no clue how to write good code. if the code makes you want to cuss, well....)

评论 #10934451 未加载

alkonaut超过 9 年前

I think making small opportunistic fixes is best. Do what you can do without massively delaying your work every time you need to work in that area of the code. That is, leave the code better than you found it, each time.If you add the new logic this time and also split it into four 500-line methods and maybe a few extra tests, you have done quite a lot and it is likely doable within the time frame of the work you are doing.ButIf there are zero tests and you aren't confident that you can create those tests then I'd bring it up with management. Maybe they'll say that changes in this functionality will be very rare after this small change and that it's very well tested manually and by customers - then that's it. Not all code has to be made concise and elegant, it's better to focus your effort on the code that regularly needs changing. However, in this case I'd ask management to consider not changing it at all, or make sure it will be very well tested manually after your changes).

wnevets超过 9 年前

I would add landmarks and do small refactoring within the function until I'm confident in my ability to start splitting it up into smaller functions. This should happen naturally as you make changes and add new features. However if you're not in the function enough to make this happen then it's probably not worth refactoring to the business.

ufmace超过 9 年前

Start out by studying and understanding the function, the way that it's used, and everything that it does. Does it mess with a lot of global state? Is it called on one place for one thing, or a whole bunch of places, or even for multiple different reasons? How hard would it be to set up some unit tests?I would recommend against any kind of rewrites or radical refactoring right away. You probably don't know what it does well enough yet or what kind of workarounds and bug fixes are in there, so you're likely to bring back bugs. And not many businesses have the time to spare to actually redo something like that correctly all at once.Aside: You could just give up and quit, but I consider it part of the job of a good developer to be able to take a mess of spaghetti code and gradually turn it into something easily understandable without causing huge delays in business processes or show-stopper bugs. I don't think a developer who is only willing to touch pristine code is very valuable in the real world.I'd make the first goal to add the required changes with as few changes to the function as possible. Focus your refactoring work at first on trying to get some tests around this thing. Don't try to test every use case right away, but do try to comprehensively test a few tasks, including capturing whatever it does to any state in any other systems. Most likely, a lot of the initial refactoring work will hardly touch the function at all, but instead focus on putting the things that it touches into modular, testable parts. Getting good tests around it will help you understand what it does better, what parts of it are important, and how state is handled around the system.Then as you go, start adding tests, and gradually refactor things that are tested. Write tests for any bugs that are reported back to you. Anything that looks weird, ask around and see if you can figure out what it's for and if you really need to test for it.

rbrogan超过 9 年前

You could start by writing comments. There is little chance of any bugs coming from that. If there are any tests available then you could work on understanding the tests and then step through the code in the debugger. Hope that helps.

tienthanh8490超过 9 年前

I know other people already suggested writing test, carefully documenting etc so I won't repeat that. Just want to add one more thing: try to prune the code first as such a large codebase is likely to have redundant code built up over time. By doing this you will have some time to understand the code (to know what you can kill and what you can't), and also save yourself some time as the codebase you have to refactor now is now much shorter and probably cleaner.

agentultra超过 9 年前

I've had to do this once. They don't teach you managing code like this! A friend gave me a copy of Working Effectively With Legacy Code[0] which helped me.The gist of it: a strong suite of integration and unit tests. Isolate small code paths into logical units and test for equivalency.[0] <a href="http://www.amazon.com/Working-Effectively-Legacy-Michael-Feathers/dp/0131177052" rel="nofollow">http://www.amazon.com/Working-Effectively-Legacy-Michael-Fea...</a>

timonoko超过 9 年前

I remember linear lists of 1000+ if-statements. These were "state machines", popular form of real-time multitasking 50 years ago. Nothing wrong with that. You might make an analyzer which recognizes longer series of events and impossible states, but why bother. This is a programming paradigm, best you can do is to go with the flow.

fallingfrog超过 9 年前

We have a saying where I'm from - "That's the smell of money!" Relax, I'm sure your boss knows that this is going to take a while. Make prudent changes, don't go overboard, take your time and test thoroughly.

frozenport超过 9 年前

Where a function ends and where a function starts is somewhat artificial. For example, if this were BASIC it would be just be a label.I would read the thing a few times and figure out exactly where to add the functionality. Refactor second, if time permits. If the thing is 2k long clearly, refactoring it isn't a priority.

there4超过 9 年前

This is a great presentation about refactoring that may be useful: "RailsConf 2014 - All the Little Things by Sandi Metz" <a href="https://www.youtube.com/watch?v=8bZh5LMaSmE" rel="nofollow">https://www.youtube.com/watch?v=8bZh5LMaSmE</a>

lectrick超过 9 年前

1) Cover code with tests 2) Refactor. Martin Fowler wrote a famous book on refactoring patterns that are guaranteed to merely transform the code and not break functionality. 3) Rerun tests, repeat till everything passes

tahssa超过 9 年前

Do a re-write. If your manager says no to the re-write then add another if condition to handle the current business logic and, at the same time, look for a new job (which is probably what the last guy did).

scarface74超过 9 年前

It depends on the tooling you have available. If I encountered a 2000 line C# method, I would use Resharper and start with the extract method command and then extract class.

kaizensoze超过 9 年前

<a href="https://www.youtube.com/watch?v=dWIHvuABaFU&t=01m05s" rel="nofollow">https://www.youtube.com/watch?v=dWIHvuABaFU&t=01m05s</a>

odonnellryan超过 9 年前

You should read Clean Code, even if it is just the bits on refactoring, so you can refactor properly! :)

quintes超过 9 年前

unit test first so you know if you've broken it. Then lots of test input variations best taken from production sources.Refactor thereafter and test after each iteration of changes. It's not a rewrite, so don't lose the plot just Refactor into smaller methods :)

panamafrank超过 9 年前

quit, get a job working in a newer more modern language or just a company working on 'green field' projects. I worked on legacy code bases for ~3 years as a junior grunt and it contributed to some kind of PTSD-lite disorder.or read the 'ol refactoring book...

zippy786超过 9 年前

Send it to me with requirements, I'll do it for 500$

jarsin超过 9 年前

Get a new job :)

评论 #10932719 未加载

sharemywin超过 9 年前

you probably need a ton of unit tests for this thing.

zerr超过 9 年前

Do not touch it.

davidw超过 9 年前

Indeed.com, Craigslist and maybe Linkedin.

PaulHoule超过 9 年前

Yes, just don't use the word "shit".The obvious refactoring is to split the function up into smaller pieces, even if you can split it into two 1500 line functions that is a win.

评论 #10932474 未加载

评论 #10932420 未加载