Hacker News Comments on
Working Effectively with Legacy Code
Hacker News Stories and CommentsAll the comments and stories posted to Hacker News that reference this book.
Very carefully. But seriously, check out the book "Working Effectively With Legacy Code" by Feathers. It has helped me tremendously in multiple refactorings.
His thesis is that Legacy Code is code without tests. So to make it not Legacy you need to add tests then refactor safely. Then he explains the very complex patterns that can arise and how to deal with them.
This book is one of a few that changed my career.
⬐ muzaniThanks for the recommendation! Exactly what I was looking for. Yup, the problem is that large chunks of the code are useless and must be removed, but I don't really want to write tests on those. Other parts have to be decoupled and removed and I end up needing to write end to end tests.
Working Effectively with Legacy Code by Michael Feathers is a good resource for how to introduce testing code into an existing system:
If you're looking for help fixing the mess you are dealing with, find the book Working Effectively with Legacy Code - https://www.amazon.com/Working-Effectively-Legacy-Michael-Fe....
You might also want to read Refactoring to Patterns, but the legacy book is more important to start with.
If the code has tests, I would start by looking at those tests.
If it has no tests, then I would slowly try to build tests to document the functionality that I need. In your case being Angular that might be having simple html pages with the smallest module that you need.
How to find things? If you're on Windows try AstroGrep http://astrogrep.sourceforge.net/ to quickly search and jump around in the code or in any system I use VS Code for a similar functionality. Also learn to use command line find/grep.
The book "Working Effectively with Legacy Code" also helped me be more comfortable navigating and changing large code bases, in a long term view I recommend this book to every developer https://www.amazon.co.uk/Working-Effectively-Legacy-Michael-...
Lastly, I would raise this because the company might not be aware they are buying a low quality framework that maybe ticks all the boxes in the contract but is in effect impossible to use by their current developers (you), it might be there's other people with more experience in said niche that might be able to help. In the private community maybe some people would be able to accept a short contract to help train you.
⬐ angrygoatxadoc's last point is really important. The missing documentation is clearly impacting your productivity: you absolutely should raise this as in issue with more senior devs or management. There are a number of ways to respond, and they should be pleased that you have flagged the issue early.⬐ mbziThis is good advice. I would also extend this and write out an FAQ / stackexchange for the next engineer at your company who has to go through the same learning curve.⬐ EliRiversXadoc's advice above is good; unit tests. I work with poorly documented protocols that have been implemented "around the theme of the protocol" by hardware from a variety of suppliers, and this is how we work out its quirks.
A battery of unit tests, starting with the simplest functions it offers, and thence upwards into more complicated tests (i.e. chained calls of the presented functions) where we track what internal state we think the system should have at that point in the tests and interrogate it to discover what internal state it really does have.⬐ mytailorisrichThese are exploratory tests rather than unit tests, but your point stands.
You can also skim chapters in this book. Best book I’ve read on the topic, highly recommended ( Working Effectively With Legacy Code ) https://www.amazon.com/Working-Effectively-Legacy-Michael-Fe...
Yes. By the way - „Working effectively with legacy code”.
If you start from scratch you may bump into the same edge cases that the original writers bumped into, and end up with a code that is not much better than the original - even in the original is 2 years out of date.
I’m sure there were cases when writing from scratch was a good call, but I don’t remember hearing about it.
⬐ avgDevI guess if I do rewrite, I shall write about it as I go. If I fail it will make for a good story.⬐ adanto6840I'd emphasize that failure on a project like this may not be what you'd traditionally have in mind when thinking 'that project failed', though it happens and it could be that bad in the absolute worst case.
The issue is primarily that of the reward versus cost -- especially the opportunity cost.
When the system is rewritten, will the business have increased revenue or decreased cost? Will it do so significantly, surpassing at least the cost to rewrite (salaries, etc) -- that's the absolute minimum bar, but then you have to consider the opportunity cost which is the real concern:
If you had instead spent the same amount time adding new features, implementing an A/B test suite to increase conversions, improving marketing capabilities, retention mailers, or really any other activity that could positively impact the company business metrics -- would the impact be better than the impact of the rewrite?
In most cases the customers (internal or external) don't really know or care how good/bad the underlying code is, as long as the product serves their needs. When that's true, even partially, the value of 'rewrite' almost never exceeds the opportunity cost alone, let alone the absolute cost (and that's to say nothing of the risks).
I've heard the book Working with Legacy Code  recommended for strategies to bring order to these kind of projects. Haven't read it myself yet...
This book is pretty great:
I'd suggest the book [Working Effectively with Legacy Code](https://www.amazon.com/Working-Effectively-Legacy-Michael-Fe...).
I'll open by saying I've only ever had bad experiences with complete re-writes and these experiences have impacted my aversion to them.
"[Working Effectively with Legacy Code]" by Michael Feathers really helped me get through a situation like this.
My recommendation is not to try to understand the code per se, but understand the business that the code was being used in/by.
From there, over time, just start writing really high level end-to-end tests to represent what the business expects the codebase to do (i.e. starting at the top of the [test pyramid]). This ends up acting as your safety net (your "test harness").
Then it's less a matter of trying to understand what the code does, and becomes a question of what the code should do. You can iterate level by level into the test pyramid, documenting the code with tests and refactoring/improving the code as you go.
It's a long process (I'm about 4.5 years into it and still going strong), but it allowed us to move fast while developing new features with a by-product of continually improving the code base as we went.
[test pyramid]: https://martinfowler.com/bliki/TestPyramid.html [Working Effectively with Legacy Code]: https://www.amazon.com/FEATHERS-WORK-EFFECT-LEG-CODE/dp/0131...
⬐ lostgame>> My recommendation is not to try to understand the code per se, but understand the business that the code was being used in/by.
I strongly agree with this. I've done at least 4 or 5 successful complete rewrites of old code bases, and I have found, rather than even 'business' the word for this might be 'context'.
If you can contextualize a piece of software, it's functionality and operations, you can have a much better understanding of an existing codebase.⬐ potta_coffee⬐ ConceptJunkieWhat would you do if the codebase was actually 5 codebases absorbed from 5 different smaller companies? Assume that zero institutional knowledge about the code / business have been passed on.⬐ lostgame>> Assume that zero institutional knowledge about the code / business have been passed on.
Who is, in that case, using the software? They obviously understand the context by which the software is at least going to work, otherwise, why is the software being rewritten?
Who is requesting the rewrite? Do they know what it is supposed to do? Is there an executable build of it that exists somewhere?⬐ potta_coffee⬐ bapThese are ecommerce systems. It's astonishing because no-one in the company truly has a complete understanding of the business, as far as I can tell. The code is running in production and serving customers.
Rewrite is being pushed by certain parties because we're unable to meet feature requests quickly with the existing system, and it's being assumed that a rewrite will fix that problem. The team is barely functional though (from the top down). I've seen a few failed projects now and I don't think the rewrite will ever be accomplished. If we manage to rewrite, it's far from certain that we'll do a better job than the last guys did.⬐ lostgameLate reply, and I'm sure you're smart enough to know this already, and are hopefully already planning it - but get the hell out of there, fast.⬐ potta_coffeeI...yeah. The picture wasn't completely clear until very recently and now the anxiety has kicked in. I'm trying to stick around a little because I've been through too many jobs in too short a time and I think I need to show some "commitment" on my resume.You are now in the platform business.
I have to assume someone is using the software therefore there is some tribal knowledge of what it does? Otherwise this is maybe SAAS software that users use and some functionality is exposed that would allow you to begin decomposing backwards toward expected input/output. You're almost black-boxing at that point.
I will admit that I have, on very rare occasion, scream tested a piece of software running on a server that nobody would claim ownership or knowledge of either on the eng. team or within the org.⬐ potta_coffeePart of the problem is that the people who owned tribal knowledge were all fired / quit without documenting anything. Every member of the existing team has been there around a year or less.⬐ potta_coffeeThere's a surface level understanding of what it does but nobody really understands how many of the large features really work, or what the actual rules are that govern them. Yes, much of this is black box. Example: yesterday I had to try to figure out what branch of code was compiled and deployed to our server. Everyone had assumed it was the Master branch, but no...deploying that branch fubared everything. I finally found the "working" branch of code.> My recommendation is not to try to understand the code per se, but understand the business that the code was being used in/by.
You're absolutely right, but the problem comes when the code itself is the only authoritative documentation of what the code does, and in a lot of cases, the only authoritative documentation (or even the only documentation, period) of what the code is supposed to do!⬐ bunderbunderI love that book. I can't recommend it highly enough.
Approval Tests (http://approvaltests.com) can be a huge timesaver when you're getting that initial black box characterization put together.
Besides being an important part of getting your bearings, talking to everyone who relies on the software to get a better understanding of how they interact with it can be a great time saver, too. It's amazing how quickly you can clean up legacy code with the delete key, provided you can confirm nobody's using it anymore.
The wholesale rewrite is a will-o-the-wisp. Very, very attractive, yes. But usually when people chase after it, they end up drowning in a quagmire. That isn't to say that you shouldn't strive to get rid of all the bad code, but do it as a long-term, component-wise, in-place rewrite.
FYI, the Legacy code book is: Working effectively with Legacy Code by Michael Feathers. Its useful, I also strongly recommend it when you're feeling overwhelmed by a large sprawling code base.
I would try and clean up the bits I was working on.
This is a good book on the topic refactoring a large code base with no tests.
⬐ riwasabiThanks for the tip! I'm gonna have a look at it. :)
You're definitely right that unit tests are a part of the solution.
can be read in a few different registers (making a case for what unit tests should be in a greenfield system, why and how to backfit unit tests into a legacy system) but it makes that case pretty strongly. It can seem overwhelming to get unit tests into a legacy system but the reward is large.
I remember working on a system that was absolutely awful but was salvageable because it had unit tests!
Also generally getting control of the build procedure is key to the scheduling issue -- I have seen many new project where a team of people work on something and think all of the parts are good to go, but you find there is another six months of integration work, installer engineering, and other things you need to do ship a product. Automation, documentation, simplification are all bits of the puzzle, but if you want agility, you need to know how to go from source code to a product, and not every team does.
⬐ JtsummersThat book is something I wish I'd read when I started my career. This is a holiday week so all the management types were out, but I had planned to have management buy a copy for the office library next week. I still need to finish it, but most of what I have read, I've been able to apply to projects in the past. I should probably actually reread it because it's been a while.
If you have to write mocks in the native language, mocks will probably drive you insane.
Tools like mockito can make a big difference.
I worked on a project which was terribly conceived, specified, and implemented. My boss said that they shouldn't even have started it and shouldn't have hired the guy who wrote it! Because it had tests, however, it was salvageable, and I was able to get it into production.
makes the case that unit tests should always run quickly, not depend on external dependencies, etc.
I do think a fast test suite is important, but there are some kinds of slower tests that can have a transformative impact on development:
* I wrote a "super hammer" test that smokes out a concurrent system for race conditions. It took a minute to run, but after that, I always knew that a critical part of the system did not have races (or if they did, they were hard to find)
* I wrote a test suite for a lightweight ORM system in PHP that would do real database queries. When the app was broken by an upgrade to MySQL, I had it working again in 20 minutes. When I wanted to use the same framework with MS SQL Server, it took about as long to port it.
* For deployment it helps to have an automated "smoke test" that will make sure that the most common failure modes didn't happen.
That said, TDD is most successful when you are in control of the system. In writing GUI code often the main uncertainty I've seen is mistrust of the underlying platform (today that could be, "Does it work in Safari?")
When it comes to servers and stuff, there is the issue of "can you make a test reproducible". For instance you might be able to make a "database" or "schema" inside a database with a random name and do all your stuff there. Or maybe you can spin one up in the cloud, or use Docker or something like that. It doesn't matter exactly how you do it, but you don't want to be the guy who nukes the production database (or a another developer's or testers database) because the build process has integration tests that use the same connection info as them.
> I've also seen people mangle well-factored but untestable code in the process of writing tests, which can be a tragedy when dealing with a legacy codebase that was written with insufficient testing but is otherwise well-designed.
Have you read Michael Feathers' Working Effectively with Legacy Code? 
In his definition of legacy code it is any such code that has no test coverage. It's a black box. There are errors in it somewhere. It works for some inputs. However you cannot quantify either of those things just by "inhabiting the mind of the original developers." The only way to work effectively with that code base in order to extend it, maintain it, or modify it is to bring it under test.
This is far more difficult than it sounds with legacy code than with greenfield TDD for the aforementioned reasons: there are unquantified errors and underspecified behaviours. You can't possibly do it in one sweeping effort and so the strategy is to accept that tests are useful and to add them with each change, first, before making that change and using the test to prove the change is correct.
Slowly, over time, your legacy code base surfaces little islands of well tested code.
You have to be deliberate and careful. You have to think about what you're doing.
This is a much different experience than writing greenfield code. TDD is effortless and drives you towards the answer in this case.
⬐ ericgunnersonOriginal author here...
Your writeup mirrors my personal experience, which was why I was such an advocate for so long. But it doesn't mirror my team experience at all.
I bought Feather's book when it first came out and liked it so much that I bought copies for a whole team out of my own pocket. And I talked about and taught the techniques to the team members.
I had high hopes, and a good team, but the results that we got were not what I hoped for. The result we got was a lot of small tested functions throughout the codebase ("sprouts" in Feather's nomenclature) - which I expected - but I didn't see any evidence of coalescing of those tests into something better - and by themselves, the sprouts made the codebase worse.
I ran into a number of cases where I would pair with developers who were looking to do something better than sprout - so they could fix a whole class or area to be better - and the best answer I could give them is, "give me a morning and I can probably come up with a good solution", which doesn't really help. Some code is just aggressively hard to test.
And I should mention that this was pretty much the perfect environment to try this; I had a team without shipping pressure and a mandate from above to write better unit tests and sufficient time to train and pair with developers.
The approach we settled on was different. We spent our time looking for "targets of opportunity"; the problematic classes or groups of classes that were leading to real issues - bug farms, hard to modify, that sort of thing - and we took a targeted approach to replace them. The replacements were generally written using TDD.⬐ dkarlYeah, I've read it, I know the definition, and I understand the intention behind adding tests to legacy code. Unfortunately, TDD (meaning TDD as I've encountered it in print and in practice) encourages people to think that "no tests" is tantamount to "no value to preserve" and therefore "no risk of harm from refactoring." Maybe that's an exaggeration, but certainly some TDD practitioners think that they can't possibly harm a codebase by adding tests. Unfortunately, testing requires refactoring, refactoring is redesigning, and if you don't understand the design you're modifying, your changes can make the code less understandable, not more. Tests added after the fact impose the test-writer's understanding of how the system works, which results in chaos if their understanding isn't compatible with the understanding embedded in the existing design.
Also, in spite of that definition, there's a lot more to "legacy" than not having tests. Not having access to the original designer and not having access to the the requirements or domain understanding that influenced development are important handicaps of legacy code that are entirely separate from tests. Certainly tests added during development can capture some of this knowledge, but adding tests after the fact does not automatically recreate it.
These are both examples of the danger of trying to elevate one aspect of software development to a primary and sufficient role. "Take care of X and everything else will work out" has no known solution for software development, and any methodology is harmful to the extent that it encourages people to think that way.
So as to be constructive, I'm going to reference a classic: Working Effectively With Legacy code . Here's a nice clip from an SO answer  paraphrasing it:
"To me, the most important concept brought in by Feathers is seams. A seam is a place in the code where you can change the behaviour of your program without modifying the code itself. Building seams into your code enables separating the piece of code under test, but it also enables you to sense the behaviour of the code under test even when it is difficult or impossible to do directly (e.g. because the call makes changes in another object or subsystem, whose state is not possible to query directly from within the test method).
This knowledge allows you to notice the seeds of testability in the nastiest heap of code, and find the minimal, least disruptive, safest changes to get there. In other words, to avoid making "obvious" refactorings which have a risk of breaking the code without you noticing - because you don't yet have the unit tests to detect that.".
As you get more experience under your belt, you'll begin to see these situations again and again of code becoming large, difficult to reason about or test, and similarly having low direct business benefit for refactoring. But crucially, learning how to refactor as you go is a huge part of working effectively with legacy code and by virtue of that, maturing into a senior engineer -- to strain a leaky analogy, you don't accrue tech debt all at once, so why would it make sense to pay it off all at once? The only reason that would occur is if you didn't have a strong culture of periodically paying off tech debt as you went along.
I'm not going to insinuate that it was necessarily wrong that you decided to solve the problem as you did, and the desire to be proactive about it is certainly not something to be criticized. But it wasn't necessarily right, either. Your leadership should have prevented something like this from occurring, because in all likelihood, you wasted those extra hours and naively thought that extra hours equal extra productivity. They don't. You ought to aim for maximal results for minimal hours of work, so that you can spend as much time as you can delivering results. And, unless you're getting paid by the hour instead of salaried, you're actually getting less pay. So to recap: you're getting less pay, you're giving the company subpar results (by definition, because you're using more hours to achieve what a competent engineer could do with only 40 hour workweeks so you're 44% as efficient), and everyone's losing a little bit. Thankfully, you still managed to get the job done, and because you were able to gain authorship and ownership over the new part of the codebase, you were able to politically argue for better compensation. Good for you, you should always bargain for what you deserve. But, just because you got a more positive outcome doesn't mean you went about it the most efficient way.
The best engineers (and I would argue workers in general) are efficient. They approach every engineering problems they can with solutions so simple and effective that they seem boring, only reaching for the impressive stuff when it's really needed, and with chagrin. If you can combine that with self-advocacy, you'll really be cooking with gas as far as your career is concerned. And, it'll get you a lot further than this silly childish delusion that more hours equals more results, or more pay. Solid work, solid negotiation skills, solid marketing skills and solid communication skills earn you better pay. The rest is fluff.
There are even books about dealing with legacy code. I've found this one to be useful:
Working Effectively with Legacy Code, by Michael Feathers
Check out the book "Working Effectively with Legacy Code", by Michael Feathers.
I believe the basic approach is to write tests to capture the current behaviour at the system boundaries - for a web application, this might take the form of automated end-to-end tests (Selenium WebDriver) - then, progressively refactor and unit test components and code paths. By the end of the process, you'll end up with a comprehensive regression suite, giving developers the confidence to make changes with impunity - whether that's refactoring to eliminate more technical debt and speed up development, or adding features to fulfill business needs.
This way, you can take a gradual, iterative approach to cleaning up the system, which should boost morale (a little bit of progress made every iteration), and minimises risk (you're not replacing an entire system at once).
I've used this approach to rewrite a Node.js API that was tightly coupled to MongoDB, and migrated it to PostgreSQL.
I've had to do this once. They don't teach you managing code like this! A friend gave me a copy of Working Effectively With Legacy Code which helped me.
The gist of it: a strong suite of integration and unit tests. Isolate small code paths into logical units and test for equivalency.
I like this book, it has a lot of tips for situations like these:
I am not sure the above idea is mentioned by Michael Feathers in his amaze book "Working Effectively with Legacy Code" but it is a great idea, and combined with the things that Michael does cover will do you a lot of good!
> > My own preference for the answer is Uncle Bob's description, which is this: technical debt is any production code that does not have (good) tests.
> That's certainly an example of technical debt.
Agreed, it is not the only example, but perhaps it is a good one, as that is a particularly important form of debt that makes the code harder to safely change. I.e. it is a form of technical debt that makes it more expensive to pay off other kinds of technical debt.
Curiously, Michael Feathers has a similar definition of legacy code :
> To me, legacy code is simply code without tests.
⬐ dragonwriterI think "code without tests" is a fine example of technical debt or legacy code, but using it as the definition of either disregards the importance of everything other than tests that goes into making systems maintainable.
Tests are necessary but not sufficient. (Particularly, tests embody
> Perhaps you should try writing unit tests after you write your code and then you can come back and add value to the conversation.
Nice snark there. Feel free to keep it to yourself.
To add actual value to the conversation (as opposed to your contribution), I can very much recommend the book "Working Effectively with Legacy Code" for how to handle unit-testing in the scenario of existing "legacy" code-bases.
It's full of useful tips and methods to get testing in place "anywhere" and has a pragmatical (as opposed religious) approach to getting it done.
To spark some interest: The book defines "legacy code" as any code not covered by unit-tests.
It may be seem dated (from 2004 and all), but it's been the most useful book I've read on unit-testing by far.
⬐ GFischerI've been meaning to buy that book for a long time, thanks for the refresher :)
I've found that there are way more books for "greenfield" software development and not so many books for what 60% of people actually do (maintaining other people's projects, legacy or otherwise).⬐ sheepmulletYou came along and dismissed the parents views without any real thought.
If you write the tests afterwards you can better gauge how well they catch bugs. It sounds like the parent has tried this and has found them pretty useless.
I've personally found that unit tests are not worthwhile for many components, and yet are critical for others.
Dealing effectively with legacy code:
I just finished Pragmatic Thinking and Learning: Refactor Your Wetware (http://www.amazon.com/gp/product/B00A32NYYE)
Next I'm picking up Working Effectively with Legacy Code (http://www.amazon.com/dp/0131177052). It's been in my reading list for years and I can finally get to it!
Working Effectively With Legacy Code by Michael Feathers http://www.amazon.com/Working-Effectively-Legacy-Michael-Fea...
Debugging with GDB: The GNU Source-Level Debugger by Stallman, Pesch, and Shebs http://www.amazon.com/Debugging-GDB-GNU-Source-Level-Debugge...
The Art of Debugging with GDB, DDD, and Eclipse by Matloff & Salzman http://www.amazon.com/gp/product/1593271743
Also, read this book: http://www.amazon.com/Working-Effectively-Legacy-Michael-Fea...
It helps a lot and teaches you how to use grep and other tools (a lot more others that I no longer remember) to search and find your way through legacy code.
See if you can talk to the people in your company who hired the contractor. They might at least be able to give you a high-level description of what the software is supposed to do and how it's supposed to work. They might even have specs that they prepared for the contractor or other design documentation.
If the contractor's software has no tests or is poorly written, it's going to be hard to add features to it or refactor it. You might want to read Working Effectively with Legacy Code by Michael Feathers, which describes how you can get a handle on large bodies of legacy software.
This reminds me of a post by Michael Feathers, titled "The carrying cost of code" . Feathers wrote the book about legacy code . I think he makes approximately the same point:
> If you are making cars or widgets, you make them one by one. They proceed through the manufacturing process and you can gain very real efficiencies by paying attention to how the pieces go through the process. Lean Software Development has chosen to see tasks as pieces. We carry them through a process and end up with completed products on the other side.
> It's a nice view of the world, but it is a bit of a lie. In software development, we are essentially working on the same car or widget continuously, often for years. We are in the same soup, the same codebase. We can't expect a model based on independence of pieces in manufacturing to be accurate when we are working continuously on a single thing (a codebase) that shows wear over time and needs constant attention.
This is the standard recommended book:
I write new tests in any area I'm going to be working in.
One book that might give you some useful advice is "Working Effectively with Legacy Code" by Michael Feathers:
> What about putting in effort into writing extensive test suites
Easier said than done. Writing test-suites for a codebase which never had a test-suite is a million times harder than writing a test-suite for new, fresh code.
In fact it's probably easier to start over than re-factoring the code to be testable in the first place, but some people might argue that would be a wee bit drastic. So not saying it can't be done, just that it does take a very significant effort.
If anyone should still feel like doing something like this, I can very much recommend the following book for advice and morale boost:
(Discalimer: Affiliate link)
At work I have set up a nightly Jenkins job to merge all verified, un-submitted patches from Gerrit and process the resulting history with Sonarqube.
This has revealed many bad habits and allowed us to correct them through review comments prior to changes being submitted. It's also shown me just how "legacy" our code-base is, but thanks to sonarqube (which runs PMD and FindBugs as part of its analysis) we're improving.
⬐ petermartinStatic code analysis is very nice to help you identify problems with legacy code. That is how I got into it when we had to do a mass migration of old code to a different application server.
Great article that is still relevant today (sadly, I remember reading it when it first came out). If you really, really feel the need to rewrite from scratch, I recommend instead picking up a copy of "Working Effectively with Legacy Code" by Michael Feathers . It will give you ways to improve those terrible code bases while not throwing out the existing code. Plus you'll still get that "new car smell" of working on your code.
If you do decide to add tests to an existing code base, I found "Working Effectively with Legacy Code" to be a good guide. Check out the table of contents.
"Working Effectively with Legacy Code" is by Michael Feathers. http://www.amazon.com/Working-Effectively-Legacy-Michael-Fea...
> First read the Fowler's 'Refactoring' book; it was written just for you.
"Refactoring" is not the tool for the job, although it's a nice sidearm.
What OP needs is the big gun, Feathers's "working effectively with legacy code": http://www.amazon.com/Working-Effectively-Legacy-Michael-Fea...
As the title hints, it was written specifically and expressly for the "I just got a huge amount of complete shit of a codebase shoved unto me, how do I survive". Just check the TOC of part 2 (the meat of the book): http://my.safaribooksonline.com/book/software-engineering-an...
> And of course, make sure your client acknowledges that it's a giant clusterf...
That's hugely important. No promises of delivery, and that the client understands it's not a cakewalk.
Appropriately, it's a book: http://www.amazon.com/Working-Effectively-Legacy-Michael-Fea...
Some automatic tools could help (although I doubt thay'll work on DBase III): Static analysis to see what's there, version control to start at the top and log your way through and be able to rollback to a previous working version.
But it's at the very least weeks of pain.
As the link by lttlrck also advocates: throwing shit out can easily be a mistake. More usually, http://www.amazon.com/Working-Effectively-Legacy-Michael-Fea... + http://www.amazon.com/Refactoring-Improving-Design-Existing-... can get you further, faster. Stuff keeps working while you incrementally improve it.
Working with legacy systems is a black art that I didn't learn about until I took a job supporting and extending one such system. The book I link to above was critical in helping me to understand the approach taken by the team I was working with. It takes a keen, detail-focused mind to do this kind of work.
The approach we took was to create a legacy interface layer. We did this by first wrapping the legacy code within a FFI. We built a test-suite that exercised the legacy application through this interface. Then we built an API on top of the interface and built integration tests that checked all the code paths into the legacy system. Once we had that we were able to build new features on to the system and replace each code path one by one.
Unsurprisingly we actually discovered bugs in the old system this way and were able to correct them. It didn't take long for the stakeholders to stop worrying and trust the team. However there was a lot of debate and argument along the way.
The problem isn't technical. You can simultaneously maintain and extend legacy applications and avoid all of the risks stakeholders are worried about. One could actually improve these systems by doing so. The real problem is political and convincing these stakeholders that you can minimize the risk is a difficult task. It was the hardest part of working on that team -- even when we were demonstrating our results!
The hardest part about working with legacy systems are the huge bureaucracies that sit on top of them.
⬐ wmatI've read that book as well, and highly recommend it. It's excellent.
You'll want these:
Essentially, my understand of best practice is to write high level functional tests for the features that appear to work and then use them to ensure there are no regressions as a result of your changes. Someone people even define legacy code as "code without tests".
If you haven't worked with legacy systems, you may want to check out "Working Effectively With Legacy Code"
You'll want to make your boss's profitable projects your top priority. Then gradually go through and start fixing the legacy problems bit by bit. Don't rewrite the entire system in a year, start rewriting/refactoring/fixing individual pieces of it one piece at a time.
I had a municipal tax management application that was a complete mess full of spaghetti VB code, multiple instances of similar queries/calculations/interfaces sprinkled all across the app in different places with different implementations. I even had one screen with a 40,000 line event handler. Over the course of 2 years, the entire thing was whipped into shape, additional functionality was added, and now it's easy for the new maintainers to work with it.
⬐ anonymousgeekThanks! This is good advise.
This is what Michael Feathers calls 'seams' in his book, Working With Legacy Code. Often, you have to do exploratory testing, that is, you don't really know the requirements but you make tests that the current code passes. Then you can refactor it. That way, current code behavior won't be changed.
Very good read, if you need to deal with legacy code and you don't know where to start.
This is actually the type of system (especially if it's very rough code quality wise in many places) I think regression tests are very useful (tests to make sure the system doesn't change function).
A book called "Working effectively with legacy code" by Feathers is great for instrumenting and regression testing old code bases then changing them without breaking them.
I have the book "Working Effectively with Legacy Code", and it's pretty much just "Put things under test, then change them.". Still a useful read, though, if you find yourself working in that sort of thing often (I do).