The way to distinguish this from the #5 situation in the article is to ask if you're dropping features because they're hard or because nobody uses them. The former is a red flag; the latter is a green flag. Before you embark on a rebuild, you should have solid data (ideally backed up by logs) about which features your users are using, which ones they care about, which ones are "nice to haves", which ones were very necessary to get to the stage you're at now but have lost their importance in the current business environment, and which ones were outright mistakes. And you should be able to identify at least half a dozen features in the last 3 categories that you can commit to cutting. Otherwise it's likely that the rewrite will contain all the complexity of the original system, but without the institutional knowledge built up on how to manage that complexity.
This is so important. I've been on many a project where, 3 months in, we wish we had historical tracking data on user activity to back up our instincts to cut a particular feature that seems worthless. The worst part? Even if you add it immediately, you'll have to wait 2-4 weeks to get a sufficient amount of data.
I think this was the problem a product like Heap  was designed to solve: just track all user actions, forever, and then assign pipelines after the fact based on what you want to check up on.
Don't work at Heap or anything, just love the team and product.
I don't think "just track all user actions, forever" is going to be a legally defensible solution for much longer, even in the US.
Out of interest, what makes you think that an application won't legally be able to record the ways in which a user interacts with that application?
Obviously I'm not speaking for Heap; just curious.
Things like online stores using cookies to track a user's shopping cart across requests are completely fine, yet it seems like legal departments decided to be overly cautious and treat all cookies as potentially infringing. GDPR may be triggering similar reactions.
I wouldn't have a problem with that if marketing departments became equally cautious, but they seem to just slap on a banner and carry on as before :(
It's about data that can identify a user, not any data. A collection of actions with anonymized user IDs will not allow to identify the user (in most cases), so it's fine to keep it.
Correct me if I'm wrong - seems like anonymizing the usage data complies with the GDPR, and thus the grandparent post still stands.
GDPR, I'm hoping that I don't have to bother my users with a "do you consent to" popup when the only thing I want to do is to log server-side the API calls so that I can see patterns in usage and such. If I were to show such a "do you consent to" popup users might mistakenly think I'm one of those techcrunchers with hundreds of data partners that all get to see your PII. I do not want to affiliate myself with those type of actors.
Anonymously of course. Should be fine, yeah?
"The principles of data protection should therefore not apply to anonymous information, namely information which does not relate to an identified or identifiable natural person or to personal data rendered anonymous in such a manner that the data subject is not or no longer identifiable. This Regulation does not therefore concern the processing of such anonymous information, including for statistical or research purposes."
As long as it's not linked to a particular profile ("pseudonymous" doesn't count, it could still be linked), it's fine.
A good example is MS Office, there are an huge amount of features that only 5% of users might ever use, but the majority of users are likely to use quite a few of these niches individually, and if you remove all the low use features, you piss off basicly everyone.
I think the mistaken idea of an average user is why a lot of metrics driven software seems to get more and more useless with every update.
(I cant see the present/away status of contacts in the newest skype, really guys? )
Ideally, you disable them in the old software, and observe how many people complain.
Too often, product management commits to cutting a feature, and then caves in when paying customers complain. It's best to know in advance which category a feature really falls in.
Sometimes you will want to fold features into a rewrite (remove prompting the user to confirm X twice) sometimes this will ease development and be worth it but other times it'll pay off to just retain the old functionality but add it to a list to be user tested later.
Once the tech is solidly over then take a swing at updating the poor UI, do it agiley so you can back out of changes that the user base rejects since (at least within my more modest usage studies) not everything people depend on comes up or gets reported. I'd much rather rollback a design feature branch then have users get change fatigue when you're forced to rollback your new shiny rebuild and the whole project ends up being shelved.
The way I try to solve this is to ask "why?" as many times as it takes to get to a fundamental business problem. Then it becomes easier to have a user story (as opposed to a specific feature request) and come up with other solutions that can be measured against the story. It also helps to keep the product focused, as it's easier to tell when a story is not for your target market vs a feature request -- and then you can make a conscious decision to either stay away or deliberately expand to that market.
It’s difficult not to sound combative when they say they want a convertible but you have to wheedle out of them that they want to take a proverbial road trip through monsoon season. No, you get a Land Rover with a snorkel or you wait, pal.
So bossy and difficult. Why won’t you just give us what we asked for? These meetings would go so much faster.
I think this is most important. A lot of people want to rewrite because they don't understand the current system and don't want to bother learning. Before you rewrite you really should understand the current state deeply.
If you can build that plan, and make the case that it will be easier to do the full rewrite, go for it. But if you couldn't put together the fix-in-place plan, you might not understand everything the old system does well enough to actually estimate the size of a rewrite...
(This isn't solely for full-parity rewrites: if you're dropping features, what does that look like dropping from the old system?)
A year into the process one of the c-level leaders pulled me into a room and asked why I couldn't fix the legacy code, and I basically told him that he should have pushed back on it. I couldn't fix the legacy code because that would be months of refactoring that should have been done instead of the rewrite.
Context: the legacy code had some design flaws that required major refactoring, but the legacy code "worked" except for very large deployments. The only problem was that the legacy system wasn't modular, so it didn't have unit tests and wasn't cross platform. All of those problems are easier to tackle via refactoring instead of a full rewrite.
Hmm... there have been a number of times when I've banged my head against the wall trying to figure out how to make my own code do something, until I finally bit the bullet and decided to rewrite the entire chunk from scratch and suddenly it took a fraction of the time I had spent trying to fix it to get it written and working. Not sure how to reconcile this with the advice you gave.
It's a very different kettle of fish to rewrite from scratch strange code you've not properly explored and given a chance to - which is the usual situation.
In fact rewriting a chunk sounds rather like refactoring.
You can't blindly listen to the experts.
#4 also mixes a good deal with #5 in that any changes you make (even purely good ones in your view) will require retraining of users and cause a kerfuffle when rolled out to your user base, people _hate_ change.
With all respect, that means you should not be in a position to rewrite legacy code, or to commit others to such a rewrite.
If all the experts you have worked with have been, in your eyes, overly attached to the old way of doing things, you have one of two issues:
- You have not had enough experience in the field, and have not worked with experts that actually have perspective about when/how to rewrite, abandon, or rework their code.
- You have dogmatically condemned people who think that the latest-and-greatest tech may not be a good solution to the problems at hand to the "old fogey" bin.
Either issue means you're not ready to make decisions at this level. Learn more. Research more. Watch more. Listen more.
Weirdly, gaining this perspective has less to do (in my experience) with years on the job, and more with diversity of team/business environments worked in.
You need that previous knowledge to know the "why" of things & if that why is still valid.
IMHO it's more dangerous if you're working with experts who don't want to improve the system.
I’ve always been a big believer in rebuilding your product from the ground up. I think it’s something you should always have going on in the background. Just a couple of devs whose job it is to try and rebuild your thing from scratch. Maybe you’ll never use the new version. But I think it’s a great way to better understand your product and make sure there’s no dark corners that no one dare touch because they don’t understand what it does, how it does it, or why it does it the way it does.
And I’ve always believed that if you don’t want to rebuild your app from scratch, then don’t worry, a competitor will do it for you.
So I agree with every point raised in this article. And I think it does a great job of articulating the issues that often go unspoken. But I’d like to add one more. And for me, this is the biggest issue for any company wanting to rebuild it’s product.
If your sales team has more clout than your designers and developers, then you’re fucked. And in the enterprise software world, this is the norm. An uncheked sales team that get’s whatever it wants has already killed your product and made it impossible to rebuild. Their demands are ad-hoc, nonsensical, and always urgent. So urgent that proper testing and documentation are not valid reasons to prevent a release. Their demands are driven by their sales targets, and the promises they make to clients are born out of ignorance of what what your product does, and how it does it.
This is not true of all companies. Many companies find a reasonable balance between the insatiable demands of a sales force and the weary cautiousness of their engineers. But if your company submits to every wish and whim of your sales team, and you attempt to rebuild your product, then you’re screwed.
What's your learning process? If you don't do maintenance how do you know your rebuilds aren't creating the same problems that lead to the systems needing replacement?
I've got a very well founded distrust of people that only work on green field projects, they're generally responsible for the system's that need rebuilding.
I don't appreciate the snark in your comment.
Incremental rebuilds are not sexy. Adding unit tests to legacy code (thereby making it not legacy code according to Michael Feathers) is not sexy. Sticking with the tried and true technology is not sexy. But they are typically the most successful approaches for those not compensated for changing things for change's sake.
Their time is much better spend working on improving the "legacy" codebase. Simple refactoring and splitting the codebase in a modular fashion, mean you can work on limited parts of the system in isolation. This makes incremental improvements and switch to new tech much easier, and certainly less risky than a rewrite.
I mean, you can write a bunch of pinning tests, then try to prise out various bits and pieces, sure.
But what if all the stuff you're trying to prise out can now be accomplished with a few open source libraries that didn't exist way back, with a very simple rewrite of your business logic on the top?
That's a situation I've encountered quite a few times - a lot of legacy code that's largely boilerplate, with business logic drizzled over the lot, oozing into the little cracks.
That may be good value for big established corporates, but for startups and smaller companies I don't think it is.
Well said. This is easily my #1 biggest pain point as a developer.
Hahaha. Just a couple of devs?
It’s just R&D. It’s not an exotic idea.
Corollary: this position needs to be at least two devs. Otherwise, you're rotating in people for redundant discoveries rather than mentorship.
It may be a great way to learn, but I think that is better achieved with something like Google's famous 20% program not some vague rewrite attempt with no direction.
Obviously, other, not-pure-research devs should be given time to do some of that work as well, otherwise the research team becomes the "saviors that are always about to come back over the hill" for every other team while they kick their respective cans down the road.
If your goal moves from feature comparable but on a modern platform, to new features, to a complete reinventing of the product all without actually shipping ... you might be in trouble.
I had a rebuild go 6 months over. In the heated executive meeting at t+3 months I was called to defend my team and pointed out that the VP Product had just delivered “final” specs literally the day before. How could we be on track with development if PM is 3 months past “end of development” with design specifications. The fact that the specs were changing weekly because “we’re agile” is a whole other issue.
The article touches on that too; simplified it's stating that if you're not live within 6 months, you're doing waterfall.
Waterfall isn’t just a synonym for “the wrong way to do it” :-)
So you rebuild as a new system as a gamble, because even though it shows all the traits described, the new system is at least one that anyone is willing to develop, and one where features can be added, and to which people can be recruited.
We know big rebuilds have small chances of sucess. But that doesn’t mean you shouldn’t do big rewrites. You are in a bad place if you even
consider. Maybe the big rewrite means the company has an 80% risk of going under. Still could be that safe bet.
As a developer you're constantly fighting managers who want to rush things to get them out and who will eventually blame you for a bug/non-defined behavior once you hit a certain milestone.
To me it seems the author of the article doesn't understand the tech debt. If you've ever worked in a startup you'd know that the requirements are ever-changing, thus that if a certain payment system is put in place, it might evolve to the point where you really need to refactor it and in order to enable the refactor you have to refactor the whole business flow as well. If there's more than 2-3 features affected by a new feature, a big refactor is definitely needed.
Only one solution offered, which I dont think is adequate because why would I leave something in that was only meant to provide value for short term and then build on top of it till I kill the old system?
For example “we spend X/year on AWS but if we spend Y to rewrite in C++ we need fewer VMs and can cut that to Z/year” is simple calculations. If your engineers can’t even do that, their motives are suspect.
New grads and junior engineers can end up trapped in a career dead end if their first job is on seriously old legacy tech.
I almost fell in the same trap, but quit a similar job to go back to grad school and get my Master's in CS.
The reference to Martin Fowler’s strangler pattern (https://www.martinfowler.com/bliki/StranglerApplication.html) was mentioned in the article to grow the new system in the same codebase until the old system is strangled. In my case (Ionic 1 to 2) however, both the entire framework and the language are different. How should the strangler pattern work in this case?
Identify key components and subsystems and rewrite them one by one. From the outside you seem to be switching over one REST endpoint after the other, but of course internally it's a bit more difficult, but applications often enough have enough parts that are not SO intertwined that you can do stuff like this. It's a bit related to how you break up a monolith. Find bigger, less coupled parts and shave them off and just touch the glue code.
Sorry for the abstract reference here, but it applies to almost any replatforming out there. In most cases it is a very expensive operation for a business and needs some major reasons in order to justify such a move.
We did something similar to this when we broke up our Ember application so that we could code new things in React. We still maintain our Ember codebase, but are rewriting parts of some routes in React, and adding all new things in the React app.
We deploy ours as separate pods in a Kubernetes cluster, but you could even host them on the same server with separate nginx routes.
The initial ramp up of this is a little frustrating, as it seems you're adding extra overhead to everything, the long term goal is to have infrastructure and workflow that supports having part of your app in The Old Proven Thing, and part in The New Hotness. This is valuable whether you're switching to React, or upgrading from Ember 2 to 3, etc, as it lets you upgrade a smaller set of dependencies, and experiment with things.
My firm belief is that when you need a rebuild, you are already well into a fail state as a company. Not to stay there can be no recovery, but it is an indication of some deep problems for the company, beyond anything the engineering department alone can resolve... and if the rebuild is not coming from the executive leadership, it is an even bigger issue as it will more likely lead to bigger problems than it will solve.
I've become a member of a team the company scrambled to deal with a `legacy` python/SQL - based ingestion/storage system in an effort to 'harden' it. Despite my best efforts, we are going for a full rewrite into java/spring/avro/mongo/es. We have internal users talking SQL and utilising the system at the moment, a fair amount of data is relational.
I have run out of ideas how to convince the team and stakeholders, will have a one-shot chance to talk to VP. Any ideas how to voice the concerns about the full re-design (perhaps I'm just being difficult)?
2. Consider 'what the point' is in the first place, because the entire world could be run on python/SQL and it would be 'hard'. I don't think anyone would consider 'Mongo' to be 'hard' usually people use it because it's fast and easy, not hard. Consider maybe only replacing one part at a time, i.e. Java-SQL.
3. Consider a simple clean up or refactor. No need to learn no languages and tools when maybe you just need a house clean.
4. People seem to be going back to SQL because of it's inherent standardization - so many reporting and analysis systems use SQL as an interface, to the point where even NoSQLs are starting to use SQL.
In fact, I thought I was. We split our app into 3 parts, rebuilt part 1, then part 2, but part 1 couldn't be released to customers until part 2 was done, and we kept our legacy system supporting the majority of our users until we are done with part 3, which is nearing completion now.
I thought that was "replacing one piece at a time", but it isn't most users aren't touching it until part 3 is done, and at that point, they are experiencing a new system from scratch.
If users speak SQL, they will reject Mongo. The users of the system are the ones who will determine project success or failure.
Think about the data analysts, product owners, etc. who use the system. Interview them. Find out exactly how they use the system currently. Do they query in an ad hoc way? Do they rapidly iterate on their queries? Watch them interact with the system. If it's any way other than through dashboards that an engineer updates on request, you are in for rough seas.
Users must always determine the contours of a new system. There are big data solutions that speak SQL. Some are cloud-based, some are not. Some are faster than others. The team should be able to show you why they rejected those as solutions.
Using the normal sense of "rebuild" didn't make sense.
There are some legitimate cases where you really should be rebuilding.
You may not have seen such a case since they are rare, but they do exist.
A good rule of thumb is to try your absolute best to avoid a rebuild. If at the end of your hard work you still feel defeated and forced to go with the rebuild option, you probably should rebuild.
Sometimes a rebuild is just necessary, because you are on a tech stack that is no longer working for you, for whatever reason. How would you solve that kind of problem?
It could also function pretty much like a nosql db initially, to ease your transition, then you could migrate gradually to using it as a relational db. You need strong checks on data integrity before you start - you could consider double writing (to old orm using nosql + new orm using psql), and comparing data stored to be sure you don't miss anything at first, before you switch?
Here is a video with more detail: https://www.youtube.com/watch?v=dQw4w9WgXcQ
I would start by firing people that led to this situation.
You are one of those blessed people who can architect a system and the architecture holds up for decades. From my experience most systems will end up in a big mess over time if features get added. There is almost no way around it.
This is exactly why maintenance is needed. Proper maintenance that includes things like updating the architecture and gradually migrating the whole system to that architecture, rebuilding small unwieldy components, updating and migrating database schemas as the product evolves, removing unused features.
If a product is just getting bugs patched and nothing else then it isn't really being maintained, it's being deprecated. Unfortunately as an industry we still think that there are distinct build and maintenance phases and that the latter can be done with less resources.
Thereby fomenting Red Flag #4, not "working with people who were experts in the old system.”
But also, wtf is it with people that the first instinct in any kind of situation is to fire everyone.
You can have the best developers and architects in the world, but clueless management will sabotage anything they do, whereas good management can accomplish plenty with teams that aren't the best possible.
Why, the sources of the DOS exes were long gone by second year, lost in the crash of that old Windows Milenium machine that used to sit in our dorm room and was uniquely configured to compile them using Turbo Pascal - we figured it was a safe option to use as a source repository. But that still didn't stop us - we implemented the remaining features by patching assembly.
For example an executive/management team that over-commits the organisation and creates a culture of rewarding technical debt and punishing maintainers.
Rather than fixing these issues they will continually search for a super hero employee who is going to come in on a white horse on monday and fix it all up in two weeks.