What's the best way to handle refactoring a big file?File structure of object-oriented projects seems clutteredBest way to use source control for a project (1-3 people)What's the best way to undo a Git merge that wipes files out of the repo?Developers blocked by waiting on code to merge from another branch using GitFlowWhat's the best way to handle slightly different exceptions?Git branch model critique: always derive from masterGit branching strategy for long-running unreleased codeThe trend of the “develop” branch going awayAuto-merging from master to all branches — good or bad idea?Is it good practice to switch back to an old branch to develop a new feature or to create a new branch?What are the “gotchas” of refactoring code that is binary serialized?

Could the museum Saturn V's be refitted for one more flight?

Why didn't Miles's spider sense work before?

Which is the best way to check return result?

In 'Revenger,' what does 'cove' come from?

Extract rows of a table, that include less than x NULLs

Cursor Replacement for Newbies

What method can I use to design a dungeon difficult enough that the PCs can't make it through without killing them?

Venezuelan girlfriend wants to travel the USA to be with me. What is the process?

Solving a recurrence relation (poker chips)

Should I tell management that I intend to leave due to bad software development practices?

One verb to replace 'be a member of' a club

Can we compute the area of a quadrilateral with one right angle when we only know the lengths of any three sides?

What is the most common color to indicate the input-field is disabled?

How can I deal with my CEO asking me to hire someone with a higher salary than me, a co-founder?

Is it acceptable for a professor to tell male students to not think that they are smarter than female students?

Intersection Puzzle

GFCI outlets - can they be repaired? Are they really needed at the end of a circuit?

Can compressed videos be decoded back to their uncompresed original format?

Is "remove commented out code" correct English?

Assassin's bullet with mercury

Can my sorcerer use a spellbook only to collect spells and scribe scrolls, not cast?

Is there a hemisphere-neutral way of specifying a season?

ssTTsSTtRrriinInnnnNNNIiinngg

How does a predictive coding aid in lossless compression?



What's the best way to handle refactoring a big file?


File structure of object-oriented projects seems clutteredBest way to use source control for a project (1-3 people)What's the best way to undo a Git merge that wipes files out of the repo?Developers blocked by waiting on code to merge from another branch using GitFlowWhat's the best way to handle slightly different exceptions?Git branch model critique: always derive from masterGit branching strategy for long-running unreleased codeThe trend of the “develop” branch going awayAuto-merging from master to all branches — good or bad idea?Is it good practice to switch back to an old branch to develop a new feature or to create a new branch?What are the “gotchas” of refactoring code that is binary serialized?













37















I'm currently working on a bigger project which unfortunately has some files where software quality guidelines where not always followed. This includes big files (read 2000-4000 lines) which clearly contain multiple distinct functionalities.



Now I want to refactor these big files into multiple small ones. The issue is, since they are so big, multiple people (me included) on different branches are working on these files. So I can't really branch from develop and refactor, since merging these refactorings with other peoples' changes will become difficult.



We could of course require everyone to merge back to develop, "freeze" the files (i.e. don't allow anyone to edit them anymore), refactor, and then "unfreeze". But this is not really good either, since this would require everyone to basically stop their work on these files until refactoring is done.



So is there a way to refactor, don't require anyone else to stop working (for to long) or merge back their feature branches to develop?










share|improve this question









New contributor




Hoff is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.




















  • stackoverflow.com/questions/1897585/…

    – Robert Andrzejuk
    Mar 28 at 16:47






  • 6





    I think this also depends on the programming language used.

    – Robert Andrzejuk
    Mar 28 at 16:50






  • 8





    I like "small incremental" checkins. Unless someone isn't keeping their copy of the repo fresh, this practice will minimize merge conflicts for everyone.

    – Matt Raffel
    Mar 28 at 20:33






  • 4





    What do your tests look like? If you're going to refactor a big (and probably important!) piece of code, make sure your test suite is in really good condition before you refactor. This will make it a lot easier to make sure you got it right in the smaller files.

    – corsiKa
    Mar 29 at 2:47






  • 1





    I joined the project where the biggest file is 10k lines long containing among others a class which itself is 6k lines long and everybody is afraid to touch it. What I mean is that your question is great. We even invented a joke that this single class is a good reason to unlock the scroll wheel in our mouses.

    – ElmoVanKielmo
    Mar 29 at 4:37
















37















I'm currently working on a bigger project which unfortunately has some files where software quality guidelines where not always followed. This includes big files (read 2000-4000 lines) which clearly contain multiple distinct functionalities.



Now I want to refactor these big files into multiple small ones. The issue is, since they are so big, multiple people (me included) on different branches are working on these files. So I can't really branch from develop and refactor, since merging these refactorings with other peoples' changes will become difficult.



We could of course require everyone to merge back to develop, "freeze" the files (i.e. don't allow anyone to edit them anymore), refactor, and then "unfreeze". But this is not really good either, since this would require everyone to basically stop their work on these files until refactoring is done.



So is there a way to refactor, don't require anyone else to stop working (for to long) or merge back their feature branches to develop?










share|improve this question









New contributor




Hoff is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.




















  • stackoverflow.com/questions/1897585/…

    – Robert Andrzejuk
    Mar 28 at 16:47






  • 6





    I think this also depends on the programming language used.

    – Robert Andrzejuk
    Mar 28 at 16:50






  • 8





    I like "small incremental" checkins. Unless someone isn't keeping their copy of the repo fresh, this practice will minimize merge conflicts for everyone.

    – Matt Raffel
    Mar 28 at 20:33






  • 4





    What do your tests look like? If you're going to refactor a big (and probably important!) piece of code, make sure your test suite is in really good condition before you refactor. This will make it a lot easier to make sure you got it right in the smaller files.

    – corsiKa
    Mar 29 at 2:47






  • 1





    I joined the project where the biggest file is 10k lines long containing among others a class which itself is 6k lines long and everybody is afraid to touch it. What I mean is that your question is great. We even invented a joke that this single class is a good reason to unlock the scroll wheel in our mouses.

    – ElmoVanKielmo
    Mar 29 at 4:37














37












37








37


8






I'm currently working on a bigger project which unfortunately has some files where software quality guidelines where not always followed. This includes big files (read 2000-4000 lines) which clearly contain multiple distinct functionalities.



Now I want to refactor these big files into multiple small ones. The issue is, since they are so big, multiple people (me included) on different branches are working on these files. So I can't really branch from develop and refactor, since merging these refactorings with other peoples' changes will become difficult.



We could of course require everyone to merge back to develop, "freeze" the files (i.e. don't allow anyone to edit them anymore), refactor, and then "unfreeze". But this is not really good either, since this would require everyone to basically stop their work on these files until refactoring is done.



So is there a way to refactor, don't require anyone else to stop working (for to long) or merge back their feature branches to develop?










share|improve this question









New contributor




Hoff is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.












I'm currently working on a bigger project which unfortunately has some files where software quality guidelines where not always followed. This includes big files (read 2000-4000 lines) which clearly contain multiple distinct functionalities.



Now I want to refactor these big files into multiple small ones. The issue is, since they are so big, multiple people (me included) on different branches are working on these files. So I can't really branch from develop and refactor, since merging these refactorings with other peoples' changes will become difficult.



We could of course require everyone to merge back to develop, "freeze" the files (i.e. don't allow anyone to edit them anymore), refactor, and then "unfreeze". But this is not really good either, since this would require everyone to basically stop their work on these files until refactoring is done.



So is there a way to refactor, don't require anyone else to stop working (for to long) or merge back their feature branches to develop?







git refactoring code-quality






share|improve this question









New contributor




Hoff is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.











share|improve this question









New contributor




Hoff is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.









share|improve this question




share|improve this question








edited Mar 29 at 9:38









Glorfindel

1,85041325




1,85041325






New contributor




Hoff is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.









asked Mar 28 at 16:12









HoffHoff

19425




19425




New contributor




Hoff is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.





New contributor





Hoff is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.






Hoff is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.












  • stackoverflow.com/questions/1897585/…

    – Robert Andrzejuk
    Mar 28 at 16:47






  • 6





    I think this also depends on the programming language used.

    – Robert Andrzejuk
    Mar 28 at 16:50






  • 8





    I like "small incremental" checkins. Unless someone isn't keeping their copy of the repo fresh, this practice will minimize merge conflicts for everyone.

    – Matt Raffel
    Mar 28 at 20:33






  • 4





    What do your tests look like? If you're going to refactor a big (and probably important!) piece of code, make sure your test suite is in really good condition before you refactor. This will make it a lot easier to make sure you got it right in the smaller files.

    – corsiKa
    Mar 29 at 2:47






  • 1





    I joined the project where the biggest file is 10k lines long containing among others a class which itself is 6k lines long and everybody is afraid to touch it. What I mean is that your question is great. We even invented a joke that this single class is a good reason to unlock the scroll wheel in our mouses.

    – ElmoVanKielmo
    Mar 29 at 4:37


















  • stackoverflow.com/questions/1897585/…

    – Robert Andrzejuk
    Mar 28 at 16:47






  • 6





    I think this also depends on the programming language used.

    – Robert Andrzejuk
    Mar 28 at 16:50






  • 8





    I like "small incremental" checkins. Unless someone isn't keeping their copy of the repo fresh, this practice will minimize merge conflicts for everyone.

    – Matt Raffel
    Mar 28 at 20:33






  • 4





    What do your tests look like? If you're going to refactor a big (and probably important!) piece of code, make sure your test suite is in really good condition before you refactor. This will make it a lot easier to make sure you got it right in the smaller files.

    – corsiKa
    Mar 29 at 2:47






  • 1





    I joined the project where the biggest file is 10k lines long containing among others a class which itself is 6k lines long and everybody is afraid to touch it. What I mean is that your question is great. We even invented a joke that this single class is a good reason to unlock the scroll wheel in our mouses.

    – ElmoVanKielmo
    Mar 29 at 4:37

















stackoverflow.com/questions/1897585/…

– Robert Andrzejuk
Mar 28 at 16:47





stackoverflow.com/questions/1897585/…

– Robert Andrzejuk
Mar 28 at 16:47




6




6





I think this also depends on the programming language used.

– Robert Andrzejuk
Mar 28 at 16:50





I think this also depends on the programming language used.

– Robert Andrzejuk
Mar 28 at 16:50




8




8





I like "small incremental" checkins. Unless someone isn't keeping their copy of the repo fresh, this practice will minimize merge conflicts for everyone.

– Matt Raffel
Mar 28 at 20:33





I like "small incremental" checkins. Unless someone isn't keeping their copy of the repo fresh, this practice will minimize merge conflicts for everyone.

– Matt Raffel
Mar 28 at 20:33




4




4





What do your tests look like? If you're going to refactor a big (and probably important!) piece of code, make sure your test suite is in really good condition before you refactor. This will make it a lot easier to make sure you got it right in the smaller files.

– corsiKa
Mar 29 at 2:47





What do your tests look like? If you're going to refactor a big (and probably important!) piece of code, make sure your test suite is in really good condition before you refactor. This will make it a lot easier to make sure you got it right in the smaller files.

– corsiKa
Mar 29 at 2:47




1




1





I joined the project where the biggest file is 10k lines long containing among others a class which itself is 6k lines long and everybody is afraid to touch it. What I mean is that your question is great. We even invented a joke that this single class is a good reason to unlock the scroll wheel in our mouses.

– ElmoVanKielmo
Mar 29 at 4:37






I joined the project where the biggest file is 10k lines long containing among others a class which itself is 6k lines long and everybody is afraid to touch it. What I mean is that your question is great. We even invented a joke that this single class is a good reason to unlock the scroll wheel in our mouses.

– ElmoVanKielmo
Mar 29 at 4:37











6 Answers
6






active

oldest

votes


















37














You have correctly understood that this is not so much a technical as a social problem: if you want to avoid excessive merge conflicts, the team needs to collaborate in a way that avoids these conflicts.



This is part of a larger issue with Git, in that branching is very easy but merging can still take a lot of effort. Development teams tend to launch a lot of branches and are then surprised that merging them is difficult, possibly because they are trying to emulate the Git Flow without understanding its context.



The general rule to fast and easy merges is to prevent big differences from accumulating, in particular that feature branches should be very short lived (hours or days, not months). A development team that is able to rapidly integrate their changes will see fewer merge conflicts. If some code isn't yet production ready, it might be possible to integrate it but deactivate it through a feature flag. As soon as the code has been integrated into your master branch, it becomes accessible to the kind of refactoring you are trying to do.



That might be too much for your immediate problem. But it may be feasible to ask colleagues to merge their changes that impact this file until the end of the week so that you can perform the refactoring. If they wait longer, they'll have to deal with the merge conflicts themselves. That's not impossible, it's just avoidable work.



You may also want to prevent breaking large swaths of dependent code and only make API-compatible changes. For example, if you want to extract some functionality into a separate module:



  1. Extract the functionality into a separate module.

  2. Change the old functions to forward their calls to the new API.

  3. Over time, port dependent code to the new API.

  4. Finally, you can delete the old functions.

  5. (Repeat for the next bunch of functionality)

This multi-step process can avoid many merge conflicts. In particular, there will only be conflicts if someone else is also changing the functionality you extracted. The cost of this approach is that it's much slower than changing everything at once, and that you temporarily have two duplicate APIs. This isn't so bad until something urgent interrupts this refactoring, the duplication is forgotten or deprioritized, and you end up with a bunch of tech debt.



But in the end, any solution will require you to coordinate with your team.






share|improve this answer




















  • 1





    @Laiv Unfortunately that is all extremely general advice, but some ideas out of the agile-ish space like Continuous Integration clearly have their merits. Teams that work together (and integrate their work frequently) will have an easier time making large cross-cutting changes than teams that only work alongside each other. This isn't necessarily about the SDLC at large, more about the collaboration within the team. Some approaches make working alongside more feasible (think Open/Closed Principle, microservices) but OP's team isn't there yet.

    – amon
    Mar 28 at 17:13






  • 22





    I wouldn't go so far as to say a feature branch needs to have a short lifetime -- merely that it should not diverge from its parent branch for long periods of time. Regularly merging changes from the parent branch into the feature branch works in those cases where the feature branch needs to stick around longer. Still, it's a good idea to keep feature branches around no longer than necessary.

    – Dan Lyons
    Mar 28 at 18:20






  • 1





    @Laiv In my experience, it makes sense to discuss a post-refactoring design with the team beforehand, but it's usually easiest if a single person makes the changes to the code. Otherwise, you're back to the problem that you have to merge stuff. The 4k lines sounds like a lot, but it's really not for targeted refactorings like extract-class. (I'd shill Martin Fowler's Refactoring book so hard here if I had read it.) But 4k lines is a lot only for untargeted refactorings like “let's see how I can improve this”.

    – amon
    Mar 28 at 20:21






  • 1





    @DanLyons In principle you are right: that can spread out some of the merging effort. In practice, Git's merging depends a lot on the latest common ancestor commit of the branches being merged. Merging master→feature does not give us a new common ancestor on master, but merging feature→master does. With repeated master→feature merges, it can happen that we have to resolve the same conflicts again and again (but see git rerere to automate this). Rebasing is strictly superior here because the tip of master becomes the new common ancestor, but history rewriting has other issues.

    – amon
    Mar 29 at 12:30






  • 1





    The answer is OK for me except for the rant about git making it too easy to branch, and thus devs branch too often. I well remember the times of SVN and even CVS when branching was hard (or at least cumbersome) enough that people generally avoided it if possible, with all the related problems. In git, being a distributed system, having many branches is really nothing different than having many separated repositories at all (i.e., on each dev). The solution lies elsewhere, being easy to branch is not the problem. (And yes, I do see that that is just an aside... but still).

    – AnoE
    Mar 29 at 16:33


















28














Do the refactoring in smaller steps. Let's say your large file has the name Foo:



  1. Add a new empty file, Bar, and commit it to "trunk".


  2. Find a small portion of the code in Foo which can be moved over to Bar. Apply the move, update from trunk, build and test the code, and commit to "trunk".


  3. Repeat step 2 until Foo and Bar have equal size (or whatever size you prefer)


That way, next time your teammates update their branches from trunk, they get your changes in "small portions" and can merge them one-by-one, which is a lot easier than having to merge a full split in one step. The same holds when in step 2 you get a merge conflict because someone else updated trunk in between.



This won't eliminate merge conflicts or the need for resolving them manually, but it restricts each conflict to a small area of code, which is way more manageable.



And of course - communicate the refactoring in the team. Inform your mates what you are doing, so they know why they have to expect merge conflicts for the particular file.






share|improve this answer




















  • 2





    This is especially useful with gits rerere option enabled

    – D. Ben Knoble
    Mar 29 at 4:09












  • @D.BenKnoble: thanks for that addition. I have to admit, I am not a git expert (but the problem described is not specificially for git, it applies to any VCS which allows branching, and my answer should fit to most of those systems).

    – Doc Brown
    Mar 29 at 9:24











  • I figured based on the terminology; in fact, with git, this kind of merge is still done only once (if one just pulls and merges). But one can always pull and cherry-pick, or merge individual commits, or rebase depending on the preference of the dev. It takes more time but is certainly doable if automatic merging seems likely to fail.

    – D. Ben Knoble
    Mar 29 at 12:48


















16














You are thinking of splitting the file as an atomic operation, but there are intermediate changes you can make. The file gradually became huge over time, it can gradually become small over time.



Pick a part that hasn't had to change in a long time (git blame can help with this), and split that off first. Get that change merged into everyone's branches, then pick the next easiest part to split. Maybe even splitting one part is too big a step and you should just do some rearranging within the large file first.



If people aren't frequently merging back to develop, you should encourage that, then after they merge, take that opportunity to split off the parts they just changed. Or ask them to do the splitting off as part of the pull request review.



The idea is to slowly move toward your goal. It will feel like progress is slow, but then suddenly you'll realize your code is a lot better. It takes a long time to turn an ocean liner.






share|improve this answer























  • The file may have started large. Files that size can be created quickly. I know people who can write 1000's of LoC in a day or week. And OP did not mention automated tests, which indicates to me that they are lacking.

    – ChuckCottrill
    Mar 29 at 22:19


















7














I'm going to suggest a different than normal solution to this problem.



Use this as a team code event. Have everyone check-in their code who can, then help others who are still working with the file. Once everyone relevant has their code checked in, find a conference room with a projector and work together to start moving things around and into new files.



You may want to set a specific amount of time to this, so that it doesn't end up being a week worth of arguments with no end in sight. Instead, this might even be a weekly 1-2 hour event until you all get things looking how it needs to be. Maybe you only need 1-2 hours to refactor the file. You won't know until you try, likely.



This has the benefit of everyone being on the same page (no pun intended) with the refactoring, but it can also help you avoid mistakes as well as get input from others about possible method groupings to maintain, if necessary.



Doing it this way can be considered to have a built-in code review, if you do that sort of thing. This allows the appropriate amount of devs to sign off on your code as soon as you get it checked in and ready for their review. You might still want them to check the code for anything you missed, but it goes a long ways to making sure the review process is shorter.



This may not work in all situations, teams, or companies, as the work isn't distributed in a way that makes this happen easily. It can also be (incorrectly) construed as a misuse of dev time. This group code needs buy-in from the manager as well as the refactor itself.



To help sell this idea to your manager, mention the code review bit as well as everyone knowing where thing are from the beginning. Preventing devs from losing time searching a host of new files can be worthwhile to avoid. Also, preventing devs from getting POed about where things ended up or "completely missing" is usually a good thing. (The fewer the meltdowns the better, IMO.)



Once you get one file refactored this way, you may be able to more easily get approval for more refactors, if it was successful and useful.



However you decide to do your refactor, good luck!






share|improve this answer








New contributor




computercarguy is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.




















  • This is a fantastic suggestion that captures a really good way to achieve the team coordination that is going to be critical to making it work. Additionally, if some of the branches can't be merged back to master first, you've at least got everybody in the room to help deal with the merges into those branches.

    – Colin Young
    Mar 29 at 15:49











  • +1 for suggesting the code mob

    – Jon Raynor
    Mar 29 at 19:27











  • This exactly addresses the social aspect of the problem.

    – ChuckCottrill
    Mar 29 at 22:20


















2














Fixing this problem requires buy-in from the other teams because you're trying to change a shared resource (the code itself). That being said, I think there's a way to "migrate away" from having huge monolithic files without disrupting people.



I would also recommend not targeting all the huge files at once unless the number of huge files is growing uncontrollably in addition to the sizes of individual files.



Refactoring large files like this frequently causes unexpected problems. The first step is to stop the big files from accumulating additional functionality beyond what's currently in master or in development branches.



I think the best way to do this is with commit hooks that block certain additions to the large files by default, but can be overruled with a magical comment in the commit message, like @bigfileok or something. It's important to be able to overrule the policy in a way that's painless but trackable. Ideally, you should be able to run the commit hook locally and it should tell you how to override this particular error in the error message itself. Also, this is just my preference, but unrecognized magical comments or magical comments suppressing errors that didn't actually fire in the commit message should be a commit-time warning or error so you don't inadvertently train people to suppress the hooks regardless of whether they need to or not.



The commit hook could check for new classes or do other static analysis (ad hoc or not). You can also just pick a line or character count that's 10% larger than the file currently is and say that the large file can't grow beyond the new limit. You can also reject individual commits that grow the large file by too many lines or too many characters or w/e.



Once the large file stops accumulating new functionality, you can refactor things out of it one at a time (and reduce the tresholds enforced by the commit hooks at the same time to prevent it from growing again).



Eventually, the large files will be small enough that the commit hooks can be completely removed.






share|improve this answer
































    -2














    Wait until hometime. Split the file, commit and merge to master.



    Other people will have to pull the changes into their feature branches in the morning like any other change.






    share|improve this answer


















    • 3





      Still would mean they would have to merge my refactorings with their changes though...

      – Hoff
      Mar 28 at 16:34











    • somewhat related: suggestion about uncluttering file structure

      – Nick Alexeev
      Mar 28 at 16:36












    • they are going to have to merge that big file with one another anyway. merging with your split version might actually reduce the total pain

      – Ewan
      Mar 28 at 16:38






    • 9





      This has the problem of "Surprise, I broke all your stuff." The OP needs to get buy-in and approval before doing this, and doing it at a scheduled time that no one else has the file "in progress" would help.

      – computercarguy
      Mar 29 at 0:08







    • 5





      For the love of cthulhu don't do this. It's about the worst way you can work in a team.

      – Lightness Races in Orbit
      Mar 29 at 14:10












    Your Answer








    StackExchange.ready(function()
    var channelOptions =
    tags: "".split(" "),
    id: "131"
    ;
    initTagRenderer("".split(" "), "".split(" "), channelOptions);

    StackExchange.using("externalEditor", function()
    // Have to fire editor after snippets, if snippets enabled
    if (StackExchange.settings.snippets.snippetsEnabled)
    StackExchange.using("snippets", function()
    createEditor();
    );

    else
    createEditor();

    );

    function createEditor()
    StackExchange.prepareEditor(
    heartbeatType: 'answer',
    autoActivateHeartbeat: false,
    convertImagesToLinks: false,
    noModals: true,
    showLowRepImageUploadWarning: true,
    reputationToPostImages: null,
    bindNavPrevention: true,
    postfix: "",
    imageUploader:
    brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
    contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
    allowUrls: true
    ,
    onDemand: false,
    discardSelector: ".discard-answer"
    ,immediatelyShowMarkdownHelp:true
    );



    );






    Hoff is a new contributor. Be nice, and check out our Code of Conduct.









    draft saved

    draft discarded


















    StackExchange.ready(
    function ()
    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fsoftwareengineering.stackexchange.com%2fquestions%2f389380%2fwhats-the-best-way-to-handle-refactoring-a-big-file%23new-answer', 'question_page');

    );

    Post as a guest















    Required, but never shown




















    StackExchange.ready(function ()
    $("#show-editor-button input, #show-editor-button button").click(function ()
    var showEditor = function()
    $("#show-editor-button").hide();
    $("#post-form").removeClass("dno");
    StackExchange.editor.finallyInit();
    ;

    var useFancy = $(this).data('confirm-use-fancy');
    if(useFancy == 'True')
    var popupTitle = $(this).data('confirm-fancy-title');
    var popupBody = $(this).data('confirm-fancy-body');
    var popupAccept = $(this).data('confirm-fancy-accept-button');

    $(this).loadPopup(
    url: '/post/self-answer-popup',
    loaded: function(popup)
    var pTitle = $(popup).find('h2');
    var pBody = $(popup).find('.popup-body');
    var pSubmit = $(popup).find('.popup-submit');

    pTitle.text(popupTitle);
    pBody.html(popupBody);
    pSubmit.val(popupAccept).click(showEditor);

    )
    else
    var confirmText = $(this).data('confirm-text');
    if (confirmText ? confirm(confirmText) : true)
    showEditor();


    );
    );






    6 Answers
    6






    active

    oldest

    votes








    6 Answers
    6






    active

    oldest

    votes









    active

    oldest

    votes






    active

    oldest

    votes









    37














    You have correctly understood that this is not so much a technical as a social problem: if you want to avoid excessive merge conflicts, the team needs to collaborate in a way that avoids these conflicts.



    This is part of a larger issue with Git, in that branching is very easy but merging can still take a lot of effort. Development teams tend to launch a lot of branches and are then surprised that merging them is difficult, possibly because they are trying to emulate the Git Flow without understanding its context.



    The general rule to fast and easy merges is to prevent big differences from accumulating, in particular that feature branches should be very short lived (hours or days, not months). A development team that is able to rapidly integrate their changes will see fewer merge conflicts. If some code isn't yet production ready, it might be possible to integrate it but deactivate it through a feature flag. As soon as the code has been integrated into your master branch, it becomes accessible to the kind of refactoring you are trying to do.



    That might be too much for your immediate problem. But it may be feasible to ask colleagues to merge their changes that impact this file until the end of the week so that you can perform the refactoring. If they wait longer, they'll have to deal with the merge conflicts themselves. That's not impossible, it's just avoidable work.



    You may also want to prevent breaking large swaths of dependent code and only make API-compatible changes. For example, if you want to extract some functionality into a separate module:



    1. Extract the functionality into a separate module.

    2. Change the old functions to forward their calls to the new API.

    3. Over time, port dependent code to the new API.

    4. Finally, you can delete the old functions.

    5. (Repeat for the next bunch of functionality)

    This multi-step process can avoid many merge conflicts. In particular, there will only be conflicts if someone else is also changing the functionality you extracted. The cost of this approach is that it's much slower than changing everything at once, and that you temporarily have two duplicate APIs. This isn't so bad until something urgent interrupts this refactoring, the duplication is forgotten or deprioritized, and you end up with a bunch of tech debt.



    But in the end, any solution will require you to coordinate with your team.






    share|improve this answer




















    • 1





      @Laiv Unfortunately that is all extremely general advice, but some ideas out of the agile-ish space like Continuous Integration clearly have their merits. Teams that work together (and integrate their work frequently) will have an easier time making large cross-cutting changes than teams that only work alongside each other. This isn't necessarily about the SDLC at large, more about the collaboration within the team. Some approaches make working alongside more feasible (think Open/Closed Principle, microservices) but OP's team isn't there yet.

      – amon
      Mar 28 at 17:13






    • 22





      I wouldn't go so far as to say a feature branch needs to have a short lifetime -- merely that it should not diverge from its parent branch for long periods of time. Regularly merging changes from the parent branch into the feature branch works in those cases where the feature branch needs to stick around longer. Still, it's a good idea to keep feature branches around no longer than necessary.

      – Dan Lyons
      Mar 28 at 18:20






    • 1





      @Laiv In my experience, it makes sense to discuss a post-refactoring design with the team beforehand, but it's usually easiest if a single person makes the changes to the code. Otherwise, you're back to the problem that you have to merge stuff. The 4k lines sounds like a lot, but it's really not for targeted refactorings like extract-class. (I'd shill Martin Fowler's Refactoring book so hard here if I had read it.) But 4k lines is a lot only for untargeted refactorings like “let's see how I can improve this”.

      – amon
      Mar 28 at 20:21






    • 1





      @DanLyons In principle you are right: that can spread out some of the merging effort. In practice, Git's merging depends a lot on the latest common ancestor commit of the branches being merged. Merging master→feature does not give us a new common ancestor on master, but merging feature→master does. With repeated master→feature merges, it can happen that we have to resolve the same conflicts again and again (but see git rerere to automate this). Rebasing is strictly superior here because the tip of master becomes the new common ancestor, but history rewriting has other issues.

      – amon
      Mar 29 at 12:30






    • 1





      The answer is OK for me except for the rant about git making it too easy to branch, and thus devs branch too often. I well remember the times of SVN and even CVS when branching was hard (or at least cumbersome) enough that people generally avoided it if possible, with all the related problems. In git, being a distributed system, having many branches is really nothing different than having many separated repositories at all (i.e., on each dev). The solution lies elsewhere, being easy to branch is not the problem. (And yes, I do see that that is just an aside... but still).

      – AnoE
      Mar 29 at 16:33















    37














    You have correctly understood that this is not so much a technical as a social problem: if you want to avoid excessive merge conflicts, the team needs to collaborate in a way that avoids these conflicts.



    This is part of a larger issue with Git, in that branching is very easy but merging can still take a lot of effort. Development teams tend to launch a lot of branches and are then surprised that merging them is difficult, possibly because they are trying to emulate the Git Flow without understanding its context.



    The general rule to fast and easy merges is to prevent big differences from accumulating, in particular that feature branches should be very short lived (hours or days, not months). A development team that is able to rapidly integrate their changes will see fewer merge conflicts. If some code isn't yet production ready, it might be possible to integrate it but deactivate it through a feature flag. As soon as the code has been integrated into your master branch, it becomes accessible to the kind of refactoring you are trying to do.



    That might be too much for your immediate problem. But it may be feasible to ask colleagues to merge their changes that impact this file until the end of the week so that you can perform the refactoring. If they wait longer, they'll have to deal with the merge conflicts themselves. That's not impossible, it's just avoidable work.



    You may also want to prevent breaking large swaths of dependent code and only make API-compatible changes. For example, if you want to extract some functionality into a separate module:



    1. Extract the functionality into a separate module.

    2. Change the old functions to forward their calls to the new API.

    3. Over time, port dependent code to the new API.

    4. Finally, you can delete the old functions.

    5. (Repeat for the next bunch of functionality)

    This multi-step process can avoid many merge conflicts. In particular, there will only be conflicts if someone else is also changing the functionality you extracted. The cost of this approach is that it's much slower than changing everything at once, and that you temporarily have two duplicate APIs. This isn't so bad until something urgent interrupts this refactoring, the duplication is forgotten or deprioritized, and you end up with a bunch of tech debt.



    But in the end, any solution will require you to coordinate with your team.






    share|improve this answer




















    • 1





      @Laiv Unfortunately that is all extremely general advice, but some ideas out of the agile-ish space like Continuous Integration clearly have their merits. Teams that work together (and integrate their work frequently) will have an easier time making large cross-cutting changes than teams that only work alongside each other. This isn't necessarily about the SDLC at large, more about the collaboration within the team. Some approaches make working alongside more feasible (think Open/Closed Principle, microservices) but OP's team isn't there yet.

      – amon
      Mar 28 at 17:13






    • 22





      I wouldn't go so far as to say a feature branch needs to have a short lifetime -- merely that it should not diverge from its parent branch for long periods of time. Regularly merging changes from the parent branch into the feature branch works in those cases where the feature branch needs to stick around longer. Still, it's a good idea to keep feature branches around no longer than necessary.

      – Dan Lyons
      Mar 28 at 18:20






    • 1





      @Laiv In my experience, it makes sense to discuss a post-refactoring design with the team beforehand, but it's usually easiest if a single person makes the changes to the code. Otherwise, you're back to the problem that you have to merge stuff. The 4k lines sounds like a lot, but it's really not for targeted refactorings like extract-class. (I'd shill Martin Fowler's Refactoring book so hard here if I had read it.) But 4k lines is a lot only for untargeted refactorings like “let's see how I can improve this”.

      – amon
      Mar 28 at 20:21






    • 1





      @DanLyons In principle you are right: that can spread out some of the merging effort. In practice, Git's merging depends a lot on the latest common ancestor commit of the branches being merged. Merging master→feature does not give us a new common ancestor on master, but merging feature→master does. With repeated master→feature merges, it can happen that we have to resolve the same conflicts again and again (but see git rerere to automate this). Rebasing is strictly superior here because the tip of master becomes the new common ancestor, but history rewriting has other issues.

      – amon
      Mar 29 at 12:30






    • 1





      The answer is OK for me except for the rant about git making it too easy to branch, and thus devs branch too often. I well remember the times of SVN and even CVS when branching was hard (or at least cumbersome) enough that people generally avoided it if possible, with all the related problems. In git, being a distributed system, having many branches is really nothing different than having many separated repositories at all (i.e., on each dev). The solution lies elsewhere, being easy to branch is not the problem. (And yes, I do see that that is just an aside... but still).

      – AnoE
      Mar 29 at 16:33













    37












    37








    37







    You have correctly understood that this is not so much a technical as a social problem: if you want to avoid excessive merge conflicts, the team needs to collaborate in a way that avoids these conflicts.



    This is part of a larger issue with Git, in that branching is very easy but merging can still take a lot of effort. Development teams tend to launch a lot of branches and are then surprised that merging them is difficult, possibly because they are trying to emulate the Git Flow without understanding its context.



    The general rule to fast and easy merges is to prevent big differences from accumulating, in particular that feature branches should be very short lived (hours or days, not months). A development team that is able to rapidly integrate their changes will see fewer merge conflicts. If some code isn't yet production ready, it might be possible to integrate it but deactivate it through a feature flag. As soon as the code has been integrated into your master branch, it becomes accessible to the kind of refactoring you are trying to do.



    That might be too much for your immediate problem. But it may be feasible to ask colleagues to merge their changes that impact this file until the end of the week so that you can perform the refactoring. If they wait longer, they'll have to deal with the merge conflicts themselves. That's not impossible, it's just avoidable work.



    You may also want to prevent breaking large swaths of dependent code and only make API-compatible changes. For example, if you want to extract some functionality into a separate module:



    1. Extract the functionality into a separate module.

    2. Change the old functions to forward their calls to the new API.

    3. Over time, port dependent code to the new API.

    4. Finally, you can delete the old functions.

    5. (Repeat for the next bunch of functionality)

    This multi-step process can avoid many merge conflicts. In particular, there will only be conflicts if someone else is also changing the functionality you extracted. The cost of this approach is that it's much slower than changing everything at once, and that you temporarily have two duplicate APIs. This isn't so bad until something urgent interrupts this refactoring, the duplication is forgotten or deprioritized, and you end up with a bunch of tech debt.



    But in the end, any solution will require you to coordinate with your team.






    share|improve this answer















    You have correctly understood that this is not so much a technical as a social problem: if you want to avoid excessive merge conflicts, the team needs to collaborate in a way that avoids these conflicts.



    This is part of a larger issue with Git, in that branching is very easy but merging can still take a lot of effort. Development teams tend to launch a lot of branches and are then surprised that merging them is difficult, possibly because they are trying to emulate the Git Flow without understanding its context.



    The general rule to fast and easy merges is to prevent big differences from accumulating, in particular that feature branches should be very short lived (hours or days, not months). A development team that is able to rapidly integrate their changes will see fewer merge conflicts. If some code isn't yet production ready, it might be possible to integrate it but deactivate it through a feature flag. As soon as the code has been integrated into your master branch, it becomes accessible to the kind of refactoring you are trying to do.



    That might be too much for your immediate problem. But it may be feasible to ask colleagues to merge their changes that impact this file until the end of the week so that you can perform the refactoring. If they wait longer, they'll have to deal with the merge conflicts themselves. That's not impossible, it's just avoidable work.



    You may also want to prevent breaking large swaths of dependent code and only make API-compatible changes. For example, if you want to extract some functionality into a separate module:



    1. Extract the functionality into a separate module.

    2. Change the old functions to forward their calls to the new API.

    3. Over time, port dependent code to the new API.

    4. Finally, you can delete the old functions.

    5. (Repeat for the next bunch of functionality)

    This multi-step process can avoid many merge conflicts. In particular, there will only be conflicts if someone else is also changing the functionality you extracted. The cost of this approach is that it's much slower than changing everything at once, and that you temporarily have two duplicate APIs. This isn't so bad until something urgent interrupts this refactoring, the duplication is forgotten or deprioritized, and you end up with a bunch of tech debt.



    But in the end, any solution will require you to coordinate with your team.







    share|improve this answer














    share|improve this answer



    share|improve this answer








    edited Mar 29 at 12:20

























    answered Mar 28 at 16:48









    amonamon

    90.1k21174262




    90.1k21174262







    • 1





      @Laiv Unfortunately that is all extremely general advice, but some ideas out of the agile-ish space like Continuous Integration clearly have their merits. Teams that work together (and integrate their work frequently) will have an easier time making large cross-cutting changes than teams that only work alongside each other. This isn't necessarily about the SDLC at large, more about the collaboration within the team. Some approaches make working alongside more feasible (think Open/Closed Principle, microservices) but OP's team isn't there yet.

      – amon
      Mar 28 at 17:13






    • 22





      I wouldn't go so far as to say a feature branch needs to have a short lifetime -- merely that it should not diverge from its parent branch for long periods of time. Regularly merging changes from the parent branch into the feature branch works in those cases where the feature branch needs to stick around longer. Still, it's a good idea to keep feature branches around no longer than necessary.

      – Dan Lyons
      Mar 28 at 18:20






    • 1





      @Laiv In my experience, it makes sense to discuss a post-refactoring design with the team beforehand, but it's usually easiest if a single person makes the changes to the code. Otherwise, you're back to the problem that you have to merge stuff. The 4k lines sounds like a lot, but it's really not for targeted refactorings like extract-class. (I'd shill Martin Fowler's Refactoring book so hard here if I had read it.) But 4k lines is a lot only for untargeted refactorings like “let's see how I can improve this”.

      – amon
      Mar 28 at 20:21






    • 1





      @DanLyons In principle you are right: that can spread out some of the merging effort. In practice, Git's merging depends a lot on the latest common ancestor commit of the branches being merged. Merging master→feature does not give us a new common ancestor on master, but merging feature→master does. With repeated master→feature merges, it can happen that we have to resolve the same conflicts again and again (but see git rerere to automate this). Rebasing is strictly superior here because the tip of master becomes the new common ancestor, but history rewriting has other issues.

      – amon
      Mar 29 at 12:30






    • 1





      The answer is OK for me except for the rant about git making it too easy to branch, and thus devs branch too often. I well remember the times of SVN and even CVS when branching was hard (or at least cumbersome) enough that people generally avoided it if possible, with all the related problems. In git, being a distributed system, having many branches is really nothing different than having many separated repositories at all (i.e., on each dev). The solution lies elsewhere, being easy to branch is not the problem. (And yes, I do see that that is just an aside... but still).

      – AnoE
      Mar 29 at 16:33












    • 1





      @Laiv Unfortunately that is all extremely general advice, but some ideas out of the agile-ish space like Continuous Integration clearly have their merits. Teams that work together (and integrate their work frequently) will have an easier time making large cross-cutting changes than teams that only work alongside each other. This isn't necessarily about the SDLC at large, more about the collaboration within the team. Some approaches make working alongside more feasible (think Open/Closed Principle, microservices) but OP's team isn't there yet.

      – amon
      Mar 28 at 17:13






    • 22





      I wouldn't go so far as to say a feature branch needs to have a short lifetime -- merely that it should not diverge from its parent branch for long periods of time. Regularly merging changes from the parent branch into the feature branch works in those cases where the feature branch needs to stick around longer. Still, it's a good idea to keep feature branches around no longer than necessary.

      – Dan Lyons
      Mar 28 at 18:20






    • 1





      @Laiv In my experience, it makes sense to discuss a post-refactoring design with the team beforehand, but it's usually easiest if a single person makes the changes to the code. Otherwise, you're back to the problem that you have to merge stuff. The 4k lines sounds like a lot, but it's really not for targeted refactorings like extract-class. (I'd shill Martin Fowler's Refactoring book so hard here if I had read it.) But 4k lines is a lot only for untargeted refactorings like “let's see how I can improve this”.

      – amon
      Mar 28 at 20:21






    • 1





      @DanLyons In principle you are right: that can spread out some of the merging effort. In practice, Git's merging depends a lot on the latest common ancestor commit of the branches being merged. Merging master→feature does not give us a new common ancestor on master, but merging feature→master does. With repeated master→feature merges, it can happen that we have to resolve the same conflicts again and again (but see git rerere to automate this). Rebasing is strictly superior here because the tip of master becomes the new common ancestor, but history rewriting has other issues.

      – amon
      Mar 29 at 12:30






    • 1





      The answer is OK for me except for the rant about git making it too easy to branch, and thus devs branch too often. I well remember the times of SVN and even CVS when branching was hard (or at least cumbersome) enough that people generally avoided it if possible, with all the related problems. In git, being a distributed system, having many branches is really nothing different than having many separated repositories at all (i.e., on each dev). The solution lies elsewhere, being easy to branch is not the problem. (And yes, I do see that that is just an aside... but still).

      – AnoE
      Mar 29 at 16:33







    1




    1





    @Laiv Unfortunately that is all extremely general advice, but some ideas out of the agile-ish space like Continuous Integration clearly have their merits. Teams that work together (and integrate their work frequently) will have an easier time making large cross-cutting changes than teams that only work alongside each other. This isn't necessarily about the SDLC at large, more about the collaboration within the team. Some approaches make working alongside more feasible (think Open/Closed Principle, microservices) but OP's team isn't there yet.

    – amon
    Mar 28 at 17:13





    @Laiv Unfortunately that is all extremely general advice, but some ideas out of the agile-ish space like Continuous Integration clearly have their merits. Teams that work together (and integrate their work frequently) will have an easier time making large cross-cutting changes than teams that only work alongside each other. This isn't necessarily about the SDLC at large, more about the collaboration within the team. Some approaches make working alongside more feasible (think Open/Closed Principle, microservices) but OP's team isn't there yet.

    – amon
    Mar 28 at 17:13




    22




    22





    I wouldn't go so far as to say a feature branch needs to have a short lifetime -- merely that it should not diverge from its parent branch for long periods of time. Regularly merging changes from the parent branch into the feature branch works in those cases where the feature branch needs to stick around longer. Still, it's a good idea to keep feature branches around no longer than necessary.

    – Dan Lyons
    Mar 28 at 18:20





    I wouldn't go so far as to say a feature branch needs to have a short lifetime -- merely that it should not diverge from its parent branch for long periods of time. Regularly merging changes from the parent branch into the feature branch works in those cases where the feature branch needs to stick around longer. Still, it's a good idea to keep feature branches around no longer than necessary.

    – Dan Lyons
    Mar 28 at 18:20




    1




    1





    @Laiv In my experience, it makes sense to discuss a post-refactoring design with the team beforehand, but it's usually easiest if a single person makes the changes to the code. Otherwise, you're back to the problem that you have to merge stuff. The 4k lines sounds like a lot, but it's really not for targeted refactorings like extract-class. (I'd shill Martin Fowler's Refactoring book so hard here if I had read it.) But 4k lines is a lot only for untargeted refactorings like “let's see how I can improve this”.

    – amon
    Mar 28 at 20:21





    @Laiv In my experience, it makes sense to discuss a post-refactoring design with the team beforehand, but it's usually easiest if a single person makes the changes to the code. Otherwise, you're back to the problem that you have to merge stuff. The 4k lines sounds like a lot, but it's really not for targeted refactorings like extract-class. (I'd shill Martin Fowler's Refactoring book so hard here if I had read it.) But 4k lines is a lot only for untargeted refactorings like “let's see how I can improve this”.

    – amon
    Mar 28 at 20:21




    1




    1





    @DanLyons In principle you are right: that can spread out some of the merging effort. In practice, Git's merging depends a lot on the latest common ancestor commit of the branches being merged. Merging master→feature does not give us a new common ancestor on master, but merging feature→master does. With repeated master→feature merges, it can happen that we have to resolve the same conflicts again and again (but see git rerere to automate this). Rebasing is strictly superior here because the tip of master becomes the new common ancestor, but history rewriting has other issues.

    – amon
    Mar 29 at 12:30





    @DanLyons In principle you are right: that can spread out some of the merging effort. In practice, Git's merging depends a lot on the latest common ancestor commit of the branches being merged. Merging master→feature does not give us a new common ancestor on master, but merging feature→master does. With repeated master→feature merges, it can happen that we have to resolve the same conflicts again and again (but see git rerere to automate this). Rebasing is strictly superior here because the tip of master becomes the new common ancestor, but history rewriting has other issues.

    – amon
    Mar 29 at 12:30




    1




    1





    The answer is OK for me except for the rant about git making it too easy to branch, and thus devs branch too often. I well remember the times of SVN and even CVS when branching was hard (or at least cumbersome) enough that people generally avoided it if possible, with all the related problems. In git, being a distributed system, having many branches is really nothing different than having many separated repositories at all (i.e., on each dev). The solution lies elsewhere, being easy to branch is not the problem. (And yes, I do see that that is just an aside... but still).

    – AnoE
    Mar 29 at 16:33





    The answer is OK for me except for the rant about git making it too easy to branch, and thus devs branch too often. I well remember the times of SVN and even CVS when branching was hard (or at least cumbersome) enough that people generally avoided it if possible, with all the related problems. In git, being a distributed system, having many branches is really nothing different than having many separated repositories at all (i.e., on each dev). The solution lies elsewhere, being easy to branch is not the problem. (And yes, I do see that that is just an aside... but still).

    – AnoE
    Mar 29 at 16:33













    28














    Do the refactoring in smaller steps. Let's say your large file has the name Foo:



    1. Add a new empty file, Bar, and commit it to "trunk".


    2. Find a small portion of the code in Foo which can be moved over to Bar. Apply the move, update from trunk, build and test the code, and commit to "trunk".


    3. Repeat step 2 until Foo and Bar have equal size (or whatever size you prefer)


    That way, next time your teammates update their branches from trunk, they get your changes in "small portions" and can merge them one-by-one, which is a lot easier than having to merge a full split in one step. The same holds when in step 2 you get a merge conflict because someone else updated trunk in between.



    This won't eliminate merge conflicts or the need for resolving them manually, but it restricts each conflict to a small area of code, which is way more manageable.



    And of course - communicate the refactoring in the team. Inform your mates what you are doing, so they know why they have to expect merge conflicts for the particular file.






    share|improve this answer




















    • 2





      This is especially useful with gits rerere option enabled

      – D. Ben Knoble
      Mar 29 at 4:09












    • @D.BenKnoble: thanks for that addition. I have to admit, I am not a git expert (but the problem described is not specificially for git, it applies to any VCS which allows branching, and my answer should fit to most of those systems).

      – Doc Brown
      Mar 29 at 9:24











    • I figured based on the terminology; in fact, with git, this kind of merge is still done only once (if one just pulls and merges). But one can always pull and cherry-pick, or merge individual commits, or rebase depending on the preference of the dev. It takes more time but is certainly doable if automatic merging seems likely to fail.

      – D. Ben Knoble
      Mar 29 at 12:48















    28














    Do the refactoring in smaller steps. Let's say your large file has the name Foo:



    1. Add a new empty file, Bar, and commit it to "trunk".


    2. Find a small portion of the code in Foo which can be moved over to Bar. Apply the move, update from trunk, build and test the code, and commit to "trunk".


    3. Repeat step 2 until Foo and Bar have equal size (or whatever size you prefer)


    That way, next time your teammates update their branches from trunk, they get your changes in "small portions" and can merge them one-by-one, which is a lot easier than having to merge a full split in one step. The same holds when in step 2 you get a merge conflict because someone else updated trunk in between.



    This won't eliminate merge conflicts or the need for resolving them manually, but it restricts each conflict to a small area of code, which is way more manageable.



    And of course - communicate the refactoring in the team. Inform your mates what you are doing, so they know why they have to expect merge conflicts for the particular file.






    share|improve this answer




















    • 2





      This is especially useful with gits rerere option enabled

      – D. Ben Knoble
      Mar 29 at 4:09












    • @D.BenKnoble: thanks for that addition. I have to admit, I am not a git expert (but the problem described is not specificially for git, it applies to any VCS which allows branching, and my answer should fit to most of those systems).

      – Doc Brown
      Mar 29 at 9:24











    • I figured based on the terminology; in fact, with git, this kind of merge is still done only once (if one just pulls and merges). But one can always pull and cherry-pick, or merge individual commits, or rebase depending on the preference of the dev. It takes more time but is certainly doable if automatic merging seems likely to fail.

      – D. Ben Knoble
      Mar 29 at 12:48













    28












    28








    28







    Do the refactoring in smaller steps. Let's say your large file has the name Foo:



    1. Add a new empty file, Bar, and commit it to "trunk".


    2. Find a small portion of the code in Foo which can be moved over to Bar. Apply the move, update from trunk, build and test the code, and commit to "trunk".


    3. Repeat step 2 until Foo and Bar have equal size (or whatever size you prefer)


    That way, next time your teammates update their branches from trunk, they get your changes in "small portions" and can merge them one-by-one, which is a lot easier than having to merge a full split in one step. The same holds when in step 2 you get a merge conflict because someone else updated trunk in between.



    This won't eliminate merge conflicts or the need for resolving them manually, but it restricts each conflict to a small area of code, which is way more manageable.



    And of course - communicate the refactoring in the team. Inform your mates what you are doing, so they know why they have to expect merge conflicts for the particular file.






    share|improve this answer















    Do the refactoring in smaller steps. Let's say your large file has the name Foo:



    1. Add a new empty file, Bar, and commit it to "trunk".


    2. Find a small portion of the code in Foo which can be moved over to Bar. Apply the move, update from trunk, build and test the code, and commit to "trunk".


    3. Repeat step 2 until Foo and Bar have equal size (or whatever size you prefer)


    That way, next time your teammates update their branches from trunk, they get your changes in "small portions" and can merge them one-by-one, which is a lot easier than having to merge a full split in one step. The same holds when in step 2 you get a merge conflict because someone else updated trunk in between.



    This won't eliminate merge conflicts or the need for resolving them manually, but it restricts each conflict to a small area of code, which is way more manageable.



    And of course - communicate the refactoring in the team. Inform your mates what you are doing, so they know why they have to expect merge conflicts for the particular file.







    share|improve this answer














    share|improve this answer



    share|improve this answer








    edited Mar 30 at 5:44









    Peter Mortensen

    1,11521114




    1,11521114










    answered Mar 28 at 18:08









    Doc BrownDoc Brown

    137k23252406




    137k23252406







    • 2





      This is especially useful with gits rerere option enabled

      – D. Ben Knoble
      Mar 29 at 4:09












    • @D.BenKnoble: thanks for that addition. I have to admit, I am not a git expert (but the problem described is not specificially for git, it applies to any VCS which allows branching, and my answer should fit to most of those systems).

      – Doc Brown
      Mar 29 at 9:24











    • I figured based on the terminology; in fact, with git, this kind of merge is still done only once (if one just pulls and merges). But one can always pull and cherry-pick, or merge individual commits, or rebase depending on the preference of the dev. It takes more time but is certainly doable if automatic merging seems likely to fail.

      – D. Ben Knoble
      Mar 29 at 12:48












    • 2





      This is especially useful with gits rerere option enabled

      – D. Ben Knoble
      Mar 29 at 4:09












    • @D.BenKnoble: thanks for that addition. I have to admit, I am not a git expert (but the problem described is not specificially for git, it applies to any VCS which allows branching, and my answer should fit to most of those systems).

      – Doc Brown
      Mar 29 at 9:24











    • I figured based on the terminology; in fact, with git, this kind of merge is still done only once (if one just pulls and merges). But one can always pull and cherry-pick, or merge individual commits, or rebase depending on the preference of the dev. It takes more time but is certainly doable if automatic merging seems likely to fail.

      – D. Ben Knoble
      Mar 29 at 12:48







    2




    2





    This is especially useful with gits rerere option enabled

    – D. Ben Knoble
    Mar 29 at 4:09






    This is especially useful with gits rerere option enabled

    – D. Ben Knoble
    Mar 29 at 4:09














    @D.BenKnoble: thanks for that addition. I have to admit, I am not a git expert (but the problem described is not specificially for git, it applies to any VCS which allows branching, and my answer should fit to most of those systems).

    – Doc Brown
    Mar 29 at 9:24





    @D.BenKnoble: thanks for that addition. I have to admit, I am not a git expert (but the problem described is not specificially for git, it applies to any VCS which allows branching, and my answer should fit to most of those systems).

    – Doc Brown
    Mar 29 at 9:24













    I figured based on the terminology; in fact, with git, this kind of merge is still done only once (if one just pulls and merges). But one can always pull and cherry-pick, or merge individual commits, or rebase depending on the preference of the dev. It takes more time but is certainly doable if automatic merging seems likely to fail.

    – D. Ben Knoble
    Mar 29 at 12:48





    I figured based on the terminology; in fact, with git, this kind of merge is still done only once (if one just pulls and merges). But one can always pull and cherry-pick, or merge individual commits, or rebase depending on the preference of the dev. It takes more time but is certainly doable if automatic merging seems likely to fail.

    – D. Ben Knoble
    Mar 29 at 12:48











    16














    You are thinking of splitting the file as an atomic operation, but there are intermediate changes you can make. The file gradually became huge over time, it can gradually become small over time.



    Pick a part that hasn't had to change in a long time (git blame can help with this), and split that off first. Get that change merged into everyone's branches, then pick the next easiest part to split. Maybe even splitting one part is too big a step and you should just do some rearranging within the large file first.



    If people aren't frequently merging back to develop, you should encourage that, then after they merge, take that opportunity to split off the parts they just changed. Or ask them to do the splitting off as part of the pull request review.



    The idea is to slowly move toward your goal. It will feel like progress is slow, but then suddenly you'll realize your code is a lot better. It takes a long time to turn an ocean liner.






    share|improve this answer























    • The file may have started large. Files that size can be created quickly. I know people who can write 1000's of LoC in a day or week. And OP did not mention automated tests, which indicates to me that they are lacking.

      – ChuckCottrill
      Mar 29 at 22:19















    16














    You are thinking of splitting the file as an atomic operation, but there are intermediate changes you can make. The file gradually became huge over time, it can gradually become small over time.



    Pick a part that hasn't had to change in a long time (git blame can help with this), and split that off first. Get that change merged into everyone's branches, then pick the next easiest part to split. Maybe even splitting one part is too big a step and you should just do some rearranging within the large file first.



    If people aren't frequently merging back to develop, you should encourage that, then after they merge, take that opportunity to split off the parts they just changed. Or ask them to do the splitting off as part of the pull request review.



    The idea is to slowly move toward your goal. It will feel like progress is slow, but then suddenly you'll realize your code is a lot better. It takes a long time to turn an ocean liner.






    share|improve this answer























    • The file may have started large. Files that size can be created quickly. I know people who can write 1000's of LoC in a day or week. And OP did not mention automated tests, which indicates to me that they are lacking.

      – ChuckCottrill
      Mar 29 at 22:19













    16












    16








    16







    You are thinking of splitting the file as an atomic operation, but there are intermediate changes you can make. The file gradually became huge over time, it can gradually become small over time.



    Pick a part that hasn't had to change in a long time (git blame can help with this), and split that off first. Get that change merged into everyone's branches, then pick the next easiest part to split. Maybe even splitting one part is too big a step and you should just do some rearranging within the large file first.



    If people aren't frequently merging back to develop, you should encourage that, then after they merge, take that opportunity to split off the parts they just changed. Or ask them to do the splitting off as part of the pull request review.



    The idea is to slowly move toward your goal. It will feel like progress is slow, but then suddenly you'll realize your code is a lot better. It takes a long time to turn an ocean liner.






    share|improve this answer













    You are thinking of splitting the file as an atomic operation, but there are intermediate changes you can make. The file gradually became huge over time, it can gradually become small over time.



    Pick a part that hasn't had to change in a long time (git blame can help with this), and split that off first. Get that change merged into everyone's branches, then pick the next easiest part to split. Maybe even splitting one part is too big a step and you should just do some rearranging within the large file first.



    If people aren't frequently merging back to develop, you should encourage that, then after they merge, take that opportunity to split off the parts they just changed. Or ask them to do the splitting off as part of the pull request review.



    The idea is to slowly move toward your goal. It will feel like progress is slow, but then suddenly you'll realize your code is a lot better. It takes a long time to turn an ocean liner.







    share|improve this answer












    share|improve this answer



    share|improve this answer










    answered Mar 28 at 18:09









    Karl BielefeldtKarl Bielefeldt

    121k32216414




    121k32216414












    • The file may have started large. Files that size can be created quickly. I know people who can write 1000's of LoC in a day or week. And OP did not mention automated tests, which indicates to me that they are lacking.

      – ChuckCottrill
      Mar 29 at 22:19

















    • The file may have started large. Files that size can be created quickly. I know people who can write 1000's of LoC in a day or week. And OP did not mention automated tests, which indicates to me that they are lacking.

      – ChuckCottrill
      Mar 29 at 22:19
















    The file may have started large. Files that size can be created quickly. I know people who can write 1000's of LoC in a day or week. And OP did not mention automated tests, which indicates to me that they are lacking.

    – ChuckCottrill
    Mar 29 at 22:19





    The file may have started large. Files that size can be created quickly. I know people who can write 1000's of LoC in a day or week. And OP did not mention automated tests, which indicates to me that they are lacking.

    – ChuckCottrill
    Mar 29 at 22:19











    7














    I'm going to suggest a different than normal solution to this problem.



    Use this as a team code event. Have everyone check-in their code who can, then help others who are still working with the file. Once everyone relevant has their code checked in, find a conference room with a projector and work together to start moving things around and into new files.



    You may want to set a specific amount of time to this, so that it doesn't end up being a week worth of arguments with no end in sight. Instead, this might even be a weekly 1-2 hour event until you all get things looking how it needs to be. Maybe you only need 1-2 hours to refactor the file. You won't know until you try, likely.



    This has the benefit of everyone being on the same page (no pun intended) with the refactoring, but it can also help you avoid mistakes as well as get input from others about possible method groupings to maintain, if necessary.



    Doing it this way can be considered to have a built-in code review, if you do that sort of thing. This allows the appropriate amount of devs to sign off on your code as soon as you get it checked in and ready for their review. You might still want them to check the code for anything you missed, but it goes a long ways to making sure the review process is shorter.



    This may not work in all situations, teams, or companies, as the work isn't distributed in a way that makes this happen easily. It can also be (incorrectly) construed as a misuse of dev time. This group code needs buy-in from the manager as well as the refactor itself.



    To help sell this idea to your manager, mention the code review bit as well as everyone knowing where thing are from the beginning. Preventing devs from losing time searching a host of new files can be worthwhile to avoid. Also, preventing devs from getting POed about where things ended up or "completely missing" is usually a good thing. (The fewer the meltdowns the better, IMO.)



    Once you get one file refactored this way, you may be able to more easily get approval for more refactors, if it was successful and useful.



    However you decide to do your refactor, good luck!






    share|improve this answer








    New contributor




    computercarguy is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
    Check out our Code of Conduct.




















    • This is a fantastic suggestion that captures a really good way to achieve the team coordination that is going to be critical to making it work. Additionally, if some of the branches can't be merged back to master first, you've at least got everybody in the room to help deal with the merges into those branches.

      – Colin Young
      Mar 29 at 15:49











    • +1 for suggesting the code mob

      – Jon Raynor
      Mar 29 at 19:27











    • This exactly addresses the social aspect of the problem.

      – ChuckCottrill
      Mar 29 at 22:20















    7














    I'm going to suggest a different than normal solution to this problem.



    Use this as a team code event. Have everyone check-in their code who can, then help others who are still working with the file. Once everyone relevant has their code checked in, find a conference room with a projector and work together to start moving things around and into new files.



    You may want to set a specific amount of time to this, so that it doesn't end up being a week worth of arguments with no end in sight. Instead, this might even be a weekly 1-2 hour event until you all get things looking how it needs to be. Maybe you only need 1-2 hours to refactor the file. You won't know until you try, likely.



    This has the benefit of everyone being on the same page (no pun intended) with the refactoring, but it can also help you avoid mistakes as well as get input from others about possible method groupings to maintain, if necessary.



    Doing it this way can be considered to have a built-in code review, if you do that sort of thing. This allows the appropriate amount of devs to sign off on your code as soon as you get it checked in and ready for their review. You might still want them to check the code for anything you missed, but it goes a long ways to making sure the review process is shorter.



    This may not work in all situations, teams, or companies, as the work isn't distributed in a way that makes this happen easily. It can also be (incorrectly) construed as a misuse of dev time. This group code needs buy-in from the manager as well as the refactor itself.



    To help sell this idea to your manager, mention the code review bit as well as everyone knowing where thing are from the beginning. Preventing devs from losing time searching a host of new files can be worthwhile to avoid. Also, preventing devs from getting POed about where things ended up or "completely missing" is usually a good thing. (The fewer the meltdowns the better, IMO.)



    Once you get one file refactored this way, you may be able to more easily get approval for more refactors, if it was successful and useful.



    However you decide to do your refactor, good luck!






    share|improve this answer








    New contributor




    computercarguy is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
    Check out our Code of Conduct.




















    • This is a fantastic suggestion that captures a really good way to achieve the team coordination that is going to be critical to making it work. Additionally, if some of the branches can't be merged back to master first, you've at least got everybody in the room to help deal with the merges into those branches.

      – Colin Young
      Mar 29 at 15:49











    • +1 for suggesting the code mob

      – Jon Raynor
      Mar 29 at 19:27











    • This exactly addresses the social aspect of the problem.

      – ChuckCottrill
      Mar 29 at 22:20













    7












    7








    7







    I'm going to suggest a different than normal solution to this problem.



    Use this as a team code event. Have everyone check-in their code who can, then help others who are still working with the file. Once everyone relevant has their code checked in, find a conference room with a projector and work together to start moving things around and into new files.



    You may want to set a specific amount of time to this, so that it doesn't end up being a week worth of arguments with no end in sight. Instead, this might even be a weekly 1-2 hour event until you all get things looking how it needs to be. Maybe you only need 1-2 hours to refactor the file. You won't know until you try, likely.



    This has the benefit of everyone being on the same page (no pun intended) with the refactoring, but it can also help you avoid mistakes as well as get input from others about possible method groupings to maintain, if necessary.



    Doing it this way can be considered to have a built-in code review, if you do that sort of thing. This allows the appropriate amount of devs to sign off on your code as soon as you get it checked in and ready for their review. You might still want them to check the code for anything you missed, but it goes a long ways to making sure the review process is shorter.



    This may not work in all situations, teams, or companies, as the work isn't distributed in a way that makes this happen easily. It can also be (incorrectly) construed as a misuse of dev time. This group code needs buy-in from the manager as well as the refactor itself.



    To help sell this idea to your manager, mention the code review bit as well as everyone knowing where thing are from the beginning. Preventing devs from losing time searching a host of new files can be worthwhile to avoid. Also, preventing devs from getting POed about where things ended up or "completely missing" is usually a good thing. (The fewer the meltdowns the better, IMO.)



    Once you get one file refactored this way, you may be able to more easily get approval for more refactors, if it was successful and useful.



    However you decide to do your refactor, good luck!






    share|improve this answer








    New contributor




    computercarguy is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
    Check out our Code of Conduct.










    I'm going to suggest a different than normal solution to this problem.



    Use this as a team code event. Have everyone check-in their code who can, then help others who are still working with the file. Once everyone relevant has their code checked in, find a conference room with a projector and work together to start moving things around and into new files.



    You may want to set a specific amount of time to this, so that it doesn't end up being a week worth of arguments with no end in sight. Instead, this might even be a weekly 1-2 hour event until you all get things looking how it needs to be. Maybe you only need 1-2 hours to refactor the file. You won't know until you try, likely.



    This has the benefit of everyone being on the same page (no pun intended) with the refactoring, but it can also help you avoid mistakes as well as get input from others about possible method groupings to maintain, if necessary.



    Doing it this way can be considered to have a built-in code review, if you do that sort of thing. This allows the appropriate amount of devs to sign off on your code as soon as you get it checked in and ready for their review. You might still want them to check the code for anything you missed, but it goes a long ways to making sure the review process is shorter.



    This may not work in all situations, teams, or companies, as the work isn't distributed in a way that makes this happen easily. It can also be (incorrectly) construed as a misuse of dev time. This group code needs buy-in from the manager as well as the refactor itself.



    To help sell this idea to your manager, mention the code review bit as well as everyone knowing where thing are from the beginning. Preventing devs from losing time searching a host of new files can be worthwhile to avoid. Also, preventing devs from getting POed about where things ended up or "completely missing" is usually a good thing. (The fewer the meltdowns the better, IMO.)



    Once you get one file refactored this way, you may be able to more easily get approval for more refactors, if it was successful and useful.



    However you decide to do your refactor, good luck!







    share|improve this answer








    New contributor




    computercarguy is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
    Check out our Code of Conduct.









    share|improve this answer



    share|improve this answer






    New contributor




    computercarguy is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
    Check out our Code of Conduct.









    answered Mar 29 at 0:37









    computercarguycomputercarguy

    1713




    1713




    New contributor




    computercarguy is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
    Check out our Code of Conduct.





    New contributor





    computercarguy is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
    Check out our Code of Conduct.






    computercarguy is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
    Check out our Code of Conduct.












    • This is a fantastic suggestion that captures a really good way to achieve the team coordination that is going to be critical to making it work. Additionally, if some of the branches can't be merged back to master first, you've at least got everybody in the room to help deal with the merges into those branches.

      – Colin Young
      Mar 29 at 15:49











    • +1 for suggesting the code mob

      – Jon Raynor
      Mar 29 at 19:27











    • This exactly addresses the social aspect of the problem.

      – ChuckCottrill
      Mar 29 at 22:20

















    • This is a fantastic suggestion that captures a really good way to achieve the team coordination that is going to be critical to making it work. Additionally, if some of the branches can't be merged back to master first, you've at least got everybody in the room to help deal with the merges into those branches.

      – Colin Young
      Mar 29 at 15:49











    • +1 for suggesting the code mob

      – Jon Raynor
      Mar 29 at 19:27











    • This exactly addresses the social aspect of the problem.

      – ChuckCottrill
      Mar 29 at 22:20
















    This is a fantastic suggestion that captures a really good way to achieve the team coordination that is going to be critical to making it work. Additionally, if some of the branches can't be merged back to master first, you've at least got everybody in the room to help deal with the merges into those branches.

    – Colin Young
    Mar 29 at 15:49





    This is a fantastic suggestion that captures a really good way to achieve the team coordination that is going to be critical to making it work. Additionally, if some of the branches can't be merged back to master first, you've at least got everybody in the room to help deal with the merges into those branches.

    – Colin Young
    Mar 29 at 15:49













    +1 for suggesting the code mob

    – Jon Raynor
    Mar 29 at 19:27





    +1 for suggesting the code mob

    – Jon Raynor
    Mar 29 at 19:27













    This exactly addresses the social aspect of the problem.

    – ChuckCottrill
    Mar 29 at 22:20





    This exactly addresses the social aspect of the problem.

    – ChuckCottrill
    Mar 29 at 22:20











    2














    Fixing this problem requires buy-in from the other teams because you're trying to change a shared resource (the code itself). That being said, I think there's a way to "migrate away" from having huge monolithic files without disrupting people.



    I would also recommend not targeting all the huge files at once unless the number of huge files is growing uncontrollably in addition to the sizes of individual files.



    Refactoring large files like this frequently causes unexpected problems. The first step is to stop the big files from accumulating additional functionality beyond what's currently in master or in development branches.



    I think the best way to do this is with commit hooks that block certain additions to the large files by default, but can be overruled with a magical comment in the commit message, like @bigfileok or something. It's important to be able to overrule the policy in a way that's painless but trackable. Ideally, you should be able to run the commit hook locally and it should tell you how to override this particular error in the error message itself. Also, this is just my preference, but unrecognized magical comments or magical comments suppressing errors that didn't actually fire in the commit message should be a commit-time warning or error so you don't inadvertently train people to suppress the hooks regardless of whether they need to or not.



    The commit hook could check for new classes or do other static analysis (ad hoc or not). You can also just pick a line or character count that's 10% larger than the file currently is and say that the large file can't grow beyond the new limit. You can also reject individual commits that grow the large file by too many lines or too many characters or w/e.



    Once the large file stops accumulating new functionality, you can refactor things out of it one at a time (and reduce the tresholds enforced by the commit hooks at the same time to prevent it from growing again).



    Eventually, the large files will be small enough that the commit hooks can be completely removed.






    share|improve this answer





























      2














      Fixing this problem requires buy-in from the other teams because you're trying to change a shared resource (the code itself). That being said, I think there's a way to "migrate away" from having huge monolithic files without disrupting people.



      I would also recommend not targeting all the huge files at once unless the number of huge files is growing uncontrollably in addition to the sizes of individual files.



      Refactoring large files like this frequently causes unexpected problems. The first step is to stop the big files from accumulating additional functionality beyond what's currently in master or in development branches.



      I think the best way to do this is with commit hooks that block certain additions to the large files by default, but can be overruled with a magical comment in the commit message, like @bigfileok or something. It's important to be able to overrule the policy in a way that's painless but trackable. Ideally, you should be able to run the commit hook locally and it should tell you how to override this particular error in the error message itself. Also, this is just my preference, but unrecognized magical comments or magical comments suppressing errors that didn't actually fire in the commit message should be a commit-time warning or error so you don't inadvertently train people to suppress the hooks regardless of whether they need to or not.



      The commit hook could check for new classes or do other static analysis (ad hoc or not). You can also just pick a line or character count that's 10% larger than the file currently is and say that the large file can't grow beyond the new limit. You can also reject individual commits that grow the large file by too many lines or too many characters or w/e.



      Once the large file stops accumulating new functionality, you can refactor things out of it one at a time (and reduce the tresholds enforced by the commit hooks at the same time to prevent it from growing again).



      Eventually, the large files will be small enough that the commit hooks can be completely removed.






      share|improve this answer



























        2












        2








        2







        Fixing this problem requires buy-in from the other teams because you're trying to change a shared resource (the code itself). That being said, I think there's a way to "migrate away" from having huge monolithic files without disrupting people.



        I would also recommend not targeting all the huge files at once unless the number of huge files is growing uncontrollably in addition to the sizes of individual files.



        Refactoring large files like this frequently causes unexpected problems. The first step is to stop the big files from accumulating additional functionality beyond what's currently in master or in development branches.



        I think the best way to do this is with commit hooks that block certain additions to the large files by default, but can be overruled with a magical comment in the commit message, like @bigfileok or something. It's important to be able to overrule the policy in a way that's painless but trackable. Ideally, you should be able to run the commit hook locally and it should tell you how to override this particular error in the error message itself. Also, this is just my preference, but unrecognized magical comments or magical comments suppressing errors that didn't actually fire in the commit message should be a commit-time warning or error so you don't inadvertently train people to suppress the hooks regardless of whether they need to or not.



        The commit hook could check for new classes or do other static analysis (ad hoc or not). You can also just pick a line or character count that's 10% larger than the file currently is and say that the large file can't grow beyond the new limit. You can also reject individual commits that grow the large file by too many lines or too many characters or w/e.



        Once the large file stops accumulating new functionality, you can refactor things out of it one at a time (and reduce the tresholds enforced by the commit hooks at the same time to prevent it from growing again).



        Eventually, the large files will be small enough that the commit hooks can be completely removed.






        share|improve this answer















        Fixing this problem requires buy-in from the other teams because you're trying to change a shared resource (the code itself). That being said, I think there's a way to "migrate away" from having huge monolithic files without disrupting people.



        I would also recommend not targeting all the huge files at once unless the number of huge files is growing uncontrollably in addition to the sizes of individual files.



        Refactoring large files like this frequently causes unexpected problems. The first step is to stop the big files from accumulating additional functionality beyond what's currently in master or in development branches.



        I think the best way to do this is with commit hooks that block certain additions to the large files by default, but can be overruled with a magical comment in the commit message, like @bigfileok or something. It's important to be able to overrule the policy in a way that's painless but trackable. Ideally, you should be able to run the commit hook locally and it should tell you how to override this particular error in the error message itself. Also, this is just my preference, but unrecognized magical comments or magical comments suppressing errors that didn't actually fire in the commit message should be a commit-time warning or error so you don't inadvertently train people to suppress the hooks regardless of whether they need to or not.



        The commit hook could check for new classes or do other static analysis (ad hoc or not). You can also just pick a line or character count that's 10% larger than the file currently is and say that the large file can't grow beyond the new limit. You can also reject individual commits that grow the large file by too many lines or too many characters or w/e.



        Once the large file stops accumulating new functionality, you can refactor things out of it one at a time (and reduce the tresholds enforced by the commit hooks at the same time to prevent it from growing again).



        Eventually, the large files will be small enough that the commit hooks can be completely removed.







        share|improve this answer














        share|improve this answer



        share|improve this answer








        edited Mar 29 at 17:28

























        answered Mar 28 at 20:24









        Gregory NisbetGregory Nisbet

        1836




        1836





















            -2














            Wait until hometime. Split the file, commit and merge to master.



            Other people will have to pull the changes into their feature branches in the morning like any other change.






            share|improve this answer


















            • 3





              Still would mean they would have to merge my refactorings with their changes though...

              – Hoff
              Mar 28 at 16:34











            • somewhat related: suggestion about uncluttering file structure

              – Nick Alexeev
              Mar 28 at 16:36












            • they are going to have to merge that big file with one another anyway. merging with your split version might actually reduce the total pain

              – Ewan
              Mar 28 at 16:38






            • 9





              This has the problem of "Surprise, I broke all your stuff." The OP needs to get buy-in and approval before doing this, and doing it at a scheduled time that no one else has the file "in progress" would help.

              – computercarguy
              Mar 29 at 0:08







            • 5





              For the love of cthulhu don't do this. It's about the worst way you can work in a team.

              – Lightness Races in Orbit
              Mar 29 at 14:10
















            -2














            Wait until hometime. Split the file, commit and merge to master.



            Other people will have to pull the changes into their feature branches in the morning like any other change.






            share|improve this answer


















            • 3





              Still would mean they would have to merge my refactorings with their changes though...

              – Hoff
              Mar 28 at 16:34











            • somewhat related: suggestion about uncluttering file structure

              – Nick Alexeev
              Mar 28 at 16:36












            • they are going to have to merge that big file with one another anyway. merging with your split version might actually reduce the total pain

              – Ewan
              Mar 28 at 16:38






            • 9





              This has the problem of "Surprise, I broke all your stuff." The OP needs to get buy-in and approval before doing this, and doing it at a scheduled time that no one else has the file "in progress" would help.

              – computercarguy
              Mar 29 at 0:08







            • 5





              For the love of cthulhu don't do this. It's about the worst way you can work in a team.

              – Lightness Races in Orbit
              Mar 29 at 14:10














            -2












            -2








            -2







            Wait until hometime. Split the file, commit and merge to master.



            Other people will have to pull the changes into their feature branches in the morning like any other change.






            share|improve this answer













            Wait until hometime. Split the file, commit and merge to master.



            Other people will have to pull the changes into their feature branches in the morning like any other change.







            share|improve this answer












            share|improve this answer



            share|improve this answer










            answered Mar 28 at 16:16









            EwanEwan

            42.9k33695




            42.9k33695







            • 3





              Still would mean they would have to merge my refactorings with their changes though...

              – Hoff
              Mar 28 at 16:34











            • somewhat related: suggestion about uncluttering file structure

              – Nick Alexeev
              Mar 28 at 16:36












            • they are going to have to merge that big file with one another anyway. merging with your split version might actually reduce the total pain

              – Ewan
              Mar 28 at 16:38






            • 9





              This has the problem of "Surprise, I broke all your stuff." The OP needs to get buy-in and approval before doing this, and doing it at a scheduled time that no one else has the file "in progress" would help.

              – computercarguy
              Mar 29 at 0:08







            • 5





              For the love of cthulhu don't do this. It's about the worst way you can work in a team.

              – Lightness Races in Orbit
              Mar 29 at 14:10













            • 3





              Still would mean they would have to merge my refactorings with their changes though...

              – Hoff
              Mar 28 at 16:34











            • somewhat related: suggestion about uncluttering file structure

              – Nick Alexeev
              Mar 28 at 16:36












            • they are going to have to merge that big file with one another anyway. merging with your split version might actually reduce the total pain

              – Ewan
              Mar 28 at 16:38






            • 9





              This has the problem of "Surprise, I broke all your stuff." The OP needs to get buy-in and approval before doing this, and doing it at a scheduled time that no one else has the file "in progress" would help.

              – computercarguy
              Mar 29 at 0:08







            • 5





              For the love of cthulhu don't do this. It's about the worst way you can work in a team.

              – Lightness Races in Orbit
              Mar 29 at 14:10








            3




            3





            Still would mean they would have to merge my refactorings with their changes though...

            – Hoff
            Mar 28 at 16:34





            Still would mean they would have to merge my refactorings with their changes though...

            – Hoff
            Mar 28 at 16:34













            somewhat related: suggestion about uncluttering file structure

            – Nick Alexeev
            Mar 28 at 16:36






            somewhat related: suggestion about uncluttering file structure

            – Nick Alexeev
            Mar 28 at 16:36














            they are going to have to merge that big file with one another anyway. merging with your split version might actually reduce the total pain

            – Ewan
            Mar 28 at 16:38





            they are going to have to merge that big file with one another anyway. merging with your split version might actually reduce the total pain

            – Ewan
            Mar 28 at 16:38




            9




            9





            This has the problem of "Surprise, I broke all your stuff." The OP needs to get buy-in and approval before doing this, and doing it at a scheduled time that no one else has the file "in progress" would help.

            – computercarguy
            Mar 29 at 0:08






            This has the problem of "Surprise, I broke all your stuff." The OP needs to get buy-in and approval before doing this, and doing it at a scheduled time that no one else has the file "in progress" would help.

            – computercarguy
            Mar 29 at 0:08





            5




            5





            For the love of cthulhu don't do this. It's about the worst way you can work in a team.

            – Lightness Races in Orbit
            Mar 29 at 14:10






            For the love of cthulhu don't do this. It's about the worst way you can work in a team.

            – Lightness Races in Orbit
            Mar 29 at 14:10











            Hoff is a new contributor. Be nice, and check out our Code of Conduct.









            draft saved

            draft discarded


















            Hoff is a new contributor. Be nice, and check out our Code of Conduct.












            Hoff is a new contributor. Be nice, and check out our Code of Conduct.











            Hoff is a new contributor. Be nice, and check out our Code of Conduct.














            Thanks for contributing an answer to Software Engineering Stack Exchange!


            • Please be sure to answer the question. Provide details and share your research!

            But avoid


            • Asking for help, clarification, or responding to other answers.

            • Making statements based on opinion; back them up with references or personal experience.

            To learn more, see our tips on writing great answers.




            draft saved


            draft discarded














            StackExchange.ready(
            function ()
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fsoftwareengineering.stackexchange.com%2fquestions%2f389380%2fwhats-the-best-way-to-handle-refactoring-a-big-file%23new-answer', 'question_page');

            );

            Post as a guest















            Required, but never shown





















































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown

































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown











            Popular posts from this blog

            Triangular numbers and gcdProving sum of a set is $0 pmod n$ if $n$ is odd, or $fracn2 pmod n$ if $n$ is even?Is greatest common divisor of two numbers really their smallest linear combination?GCD, LCM RelationshipProve a set of nonnegative integers with greatest common divisor 1 and closed under addition has all but finite many nonnegative integers.all pairs of a and b in an equation containing gcdTriangular Numbers Modulo $k$ - Hit All Values?Understanding the Existence and Uniqueness of the GCDGCD and LCM with logical symbolsThe greatest common divisor of two positive integers less than 100 is equal to 3. Their least common multiple is twelve times one of the integers.Suppose that for all integers $x$, $x|a$ and $x|b$ if and only if $x|c$. Then $c = gcd(a,b)$Which is the gcd of 2 numbers which are multiplied and the result is 600000?

            Barbados Ynhâld Skiednis | Geografy | Demografy | Navigaasjemenu

            Σερβία Πίνακας περιεχομένων Γεωγραφία | Ιστορία | Πολιτική | Δημογραφία | Οικονομία | Τουρισμός | Εκπαίδευση και επιστήμη | Πολιτισμός | Δείτε επίσης | Παραπομπές | Εξωτερικοί σύνδεσμοι | Μενού πλοήγησης43°49′00″N 21°08′00″E / 43.8167°N 21.1333°E / 43.8167; 21.133344°49′14″N 20°27′44″E / 44.8206°N 20.4622°E / 44.8206; 20.4622 (Βελιγράδι)Επίσημη εκτίμηση«Σερβία»«Human Development Report 2018»Παγκόσμιος Οργανισμός Υγείας, Προσδόκιμο ζωής και υγιές προσδόκιμο ζωής, Δεδομένα ανά χώρα2003 statistics2004 statistics2005 statistics2006 statistics2007 statistics2008 statistics2009-2013 statistics2014 statisticsStatistical Yearbook of the Republic of Serbia – Tourism, 20152016 statisticsStatistical Yearbook of the Republic of Serbia – Tourism, 2015Πληροφορίες σχετικά με τη Σερβία και τον πολιτισμό τηςΣερβική ΠροεδρίαΕθνικός Οργανισμός Τουρισμού της ΣερβίαςΣερβική ΕθνοσυνέλευσηΣερβίαεε