Rob Williams is a probabilistic Lean coder of Java and Objective-C. Rob is a DZone MVB and is not an employee of DZone and has posted 170 posts at DZone. You can read more from them at their website. View Full User Profile

First Experience With SVN Merge Tracking

  • submit to reddit

 Source Control has really crawled forward at a slow pace. I am going to be a total contrarian (read: jerk) and say that the arguments I made about reactionary kerfuffles being the norm in open source go double where SCM is concerned. Consider: I started with SourceSafe. Going from that to CVS was like getting out of jail. Seriously, like crazy huge improvement. I would put that at a couple orders of magnitude improvement. When Eclipse first hit, it's support for CVS was outstanding. I remember in 2004 going to eclipsecon and talking to the 2 guys who did the CVS support and the main thing I was looking for then was an easier way to do merges. Of course, like most of the rest of the world, I ended up dumping CVS not long thereafter for Subversion. Then some years passed, and the problems svn solved, mainly making it easy to move things, were bought at the cost of horrendously bad plugins that finally stabilized a few years on (use subclipse now, which is pretty good). Then svn started talking about lightweight merging, what I had been after all those years ago.

The idea is kind of simple: people don‘t branch because they think they are going to end up with 2 messes instead of 1. Wouldn‘t it be nice if you could make a branch and keep it in sync while it was out? That‘s the dream. The reality, however, is that merge tracking doesn‘t really give you that. Although it could.

If you want to use merge tracking, you really have to get everyone on the team familiar with it before you go off and make your branches. We have been trying to release from branches, based on the logic that if we release from a branch, we can avoid the last minute hell of people trying to do massive merges right before release. Instead, we can test a merge and then if it doesn‘t go, release off the branch and have all the way until the next release to stabilize whatever didn‘t make it from trunk.

Recently, though, I decided to convert my iPhone app to a universal app and of course thought: let me do this on a branch. Sure enough, one of the other people on the team had some issue using my new branch and went back to work on the old one. Next thing we know, we are a few weeks off shore, knowing a really sucky merge lay between us and solid ground. Last night I tried to do it using svn at the command line and it was really not pretty.

In a lot of ways, you can‘t blame svn for not offering much to people who have not toed its line. The idea behind merge tracking is that once you have 2 branches, when someone changes, those changes should go onto both branches. We probably would have had zero problems if each dev had followed this. But, wait, it‘s freaking 2010. Are we still saying ‘sure you get expected results, if you volunteer for all our behavior constraints?‘ [I call this the mickeysoft plan.]

Let me drift off the reservation and imagine life in an alternate reality, one in which the head of the pantheon would be Bertrand Meyer. Imagine that there was a simple way in the repository to associate the two branches, then when a commit is done, it would be mirrored to the other branch automatically. Then you wouldn‘t even really need tool support. I would just say ‘ok, commit my stuff,‘ and SVN would return its list of conflicts. Where they are is pretty much immaterial. I perform the merges and go on my way. This is one of the endemic problems of open source: most of it is based on NO model, or a model that represents only a tiny corner of some bubble reality. Procedural coding is not only bad because its weak capacity for abstraction results in it bleeding out under the lightest of loads, it sucks because it screams ‘let me get about my business‘ and the ‘my business‘ in reference here is always some little trick rather than a true accounting for what can and should go on in the environment. The first line from the SVN book about merge tracking is ‘the implementation of merge tracking is insanely complex and you have almost no window into it.‘ Here, let me offer my translation services: it‘s a pile of crap that deems itself accountable to none (of course, given that we know it is written in C, this warning label was scarcely needed).

I found an interesting post on Stack Overflow where some rube came in to ask what was better about Git and the main answer was 'it's great when you can't attach to your server' (file under: could care less). But I am guessing that Git would be better able to deliver real merge tracking due to the fact that it has modeled reality a little bit better. Distributed models are not just for offline crap. They decrease complexity through a number of mechanisms: 1. they model the actual world more accurately so there is less fitting and trimming required, and 2. they, um, distribute it (the complexity). Does Git make it easier to branch? to recover from bad merges? to rollback offending commits? It would be interesting to see some real metrics on this. I believe that a study of these practices industry wide would be shocking. I bet that most teams rarely branch, almost never do lightweight branching, and that, even though this was the only major feature from svn in the last 5 years, almost no one is using it (there is pretty much zero tool support).



Published at DZone with permission of Rob Williams, author and DZone MVB.

(Note: Opinions expressed in this article and its replies are the opinions of their respective authors and not those of DZone, Inc.)


Jilles Van Gurp replied on Wed, 2010/04/28 - 2:00am

I switched to git svn a few weeks ago (i.e. I use git client side with a plugin to deal with the central subversion repo). I can effortlessly create git branches locally and dealing with upstream svn commits is a non issue. Just git svn rebase. Most conflicts resolve automatically.You only have to deal with the real ones. This is true even with extensive refactorings (including file and directory renames).

If you want to get that out of jail feeling again, take a serious look at it.

I'm sure svn merge tracking is a nice improvement. But fundamentally, git just has a different and much better way of representing changes. You can merge in git (i.e. create a diff and apply it to your branch) but most of the time you instead rebase. Rebase branch B on branch A is like taking both changesets (i.e. the set of individual commits) of branch A and B and reordering them such that the the commits of branch B happen after those of branch A. Basically this is done by rewinding branch B to the point where you branched, applying all of branch A's commits and then applying all of branch B's commits one by one (with the option to modify/skip them in case of conflicts. After rebasing, you can create a conflict free patch of B's changes to apply to A. Other nice features are cherry-picking commits from branches when rebasing, squashing them into bigger commits, reordering commits, etc. 

Mikael Couzic replied on Wed, 2010/04/28 - 6:24am

Did you try the last versions of Subclipse with native integration of Collabnet Merge Client ?

From my limited experience, I have come to the conclusion that anyone creating SVN branches without this tool is clearly masochist. Unless you like it when the conflicts are resolved by arbitrarily overwriting your last commits...

However, last time I branched with SVN it worked great, and I realized this magic came from the merge client. The conflicts are resolved like you'd expect of any decent SCM. I never quite understood why there was such a gap between commit-level and merge-level conflict gestion, but at least we now have a way to handle that.

As for Git, I love the video of the conference Linus Torvalds gave at Google where he mocks SVN. But I don't believe basic merging needs are enough to migrate from SVN to Git.

BTW, I'm currently contributing to a SourceForge project using a Mercurial repository, and it works very well. You might want to check that out too if you're looking for a distributed SCM.

Michael Parmeley replied on Wed, 2010/04/28 - 12:09pm

No one is using it you say? I use svn merge tracking at least on a weekly basis. We do our releases from tags created from branches. Changes on the branch get merged back to trunk. Sometimes I go the other way as well if I am working a branch like a feature branch (which I do sometimes if I think a change may really destabilize trunk for a while, or I am just experimenting. I do merge tracking from the command line as well. GUI tools just get in the way, merging is best done from the command line. Then I resolve conflicts (if any) from my IDE.

 Just like developers need to follow coding conventions for a project, they also need to follow source control conventions. If they don't, they can clean up the mess.

 I don't recall ever having much of a problem with subversion merge tracking. Works great.

John J. Franey replied on Wed, 2010/04/28 - 2:19pm

So, I've used svn for about 2 years but using 1.4, prior to facilities that allowed for merge tracking. I am not familiar with the source control conventions that are required to operate merge tracking effectively. Your language sounds like if I don't know them, I will have extra work. Sounds risky.

Sometimes (most times) it can be difficult to get developers on a team to converge on a set of coding conventions. I'd anticipate trouble in getting a team to converge on a set of source code management conventions as well.

I guess I like git because it doesn't depend on all participants following convention.

Rob Williams replied on Fri, 2010/06/11 - 1:08pm

Sorry I didn't see these before. The CollabNet tools sounds like it's worth a look! I have been trying to look for other tools and found only toys. Even trying to really see what is happening through Versions is impossible. I am leaning toward Git. As for my comments about merge tracking, @Michael, you were just making my argument. 100% of the time when I do posts, I try to write about the ideas. The central one here is that we are still producing tools that fit the pattern of 'works great, if everyone does exactly what they are supposed to, having divined that with little help from tools and documentation.' To me, that's one of the reasons that a walled garden (like Apple's) can end up looking really good (though Microsoft's never did). You are also just talking about the usage scenario. I am also interested in the forensic angle: how can we keep samples from getting stomped and polluted? and how can we easily see what has been going on in the code? I don't think any scm system really has the answer to this. If I were starting from scratch, I would make that the focus of the tool: a super simple set of interfaces/APIs that allow me to not only follow everything that is going on, but act on it, with rules, analytics, etc.

Mikael Couzic replied on Thu, 2010/06/24 - 3:09am

After using both Mercurial and SVN for a few months, I have to review my precedent statement :

"I don't believe basic merging needs are enough to migrate from SVN to Git."

I now believe that even if you don't branch, migrating to a distributed SCM is always a good choice. Having a local repository gives you a lot more speed and power, especially in repository browsing. It's the future, just jump in it !

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.