Request for Comments-40: Move to GitHub

From OTBWiki
Jump to: navigation, search


  • Author: Victor Poughon
  • Submitted on 21.08.2017
  • Open for comments


What changes will be made and why they would make a better Orfeo ToolBox?

GitHub is a free git hosting provider and easily the most popular network for software collaboration. Since its launch 9 years ago, it has revolutionized the open-source ecosystem, and now hosts millions of repositories including many OSGeo projects (QGIS, GDAL, GeoServer, MapServer, OpenLayers).

ITK, our main dependency, has recently announced that they are in the process of migrating to GitHub. After discussion on their mailing list the proposal received overwhelming support and the entire thread is worth a read. It highlights the many advantages of migrating. It think Orfeo Toolbox could benefit from the same improvements to infrastructure and work-flow for:

  • State of the art technical communication.
    This is somewhat lacking in our current mailing-list only discussion, where long technical discussions about RFCs are discouraged because of the constraints of email. While this is not an inherent limitation of the email platform (as massive projects like the linux kernel demonstrate), not all OTB contributors have well configured text-only email clients (myself included). In practice I feel like most RFCs receive less feedback, and less code review than they could if we had better tools, and in general technical discussion has low signal-to-noise ratio. For example we are sometimes purposefully holding back ready to merge RFCs to reduce mailing list traffic. Other times PSC members forget to vote or cannot tell which RFC are open / closed. Additionally this would leave otb-developers for non-RFC related discussion.
  • State of the art code review.
    Currently in OTB, code review is done with email. This has several downsides: there is no visual distinction between open/closed RFCs, code is copy pasted, links are poorly supported, code is not highlighted and sometime loses its original formatting, context is unavailable, discussion archives are difficult to browse, continuous integration is not automatic (feature_branches.txt in a different repo), and there is no automatic link between bugs and bugfixes. GitHub provides all of those features.
  • Easiest external contributions.
    Officially we are currently accepting GitHub PRs. However in practice we receive very few contributions through GitHub, and often they remain stale for a long time because nobody looks at them. I think that migrating to the platform would simplify workflow for non OTB specialists and promote community contributions and feedback. GitHub is the platform that most developers are familiar with, and this is usually one major reason for migration for us late adopters (as is the case for ITK).
  • Proximity to the OSGeo and open-source community.
    As mentioned in the introduction, GitHub is the number one platform for the open-source community. GitHub is also a social network, and centralization is a major advantage for collaboration and visibility. Besides the technical points above, migrating to GitHub would bring us closer to the open-source community and to our software dependencies and ecosystem.
  • Better use of resources.
    Time and money are limited, and using GitHub free hosting would unlock part of the effort we currently spend on infrastructure for something else (features, bugfixes, better docs, ...).
  • Closed-source platform.

While GitHub offers unlimited free hosting for open-source projects, it is still a proprietary platform. This is highly regrettable, and a little bit of a paradox in the free software movement. However this has no practical implication on everyday work, and I don't expect the OTB project to ever need to modify GitHub's source code. Future evolutions of the GitHub platform are completely out of control of the OTB project, and this must be kept in mind when discussing this, and after the migration. If it ever becomes unsuitable we must immediately migrate away. We have seen this with the events around SourceForge in 2015. There is no reason to believe that GitHub will last forever. I think the fact that it's the best available tool outweighs this (nonetheless important) consideration. Using closed-source platforms for practical reasons is something we already do for Windows and MacOS support.

  • Managed hosting.

This is similar to the previous point but refers to the fact that hosting is managed by a commercial company under California law, outside of the OTB project's control. I think this is not a problem because of the decentralized nature of git, where the entire code's history is stored in each clone. If we are very paranoid we could make backups of the extra data not stored in the repo (issues most importantly). We can also fo it with GitLab. For more information there is the GitHub terms of service.

I don't mean to downplay the risks of migrating to GitHub. It is not a perfect solution but a trade-off that must be carefully considered. If we have the resources and we decide that it's necessary we can take steps to mitigate those risks with backups. However I think that the risk is well understood and that the potential benefits for a better OTB project are worth it.

Migration plan

Here is an idea for the proposed migration plan that could happen if we decide to migrate:

  • Shutdown the mirroring script
  • On developers with push access:
   git remote set-url origin
  • Update RFC procedure to use a PR instead of a wiki page. We can use PR templates to replace the current RFC template.
  • Update our gitflow to more standard and simpler GitHub work flow. I think in our case core developers could push feature branches to the main repo directly and then make a PR to develop Occasional contributors would use the typical fork, then PR model.
  • Update dashboard scripts to build GitHub PR branches instead of features_branches.txt
  • Freeze and manually migrate still relevant bugs to GitHub issues
  • Archive
  • Archive

When will those changes be available (target release or date)?

Who will be developing the proposed changes?




Corresponding Requests for Changes