This blog post is inspired by the conversation of adding background repository migration to Gogs.
Dave Cheney talked about petty much the “Why and when” parts in his Talk, then code, I recommend reading it if you ever want to make a meaningful contribution to an open source project.
The more I work in the software engineering field, the more I admire the fact that most of time is and should be spent on articulating and communicating my ideas, both to other people and to my future self. It might seem to have a bit more overhead for always communicating in the beginning (more writing, more talking), but think it as a sort of “insurance” that a little money you pay, you “guarantee” (quotas because of reasons) a large pay back in the case of accidents.
Time on both ends (the project maintainer and the contributor) are valuable, and no one likes the feeling of hours of work to be turned down simply because of miscommunication or lack of communication.
Then, why can’t I just say “I’m going to implement this”, why and when a proposal?
Unlike bug fixes, which usually small and comes with clearly defined outcomes, features are often complicated and how these features should be designed and implemented varies, new features usually means writing bunch of new code, sometimes changing the code architecture. The project maintainer is supposed to be the person equips the best knowledge of how the entire project works, and knows much more bloody truth than outside contributors. A proposal helps the project maintainer evaluate how well the new code will fit into the existing codebase, as well as helps the contributor articulate, communicate and examine the idea and the implementation better.
There is another important thing, and let’s be honest, not all features are welcomed. So before you even make a proposal, be sure to ask if the feature is wanted at all.
Communication plays a crucial role in collaboration (this line is not needed at all, but I think it’s cool to have it here).
Many organizations and companies adopt the idea of RFC, but the actual definitions of these “RFCs” differ in each of them. For an average open source project, I’d prefer calling “proposal” to be a lightweight version of an RFC, but of course RFC is not prohibited, and name is not important here.
At the bare minimal, a proposal should include three parts, which are:
- Background, what is the historical context?
- Problems, what is wrong?
- Solution, how problems are addressed? What are the technical details? What trade-offs are made?
I know it feels very abstract, so I’m mocking an over-simplified example below.
When a user wants to migrate a Git repository from other code hosts to Gogs, the user can put an HTTPS URL of the repository and submit the migration request. The repository migration will then start in the foreground, and the page will not be finishing loading until the migration is done. Then user will be redirected to the home page of migrated repository.
- There is no visibility into the progress of the repository migration, it often times out if the repository is large and takes long time to finish.
- Visiting the home page of the migrating repository gives users 500 errors.
- If there is a network jitter or server restart, the migration would fail silently and resulted the repository to be migrated in a malformed state on disk, manual intervention is required to remove the repository and start over.
- Create a new database table acting as queue of repository migration tasks.
- When users visit the home page of the migrating repository, a status message is shown depending on the migration is “queued”, “cloning” or “failed” (with error).
- On restarts, Gogs needs to pick up unfinished migration tasks and continue.
Last but not the least, DO NOT send everything in a single PR. It is better to do iterative changes, because we have the proposal, we know where we are and what’s left.