Missing commit messages and changelogs

amitai · June 14, 2021, 5:18pm

Hi all:)
Reviewing many commits to main I noticed lots of commits have either a messy changelog or no message at all.
The ElasticSearch repo had commit messages that really help developers understand what the change was about, yet the new commits leading to the RC1 give little to no info.
I think the flip side is along the lines of this post:

I think no message is really a problem since developers are not keen on skipping between github and the IDE to see what led to the change they are seeing.

Does anyone have proper best practices?
Where should we add those guidelines?

ralph · June 15, 2021, 7:05am

proper git commit messages are a pet-peeve of mine (people hate it if i review their code - i always criticize the commit-msg ).

here are some very good guidelines on writing commit messages:
my favourite:
https://tbaggery.com/2008/04/19/a-note-about-git-commit-messages.html

i would probably put these guidelines in a CONTRIBUTING.md file

EDIT: probably the best source of good examples for git commit messages is still the linux kernel (for those who don’t know: git was built for & by the linux kernel community): kernel/git/torvalds/linux.git - Linux kernel source tree
and of course also git itself: Commits · git/git · GitHub

searchymcsearchface · June 15, 2021, 1:31pm

First, I’m guessing that this is relevant mostly to OpenSearch or is it an issue in the other repos?

I agree with @ralph about this type of guideline being in CONTRIBUTING.md.

Actually specifying these in the file is not hard, getting folks on-board is going to be harder depending how prescriptive the guidelines are.

The other place for this the PR template but my experience is that check boxes are often just a formality that no one thinks about, especially if it involves messing with existing commits.

amitai · June 15, 2021, 1:33pm

+1 for this, the only issue is that this is a project-wide issue and shouldn’t be limited to just the OpenSearch repo imo. Should we have some general place for guidelines that involve the whole project?

Also prevalent in opensearch-dashboards and many of the plugins.

When a project has a one liner as the issue description, and a commit message that doesn’t even reference that issue that is resolves - it could be considered gatekeeping, since newcomers won’t know what is going on in the project.

searchymcsearchface · June 15, 2021, 1:46pm

So, the one CONTRIBUTING.md to rule them all is challenging. There are just so many differences between languages and repos to make it relevant across all the repos. Not that there shouldn’t be some harmony between the guidelines though.

I tend to agree that erring on the side of verbosity is a net-positive for getting other contributors up-to-speed on the project.

It sounds like there is some really specific occurrences that are being referenced. Has there been comments made on these occurrences?

amitai · June 15, 2021, 3:05pm

I made comments on a PR recently, but commenting on the commits after they are committed seems redundant. Also, I am pretty sure only a maintainer could prevent a commit from being merged with an improper message. Policing this is not fun, contributors should be aware that maintaining these messages is in their best interest as well.

I honestly did not even think to check this as it is a very basic and important practice. I just noticed this the other day, checked the last 10+ commits in several repos and 7 times out of 10 the commit message is either missing or a mess.

By “a mess” I mean messages that are comprised of all the commit messages of the PR appended one after the other. This is obvious since the sign-off is layered in.

I would not say this is at all specific, if anything, it is wide-spread. I just never thought developers would submit code without a proper message.

Having said that, if a commit message is redundant because the title is descriptive enough, then perhaps the change is too small for its own PR.

amitai · June 15, 2021, 3:09pm

Do you think it best then to add this to the different projects as a new issue, and then each repo’s maintainers would resolve the issue in their own CONTRIBUTING.md?

searchymcsearchface · June 15, 2021, 3:52pm

This seems like a great meta issue. E.g. find a sample of commit and/or PR messages which have insufficient detail and link these in the meta issue - describe what you would have like to have seen.

I’m personally a big believer in addressing behaviour from specific to broad instead of the other way around. The impact is only large enough to affect the greatest change.

searchymcsearchface · June 15, 2021, 3:53pm

Once there is a recommendation, yeah, I think it would have to be on each CONTRIBUTING.

amitai · June 15, 2021, 3:55pm

The reason I refrained from doing that so far is to avoid any form of singling out anyone.
I will do my best to frame it in a way that is constructive though:)

searchymcsearchface · June 15, 2021, 3:56pm

I get that’s a little hard, but it’s really no different than pointing out a bug in the code

kvngar · June 15, 2021, 10:34pm

I’m wondering if perhaps proposing a flow where all commits are squashed and rebased prior to merge would make sense. Something like this could be added to the contributing MD file: Always Squash and Rebase your Git Commits

And we could include the squash and rebase step as part of the checklist in the PR template (Along with the DCO check) - That would ensure PR’s have clean, changelog ready commit messages.

Edit: If there was a good commit message as part of the PR template, this could be used by the maintainers to do a “Squash and Merge” from Github, and omit some of the rebase shenanigans which can also be tricky

kavilla · June 16, 2021, 6:25am

100% agreed. We should do better.

In the past, I have been held to standard of every commit being reviewable on a PR. If the commit is invalid it should have been squashed prior to the PR being raised or if the changes were requested then the PR should have an interactive rebased. Even if the plan is to have it squashed, there shouldn’t be broken commits out for review.

Descriptions are a must and the commit message should be 50 characters. We should treat commits like a story and if a commit has a vague commit message and no description we have to open up the commit and figure it out. The context might be lost at the point and retracking it will be hard. 50 characters commit message because tools that depend on the 50 characters for best experience.

We can’t amend committed messages with no messages with rewriting history but we should do better to hold this standard going forward. Just like you said,

it could be considered gatekeeping , since newcomers won’t know what is going on in the project.

I think now that we are in the stabilization phases I think this standard will easier to keep track.

amitai · June 16, 2021, 9:55am

Opened Commit messages should be checked as part of the Pull Request · Issue #851 · opensearch-project/OpenSearch · GitHub to continue the discussion over there:)

ralph · June 17, 2021, 8:02am

i disagree on that one: i’m a big fan of small, simple, atomic commits. it can happen that if you’re working on something you see something else which is super small but unrelated to what you’re working on => this can become its own commit (obviously) and own PR (because it’s unrelated it shouldn’t be bundled together as otherwise it’ll take longer for it to get merged and it might be confusing as a causation or correlation is presumed where there is none).

i’m very much against squashing multiple commits into one. on the contrary: i’m a firm believer in atomic commits (first search result as a random link for those who’re not familiar with it (note: i didn’t read the whole page, but it looks like an ok description at first glance): Make Atomic Git Commits | Aleksandr Hovhannisyan).
i continuously do interactive rebases when i’m working on things, but not to squash everything into one commit but rather to amend existing commits (usually by using fixup or squash), re-ordering commits (so that they’re logically grouped and i can open separate PRs if i can already split something out ahead of the whole topic). i sometimes also create separate branches, cherry-pick commits there and then create a PR from there - and after that PR has been merged i can rebase my “real” working branch again.

obviously i only do the force-pushes when working on a feature branch alone.

in general i’m also a firm believer in keeping feature-branches short-lived (also avoids all the rebasing/merging in the master branch). thus it’s important to do atomic commits which are well documented: this makes it easier to git-revert them again if for some reason the feature needs to be removed again. and of course the proper commit message is needed to understand if this is actually part of the feature or something else which must be kept in.

and maybe i got a bit off-topic now… i seem to have a tendency to do that, sorry
but i do believe that the topics “atomic commits”, “proper commit messages”, “clean commit history”, git-revert, etc. are interlinked.

amitai · June 20, 2021, 8:38am

Although atomic commits can be great, this slows down the development velocity because you must constantly think about the context of your change - as you said:

From my experience, it is really hard for many developers to work in an orderly fashion with atomic commits, which may add friction to onboarding. If it were a widespread practice then I probably wouldn’t say so.

I agree with you that some commits warrant a none message, for instance, a spelling fix in a doc, or a rename etc. You are correct that bunching things up when they are unrelated issues can be bad and should be avoided.

Topic		Replies	Views
Redesign of the documentation site - we could use your feedback Request For Comments discuss	2	338	July 29, 2022
OpenSearch Documentation for opensearch.yml General Feedback	1	479	October 6, 2021
Where should we put plugin migration instructions? OpenDistro	14	899	April 17, 2021
Document code reviews best practices in MAINTAINERS Request For Comments discuss	0	138	November 8, 2023
Feb 5 Fork Update OpenDistro	10	2534	February 11, 2021

Missing commit messages and changelogs

Related Topics