It’s October. And that means that Hacktoberfest has started. If you can get four pull requests accepted on other people’s code repositories during October then you can win a t-shirt.
In many ways, I think it’s a great idea. It encourages people to get involved in open source software. But in other ways, it can be a bit of a pain in the arse. Some people go crazy for a free t-shirt and that means you’ll almost certainly get several pull requests that aren’t really of the quality you’d hope for.
I have a particular problem that probably isn’t very common. I’ve talked before about the “semi-static” sites I run on GitHub Pages. There’s some data in a GitHub Repo and every couple of hours the system wakes up and runs some code which generates a few HTML pages and commits those HTML pages into the repo’s “/docs” directory. And – hey presto! – there’s a new version of your web site.
A good example is Planet Perl. The data is a YAML file which mostly consists of a list of web feeds. Every couple of hours we run perlanet to pull in those web feeds and build a new version of the web site containing the latest articles about Perl.
Can you see what the problem is?
The problem is that the most obvious file in the repo is the “index.html” which is the web site. So when people find that repo and want to make a small change to the web site they’ll change that “index.html” file. But that file is generated. Every few hours, any changes to that file are overwritten as a new version is created. You actually want to change “index.tt”. But that uses Template Toolkit syntax, so it’s easy enough to see why people with no Perl knowledge might want to avoid editing that.
The README file for the project explains which files you might want to change in order to make different types of changes. But people don’t read that. Or, if they do read it, they ignore the bits that they don’t like.
So I get pull requests that I have to reject because they change the wrong files.
Last year I got enough of these problematic pull requests that I decided to automate a solution. And it’s this pretty simple GitHub Workflow. It runs whenever my repo receives a pull request and looks at the files that have been changed. If that list of files includes “docs/index.html” then the PR is automatically closed with a polite message explaining what they’ve done wrong.
This makes my life easier. It’s possible it might make your life easier too.