Over the last few months, I’ve been dabbling in using AI to generate or improve code. I have a subscription to GitHub Copilot and I’m finding it a really useful tool for increasing my productivity. Copilot comes in several different flavours, and I’ve been making particular use of a couple of them.
- Copilot Autocomplete was the first Copilot tool that GitHub released. Once you’ve configured your editor to use it, the AI will read the file you’re working on and will monitor what you’re typing. When it thinks it knows what you’re doing and what comes next, it will display a suggestion for the next chunk of code and if you like what you see, you can just hit the tab key to accept it. I’ve been pleasantly surprised by how well it does. I’ve had cases where I’ve just typed the name of a method and it has autocompleted the code for me.
- Copilot Chat was the next version to be released. This is a chat box that sits alongside your code where you can talk to the AI about what you’re doing and ask it for suggestions. This is great for taking on larger projects. I’ve found it particularly useful for working on front-end code. I can usually make CSS and Javascript do what I want, but asking Copilot for suggestions makes me an order of magnitude quicker.
Those two tools alone make me a more efficient programmer. And they’re well worth the $10 a month I pay for my Copilot subscription. But recently I was invited to the preview of Copilot Workspace. And that’s a whole new level. Copilot Workspace takes a GitHub issue as its input and returns a complete, multifile pull request that implements the required change. I’ve been playing with it for small tweaks, but I decided the time was right to do something more substantial. I planned to write an entire Dancer app by defining issues and asking Copilot to implement the code. Here’s what happened. You can follow along at the GitHub repo.
I decided I would start from the standard, automatically generated Dancer2 app. So I ran dancer2 gen -a Example and committed the output from that. It was then time for the first issue. I decided to start by adding (empty) routes for user registration and login. I opened the issue in the Copilot Workspace and asked the AI for some suggested code. It didn’t really understand the idea of empty routes – but the pull request seemed pretty good. I merged the PR and moved on to the next issue – to add basic registration and login screens. Again, the pull request did a little more than I asked for – adding a bit more registration and login logic – but the code was good.
As an aside, you’ll notice that the PRs are all correctly linked to the correct issues and contain substantial information about the changes. This is all generated by the AI.
For the next step, we needed a database table to store the users. I asked Copilot to use SQLite and it gave me what I wanted – once again, going above and beyond. For the first time, its overenthusiasm was slightly annoying, because it added some database code to store new users and I hadn’t told it that we would be using DBIx::Class. So that was the next issue and the next pull request. Note that the pull request even includes adding DBIx::Class to the requisites in Makefile.PL.
Time for some unit tests (ok, maybe the best time was a few PRs ago!). The issue description was simple – “Write unit tests for everything we have so far“. Maybe it was too simple – as this was the first time the AI seemed to struggle a bit. I was merging the PRs without really checking them and the PR introduced a lot of useful tests – but many of them failed. Part of the problem here is that (as far as I can see) Copilot Workspace has no way to run the code it produces – so it was guessing how well it was doing. It took a few iterations to get that right – it basically boiled down to the database schema not being loaded into the database before the tests were run. At times while we were working through these problems, I was reminded of someone (I think it was Simon Willison) describing an AI programming assistant as “an overconfident, overenthusiastic intern”. Luckily, unlike an intern, Copilot never gets annoyed with you telling it to try again and providing more and more information to help it get to the bottom of a problem.
After a while, we had a working test suite and were back on track.
So we were back at adding features to the application. I decided the next thing we needed was to display the logged-in user’s username and email address on the main page. That seemed simple enough and worked first time. About this time I was getting annoyed with the standard Dancer2 web page, so we removed most of that. Then I switched from Dancer’s default “simple” templating system to the Template Toolkit [issue / PR].
While we were tidying up the look and feel, we added login and logout buttons [issue / PR] and a register button on the logged out page [issue / PR]. This led to some more confusion for a while as logging out didn’t work. It turned out the AI had used outdated code to destroy the session and I had to get very specific before it would do the right thing [issue / PR].
We then added some more tests [issue / PR], displayed registration and login errors [issue / PR] and ensured we were storing the passwords in encrypted form (to be honest, I’m slightly disappointed that the AI didn’t do that by default) [issue / PR].
At this point (and I don’t know why I didn’t do it sooner), we replaced the UI with something using Bootstrap [issue / PR]. That led to a bit more tweaking of the buttons [issue / PR].
At this point, I had basically got to where I wanted to be. I had an app that didn’t do anything useful, but let you register, login and log out. And I’d done it all pretty quickly and without writing very much code.
Then I decided to push it too far.
The thing that I actually wanted to achieve at this point was to add social registration and login to the site. I created an issue – Allow users to register and login using a Google account – and Copilot gave me some code. But at this point, it’s not just about code. You also need to configure stuff at Google in order to get this working. And, while Copilot gave me some information about what I needed to do, I haven’t yet been able to get it working. This is a good example of the limitations of AI-powered programming. It’s great at generating code, but (so far, at least) not so good at keeping up to date with how to interface with external systems. Oh, and there’s the problem we saw earlier about it not actually running the tests.
So, how do I think the experiment went? I was impressed. There was a lot of code generated that was as good or better than I would have written myself. There are certainly the problems that I mentioned above, but this stuff is improving at such an incredible rate that I really can’t see those problems still existing in a year.
I’ve started using Copilot Workspace for a lot more of my projects. And I’m happy with the results I’ve got.
What about you? Have you used any version of Copilot to help with your coding? How successful has it been?
I haven’t used Copilot or other similar tools for one very strong reason: I don’t know what they’ve been trained on and the legal situation is unclear. Namely, who owns the output? Is it me, or Copilot, or the author of the code it’s been trained on? What if it’s produced code that is very similar to (or a verbatim copy of) something released under the (A)GPL – can I use it in a proprietary product?