Over the last couple of days I’ve been involved in a couple of discussions where it is clear that other people don’t understand how Perl deals with Unicode. The documentation is clear and detailed (there’s even a good tutorial) but for some reason people still persist in misunderstanding it.

Here’s a quick quiz. Can you explain (in detail) what is going on with all of these four command-line programs? And for bonus points, which one should we be emulating in our code?

In all cases, assume that my locale is set to en_US.UTF-8.

I’ll post explanations in a few days time.

Update: Coincidentally, Miyagawa posted something very similar on his blog.


Blogging By Proxy

I’ve been too busy to write anything here for a while, but here’s the next best thing.

A few months ago I gave a talk on Unicode Best Practices to the Perl team at Net-A-Porter. And now Adam Taylor has written up that talk on their new technical blog.