Over the last couple of days I’ve been involved in a couple of discussions where it is clear that other people don’t understand how Perl deals with Unicode. The documentation is clear and detailed (there’s even a good tutorial) but for some reason people still persist in misunderstanding it.
Here’s a quick quiz. Can you explain (in detail) what is going on with all of these four command-line programs? And for bonus points, which one should we be emulating in our code?
1 2 3 4 5 6 7 8 |
$ perl –E‘say “ยฃ”‘ ยฃ $ perl –Mutf8 –E‘say “ยฃ”‘ ๏ฟฝ $ perl –C –E‘say “ยฃ”‘ รยฃ $ perl –C –Mutf8 –E‘say “ยฃ”‘ ยฃ |
In all cases, assume that my locale is set to en_US.UTF-8.
I’ll post explanations in a few days time.
Update: Coincidentally,ย Miyagawa posted something very similar on his blog.
Leave a Reply