Perl case-insensitive string array sorting

Perl string array sorting FAQ: How can I sort a Perl string array in a case-insensitive manner?

I mentioned in my first Perl array sorting tutorial that the default Perl sorting algorithm sorts characters by their ASCII values. Because of that, my simple Perl array sorting example using the following strings worked just fine:

# a simple perl string array
@pizzas = qw(pepperoni cheese veggie sausage);

However, if I add a few uppercase strings to that string array, like this:

# a perl string array with uppercase and lowercase strings
@pizzas = qw(pepperoni cheese veggie Works Anchovie sausage);

and then run my earlier Perl script, I get the following "sorted" results:

Anchovie Works cheese pepperoni sausage veggie

These results look okay when "Anchovie" prints first, but then you know the sort order is wrong when "Works" is printed next.

To fix this Perl string array sorting problem, we need to tell Perl how to sort a string array in a case insensitive manner. Fortunately the Perl case-insensitive string array sorting algorithm is a well-known formula, and also requires very little code.

Case-insensitive Perl string array sorting

To properly sort our Perl string array in a case-insensitive manner, we have to create our own sorting routine to work with Perl's sort function. This sorting routine is often referred to as a "helper" function. (This is very similar to what you have to do in Java for advanced sorting, except it does not require much code.)

Skipping the theory for now and cutting right to the chase, we need to sort our Perl string array as follows to get the case-insensitive string sort order we want:

@sorted_pizzas = sort { "\L$a" cmp "\L$b" } @pizzas;

The block of code in between the curly braces provides the formula Perl needs to sort a string array in a case-insensitive manner. This block of code uses the Perl cmp function to compare each element in the array, and uses the "\L" I have in front of the $a and $b variables to convert the strings to lowercase. Because each string is made lowercase before the comparison, the end result is the case-insensitive sort order we're looking for.

Note that I could just have easily converted both strings to uppercase using this comparison block instead:

# perl string array case insensitive sorting
@sorted_pizzas = sort { "\U$a" cmp "\U$b" } @pizzas;

In either case, the case insensitive sorted array output looks like this:

Anchovie cheese pepperoni sausage veggie Works

A complete case-insensitive Perl array sorting script

If you'd like to test this case insensitive string array sorting code on your own system, here's the complete source code for a Perl script that I used in writing this article:

#!/usr/bin/perl

@pizzas = qw(pepperoni cheese veggie Works Anchovie sausage);
@sorted_pizzas = sort { "\L$a" cmp "\L$b" } @pizzas;
print "@sorted_pizzas\n";

Feel free to use this code for your own Perl string array sorting needs.

Add new comment

Anonymous format

  • Allowed HTML tags: <em> <strong> <cite> <code> <ul type> <ol start type> <li> <pre>
  • Lines and paragraphs break automatically.