In a previous tutorial I wrote about how to sort an array of Ruby objects by one field in the object. In today's tutorial I'd like to demonstrate how to sort an array of Ruby objects by multiple attributes (or fields) of the class contained by the array.
Sorting a Ruby array by two or more class attributes may sound difficult, but as a friend of mine likes to say, "It's not too hard once you know how to do it."
Some sample data
Before we get into the array-sorting code, the first thing we'll need is some good raw data for sorting. Here are the contents of a sample CSV file I've been working with for my sorting tests.
"Rubble", "Barney", "Bedrock" "Rubble", "Betty", "Bedrock" "Flinstone", "Wilma", "Bedrock" "Flinstone", "Fred", "Bedrock" "Simpson", "Homer", "Springfield" "Simpson", "Bart", "Springfield" "MadeUp", "Bart", "Springfield"
I added that last line so I could test my sorting algorithm by three fields, which I'll get to shortly. Now let's take a look at some Ruby code.
Assumptions
For all of the following examples, please assume that these records have been loaded into an array named people
, and the elements in this array are in the order shown above. For completeness I'll share all of the code later, but for now it's easier to skip all those details.
Also, assume that we have a Ruby class named Person
which is used to hold each of those records. My Person
class specifically contains the following fields:
last_name first_name city
Given those assumptions, let's look at some examples of sorting a Ruby array of objects by multiple fields.
Sort by last_name, then first_name
My first example shows how to sort this array by two attributes (fields) of the Person
class: last_name
, and then first_name
.
If you've never sorted a Ruby array by multiple attributes before, you may be thinking that it's very hard, but thanks to the sort_by
method of the Enumerable module, it's not hard at all.
Here's the code needed to sort this array of Person
objects by last_name
, and then by first_name
:
people = people.sort_by { |a| [ a.last_name, a.first_name ] }
As you can see, all you have to do is supply the sort_by
method a block which tells it how to perform the sort. After performing this sort operation, when I print the people
array, like this:
people.each { |p| p.print_csv_record }
I get the sorted output, which looks like this:
"Flinstone","Fred","Bedrock" "Flinstone","Wilma","Bedrock" "MadeUp","Bart","Springfield" "Rubble","Barney","Bedrock" "Rubble","Betty","Bedrock" "Simpson","Bart","Springfield" "Simpson","Homer","Springfield"
Again, that sort was by last name, and then first name.
Sort by first_name, then last_name
As a second example, let's sort the same data, this time by first_name
, and then by last_name
. Here's the code for this multiple-attribute array sorting algorithm:
people = people.sort_by { |a| [ a.first_name, a.last_name ] }
And here's the sorted output from this algorithm:
"Rubble","Barney","Bedrock" "MadeUp","Bart","Springfield" "Simpson","Bart","Springfield" "Rubble","Betty","Bedrock" "Flinstone","Fred","Bedrock" "Simpson","Homer","Springfield" "Flinstone","Wilma","Bedrock"
Again, that data is sorted by first name, and then last name. Note specifically that "Bart MadeUp" appears in the sorted output before "Bart Simpson". They both have the same first name, but the MadeUp last name comes before Simpson, which is correct.
Three-attribute sorting: Sort by city, last_name, first_name
As a final example, here is the code to sort my array of Ruby Person
objects by three attributes: city
, then last_name
, and then first_name
:
people = people.sort_by { |a| [ a.city, a.last_name, a.first_name ] }
By now you're used to the syntax, and this is no big deal.
Here's the output from this sorting code, showing Bedrock before Springfield, then "Bart MadeUp" appearing before "Bart Simpson", which again is correct:
"Flinstone","Fred","Bedrock" "Flinstone","Wilma","Bedrock" "Rubble","Barney","Bedrock" "Rubble","Betty","Bedrock" "MadeUp","Bart","Springfield" "Simpson","Bart","Springfield" "Simpson","Homer","Springfield"
Comments
I have to tell you, I went into this research not knowing what to expect, but it turned out to be pretty easy, didn't it? I never cease to be amazed by the simplicity of the Ruby programming language.
The complete Ruby array-sorting source code
If you'd like to run some tests for yourself, here is the complete Ruby source code that I used to perform these tests. You can use this source code, and the sample data shown earlier, to run your own tests.
# Program: SortCsvMultipleFields.rb # Usage: ruby SortCsvMultipleFields.rb InputFilename > OutputFilename # Author: alvin alexander, devdaily.com # define a "Person" class to represent the columns in the CSV file class Person < Struct.new(:first_name, :last_name, :city) # a method to print out a csv record for the current Person. # note that you can easily re-arrange columns here, if desired. def print_csv_record last_name.length==0 ? printf(",") : printf("\"%s\",", last_name) first_name.length==0 ? printf(",") : printf("\"%s\",", first_name) city.length==0 ? printf("") : printf("\"%s\"", city) printf("\n") end end #------# # MAIN # #------# # bail out unless we get the right number of command line arguments unless ARGV.length == 1 puts "Dude, not the right number of arguments." puts "Usage: ruby SortCsvMultipleFields.rb InputFile.csv > SortedOutputFile.csv\n" exit end # get the input filename from the command line input_file = ARGV[0] people = Array.new # loop through each record in the csv file, adding # each record to our array. f = File.open(input_file, "r") f.each_line { |line| words = line.split(',') p = Person.new p.last_name = words[0].tr_s('"', '').strip p.first_name = words[1].tr_s('"', '').strip p.city = words[2].tr_s('"', '').strip people.push(p) } # (1) sort by last_name, then first_name people = people.sort_by { |a| [ a.last_name, a.first_name ] } # (2) sort by first_name, then last_name #people = people.sort_by { |a| [ a.first_name, a.last_name ] } # (3a) sort by city, last_name, then first_name #people = people.sort_by { |a| [ a.city, a.last_name, a.first_name ] } # (3b) alternate syntax for sorting: #people = people.sort_by { |a| [ a[:city], a[:last_name], a[:first_name]] } # print out all the sorted records people.each { |p| p.print_csv_record }