Ruby CSV - An example of how to split CSV row data into fields

Ruby CSV FAQ: Can you share some sample Ruby code to demonstrate how to read a CSV file in Ruby?

I just created a Ruby script that would open, read, and parse a simple CSV file. If you're interested in the full class definition, the full Ruby source code is available here.

But if you're just interested in seeing how to parse the records and fields of a CSV file, keep reading here.

My CSV data file

Before looking at any source code, it may help to see the contents of the CSV file I'd like to read in my Ruby script. Here are the contents of my CSV data file:

"Rubble", "Barney", "Bedrock"
"Rubble", "Betty", "Bedrock"
"Flinstone", "Wilma", "Bedrock"
"Flinstone", "Fred", "Bedrock"
"Simpson", "Homer", "Springfield"
"Simpson", "Bart", "Springfield"

As you can see, this CSV file contains three fields (or columns) of data: last name, first name, and city.

My Person class

Because my CSV file contains those three columns of data, the first thing I do is create a Ruby class that mirrors this data. Here is the source code for my Person class that mirrors the columns of information in my CSV file:

class Person <

  # a Person has a first name, last name, and city
  Struct.new(:first_name, :last_name, :city)

end

Note that the order of the fields in that definition aren't important. The only thing that is important is that I read those fields in the correct order when I read through the CSV file records, which I'll demonstrate next.

How to read the CSV file records

Here's the next snippet of Ruby source code, which is used to open the CSV file, read each record from the file, create a Person object to represent that row of data, and then places that Person object onto an array I've named people. Here's that code:

# define an array to hold the Person records
people = Array.new

# open the csv file
f = File.open("people.csv", "r")

# loop through each record in the csv file, adding
# each record to our array.
f.each_line { |line|

  # each line has fields separated by commas, so split those fields
  fields = line.split(',')
  
  # create a new Person
  p = Person.new

  # do a little work here to get rid of double-quotes and blanks
  p.last_name = fields[0].tr_s('"', '').strip
  p.first_name = fields[1].tr_s('"', '').strip
  p.city = fields[2].tr_s('"', '').strip
  people.push(p)
}

Here are a few more notes about this segment of code:

  1. The line of code that contains the line.split(',') method call breaks each record in the CSV file into an array of fields.
  2. In that same line of code, the comma in the split(',') method is the one that tells the split method that the comma is the field separator. If your fields were separated by another character, such as a tab, you'd specify that character in this split method. (The tab character should be represented by '\t', but I haven't tested that.)
  3. In the segment of code that looks like this fields[0].tr_s('"', '').strip I'm saying, "Get the first field of this record; deleted all double-quotes from that string; then strip all blank spaces from the beginning and end of the string." In my case I wanted to get rid of those quote characters and the blank spaces, but in your case you may not want to do this; if so, just delete those last two method calls.
  4. In the people.push(p) line of code, I'm simply adding the current Person object to the array people.

I hope that all makes sense. I just wanted to show an example of how to read a CSV file using Ruby, specifically how to split the fields on each record, but somewhere along the way I got a little carried away. In this case hopefully "too much" information is better than "too little".

Post new comment

The content of this field is kept private and will not be shown publicly.