Ruby CSV FAQ: Can you share some sample Ruby code to demonstrate how to read a CSV file in Ruby?
I just created a Ruby script that would open, read, and parse a simple CSV file. If you're interested in the full class definition, the full Ruby source code is available here.
But if you're just interested in seeing how to parse the records and fields of a CSV file, keep reading here.
My CSV data file
Before looking at any source code, it may help to see the contents of the CSV file I'd like to read in my Ruby script. Here are the contents of my CSV data file:
"Rubble", "Barney", "Bedrock" "Rubble", "Betty", "Bedrock" "Flinstone", "Wilma", "Bedrock" "Flinstone", "Fred", "Bedrock" "Simpson", "Homer", "Springfield" "Simpson", "Bart", "Springfield"
As you can see, this CSV file contains three fields (or columns) of data: last name, first name, and city.
My Person class
Because my CSV file contains those three columns of data, the first thing I do is create a Ruby class that mirrors this data. Here is the source code for my Person
class that mirrors the columns of information in my CSV file:
class Person < # a Person has a first name, last name, and city Struct.new(:first_name, :last_name, :city) end
Note that the order of the fields in that definition aren't important. The only thing that is important is that I read those fields in the correct order when I read through the CSV file records, which I'll demonstrate next.
How to read the CSV file records
Here's the next snippet of Ruby source code, which is used to open the CSV file, read each record from the file, create a Person
object to represent that row of data, and then places that Person
object onto an array I've named people
. Here's that code:
# define an array to hold the Person records people = Array.new # open the csv file f = File.open("people.csv", "r") # loop through each record in the csv file, adding # each record to our array. f.each_line { |line| # each line has fields separated by commas, so split those fields fields = line.split(',') # create a new Person p = Person.new # do a little work here to get rid of double-quotes and blanks p.last_name = fields[0].tr_s('"', '').strip p.first_name = fields[1].tr_s('"', '').strip p.city = fields[2].tr_s('"', '').strip people.push(p) }
Here are a few more notes about this segment of code:
-
The line of code that contains the
line.split(',')
method call breaks each record in the CSV file into an array offields
. -
In that same line of code, the comma in the
split(',')
method is the one that tells thesplit
method that the comma is the field separator. If your fields were separated by another character, such as a tab, you'd specify that character in thissplit
method. (The tab character should be represented by'\t'
, but I haven't tested that.) -
In the segment of code that looks like this
fields[0].tr_s('"', '').strip
I'm saying, "Get the first field of this record; deleted all double-quotes from that string; then strip all blank spaces from the beginning and end of the string." In my case I wanted to get rid of those quote characters and the blank spaces, but in your case you may not want to do this; if so, just delete those last two method calls. -
In the
people.push(p)
line of code, I'm simply adding the currentPerson
object to the arraypeople
.
I hope that all makes sense. I just wanted to show an example of how to read a CSV file using Ruby, specifically how to split the fields on each record, but somewhere along the way I got a little carried away. In this case hopefully "too much" information is better than "too little".