By Alvin Alexander. Last updated: June 30, 2016
This is a little Perl script I wrote to parse a CSV file I periodically download from Google AdSense. It does the following things:
- Opens the CSV file
- Skips through a bunch of lines until it gets to the first line I’m interested in
- In the rest of the file its skips other lines that I”m not interested in
- In both of those cases it uses pattern matching to compare the current line in the file to the desired pattern
- For the lines it keeps, I extract fields from that line using
split
, then print those fields as CSV fields
Given that brief introduction, here is a Perl script that I use to process Google AdSense CSV files:
#!/usr/bin/perl # A program to parse a Google AdSense CSV file, and convert it to # a better format, with all of the undesirable lines removed # # Usage: ./parse.pl GoogleCsvFile.csv > BetterGoogleCsvFile.csv $numArgs = $#ARGV + 1; die if ($numArgs != 1); $file = $ARGV[$argnum]; open (F, $file) || die ("Could not open $file!"); $do_skip_test = 1; while ($line = <F>) { if ($do_skip_test) { if ($line =~ /^Page/) { # got to the starting point, don't skip lines any more $do_skip_test = 0; } else { next; } } # skip these lines, i don't want/need them next if $line =~ /^#/; next if $line =~ /search/; next if $line =~ /\/node/; next if $line =~ /jwarehouse/; ($uri, $rev, $clicked, $imps, $ctr, $ecpm) = split ',', $line; chomp($ecpm); next if ($clicked < 2); next if ($imps < 50); print "$uri, $rev, $clicked, $imps, $ctr, $ecpm\n"; } close (F);
Summary
In summary, if you are looking for a Perl script to process Google AdSense CSV files, CSV files in general, how to loop over every line in a file in a Perl script, how to skip lines using pattern matching, or how to convert a line of text into CSV fields, I hope this example is helpful.