Pickling and CSV

Pickling & CSV
Preservation through Serialization and Tabulation

Pickle
Module for (de)serialization: Storing complete Python objects into files and later
loading them back.
● Supports almost all data types – good.
● Works only with Python – bad.
import pickle
pickle.dump(object, openBinaryFile) # Save object to an open file
object = pickle.load(openBinaryFile) # Restore an object from an open file
2

What Is CSV?
● “Comma Separated Values”
● Tabular file with rows and columns.
● All rows have the same number of fields.
● Fields separated by commas.
○ “Commas” do not have to be commas. Any other character can be used, such as TAB (TSV,
“tab separated values”), vertical bar, space...
● The first row often serves as headers.
3

CSV Example
4
Student, ID, E-mail Address, Phone Number, Class, Academic Level
“Almarar, Hassan A”, 16897**, halmarar2@suffolk.edu, Junior, UG
“Arakelyan, Artur”, 17577**, aarakelyan@suffolk.edu, Sophomore, UG
“Batista, Christopher A”, 16357**, cbatista@suffolk.edu, Senior, UG
Complete file...

Reading CSV
import csv
with open("path-words.csv") as csvfileIn:
reader = csv.reader(csvfileIn, delimiter=',', quotechar='"')
# Returns the next row parsed as a list, if necessary
headers = next(reader)
# Process the rest of the file
for row in reader:
do_something(row)
# Or, since reader is a generator:
all_rows = list(reader)
5

Writing CSV
import csv
with open("path-words.csv", "w") as csvfileOut:
writer = csv.writer(csvfileOut, delimiter=',', quotechar='"')
writer.writerow([..., ..., ...]) # Write headers
# Write the rest of the file; each row is a list of strings or numbers
writer.writerows([row1, row2, row3 ...])
6

Example: Who Are the Students? (students.py)
import csv, collections
with open("class-2017.csv") as mystudents:
reader = csv.reader(mystudents)
headers = next(reader)
class_position = headers.index("Class") # Where is the Class column?
class_levels = [row[class_position] for row in reader]
who_s_who = collections.Counter(class_levels) # Summary
with open("class-summary.csv", "w") as levels:
writer = csv.writer(levels)
writer.writerow(['Class', 'count']) # New headers
writer.writerows(who_s_who.items()) # New content
7

Whenever
possible, use
Pandas
8

Pickling and CSV

More Related Content

Similar to Pickling and CSV

More from Dmitry Zinoviev

Recently uploaded

Pickling and CSV