CHAP 4
File Handling
 File helps storing
information permanently.
 A file in itself is a bunch of
bytes stored on some
storage device like hard
disk, thumb disk etc.
 The data files are the files
that store data pertaining
to a specific applications
for later use.
Text File Binary File
A text file stores information in the form of a stream of ASCII or
Unicode characters.
In text files, each line of text is terminated with a special character
known as EOL ( end of line) character.
in Python, by default, EOL character is the new line character (n)
or carriage-return, newline combination (rn).
Regular Text Files
• These are the text files which store the
text in the same form as typed. here the
new line character ends a line and the
text translations take place. these files
have a file extension as .tt
Delimited Text Files
• In these text files, a specific character is
stored to separate the values, i.e., after
each value, a tab or a comma after every
value.
• When a tab character is used to
separate the values stored, these are
called TSV files ( tab separated
values). these files can take the
extension as .tt or .csv.
• When the comma is used to separate
the values stored, these are called
CSV (, separated values) files. these
files take extension as .csv.
Regular
Text File
I am a
simple text.
TSV File
Content
I -> am ->
simple.
CSV File
Content
I, am,
simple.
 Binary files store the
information in the form of a
stream of bytes.
 A binary file contains
information in the same format
in which the information is held
in memory.
 Binary files can take a variety
of extensions.
 The text file can be
opened in any text editor
and are in human readable
form while binary files are
not in human readable
form.
Difference between
binary file and
text file
File
Open
Write
/ Read
Close
the
File
 The basic file manipulation tasks include adding, modifying or
deleting data in a file, which in turn include any one or combination
of the following operations:
Reading data from files writing data to files appending data to files
<file_objectname> = open(<filename>, <mode>)
 Syntax:
Identifier to access the file
Name of the file -
identifier
Built-in Function Access Mode
f=open("myfile.txt", "w")
myobj=open("binfile.dat","r")
f=open("myfile.txt", "w")
fo=open("c:tempdata.txt","r")
fiob=open(r"c:tempdata.txt","r")
#The prefix r in front of a string makes it raw string
# If you don't use raw string, you have to escape every backslash:
The with statement works with the open() function to open a file.
Unlike open() where you have to close the file
with the close() method, the with statement
closes the file for you without you telling it to.
with open("hello.txt") as my_file:
print(my_file.read())
 File objects are used to read and write data to a file on disk.
 A file object is there a reference to a file on disk. It opens and makes it
available for a number of different tasks.
Text File Binary File Mode Description
'r' 'rb' read only
'w' 'wb' write only
'a' 'ab' append
'r+' 'r+b' or
'rb+'
read and write
'w+' 'w+b' or
'wb+'
write and read
'a+' 'a+b' or
'ab+'
write and read
 A close( ) function breaks
the links of file-object and
the file on the disk.
 After close( ), no tasks can
be performed on that file
through the file-object.
 Syntax:
<fileobject>.close( )
• Example:
f.close( )
W
O
R
K
IN
G
WITH
TEXT
FILES
Python provides many functions for reading
and writing the open files.
 Python provides three types of read functions to read from a data
file.
 Before you read from a file, the file must be opened and
linked via a file object.
Method Syntax Description
read() <fileobject>.read([n])
reads n bytes; if no n is
specified, reads the
entire file.
readline() <fileobject>.readline([n])
reads a line of input; if n
is specified reads at most
n bytes.
readlines() <fileobject>.readlines([n])
reads all lines and
returns them in a list.
You can also combine the open() and read() functions as follows:
file("filename",<mode>).read()
 The writing
functions also work
on open files, i.e.,
the files that are
opened and linked
via a file object.
Method Syntax Description
write() <fileobject>.write(str)
writes string str to file
referenced by
<fileobject>
writelines()
<fileobject>.writelines(L)
writes all strings in list
L as lines to file
referenced by
<fileobject>
You can also use plus symbol (+) with file read
mode to facilitate reading as well as writing.
If you want to write into the file while retaining the
old data, then you should open the file in 'a' or
append mode.
When you open a file in 'w' or write mode, Python
overwrites an existing file or creates a non-
existing file, which means, for an existing file with
the same name, the earlier data gets lost.
In an existing file, while retaining its content
• (a) if the file has been opened in append mode ("a") to retain the old content.
• (b) if the file has been open in 'r+' or 'a+' modes to facilitate reading as well as
writing.
To create a new file or to write on an existing file after
truncating / overwriting its old content
• (a) if the file has been opened in write-only mode ("w")
• (b) if the file has been open in 'w+' mode to facilitate writing as well as reading
Make sure to use close() function on file-object after you have
finished writing.
WRITING IN FILE CAN BE
IN THE FOLLOWING FORMS:
 The flush( ) function forces the writing of data on disc still
pending in output buffer.
Syntax:
<fileobject>.flush( )
 All the read functions also read the leading and trailing
whitespaces i.e., spaces or tabs or newline characters.
 If you want to remove any of these trailing and
leading whitespaces, you can use strip( ) functions.
strip( )
• Removes the given character from both ends
rstrip( )
• Removes the given character from trailing end i.e., right end
lstrip( )
• Removes the given character from leading end i.e., left end
 Every file maintains a file pointer which tells the current position in
the file where writing or reading will take place.
 Whenever you read something from a file or write onto a file,
then these two things happen involving file-pointer:
o This operation takes place at the position of file-pointer and
o File-pointer advances by the specified number of bytes.
fh = open("marks.txt", "r")
Will open the file and place the file-pointer at the beginning of the file
01 , K R I S H , 6 7 , 7 5 n , J A I , 8 5 , 6 9 …..
Position of the file when opened in reading mode
ch = fh.read(1)
Will read 1 byte from the file from the position, the file-pointer is currently
at; and the file pointer advances by one byte.
01 , K R I S H , 6 7 , 7 5 n , J A I , 8 5 , 6 9 …..
The file-pointer has advanced by 1 byte
Krish Info Tech
File Modes Opening Position of File-pointer
r,rb,r+,rb+, r+b Beginning of the file
w,wb,w+,wb+, w+b Beginning of the file (Overwrites the file if the file
exists)
a,ab, a+,ab+, a+b At the end of the file if the file exists otherwise creates
a new file.
WORKING WITH
BINARY FILES
Sometimes you may
need to write and read
non-simple objects like
dictionaries, tuples,
lists or nested lists and
so forth onto the files.
To maintain this
structure, we have to
serialize the objects .
 Pickling is the process
whereby a Python object
hierarchy is converted into a
byte-stream
 Unpickling is the inverse
operation, whereby a byte-
stream is converted back into
an object hierarchy.
 The pickle module
implements a fundamental,
but powerful algorithm for
serializing and de-
serializing a Python object
structure.
 To work with pickle it is dule, you must first import it in your program
using import statement:
import pickle
 And then, you may use dump() and load() functions of pickle module
to write and read from an open binary file respectively.
1
Import
pickle module.
2
Open binary file in
the
required file mode
3
Process binary file by
writing /
reading objects using
pickle module's
methods.
Once done, close the
file.
A binary file is opened in the same way as you open any other file, but
make sure to use "b" with file modes to open a file in binary mode.
• Eg:
dfile = open("stu.dat", "wb+")
file1=open("stu.dat", "rb+")
CLOSINGTHE FILE
 Eg.
dfile.close( )
 Syntax:
pickle.dump(<object-to-be-written>,<filehandle-of-open-file>)
 Eg:
pickle.dump(list1, file1)
 Appending records in binary files is
similar to writing, only thing you have
to ensure is that you must open the file
in append mode. ("ab")
 A file opened in append mode will
retain the previous records and
append the new records written in the
file.
 dump( )function of the pickle
module will be used to append.
APPENDING
RECORDS
IN BINARY FILE
To read from the file, we should use
load( ) function of pickle module as
it would then unpickle the data
coming from the file.
<object> = pickle.load(<filehandle>)
obj = pickle.load(f)
 It is important to know
that pickle.load( )function
would raise EOFError when
you reach end-of-file
while reading from the file.
 To avoid this we will use
try and except block
EOFError
S
E
A
R
C
H
I
N
GIN AFILE
 Though we have multiple
ways of searching for a value,
the simplest being the
sequential search whereby you
read the records from a file one
by one and then look for the
search key in the read record.
Open the file in read
mode.
Read the file contents
record by record.
In every read record, look
for the desired search-
key.
If found, process as
desired.
If not found, read the next
record and look for
the desired search-key.
If search-key is not found
in any of the records,
report that no such vlue
found in the file.
Updating an object means changing its value(s) and
storing it again.
Updating record in a file is
similar and is a three-
step process, which is:
Locate the record to be updated by searching for
it Make changes in the loaded record in memory
Write back onto the file at the exact location of
od record.
UPDATE IN A BINARY FILE
 Python provides two functions that
help you manipulate the position of file-
pointer and thus you can read and
write from desired position in the file.
 The two file-pointer location
functions of python are:
1. tell( )
2. seek( )
ACCESSING AND MANIPULATING
LOCATION OF A FILE POINTER
The seek( ) function changes the
position of the file- pointer y placing
the file-pointer at the specified position
in the open file.
<file-object>.seek( offset[, mode])
The tell( ) function returns
the current position of file
pointer in
the file.
<file-object>.tell( )
fh=open("Marks.txt","r")
print("Initially file-pointer's position is at: ", fh.tell( ))
print("3 bytes read are: ", fh.read(3))
print("After previous read, Current position of file-pointer: ", fh.tell( ))
fh=open("Marks.txt","r")
fh.seek(30)# FROM BEGINNING
fh.seek(30,1) #FROM CURRENT POSITION
fh.seek(-30,2)#FROM END
You can move the
file-pointer
in forward
direction as well as
the backward
direction.
 To determine the exact location, the enhanced version of the updation method would be:
1. Open file in read as well as write mode.
2. Locate the record:
a) Firstly store the position of file pointer before reading a record
b) Read record from the file and search the key in it through appropriate test condition
c) If found, your desired record's start position is available in rpos.
3. Make changes in the record by changing its values in memory, as desired.
4. Right back onto the file at the exact location of old record.
a) Place the file pointer at the stored record position using seek( ), that is at rpos, which was
stored in step a.
b) Write the modified record now.The previous step is important and necessary as any
operation read or write takes place at the current file pointer's position. So the file pointer
must be at the beginning of the record to be over-written.
 pickle.PicklingError - raised when
an unpickable object is encountered
while writing.
 pickle.UnpicklingError - raised during
unpickling of an object, if there is any
problem.
 You know that CSV files are delimited
files that store tabular data ( data
stored in rows and columns as we see
in spreadsheets or databases) where,
delimits every value i.E.,The values
are separated with comma.
 Since CSV files are the text files, you
can apply text file procedures on these
and then split values using split( )
function but there is a better way of
handling CSV files, which is - using
CSV module of python.
 The CSV module of python
provides functionality to read
and write tabular data in CSV
format.
 It provides 2 specific types of
objects - the reader and writer
objects - to read and write into
CSV files.
Using the csv.writer() method ensures that data is written correctly to csv files, handling newlines and special
characters as needed so it creates writer object associated with the opened file to write data into csv file along
with newline and special characters
import csv
a CSV file is opened in
the same way as you
open any other text
file but make sure to
specify the extension.
follow the same modes
as text files to open
the CSV files.
Anopen C
S
Vfile is closed in
the same manner as you close
any other file.
newline=‘’ this parameter ensures that newlines are handled correctly across different platforms.
Without this, we might encounter extra blank lines when writing CSV files on windows
OPENING / CLOSING CSV FILES
 Writing into CSV files involves the conversion of the user data into
the writable delimited form and then storing it in the form of
CSV file.
Functions Description
csv.writer( ) Returns a writer object which writes
data into CSV files.
<writerobject>.writerow( ) Writes one row of data onto the
writer object.
<writerobject>.writerows( ) Writes multiple rows of data onto the
writer object.
Reading from a csv file
involves loading of
a csv file's data, parsing
it (i.e., removing
its delimitation), loading it
in Python iterable and then
reading from this iterable.
csv.reader( )- returns a reader
object which loads data from
CSV file into an
itearble after parsing delimeted
data.
1. Import csv module
2. Open csv file in a file-handle in read mode
3. Create the reader object by using the syntax given below:
<reader-object>=csv.reader(<file-handle>,[delimeter=<delimeter
character>])
Eg. Stureader = csv.reader(fh)
4. The reader object stores the parsed data in the form of iterable and
thus you can fetch from it row by row through a traditional for loop, one
row at a time.
5. Process the fetched single row of data as required.
6. Once done, close the file.
STEPS TO READ FROM A CSV FILE

FILE HANDLING COMPUTER SCIENCE -FILES.pptx

  • 1.
  • 2.
     File helpsstoring information permanently.  A file in itself is a bunch of bytes stored on some storage device like hard disk, thumb disk etc.
  • 3.
     The datafiles are the files that store data pertaining to a specific applications for later use.
  • 4.
  • 5.
    A text filestores information in the form of a stream of ASCII or Unicode characters. In text files, each line of text is terminated with a special character known as EOL ( end of line) character. in Python, by default, EOL character is the new line character (n) or carriage-return, newline combination (rn).
  • 6.
    Regular Text Files •These are the text files which store the text in the same form as typed. here the new line character ends a line and the text translations take place. these files have a file extension as .tt Delimited Text Files • In these text files, a specific character is stored to separate the values, i.e., after each value, a tab or a comma after every value. • When a tab character is used to separate the values stored, these are called TSV files ( tab separated values). these files can take the extension as .tt or .csv. • When the comma is used to separate the values stored, these are called CSV (, separated values) files. these files take extension as .csv.
  • 7.
    Regular Text File I ama simple text. TSV File Content I -> am -> simple. CSV File Content I, am, simple.
  • 8.
     Binary filesstore the information in the form of a stream of bytes.  A binary file contains information in the same format in which the information is held in memory.  Binary files can take a variety of extensions.
  • 9.
     The textfile can be opened in any text editor and are in human readable form while binary files are not in human readable form. Difference between binary file and text file
  • 11.
  • 12.
     The basicfile manipulation tasks include adding, modifying or deleting data in a file, which in turn include any one or combination of the following operations: Reading data from files writing data to files appending data to files
  • 13.
    <file_objectname> = open(<filename>,<mode>)  Syntax: Identifier to access the file Name of the file - identifier Built-in Function Access Mode
  • 14.
  • 15.
    f=open("myfile.txt", "w") fo=open("c:tempdata.txt","r") fiob=open(r"c:tempdata.txt","r") #The prefixr in front of a string makes it raw string # If you don't use raw string, you have to escape every backslash: The with statement works with the open() function to open a file. Unlike open() where you have to close the file with the close() method, the with statement closes the file for you without you telling it to. with open("hello.txt") as my_file: print(my_file.read())
  • 16.
     File objectsare used to read and write data to a file on disk.  A file object is there a reference to a file on disk. It opens and makes it available for a number of different tasks.
  • 17.
    Text File BinaryFile Mode Description 'r' 'rb' read only 'w' 'wb' write only 'a' 'ab' append 'r+' 'r+b' or 'rb+' read and write 'w+' 'w+b' or 'wb+' write and read 'a+' 'a+b' or 'ab+' write and read
  • 18.
     A close() function breaks the links of file-object and the file on the disk.  After close( ), no tasks can be performed on that file through the file-object.
  • 19.
  • 20.
  • 21.
    Python provides manyfunctions for reading and writing the open files.
  • 22.
     Python providesthree types of read functions to read from a data file.  Before you read from a file, the file must be opened and linked via a file object.
  • 23.
    Method Syntax Description read()<fileobject>.read([n]) reads n bytes; if no n is specified, reads the entire file. readline() <fileobject>.readline([n]) reads a line of input; if n is specified reads at most n bytes. readlines() <fileobject>.readlines([n]) reads all lines and returns them in a list.
  • 24.
    You can alsocombine the open() and read() functions as follows: file("filename",<mode>).read()
  • 25.
     The writing functionsalso work on open files, i.e., the files that are opened and linked via a file object.
  • 26.
    Method Syntax Description write()<fileobject>.write(str) writes string str to file referenced by <fileobject> writelines() <fileobject>.writelines(L) writes all strings in list L as lines to file referenced by <fileobject>
  • 27.
    You can alsouse plus symbol (+) with file read mode to facilitate reading as well as writing. If you want to write into the file while retaining the old data, then you should open the file in 'a' or append mode. When you open a file in 'w' or write mode, Python overwrites an existing file or creates a non- existing file, which means, for an existing file with the same name, the earlier data gets lost.
  • 28.
    In an existingfile, while retaining its content • (a) if the file has been opened in append mode ("a") to retain the old content. • (b) if the file has been open in 'r+' or 'a+' modes to facilitate reading as well as writing. To create a new file or to write on an existing file after truncating / overwriting its old content • (a) if the file has been opened in write-only mode ("w") • (b) if the file has been open in 'w+' mode to facilitate writing as well as reading Make sure to use close() function on file-object after you have finished writing. WRITING IN FILE CAN BE IN THE FOLLOWING FORMS:
  • 29.
     The flush() function forces the writing of data on disc still pending in output buffer. Syntax: <fileobject>.flush( )
  • 30.
     All theread functions also read the leading and trailing whitespaces i.e., spaces or tabs or newline characters.  If you want to remove any of these trailing and leading whitespaces, you can use strip( ) functions.
  • 31.
    strip( ) • Removesthe given character from both ends rstrip( ) • Removes the given character from trailing end i.e., right end lstrip( ) • Removes the given character from leading end i.e., left end
  • 33.
     Every filemaintains a file pointer which tells the current position in the file where writing or reading will take place.  Whenever you read something from a file or write onto a file, then these two things happen involving file-pointer: o This operation takes place at the position of file-pointer and o File-pointer advances by the specified number of bytes.
  • 34.
    fh = open("marks.txt","r") Will open the file and place the file-pointer at the beginning of the file 01 , K R I S H , 6 7 , 7 5 n , J A I , 8 5 , 6 9 ….. Position of the file when opened in reading mode ch = fh.read(1) Will read 1 byte from the file from the position, the file-pointer is currently at; and the file pointer advances by one byte. 01 , K R I S H , 6 7 , 7 5 n , J A I , 8 5 , 6 9 ….. The file-pointer has advanced by 1 byte Krish Info Tech
  • 35.
    File Modes OpeningPosition of File-pointer r,rb,r+,rb+, r+b Beginning of the file w,wb,w+,wb+, w+b Beginning of the file (Overwrites the file if the file exists) a,ab, a+,ab+, a+b At the end of the file if the file exists otherwise creates a new file.
  • 36.
  • 37.
    Sometimes you may needto write and read non-simple objects like dictionaries, tuples, lists or nested lists and so forth onto the files. To maintain this structure, we have to serialize the objects .
  • 38.
     Pickling isthe process whereby a Python object hierarchy is converted into a byte-stream  Unpickling is the inverse operation, whereby a byte- stream is converted back into an object hierarchy.
  • 39.
     The picklemodule implements a fundamental, but powerful algorithm for serializing and de- serializing a Python object structure.
  • 40.
     To workwith pickle it is dule, you must first import it in your program using import statement: import pickle  And then, you may use dump() and load() functions of pickle module to write and read from an open binary file respectively.
  • 41.
    1 Import pickle module. 2 Open binaryfile in the required file mode 3 Process binary file by writing / reading objects using pickle module's methods. Once done, close the file.
  • 42.
    A binary fileis opened in the same way as you open any other file, but make sure to use "b" with file modes to open a file in binary mode. • Eg: dfile = open("stu.dat", "wb+") file1=open("stu.dat", "rb+")
  • 43.
  • 44.
  • 45.
     Appending recordsin binary files is similar to writing, only thing you have to ensure is that you must open the file in append mode. ("ab")  A file opened in append mode will retain the previous records and append the new records written in the file.  dump( )function of the pickle module will be used to append. APPENDING RECORDS IN BINARY FILE
  • 46.
    To read fromthe file, we should use load( ) function of pickle module as it would then unpickle the data coming from the file. <object> = pickle.load(<filehandle>) obj = pickle.load(f)
  • 47.
     It isimportant to know that pickle.load( )function would raise EOFError when you reach end-of-file while reading from the file.  To avoid this we will use try and except block EOFError
  • 48.
    S E A R C H I N GIN AFILE  Thoughwe have multiple ways of searching for a value, the simplest being the sequential search whereby you read the records from a file one by one and then look for the search key in the read record.
  • 49.
    Open the filein read mode. Read the file contents record by record. In every read record, look for the desired search- key. If found, process as desired. If not found, read the next record and look for the desired search-key. If search-key is not found in any of the records, report that no such vlue found in the file.
  • 50.
    Updating an objectmeans changing its value(s) and storing it again. Updating record in a file is similar and is a three- step process, which is: Locate the record to be updated by searching for it Make changes in the loaded record in memory Write back onto the file at the exact location of od record. UPDATE IN A BINARY FILE
  • 51.
     Python providestwo functions that help you manipulate the position of file- pointer and thus you can read and write from desired position in the file.  The two file-pointer location functions of python are: 1. tell( ) 2. seek( ) ACCESSING AND MANIPULATING LOCATION OF A FILE POINTER
  • 52.
    The seek( )function changes the position of the file- pointer y placing the file-pointer at the specified position in the open file. <file-object>.seek( offset[, mode]) The tell( ) function returns the current position of file pointer in the file. <file-object>.tell( )
  • 53.
    fh=open("Marks.txt","r") print("Initially file-pointer's positionis at: ", fh.tell( )) print("3 bytes read are: ", fh.read(3)) print("After previous read, Current position of file-pointer: ", fh.tell( ))
  • 54.
    fh=open("Marks.txt","r") fh.seek(30)# FROM BEGINNING fh.seek(30,1)#FROM CURRENT POSITION fh.seek(-30,2)#FROM END
  • 55.
    You can movethe file-pointer in forward direction as well as the backward direction.
  • 56.
     To determinethe exact location, the enhanced version of the updation method would be: 1. Open file in read as well as write mode. 2. Locate the record: a) Firstly store the position of file pointer before reading a record b) Read record from the file and search the key in it through appropriate test condition c) If found, your desired record's start position is available in rpos. 3. Make changes in the record by changing its values in memory, as desired. 4. Right back onto the file at the exact location of old record. a) Place the file pointer at the stored record position using seek( ), that is at rpos, which was stored in step a. b) Write the modified record now.The previous step is important and necessary as any operation read or write takes place at the current file pointer's position. So the file pointer must be at the beginning of the record to be over-written.
  • 57.
     pickle.PicklingError -raised when an unpickable object is encountered while writing.  pickle.UnpicklingError - raised during unpickling of an object, if there is any problem.
  • 59.
     You knowthat CSV files are delimited files that store tabular data ( data stored in rows and columns as we see in spreadsheets or databases) where, delimits every value i.E.,The values are separated with comma.  Since CSV files are the text files, you can apply text file procedures on these and then split values using split( ) function but there is a better way of handling CSV files, which is - using CSV module of python.
  • 60.
     The CSVmodule of python provides functionality to read and write tabular data in CSV format.  It provides 2 specific types of objects - the reader and writer objects - to read and write into CSV files. Using the csv.writer() method ensures that data is written correctly to csv files, handling newlines and special characters as needed so it creates writer object associated with the opened file to write data into csv file along with newline and special characters
  • 61.
  • 62.
    a CSV fileis opened in the same way as you open any other text file but make sure to specify the extension. follow the same modes as text files to open the CSV files. Anopen C S Vfile is closed in the same manner as you close any other file. newline=‘’ this parameter ensures that newlines are handled correctly across different platforms. Without this, we might encounter extra blank lines when writing CSV files on windows OPENING / CLOSING CSV FILES
  • 63.
     Writing intoCSV files involves the conversion of the user data into the writable delimited form and then storing it in the form of CSV file. Functions Description csv.writer( ) Returns a writer object which writes data into CSV files. <writerobject>.writerow( ) Writes one row of data onto the writer object. <writerobject>.writerows( ) Writes multiple rows of data onto the writer object.
  • 64.
    Reading from acsv file involves loading of a csv file's data, parsing it (i.e., removing its delimitation), loading it in Python iterable and then reading from this iterable.
  • 65.
    csv.reader( )- returnsa reader object which loads data from CSV file into an itearble after parsing delimeted data.
  • 66.
    1. Import csvmodule 2. Open csv file in a file-handle in read mode 3. Create the reader object by using the syntax given below: <reader-object>=csv.reader(<file-handle>,[delimeter=<delimeter character>]) Eg. Stureader = csv.reader(fh) 4. The reader object stores the parsed data in the form of iterable and thus you can fetch from it row by row through a traditional for loop, one row at a time. 5. Process the fetched single row of data as required. 6. Once done, close the file. STEPS TO READ FROM A CSV FILE