INTRODUCTION TO PANDAS DATA
STRUCTURE
Nagendra
Asstt. Professor
B. N. College (University of Delhi)
Install Pandas Library in Jupyter
Series
A series is a one dimensional array like object containing a sequence
of values and an associate array of data labels, called its index.
Array representation and index object of the Series via its
value and index attributes
Creating a Series with an index identifying each data
point with label
We can labels in the index when selecting single values
or a set of values
Using NumPy function, like filtering with a boolean
array, scalar multiplication or applying math, will
preserve the index-value link
A Series a fixed-length , ordered dict, as it is a mapping of
index values to data values
To check missing values
Series automatically aligns index label in arithmetic
operations
A Series index can be altered in-place by assignment
DataFrame
A DataFrame represents a rectangular table of data
and contains an ordered collection of columns, each of
which can be a different value type.
A DataFrame can easily constructed from dict of equal-
length lists or NumPy arrays
The head method select only first five rows.
Sequence of columns can be specified/arranged.
If we pass a column that isn’t in the dict, it will appear with
missing values in the result.
A column in a DataFrame cab be retrieved as a Series
either by dict-like notation or by attribute.
Rows can also be retrieved by position or name with the
specific log attribute.
Columns can be modified by assignment.
When assign lists or arrays to a column, the value’s length must
match the length of the DataFrame.
If we assign a Series, its label will be realigned exactly to
DataFrame’s index, inserting missing values in any holes.
del method can be used to remove this column.
data is a nested dict of dicts
Transpose the DataFrame with similar syntax to a
NumPy array
The keys in the inner dicts are combined and sorted to
form the index in the result.
With Series, the values attribute returns to data combined in the
DataFrame as a two-dimensional ndarray.
Index objects are responsible for axis labels and other metadata.
introduction to pandas data structure.pptx

introduction to pandas data structure.pptx

  • 1.
    INTRODUCTION TO PANDASDATA STRUCTURE Nagendra Asstt. Professor B. N. College (University of Delhi)
  • 2.
  • 3.
    Series A series isa one dimensional array like object containing a sequence of values and an associate array of data labels, called its index.
  • 4.
    Array representation andindex object of the Series via its value and index attributes
  • 5.
    Creating a Serieswith an index identifying each data point with label
  • 6.
    We can labelsin the index when selecting single values or a set of values
  • 7.
    Using NumPy function,like filtering with a boolean array, scalar multiplication or applying math, will preserve the index-value link
  • 8.
    A Series afixed-length , ordered dict, as it is a mapping of index values to data values
  • 9.
  • 10.
    Series automatically alignsindex label in arithmetic operations
  • 11.
    A Series indexcan be altered in-place by assignment
  • 12.
    DataFrame A DataFrame representsa rectangular table of data and contains an ordered collection of columns, each of which can be a different value type. A DataFrame can easily constructed from dict of equal- length lists or NumPy arrays
  • 13.
    The head methodselect only first five rows. Sequence of columns can be specified/arranged.
  • 14.
    If we passa column that isn’t in the dict, it will appear with missing values in the result. A column in a DataFrame cab be retrieved as a Series either by dict-like notation or by attribute.
  • 15.
    Rows can alsobe retrieved by position or name with the specific log attribute. Columns can be modified by assignment.
  • 16.
    When assign listsor arrays to a column, the value’s length must match the length of the DataFrame. If we assign a Series, its label will be realigned exactly to DataFrame’s index, inserting missing values in any holes.
  • 17.
    del method canbe used to remove this column. data is a nested dict of dicts Transpose the DataFrame with similar syntax to a NumPy array
  • 18.
    The keys inthe inner dicts are combined and sorted to form the index in the result.
  • 19.
    With Series, thevalues attribute returns to data combined in the DataFrame as a two-dimensional ndarray. Index objects are responsible for axis labels and other metadata.