Computer Vision
Have you ever wondered how your phone unlocks just by looking at your
face?
Or
how self-driving cars can ‘see’ the road?
Definition
Computer Vision is a branch of Artificial Intelligence that teaches computers how to "see"
and understand images or videos — just like human eyes and brain do.
Everyday Examples
Face Unlock on Your Phone
"When you unlock your phone using Face ID — it’s using computer vision to detect and recognize your
face."
Self-Driving Cars
"Tesla cars use computer vision to detect other vehicles, traffic lights, people, and road signs to drive
safely."
Social Media Filters
"Ever used an Instagram or Snapchat filter? It’s computer vision detecting your face and applying cool
effects."
Medical Imaging
"Doctors use computer vision to detect diseases in X-rays, MRIs, and CT scans — faster and more
accurately."
We humans see colors, faces, and objects. But a computer sees something very
different. Do you know what an image looks like to a computer?
Grayscale Images:
“In a black & white image, each pixel has just one value between 0 and 255.
● 0 means black
● 255 means white
● Values in between are shades of gray.”
● A grayscale image is a 2D matrix:
img[height][width]
Color (RGB) Images:
Color images are more complex. Each pixel is made up of 3 values — Red, Green, and Blue —
called channels.
● A color image is a 3D matrix:
img[height][width][3] (for R, G, B channels)
Task What it answers Output Format
Classification What is in the image? Label
Detection What & where is it? Label + Bounding Box
Segmentation What is every pixel? Label per pixel (mask)
Library/Tool Purpose Use Case
OpenCV Traditional CV (image processing, filters,
feature detection)
Reading/writing images, face detection,
edge detection
NumPy Matrix & array operations Image = array → NumPy is essential
Matplotlib Visualization (plot images, graphs) Show image outputs, histograms
scikit-image Image processing (high-level functions) Filtering, morphology, segmentation
TensorFlow /
Keras
Deep learning frameworks Train CNNs for image
classification/detection
PyTorch Another deep learning framework Used in research and production
MediaPipe Real-time vision solutions Hand, pose, face tracking (Google's
library)
What is OpenCV?
OpenCV (Open Source Computer Vision Library) is an open-source computer vision and image
processing library written in C/C++ and has Python bindings.
OpenCV is like a toolbox. It gives us ready-made tools to read, edit, and analyze images and videos.
Feature Description
📸 Read & Write Load and save images easily
✂ Modify Crop, resize, rotate, flip images
🧠 Analyze Detect faces, objects, features
🎥 Video Work with real-time video streams
💻 AI Models Integrates with Deep Learning (DNN, YOLO, etc.)
Where is OpenCV Used?
● ✅ Face Detection in phones
● ✅ Object Tracking in surveillance
● ✅ Lane detection in self-driving cars
● ✅ Medical image processing
● ✅ Augmented Reality (Snapchat filters)
What is an image to a computer?
Photo? or collection of numbers?
“Images are NumPy arrays.”
Code Block:
import cv2
image = cv2.imread('some_image.png') # Reads the image
print(type(image)) # Shows the data type
O/P: <class 'numpy.ndarray'>
cv2.imread() → loads the image as a NumPy array.
type(image) → tells us that it is a <class 'numpy.ndarray'>.
print(image.shape)
gives us the shape of the image — which means:
(height, width, number of color channels)
import cv2
image = cv2.imread('myphoto.png')
cv2.imshow("Image",image)
cv2.waitKey(0)
print(type(image))
print(image.shape)
Line Purpose
cv2.imread() Read the image
cv2.imshow() Display image in a new window
cv2.waitKey(0) Keep window open until a key is pressed
print(type()) Check variable type
print(shape) Check size (height, width, channels)
cv2.destroyAllWindows(0) It closes all OpenCV windows that were opened using
cv2.imshow().
converts a color image (image) from BGR format to grayscale
cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
A function in
OpenCV that
converts
the color
space of an
image.
A flag telling
OpenCV to convert
from BGR
(Blue-Green-Red)
to grayscale.
Bitwise AND
● Bitwise operations compare each corresponding pixel in the two images, bit by bit.
● The result pixel is white (255) only if both input pixels are white (255) at the same position.
● Otherwise, the result pixel is black (0).
So, the result will only be white where both images have white pixels at the same location.
img1 Pixel img2 Pixel Result (AND)
0 (black) 0 (black) 0
255 (white) 0 (black) 0
0 (black) 255 (white) 0
255 (white) 255 (white) 255 (white)
Bitwise OR
The bitwise OR operation compares corresponding pixels from both images and:
● Sets the result pixel to white (255) if either of the input pixels is white (255).
● Only returns black (0) if both input pixels are black.
The result is a union of both shapes.
img1 Pixel img2 Pixel Result (OR)
0 (black) 0 (black) 0 (black)
255 (white) 0 (black) 255 (white)
0 (black) 255 (white) 255 (white)
255 (white) 255 (white) 255 (white)
Bitwise XOR
XOR (exclusive OR) means:
● The result pixel is white (255) if only one of the input pixels is white.
● If both pixels are the same (either both black or both white), the result is black (0).
So the overlapping area is removed, and you're left with the non-overlapping parts of the rectangle
and the circle.
img1 Pixel img2 Pixel Result (XOR)
0 (black) 0 (black) 0 (black)
255 (white) 0 (black) 255 (white)
0 (black) 255 (white) 255 (white)
255 (white) 255 (white) 0 (black)
Bitwise NOT
NOT means inverting every pixel:
● If a pixel is black (0) → it becomes white (255).
● If a pixel is white (255) → it becomes black (0).

Computer Vision Introduction and Basic OpenCV.pdf

  • 1.
  • 2.
    Have you everwondered how your phone unlocks just by looking at your face? Or how self-driving cars can ‘see’ the road?
  • 3.
    Definition Computer Vision isa branch of Artificial Intelligence that teaches computers how to "see" and understand images or videos — just like human eyes and brain do.
  • 4.
    Everyday Examples Face Unlockon Your Phone "When you unlock your phone using Face ID — it’s using computer vision to detect and recognize your face." Self-Driving Cars "Tesla cars use computer vision to detect other vehicles, traffic lights, people, and road signs to drive safely." Social Media Filters "Ever used an Instagram or Snapchat filter? It’s computer vision detecting your face and applying cool effects." Medical Imaging "Doctors use computer vision to detect diseases in X-rays, MRIs, and CT scans — faster and more accurately."
  • 5.
    We humans seecolors, faces, and objects. But a computer sees something very different. Do you know what an image looks like to a computer?
  • 6.
    Grayscale Images: “In ablack & white image, each pixel has just one value between 0 and 255. ● 0 means black ● 255 means white ● Values in between are shades of gray.” ● A grayscale image is a 2D matrix: img[height][width]
  • 8.
    Color (RGB) Images: Colorimages are more complex. Each pixel is made up of 3 values — Red, Green, and Blue — called channels. ● A color image is a 3D matrix: img[height][width][3] (for R, G, B channels)
  • 10.
    Task What itanswers Output Format Classification What is in the image? Label Detection What & where is it? Label + Bounding Box Segmentation What is every pixel? Label per pixel (mask)
  • 11.
    Library/Tool Purpose UseCase OpenCV Traditional CV (image processing, filters, feature detection) Reading/writing images, face detection, edge detection NumPy Matrix & array operations Image = array → NumPy is essential Matplotlib Visualization (plot images, graphs) Show image outputs, histograms scikit-image Image processing (high-level functions) Filtering, morphology, segmentation TensorFlow / Keras Deep learning frameworks Train CNNs for image classification/detection PyTorch Another deep learning framework Used in research and production MediaPipe Real-time vision solutions Hand, pose, face tracking (Google's library)
  • 12.
    What is OpenCV? OpenCV(Open Source Computer Vision Library) is an open-source computer vision and image processing library written in C/C++ and has Python bindings. OpenCV is like a toolbox. It gives us ready-made tools to read, edit, and analyze images and videos. Feature Description 📸 Read & Write Load and save images easily ✂ Modify Crop, resize, rotate, flip images 🧠 Analyze Detect faces, objects, features 🎥 Video Work with real-time video streams 💻 AI Models Integrates with Deep Learning (DNN, YOLO, etc.)
  • 13.
    Where is OpenCVUsed? ● ✅ Face Detection in phones ● ✅ Object Tracking in surveillance ● ✅ Lane detection in self-driving cars ● ✅ Medical image processing ● ✅ Augmented Reality (Snapchat filters)
  • 14.
    What is animage to a computer? Photo? or collection of numbers?
  • 15.
    “Images are NumPyarrays.” Code Block: import cv2 image = cv2.imread('some_image.png') # Reads the image print(type(image)) # Shows the data type O/P: <class 'numpy.ndarray'> cv2.imread() → loads the image as a NumPy array. type(image) → tells us that it is a <class 'numpy.ndarray'>.
  • 16.
    print(image.shape) gives us theshape of the image — which means: (height, width, number of color channels) import cv2 image = cv2.imread('myphoto.png') cv2.imshow("Image",image) cv2.waitKey(0) print(type(image)) print(image.shape)
  • 17.
    Line Purpose cv2.imread() Readthe image cv2.imshow() Display image in a new window cv2.waitKey(0) Keep window open until a key is pressed print(type()) Check variable type print(shape) Check size (height, width, channels) cv2.destroyAllWindows(0) It closes all OpenCV windows that were opened using cv2.imshow().
  • 18.
    converts a colorimage (image) from BGR format to grayscale cv2.cvtColor(image, cv2.COLOR_BGR2GRAY) A function in OpenCV that converts the color space of an image. A flag telling OpenCV to convert from BGR (Blue-Green-Red) to grayscale.
  • 19.
    Bitwise AND ● Bitwiseoperations compare each corresponding pixel in the two images, bit by bit. ● The result pixel is white (255) only if both input pixels are white (255) at the same position. ● Otherwise, the result pixel is black (0). So, the result will only be white where both images have white pixels at the same location. img1 Pixel img2 Pixel Result (AND) 0 (black) 0 (black) 0 255 (white) 0 (black) 0 0 (black) 255 (white) 0 255 (white) 255 (white) 255 (white)
  • 20.
    Bitwise OR The bitwiseOR operation compares corresponding pixels from both images and: ● Sets the result pixel to white (255) if either of the input pixels is white (255). ● Only returns black (0) if both input pixels are black. The result is a union of both shapes. img1 Pixel img2 Pixel Result (OR) 0 (black) 0 (black) 0 (black) 255 (white) 0 (black) 255 (white) 0 (black) 255 (white) 255 (white) 255 (white) 255 (white) 255 (white)
  • 21.
    Bitwise XOR XOR (exclusiveOR) means: ● The result pixel is white (255) if only one of the input pixels is white. ● If both pixels are the same (either both black or both white), the result is black (0). So the overlapping area is removed, and you're left with the non-overlapping parts of the rectangle and the circle. img1 Pixel img2 Pixel Result (XOR) 0 (black) 0 (black) 0 (black) 255 (white) 0 (black) 255 (white) 0 (black) 255 (white) 255 (white) 255 (white) 255 (white) 0 (black)
  • 22.
    Bitwise NOT NOT meansinverting every pixel: ● If a pixel is black (0) → it becomes white (255). ● If a pixel is white (255) → it becomes black (0).