Digital Image Processing Unit 2 ppt.pptx

Digital Image Processing
CS3EA14
Swati Vaidya
Unit II
Imaging Geometry, Digital Geometry, Image Acquisition Systems, Different types of digital images Introduction to
Fourier Transform and DFT, Properties of 2D Fourier Transform, FFT, Separable Image Transforms, Walsh –
Hadamard, Discrete Cosine Transform, Haar, Slant – Karhunen – Loeve transforms.

Imaging Geometry
• Imaging geometry in digital image fundamentals refers to the study of relationship
between the image plane and the object being imaged. It involves understanding
the position, orientation, and size of the object in the image, as well as the imaging
system parameters such as focal length and sensor size.
• Imaging geometry plays a crucial role in determining the quality of the digital
image. If the imaging geometry is not properly calibrated or aligned, it can lead to
distortions, blurring or misalignment in the image. This can result in a loss of
detail, reduced accuracy in measurements and overall degradation of image quality.
• The Key parameters of imaging geometry include focal length, sensor size, image
resolution, field of view, and camera position. Imaging geometry is a widely used
in various applications of digital Image Processing. some common applications
include image rectification, and registration, 3D reconstruction, object recognition
and tracking, augmented reality and computer vision.

Imaging Geometry includes-
1. Imaging geometry
2. Geometrical operations
3. Geometrical coordinates
4. Geometrical operation- Translation
5. Geometrical operation- Scaling
6. Geometrical operation- Rotation
7. Image formation process
If you say in specific
1. Angular arrangement
2. Position of camera, position of object
3. Plane of image capturing by camera
4. Line of object and camera alignment
5. Latitude and longitudinal arrangement in xyz plane
6. Camera and imaging object coordinates

Geometrical Operations in Imaging
• Translation: Moving an image from one location to another.
• Rotation: Rotating an image around a point.
• Scaling: Changing the size of an image.

Geometrical Coordinates
• Saying about Geometrical / Positional coordinates,
• Real World Coordinates-
- (X, Y, Z)
Digital Coordinates-
- 2D-(x,y)
- 3D-(x,y,z)

Translation
• Translation displaces an image by a certain amount of pixels about the x and y axis.
This operation translates the image in a way such that every pixel in the image will be
shifted to a new position maintaining the shape and size of the image.
• Object is moving or Camera is moving from A to B, there occurs a displacement (Xo,
Yo, Zo) which can be represented as
• It is represented in Linear Expression as

Scaling
Scaling enables one to make the image larger or smaller in size or as it is known as scaling. It
copies or reduces the image proportionately to the original size. Most scaling methods preserve
aspect ratio, but the general scaling is achieved by changing the dimension on different axes
unlike other methods.

Rotation
Rotation is basically the process by which an image is simply rotated around the origin or an
image center by a given angle. This one rotates the image or changes the orientation of an
image depending on the angle it has been set to.

Digital geometry
• Digital geometry is the study of digitized models and images of objects in 2D or
3D Euclidean space.
• Digital geometry in digital image processing involves the study of shapes,
structures, and spatial relationships within digital images.
• It’s about understanding how to represent and analyze geometric properties in a
digital format.
• It's a field that uses mathematical methods to extract geometric information from
digital pictures.
• Digital geometry is used in many applications, including image analysis and
computer graphics.

Digital Geometry
• In the context of digital image processing, "Digital Geometry" focuses on
the geometric properties and structures of images represented in a digital
form. This typically includes:
• 1. Pixel Representation
• Pixels: In digital image processing, images are represented as a grid of
pixels, each with its own color or intensity value. These pixels are the
fundamental discrete elements in digital geometry.
• Resolution: The resolution of an image, determined by the number of
pixels in the grid, directly influences the accuracy of geometric
representations.

• 2. Geometric Transformations
• Translation, Rotation, and Scaling: Digital geometry involves applying geometric
transformations to images, such as shifting (translation), rotating, or resizing (scaling) the
image.
• Affine and Projective Transformations: More complex transformations like affine
(preserving lines and parallelism) and projective (preserving straight lines) are used for
tasks such as image alignment and perspective correction.
• 3. Shape Analysis and Recognition
• Boundary Detection: Identifying the edges or boundaries of objects within an image is a
key task. Techniques like edge detection (using Sobel, Canny algorithms, etc.) are
employed to delineate the geometric structure of objects.
• Morphological Operations: Operations such as dilation, erosion, opening, and closing
modify the geometric structure of objects in binary images to enhance or suppress certain
features.
• Object Recognition: Involves identifying and classifying objects based on their
geometric features. For example, recognizing a circle or a rectangle based on its shape
and size. Recognizing and classifying objects based on their geometric properties (shape,
size, orientation) is a fundamental aspect of image processing.

Dilation- Dilation is a morphological operation in image processing that expands the
boundaries of objects within a binary image. By examining a neighborhood around each
pixel, if any pixel in this area is "on" (white), the central pixel is also turned "on," causing
objects to grow larger. This process is useful for filling in gaps, connecting disjointed
elements, and enhancing object features, making it an important step in preparing images
for further analysis.
Erosion- Erosion, on the other hand, shrinks the boundaries of objects in a binary image. It
works by turning a pixel "off" (black) if any of its neighbors are "off," thereby reducing the
size of objects. Erosion is particularly useful for removing small noise, separating close
objects, and simplifying object shapes. It often follows dilation in image processing tasks to
refine the results and prepare the image for further processing.
4. Spatial Relationships: Analyzing how different objects or features in an image are
positioned relative to each other. This can include calculating distances, angles, and other
spatial metrics.

Image Acquisition using a single sensor:
Example of a single sensor is a photodiode.
Now to obtain a two-dimensional image
using a single sensor, the motion should be
in both x and y directions.
Rotation provides motion in one direction.
Linear motion provides motion in the
perpendicular direction.
Image Acquisition Systems

Image Acquisition using a line sensor (sensor strips):
The sensor strip provides imaging in one direction.
Motion perpendicular to the strip provides imaging in other direction.

Image Acquisition using an array sensor:
In this, individual sensors are arranged in the form of a 2-D array. This type of arrangement is found in
digital cameras. e.g. CCD array
In this, the response of each sensor is proportional to the integral of the light energy projected onto the
surface of the sensor. Noise reduction is achieved by letting the sensor integrate the input light signal over
minutes or ever hours.
Advantage: Since sensor array is 2D, a complete image can be obtained by focusing the energy pattern onto
the surface of the array.
The sensor array is coincident with the focal plane, it
produces an output proportional to the integral of light
received at each sensor.
Digital and analog circuitry sweep these outputs and
convert them to a video signal which is then digitized
by another section of the imaging system. The output
is a digital image.

Different Types of Digital Images
•There are three types of images. They are as following:
•1. Binary Images
•It is the simplest type of image. It takes only two values i.e, Black and White or 0 and 1.
The binary image consists of a 1-bit image and it takes only 1 binary digit to represent a
pixel. Binary images are mostly used for general shape or outline.
•For Example: Optical Character Recognition (OCR).
•Binary images are generated using threshold operation. When a pixel is above the threshold
value, then it is turned white('1') and which are below the threshold value then they are
turned black('0')

2. GRAY-SCALE IMAGES
Grayscale images are monochrome images, Means they have only one color.
Grayscale images do not contain any information about color. Each pixel determines
available different grey levels.
A normal grayscale image contains 8 bits/pixel data, which has 256 different grey
levels. In medical images and astronomy, 12 or 16 bits/pixel images are used.

3. COLOUR IMAGES
Colour images are three band monochrome images in which, each band contains a different
color and the actual information is stored in the digital image. The color images contain gray
level information in each spectral band.
The images are represented as red, green and blue (RGB images). And each color image has 24
bits/pixel means 8 bits for each of the three color band(RGB).

8-BIT COLOR FORMAT
8-bit color is used for storing image information in a computer's memory or in a file of an
image. In this format, each pixel represents one 8 bit byte. It has 0-255 range of colors, in
which 0 is used for black, 255 for white and 127 for gray color. The 8-bit color format is also
known as a grayscale image. Initially, it was used by the UNIX operating system.

16-BIT COLOR FORMAT
The 16-bit color format is also known as high color format. It has 65,536 different color
shades. It is used in the system developed by Microsoft. The 16-bit color format is further
divided into three formats which are Red, Green, and Blue also known as RGB format.
In RGB format, there are 5 bits for R, 6 bits for G, and 5 bits for B. One additional bit is added
in green because in all the 3 colors green color is soothing to eyes.

24-BIT COLOR FORMAT
The 24-bit color format is also known as the true color format. The 24-bit color format is also
distributed in Red, Green, and Blue. As 24 can be equally divided on 8, so it is distributed equally
between 3 different colors like 8 bits for R, 8 bits for G and 8 bits for B.

What is Transform? Why it is required?
• Informally, Transformation is any way of changing something.
• In Mathematics, transformations are often used to move an object
from a place where it is hard to work with it to a place where it is
simpler.
• For some people moving the object is equivalent to choosing a
new way to view the object.

• An image transform can be applied to an image to convert it
from one domain to another.
• Viewing an image in domains such as frequency domain enables
the identification of features that may not be easily detected in
the spatial domain.
• Transform is basically a mathematical tool, which allows us to
move from one domain to another.
• Image transform are useful for computation of convolution and
corelation.
• All transform will not give frequency domain information. Most
of image transformed like fourier transform, DCT, wavelet
transform etc gives information about the frequency contains in
an image transform and allow us to extract more relevant
information.

What is Fourier Transform?
• Convert time domain signal into frequency domain signal
Convert time domain signal into frequency domain signal

Introduction to Fourier Transform and DFT
• The Fourier Transform is an important image processing tool which is used
to decompose an image into its sine and cosine components. The output of
the transformation represents the image in the Fourier or
frequency domain, while the input image is the spatial domain equivalent.
• In the Fourier domain image, each point represents a particular frequency
contained in the spatial domain image.
• The Fourier Transform is used in a wide range of applications, such as
image analysis, image filtering, image reconstruction and image
compression.

Fourier Transform of an Image
• The Fourier Transform in an important image processing tool which
is used to decompose an image into its sine and cosine
components.
• As we are only concerned with digital images, we will restrict this
discussion to the Discrete Fourier Transform (DFT).
• For a square image of size NxN, the two-dimensional DFT is given
by:

Inverse Fourier Transform
Convert Signal from Frequency Domain to time domain.

What information Fourier Transform of an
image give?

PROPERTIES OF FOURIER TRANSFORM
LINEARITY:
Addition of two functions corresponding to the addition of the two frequency spectrum is called the
linearity. If we multiply a function by a constant, the Fourier transform of the resultant function is
multiplied by the same constant. The Fourier transform of sum of two or more functions is the sum of the
Fourier transforms of the functions.
Case I.
If h(x) -> H(f) then ah(x) -> aH(f)
Case II.
If h(x) -> H(f) and g(x) -> G(f) then h(x)+g(x) -> H(f)+G(f)
SCALING:
Scaling is the method that is used to the change the range of the independent variables or features of data.
If we stretch a function by the factor in the time domain then squeeze the Fourier transform by the same
factor in the frequency domain.
If f(t) -> F(w) then f(at) -> (1/|a|)F(w/a)
DIFFERENTIATION:
Differentiating function with respect to time yields to the constant multiple of the initial function.
If f(t) -> F(w) then f'(t) -> jwF(w)

CONVOLUTION:
It includes the multiplication of two functions. The Fourier transform of a convolution of two functions is the
point-wise product of their respective Fourier transforms.
If f(t) -> F(w) and g(t) -> G(w)
then f(t)*g(t) -> F(w)*G(w)
FREQUENCY SHIFT:
Frequency is shifted according to the co-ordinates. There is a duality between the time and frequency domains and
frequency shift affects the time shift.
If f(t) -> F(w) then f(t)exp[jw't] -> F(w-w')
TIME SHIFT:
The time variable shift also effects the frequency function. The time shifting property concludes that a linear
displacement in time corresponds to a linear phase factor in the frequency domain.
If f(t) -> F(w) then f(t-t') -> F(w)exp[-jwt']

FAST FOURIER TRANSFORM
It is an algorithm which plays a very important role in the computation of the Discrete Fourier
Transform of a sequence. It converts a space or time signal to signal of the frequency domain.
The DFT signal is generated by the distribution of value sequences to different frequency component.
Working directly to convert on Fourier transform is computationally too expensive. So, Fast Fourier
transform is used as it rapidly computes by factorizing the DFT matrix as the product of sparse factors.
As a result, it reduces the DFT computation complexity from O(n2) to O(N log N). And this is a huge
difference when working on a large dataset. Also, FFT algorithms are very accurate as compared to the
DFT definition directly, in the presence of round-off error.

SEPARABLE IMAGE TRANSFORMS
What is a Separable Transform?
A transform is said to be separable if a two-dimensional transformation can be expressed as two successive one-
dimensional transformations—one along the rows and one along the columns of the image matrix. For example,
instead of directly applying a 2D operation on the entire image at once, you can apply the transform row-wise first
and then column-wise (or vice versa).
Benefits of Separable Transforms
Reduced Computational Cost:
A 2D operation generally requires O(N2) computations, where N is the image size. However, if the operation is
𝑁
separable, it can be broken down into two 1D operations along the rows and columns, each requiring ( )O(N)
𝑂 𝑁
computations, reducing the total complexity to O(2N).
Memory Efficiency:
Instead of holding large intermediate results in memory during 2D transforms, separable transforms allow for
intermediate processing with smaller memory footprints.

COMMON SEPARABLE TRANSFORMS
FOURIER TRANSFORM (2D DFT):
The 2D Discrete Fourier Transform (DFT) can be computed using separability. You first apply a 1D
Fourier transform to each row of the image, followed by applying the 1D Fourier transform to each
column of the resulting matrix.
DISCRETE COSINE TRANSFORM (DCT):
The DCT is widely used in image compression (e.g., JPEG). The 2D DCT can be computed by performing a 1D DCT
along each row, followed by a 1D DCT along each column.

MATHEMATICAL EXPRESSION OF SEPARABLE FILTERS
A 2D filter or transform ( , )H(x,y) is separable if it can be written as:
𝐻 𝑥 𝑦
where 1( )H 1(x) is the filter applied along the rows, and 2( )H 2(y) is the filter applied along the
𝐻 𝑥 𝐻 𝑦
columns.
APPLICATIONS OF SEPARABLE TRANSFORMS
IMAGE COMPRESSION:
Efficient compression algorithms like JPEG leverage separable transforms like DCT.
IMAGE FILTERING:
Gaussian blurring and other image smoothing techniques use separable filters to speed up the process.
FEATURE EXTRACTION:
Edge detection using Sobel or Prewitt operators can benefit from separable implementations to extract image features
more quickly.
FOURIER ANALYSIS:
Frequency domain analysis and manipulation using the 2D DFT in separable form.

Classification of image transforms

Discrete Cosine Transform
• Image Compression : Image is stored or transmitted with having pixel value.
It can be compressed by reducing the value its every pixel contains. Image
compression is basically of two types :
• Lossless compression : In this type of compression, after recovering image is
exactly become same as that was before applying compression techniques
and so, its quality didn’t gets reduced.
• Lossy compression : In this type of compression, after recovering we can’t get
exactly as older data and that’s why the quality of image gets significantly
reduced. But this type of compression results in very high compression of
image data and is very useful in transmitting image over network.

• Discrete Cosine Transform is used in lossy image compression
because it has very strong energy compaction, i.e., its large
amount of information is stored in very low frequency
component of a signal and rest other frequency having very small
data which can be stored by using very less number of bits
(usually, at most 2 or 3 bit).
• To perform DCT Transformation on an image, first we have to
fetch image file information (pixel value in term of integer
having range 0 – 255) which we divides in block of 8 X 8 matrix
and then we apply discrete cosine transform on that block of
data.
• After applying discrete cosine transform, we will see that its
more than 90% data will be in lower frequency component. For
simplicity, we took a matrix of size 8 X 8 having all value as 255
(considering image to be completely white) and we are going to
perform 2-D discrete cosine transform on that to observe the
output.

• DCT is used in the JPEG image compression algorithm.
• The input image is divided into 8-by-8 or 16-by-16 blocks, and
the two-dimensional DCT is computed for each block.
• The DCT coefficients are then quantized, coded, and
transmitted.
• The JPEG receiver (or JPEG file reader) decodes the quantized
DCT coefficients, computes the inverse two-dimensional DCT of
each block, and then puts the blocks back together into a single
image.
• For typical images, many of the DCT coefficients have values
close to zero.
• These coefficients can be discarded without seriously affecting
the quality of the reconstructed image.

• The discrete cosine transform (DCT) represents an image as a sum
of sinusoids of varying magnitudes and frequencies.
• The dct2 function computes the two-dimensional discrete cosine
transform (DCT) of an image.
• The DCT has the property that, for a typical image, most of the
visually significant information about the image is concentrated in
just a few coefficients of the DCT.
• For this reason, the DCT is often used in image compression
applications. For example, the DCT is at the heart of the
international standard lossy image compression algorithm known as
JPEG.

The two-dimensional DCT of an M-by-N matrix A is defined as follows.
The values Bpq are called the DCT coefficients of A.
The DCT is an invertible transform, and its inverse is given by

Mathematical Definition of Haar Transform:

Haar Transform Steps (1D Example):
•Input Data: Start with a 1D signal, such as [a1,a2,a3,a4]
For example:
Input: [4,6,10,12][4, 6, 10, 12][4,6,10,12]
•Averaging: For each pair of adjacent values, compute their average (low-frequency
component).
For example:
Average: [(4+6)/2,(10+12)/2]=[5,11]
•Differencing: Compute the difference between each pair (high-frequency component).
For example:
Difference: [(4−6)/2,(10−12)/2]=[−1,−1]
•Decomposition: Repeat these steps recursively on the resulting averages to build a
hierarchical structure of the data.
For example:
Repeat recursively for further decomposition.

APPLICATIONS OF HAAR TRANSFORM
•Image Compression: The Haar transform can compress images by transforming
them into a wavelet domain where small wavelet coefficients (less important
details) can be discarded.
•Signal Processing: Used for edge detection and feature extraction in signals.
•Data Analysis: Provides multi-resolution analysis useful for time-frequency
analysis of non-stationary signals.

SLANT – KARHUNEN – LOEVE
TRANSFORMS
1. Slant Transform
• The Slant Transform is a simple and efficient orthogonal transform
primarily used for image processing. It is designed to approximate the
Karhunen–Loève Transform (KLT) for signals with linear trends or
slopes. The Slant Transform captures the overall linear structure of an
image or signal, making it useful for processing signals with sharp edges
or rapid changes.

75
MATHEMATICAL REPRESENTATION:

76
KEY FEATURES OF SLANT – KARHUNEN –
LOEVE TRANSFORMS
•Fast computation: It is more computationally efficient compared to the KLT.
•Handles slants: Good for detecting linear structures, such as edges in images.
•Orthogonal Transform: Like many transforms, it is an orthogonal transform,
meaning it preserves energy and can be inverted.

77
MATHEMATICAL REPRESENTATION:
• The Slant Transform matrix S is designed with specific rows that capture
constant signals and others that capture linear or slanted patterns. The matrix is
of size N×N, where N is the number of data points in the signal.
For a 1D signal x, the transform is:
X = S x
⋅
where S is the Slant Transform matrix, and x is the signal vector.
For 2D (images), the Slant Transform is applied in two dimensions:
where A is the image matrix.

78
APPLICATIONS OF SLANT
• Image compression: It is used in image compression algorithms due
to its efficiency in handling linear structures in images.
• Edge detection: Particularly useful for detecting and enhancing
linear features in an image.

79
. KARHUNEN–LOÈVE TRANSFORM (KLT)
• The Karhunen–Loève Transform (KLT) is an optimal transform for data compression and
dimensionality reduction. It is also known as the Principal Component Analysis (PCA) in
the context of statistics. The KLT seeks to decorrelate data, meaning it transforms a set of
possibly correlated variables into a set of uncorrelated variables (principal components).
• Key Features:
• Optimal decorrelation: The KLT minimizes the mean-square error (MSE) when
reconstructing the data after compression, making it the most efficient transform for
decorrelating data.
• Data-dependent: The KLT matrix is computed from the eigenvectors of the covariance
matrix of the data, so the transform is adapted to the statistics of the data.
• Energy compaction: The KLT compacts most of the signal's energy into a few coefficients,
which is useful for data compression.

80
APPLICATIONS
• Data compression: KLT is used in applications like image
and video compression, where reducing the dimensionality
of data without significant loss of information is important.
• Feature extraction: In machine learning and signal
processing, KLT/PCA is used to reduce the dimensionality of
datasets while retaining the most important variance.
• Denoising: It can remove noise from signals by transforming
data into a domain where noise is less prominent and then
filtering out noisy components.

81
COMPARISON:
Feature Slant Transform Karhunen–Loève Transform (KLT)
Efficiency Computationally efficient
Computationally expensive
(requires eigen-decomposition)
Data-Dependence
Fixed transform matrix (not data-
dependent)
Data-dependent (requires
covariance matrix computation)
Optimality Sub-optimal compared to KLT
Optimal decorrelation and energy
compaction
Applications
Image processing, edge detection,
simple compression
Data compression, dimensionality
reduction, feature extraction
Complexity Simple to implement
More complex to compute
(especially for large datasets)

THANK YOU!....................

Digital Image Processing Unit 2 ppt.pptx

More Related Content

Similar to Digital Image Processing Unit 2 ppt.pptx

Recently uploaded

Digital Image Processing Unit 2 ppt.pptx