Scatter plots are one of the most fundamental tools for visualizing relationships between two numerical variables. matplotlib.pyplot.scatter() plots points on a Cartesian plane defined by X and Y coordinates. Each point represents a data observation, allowing us to visually analyze how two variables correlate, cluster or distribute.
For example:
Python
import matplotlib.pyplot as plt
import numpy as np
x = np . array ([ 12 , 45 , 7 , 32 , 89 , 54 , 23 , 67 , 14 , 91 ])
y = np . array ([ 99 , 31 , 72 , 56 , 19 , 88 , 43 , 61 , 35 , 77 ])
plt . scatter ( x , y )
plt . title ( "Basic Scatter Plot" )
plt . xlabel ( "X Values" )
plt . ylabel ( "Y Values" )
plt . show ()
Output
Using matplotlib.pyplot.scatter() Explanation: plt.scatter(x, y) creates a scatter plot on a 2D plane to visualize the relationship between two variables, with a title and axis labels added for clarity and context.
Syntax matplotlib.pyplot.scatter(x, y, s=None, c=None, marker=None, cmap=None, alpha=None, edgecolors=None, label=None)
Parameters:
Parameter
Description
x, y
Sequences of data points to plot
s
Marker size (scalar or array-like)
c
Marker color
marker
Shape of the marker
cmap
Colormap for mapping numeric values to colors
alpha
Transparency (0 = transparent, 1 = opaque)
edgecolors
Color of marker edges
label
Legend label for the dataset
Returns: This function returns a PathCollection object representing the scatter plot points. This object can be used to further customize the plot or to update it dynamically.
Examples Example 1: In this example, we compare the height and weight of two different groups using different colors for each group.
Python
x1 = np . array ([ 160 , 165 , 170 , 175 , 180 , 185 , 190 , 195 , 200 , 205 ])
y1 = np . array ([ 55 , 58 , 60 , 62 , 64 , 66 , 68 , 70 , 72 , 74 ])
x2 = np . array ([ 150 , 155 , 160 , 165 , 170 , 175 , 180 , 195 , 200 , 205 ])
y2 = np . array ([ 50 , 52 , 54 , 56 , 58 , 64 , 66 , 68 , 70 , 72 ])
plt . scatter ( x1 , y1 , color = 'blue' , label = 'Group 1' )
plt . scatter ( x2 , y2 , color = 'red' , label = 'Group 2' )
plt . xlabel ( 'Height (cm)' )
plt . ylabel ( 'Weight (kg)' )
plt . title ( 'Comparison of Height vs Weight between two groups' )
plt . legend ()
plt . show ()
Output
Using matplotlib.pyplot.scatter() Explanation: We define NumPy arrays x1, y1 and x2, y2 for height and weight data of two groups. Using plt.scatter(), Group 1 is plotted in blue and Group 2 in red, each with labels. The x-axis and y-axis are labeled "Height (cm)" and "Weight (kg)" for clarity.
Example 2: This example demonstrates how to customize a scatter plot using different marker sizes and colors for each point. Transparency and edge colors are also adjusted.
Python
x = np . array ([ 3 , 12 , 9 , 20 , 5 , 18 , 22 , 11 , 27 , 16 ])
y = np . array ([ 95 , 55 , 63 , 77 , 89 , 50 , 41 , 70 , 58 , 83 ])
a = [ 20 , 50 , 100 , 200 , 500 , 1000 , 60 , 90 , 150 , 300 ] # size
b = [ 'red' , 'green' , 'blue' , 'purple' , 'orange' , 'black' , 'pink' , 'brown' , 'yellow' , 'cyan' ] # color
plt . scatter ( x , y , s = a , c = b , alpha = 0.6 , edgecolors = 'w' , linewidths = 1 )
plt . title ( "Scatter Plot with Varying Colors and Sizes" )
plt . show ()
Output
Using matplotlib.pyplot.scatter() Explanation: NumPy arrays x and y set point coordinates, a defines marker sizes and b assigns colors. plt.scatter() plots the points with transparency, white edges and linewidth. A title is added before displaying the plot.
Example 3: This example shows how to create a bubble plot where the size of each point (bubble) represents a variable's magnitude. Edge color and alpha transparency are also used.
Python
x = [ 1 , 2 , 3 , 4 , 5 ]
y = [ 2 , 3 , 5 , 7 , 11 ]
sizes = [ 30 , 80 , 150 , 200 , 300 ] # Bubble sizes
plt . scatter ( x , y , s = sizes , alpha = 0.5 , edgecolors = 'blue' , linewidths = 2 )
plt . title ( "Bubble Plot Example" )
plt . xlabel ( "X-axis" )
plt . ylabel ( "Y-axis" )
plt . show ()
Output
Using matplotlib.pyplot.scatter() Explanation: Lists x and y define point coordinates, while sizes sets the marker (bubble) sizes. The plt.scatter() plots the bubbles with 50% transparency (alpha=0.5), blue edges and edge width of 2. Axis labels and a title are added before displaying the plot.
Example 4: In this example, we map data values to colors using a colormap and add a colorbar. This helps in visualizing a third variable via color intensity.
Python
x = np . random . randint ( 50 , 150 , 100 )
y = np . random . randint ( 50 , 150 , 100 )
colors = np . random . rand ( 100 ) # Random float values for color mapping
sizes = 20 * np . random . randint ( 10 , 100 , 100 )
plt . scatter ( x , y , c = colors , s = sizes , cmap = 'viridis' , alpha = 0.7 )
plt . colorbar ( label = 'Color scale' )
plt . title ( "Scatter Plot with Colormap and Colorbar" )
plt . show ()
Output
Using matplotlib.pyplot.scatter() Explanation: Random arrays x and y set 100 points, with colors mapped using 'viridis' and varying sizes. plt.scatter() plots them with 0.7 transparency and plt.colorbar() adds a color legend.
Example 5: This final example illustrates how to change the marker style using the marker parameter. Here, triangle markers are used with magenta color.
Python
plt . scatter ( x , y , marker = '^' , color = 'magenta' , s = 100 , alpha = 0.7 )
plt . title ( "Scatter Plot with Triangle Markers" )
plt . show ()
Output
Using matplotlib.pyplot.scatter() Explanation: This code plots points with triangle markers ('^') in magenta color, size 100 and 0.7 transparency. A title is added before displaying the plot.
Scatter Plot in Matplotlib
Explore
Python Fundamentals Python Data Structures Advanced Python Data Science with Python Web Development with Python Python Practice