Skip to content

Axes3D.plot_surface: Allow masked arrays and NaN values#20725

Merged
QuLogic merged 6 commits into
matplotlib:masterfrom
tomneep:polygon-nan-removal
Aug 12, 2021
Merged

Axes3D.plot_surface: Allow masked arrays and NaN values#20725
QuLogic merged 6 commits into
matplotlib:masterfrom
tomneep:polygon-nan-removal

Conversation

@tomneep

@tomneep tomneep commented Jul 23, 2021

Copy link
Copy Markdown
Contributor

PR Summary

This is my third attempt at solving the issues with using masked arrays in Axes3D.plot_surface, previous attempts are #18114 and #20646. Hopefully it is third time lucky.

I got the impression that the feedback on #18114 was reasonably positive but this approach solves more issues and also behaves like contour with corner_mask=True which was a preference expressed by @ianthomas23.

The suggestion from #20646 was that it would be better to deal with NaNs before we get to Poly3DCollection.do_3d_projection which is called every time we adjust an interactive figure.

I've gone "all in" on this PR and removed the warning about not supporting NaNs and added masked array support, if you prefer to take this step-by-step e.g. removing the masked array part or keeping the NaN warning, then let me know, and I'll happily change.

There are no tests yet, once I've had some feedback on this I'll add some based on the issues that this closes.

Examples of the issues that this PR solves (along with the code to make them) are below. I didn't add the example from #8222 which loads a large dataset but the before and after are visually the same as the examples shown in #20646.

Examples before this PR

plot_surface_master

Examples after this PR

plot_surface_fixes

Code for examples

import matplotlib
import matplotlib.pyplot as plt
import numpy as np

fig, axes = plt.subplots(
    nrows=2,
    ncols=2,
    figsize=(10, 10),
    subplot_kw={"projection": "3d"},
    constrained_layout=True,
)

# 487
x, y = np.mgrid[1:10:1, 1:10:1]
z = x ** 3 + y ** 3 - 500
z = np.ma.masked_array(z, z < 0)

plot_opts = dict(rstride=1, cstride=1, linewidth=0, cmap="inferno")
axes[0, 0].plot_surface(x, y, z, **plot_opts)
axes[0, 0].view_init(35, -90)
axes[0, 0].set(title="Issue #487")

# 4941
x, y = np.mgrid[-1:1:0.05, -1:1:0.05]
z = (x ** 2 + y ** 2) + 0.01
z[10:20, 10:20] = 0.0

masked_array = np.ma.masked_equal(z, 0.0)
palette = plt.get_cmap("gray").copy()
palette.set_over("g", 1.0)
palette.set_under("b", 1.0)
palette.set_bad("r", 1.0)

plot_opts = dict(rstride=1, cstride=1, cmap=palette, vmin=-1, vmax=1)
axes[0, 1].plot_surface(x, y, masked_array, **plot_opts)
axes[0, 1].view_init(35, -90)
axes[0, 1].set(title="Issue #4941")

# 12395
z = (1 - y / x).clip(min=-5, max=5)
# place NaNs at the discontinuity
pos = np.where(np.abs(np.diff(z, axis=0)) >= 5.0)
z[pos] = np.nan

plot_opts = dict(rstride=1, cstride=1, cmap="coolwarm", antialiased=False)
axes[1, 0].plot_surface(x, y, z, **plot_opts)
axes[1, 0].set(title="Issue #12395")

#16470
x = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11]
y = [1, 2, 3, 4, 5, 6, 7, 8]

x, y = np.meshgrid(x, y)
nan = np.nan
matrix = np.array(
    [
        [nan, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0],
        [nan, 1.0, 2.0, 3.0, 4.0, 4.0, 4.0, 3.0, 2.0, 1.0, 1.0],
        [nan, nan, 4.0, 5.0, 6.0, 8.0, 6.0, 5.0, 4.0, 3.0, nan],
        [nan, nan, 7.0, 8.0, 11.0, 12.0, 11.0, 8.0, 7.0, nan, nan],
        [nan, nan, 8.0, 9.0, 10.0, 16.0, 10.0, 9.0, 10.0, 7.0, nan],
        [nan, nan, nan, 12.0, 16.0, 20.0, 16.0, 12.0, 11.0, nan, nan],
        [nan, nan, nan, nan, 22.0, 24.0, 22.0, 20.0, 18.0, nan, nan],
        [nan, nan, nan, nan, nan, 28.0, 26.0, 25.0, nan, nan, nan],
    ]
)
z = np.ma.masked_invalid(matrix)
normC = matplotlib.colors.Normalize(vmax=z.max(), vmin=z.min())
colors = plt.get_cmap("plasma")(normC(z))
axes[1, 1].plot_surface(x, y, z, facecolors=colors)
axes[1, 1].view_init(30, -80)
axes[1, 1].set(title="Issue #16470")

PR Checklist

  • Has pytest style unit tests (and pytest passes).
  • Is Flake 8 compliant (run flake8 on changed files to check).
  • New features are documented, with examples if plot related.
  • Documentation is sphinx and numpydoc compliant (the docs should build without error).
  • Conforms to Matplotlib style conventions (install flake8-docstrings and run flake8 --docstring-convention=all).
  • New features have an entry in doc/users/next_whats_new/ (follow instructions in README.rst there).
  • API changes documented in doc/api/next_api_changes/ (follow instructions in README.rst there).

@jklymak jklymak mentioned this pull request Jul 23, 2021
7 tasks

@timhoffm timhoffm left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm slightly worried that we are throwing away data and not storing the original inputs in the PolyCollection. Do we have precedent for that?

Nevertheless I'm tempted to accept the solution as it is much simpler and faster than any later data cleaning. Maybe we should add a note on the data cleaning in the docstring.

Comment thread lib/mpl_toolkits/mplot3d/axes3d.py
@tomneep

tomneep commented Jul 26, 2021

Copy link
Copy Markdown
Contributor Author

I'm slightly worried that we are throwing away data and not storing the original inputs in the PolyCollection. Do we have precedent for that?

I don't know, hopefully someone else does. The one thought I had was to see what is done in contour/contourf. Here it looks like the masked data are not saved in the collections (but perhaps I am missing something)? Would this be an equivalent case?

import matplotlib.pyplot as plt
import numpy as np

x, y = np.mgrid[-10:10.1:2, -10:10.1:2]
z = np.ma.masked_less(x ** 2 + y ** 2, 50)

fig, ax = plt.subplots(figsize=(6, 6))
contours = ax.contourf(x, y, z)

segs = np.concatenate([np.concatenate(s) for s in contours.allsegs if s]).T
paths = np.concatenate(
    [i.vertices for paths in contours.collections for i in paths.get_paths()]
).T

ax.plot(segs[0], segs[1], "rs")
ax.plot(paths[0], paths[1], ".b")

gives
contour_test

@jklymak

jklymak commented Jul 27, 2021

Copy link
Copy Markdown
Member

The user threw the data away when they NaN-ed or masked it. I don't think we are going to offer a way to unmask it after the surface is made, so I don't think there is harm in removing it. But maybe I'm missing a subtlety in the data->polygon pipeline. I assume if the user changes x, y, Z in the data then the polygons are regenerated anyway?

@tomneep

tomneep commented Jul 27, 2021

Copy link
Copy Markdown
Contributor Author

Sorry @jklymak I don't understand your question. What kind of changing of x, y and Z do you have in mind?

@jklymak

jklymak commented Jul 27, 2021

Copy link
Copy Markdown
Member

via set_data or whatever the 3-d equivalent is...

@tomneep

tomneep commented Jul 27, 2021

Copy link
Copy Markdown
Contributor Author

I'm not sure what the equivalent would be. You could get the paths with get_paths and perhaps modify those, but it would be tricky. Maybe there is some easy method that I am missing.

The original x, y and z don't get passed to the Poly3DCollection. I suppose I could imagine a new collection that computed the polygons from the x, y and z and would recalculate the polygons if set_data was called, but that isn't how it currently works. Is that what you had in mind?

@jklymak

jklymak commented Jul 27, 2021

Copy link
Copy Markdown
Member

Sorry if I'm taking you off on a tangent - many of our methods save a representation of the original data. If this has no such representation, then there really is no harm done in throwing out masked data because the user can't modify it post-facto anyways.

@tomneep

tomneep commented Jul 27, 2021

Copy link
Copy Markdown
Contributor Author

The Poly3DCollection is made directly from the vertices of the polygons and doesn't see the original data. It isn't just the masking because there are the stride/count options in plot_surface that (can) also throw data away before the polygons are made. So I think we are ok on that front?

@jklymak jklymak added this to the v3.5.0 milestone Jul 28, 2021
@jklymak jklymak added the API: consistency Consistency of the matplotlib API, including naming, behavior, defaults, … label Jul 28, 2021
@jklymak

jklymak commented Jul 28, 2021

Copy link
Copy Markdown
Member

@eric-wieser @WeatherGod @ianthomas23 do any of you have time to comment on this solution to the masking problem? It seems reasonable to me but I don't know if I've ever used plot_surface in matplotlib ;-) Thanks!

@eric-wieser

eric-wieser commented Jul 29, 2021

Copy link
Copy Markdown
Contributor

Can you add a test showing how this behaves with striding? For instance, if a facet with stride=4 has all the . entries masked in this pattern, does matplotlib draw a diamond or two triangles?

++...
+....
.....
....+
...++

@tomneep

tomneep commented Jul 29, 2021

Copy link
Copy Markdown
Contributor Author

@eric-wieser Here is the example you request (for before and after this PR). It is a bit hard to show here but from the side angle (third plot) it doesn't look right, neither before and after this PR. Is this what you wanted to see?

Example without PR

stride_tests_master

Example with PR

stride_tests_pr

Code for plots

Details
import matplotlib.pyplot as plt
import numpy as np

x, y = np.mgrid[-2:2.1:1, -2:2.1:1]
z = np.ma.masked_less(x * y, 2)

fig, (ax1, ax2, ax3) = plt.subplots(
    figsize=(9, 3), ncols=3, subplot_kw={"projection": "3d"}
)
ax1.plot_surface(x, y, z)
ax1.set_title("Default strides")
ax1.view_init(60, -45)

surf = ax2.plot_surface(x, y, z, rstride=4, cstride=4)
ax2.set_title("r/cstride = 4")
ax2.view_init(60, -45)

surf = ax3.plot_surface(x, y, z, rstride=4, cstride=4)
ax3.set_title("r/cstride = 4")
ax3.view_init(2, 30)

@eric-wieser

Copy link
Copy Markdown
Contributor

Thanks, that does show what I was expecting to see. I don't think the side view is interesting, but I think the middle plot might be worth adding a test for in case someone decides they want different behavior later. Using np.mgrid[-6:6.1:1, -6:6.1:1] but leaving the stride at 4 would probably produce a slightly more insightful plot for deciding whether the current behavior is sensible.

@tomneep

tomneep commented Jul 29, 2021

Copy link
Copy Markdown
Contributor Author

Thanks, I've added a test with the range you suggest. Indeed the plot looks a bit more interesting than my previous comment, with a double arrowhead/bowtie arrangement. Does this look ok to you?

strides test

@eric-wieser

Copy link
Copy Markdown
Contributor

Yes, that looks like a good test case to include, thanks! My question (possibly for a future PR) would be whether it makes sense to split that middle polygon in two, into two triangles - but it's not immediately clear to me how such an approach would generalize.

@tomneep

tomneep commented Jul 29, 2021

Copy link
Copy Markdown
Contributor Author

Yes I had the same thought after the first example (whether the polygon could be split into pieces to help the side view). I'm not sure how simple it is, so I think as you say it might be a question for a future PR.

@jklymak

jklymak commented Jul 29, 2021

Copy link
Copy Markdown
Member

I'm not sure we should be worried about striding effects - if you decimate your data you should expect aliasing, and if you decimate a mask it is luck of the draw where on the mask your stride will fall. If you need more control, it really isn't hard to stride your own data.

@eric-wieser

Copy link
Copy Markdown
Contributor

I'd dispute that the claim that it's easy to stride your own data - this isn't just simple striding that takes 1/n² of the elements, it's preserving full resolution along the gridlines and discarding data in the cells, taking (2n-1)/n² of the elements. That means there's no way to represent the striding in a numpy array.

That's not to say you're wrong in saying that the weirdness in the plot above is unimportant - I think it's fine to merge with the current behavior, and was only pushing for a test to ensure we were aware of the quirk.

@jklymak

jklymak commented Jul 29, 2021

Copy link
Copy Markdown
Member

this isn't just simple striding that takes 1/n² of the elements, it's preserving full resolution along the gridlines and discarding data in the cells, taking (2n-1)/n² of the elements. That means there's no way to represent the striding in a numpy array.

probably off topic, but that is confusing. The docs say rstride, cstride : Downsampling stride in each direction. Given that, I'm not sure what "full resolution along the gridlines and discarding data in the cells" means.

@ianthomas23 ianthomas23 left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There were a couple of CI failures, but I've rerun them and they now pass.

As my opinion was requested, I've taken a look at this PR and am happy with the output. But I am not a frequent user of mplot3d.

@eric-wieser

eric-wieser commented Aug 4, 2021

Copy link
Copy Markdown
Contributor

this isn't just simple striding that takes 1/n² of the elements, it's preserving full resolution along the gridlines and discarding data in the cells, taking (2n-1)/n² of the elements. That means there's no way to represent the striding in a numpy array.

probably off topic, but that is confusing. The docs say rstride, cstride : Downsampling stride in each direction. Given that, I'm not sure what "full resolution along the gridlines and discarding data in the cells" means.

I mean that rstride=cstride=3 samples the input data as

#######
#  #  #
#  #  #
#######
#  #  #
#  #  #
#######

Not

#  #  #


#  #  #


#  #  #

(Where every character is an entry in the input data array, and every # is a point that gets plotted)

@jklymak

jklymak commented Aug 4, 2021

Copy link
Copy Markdown
Member

So is the problem that the docs are wrong or my understanding of the term "stride" is wrong?

@eric-wieser

Copy link
Copy Markdown
Contributor

The docs are using "stride" in a vague sense, so they're not wrong per se, but certainly easy to misread. The reading I have of them that is consistent with the implementation is "Split the surface into row and column gridlines spaced rstride and cstride entries apart, but draw the perimeters of the resulting grid squares without downsampling". I suppose this phrasing in the docs is actively misleading:

If the input data is larger, it will be downsampled (by slicing) to these numbers of points.

I think the current behavior of the striding (ignoring this PR) is the right behavior, but the docs don't describe it well.

@QuLogic

QuLogic commented Aug 12, 2021

Copy link
Copy Markdown
Member

It appears this is good to go, and the striding discussion is orthogonal, so I'm going to merge this.

@QuLogic QuLogic merged commit 3277b9d into matplotlib:master Aug 12, 2021
@tomneep tomneep deleted the polygon-nan-removal branch August 19, 2021 08:41
LemonBoy added a commit to LemonBoy/matplotlib that referenced this pull request Jul 4, 2023
Extend the logic introduced by matplotlib#20725 for NaN values to inf values,
avoiding the introduction of infinite vertices in the mesh that wreak
havoc when performing the z-sorting.

Sadly I don't have a minimal test case for this problem.
LemonBoy added a commit to LemonBoy/matplotlib that referenced this pull request Jul 31, 2023
Extend the logic introduced by matplotlib#20725 for NaN values to inf values,
avoiding the introduction of infinite vertices in the mesh that wreak
havoc when performing the z-sorting.

Sadly I don't have a minimal test case for this problem.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

API: consistency Consistency of the matplotlib API, including naming, behavior, defaults, … topic: mplot3d

Projects

None yet

6 participants