Skip to content

Allow for nested subdirectories in DefaultDataset#570

Open
kasparas-k wants to merge 1 commit into
Pointcept:mainfrom
kasparas-k:support_nested_dataset
Open

Allow for nested subdirectories in DefaultDataset#570
kasparas-k wants to merge 1 commit into
Pointcept:mainfrom
kasparas-k:support_nested_dataset

Conversation

@kasparas-k
Copy link
Copy Markdown
Contributor

While retaining the original behavior with regards to data splits stored in .json files, I add support for nested subdirectory structures.

This works under the assumption that every point cloud MUST contain at least its coord asset (otherwise it's not a point cloud). That means that every individual point cloud's assets (normals, color, etc.) in a train/val/etc. split's directory can be located by finding the parent directory of every coord asset.

Given the following train split directory structure

train
|_pc1
|   |_coord.npy
|   |_color.npy
|_pc2
|   |_coord.npy
|   |_color.npy
|_extra
    |_pc3
        |_coord.npy
        |_color.npy

When the data split is specified as a subdirectory, and not as a json file

Old behavior: pc1 and pc2 are included in the training set, pc3 inside extra is skipped
New behavior: pc1, pc2 and pc3 are included in the training set

This is a very convenient change for constantly changing datasets, since repeatedly updating a json file is bothersome and error-prone. I've used this modification for months now without issues, so I'd like to contribute it to the community.

@kasparas-k kasparas-k force-pushed the support_nested_dataset branch from f4d8202 to 7dcb75f Compare April 2, 2026 09:11
@kasparas-k kasparas-k force-pushed the support_nested_dataset branch from 7dcb75f to b51216a Compare April 2, 2026 09:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant