Rupiah Banknotes Dataset

Short description: A curated image dataset of Indonesian Rupiah banknotes (Rp1,000 – Rp100,000) in JPG format, photographed by M. Kaspul Anwar, Muhammad Lutfan, and Andri Rahmadani, for Machine Learning and Computer Vision research and experiments.

🔍 What is this?

This repository contains a labeled collection of images of Indonesian banknotes organized by denomination. All images are in JPG format and were photographed by our dataset team. The dataset was prepared for tasks such as image classification, object detection, recognition, and other computer vision experiments using deep learning or classical ML.

Denominations included: 1k, 2k, 5k, 10k, 20k, 50k, 100k, and a mix folder for miscellaneous or combined images.

Primary goals:

Provide a simple, well-structured dataset for academic research and prototyping.
Make replication and benchmarking of currency-recognition models easier.
Encourage responsible use and clarity about dataset provenance and license.

✅ Use cases

Image classification: recognize which denomination appears in an image.
Object detection: locate banknotes in complex scenes.
Segmentation / OCR: crop and extract text/serial numbers (if available and permitted).
Transfer learning & benchmarking: experiments with CNNs, MobileNet, ResNet, EfficientNet, etc.
Education: teaching CV pipelines end-to-end from data to model to deployment.

📁 Dataset structure (recommended)

rupiah-banknotes-dataset/
├── README.md
├── LICENSE
├── data/
│   ├── 1k/            # JPG images of Rp1,000
│   ├── 2k/            # JPG images of Rp2,000
│   ├── 5k/
│   ├── 10k/
│   ├── 20k/
│   ├── 50k/
│   ├── 100k/
│   └── mix/           # mixed/crowd JPG images or images containing multiple notes
├── scripts/
│   ├── split_dataset.py    # create train/val/test splits
│   └── stats.py            # compute per-class counts & basic stats
└── examples/
    ├── pytorch_example.ipynb
    └── tensorflow_example.ipynb

File formats & naming:

Images are stored in JPG format.
Filenames should be unique and, optionally, encode metadata (e.g. 20k_studio_001.jpg).
Avoid embedding personal data or location metadata in filenames.

⚙️ How to use this dataset

1) Clone the repository

git clone https://github.com/<your-username>/rupiah-banknotes-dataset.git
cd rupiah-banknotes-dataset

If the dataset contains large files, consider hosting the images on a release, Zenodo, or cloud storage (Google Drive, S3) and keeping the repo lightweight. Use Git LFS for versioning large binaries if you intend to push images to GitHub.

2) Install common Python dependencies

python -m venv venv && source venv/bin/activate    # linux/mac
# windows: python -m venv venv && venv\Scripts\activate
pip install --upgrade pip
pip install numpy pillow matplotlib torchvision tensorflow

3) Load data with PyTorch (ImageFolder)

from torchvision import datasets, transforms
from torch.utils.data import DataLoader

transform = transforms.Compose([
    transforms.Resize((224,224)),
    transforms.ToTensor(),
])

data = datasets.ImageFolder('data/train', transform=transform)
dataloader = DataLoader(data, batch_size=32, shuffle=True)

# class_to_idx: {'1k':0, '2k':1, ...}
print(data.class_to_idx)

4) Load data with TensorFlow (tf.data)

import tensorflow as tf

dataset = tf.keras.utils.image_dataset_from_directory(
    'data/train',
    image_size=(224,224),
    batch_size=32,
)

for images, labels in dataset.take(1):
    print(images.shape, labels.shape)

5) Creating train / val / test splits

Use scripts/split_dataset.py (example):

# simple script idea (not full code):
# - iterate through each class folder
# - shuffle file list
# - move 70%->train, 15%->val, 15%->test

(See scripts/split_dataset.py for a ready-to-run implementation.)

🧰 Example: quick preprocessing tips

Resize to a fixed size (e.g., 224×224) or keep aspect ratio and pad.
Normalize using ImageNet mean/std if using pre-trained models.
Data augmentation: random crop, horizontal flip (careful: flipping may alter features), brightness/contrast jitter, rotation.
Class imbalance: use class weights, oversampling, or augmentation for under-represented denominations.

📊 Dataset statistics

Include a script (scripts/stats.py) that computes:

number of images per class
total image count
average resolution

Note: If you share the dataset publicly, include an up-to-date table of per-class counts here.

⚖️ License & Terms

This dataset is released under the MIT License. By using the dataset you agree to follow the terms in LICENSE.

If you prefer a different license for datasets (e.g., CC0 / CC BY 4.0), replace the LICENSE file accordingly. The README includes the MIT choice as the default for code and scripts in the repo.

🧾 Citation

If you use this dataset in academic work, please cite the repository. Suggested citation:

@misc{rupiah-banknotes-dataset,
  author = {M. Kaspul Anwar and Muhammad Lutfan and Andri Rahmadani},
  title = {Rupiah Banknotes Dataset},
  year = {2025},
  howpublished = {\url{https://github.com/<your-username>/rupiah-banknotes-dataset}},
}

Replace the URL with your repository link.

🛡️ Ethics & Responsible Use

Please follow these rules when using the dataset:

Legality: Ensure your use complies with local and national laws concerning currency images. Some jurisdictions limit reproduction of banknote images — check legal constraints before redistribution or commercial use.
No counterfeiting: This dataset must not be used for counterfeiting, fraud, or any illegal financial activity. Models trained on this dataset should not be used to produce realistic images or printed copies of currency.
Privacy: If any photo contains identifiable people, personal possessions, or private property, obtain consent or blur faces before sharing publicly. Do not include personal data or private information.
Attribution: If you publish results using this dataset, give proper credit to this repository (see Citation section above).
Sensitive uses: Avoid deploying models in high-stakes settings (e.g., automated financial systems) without thorough testing and safeguards.
Respect licenses: If you combine this dataset with other datasets, respect their licenses.

🤝 Contributing

Contributions are welcome! Please:

Open an issue to discuss major changes.
Fork the repo and submit a pull request for fixes or additions.
Add clear metadata when you contribute new images (e.g., acquisition conditions, device, date).

When contributing images, ensure you have the right to share them and they comply with the ethics above.

✉️ Contact

For questions, issues, or requests, please open an issue in this repository or contact the maintainer listed on the GitHub profile.

📌 Acknowledgements

Dataset images were photographed by M. Kaspul Anwar, Muhammad Lutfan, and Andri Rahmadani. Thank you to everyone who helps collect, clean, and validate the images. If you used third-party resources to host data (e.g., Zenodo, Google Drive, S3), mention them here.

Last updated: 2025-08-29

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
idr_1000		idr_1000
idr_10000		idr_10000
idr_100000		idr_100000
idr_2000		idr_2000
idr_20000		idr_20000
idr_5000		idr_5000
idr_50000		idr_50000
idr_mix		idr_mix
test		test
.gitattributes		.gitattributes
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Rupiah Banknotes Dataset

🔍 What is this?

✅ Use cases

📁 Dataset structure (recommended)

⚙️ How to use this dataset

1) Clone the repository

2) Install common Python dependencies

3) Load data with PyTorch (ImageFolder)

4) Load data with TensorFlow (tf.data)

5) Creating train / val / test splits

🧰 Example: quick preprocessing tips

📊 Dataset statistics

⚖️ License & Terms

🧾 Citation

🛡️ Ethics & Responsible Use

🤝 Contributing

✉️ Contact

📌 Acknowledgements

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

Rupiah Banknotes Dataset

🔍 What is this?

✅ Use cases

📁 Dataset structure (recommended)

⚙️ How to use this dataset

1) Clone the repository

2) Install common Python dependencies

3) Load data with PyTorch (ImageFolder)

4) Load data with TensorFlow (tf.data)

5) Creating train / val / test splits

🧰 Example: quick preprocessing tips

📊 Dataset statistics

⚖️ License & Terms

🧾 Citation

🛡️ Ethics & Responsible Use

🤝 Contributing

✉️ Contact

📌 Acknowledgements

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Packages