You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Niels here from the open-source team at Hugging Face. I discovered your work through AK's daily papers: https://huggingface.co/papers/2410.13674. The paper page lets people discuss the paper and discover its artifacts (such as models, datasets, and demos in the form of a 🤗 Space).
Your recent work on Diffusion Curriculum (DisCL) is very interesting! It would be great to make the ImageNet-LT and iWildCam datasets, augmented with your DisCL synthetic data, available on the 🤗 Hub to improve their discoverability and visibility. We can add tags to help people find them when searching https://huggingface.co/datasets.
We've noticed that the ImageNet-LT meta-information is currently hosted on Google Drive and the iWildCam dataset is available via its official Github repository. Migrating these to the Hugging Face Hub would offer several advantages:
Better discoverability: Researchers can easily find your augmented datasets using the Hugging Face search functionality and filters.
Improved reproducibility: Others can directly load your datasets using the datasets library: from datasets import load_dataset; dataset = load_dataset("your-hf-org-or-username/your-dataset")
Dataset viewer: Users can quickly explore the first few rows of your data in a browser using our dataset viewer.
After uploading, we can link the datasets to your paper page (read more here: https://huggingface.co/docs/hub/en/datasets-viewer) to further increase visibility. This will also greatly increase the impact of your work.
Hi @tianyizhou,
Niels here from the open-source team at Hugging Face. I discovered your work through AK's daily papers: https://huggingface.co/papers/2410.13674. The paper page lets people discuss the paper and discover its artifacts (such as models, datasets, and demos in the form of a 🤗 Space).
Your recent work on Diffusion Curriculum (DisCL) is very interesting! It would be great to make the ImageNet-LT and iWildCam datasets, augmented with your DisCL synthetic data, available on the 🤗 Hub to improve their discoverability and visibility. We can add tags to help people find them when searching https://huggingface.co/datasets.
We've noticed that the ImageNet-LT meta-information is currently hosted on Google Drive and the iWildCam dataset is available via its official Github repository. Migrating these to the Hugging Face Hub would offer several advantages:
datasets
library:from datasets import load_dataset; dataset = load_dataset("your-hf-org-or-username/your-dataset")
After uploading, we can link the datasets to your paper page (read more here: https://huggingface.co/docs/hub/en/datasets-viewer) to further increase visibility. This will also greatly increase the impact of your work.
Would you be interested in hosting these on Hugging Face? If so, here's a guide: https://huggingface.co/docs/datasets/loading. We also support WebDataset, which is especially useful for large image/video datasets: https://huggingface.co/docs/datasets/en/loading#webdataset.
Let me know if you're interested or need any help with this process!
Cheers,
Niels
ML Engineer @ HF 🤗
The text was updated successfully, but these errors were encountered: