Release 1.0.2 #50

dylanuys · 2024-08-19T18:48:42Z

Patch 1.0.2

QOL Upgrades

Auto-update for both validators and miners.
Self-healing for both validators and miners, restarts neuron every 6 hours to help keep processes in a healthy state
Reduced validator log clutter
- Before: Lots of innocuous warnings and ugly progress bars from pytorch, huggingface, etc. that made logs difficult to parse
- Now: Silenced warnings and removed progress bars, keeping only information essential to understanding validator behavior/health

Validator Technical Improvements:

Improved validator image generation pipeline
Before

The validator prompt generation pipline produced annotations with inconsistent white spacing and punctuation between prompt responses.
Our image generation pipline attempted to generate images from annotations that exceeded the maximum allowed length of 77 tokens for CLIPTokenizer, the tokenizer for the Stable Diffusion XL, Mobius, and RealVisXL_V4.0 diffusion models used by validators. These long annotations previously triggered warnings.
Previously in the validator forward function, resampling was not performed if a NAN value was present in any field of a real image sample dict (e.g. Image id).

Now

Improved BLIP-2 outputs to be cleaner with spaces between each prompt and periods at the end of responses.
Synthetic image generation now first truncates all descriptive information into the maximum tokenizer token limit upfront, to ensure compliance with HuggingFace diffusion model tokenizers and avoid warnings.
Validator forward function now checks for NAN image metadata after each sampling, annotation, and image generation step. A retry with a resampled real image is performed if data is invalid.

…tion for generating images iin SyntheticImageGenerator class, moved notebooks to new notebooks dir.

…e mirror pipeline.

…o generate_image in SyntheticImageGenerato for future customization.

…or chunking, pm2 examples

…ared in SyntheticImageGenerator

…inherits from it. Specified generated image dimensions in diffuser call params.

…mage count logs

…ind-ai/bitmind-subnet into synthetic_image_gen_updates

…ged annotations 'index' field to 'id' for consistency, various data loading and parameter fixes

…line loading and image size customization to generation.

…ticate with huggingface-cli login

…art and end indices

…aset rows being ordered by filename string-wise instead of numerically. Added generate_synthetic_images arg and updated dataset naming conventions for parallelization-friendliness. Disabled diffusion pipeline progress bars. Added const for progress updates in terminal.

…ss calculation.

…tart_index and end_index args.

…es to disk.

Versioned requirements.txt

Validator Caption/Image Resampling

…essary imports. Simplified clear_gpu to not moving tensor to CPU.

…F logging level in dataset generation script to be consistent with synthetic generation classes.

Synthetic image gen updates

Fix Weight Setting Edge Case

dylanuys · 2024-08-19T18:55:20Z

bitmind/base/utils/weight_utils.py

@@ -157,6 +157,9 @@ def process_weights_for_netuid(

    # Find all non zero weights.
    non_zero_weight_idx = np.argwhere(weights > 0).squeeze()
+    if non_zero_weight_idx.ndim == 0:
+        non_zero_weight_idx = non_zero_weight_idx.reshape((1,))
+


this is a fix for an issue i raised a while back for the bittensor-subnet-template repo:
opentensor/bittensor-subnet-template#93

dylanuys · 2024-08-19T18:55:56Z

bitmind/constants.py

-        {"path": "bitmind/realvis-xl", "create_splits": False},
-        {"path": "bitmind/stable-diffusion-xl", "create_splits": False},
+        {"path": "bitmind/realvis-xl___parquet", "create_splits": False},
+        {"path": "bitmind/stable-diffusion-xl___parquet", "create_splits": False},


we are renaming in huggingface and reverting these to the original names right now

fixing download_data extension

kenobijon

Looks good, have reviewed many of the individual PRs into testnet branch. Small cleanups on documentation but overall good to release

kenobijon · 2024-08-19T18:59:55Z

install_system_deps.sh

+MINER_AXON_PORT=8091
+BLACKLIST_FORCE_VALIDATOR_PERMIT=True" > miner.env
+
+echo "NETUID=34
+SUBTENSOR_NETWORK=finney
+WALLET_NAME=default
+WALLET_HOTKEY=default
+VALIDATOR_AXON_PORT=8092" > validator.env


Do we have any docs about these being the default axon ports?

@kenobijon Not explicitly, but our updated MIner/Validator Guide's instruct users to modify the .env files according to their environment.

If you think it'd be helpful to mention explicitly that these are the defaults, i can add that too!

Update fake dataset paths

Autoupdate Python/Pip Binaries

…n (CAMO) Framework (#55) * Added DeepfakeBench submodule to base_miner dir * Added initial adaptation of pretrained UCF inference. Refactored NPR files into new dir. * Added setup readme and a sample image for inference. * Added loss functions and backbone network. Updated readme. * Enable loading model checkpoints from Hugging Face. * migrated training scripts from DeepfakeBench * Added package initialization, renaming configs to config. * Added fix to missing weights directory * Finished ucf_test on sample images. * Added dlib shape detector for face detection and alignment * Added face_recognition implementation of face alignment * Fixed variable names * Update dlib requirements * Implemented ucf_miner and created a class for the pretrained UCF model * Renamed files for clarity. Added unit test for pretrained UCF. * Migrated train utils from NPR base miner, modified train_ucf.py to use BitMind datasets * Fix image input type errors * Added xception training backbone, logging files * Detectors module path fix * bug fixes for live miner * BitMind data load and restructure for integrated DeepfakeBench train loop * Added DeepfakeBench training logs to .gitignore * Removed unused import, local data saving * Fixed prediction_class referenced before assignment * Fixed test metric using logits and not class labels * Train source labels for learning specific forgery, added separate test and validation loops * Corrected variable name typo * Refactored eval in training loop, renamed test stage to validation * Implemented source label mapping in UCF training splits * Added test stage, source labels to training data dict for learning dataset specific forgery features * Added gpu cache cleanup, now using configs for batch size and data loader workers. * Batch to cpu after train loader iteration, logging cleanup * Fixed test metrics not logging * Added logging for train and test time * Added image normalization for training data * Re-added check for data label during inference. * Adjusted UCF image normalization to be in line with config. Fixed processing of local images for UCF testing. * Adjusted image preprocessing for experiments. * Added face cropping and alignment to preprocess images for UCF detection. * Typo fixes, added readme to credit face shape predictor file. * Made face crop and align False by default. * New miner script for running UCF-BitMind * Added handling for the case when face_detector does not find any images. Reduced warning messages. * Removed duplicate function. * Adapted face detection and extraction functions to UCF class for modularity. Updated and refactored test, miner scripts. * First iteration of context_aware_miner.py * Fixed ucf_miner import error by simplifying path and import statement for UCF module. * Fixed dlib predictor path, explicitly define map_location for torch.load * Fixed imports in ucf_bitmind_miner and removed rounding or predictions. * Fixed imports for context aware miner. * Fix NPR model weight variable name. * Typo * Added Context-Aware v2 with UCF-BitMind for general images * Remove unused import * Moved UCF model loading to init function of Context Aware Miner v2 * Added free memory function to manage resources for multi-model miners. * Added script for miners to test their model loading and inference latency. * Moved model loading to init functions to avoid reloading. * Updated minimum miner requirements to require GPU * Fixed indents, load UCF-DFB model in init func * Update check for faces to be consistent with DFB preprocessing * Release 1.0.2 (#50) * Updated synthetic image mirror generation script, created helper function for generating images iin SyntheticImageGenerator class, moved notebooks to new notebooks dir. * Restored ensure_save_path func back to annotation_utils.py * Added latency tracking, max images generated field for synthetic image mirror pipeline. * Added mean synthetic image gen latency print statement * Update example arg inputs * Fix imports * Fixed and reformatted args * Suppressed TensorFlow warnings, fixed image gen from annotation * Index and loop bugfixes * Index, looping, args logic fixes. * Add load diffuser function call * Clear gpu after using synthetic image generator * Always load from Hugging Face. * Batch processing for memory optimization. Added optional name field to generate_image in SyntheticImageGenerato for future customization. * Memory optimizations (saving annotation .jsons to disk), added args for chunking, pm2 examples * Fix save as json on disk, ensure no hanging reference when gpu is cleared in SyntheticImageGenerator * Replaced generic DiffusionPipeline with StableDiffusionPipeline that inherits from it. Specified generated image dimensions in diffuser call params. * convert diffuser to float32 before moving onto cpu, fixed duplicate image count logs * Added a testing function to save images from real image dataset, changed annotations 'index' field to 'id' for consistency, various data loading and parameter fixes * Added pipeline for diffusion models to constants.py, and dynamic pipeline loading and image size customization to generation. * Fixed Hugging Face authentication errors. Added instruction to authenticate with huggingface-cli login * Fixed all annotations being used to generate mirrors regardless of start and end indices * Added a new load_and_sort_dataset function to handle Hugging Face dataset rows being ordered by filename string-wise instead of numerically. Added generate_synthetic_images arg and updated dataset naming conventions for parallelization-friendliness. Disabled diffusion pipeline progress bars. Added const for progress updates in terminal. * Removed extra disable progress bar call. Added ceil import for progress calculation. * Adjust Hugging Face annotations dataset name * Reverted annotations dataset name to have data range, now requiring start_index and end_index args. * Re-removed data range from annotations * Update 'index' to 'id' * Fixed loading annotations from Hugging Face and savng specified indices to disk. * Utils refactored, smaller functions. Added resize arg. Added combine_datasets script to put together all generated splits into one Hugging Face dataset. * Replace hardcoded name * Fix fstring * Fix args * Fixed typos * Updated combine_datasets.py to match Hugging Face dataset nomenclature. * removing unused files * initial validator forward pytest * initial ci.yml * new mock classes for ci workflow * temporarily removing old version of generate_synthetic_data.py * rename get_mock_image() -> create_random_image() * adding test_mock.py * renaming build -> test step in ci.yml * test_rewards.py * parameterizing fake_prob to allow intentionally testing real/synth image flows in vali fwd * forcing vali fwd through real and synth image flows * fake_prob -> _fake_prob * using dot operator to read config in mock vali until I replace namespace cfg with bt.config * allowing mock code to skip force_register_neuron in the case that the neuron was already registered in previous test instance * removing unused circleci dir from template repo * image transforms tests * fixing setting of mock dentrite process_time * adding test_mock.py * reset mock chain state in between test cases * cleaning up state management for MockSubtensor * __init__.py * replacing hardcoded string with random image b64 * Fixed saving synthetic images after resizing. * new auto update implementation from sn19 * inital self heal script from sn19 * Flag for downloading annotations from HuggingFace * fixing reference to self.config * Enforcing no watermarking in all cases * self heal in autoupdate script * making autoupdate scripts executable * self heal restart 6 -> 6.5 * typo * allowing --no-auto-update and --no-self-heal for validators * combining run scripts into run_neuron.py * replacing neuron type with --validator and --miner * documentation updates for new run script * docs update * adding wandb to docs * Arg for skipping annotation generation * Prompt truncation for annotations longer than max token length * Suppress token max exceeded warning, cleaned up error logging * Removed all tqdm loading bars, cleaned imports, updated fake dataset paths to parquet versions. * Improved annotation cleanliness with inter-prompt spacing and stripped endings. * removing fixtures reference from mock.py * read btcli args from .env * docs update * Formatting * fixing fixtures import * adding .env file creation to install script * moving network (test/finney) into .env, reducing script redundancy * missing netuid arg for MockSubtensor/MockMetagraph inits in test * adding .env to .gitignore * AXON_PORT -> MINER_AXON_PORT env var rename * docs updates to reflect latest run_neuron.py updates * updating .env paths * small docs update * Fixed annotation json filenames not starting with start_idx arg * locking down version numbers * Added docstrings and comments * fixing image_index field for wanbd logging * try except for wandb init * adding retries for nan images * fixing image isnan check by adding np.any * rename wandb fields *image_id -> *image_name * Updated failure case for generating annotations. * Adjusted TF logging level to include error messages. Cleaned up unnecessary imports. Simplified clear_gpu to not moving tensor to CPU. * Reverted deletion of necessary diffusion pipeline imports. Adjusted TF logging level in dataset generation script to be consistent with synthetic generation classes. * adding a sleep to reduce metagraph resync freq * fixing edge case that occurs when only 1 miner has nonzero weight * bump version to 1.0.2 * fixing download_data extension * Update fake dataset paths * replacing conda activate with /home/user/mambaforge/envs/tensorml --------- Co-authored-by: Benjamin <[email protected]> Co-authored-by: aliang322 <[email protected]> * TrainingDatasetProcessor class for loading, generating, and uploading preprocessing face only images into training datasets * Fixed transform dict var name * Removed normalize function in training dataset creation * Changed config dict to faces_only bool for clarity, changed hf repo type to dataset * Added splits instance variable, generalized function names, non face only processing * Script for interfacing with TrainingDatasetProcessor to create and upload preprocessed training datasets. * Added usage examples and explanation * Removed unused import * Simplified repo upload naming convention. * Added original image index column to training datasets. Consolidated transform datasets into HF subsets. * Added local save/load, upload repo destination options; Fixed dataset preprocessing to be performed in-place. * Created create_splits() helper function * Fixed not clearing dataset memory when loading pickle * Created Context-Aware Hierarchical Mixture-of-Agents (CAMO) miner. * Created modular helper function for loading detectors. * Rewording comments * Added YOLOv8 object detection for image classification * Renamed create_splits() to split_dataset() and added support for subset loading * Added HuggingFace subset download option * Restructured data loading for training. Support for face only subset loading with stratified splits. * Added YOLOv8 object detection experiments, renamed CAMO miner. * Updated paths for new CAMO model weights * Added object detection error checks * Switch debug to bt.logging, formatting * Set use_object_detection to False for current iteration of CAMO * Fixed assertion error on last batch of train epoch when incomplete batches present by dropping last in dataloaders. * Added data shuffling prior to splitting, check for disjoint stratified split indices. * Fixed faces not being used by face expert. * Improved clarity with face processing helper functions * Generalized to optionally include source label mapping * Consolidated real_fake_dataset.py into bitmind directory, updated references * Fixed docstring typo * Parameterized shuffling before splits, generalized params to adopt expert model terminology, made source label mapping optional * Renamed params appropriately, fixed generalist UCF not using source labels * Reformatted long lines, fixed split size printing num batches * Removed UCF-specific data utils, replaced with generalized utils at bitmind level * Standardized usage of bitmind.utils.train_data for data load/split across NPR and UCF base models * adding versions for new package * Update Mining.md * moving miner/validator specific dependencies to new requirements filse * setup scripts specific to miner/validator reqs * Relocated train/predict data processing scripts, updated imports and paths * Removed redundant face detection utils from UCF dir * Comment crediting original source of UCF training scripts * Removed DeepfakeBench submodule * Added auto download for backbone weights if not locally present * Removed redundant video metrics from validation logs * Updated README with script usage, removed deprecated manual weight download instructions * Cleaned up leftover DFB train configs, fixed training error when using only 1 real/fake train dataset * Deleted whitespace * Cleaned unused DFB config labels * adding subtensor.chain_endpoint to startup scripts * Added default value for specific task number for training * Standardized UCF paths with consts across UCF neurons and training files * Cleaned up experimental files. * Update Mining.md * Update Validating.md * Standardized neuron naming * Updated default miner to camo_miner.py * Fixed forgery dataset/method disentangling by setting specific_task_number value to num of fake datasets + 1 for real label * Added readme file for camo base miner * Fixed UCF constant weight path names * Removed non-validator dataset generation scripts, fixed camo readme. * Weights constant name typo * Fixed UCF miner imports * Added missing UCF weights import * Cleaned up utils and unnecessary files * adding testnet chain endpoint to docs * adding miner/vali dep installs to ci.yml --------- Co-authored-by: Benjamin <[email protected]> Co-authored-by: default <[email protected]> Co-authored-by: aliang322 <[email protected]> Co-authored-by: Ken Miyachi <[email protected]> Co-authored-by: Dylan Uys <[email protected]>

* Updated synthetic image mirror generation script, created helper function for generating images iin SyntheticImageGenerator class, moved notebooks to new notebooks dir. * Restored ensure_save_path func back to annotation_utils.py * Added latency tracking, max images generated field for synthetic image mirror pipeline. * Added mean synthetic image gen latency print statement * Update example arg inputs * Fix imports * Fixed and reformatted args * Suppressed TensorFlow warnings, fixed image gen from annotation * Index and loop bugfixes * Index, looping, args logic fixes. * Add load diffuser function call * Clear gpu after using synthetic image generator * Always load from Hugging Face. * Batch processing for memory optimization. Added optional name field to generate_image in SyntheticImageGenerato for future customization. * Memory optimizations (saving annotation .jsons to disk), added args for chunking, pm2 examples * Fix save as json on disk, ensure no hanging reference when gpu is cleared in SyntheticImageGenerator * Replaced generic DiffusionPipeline with StableDiffusionPipeline that inherits from it. Specified generated image dimensions in diffuser call params. * convert diffuser to float32 before moving onto cpu, fixed duplicate image count logs * Added a testing function to save images from real image dataset, changed annotations 'index' field to 'id' for consistency, various data loading and parameter fixes * Added pipeline for diffusion models to constants.py, and dynamic pipeline loading and image size customization to generation. * Fixed Hugging Face authentication errors. Added instruction to authenticate with huggingface-cli login * Fixed all annotations being used to generate mirrors regardless of start and end indices * Added a new load_and_sort_dataset function to handle Hugging Face dataset rows being ordered by filename string-wise instead of numerically. Added generate_synthetic_images arg and updated dataset naming conventions for parallelization-friendliness. Disabled diffusion pipeline progress bars. Added const for progress updates in terminal. * Removed extra disable progress bar call. Added ceil import for progress calculation. * Adjust Hugging Face annotations dataset name * Reverted annotations dataset name to have data range, now requiring start_index and end_index args. * Re-removed data range from annotations * Update 'index' to 'id' * Fixed loading annotations from Hugging Face and savng specified indices to disk. * Utils refactored, smaller functions. Added resize arg. Added combine_datasets script to put together all generated splits into one Hugging Face dataset. * Replace hardcoded name * Fix fstring * Fix args * Fixed typos * Updated combine_datasets.py to match Hugging Face dataset nomenclature. * removing unused files * initial validator forward pytest * initial ci.yml * new mock classes for ci workflow * temporarily removing old version of generate_synthetic_data.py * rename get_mock_image() -> create_random_image() * adding test_mock.py * renaming build -> test step in ci.yml * test_rewards.py * parameterizing fake_prob to allow intentionally testing real/synth image flows in vali fwd * forcing vali fwd through real and synth image flows * fake_prob -> _fake_prob * using dot operator to read config in mock vali until I replace namespace cfg with bt.config * allowing mock code to skip force_register_neuron in the case that the neuron was already registered in previous test instance * removing unused circleci dir from template repo * image transforms tests * fixing setting of mock dentrite process_time * adding test_mock.py * reset mock chain state in between test cases * cleaning up state management for MockSubtensor * __init__.py * replacing hardcoded string with random image b64 * Fixed saving synthetic images after resizing. * new auto update implementation from sn19 * inital self heal script from sn19 * Flag for downloading annotations from HuggingFace * fixing reference to self.config * Enforcing no watermarking in all cases * self heal in autoupdate script * making autoupdate scripts executable * self heal restart 6 -> 6.5 * typo * allowing --no-auto-update and --no-self-heal for validators * combining run scripts into run_neuron.py * replacing neuron type with --validator and --miner * documentation updates for new run script * docs update * adding wandb to docs * Arg for skipping annotation generation * Prompt truncation for annotations longer than max token length * Suppress token max exceeded warning, cleaned up error logging * Removed all tqdm loading bars, cleaned imports, updated fake dataset paths to parquet versions. * Improved annotation cleanliness with inter-prompt spacing and stripped endings. * removing fixtures reference from mock.py * read btcli args from .env * docs update * Formatting * fixing fixtures import * adding .env file creation to install script * moving network (test/finney) into .env, reducing script redundancy * missing netuid arg for MockSubtensor/MockMetagraph inits in test * adding .env to .gitignore * AXON_PORT -> MINER_AXON_PORT env var rename * docs updates to reflect latest run_neuron.py updates * updating .env paths * small docs update * Fixed annotation json filenames not starting with start_idx arg * locking down version numbers * Added docstrings and comments * fixing image_index field for wanbd logging * try except for wandb init * adding retries for nan images * fixing image isnan check by adding np.any * rename wandb fields *image_id -> *image_name * Updated failure case for generating annotations. * Adjusted TF logging level to include error messages. Cleaned up unnecessary imports. Simplified clear_gpu to not moving tensor to CPU. * Reverted deletion of necessary diffusion pipeline imports. Adjusted TF logging level in dataset generation script to be consistent with synthetic generation classes. * adding a sleep to reduce metagraph resync freq * fixing edge case that occurs when only 1 miner has nonzero weight * bump version to 1.0.2 * fixing download_data extension * Update fake dataset paths * replacing conda activate with /home/user/mambaforge/envs/tensorml * Base miner training improvements and Content-Aware Model Orchestration (CAMO) Framework (#55) * Added DeepfakeBench submodule to base_miner dir * Added initial adaptation of pretrained UCF inference. Refactored NPR files into new dir. * Added setup readme and a sample image for inference. * Added loss functions and backbone network. Updated readme. * Enable loading model checkpoints from Hugging Face. * migrated training scripts from DeepfakeBench * Added package initialization, renaming configs to config. * Added fix to missing weights directory * Finished ucf_test on sample images. * Added dlib shape detector for face detection and alignment * Added face_recognition implementation of face alignment * Fixed variable names * Update dlib requirements * Implemented ucf_miner and created a class for the pretrained UCF model * Renamed files for clarity. Added unit test for pretrained UCF. * Migrated train utils from NPR base miner, modified train_ucf.py to use BitMind datasets * Fix image input type errors * Added xception training backbone, logging files * Detectors module path fix * bug fixes for live miner * BitMind data load and restructure for integrated DeepfakeBench train loop * Added DeepfakeBench training logs to .gitignore * Removed unused import, local data saving * Fixed prediction_class referenced before assignment * Fixed test metric using logits and not class labels * Train source labels for learning specific forgery, added separate test and validation loops * Corrected variable name typo * Refactored eval in training loop, renamed test stage to validation * Implemented source label mapping in UCF training splits * Added test stage, source labels to training data dict for learning dataset specific forgery features * Added gpu cache cleanup, now using configs for batch size and data loader workers. * Batch to cpu after train loader iteration, logging cleanup * Fixed test metrics not logging * Added logging for train and test time * Added image normalization for training data * Re-added check for data label during inference. * Adjusted UCF image normalization to be in line with config. Fixed processing of local images for UCF testing. * Adjusted image preprocessing for experiments. * Added face cropping and alignment to preprocess images for UCF detection. * Typo fixes, added readme to credit face shape predictor file. * Made face crop and align False by default. * New miner script for running UCF-BitMind * Added handling for the case when face_detector does not find any images. Reduced warning messages. * Removed duplicate function. * Adapted face detection and extraction functions to UCF class for modularity. Updated and refactored test, miner scripts. * First iteration of context_aware_miner.py * Fixed ucf_miner import error by simplifying path and import statement for UCF module. * Fixed dlib predictor path, explicitly define map_location for torch.load * Fixed imports in ucf_bitmind_miner and removed rounding or predictions. * Fixed imports for context aware miner. * Fix NPR model weight variable name. * Typo * Added Context-Aware v2 with UCF-BitMind for general images * Remove unused import * Moved UCF model loading to init function of Context Aware Miner v2 * Added free memory function to manage resources for multi-model miners. * Added script for miners to test their model loading and inference latency. * Moved model loading to init functions to avoid reloading. * Updated minimum miner requirements to require GPU * Fixed indents, load UCF-DFB model in init func * Update check for faces to be consistent with DFB preprocessing * Release 1.0.2 (#50) * Updated synthetic image mirror generation script, created helper function for generating images iin SyntheticImageGenerator class, moved notebooks to new notebooks dir. * Restored ensure_save_path func back to annotation_utils.py * Added latency tracking, max images generated field for synthetic image mirror pipeline. * Added mean synthetic image gen latency print statement * Update example arg inputs * Fix imports * Fixed and reformatted args * Suppressed TensorFlow warnings, fixed image gen from annotation * Index and loop bugfixes * Index, looping, args logic fixes. * Add load diffuser function call * Clear gpu after using synthetic image generator * Always load from Hugging Face. * Batch processing for memory optimization. Added optional name field to generate_image in SyntheticImageGenerato for future customization. * Memory optimizations (saving annotation .jsons to disk), added args for chunking, pm2 examples * Fix save as json on disk, ensure no hanging reference when gpu is cleared in SyntheticImageGenerator * Replaced generic DiffusionPipeline with StableDiffusionPipeline that inherits from it. Specified generated image dimensions in diffuser call params. * convert diffuser to float32 before moving onto cpu, fixed duplicate image count logs * Added a testing function to save images from real image dataset, changed annotations 'index' field to 'id' for consistency, various data loading and parameter fixes * Added pipeline for diffusion models to constants.py, and dynamic pipeline loading and image size customization to generation. * Fixed Hugging Face authentication errors. Added instruction to authenticate with huggingface-cli login * Fixed all annotations being used to generate mirrors regardless of start and end indices * Added a new load_and_sort_dataset function to handle Hugging Face dataset rows being ordered by filename string-wise instead of numerically. Added generate_synthetic_images arg and updated dataset naming conventions for parallelization-friendliness. Disabled diffusion pipeline progress bars. Added const for progress updates in terminal. * Removed extra disable progress bar call. Added ceil import for progress calculation. * Adjust Hugging Face annotations dataset name * Reverted annotations dataset name to have data range, now requiring start_index and end_index args. * Re-removed data range from annotations * Update 'index' to 'id' * Fixed loading annotations from Hugging Face and savng specified indices to disk. * Utils refactored, smaller functions. Added resize arg. Added combine_datasets script to put together all generated splits into one Hugging Face dataset. * Replace hardcoded name * Fix fstring * Fix args * Fixed typos * Updated combine_datasets.py to match Hugging Face dataset nomenclature. * removing unused files * initial validator forward pytest * initial ci.yml * new mock classes for ci workflow * temporarily removing old version of generate_synthetic_data.py * rename get_mock_image() -> create_random_image() * adding test_mock.py * renaming build -> test step in ci.yml * test_rewards.py * parameterizing fake_prob to allow intentionally testing real/synth image flows in vali fwd * forcing vali fwd through real and synth image flows * fake_prob -> _fake_prob * using dot operator to read config in mock vali until I replace namespace cfg with bt.config * allowing mock code to skip force_register_neuron in the case that the neuron was already registered in previous test instance * removing unused circleci dir from template repo * image transforms tests * fixing setting of mock dentrite process_time * adding test_mock.py * reset mock chain state in between test cases * cleaning up state management for MockSubtensor * __init__.py * replacing hardcoded string with random image b64 * Fixed saving synthetic images after resizing. * new auto update implementation from sn19 * inital self heal script from sn19 * Flag for downloading annotations from HuggingFace * fixing reference to self.config * Enforcing no watermarking in all cases * self heal in autoupdate script * making autoupdate scripts executable * self heal restart 6 -> 6.5 * typo * allowing --no-auto-update and --no-self-heal for validators * combining run scripts into run_neuron.py * replacing neuron type with --validator and --miner * documentation updates for new run script * docs update * adding wandb to docs * Arg for skipping annotation generation * Prompt truncation for annotations longer than max token length * Suppress token max exceeded warning, cleaned up error logging * Removed all tqdm loading bars, cleaned imports, updated fake dataset paths to parquet versions. * Improved annotation cleanliness with inter-prompt spacing and stripped endings. * removing fixtures reference from mock.py * read btcli args from .env * docs update * Formatting * fixing fixtures import * adding .env file creation to install script * moving network (test/finney) into .env, reducing script redundancy * missing netuid arg for MockSubtensor/MockMetagraph inits in test * adding .env to .gitignore * AXON_PORT -> MINER_AXON_PORT env var rename * docs updates to reflect latest run_neuron.py updates * updating .env paths * small docs update * Fixed annotation json filenames not starting with start_idx arg * locking down version numbers * Added docstrings and comments * fixing image_index field for wanbd logging * try except for wandb init * adding retries for nan images * fixing image isnan check by adding np.any * rename wandb fields *image_id -> *image_name * Updated failure case for generating annotations. * Adjusted TF logging level to include error messages. Cleaned up unnecessary imports. Simplified clear_gpu to not moving tensor to CPU. * Reverted deletion of necessary diffusion pipeline imports. Adjusted TF logging level in dataset generation script to be consistent with synthetic generation classes. * adding a sleep to reduce metagraph resync freq * fixing edge case that occurs when only 1 miner has nonzero weight * bump version to 1.0.2 * fixing download_data extension * Update fake dataset paths * replacing conda activate with /home/user/mambaforge/envs/tensorml --------- Co-authored-by: Benjamin <[email protected]> Co-authored-by: aliang322 <[email protected]> * TrainingDatasetProcessor class for loading, generating, and uploading preprocessing face only images into training datasets * Fixed transform dict var name * Removed normalize function in training dataset creation * Changed config dict to faces_only bool for clarity, changed hf repo type to dataset * Added splits instance variable, generalized function names, non face only processing * Script for interfacing with TrainingDatasetProcessor to create and upload preprocessed training datasets. * Added usage examples and explanation * Removed unused import * Simplified repo upload naming convention. * Added original image index column to training datasets. Consolidated transform datasets into HF subsets. * Added local save/load, upload repo destination options; Fixed dataset preprocessing to be performed in-place. * Created create_splits() helper function * Fixed not clearing dataset memory when loading pickle * Created Context-Aware Hierarchical Mixture-of-Agents (CAMO) miner. * Created modular helper function for loading detectors. * Rewording comments * Added YOLOv8 object detection for image classification * Renamed create_splits() to split_dataset() and added support for subset loading * Added HuggingFace subset download option * Restructured data loading for training. Support for face only subset loading with stratified splits. * Added YOLOv8 object detection experiments, renamed CAMO miner. * Updated paths for new CAMO model weights * Added object detection error checks * Switch debug to bt.logging, formatting * Set use_object_detection to False for current iteration of CAMO * Fixed assertion error on last batch of train epoch when incomplete batches present by dropping last in dataloaders. * Added data shuffling prior to splitting, check for disjoint stratified split indices. * Fixed faces not being used by face expert. * Improved clarity with face processing helper functions * Generalized to optionally include source label mapping * Consolidated real_fake_dataset.py into bitmind directory, updated references * Fixed docstring typo * Parameterized shuffling before splits, generalized params to adopt expert model terminology, made source label mapping optional * Renamed params appropriately, fixed generalist UCF not using source labels * Reformatted long lines, fixed split size printing num batches * Removed UCF-specific data utils, replaced with generalized utils at bitmind level * Standardized usage of bitmind.utils.train_data for data load/split across NPR and UCF base models * adding versions for new package * Update Mining.md * moving miner/validator specific dependencies to new requirements filse * setup scripts specific to miner/validator reqs * Relocated train/predict data processing scripts, updated imports and paths * Removed redundant face detection utils from UCF dir * Comment crediting original source of UCF training scripts * Removed DeepfakeBench submodule * Added auto download for backbone weights if not locally present * Removed redundant video metrics from validation logs * Updated README with script usage, removed deprecated manual weight download instructions * Cleaned up leftover DFB train configs, fixed training error when using only 1 real/fake train dataset * Deleted whitespace * Cleaned unused DFB config labels * adding subtensor.chain_endpoint to startup scripts * Added default value for specific task number for training * Standardized UCF paths with consts across UCF neurons and training files * Cleaned up experimental files. * Update Mining.md * Update Validating.md * Standardized neuron naming * Updated default miner to camo_miner.py * Fixed forgery dataset/method disentangling by setting specific_task_number value to num of fake datasets + 1 for real label * Added readme file for camo base miner * Fixed UCF constant weight path names * Removed non-validator dataset generation scripts, fixed camo readme. * Weights constant name typo * Fixed UCF miner imports * Added missing UCF weights import * Cleaned up utils and unnecessary files * adding testnet chain endpoint to docs * adding miner/vali dep installs to ci.yml --------- Co-authored-by: Benjamin <[email protected]> Co-authored-by: default <[email protected]> Co-authored-by: aliang322 <[email protected]> Co-authored-by: Ken Miyachi <[email protected]> Co-authored-by: Dylan Uys <[email protected]> * bump version 1.0.2 -> 1.1.0 * Removed sample images in UCF directory * parameterizing neuron filepath * Added base miner dir readme with CAMO information --------- Co-authored-by: Benjamin <[email protected]> Co-authored-by: aliang322 <[email protected]> Co-authored-by: default <[email protected]> Co-authored-by: aliang322 <[email protected]> Co-authored-by: Ken Miyachi <[email protected]>

* Updated synthetic image mirror generation script, created helper function for generating images iin SyntheticImageGenerator class, moved notebooks to new notebooks dir. * Restored ensure_save_path func back to annotation_utils.py * Added latency tracking, max images generated field for synthetic image mirror pipeline. * Added mean synthetic image gen latency print statement * Update example arg inputs * Fix imports * Fixed and reformatted args * Suppressed TensorFlow warnings, fixed image gen from annotation * Index and loop bugfixes * Index, looping, args logic fixes. * Add load diffuser function call * Clear gpu after using synthetic image generator * Always load from Hugging Face. * Batch processing for memory optimization. Added optional name field to generate_image in SyntheticImageGenerato for future customization. * Memory optimizations (saving annotation .jsons to disk), added args for chunking, pm2 examples * Fix save as json on disk, ensure no hanging reference when gpu is cleared in SyntheticImageGenerator * Replaced generic DiffusionPipeline with StableDiffusionPipeline that inherits from it. Specified generated image dimensions in diffuser call params. * convert diffuser to float32 before moving onto cpu, fixed duplicate image count logs * Added a testing function to save images from real image dataset, changed annotations 'index' field to 'id' for consistency, various data loading and parameter fixes * Added pipeline for diffusion models to constants.py, and dynamic pipeline loading and image size customization to generation. * Fixed Hugging Face authentication errors. Added instruction to authenticate with huggingface-cli login * Fixed all annotations being used to generate mirrors regardless of start and end indices * Added a new load_and_sort_dataset function to handle Hugging Face dataset rows being ordered by filename string-wise instead of numerically. Added generate_synthetic_images arg and updated dataset naming conventions for parallelization-friendliness. Disabled diffusion pipeline progress bars. Added const for progress updates in terminal. * Removed extra disable progress bar call. Added ceil import for progress calculation. * Adjust Hugging Face annotations dataset name * Reverted annotations dataset name to have data range, now requiring start_index and end_index args. * Re-removed data range from annotations * Update 'index' to 'id' * Fixed loading annotations from Hugging Face and savng specified indices to disk. * Utils refactored, smaller functions. Added resize arg. Added combine_datasets script to put together all generated splits into one Hugging Face dataset. * Replace hardcoded name * Fix fstring * Fix args * Fixed typos * Updated combine_datasets.py to match Hugging Face dataset nomenclature. * removing unused files * initial validator forward pytest * initial ci.yml * new mock classes for ci workflow * temporarily removing old version of generate_synthetic_data.py * rename get_mock_image() -> create_random_image() * adding test_mock.py * renaming build -> test step in ci.yml * test_rewards.py * parameterizing fake_prob to allow intentionally testing real/synth image flows in vali fwd * forcing vali fwd through real and synth image flows * fake_prob -> _fake_prob * using dot operator to read config in mock vali until I replace namespace cfg with bt.config * allowing mock code to skip force_register_neuron in the case that the neuron was already registered in previous test instance * removing unused circleci dir from template repo * image transforms tests * fixing setting of mock dentrite process_time * adding test_mock.py * reset mock chain state in between test cases * cleaning up state management for MockSubtensor * __init__.py * replacing hardcoded string with random image b64 * Fixed saving synthetic images after resizing. * new auto update implementation from sn19 * inital self heal script from sn19 * Flag for downloading annotations from HuggingFace * fixing reference to self.config * Enforcing no watermarking in all cases * self heal in autoupdate script * making autoupdate scripts executable * self heal restart 6 -> 6.5 * typo * allowing --no-auto-update and --no-self-heal for validators * combining run scripts into run_neuron.py * replacing neuron type with --validator and --miner * documentation updates for new run script * docs update * adding wandb to docs * Arg for skipping annotation generation * Prompt truncation for annotations longer than max token length * Suppress token max exceeded warning, cleaned up error logging * Removed all tqdm loading bars, cleaned imports, updated fake dataset paths to parquet versions. * Improved annotation cleanliness with inter-prompt spacing and stripped endings. * removing fixtures reference from mock.py * read btcli args from .env * docs update * Formatting * fixing fixtures import * adding .env file creation to install script * moving network (test/finney) into .env, reducing script redundancy * missing netuid arg for MockSubtensor/MockMetagraph inits in test * adding .env to .gitignore * AXON_PORT -> MINER_AXON_PORT env var rename * docs updates to reflect latest run_neuron.py updates * updating .env paths * small docs update * Fixed annotation json filenames not starting with start_idx arg * locking down version numbers * Added docstrings and comments * fixing image_index field for wanbd logging * try except for wandb init * adding retries for nan images * fixing image isnan check by adding np.any * rename wandb fields *image_id -> *image_name * Updated failure case for generating annotations. * Adjusted TF logging level to include error messages. Cleaned up unnecessary imports. Simplified clear_gpu to not moving tensor to CPU. * Reverted deletion of necessary diffusion pipeline imports. Adjusted TF logging level in dataset generation script to be consistent with synthetic generation classes. * adding a sleep to reduce metagraph resync freq * fixing edge case that occurs when only 1 miner has nonzero weight * bump version to 1.0.2 * fixing download_data extension * Update fake dataset paths * replacing conda activate with /home/user/mambaforge/envs/tensorml * Base miner training improvements and Content-Aware Model Orchestration (CAMO) Framework (#55) * Added DeepfakeBench submodule to base_miner dir * Added initial adaptation of pretrained UCF inference. Refactored NPR files into new dir. * Added setup readme and a sample image for inference. * Added loss functions and backbone network. Updated readme. * Enable loading model checkpoints from Hugging Face. * migrated training scripts from DeepfakeBench * Added package initialization, renaming configs to config. * Added fix to missing weights directory * Finished ucf_test on sample images. * Added dlib shape detector for face detection and alignment * Added face_recognition implementation of face alignment * Fixed variable names * Update dlib requirements * Implemented ucf_miner and created a class for the pretrained UCF model * Renamed files for clarity. Added unit test for pretrained UCF. * Migrated train utils from NPR base miner, modified train_ucf.py to use BitMind datasets * Fix image input type errors * Added xception training backbone, logging files * Detectors module path fix * bug fixes for live miner * BitMind data load and restructure for integrated DeepfakeBench train loop * Added DeepfakeBench training logs to .gitignore * Removed unused import, local data saving * Fixed prediction_class referenced before assignment * Fixed test metric using logits and not class labels * Train source labels for learning specific forgery, added separate test and validation loops * Corrected variable name typo * Refactored eval in training loop, renamed test stage to validation * Implemented source label mapping in UCF training splits * Added test stage, source labels to training data dict for learning dataset specific forgery features * Added gpu cache cleanup, now using configs for batch size and data loader workers. * Batch to cpu after train loader iteration, logging cleanup * Fixed test metrics not logging * Added logging for train and test time * Added image normalization for training data * Re-added check for data label during inference. * Adjusted UCF image normalization to be in line with config. Fixed processing of local images for UCF testing. * Adjusted image preprocessing for experiments. * Added face cropping and alignment to preprocess images for UCF detection. * Typo fixes, added readme to credit face shape predictor file. * Made face crop and align False by default. * New miner script for running UCF-BitMind * Added handling for the case when face_detector does not find any images. Reduced warning messages. * Removed duplicate function. * Adapted face detection and extraction functions to UCF class for modularity. Updated and refactored test, miner scripts. * First iteration of context_aware_miner.py * Fixed ucf_miner import error by simplifying path and import statement for UCF module. * Fixed dlib predictor path, explicitly define map_location for torch.load * Fixed imports in ucf_bitmind_miner and removed rounding or predictions. * Fixed imports for context aware miner. * Fix NPR model weight variable name. * Typo * Added Context-Aware v2 with UCF-BitMind for general images * Remove unused import * Moved UCF model loading to init function of Context Aware Miner v2 * Added free memory function to manage resources for multi-model miners. * Added script for miners to test their model loading and inference latency. * Moved model loading to init functions to avoid reloading. * Updated minimum miner requirements to require GPU * Fixed indents, load UCF-DFB model in init func * Update check for faces to be consistent with DFB preprocessing * Release 1.0.2 (#50) * Updated synthetic image mirror generation script, created helper function for generating images iin SyntheticImageGenerator class, moved notebooks to new notebooks dir. * Restored ensure_save_path func back to annotation_utils.py * Added latency tracking, max images generated field for synthetic image mirror pipeline. * Added mean synthetic image gen latency print statement * Update example arg inputs * Fix imports * Fixed and reformatted args * Suppressed TensorFlow warnings, fixed image gen from annotation * Index and loop bugfixes * Index, looping, args logic fixes. * Add load diffuser function call * Clear gpu after using synthetic image generator * Always load from Hugging Face. * Batch processing for memory optimization. Added optional name field to generate_image in SyntheticImageGenerato for future customization. * Memory optimizations (saving annotation .jsons to disk), added args for chunking, pm2 examples * Fix save as json on disk, ensure no hanging reference when gpu is cleared in SyntheticImageGenerator * Replaced generic DiffusionPipeline with StableDiffusionPipeline that inherits from it. Specified generated image dimensions in diffuser call params. * convert diffuser to float32 before moving onto cpu, fixed duplicate image count logs * Added a testing function to save images from real image dataset, changed annotations 'index' field to 'id' for consistency, various data loading and parameter fixes * Added pipeline for diffusion models to constants.py, and dynamic pipeline loading and image size customization to generation. * Fixed Hugging Face authentication errors. Added instruction to authenticate with huggingface-cli login * Fixed all annotations being used to generate mirrors regardless of start and end indices * Added a new load_and_sort_dataset function to handle Hugging Face dataset rows being ordered by filename string-wise instead of numerically. Added generate_synthetic_images arg and updated dataset naming conventions for parallelization-friendliness. Disabled diffusion pipeline progress bars. Added const for progress updates in terminal. * Removed extra disable progress bar call. Added ceil import for progress calculation. * Adjust Hugging Face annotations dataset name * Reverted annotations dataset name to have data range, now requiring start_index and end_index args. * Re-removed data range from annotations * Update 'index' to 'id' * Fixed loading annotations from Hugging Face and savng specified indices to disk. * Utils refactored, smaller functions. Added resize arg. Added combine_datasets script to put together all generated splits into one Hugging Face dataset. * Replace hardcoded name * Fix fstring * Fix args * Fixed typos * Updated combine_datasets.py to match Hugging Face dataset nomenclature. * removing unused files * initial validator forward pytest * initial ci.yml * new mock classes for ci workflow * temporarily removing old version of generate_synthetic_data.py * rename get_mock_image() -> create_random_image() * adding test_mock.py * renaming build -> test step in ci.yml * test_rewards.py * parameterizing fake_prob to allow intentionally testing real/synth image flows in vali fwd * forcing vali fwd through real and synth image flows * fake_prob -> _fake_prob * using dot operator to read config in mock vali until I replace namespace cfg with bt.config * allowing mock code to skip force_register_neuron in the case that the neuron was already registered in previous test instance * removing unused circleci dir from template repo * image transforms tests * fixing setting of mock dentrite process_time * adding test_mock.py * reset mock chain state in between test cases * cleaning up state management for MockSubtensor * __init__.py * replacing hardcoded string with random image b64 * Fixed saving synthetic images after resizing. * new auto update implementation from sn19 * inital self heal script from sn19 * Flag for downloading annotations from HuggingFace * fixing reference to self.config * Enforcing no watermarking in all cases * self heal in autoupdate script * making autoupdate scripts executable * self heal restart 6 -> 6.5 * typo * allowing --no-auto-update and --no-self-heal for validators * combining run scripts into run_neuron.py * replacing neuron type with --validator and --miner * documentation updates for new run script * docs update * adding wandb to docs * Arg for skipping annotation generation * Prompt truncation for annotations longer than max token length * Suppress token max exceeded warning, cleaned up error logging * Removed all tqdm loading bars, cleaned imports, updated fake dataset paths to parquet versions. * Improved annotation cleanliness with inter-prompt spacing and stripped endings. * removing fixtures reference from mock.py * read btcli args from .env * docs update * Formatting * fixing fixtures import * adding .env file creation to install script * moving network (test/finney) into .env, reducing script redundancy * missing netuid arg for MockSubtensor/MockMetagraph inits in test * adding .env to .gitignore * AXON_PORT -> MINER_AXON_PORT env var rename * docs updates to reflect latest run_neuron.py updates * updating .env paths * small docs update * Fixed annotation json filenames not starting with start_idx arg * locking down version numbers * Added docstrings and comments * fixing image_index field for wanbd logging * try except for wandb init * adding retries for nan images * fixing image isnan check by adding np.any * rename wandb fields *image_id -> *image_name * Updated failure case for generating annotations. * Adjusted TF logging level to include error messages. Cleaned up unnecessary imports. Simplified clear_gpu to not moving tensor to CPU. * Reverted deletion of necessary diffusion pipeline imports. Adjusted TF logging level in dataset generation script to be consistent with synthetic generation classes. * adding a sleep to reduce metagraph resync freq * fixing edge case that occurs when only 1 miner has nonzero weight * bump version to 1.0.2 * fixing download_data extension * Update fake dataset paths * replacing conda activate with /home/user/mambaforge/envs/tensorml --------- Co-authored-by: Benjamin <[email protected]> Co-authored-by: aliang322 <[email protected]> * TrainingDatasetProcessor class for loading, generating, and uploading preprocessing face only images into training datasets * Fixed transform dict var name * Removed normalize function in training dataset creation * Changed config dict to faces_only bool for clarity, changed hf repo type to dataset * Added splits instance variable, generalized function names, non face only processing * Script for interfacing with TrainingDatasetProcessor to create and upload preprocessed training datasets. * Added usage examples and explanation * Removed unused import * Simplified repo upload naming convention. * Added original image index column to training datasets. Consolidated transform datasets into HF subsets. * Added local save/load, upload repo destination options; Fixed dataset preprocessing to be performed in-place. * Created create_splits() helper function * Fixed not clearing dataset memory when loading pickle * Created Context-Aware Hierarchical Mixture-of-Agents (CAMO) miner. * Created modular helper function for loading detectors. * Rewording comments * Added YOLOv8 object detection for image classification * Renamed create_splits() to split_dataset() and added support for subset loading * Added HuggingFace subset download option * Restructured data loading for training. Support for face only subset loading with stratified splits. * Added YOLOv8 object detection experiments, renamed CAMO miner. * Updated paths for new CAMO model weights * Added object detection error checks * Switch debug to bt.logging, formatting * Set use_object_detection to False for current iteration of CAMO * Fixed assertion error on last batch of train epoch when incomplete batches present by dropping last in dataloaders. * Added data shuffling prior to splitting, check for disjoint stratified split indices. * Fixed faces not being used by face expert. * Improved clarity with face processing helper functions * Generalized to optionally include source label mapping * Consolidated real_fake_dataset.py into bitmind directory, updated references * Fixed docstring typo * Parameterized shuffling before splits, generalized params to adopt expert model terminology, made source label mapping optional * Renamed params appropriately, fixed generalist UCF not using source labels * Reformatted long lines, fixed split size printing num batches * Removed UCF-specific data utils, replaced with generalized utils at bitmind level * Standardized usage of bitmind.utils.train_data for data load/split across NPR and UCF base models * adding versions for new package * Update Mining.md * moving miner/validator specific dependencies to new requirements filse * setup scripts specific to miner/validator reqs * Relocated train/predict data processing scripts, updated imports and paths * Removed redundant face detection utils from UCF dir * Comment crediting original source of UCF training scripts * Removed DeepfakeBench submodule * Added auto download for backbone weights if not locally present * Removed redundant video metrics from validation logs * Updated README with script usage, removed deprecated manual weight download instructions * Cleaned up leftover DFB train configs, fixed training error when using only 1 real/fake train dataset * Deleted whitespace * Cleaned unused DFB config labels * adding subtensor.chain_endpoint to startup scripts * Added default value for specific task number for training * Standardized UCF paths with consts across UCF neurons and training files * Cleaned up experimental files. * Update Mining.md * Update Validating.md * Standardized neuron naming * Updated default miner to camo_miner.py * Fixed forgery dataset/method disentangling by setting specific_task_number value to num of fake datasets + 1 for real label * Added readme file for camo base miner * Fixed UCF constant weight path names * Removed non-validator dataset generation scripts, fixed camo readme. * Weights constant name typo * Fixed UCF miner imports * Added missing UCF weights import * Cleaned up utils and unnecessary files * adding testnet chain endpoint to docs * adding miner/vali dep installs to ci.yml --------- Co-authored-by: Benjamin <[email protected]> Co-authored-by: default <[email protected]> Co-authored-by: aliang322 <[email protected]> Co-authored-by: Ken Miyachi <[email protected]> Co-authored-by: Dylan Uys <[email protected]> * bump version 1.0.2 -> 1.1.0 * Removed sample images in UCF directory * parameterizing neuron filepath * Added base miner dir readme with CAMO information * Bittensor agnostic deepfake detector ABC, UCF and NPR subclass implementations * Moved preprocessed train dataset pipeline to bitmind-utils repo * Moved detectors and added naive unit tests in new dir * Fixed npr unit tests using correct resnet50 architecture * Added basic detector class registry implementation * Changed routing type logging to 'info' type (#60) * Logging Updates (#61) * Update Mining.md * logging augmented b64 images * removing deprecated get_metagraph endpoint from validator_proxy * removing metagraph fastapi endpoint * Moving tensor to PIL conversion to b64_encode * __init__.py * not logging b64 encoded images * shebang * Standardized detector call output to numpy ndarray * Registry contains magic method, unit tests after subclass imports * Module inits with DeepfakeDetector subclass imports * Removed model base var since subclasses may use more than one model * Removed redundant model load * Initial camo detector implementation using detector registry, unit test * Re-added YOLO objct detector * Commented auto registration for further testing * removing old notebooks, only supporting scripts moving forward * Added base_miner level init and adjusted deepfake_detectors file imports * Clarified comment * Changed unit tests to compare deepfake_detectors registry const and new registry * Added universal modular miner neuron using detector registry, and companion unit tests * Removed separate miner neurons for individual detector architectures * Removed legacy DFB preprocessing tools * Gate ABC and FaceGate subclass * Added gating registry and moved registries up a level, changed UCF to modular gates, fixed imports * Gating mechanism using gate registry for generalized CAMO * Fixed import typo * Removed legacy pretrained UCF files * Renamed ucf training script to match NPR convention * Cleaned logging to output to one directory per training run. Removed unused task target param. * Added local config yaml saving prior to training loop * Added epochs and num workers arg with default cpu count - 1 * Changed default epochs to 5 * UCF yaml config saving from training, HF config download for UCF detector init. Informative error message for specific task number mismatch. * Removed redundant path const * Loading detector attributes from YAML configs for easier model changes, readable params * Restored UCF face detector configs * Deleted unused UCF constants, updated references * Fixed import error due to sys path not being added yet, removed unused import * YAML comment cleanup, removed model_name key * Changed call() output to return npndarray consistent with npr, minor param rename * NPR detector using config for weight download * Parameterized detector device, refactored subclass inits to rely on base class for majority of setup * Miner now uses config args for dete ctor class, configs, device. * Updated ad hoc miner unit test to use neuron configs * Descriptive class docstrings * Pep8 import spacing, removed unused import * gitignore NPR weights dir * base miner directory documentation * formatting * Updated bittensor ver * Moved base miner CAMO description to top * Cleaned non-error detector and gate prints * Fixed incomplete docstrings * Defaulting CAMO and GatingMechanism to not use YOLO * Removed specific tasks param, changed default epoch args to config setting. Config now prints right before training loop. * Camo/Gate Refactor (#64) * typo * simplifying gating mechanism, removing object detection * removing gate from ucf, returning last model output * updating configs to match latest gate implementation * pep8 formatting * future proof handling image_url in image_dataset * face gate cleanup * face gate cleanup * moving face_utils.py to utils dir, updating imports * docstring update * init imports * testing benchmark datasets * fixing detector import * moving norm to image transform pipeline * ucf transfomrs + CLAHE * removing extra norm config from real_fake_dataset * adding many to one (group_by_source) option * fixing fake_source_label value * Moving subset loading for face expert datasets to constants.py * simplifying transformation arguments * train script cleanup * moving data_processing/ functionality to utils/data.py * standardizing order of dataloading args * reorganizing code for cleaner structure + addressing circular import * cleaning up dataset splitting * updating NPR training to use new data loading code * adding missing requests import * adding new detector args to miner start script * adding new fields to miner env setup script * removing old model_path arg * setting default datasets * simplifying pm2 start commands * Fixing import * new configs * adding usage of self.device in npr, printing config loading in ucf * adding device to start miner scripts * Added parameterization options in miner.env template * docs updates * Fixed torch device setting * docs udpates * Readme fixes * HuggingFace renamed face expert dataset path * CPU device support with default GPU. Renamed local_rank to gpu_id --------- Co-authored-by: Dylan Uys <[email protected]> Co-authored-by: benliang99 <[email protected]> Co-authored-by: Andrew <[email protected]> * Updated Validator min compute (#71) * Updated min compute in anticipation of FLUX and LLM moderation additions. * Fixed typo, organized .env format. * Small updates (#72) * updating config print * small doc update * Release readme (#70) * Added config context in mining readme * Readme formatting * moving deployment instructions to deploy section * docs update * miner.env and base miner dir context * Spacing --------- Co-authored-by: Dylan Uys <[email protected]> * allowing auto device configuration (#74) * allowing auto device configuration * allowing config to be specified as a rel path * docs updateg * UCF backbone overwrite (#75) * Backbone now loaded from path in UCF detector config rather than train config * Updated ad hoc detector unit tests, added one script that runs all tests * Removed deprecated mining model weights directory * UCFDetector config no longer provides backbone path, instead uses training config backbone path which defaults to xception-best.pth on HF * Updated backbone load check * Training now uses xception-best.pth backbone weights from bitmind/bm-ucf/ on HuggingFace * Weights renaming (#76) * removing deprecated train_data.py * updating weight and config filenames w version str --------- Co-authored-by: Benjamin <[email protected]> Co-authored-by: aliang322 <[email protected]> Co-authored-by: default <[email protected]> Co-authored-by: aliang322 <[email protected]> Co-authored-by: Ken Miyachi <[email protected]>

* Updated synthetic image mirror generation script, created helper function for generating images iin SyntheticImageGenerator class, moved notebooks to new notebooks dir. * Restored ensure_save_path func back to annotation_utils.py * Added latency tracking, max images generated field for synthetic image mirror pipeline. * Added mean synthetic image gen latency print statement * Update example arg inputs * Fix imports * Fixed and reformatted args * Suppressed TensorFlow warnings, fixed image gen from annotation * Index and loop bugfixes * Index, looping, args logic fixes. * Add load diffuser function call * Clear gpu after using synthetic image generator * Always load from Hugging Face. * Batch processing for memory optimization. Added optional name field to generate_image in SyntheticImageGenerato for future customization. * Memory optimizations (saving annotation .jsons to disk), added args for chunking, pm2 examples * Fix save as json on disk, ensure no hanging reference when gpu is cleared in SyntheticImageGenerator * Replaced generic DiffusionPipeline with StableDiffusionPipeline that inherits from it. Specified generated image dimensions in diffuser call params. * convert diffuser to float32 before moving onto cpu, fixed duplicate image count logs * Added a testing function to save images from real image dataset, changed annotations 'index' field to 'id' for consistency, various data loading and parameter fixes * Added pipeline for diffusion models to constants.py, and dynamic pipeline loading and image size customization to generation. * Fixed Hugging Face authentication errors. Added instruction to authenticate with huggingface-cli login * Fixed all annotations being used to generate mirrors regardless of start and end indices * Added a new load_and_sort_dataset function to handle Hugging Face dataset rows being ordered by filename string-wise instead of numerically. Added generate_synthetic_images arg and updated dataset naming conventions for parallelization-friendliness. Disabled diffusion pipeline progress bars. Added const for progress updates in terminal. * Removed extra disable progress bar call. Added ceil import for progress calculation. * Adjust Hugging Face annotations dataset name * Reverted annotations dataset name to have data range, now requiring start_index and end_index args. * Re-removed data range from annotations * Update 'index' to 'id' * Fixed loading annotations from Hugging Face and savng specified indices to disk. * Utils refactored, smaller functions. Added resize arg. Added combine_datasets script to put together all generated splits into one Hugging Face dataset. * Replace hardcoded name * Fix fstring * Fix args * Fixed typos * Updated combine_datasets.py to match Hugging Face dataset nomenclature. * removing unused files * initial validator forward pytest * initial ci.yml * new mock classes for ci workflow * temporarily removing old version of generate_synthetic_data.py * rename get_mock_image() -> create_random_image() * adding test_mock.py * renaming build -> test step in ci.yml * test_rewards.py * parameterizing fake_prob to allow intentionally testing real/synth image flows in vali fwd * forcing vali fwd through real and synth image flows * fake_prob -> _fake_prob * using dot operator to read config in mock vali until I replace namespace cfg with bt.config * allowing mock code to skip force_register_neuron in the case that the neuron was already registered in previous test instance * removing unused circleci dir from template repo * image transforms tests * fixing setting of mock dentrite process_time * adding test_mock.py * reset mock chain state in between test cases * cleaning up state management for MockSubtensor * __init__.py * replacing hardcoded string with random image b64 * Fixed saving synthetic images after resizing. * new auto update implementation from sn19 * inital self heal script from sn19 * Flag for downloading annotations from HuggingFace * fixing reference to self.config * Enforcing no watermarking in all cases * self heal in autoupdate script * making autoupdate scripts executable * self heal restart 6 -> 6.5 * typo * allowing --no-auto-update and --no-self-heal for validators * combining run scripts into run_neuron.py * replacing neuron type with --validator and --miner * documentation updates for new run script * docs update * adding wandb to docs * Arg for skipping annotation generation * Prompt truncation for annotations longer than max token length * Suppress token max exceeded warning, cleaned up error logging * Removed all tqdm loading bars, cleaned imports, updated fake dataset paths to parquet versions. * Improved annotation cleanliness with inter-prompt spacing and stripped endings. * removing fixtures reference from mock.py * read btcli args from .env * docs update * Formatting * fixing fixtures import * adding .env file creation to install script * moving network (test/finney) into .env, reducing script redundancy * missing netuid arg for MockSubtensor/MockMetagraph inits in test * adding .env to .gitignore * AXON_PORT -> MINER_AXON_PORT env var rename * docs updates to reflect latest run_neuron.py updates * updating .env paths * small docs update * Fixed annotation json filenames not starting with start_idx arg * locking down version numbers * Added docstrings and comments * fixing image_index field for wanbd logging * try except for wandb init * adding retries for nan images * fixing image isnan check by adding np.any * rename wandb fields *image_id -> *image_name * Updated failure case for generating annotations. * Adjusted TF logging level to include error messages. Cleaned up unnecessary imports. Simplified clear_gpu to not moving tensor to CPU. * Reverted deletion of necessary diffusion pipeline imports. Adjusted TF logging level in dataset generation script to be consistent with synthetic generation classes. * adding a sleep to reduce metagraph resync freq * fixing edge case that occurs when only 1 miner has nonzero weight * bump version to 1.0.2 * fixing download_data extension * Update fake dataset paths * replacing conda activate with /home/user/mambaforge/envs/tensorml * Base miner training improvements and Content-Aware Model Orchestration (CAMO) Framework (#55) * Added DeepfakeBench submodule to base_miner dir * Added initial adaptation of pretrained UCF inference. Refactored NPR files into new dir. * Added setup readme and a sample image for inference. * Added loss functions and backbone network. Updated readme. * Enable loading model checkpoints from Hugging Face. * migrated training scripts from DeepfakeBench * Added package initialization, renaming configs to config. * Added fix to missing weights directory * Finished ucf_test on sample images. * Added dlib shape detector for face detection and alignment * Added face_recognition implementation of face alignment * Fixed variable names * Update dlib requirements * Implemented ucf_miner and created a class for the pretrained UCF model * Renamed files for clarity. Added unit test for pretrained UCF. * Migrated train utils from NPR base miner, modified train_ucf.py to use BitMind datasets * Fix image input type errors * Added xception training backbone, logging files * Detectors module path fix * bug fixes for live miner * BitMind data load and restructure for integrated DeepfakeBench train loop * Added DeepfakeBench training logs to .gitignore * Removed unused import, local data saving * Fixed prediction_class referenced before assignment * Fixed test metric using logits and not class labels * Train source labels for learning specific forgery, added separate test and validation loops * Corrected variable name typo * Refactored eval in training loop, renamed test stage to validation * Implemented source label mapping in UCF training splits * Added test stage, source labels to training data dict for learning dataset specific forgery features * Added gpu cache cleanup, now using configs for batch size and data loader workers. * Batch to cpu after train loader iteration, logging cleanup * Fixed test metrics not logging * Added logging for train and test time * Added image normalization for training data * Re-added check for data label during inference. * Adjusted UCF image normalization to be in line with config. Fixed processing of local images for UCF testing. * Adjusted image preprocessing for experiments. * Added face cropping and alignment to preprocess images for UCF detection. * Typo fixes, added readme to credit face shape predictor file. * Made face crop and align False by default. * New miner script for running UCF-BitMind * Added handling for the case when face_detector does not find any images. Reduced warning messages. * Removed duplicate function. * Adapted face detection and extraction functions to UCF class for modularity. Updated and refactored test, miner scripts. * First iteration of context_aware_miner.py * Fixed ucf_miner import error by simplifying path and import statement for UCF module. * Fixed dlib predictor path, explicitly define map_location for torch.load * Fixed imports in ucf_bitmind_miner and removed rounding or predictions. * Fixed imports for context aware miner. * Fix NPR model weight variable name. * Typo * Added Context-Aware v2 with UCF-BitMind for general images * Remove unused import * Moved UCF model loading to init function of Context Aware Miner v2 * Added free memory function to manage resources for multi-model miners. * Added script for miners to test their model loading and inference latency. * Moved model loading to init functions to avoid reloading. * Updated minimum miner requirements to require GPU * Fixed indents, load UCF-DFB model in init func * Update check for faces to be consistent with DFB preprocessing * Release 1.0.2 (#50) * Updated synthetic image mirror generation script, created helper function for generating images iin SyntheticImageGenerator class, moved notebooks to new notebooks dir. * Restored ensure_save_path func back to annotation_utils.py * Added latency tracking, max images generated field for synthetic image mirror pipeline. * Added mean synthetic image gen latency print statement * Update example arg inputs * Fix imports * Fixed and reformatted args * Suppressed TensorFlow warnings, fixed image gen from annotation * Index and loop bugfixes * Index, looping, args logic fixes. * Add load diffuser function call * Clear gpu after using synthetic image generator * Always load from Hugging Face. * Batch processing for memory optimization. Added optional name field to generate_image in SyntheticImageGenerato for future customization. * Memory optimizations (saving annotation .jsons to disk), added args for chunking, pm2 examples * Fix save as json on disk, ensure no hanging reference when gpu is cleared in SyntheticImageGenerator * Replaced generic DiffusionPipeline with StableDiffusionPipeline that inherits from it. Specified generated image dimensions in diffuser call params. * convert diffuser to float32 before moving onto cpu, fixed duplicate image count logs * Added a testing function to save images from real image dataset, changed annotations 'index' field to 'id' for consistency, various data loading and parameter fixes * Added pipeline for diffusion models to constants.py, and dynamic pipeline loading and image size customization to generation. * Fixed Hugging Face authentication errors. Added instruction to authenticate with huggingface-cli login * Fixed all annotations being used to generate mirrors regardless of start and end indices * Added a new load_and_sort_dataset function to handle Hugging Face dataset rows being ordered by filename string-wise instead of numerically. Added generate_synthetic_images arg and updated dataset naming conventions for parallelization-friendliness. Disabled diffusion pipeline progress bars. Added const for progress updates in terminal. * Removed extra disable progress bar call. Added ceil import for progress calculation. * Adjust Hugging Face annotations dataset name * Reverted annotations dataset name to have data range, now requiring start_index and end_index args. * Re-removed data range from annotations * Update 'index' to 'id' * Fixed loading annotations from Hugging Face and savng specified indices to disk. * Utils refactored, smaller functions. Added resize arg. Added combine_datasets script to put together all generated splits into one Hugging Face dataset. * Replace hardcoded name * Fix fstring * Fix args * Fixed typos * Updated combine_datasets.py to match Hugging Face dataset nomenclature. * removing unused files * initial validator forward pytest * initial ci.yml * new mock classes for ci workflow * temporarily removing old version of generate_synthetic_data.py * rename get_mock_image() -> create_random_image() * adding test_mock.py * renaming build -> test step in ci.yml * test_rewards.py * parameterizing fake_prob to allow intentionally testing real/synth image flows in vali fwd * forcing vali fwd through real and synth image flows * fake_prob -> _fake_prob * using dot operator to read config in mock vali until I replace namespace cfg with bt.config * allowing mock code to skip force_register_neuron in the case that the neuron was already registered in previous test instance * removing unused circleci dir from template repo * image transforms tests * fixing setting of mock dentrite process_time * adding test_mock.py * reset mock chain state in between test cases * cleaning up state management for MockSubtensor * __init__.py * replacing hardcoded string with random image b64 * Fixed saving synthetic images after resizing. * new auto update implementation from sn19 * inital self heal script from sn19 * Flag for downloading annotations from HuggingFace * fixing reference to self.config * Enforcing no watermarking in all cases * self heal in autoupdate script * making autoupdate scripts executable * self heal restart 6 -> 6.5 * typo * allowing --no-auto-update and --no-self-heal for validators * combining run scripts into run_neuron.py * replacing neuron type with --validator and --miner * documentation updates for new run script * docs update * adding wandb to docs * Arg for skipping annotation generation * Prompt truncation for annotations longer than max token length * Suppress token max exceeded warning, cleaned up error logging * Removed all tqdm loading bars, cleaned imports, updated fake dataset paths to parquet versions. * Improved annotation cleanliness with inter-prompt spacing and stripped endings. * removing fixtures reference from mock.py * read btcli args from .env * docs update * Formatting * fixing fixtures import * adding .env file creation to install script * moving network (test/finney) into .env, reducing script redundancy * missing netuid arg for MockSubtensor/MockMetagraph inits in test * adding .env to .gitignore * AXON_PORT -> MINER_AXON_PORT env var rename * docs updates to reflect latest run_neuron.py updates * updating .env paths * small docs update * Fixed annotation json filenames not starting with start_idx arg * locking down version numbers * Added docstrings and comments * fixing image_index field for wanbd logging * try except for wandb init * adding retries for nan images * fixing image isnan check by adding np.any * rename wandb fields *image_id -> *image_name * Updated failure case for generating annotations. * Adjusted TF logging level to include error messages. Cleaned up unnecessary imports. Simplified clear_gpu to not moving tensor to CPU. * Reverted deletion of necessary diffusion pipeline imports. Adjusted TF logging level in dataset generation script to be consistent with synthetic generation classes. * adding a sleep to reduce metagraph resync freq * fixing edge case that occurs when only 1 miner has nonzero weight * bump version to 1.0.2 * fixing download_data extension * Update fake dataset paths * replacing conda activate with /home/user/mambaforge/envs/tensorml --------- Co-authored-by: Benjamin <[email protected]> Co-authored-by: aliang322 <[email protected]> * TrainingDatasetProcessor class for loading, generating, and uploading preprocessing face only images into training datasets * Fixed transform dict var name * Removed normalize function in training dataset creation * Changed config dict to faces_only bool for clarity, changed hf repo type to dataset * Added splits instance variable, generalized function names, non face only processing * Script for interfacing with TrainingDatasetProcessor to create and upload preprocessed training datasets. * Added usage examples and explanation * Removed unused import * Simplified repo upload naming convention. * Added original image index column to training datasets. Consolidated transform datasets into HF subsets. * Added local save/load, upload repo destination options; Fixed dataset preprocessing to be performed in-place. * Created create_splits() helper function * Fixed not clearing dataset memory when loading pickle * Created Context-Aware Hierarchical Mixture-of-Agents (CAMO) miner. * Created modular helper function for loading detectors. * Rewording comments * Added YOLOv8 object detection for image classification * Renamed create_splits() to split_dataset() and added support for subset loading * Added HuggingFace subset download option * Restructured data loading for training. Support for face only subset loading with stratified splits. * Added YOLOv8 object detection experiments, renamed CAMO miner. * Updated paths for new CAMO model weights * Added object detection error checks * Switch debug to bt.logging, formatting * Set use_object_detection to False for current iteration of CAMO * Fixed assertion error on last batch of train epoch when incomplete batches present by dropping last in dataloaders. * Added data shuffling prior to splitting, check for disjoint stratified split indices. * Fixed faces not being used by face expert. * Improved clarity with face processing helper functions * Generalized to optionally include source label mapping * Consolidated real_fake_dataset.py into bitmind directory, updated references * Fixed docstring typo * Parameterized shuffling before splits, generalized params to adopt expert model terminology, made source label mapping optional * Renamed params appropriately, fixed generalist UCF not using source labels * Reformatted long lines, fixed split size printing num batches * Removed UCF-specific data utils, replaced with generalized utils at bitmind level * Standardized usage of bitmind.utils.train_data for data load/split across NPR and UCF base models * adding versions for new package * Update Mining.md * moving miner/validator specific dependencies to new requirements filse * setup scripts specific to miner/validator reqs * Relocated train/predict data processing scripts, updated imports and paths * Removed redundant face detection utils from UCF dir * Comment crediting original source of UCF training scripts * Removed DeepfakeBench submodule * Added auto download for backbone weights if not locally present * Removed redundant video metrics from validation logs * Updated README with script usage, removed deprecated manual weight download instructions * Cleaned up leftover DFB train configs, fixed training error when using only 1 real/fake train dataset * Deleted whitespace * Cleaned unused DFB config labels * adding subtensor.chain_endpoint to startup scripts * Added default value for specific task number for training * Standardized UCF paths with consts across UCF neurons and training files * Cleaned up experimental files. * Update Mining.md * Update Validating.md * Standardized neuron naming * Updated default miner to camo_miner.py * Fixed forgery dataset/method disentangling by setting specific_task_number value to num of fake datasets + 1 for real label * Added readme file for camo base miner * Fixed UCF constant weight path names * Removed non-validator dataset generation scripts, fixed camo readme. * Weights constant name typo * Fixed UCF miner imports * Added missing UCF weights import * Cleaned up utils and unnecessary files * adding testnet chain endpoint to docs * adding miner/vali dep installs to ci.yml --------- Co-authored-by: Benjamin <[email protected]> Co-authored-by: default <[email protected]> Co-authored-by: aliang322 <[email protected]> Co-authored-by: Ken Miyachi <[email protected]> Co-authored-by: Dylan Uys <[email protected]> * bump version 1.0.2 -> 1.1.0 * Removed sample images in UCF directory * parameterizing neuron filepath * Added base miner dir readme with CAMO information * Bittensor agnostic deepfake detector ABC, UCF and NPR subclass implementations * Moved preprocessed train dataset pipeline to bitmind-utils repo * Moved detectors and added naive unit tests in new dir * Fixed npr unit tests using correct resnet50 architecture * Added basic detector class registry implementation * Changed routing type logging to 'info' type (#60) * Logging Updates (#61) * Update Mining.md * logging augmented b64 images * removing deprecated get_metagraph endpoint from validator_proxy * removing metagraph fastapi endpoint * Moving tensor to PIL conversion to b64_encode * __init__.py * not logging b64 encoded images * shebang * Standardized detector call output to numpy ndarray * Registry contains magic method, unit tests after subclass imports * Module inits with DeepfakeDetector subclass imports * Removed model base var since subclasses may use more than one model * Removed redundant model load * Initial camo detector implementation using detector registry, unit test * Re-added YOLO objct detector * Commented auto registration for further testing * removing old notebooks, only supporting scripts moving forward * Added base_miner level init and adjusted deepfake_detectors file imports * Clarified comment * Changed unit tests to compare deepfake_detectors registry const and new registry * Added universal modular miner neuron using detector registry, and companion unit tests * Removed separate miner neurons for individual detector architectures * Removed legacy DFB preprocessing tools * Gate ABC and FaceGate subclass * Added gating registry and moved registries up a level, changed UCF to modular gates, fixed imports * Gating mechanism using gate registry for generalized CAMO * Fixed import typo * Removed legacy pretrained UCF files * Renamed ucf training script to match NPR convention * Cleaned logging to output to one directory per training run. Removed unused task target param. * Added local config yaml saving prior to training loop * Added epochs and num workers arg with default cpu count - 1 * Changed default epochs to 5 * UCF yaml config saving from training, HF config download for UCF detector init. Informative error message for specific task number mismatch. * Removed redundant path const * Loading detector attributes from YAML configs for easier model changes, readable params * Restored UCF face detector configs * Deleted unused UCF constants, updated references * Fixed import error due to sys path not being added yet, removed unused import * YAML comment cleanup, removed model_name key * Changed call() output to return npndarray consistent with npr, minor param rename * NPR detector using config for weight download * Parameterized detector device, refactored subclass inits to rely on base class for majority of setup * Miner now uses config args for dete ctor class, configs, device. * Updated ad hoc miner unit test to use neuron configs * Descriptive class docstrings * Pep8 import spacing, removed unused import * gitignore NPR weights dir * base miner directory documentation * formatting * Updated bittensor ver * Moved base miner CAMO description to top * Cleaned non-error detector and gate prints * Fixed incomplete docstrings * Defaulting CAMO and GatingMechanism to not use YOLO * Removed specific tasks param, changed default epoch args to config setting. Config now prints right before training loop. * Camo/Gate Refactor (#64) * typo * simplifying gating mechanism, removing object detection * removing gate from ucf, returning last model output * updating configs to match latest gate implementation * pep8 formatting * future proof handling image_url in image_dataset * face gate cleanup * face gate cleanup * moving face_utils.py to utils dir, updating imports * docstring update * init imports * testing benchmark datasets * fixing detector import * moving norm to image transform pipeline * ucf transfomrs + CLAHE * removing extra norm config from real_fake_dataset * adding many to one (group_by_source) option * fixing fake_source_label value * Moving subset loading for face expert datasets to constants.py * simplifying transformation arguments * train script cleanup * moving data_processing/ functionality to utils/data.py * standardizing order of dataloading args * reorganizing code for cleaner structure + addressing circular import * cleaning up dataset splitting * updating NPR training to use new data loading code * adding missing requests import * adding new detector args to miner start script * adding new fields to miner env setup script * removing old model_path arg * setting default datasets * simplifying pm2 start commands * Fixing import * new configs * adding usage of self.device in npr, printing config loading in ucf * adding device to start miner scripts * Added parameterization options in miner.env template * docs updates * Fixed torch device setting * docs udpates * Readme fixes * HuggingFace renamed face expert dataset path * CPU device support with default GPU. Renamed local_rank to gpu_id --------- Co-authored-by: Dylan Uys <[email protected]> Co-authored-by: benliang99 <[email protected]> Co-authored-by: Andrew <[email protected]> * Updated Validator min compute (#71) * Updated min compute in anticipation of FLUX and LLM moderation additions. * Fixed typo, organized .env format. * Small updates (#72) * updating config print * small doc update * Release readme (#70) * Added config context in mining readme * Readme formatting * moving deployment instructions to deploy section * docs update * miner.env and base miner dir context * Spacing --------- Co-authored-by: Dylan Uys <[email protected]> * allowing auto device configuration (#74) * allowing auto device configuration * allowing config to be specified as a rel path * docs updateg * UCF backbone overwrite (#75) * Backbone now loaded from path in UCF detector config rather than train config * Updated ad hoc detector unit tests, added one script that runs all tests * Removed deprecated mining model weights directory * UCFDetector config no longer provides backbone path, instead uses training config backbone path which defaults to xception-best.pth on HF * Updated backbone load check * Training now uses xception-best.pth backbone weights from bitmind/bm-ucf/ on HuggingFace * Weights renaming (#76) * removing deprecated train_data.py * updating weight and config filenames w version str * Scoring Udpate (#80) * miner performance tracker * adding perf tracker to base validator save & load state * new reward function: squared AUC * EMA alpha 0.1 -> 0.01 * removing test that checked old reward function, will have a more sophisticated test in the future * explicitly setting window arg for get_metrics in get_rewards * logging metrics to wandb for leaderboard * bump version * adding perf tracker to mockvalidator * doc updates * removing extra / from path for consistency * Making wandb a general requirement for miners to check their performance --------- Co-authored-by: Benjamin <[email protected]> Co-authored-by: aliang322 <[email protected]> Co-authored-by: default <[email protected]> Co-authored-by: aliang322 <[email protected]> Co-authored-by: Ken Miyachi <[email protected]>

* Suppressed TensorFlow warnings, fixed image gen from annotation * Index and loop bugfixes * Index, looping, args logic fixes. * Add load diffuser function call * Clear gpu after using synthetic image generator * Always load from Hugging Face. * Batch processing for memory optimization. Added optional name field to generate_image in SyntheticImageGenerato for future customization. * Memory optimizations (saving annotation .jsons to disk), added args for chunking, pm2 examples * Fix save as json on disk, ensure no hanging reference when gpu is cleared in SyntheticImageGenerator * Replaced generic DiffusionPipeline with StableDiffusionPipeline that inherits from it. Specified generated image dimensions in diffuser call params. * convert diffuser to float32 before moving onto cpu, fixed duplicate image count logs * Added a testing function to save images from real image dataset, changed annotations 'index' field to 'id' for consistency, various data loading and parameter fixes * Added pipeline for diffusion models to constants.py, and dynamic pipeline loading and image size customization to generation. * Fixed Hugging Face authentication errors. Added instruction to authenticate with huggingface-cli login * Fixed all annotations being used to generate mirrors regardless of start and end indices * Added a new load_and_sort_dataset function to handle Hugging Face dataset rows being ordered by filename string-wise instead of numerically. Added generate_synthetic_images arg and updated dataset naming conventions for parallelization-friendliness. Disabled diffusion pipeline progress bars. Added const for progress updates in terminal. * Removed extra disable progress bar call. Added ceil import for progress calculation. * Adjust Hugging Face annotations dataset name * Reverted annotations dataset name to have data range, now requiring start_index and end_index args. * Re-removed data range from annotations * Update 'index' to 'id' * Fixed loading annotations from Hugging Face and savng specified indices to disk. * Utils refactored, smaller functions. Added resize arg. Added combine_datasets script to put together all generated splits into one Hugging Face dataset. * Replace hardcoded name * Fix fstring * Fix args * Fixed typos * Updated combine_datasets.py to match Hugging Face dataset nomenclature. * removing unused files * initial validator forward pytest * initial ci.yml * new mock classes for ci workflow * temporarily removing old version of generate_synthetic_data.py * rename get_mock_image() -> create_random_image() * adding test_mock.py * renaming build -> test step in ci.yml * test_rewards.py * parameterizing fake_prob to allow intentionally testing real/synth image flows in vali fwd * forcing vali fwd through real and synth image flows * fake_prob -> _fake_prob * using dot operator to read config in mock vali until I replace namespace cfg with bt.config * allowing mock code to skip force_register_neuron in the case that the neuron was already registered in previous test instance * removing unused circleci dir from template repo * image transforms tests * fixing setting of mock dentrite process_time * adding test_mock.py * reset mock chain state in between test cases * cleaning up state management for MockSubtensor * __init__.py * replacing hardcoded string with random image b64 * Fixed saving synthetic images after resizing. * adding blackforest labs models * temporarily disabling real image challenges to maximize data production * adding cpu offload * adding arg for flux generation * calling flux generation with new args in constants.py * new auto update implementation from sn19 * inital self heal script from sn19 * Flag for downloading annotations from HuggingFace * fixing reference to self.config * Enforcing no watermarking in all cases * self heal in autoupdate script * making autoupdate scripts executable * self heal restart 6 -> 6.5 * typo * allowing --no-auto-update and --no-self-heal for validators * combining run scripts into run_neuron.py * replacing neuron type with --validator and --miner * documentation updates for new run script * docs update * adding wandb to docs * Arg for skipping annotation generation * Prompt truncation for annotations longer than max token length * Suppress token max exceeded warning, cleaned up error logging * Removed all tqdm loading bars, cleaned imports, updated fake dataset paths to parquet versions. * Improved annotation cleanliness with inter-prompt spacing and stripped endings. * removing fixtures reference from mock.py * read btcli args from .env * docs update * Formatting * fixing fixtures import * adding .env file creation to install script * moving network (test/finney) into .env, reducing script redundancy * missing netuid arg for MockSubtensor/MockMetagraph inits in test * adding .env to .gitignore * AXON_PORT -> MINER_AXON_PORT env var rename * docs updates to reflect latest run_neuron.py updates * updating .env paths * small docs update * Fixed annotation json filenames not starting with start_idx arg * locking down version numbers * Added docstrings and comments * fixing image_index field for wanbd logging * try except for wandb init * adding retries for nan images * fixing image isnan check by adding np.any * rename wandb fields *image_id -> *image_name * Updated failure case for generating annotations. * Adjusted TF logging level to include error messages. Cleaned up unnecessary imports. Simplified clear_gpu to not moving tensor to CPU. * Reverted deletion of necessary diffusion pipeline imports. Adjusted TF logging level in dataset generation script to be consistent with synthetic generation classes. * adding a sleep to reduce metagraph resync freq * fixing edge case that occurs when only 1 miner has nonzero weight * bump version to 1.0.2 * fixing download_data extension * Update fake dataset paths * replacing conda activate with /home/user/mambaforge/envs/tensorml * constants update for flux mirror gen * adding cache dir to model loads * temp notebooks for flux tests * typo * updating guidance_scale * Added Meta's Llama-3.1-8B-Instruct model for annotation moderation. * Upgraded BLIP2 model to 6.7B param version * Replaced Llama-3.1-8B-Instruct model with Unsloth's ungated version. * Updated requirements with Unsloth model dependency. * Clear image generation annotation models from GPU after generating image dataset annotations. * Fixed usage of generation args, updated image generation helper function. * Base miner training improvements and Content-Aware Model Orchestration (CAMO) Framework (#55) * Added DeepfakeBench submodule to base_miner dir * Added initial adaptation of pretrained UCF inference. Refactored NPR files into new dir. * Added setup readme and a sample image for inference. * Added loss functions and backbone network. Updated readme. * Enable loading model checkpoints from Hugging Face. * migrated training scripts from DeepfakeBench * Added package initialization, renaming configs to config. * Added fix to missing weights directory * Finished ucf_test on sample images. * Added dlib shape detector for face detection and alignment * Added face_recognition implementation of face alignment * Fixed variable names * Update dlib requirements * Implemented ucf_miner and created a class for the pretrained UCF model * Renamed files for clarity. Added unit test for pretrained UCF. * Migrated train utils from NPR base miner, modified train_ucf.py to use BitMind datasets * Fix image input type errors * Added xception training backbone, logging files * Detectors module path fix * bug fixes for live miner * BitMind data load and restructure for integrated DeepfakeBench train loop * Added DeepfakeBench training logs to .gitignore * Removed unused import, local data saving * Fixed prediction_class referenced before assignment * Fixed test metric using logits and not class labels * Train source labels for learning specific forgery, added separate test and validation loops * Corrected variable name typo * Refactored eval in training loop, renamed test stage to validation * Implemented source label mapping in UCF training splits * Added test stage, source labels to training data dict for learning dataset specific forgery features * Added gpu cache cleanup, now using configs for batch size and data loader workers. * Batch to cpu after train loader iteration, logging cleanup * Fixed test metrics not logging * Added logging for train and test time * Added image normalization for training data * Re-added check for data label during inference. * Adjusted UCF image normalization to be in line with config. Fixed processing of local images for UCF testing. * Adjusted image preprocessing for experiments. * Added face cropping and alignment to preprocess images for UCF detection. * Typo fixes, added readme to credit face shape predictor file. * Made face crop and align False by default. * New miner script for running UCF-BitMind * Added handling for the case when face_detector does not find any images. Reduced warning messages. * Removed duplicate function. * Adapted face detection and extraction functions to UCF class for modularity. Updated and refactored test, miner scripts. * First iteration of context_aware_miner.py * Fixed ucf_miner import error by simplifying path and import statement for UCF module. * Fixed dlib predictor path, explicitly define map_location for torch.load * Fixed imports in ucf_bitmind_miner and removed rounding or predictions. * Fixed imports for context aware miner. * Fix NPR model weight variable name. * Typo * Added Context-Aware v2 with UCF-BitMind for general images * Remove unused import * Moved UCF model loading to init function of Context Aware Miner v2 * Added free memory function to manage resources for multi-model miners. * Added script for miners to test their model loading and inference latency. * Moved model loading to init functions to avoid reloading. * Updated minimum miner requirements to require GPU * Fixed indents, load UCF-DFB model in init func * Update check for faces to be consistent with DFB preprocessing * Release 1.0.2 (#50) * Updated synthetic image mirror generation script, created helper function for generating images iin SyntheticImageGenerator class, moved notebooks to new notebooks dir. * Restored ensure_save_path func back to annotation_utils.py * Added latency tracking, max images generated field for synthetic image mirror pipeline. * Added mean synthetic image gen latency print statement * Update example arg inputs * Fix imports * Fixed and reformatted args * Suppressed TensorFlow warnings, fixed image gen from annotation * Index and loop bugfixes * Index, looping, args logic fixes. * Add load diffuser function call * Clear gpu after using synthetic image generator * Always load from Hugging Face. * Batch processing for memory optimization. Added optional name field to generate_image in SyntheticImageGenerato for future customization. * Memory optimizations (saving annotation .jsons to disk), added args for chunking, pm2 examples * Fix save as json on disk, ensure no hanging reference when gpu is cleared in SyntheticImageGenerator * Replaced generic DiffusionPipeline with StableDiffusionPipeline that inherits from it. Specified generated image dimensions in diffuser call params. * convert diffuser to float32 before moving onto cpu, fixed duplicate image count logs * Added a testing function to save images from real image dataset, changed annotations 'index' field to 'id' for consistency, various data loading and parameter fixes * Added pipeline for diffusion models to constants.py, and dynamic pipeline loading and image size customization to generation. * Fixed Hugging Face authentication errors. Added instruction to authenticate with huggingface-cli login * Fixed all annotations being used to generate mirrors regardless of start and end indices * Added a new load_and_sort_dataset function to handle Hugging Face dataset rows being ordered by filename string-wise instead of numerically. Added generate_synthetic_images arg and updated dataset naming conventions for parallelization-friendliness. Disabled diffusion pipeline progress bars. Added const for progress updates in terminal. * Removed extra disable progress bar call. Added ceil import for progress calculation. * Adjust Hugging Face annotations dataset name * Reverted annotations dataset name to have data range, now requiring start_index and end_index args. * Re-removed data range from annotations * Update 'index' to 'id' * Fixed loading annotations from Hugging Face and savng specified indices to disk. * Utils refactored, smaller functions. Added resize arg. Added combine_datasets script to put together all generated splits into one Hugging Face dataset. * Replace hardcoded name * Fix fstring * Fix args * Fixed typos * Updated combine_datasets.py to match Hugging Face dataset nomenclature. * removing unused files * initial validator forward pytest * initial ci.yml * new mock classes for ci workflow * temporarily removing old version of generate_synthetic_data.py * rename get_mock_image() -> create_random_image() * adding test_mock.py * renaming build -> test step in ci.yml * test_rewards.py * parameterizing fake_prob to allow intentionally testing real/synth image flows in vali fwd * forcing vali fwd through real and synth image flows * fake_prob -> _fake_prob * using dot operator to read config in mock vali until I replace namespace cfg with bt.config * allowing mock code to skip force_register_neuron in the case that the neuron was already registered in previous test instance * removing unused circleci dir from template repo * image transforms tests * fixing setting of mock dentrite process_time * adding test_mock.py * reset mock chain state in between test cases * cleaning up state management for MockSubtensor * __init__.py * replacing hardcoded string with random image b64 * Fixed saving synthetic images after resizing. * new auto update implementation from sn19 * inital self heal script from sn19 * Flag for downloading annotations from HuggingFace * fixing reference to self.config * Enforcing no watermarking in all cases * self heal in autoupdate script * making autoupdate scripts executable * self heal restart 6 -> 6.5 * typo * allowing --no-auto-update and --no-self-heal for validators * combining run scripts into run_neuron.py * replacing neuron type with --validator and --miner * documentation updates for new run script * docs update * adding wandb to docs * Arg for skipping annotation generation * Prompt truncation for annotations longer than max token length * Suppress token max exceeded warning, cleaned up error logging * Removed all tqdm loading bars, cleaned imports, updated fake dataset paths to parquet versions. * Improved annotation cleanliness with inter-prompt spacing and stripped endings. * removing fixtures reference from mock.py * read btcli args from .env * docs update * Formatting * fixing fixtures import * adding .env file creation to install script * moving network (test/finney) into .env, reducing script redundancy * missing netuid arg for MockSubtensor/MockMetagraph inits in test * adding .env to .gitignore * AXON_PORT -> MINER_AXON_PORT env var rename * docs updates to reflect latest run_neuron.py updates * updating .env paths * small docs update * Fixed annotation json filenames not starting with start_idx arg * locking down version numbers * Added docstrings and comments * fixing image_index field for wanbd logging * try except for wandb init * adding retries for nan images * fixing image isnan check by adding np.any * rename wandb fields *image_id -> *image_name * Updated failure case for generating annotations. * Adjusted TF logging level to include error messages. Cleaned up unnecessary imports. Simplified clear_gpu to not moving tensor to CPU. * Reverted deletion of necessary diffusion pipeline imports. Adjusted TF logging level in dataset generation script to be consistent with synthetic generation classes. * adding a sleep to reduce metagraph resync freq * fixing edge case that occurs when only 1 miner has nonzero weight * bump version to 1.0.2 * fixing download_data extension * Update fake dataset paths * replacing conda activate with /home/user/mambaforge/envs/tensorml --------- Co-authored-by: Benjamin <[email protected]> Co-authored-by: aliang322 <[email protected]> * TrainingDatasetProcessor class for loading, generating, and uploading preprocessing face only images into training datasets * Fixed transform dict var name * Removed normalize function in training dataset creation * Changed config dict to faces_only bool for clarity, changed hf repo type to dataset * Added splits instance variable, generalized function names, non face only processing * Script for interfacing with TrainingDatasetProcessor to create and upload preprocessed training datasets. * Added usage examples and explanation * Removed unused import * Simplified repo upload naming convention. * Added original image index column to training datasets. Consolidated transform datasets into HF subsets. * Added local save/load, upload repo destination options; Fixed dataset preprocessing to be performed in-place. * Created create_splits() helper function * Fixed not clearing dataset memory when loading pickle * Created Context-Aware Hierarchical Mixture-of-Agents (CAMO) miner. * Created modular helper function for loading detectors. * Rewording comments * Added YOLOv8 object detection for image classification * Renamed create_splits() to split_dataset() and added support for subset loading * Added HuggingFace subset download option * Restructured data loading for training. Support for face only subset loading with stratified splits. * Added YOLOv8 object detection experiments, renamed CAMO miner. * Updated paths for new CAMO model weights * Added object detection error checks * Switch debug to bt.logging, formatting * Set use_object_detection to False for current iteration of CAMO * Fixed assertion error on last batch of train epoch when incomplete batches present by dropping last in dataloaders. * Added data shuffling prior to splitting, check for disjoint stratified split indices. * Fixed faces not being used by face expert. * Improved clarity with face processing helper functions * Generalized to optionally include source label mapping * Consolidated real_fake_dataset.py into bitmind directory, updated references * Fixed docstring typo * Parameterized shuffling before splits, generalized params to adopt expert model terminology, made source label mapping optional * Renamed params appropriately, fixed generalist UCF not using source labels * Reformatted long lines, fixed split size printing num batches * Removed UCF-specific data utils, replaced with generalized utils at bitmind level * Standardized usage of bitmind.utils.train_data for data load/split across NPR and UCF base models * adding versions for new package * Update Mining.md * moving miner/validator specific dependencies to new requirements filse * setup scripts specific to miner/validator reqs * Relocated train/predict data processing scripts, updated imports and paths * Removed redundant face detection utils from UCF dir * Comment crediting original source of UCF training scripts * Removed DeepfakeBench submodule * Added auto download for backbone weights if not locally present * Removed redundant video metrics from validation logs * Updated README with script usage, removed deprecated manual weight download instructions * Cleaned up leftover DFB train configs, fixed training error when using only 1 real/fake train dataset * Deleted whitespace * Cleaned unused DFB config labels * adding subtensor.chain_endpoint to startup scripts * Added default value for specific task number for training * Standardized UCF paths with consts across UCF neurons and training files * Cleaned up experimental files. * Update Mining.md * Update Validating.md * Standardized neuron naming * Updated default miner to camo_miner.py * Fixed forgery dataset/method disentangling by setting specific_task_number value to num of fake datasets + 1 for real label * Added readme file for camo base miner * Fixed UCF constant weight path names * Removed non-validator dataset generation scripts, fixed camo readme. * Weights constant name typo * Fixed UCF miner imports * Added missing UCF weights import * Cleaned up utils and unnecessary files * adding testnet chain endpoint to docs * adding miner/vali dep installs to ci.yml --------- Co-authored-by: Benjamin <[email protected]> Co-authored-by: default <[email protected]> Co-authored-by: aliang322 <[email protected]> Co-authored-by: Ken Miyachi <[email protected]> Co-authored-by: Dylan Uys <[email protected]> * bump version 1.0.2 -> 1.1.0 * Removed sample images in UCF directory * parameterizing neuron filepath * Added base miner dir readme with CAMO information * Added gpu-specific diffusion model loading for parallelization * Fixed docstring indent * Added gpu specification for synthetic image generation * Bittensor agnostic deepfake detector ABC, UCF and NPR subclass implementations * Moved preprocessed train dataset pipeline to bitmind-utils repo * Moved detectors and added naive unit tests in new dir * Fixed npr unit tests using correct resnet50 architecture * Added basic detector class registry implementation * Changed routing type logging to 'info' type (#60) * Logging Updates (#61) * Update Mining.md * logging augmented b64 images * removing deprecated get_metagraph endpoint from validator_proxy * removing metagraph fastapi endpoint * Moving tensor to PIL conversion to b64_encode * __init__.py * not logging b64 encoded images * shebang * Standardized detector call output to numpy ndarray * Registry contains magic method, unit tests after subclass imports * Module inits with DeepfakeDetector subclass imports * Removed model base var since subclasses may use more than one model * Removed redundant model load * Initial camo detector implementation using detector registry, unit test * Re-added YOLO objct detector * Commented auto registration for further testing * removing old notebooks, only supporting scripts moving forward * Added base_miner level init and adjusted deepfake_detectors file imports * Clarified comment * Changed unit tests to compare deepfake_detectors registry const and new registry * Added universal modular miner neuron using detector registry, and companion unit tests * Removed separate miner neurons for individual detector architectures * Removed legacy DFB preprocessing tools * Gate ABC and FaceGate subclass * Added gating registry and moved registries up a level, changed UCF to modular gates, fixed imports * Gating mechanism using gate registry for generalized CAMO * Fixed import typo * Removed legacy pretrained UCF files * Renamed ucf training script to match NPR convention * Added dataset gen GPU delegation shell script and SDXL to constants * Removed HF token * Cleaned logging to output to one directory per training run. Removed unused task target param. * Added local config yaml saving prior to training loop * Added epochs and num workers arg with default cpu count - 1 * Changed default epochs to 5 * UCF yaml config saving from training, HF config download for UCF detector init. Informative error message for specific task number mismatch. * Removed redundant path const * Fixed specification of torch.dtype for SDXL * Loading detector attributes from YAML configs for easier model changes, readable params * Restored UCF face detector configs * Deleted unused UCF constants, updated references * Fixed import error due to sys path not being added yet, removed unused import * YAML comment cleanup, removed model_name key * Changed call() output to return npndarray consistent with npr, minor param rename * NPR detector using config for weight download * Reformatting empty lines, added stress test script. * Parameterized detector device, refactored subclass inits to rely on base class for majority of setup * Miner now uses config args for dete ctor class, configs, device. * Added temporary stress testing prints * Updated ad hoc miner unit test to use neuron configs * Descriptive class docstrings * Pep8 import spacing, removed unused import * gitignore NPR weights dir * base miner directory documentation * formatting * Added all diffusion models and updated generate args * Added float16 tensor tflop specs metric * Updated bittensor ver * Moved base miner CAMO description to top * Cleaned non-error detector and gate prints * Fixed incomplete docstrings * Defaulting CAMO and GatingMechanism to not use YOLO * Removed specific tasks param, changed default epoch args to config setting. Config now prints right before training loop. * Set default GPU for diffusion models to cuda if no gpu_id specified. General cleanup for PR. * Reverted merge * Removed unnecessary imports * Added diffusion model verification/loading at the end of validator setup. Cleaned up imports and added comments to setup scripts. * Added validator model verification script for initial model download and loading checks. Added automated wandb and hugging face logins with new .env fields. Updated validator autoupdates. Cleaned up prints. * Fixed typo. * Added check for model presence in cache * Reinstated necessary pipeline imports * PEP8 class spacing * Notebook cleanup. * ImageAnnotationGenerator doc/class strings * SyntheticImageGenerator docstring updates * Camo/Gate Refactor (#64) * typo * simplifying gating mechanism, removing object detection * removing gate from ucf, returning last model output * updating configs to match latest gate implementation * pep8 formatting * future proof handling image_url in image_dataset * face gate cleanup * face gate cleanup * moving face_utils.py to utils dir, updating imports * docstring update * init imports * testing benchmark datasets * fixing detector import * moving norm to image transform pipeline * ucf transfomrs + CLAHE * removing extra norm config from real_fake_dataset * adding many to one (group_by_source) option * fixing fake_source_label value * Moving subset loading for face expert datasets to constants.py * simplifying transformation arguments * train script cleanup * moving data_processing/ functionality to utils/data.py * standardizing order of dataloading args * reorganizing code for cleaner structure + addressing circular import * cleaning up dataset splitting * updating NPR training to use new data loading code * adding missing requests import * adding new detector args to miner start script * adding new fields to miner env setup script * removing old model_path arg * setting default datasets * simplifying pm2 start commands * Fixing import * new configs * adding usage of self.device in npr, printing config loading in ucf * adding device to start miner scripts * Added parameterization options in miner.env template * docs updates * Fixed torch device setting * docs udpates * Readme fixes * HuggingFace renamed face expert dataset path * CPU device support with default GPU. Renamed local_rank to gpu_id --------- Co-authored-by: Dylan Uys <[email protected]> Co-authored-by: benliang99 <[email protected]> Co-authored-by: Andrew <[email protected]> * Updated Validator min compute (#71) * Updated min compute in anticipation of FLUX and LLM moderation additions. * Fixed typo, organized .env format. * Small updates (#72) * updating config print * small doc update * Release readme (#70) * Added config context in mining readme * Readme formatting * moving deployment instructions to deploy section * docs update * miner.env and base miner dir context * Spacing --------- Co-authored-by: Dylan Uys <[email protected]> * allowing auto device configuration (#74) * allowing auto device configuration * allowing config to be specified as a rel path * docs updateg * UCF backbone overwrite (#75) * Backbone now loaded from path in UCF detector config rather than train config * Updated ad hoc detector unit tests, added one script that runs all tests * Removed deprecated mining model weights directory * UCFDetector config no longer provides backbone path, instead uses training config backbone path which defaults to xception-best.pth on HF * Updated backbone load check * Training now uses xception-best.pth backbone weights from bitmind/bm-ucf/ on HuggingFace * Weights renaming (#76) * removing deprecated train_data.py * updating weight and config filenames w version str * Removed legacy (unused) directory * Renamed and moved validator model verification to bitmind/validator/, logging cleanup. * Added validator-specific unit testing * Added unit test for generating images. Added PEP8 docstrings. * Scoring Udpate (#80) * miner performance tracker * adding perf tracker to base validator save & load state * new reward function: squared AUC * EMA alpha 0.1 -> 0.01 * removing test that checked old reward function, will have a more sophisticated test in the future * explicitly setting window arg for get_metrics in get_rewards * logging metrics to wandb for leaderboard * bump version * adding perf tracker to mockvalidator * doc updates * removing extra / from path for consistency * Making wandb a general requirement for miners to check their performance * Correct WANDB_PROJECT constantexit * Update setup fields * Fixed axon undefined * Added conditional cuda usage --------- Co-authored-by: aliang322 <[email protected]> Co-authored-by: Dylan Uys <[email protected]> Co-authored-by: default <[email protected]> Co-authored-by: aliang322 <[email protected]> Co-authored-by: Ken Miyachi <[email protected]>

benliang99 and others added 30 commits July 31, 2024 00:09

Updated synthetic image mirror generation script, created helper func…

8e3169d

…tion for generating images iin SyntheticImageGenerator class, moved notebooks to new notebooks dir.

Restored ensure_save_path func back to annotation_utils.py

f939cbb

Added latency tracking, max images generated field for synthetic imag…

26f695e

…e mirror pipeline.

Added mean synthetic image gen latency print statement

810211a

Update example arg inputs

8aa47c8

Fix imports

37b1bdd

Fixed and reformatted args

1b9f657

Suppressed TensorFlow warnings, fixed image gen from annotation

8927abf

Index and loop bugfixes

48456ce

Index, looping, args logic fixes.

6fc7bba

Add load diffuser function call

c7e4045

Clear gpu after using synthetic image generator

23d467d

Always load from Hugging Face.

249cab8

Batch processing for memory optimization. Added optional name field t…

1d0565e

…o generate_image in SyntheticImageGenerato for future customization.

Memory optimizations (saving annotation .jsons to disk), added args f…

39bc7d4

…or chunking, pm2 examples

Fix save as json on disk, ensure no hanging reference when gpu is cle…

8f25209

…ared in SyntheticImageGenerator

Replaced generic DiffusionPipeline with StableDiffusionPipeline that …

8b09c07

…inherits from it. Specified generated image dimensions in diffuser call params.

convert diffuser to float32 before moving onto cpu, fixed duplicate i…

420513f

…mage count logs

Merge branch 'synthetic_image_gen_updates' of https://github.com/bitm…

4f2d671

…ind-ai/bitmind-subnet into synthetic_image_gen_updates

Added a testing function to save images from real image dataset, chan…

9e7e829

…ged annotations 'index' field to 'id' for consistency, various data loading and parameter fixes

Added pipeline for diffusion models to constants.py, and dynamic pipe…

9255df4

…line loading and image size customization to generation.

Fixed Hugging Face authentication errors. Added instruction to authen…

ebd3171

…ticate with huggingface-cli login

Fixed all annotations being used to generate mirrors regardless of st…

3f29b83

…art and end indices

Removed extra disable progress bar call. Added ceil import for progre…

1a435ef

…ss calculation.

Adjust Hugging Face annotations dataset name

d986842

Reverted annotations dataset name to have data range, now requiring s…

b59e513

…tart_index and end_index args.

Re-removed data range from annotations

2d0851c

Update 'index' to 'id'

6af5b61

Fixed loading annotations from Hugging Face and savng specified indic…

98da219

…es to disk.

dylanuys and others added 14 commits August 17, 2024 21:51

adding retries for nan images

46c572e

fixing image isnan check by adding np.any

3eb7ad6

Merged synthetic image gen updates with testnet

6429e39

rename wandb fields *image_id -> *image_name

981bb16

Merge pull request #46 from BitMind-AI/requirements-updates

3dcf25d

Versioned requirements.txt

Merge pull request #47 from BitMind-AI/vali-fwd-resampling

edf05fa

Validator Caption/Image Resampling

Updated failure case for generating annotations.

3a0fedf

Adjusted TF logging level to include error messages. Cleaned up unnec…

55a659f

…essary imports. Simplified clear_gpu to not moving tensor to CPU.

Reverted deletion of necessary diffusion pipeline imports. Adjusted T…

80c3f83

…F logging level in dataset generation script to be consistent with synthetic generation classes.

Merge pull request #45 from BitMind-AI/synthetic_image_gen_updates

c5ee3aa

Synthetic image gen updates

adding a sleep to reduce metagraph resync freq

e75c0b5

fixing edge case that occurs when only 1 miner has nonzero weight

445e822

Merge pull request #49 from BitMind-AI/fix-weight-setting-edgecase

dde6556

Fix Weight Setting Edge Case

bump version to 1.0.2

079e461

dylanuys commented Aug 19, 2024

View reviewed changes

dylanuys added 2 commits August 19, 2024 19:00

fixing download_data extension

deca46a

Merge pull request #51 from BitMind-AI/script-extension-fix

ae68510

fixing download_data extension

kenobijon approved these changes Aug 19, 2024

View reviewed changes

benliang99 and others added 5 commits August 19, 2024 15:02

Update fake dataset paths

27bf00b

Merge pull request #52 from BitMind-AI/update-dataset-paths

5d7669c

Update fake dataset paths

replacing conda activate with /home/user/mambaforge/envs/tensorml

bb60523

Merge pull request #53 from BitMind-AI/autoupdate-conda-prefix

e4b2f5b

Autoupdate Python/Pip Binaries

resolving merge conflicts

828e59f

dylanuys merged commit 97dd144 into main Aug 19, 2024
2 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Release 1.0.2 #50

Release 1.0.2 #50

dylanuys commented Aug 19, 2024 •

edited

Loading

dylanuys Aug 19, 2024 •

edited

Loading

dylanuys Aug 19, 2024

kenobijon left a comment

kenobijon Aug 19, 2024

dylanuys Aug 19, 2024

Release 1.0.2 #50

Release 1.0.2 #50

Conversation

dylanuys commented Aug 19, 2024 • edited Loading

Patch 1.0.2

QOL Upgrades

Validator Technical Improvements:

dylanuys Aug 19, 2024 • edited Loading

Choose a reason for hiding this comment

dylanuys Aug 19, 2024

Choose a reason for hiding this comment

kenobijon left a comment

Choose a reason for hiding this comment

kenobijon Aug 19, 2024

Choose a reason for hiding this comment

dylanuys Aug 19, 2024

Choose a reason for hiding this comment

dylanuys commented Aug 19, 2024 •

edited

Loading

dylanuys Aug 19, 2024 •

edited

Loading