Skip to content

Commit a53cef2

Browse files
author
slin96
authored
cleanup (#47)
* updated main readme * typos readme * move examples * moved examples * wip examples for probing * added gitkeep to have empty directory in examples * approach examples * remove stoarge intensive tests * better explaination for config file
1 parent 8b6c364 commit a53cef2

12 files changed

+67
-132
lines changed

README.md

+20-5
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,22 @@
11
# mmlib
22

3-
- A library for model management and related tasks.
4-
3+
mmlib is a library that implements different approaches to save and recover models. It was developed as part of my
4+
master thesis ([link to thesis repo](https://github.com/slin96/master-thesis)).
5+
6+
The approach names in the thesis match the following implementations:
7+
- baseline approach
8+
- implemented by the `BaselineSaveService`
9+
- parameter update approach
10+
- implemented by `WeightUpdateSaveService` (set `improved_version=False`)
11+
- improved parameter update approach
12+
- implemented by `WeightUpdateSaveService` (set `improved_version=True`)
13+
- provenance approach
14+
- implemented by `ProvenanceSaveService`
15+
16+
Next to the approaches to save and recover models we also implemented a **probing tool**
17+
- the corresponding code is in `probe.py`
18+
- examples are provided under [examples](examples)
19+
520
## Installation
621

722
### Option 1: Docker
@@ -10,15 +25,15 @@
1025
- **Build Library**
1126
- clone this repo
1227
- run the script `generate-archives-docker.sh`
13-
- it runs a docker container and builds the *mmlib* in it.
28+
- it runs a docker container and builds the *mmlib* in it
1429
- the created `dist` directory is copied back to repository root
1530
- it contains the `.whl` file that can be used to install the library with pip (see below)
1631
- **Install**
1732
- to install mmlib run: `pip install <PATH>/dist/mmlib-0.0.1-py3-none-any.whl`
1833

1934
### Option 2: Local Build
2035

21-
- **Requirements**: Python 3.8
36+
- **Requirements**: Python 3.8 and Python `venv`
2237
- **Build Library**
2338
- run the script `generate-archives.sh`
2439
- it creates a virtual environment, activates it, and installs all requirements
@@ -28,7 +43,7 @@
2843

2944
## Examples
3045

31-
- For examples on how to use mmlib checkout the [examples](mmlib/examples) directory.
46+
- For examples on how to use mmlib checkout the [examples](examples) directory.
3247

3348

3449

mmlib/examples/README.md examples/README.md

+19-3
Original file line numberDiff line numberDiff line change
@@ -2,12 +2,28 @@
22

33
This directory contains examples of how to use the functionality offered by the *mmlib*.
44

5+
## Approaches to save and recover models
6+
- to execute all examples we use a MongoDB, in all examples the MongoDB is started using docker
7+
- if you don't have docker installed you have to either install it or slightly adjust the examples
8+
- in `baseline_save.py` we provide an example of how to save and recover a model using the baseline approach
9+
- for all other approaches we do not give explict examples and refer to our [test for the appraoches](../tests/save)
10+
11+
12+
## Probing Tool
13+
14+
We provide some basic examples to see the different use cases of the probing tool
15+
16+
### Create a probe summary for a given model
517
- *probe_store.py* - Creates and stores a probe summary of the training process of a GoogLeNet.
6-
- execution: `python probe_store.py --path <optional path to store probe summary>`
18+
- execution: `python probe_store.py --path <optional path to store probe summary>`
19+
20+
### Create new summary and compare to given one
721
- *probe_load_compare.py* - Creates a probe summary of the training process of a GoogLeNet and compares it to a stored
822
probe summary
9-
- execution: `python probe_load_compare.py --path <path to the already stored probe summary>`
10-
- note: To generate and store a probe summary to compare to use the *probe_store.py* script.
23+
- execution: `python probe_load_compare.py --path <path to the already stored probe summary>`
24+
- note: To generate and store a probe summary to compare to use the *probe_store.py* script.
25+
26+
### Extensive example
1127
- *probe_example.py* - Shows extensively how the probe functionality offered by the *mmlib* can be used to make the
1228
PyTorch implementation of GoogLeNet reproducible. It runs the following steps:
1329
- simple summary
File renamed without changes.

mmlib/examples/baseline_save.py examples/baseline_save.py

+4-2
Original file line numberDiff line numberDiff line change
@@ -3,7 +3,8 @@
33
from mmlib.equal import model_equal
44
from mmlib.persistence import FileSystemPersistenceService, MongoDictPersistenceService
55
from mmlib.save import BaselineSaveService
6-
from mmlib.schema import ModelSaveInfoBuilder
6+
from mmlib.schema.save_info_builder import ModelSaveInfoBuilder
7+
from mmlib.track_env import track_current_environment
78
from mmlib.util.dummy_data import imagenet_input
89
from tests.example_files.mynets.mobilenet import mobilenet_v2
910

@@ -24,8 +25,9 @@
2425
# initialize instance of mobilenet_v2
2526
model = mobilenet_v2(pretrained=True)
2627
# create the info to save the model
28+
env = track_current_environment()
2729
save_info_builder = ModelSaveInfoBuilder()
28-
save_info_builder.add_model_info(model=model)
30+
save_info_builder.add_model_info(model=model, env=env)
2931
save_info = save_info_builder.build()
3032
# given the save info we can store the model, and get a model id back
3133
model_id = save_service.save_model(save_info)

examples/filesystem-tmp/.gitkeep

Whitespace-only changes.
File renamed without changes.

mmlib/examples/probe_load_compare.py examples/probe_load_compare.py

+7-5
Original file line numberDiff line numberDiff line change
@@ -1,18 +1,20 @@
11
import argparse
22

3-
from mmlib.examples.probe_store import _generate_probe_training_summary
3+
from examples.probe_store import _generate_probe_training_summary
44
from mmlib.probe import ProbeSummary, ProbeInfo
55

66

77
def main(args):
8+
# we use the functionality from the probe_store script to generate a summary for the GoogLeNet
89
summary = _generate_probe_training_summary()
9-
10+
# We load the summary from the path given in the args
1011
loaded_summary = ProbeSummary(summary_path=args.path)
11-
12-
common = [ProbeInfo.LAYER_NAME]
12+
# we specify the fields both summaries have in common (they are excluded form comparing)
13+
common = [ProbeInfo.LAYER_NAME, ProbeInfo.FORWARD_INDEX]
14+
# we define the fields we want to compare; in this case different kind of tensors for teh forward an backward pass
1315
compare = [ProbeInfo.INPUT_TENSOR, ProbeInfo.OUTPUT_TENSOR, ProbeInfo.GRAD_INPUT_TENSOR,
1416
ProbeInfo.GRAD_OUTPUT_TENSOR]
15-
17+
# haven created one summary and loaded one we compare them and print the comparison to the console
1618
summary.compare_to(loaded_summary, common, compare)
1719

1820

mmlib/examples/probe_store.py examples/probe_store.py

+8-2
Original file line numberDiff line numberDiff line change
@@ -11,19 +11,25 @@
1111

1212

1313
def main(args):
14+
# we create a probe summary and get it back as an object
1415
summary = _generate_probe_training_summary()
15-
16+
# we can save the summary to the path given in the args
1617
output_path = os.path.join(args.path, 'summary')
1718
summary.save(output_path)
1819

1920

2021
def _generate_probe_training_summary():
22+
# First, we force the implementation to be deterministic using mmlib's set_deterministic() function
2123
set_deterministic()
24+
# as an example we want to prob the GoogLeNet architecture
25+
model = models.googlenet(pretrained=True)
26+
# to probe tha forward and backward pass we have to create some dummy data
27+
# we need: input, target, loss function and optimizer
2228
dummy_input = imagenet_input()
2329
dummy_target = imagenet_target(dummy_input)
2430
loss_func = nn.CrossEntropyLoss()
25-
model = models.googlenet(pretrained=True)
2631
optimizer = torch.optim.SGD(model.parameters(), 1e-3)
32+
# having created the model and all dummy data we can execute a probe run and return the summary
2733
summary = probe_training(model, dummy_input, optimizer, loss_func, dummy_target)
2834
return summary
2935

mmlib/examples/provenance_save.py

-104
This file was deleted.

mmlib/schema/schema_obj.py

+1-1
Original file line numberDiff line numberDiff line change
@@ -13,7 +13,7 @@
1313

1414
class SchemaObj(metaclass=abc.ABCMeta):
1515

16-
def __init__(self, store_id: str = None, logging=True):
16+
def __init__(self, store_id: str = None, logging=False):
1717
self.store_id = store_id
1818
self.logging = logging
1919

tests/model_equal/test_equal.py

+3-9
Original file line numberDiff line numberDiff line change
@@ -5,8 +5,8 @@
55

66
from mmlib.deterministic import set_deterministic
77
from mmlib.equal import state_dict_equal, model_equal
8-
from mmlib.util import state_dict_hash, tensor_hash
98
from mmlib.util.dummy_data import imagenet_input
9+
from mmlib.util.hash import state_dict_hash, tensor_hash
1010

1111

1212
class TestStateDictEqual(unittest.TestCase):
@@ -58,12 +58,6 @@ def test_resnet18_pretrained(self):
5858

5959
self.assertTrue(model_equal(mod1, mod2, imagenet_input))
6060

61-
def test_resnet50_pretrained(self):
62-
mod1 = models.resnet50(pretrained=True)
63-
mod2 = models.resnet50(pretrained=True)
64-
65-
self.assertTrue(model_equal(mod1, mod2, imagenet_input))
66-
6761
def test_googlenet_pretrained(self):
6862
mod1 = models.googlenet(pretrained=True)
6963
mod2 = models.googlenet(pretrained=True)
@@ -76,9 +70,9 @@ def test_mobilenet_v2_pretrained(self):
7670

7771
self.assertTrue(model_equal(mod1, mod2, imagenet_input))
7872

79-
def test_resnet18_resnet152_pretrained(self):
73+
def test_resnet18_mobilenet_pretrained(self):
8074
mod1 = models.resnet18(pretrained=True)
81-
mod2 = models.resnet152(pretrained=True)
75+
mod2 = models.mobilenet_v2(pretrained=True)
8276

8377
self.assertFalse(model_equal(mod1, mod2, imagenet_input))
8478

tests/save/test_prov_save_servcie.py

+5-1
Original file line numberDiff line numberDiff line change
@@ -32,7 +32,11 @@ class TestProvSaveService(TestBaselineSaveService):
3232
def setUp(self) -> None:
3333
super().setUp()
3434
assert os.path.isfile(CONFIG), \
35-
'to run these tests define your onw config file named \'local-config\' with respect to the template file'
35+
'to run these tests define your onw config file named \'local-config\'' \
36+
'to do so copy the file under tests/example_files/config.ini, and place it in the same directory' \
37+
'rename it to local-config.ini, create an empty directory and put the path to it in the newly created' \
38+
'config file, to define the current_data_root' \
39+
3640
os.environ[MMLIB_CONFIG] = CONFIG
3741

3842
def init_save_service(self, dict_pers_service, file_pers_service):

0 commit comments

Comments
 (0)