Basic review pass #3

cite-reader · 2022-11-12T01:27:25Z

Each commit resolves a separate issue, with explanations in their commit messages.

The most important thing this provides is a list of the package's dependencies. Version bounds are strongly recommended in the real world, but this is enough to get started.

Everything in the quickstart's loop is there for a reason. Most notably, the use of a `finally` block rather than a bare `except` block. Bare `except` handles every exception, including things you *really* don't want to catch like out-of-memory conditions. The quickstart wants to _finalize_ the observer-stop its thread and reclaim those resources-when exiting the scope it's supposed to be running, but if the scope is being exited for an abnormal reason we shouldn't continue on as though everything is fine. `finally` gives us those semantics, while `except:` empathically does not. Secondly, the observer thread can terminate for reasons other than the main thread requesting it stop. Should this occur the original code would loop infinitely; this new version will instead promptly exit. Detecting and reporting such a condition is left as an exercise for the reader. As a side note, the printout that the observer stopped is moved to after the thread join. In its original position it did not accurately reflect the state of the world: the observer had only been _requested_ to stop. We can only say that it _has_ stopped once we've observed control return from the thread join.

I'm honestly surprised that overriding an instance method with a static method works at all. It is at least surprising.

It's essentially universally accepted that global variables are not a good practice, so here we pass them down to where they are needed as parameters. In order to illustrate a motivation for the change, we extend the capabilities of `OnMyWatch` slightly, allowing any number of independent handlers to be registered. While we're in there adding parameters, we hoist the creation of the S3 client from deep in the guts of the Handler class to the top level. This is exactly the "inversion of control" or "dependency injection" pattern familiar from other languages and gives the same benefits: since client code determines the runtime type of the client, it is trivial to pass a mock implementation for testing, and so on. In addition, reusing the client allows the SDK to transparently reuse HTTPS connections as appropriate, saving connection overhead.

The body of every conditional branch in the event handler was essentially identical, so we can collapse the differences into the contidional itself. There's also a meaningful semantic change here, which I will insist is correct based on my experience with file watchers: instead of firing on modifications, this version fires when the file which was opened for modifications is closed. In my experience, watchers that fire on every call to `write(2)` are poorly behaved, because there are usually _many_ such calls involved in real-world edits and you end up in a race with the program making the edits and it's really just not a good time. _Also,_ at least on Linux with the inotify backend, a program which chooses to `mmap` the file to modify it will not generate the modified events at all. Waiting for the file to be closed does not _eliminate_ all race conditions, but it does minimize common ones. Finally, I will simply note that this version retains the original semantic bug regarding matching the file: because it uses a simple substring match, a request to watch a file `foo` will also match modifications to `foo_bar`. How this is best corrected depends on the exact semantics of the watchdog package's interface, and I don't know them.

This brings the advantage that `--help` works automatically. More elaborate (and more helpful) argparse setups are left as an exercise.

blueskyjunkie · 2022-11-12T15:52:48Z

main.py

-        self.observer.join()
-
-
+
 class Handler(FileSystemEventHandler):


You might want to add type hinting. The use of type-hinting seems to becoming a norm in Python through use of static checkers such as flake8, pyright and mypy. Besides that, using type hints definitely helps with debugging and maintenance to help readers of the code understand what data is flowing where. IDE's such as PyCharm (and VSCode, I think) will also warn you when it's obvious that the wrong data type is being applied to type hints - useful for your own development as well.

Just remembered that boto3 is using some kind of dynamic type stuff (it's been a couple of years) so usually I had to use # type: ignore to pacify mypy and move on.

On the up-side, using # type: ignore would indicate to your reviewers that you know about mypy and assume it will be used and (hopefully) are even using it yourself.

blueskyjunkie · 2022-11-12T15:55:59Z

main.py

 from watchdog.observers import Observer
 from watchdog.events import FileSystemEventHandler

-s3_bucket_name = ""
-filename = ""
-
 class OnMyWatch:
    # Set the directory on watch
    watchDirectory = "e:/Code/indellinent/test/"


Use pathlib.Path for easy path management?

blueskyjunkie · 2022-11-12T15:59:20Z

main.py

-            )
+        if event.event_type == 'created' or event.event_type == 'closed' and self.target_filename in event.src_path:
+            self.s3.upload_file(event.src_path, self.bucket_name, self.target_filename)
+            print("Watchdog received %s event - %s." % (event.event_type, event.src_path))


use logging instead of print statements? (unless, of course, you are specifically wanting console output)

I think for this case I wanted console output, but that is good to know for future.

blueskyjunkie · 2022-11-12T16:02:50Z

main.py

-            while True:
-                time.sleep(5)
-        except:
+            while self.observer.isAlive():


since this is a "running indefinitely" process add lots of logging to ensure useful diagnostic information?

loguru is an easy way to add really simple logging to your app, but it does diverge from using the standard logging library (it seems to be fairly popular though).

cite-reader added 6 commits November 11, 2022 15:27

Add a rudimentary pyproject.toml

87b736c

The most important thing this provides is a list of the package's dependencies. Version bounds are strongly recommended in the real world, but this is enough to get started.

Override superclass method at same level

a5ce436

I'm honestly surprised that overriding an instance method with a static method works at all. It is at least surprising.

Parse the cmdline with argparse

6eb5f2a

This brings the advantage that `--help` works automatically. More elaborate (and more helpful) argparse setups are left as an exercise.

blueskyjunkie reviewed Nov 12, 2022

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Basic review pass #3

Basic review pass #3

cite-reader commented Nov 12, 2022

blueskyjunkie Nov 12, 2022

blueskyjunkie Nov 12, 2022

blueskyjunkie Nov 12, 2022

blueskyjunkie Nov 12, 2022

canadiancreed Nov 12, 2022

blueskyjunkie Nov 12, 2022

Basic review pass #3

Are you sure you want to change the base?

Basic review pass #3

Conversation

cite-reader commented Nov 12, 2022

blueskyjunkie Nov 12, 2022

Choose a reason for hiding this comment

blueskyjunkie Nov 12, 2022

Choose a reason for hiding this comment

blueskyjunkie Nov 12, 2022

Choose a reason for hiding this comment

blueskyjunkie Nov 12, 2022

Choose a reason for hiding this comment

canadiancreed Nov 12, 2022

Choose a reason for hiding this comment

blueskyjunkie Nov 12, 2022

Choose a reason for hiding this comment