-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Basic review pass #3
base: review
Are you sure you want to change the base?
Conversation
The most important thing this provides is a list of the package's dependencies. Version bounds are strongly recommended in the real world, but this is enough to get started.
Everything in the quickstart's loop is there for a reason. Most notably, the use of a `finally` block rather than a bare `except` block. Bare `except` handles every exception, including things you *really* don't want to catch like out-of-memory conditions. The quickstart wants to _finalize_ the observer-stop its thread and reclaim those resources-when exiting the scope it's supposed to be running, but if the scope is being exited for an abnormal reason we shouldn't continue on as though everything is fine. `finally` gives us those semantics, while `except:` empathically does not. Secondly, the observer thread can terminate for reasons other than the main thread requesting it stop. Should this occur the original code would loop infinitely; this new version will instead promptly exit. Detecting and reporting such a condition is left as an exercise for the reader. As a side note, the printout that the observer stopped is moved to after the thread join. In its original position it did not accurately reflect the state of the world: the observer had only been _requested_ to stop. We can only say that it _has_ stopped once we've observed control return from the thread join.
I'm honestly surprised that overriding an instance method with a static method works at all. It is at least surprising.
It's essentially universally accepted that global variables are not a good practice, so here we pass them down to where they are needed as parameters. In order to illustrate a motivation for the change, we extend the capabilities of `OnMyWatch` slightly, allowing any number of independent handlers to be registered. While we're in there adding parameters, we hoist the creation of the S3 client from deep in the guts of the Handler class to the top level. This is exactly the "inversion of control" or "dependency injection" pattern familiar from other languages and gives the same benefits: since client code determines the runtime type of the client, it is trivial to pass a mock implementation for testing, and so on. In addition, reusing the client allows the SDK to transparently reuse HTTPS connections as appropriate, saving connection overhead.
The body of every conditional branch in the event handler was essentially identical, so we can collapse the differences into the contidional itself. There's also a meaningful semantic change here, which I will insist is correct based on my experience with file watchers: instead of firing on modifications, this version fires when the file which was opened for modifications is closed. In my experience, watchers that fire on every call to `write(2)` are poorly behaved, because there are usually _many_ such calls involved in real-world edits and you end up in a race with the program making the edits and it's really just not a good time. _Also,_ at least on Linux with the inotify backend, a program which chooses to `mmap` the file to modify it will not generate the modified events at all. Waiting for the file to be closed does not _eliminate_ all race conditions, but it does minimize common ones. Finally, I will simply note that this version retains the original semantic bug regarding matching the file: because it uses a simple substring match, a request to watch a file `foo` will also match modifications to `foo_bar`. How this is best corrected depends on the exact semantics of the watchdog package's interface, and I don't know them.
This brings the advantage that `--help` works automatically. More elaborate (and more helpful) argparse setups are left as an exercise.
self.observer.join() | ||
|
||
|
||
|
||
class Handler(FileSystemEventHandler): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You might want to add type hinting. The use of type-hinting seems to becoming a norm in Python through use of static checkers such as flake8, pyright and mypy. Besides that, using type hints definitely helps with debugging and maintenance to help readers of the code understand what data is flowing where. IDE's such as PyCharm (and VSCode, I think) will also warn you when it's obvious that the wrong data type is being applied to type hints - useful for your own development as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just remembered that boto3 is using some kind of dynamic type stuff (it's been a couple of years) so usually I had to use # type: ignore
to pacify mypy and move on.
On the up-side, using # type: ignore
would indicate to your reviewers that you know about mypy and assume it will be used and (hopefully) are even using it yourself.
from watchdog.observers import Observer | ||
from watchdog.events import FileSystemEventHandler | ||
|
||
s3_bucket_name = "" | ||
filename = "" | ||
|
||
class OnMyWatch: | ||
# Set the directory on watch | ||
watchDirectory = "e:/Code/indellinent/test/" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use pathlib.Path
for easy path management?
) | ||
if event.event_type == 'created' or event.event_type == 'closed' and self.target_filename in event.src_path: | ||
self.s3.upload_file(event.src_path, self.bucket_name, self.target_filename) | ||
print("Watchdog received %s event - %s." % (event.event_type, event.src_path)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
use logging instead of print statements? (unless, of course, you are specifically wanting console output)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think for this case I wanted console output, but that is good to know for future.
while True: | ||
time.sleep(5) | ||
except: | ||
while self.observer.isAlive(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
since this is a "running indefinitely" process add lots of logging to ensure useful diagnostic information?
loguru is an easy way to add really simple logging to your app, but it does diverge from using the standard logging library (it seems to be fairly popular though).
Each commit resolves a separate issue, with explanations in their commit messages.