forked from invoice-x/invoice2data
-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Newr #10
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
bosd
force-pushed
the
newr
branch
6 times, most recently
from
December 16, 2024 21:33
ca1a5de
to
d707369
Compare
Update cookiecutter template
This commit includes a comprehensive refactoring of the codebase to improve code style, add type hints, and enhance documentation. The following key changes were made: - **Type Hints:** Added type hints to all function parameters, return values, and variables throughout the codebase. This improves code readability and helps with static analysis using tools like `mypy`. - **Docstrings:** Updated docstrings to conform to the Google style guide, including clear descriptions, parameter and return type documentation, and example usage. - **Code Style:** Fixed indentation, removed unnecessary comments, and simplified code logic for better readability and maintainability. - **Dependency Updates:** Updated dependencies and removed unnecessary imports to improve efficiency and reduce reliance on external packages. - **Test Refactoring:** Refactored tests to be more robust and independent of specific environment setups. - **Bug Fixes:** Fixed various bugs and issues identified by linters and tests, improving the overall stability and correctness of the code. Specific changes include: - Refactored the `to_xml`, `to_json`, and `to_csv` output modules to improve code style, add type hints, and enhance documentation. - Updated the Google Vision input module to use the latest API version and improve code style. - Refactored the `loader.py` module to optimize file loading and improve documentation. - Refactored test files to remove the dependency on `setuptools` and improve test coverage. - Fixed various linting errors and warnings reported by `ruff` and `pydoclint`. - Updated the `noxfile.py` to streamline the testing process. - Updated the `pyproject.toml` file to manage dependencies and configure linters. - Updated the lockfile to reflect the updated dependencies. These changes significantly improve the codebase's quality, readability, and maintainability while ensuring type safety and adherence to best practices.
Extracted template loading: The code for loading templates has been moved to a separate function _load_templates. Extracted file processing and copy/move: The code for processing the extracted data and copying/moving the file has been moved to a separate function _process_and_move_copy. Simplified main function: The main function is now more concise and focused on the main control flow, as the specific logic for template loading and file processing has been delegated to helper functions. This refactoring reduces the complexity of the main function by breaking it down into smaller, more manageable chunks. It also improves readability and maintainability by separating different concerns into dedicated functions.
To reduce the complexity of the extract function, Several logical blocks are extracted into separate helper functions: _initialize_output_and_log: Handles the initialization of the output dictionary and the logging of debug information. _handle_area: Handles the logic for extracting text from a specific area of the invoice. _handle_parser: Handles the logic for parsing fields using different parsers. _handle_legacy_syntax: Handles the legacy syntax for backward compatibility. _check_required_fields: Checks if all required fields are present in the output. By extracting these blocks into separate functions, the extract function becomes more concise and easier to understand. Each helper function focuses on a specific task, improving the overall organization and readability of the code. This refactoring addresses the "C901 extract is too complex" warning by reducing the function's complexity and improving the code's structure.
Improve Documentation Improve the documentation, by autodocumenting with spinx autodoc from the sourcecode. Plus several smaller documentation improvements. Documentation: Add mermaid support
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR introduces a big refator.
An hypermodern coockiecutter template is used.
Read more about it here: https://cookiecutter-uv-hypermodern-python.readthedocs.io/en/latest/index.html
It brings a lot of features which will make maintaining of this project a lot easier.
As this is a huge commit. I don't expect anyone to review it.
There are no breaking changes (yet).
Those are planned in future PR's.