Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kedro-telemetry: Spike how to enable telemetry for "kedro new" and other commands executed outside of kedro project folder #729

Closed
Tracked by #4025
DimedS opened this issue Jun 14, 2024 · 1 comment · Fixed by #775
Assignees

Comments

@DimedS
Copy link
Member

DimedS commented Jun 14, 2024

Description

Enable telemetry for kedro new and other commands executed outside of kedro project folder (e.g. before project creation).

Context

Currently, kedro-telemetry doesn't work with the kedro new command. I believe this is due to the following reasons:

  1. To participate in telemetry, the kedro-telemetry plugin needs to be installed. However, it is only installed after creating the project with the kedro new command and running pip install -r requirements.txt. But even if the plugin is pre-installed, kedro new will not be covered by telemetry.
  2. It wasn't possible to provide consent: false before the project's folder was created.
  3. There were issues in a packaged mode with writing a file in a read-only environment.

With the shift to opt-out consent in #715, and the introduction of new ways to disable telemetry in #728, it would be reasonable to send telemetry in the pre-command hook of any command, including kedro new. With opt-out consent, we will stop asking for consent and will avoid problems with writing an answer.

Important:
However, we should continue not sending telemetry in packaged mode for the first iteration of the opt-out flow. This is so we can determine more easily what the impact is of changing to an opt-out model. Immediately collecting data for packaged projects would muddle the effect.

So we need to change how we determine the packaged mode. Currently, we use:

            if not project_metadata:  # in package mode
                return

and it works for package mode but also disables it for kedro new commands.

@merelcht merelcht changed the title kedro-telemetry: enable telemetry for "kedro new" and other commands executed outside of kedro project folder kedro-telemetry: Spike how to enable telemetry for "kedro new" and other commands executed outside of kedro project folder Jul 1, 2024
@merelcht merelcht added this to the Telemetry opt-out milestone Jul 1, 2024
@merelcht merelcht moved this to To Do in Kedro Framework Jul 1, 2024
@DimedS DimedS moved this from To Do to In Progress in Kedro Framework Jul 10, 2024
@DimedS
Copy link
Member Author

DimedS commented Jul 10, 2024

I observed that the current comments about working in package mode inside plugin.py are likely incorrect:

class KedroTelemetryCLIHooks:
    """Hook to send CLI command data to Heap"""

    @cli_hook_impl
    def before_command_run(
        self, project_metadata: ProjectMetadata, command_args: list[str]
    ):
        """Hook implementation to send command run data to Heap"""
        try:
            if not project_metadata:  # in package mode
                return

This hook will only work when a Kedro CLI command is executed. If the Kedro project is packaged and executed with the package_name command, the run command will be executed directly without using CLI and hooks. Therefore, in the current design, we will not collect telemetry data in package mode even without the above code.

To allow telemetry collection with the kedro new command, we need to remove these lines:

            if not project_metadata:  # in package mode
                return

and restructure the code below them. At the time of executing that command, no project is created, which means no information about the Kedro project is available.

For consistency, this change should be made after merging PR #760 and addressing Issue #728.

@noklam noklam moved this from In Progress to In Review in Kedro Framework Jul 15, 2024
@noklam noklam moved this from In Review to In Progress in Kedro Framework Jul 15, 2024
@DimedS DimedS linked a pull request Jul 22, 2024 that will close this issue
4 tasks
@DimedS DimedS moved this from In Progress to In Review in Kedro Framework Jul 22, 2024
@github-project-automation github-project-automation bot moved this from In Review to Done in Kedro Framework Aug 1, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Archived in project
Development

Successfully merging a pull request may close this issue.

3 participants