diff --git a/docs/source/02_get_started/01_prerequisites.md b/docs/source/02_get_started/01_prerequisites.md index e5707d1e21..e7413ab603 100644 --- a/docs/source/02_get_started/01_prerequisites.md +++ b/docs/source/02_get_started/01_prerequisites.md @@ -2,7 +2,7 @@ - Kedro supports macOS, Linux and Windows (7 / 8 / 10 and Windows Server 2016+). If you encounter any problems on these platforms, please check the [frequently asked questions](../11_faq/01_faq.md), and / or the Kedro community support on [Stack Overflow](https://stackoverflow.com/questions/tagged/kedro). -- To work with Kedro, we highly recommend that you [download and install Anaconda](https://www.anaconda.com/download/#macos) (Python 3.x version). +- To work with Kedro, we highly recommend that you [download and install Anaconda](https://www.anaconda.com/download/) (Python 3.x version). - If you are using PySpark, you will also need to [install Java](https://www.oracle.com/technetwork/java/javase/downloads/index.html). If you are a Windows user, you will need admin rights to complete the installation. diff --git a/docs/source/02_get_started/03_hello_kedro.md b/docs/source/02_get_started/03_hello_kedro.md index dd948dfe3e..e71226c589 100644 --- a/docs/source/02_get_started/03_hello_kedro.md +++ b/docs/source/02_get_started/03_hello_kedro.md @@ -15,8 +15,9 @@ Here, the `return_greeting` function is wrapped by a node called `return_greetin ```python from kedro.pipeline import node + +# Prepare first node def return_greeting(): - # Prepare first node return "Hello" @@ -28,8 +29,8 @@ return_greeting_node = node( The `join_statements` function is wrapped by a node called `join_statements_node`, which names a single input (`my_salutation`) and a single output (`my_message`): ```python +# Prepare second node def join_statements(greeting): - # Prepare second node return f"{greeting} Kedro!" @@ -83,7 +84,7 @@ The Runner is an object that runs the pipeline. Kedro resolves the order in whic It's now time to stitch the code together. Here is the full example: ```python -"""Content of hello_kedro.py""" +"""Contents of hello_kedro.py""" from kedro.io import DataCatalog, MemoryDataSet from kedro.pipeline import node, Pipeline from kedro.runner import SequentialRunner @@ -91,9 +92,8 @@ from kedro.runner import SequentialRunner # Prepare a data catalog data_catalog = DataCatalog({"example_data": MemoryDataSet()}) - +# Prepare first node def return_greeting(): - # Prepare first node return "Hello" @@ -101,9 +101,8 @@ return_greeting_node = node( return_greeting, inputs=None, outputs="my_salutation" ) - +# Prepare second node def join_statements(greeting): - # Prepare second node return f"{greeting} Kedro!" diff --git a/docs/source/02_get_started/05_example_project.md b/docs/source/02_get_started/05_example_project.md index c5f430f73b..87b40acfb9 100644 --- a/docs/source/02_get_started/05_example_project.md +++ b/docs/source/02_get_started/05_example_project.md @@ -55,7 +55,7 @@ For project-specific settings to share across different installations (for examp The folder contains three files for the example, but you can add others as you require: -- `catalog.yml` - [Configures the Data Catalog](../04_data_catalog/04_data_catalog#using-the-data-catalog-within-kedro-configuration) with the file paths and load/save configuration required for different datasets +- `catalog.yml` - [Configures the Data Catalog](../05_data/01_data_catalog#using-the-data-catalog-within-kedro-configuration) with the file paths and load/save configuration required for different datasets - `logging.yml` - Uses Python's default [`logging`](https://docs.python.org/3/library/logging.html) library to set up logging - `parameters.yml` - Allows you to define parameters for machine learning experiments e.g. train / test split and number of iterations diff --git a/docs/source/02_get_started/06_starters.md b/docs/source/02_get_started/06_starters.md index f831db37ce..a1ceaccb09 100644 --- a/docs/source/02_get_started/06_starters.md +++ b/docs/source/02_get_started/06_starters.md @@ -1,6 +1,5 @@ # Kedro starters - Kedro starters are used to create projects that contain code to run as-is, or to adapt and extend. They provide pre-defined example code and configuration that can be reused, for example: * As example code for a typical Kedro project @@ -65,7 +64,7 @@ Under the hood, the value will be passed to the [`--checkout` flag in Cookiecutt ## Use a starter in interactive mode -By default, when you create a new project using a starter, `kedro new` launches in [interactive mode](./04_new_project.md). You will be prompted to provide the following variables: +By default, when you create a new project using a starter, `kedro new` launches [by asking a few questions](./04_new_project.md#create-a-new-project-interactively). You will be prompted to provide the following variables: * `project_name` - A human readable name for your new project * `repo_name` - A name for the directory that holds your project repository diff --git a/docs/source/03_tutorial/02_tutorial_template.md b/docs/source/03_tutorial/02_tutorial_template.md index 52acf6327c..4e230ea96c 100644 --- a/docs/source/03_tutorial/02_tutorial_template.md +++ b/docs/source/03_tutorial/02_tutorial_template.md @@ -10,7 +10,7 @@ In this section, we discuss the project set-up phase, which is the first part of ## Create a new project -Navigate to your chosen working directory and run the following to [create a new empty Kedro project](../02_get_started/04_new_project.md) using the default interactive prompts: +Navigate to your chosen working directory and run the following to [create a new empty Kedro project](../02_get_started/04_new_project.md#create-a-new-project-interactively) using the default interactive prompts: ```bash kedro new @@ -34,7 +34,7 @@ isort>=4.3.21, <5.0 # Used for linting code with `kedro lint` jupyter>=1.0.0, <2.0 # Used to open a Kedro-session in Jupyter Notebook & Lab jupyter_client>=5.1.0, <7.0 # Used to open a Kedro-session in Jupyter Notebook & Lab jupyterlab==0.31.1 # Used to open a Kedro-session in Jupyter Lab -kedro==0.16.3 +kedro==0.16.5 nbstripout==0.3.3 # Strips the output of a Jupyter Notebook and writes the outputless version to the original file pytest-cov>=2.5, <3.0 # Produces test coverage reports pytest-mock>=1.7.1,<2.0 # Wrapper around the mock package for easier use with pytest diff --git a/docs/source/03_tutorial/04_create_pipelines.md b/docs/source/03_tutorial/04_create_pipelines.md index fbd41412f4..bd388354bb 100644 --- a/docs/source/03_tutorial/04_create_pipelines.md +++ b/docs/source/03_tutorial/04_create_pipelines.md @@ -421,7 +421,7 @@ test_size: 0.2 random_state: 3 ``` -These are the parameters fed into the `DataCatalog` when the pipeline is executed. More information about [parameters](../04_kedro_project_setup/01_configuration.md#parameters) is available in later documentation for advanced usage. +These are the parameters fed into the `DataCatalog` when the pipeline is executed. More information about [parameters](../04_kedro_project_setup/02_configuration.md#Parameters) is available in later documentation for advanced usage. ### Register the dataset The next step is to register the dataset that will save the trained model, by adding the following definition to `conf/base/catalog.yml`: @@ -433,7 +433,9 @@ regressor: versioned: true ``` -> *Note:* Versioning is enabled for `regressor`, which means that the pickled output of the `regressor` will be versioned and saved every time the pipeline is run. This allows us to keep the history of the models built using this pipeline. Further details can be found in the [Versioning](../05_data/02_kedro_io.md#versioning). +> *Note:* Versioning is enabled for `regressor`, which means that the pickled output of the `regressor` will be +> versioned and saved every time the pipeline is run. This allows us to keep the history of the models built using +> this pipeline. Further details can be found in the [Versioning](../05_data/02_kedro_io.md#versioning) section. ### Assemble the data science pipeline To create a pipeline for the price prediction model, add the following to the top of `src/kedro_tutorial/pipelines/data_science/pipeline.py`: diff --git a/docs/source/06_nodes_and_pipelines/02_pipelines.md b/docs/source/06_nodes_and_pipelines/02_pipelines.md index 0b9b009049..f8653288ac 100644 --- a/docs/source/06_nodes_and_pipelines/02_pipelines.md +++ b/docs/source/06_nodes_and_pipelines/02_pipelines.md @@ -161,7 +161,7 @@ Modular pipelines serve the following main purposes: ### How do I create modular pipelines? -For projects created using Kedro version 0.16.0 or later, Kedro ships a [project-specific CLI command](../07_extend_kedro/05_plugins.md#global-and-project-commands) `kedro pipeline create `, which does the following for you: +For projects created using Kedro version 0.16.0 or later, Kedro ships a [project-specific CLI command](../09_development/03_commands_reference.md) `kedro pipeline create `, which does the following for you: 1. Adds a new modular pipeline in a `src//pipelines//` directory 2. Creates boilerplate configuration files, `catalog.yml` and `parameters.yml`, in `conf//pipelines//`, where `` defaults to `base` 3. Makes a placeholder for the pipeline unit tests in `src/tests/pipelines//` @@ -184,7 +184,7 @@ You can manually delete all the files that belong to a modular pipeline. However * All the modular pipeline code in `src//pipelines//` * Configuration files in `conf//pipelines//`, where `` defaults to `base`. If the files are located in a different config environment, run `kedro pipeline delete --env `. -* Pipeline unit tests in `src/tests/pipelines//` +* Pipeline unit tests in `src/tests/pipelines//` ### Modular pipeline structure @@ -207,7 +207,8 @@ pipeline = my_modular_pipeline_1.create_pipeline() Here is a list of recommendations for developing a modular pipeline: * A modular pipeline should include a `README.md`, with all the information regarding the execution of the pipeline for the end users -* A modular pipeline _may_ have external dependencies specified in `requirements.txt`. These dependencies are _not_ currently installed by the [`kedro install`](../04_kedro_project_setup/01_dependencies.md#kedro-install) command, so the users of your pipeline would have to run `pip install -r src//pipelines//requirements.txt` +* A modular pipeline _may_ have external dependencies specified in `requirements.txt`. These dependencies are _not_ + currently installed by the [`kedro install`](../09_development/03_commands_reference.md#Install-all-package-dependencies) command, so the users of your pipeline would have to run `pip install -r src//pipelines//requirements.txt` * To ensure portability, modular pipelines should use relative imports when accessing their own objects and absolute imports otherwise. Look at an example from `src/new_kedro_project/pipelines/modular_pipeline_1/pipeline.py` below:
@@ -265,7 +266,7 @@ project_hooks = ProjectHooks() ### How do I share a modular pipeline? #### Packaging a modular pipeline -Since Kedro 0.16.4 you can package a modular pipeline by executing `kedro pipeline package ` command, which will generate a new [wheel file](https://pythonwheels.com/) for it. By default, the wheel file will be saved into `src/dist` directory inside your project, however this can be changed using `--destination` (`-d`) option. +Since Kedro 0.16.4 you can package a modular pipeline by executing `kedro pipeline package ` command, which will generate a new [wheel file](https://pythonwheels.com/) for it. By default, the wheel file will be saved into `src/dist` directory inside your project, however this can be changed using the `--destination` (`-d`) option. When packaging your modular pipeline, Kedro will also automatically package files from 3 locations : diff --git a/docs/source/07_extend_kedro/01_custom_datasets.md b/docs/source/07_extend_kedro/01_custom_datasets.md index d474fd120a..233dd627da 100644 --- a/docs/source/07_extend_kedro/01_custom_datasets.md +++ b/docs/source/07_extend_kedro/01_custom_datasets.md @@ -90,7 +90,7 @@ src/kedro_pokemon/extras ## Implement the `_load` method with `fsspec` -Many of the built-in Kedro datasets rely on [fsspec](https://filesystem-spec.readthedocs.io/en/latest/) as a consistent interface to different data sources, as described earlier in the section about the [Data Catalog](./04_data_catalog.html#specifying-the-location-of-the-dataset). In this example, it's particularly convenient to use `fsspec` in conjunction with `Pillow` to read image data, since it allows the dataset to work flexibly with different image locations and formats. +Many of the built-in Kedro datasets rely on [fsspec](https://filesystem-spec.readthedocs.io/en/latest/) as a consistent interface to different data sources, as described earlier in the section about the [Data Catalog](../05_data/01_data_catalog.md#specifying-the-location-of-the-dataset). In this example, it's particularly convenient to use `fsspec` in conjunction with `Pillow` to read image data, since it allows the dataset to work flexibly with different image locations and formats. Here is the implementation of the `_load` method using `fsspec` and `Pillow` to read the data of a single image into a `numpy` array: @@ -193,7 +193,7 @@ You can open the file to verify that the data was written back correctly. ## Implement the `_describe` method -The `_describe` method is used for printing purposes. The convention in Kedro is for the method to return a dictionary describing the attributes of the dataset . +The `_describe` method is used for printing purposes. The convention in Kedro is for the method to return a dictionary describing the attributes of the dataset. ```python from kedro.io import AbstractDataSet @@ -284,7 +284,7 @@ class ImageDataSet(AbstractDataSet): Currently, the `ImageDataSet` only works with a single image, but this example needs to load all Pokemon images from the raw data directory for further processing. -Kedro's [`PartitionedDataSet`](./07_kedro_io/01_advanced_io.html#partitioned-dataset) is a convenient way to load multiple separate data files of the same underlying dataset type into a directory. +Kedro's [`PartitionedDataSet`](../05_data/02_kedro_io.md#partitioned-dataset) is a convenient way to load multiple separate data files of the same underlying dataset type into a directory. To use `PartitionedDataSet` with `ImageDataSet` to load all Pokemon PNG images, add this to the data catalog YAML so that `PartitionedDataSet` loads all PNG files from the data directory using `ImageDataSet`: @@ -315,7 +315,8 @@ $ ls -la data/01_raw/pokemon-images-and-types/images/images/*.png | wc -l > *Note*: Versioning doesn't work with PartitionedDataSet. You can't use both of them at the same time. -To add [Versioning](./05_data/02_kedro_io.md#versioning) support to the new dataset we need to extend the [AbstractVersionedDataSet](/kedro.io.AbstractVersionedDataSet) to: +To add [Versioning](../05_data/02_kedro_io.md#versioning) support to the new dataset we need to extend the + [AbstractVersionedDataSet](/kedro.io.AbstractVersionedDataSet) to: * Accept a `version` keyword argument as part of the constructor * Adapt the `_save` and `_load` method to use the versioned data path obtained from `_get_save_path` and `_get_load_path` respectively @@ -397,7 +398,7 @@ class ImageDataSet(AbstractVersionedDataSet): The graphic shows the differences between the original `ImageDataSet` and the versioned `ImageDataSet`: -![Visual code diff graphic](../meta/images/diffs-graphic.png) +![](../meta/images/diffs-graphic.png) To test the code, you need to enable versioning support in the data catalog: @@ -439,7 +440,7 @@ In [2]: context.catalog.save('pikachu', data=img) Inspect the content of the data directory to find a new version of the data, written by `save`. -You may also want to consult the [in-depth documentation about the Versioning API](./05_data/kedro#versioning). +You may also want to consult the [in-depth documentation about the Versioning API](../05_data/02_kedro_io.md#versioning). ## Thread-safety @@ -505,7 +506,8 @@ We provide additional examples of [how to use parameters through the data catalo ## How to contribute a custom dataset implementation -One of the easiest ways to contribute back to Kedro is to share a custom dataset. Kedro has a `kedro.extras.datasets` sub-package where you can add a new custom dataset implementation to share it with others. You can find out more in the [Kedro contribution guide](https://github.com/quantumblacklabs/kedro/blob/develop/CONTRIBUTING.md) on Github. +One of the easiest ways to contribute back to Kedro is to share a custom dataset. Kedro has a `kedro.extras.datasets` sub-package where you can add a new custom dataset implementation to share it with others. You can find out more in + the [Kedro contribution guide](https://github.com/quantumblacklabs/kedro/blob/master/CONTRIBUTING.md) on Github. To contribute your custom dataset: diff --git a/docs/source/09_development/01_set_up_vscode.md b/docs/source/09_development/01_set_up_vscode.md index 17de4b700d..3ef3e31a93 100644 --- a/docs/source/09_development/01_set_up_vscode.md +++ b/docs/source/09_development/01_set_up_vscode.md @@ -5,13 +5,13 @@ Start by opening a new project directory in VS Code and installing the Python plugin under **Tools and languages**: -![Tools and languages graphic](../meta/images/vscode_startup.png) +![](../meta/images/vscode_startup.png) Python is an interpreted language; to run Python code you must tell VS Code which interpreter to use. From within VS Code, select a Python 3 interpreter by opening the **Command Palette** (`Cmd + Shift + P` for macOS), start typing the **Python: Select Interpreter** command to search, then select the command. At this stage, you should be able to see the `conda` environment that you have created. Select the environment: -![Conda environment graphic](../meta/images/vscode_setup_interpreter.png) +![](../meta/images/vscode_setup_interpreter.png) ### Advanced: For those using `venv` / `virtualenv` @@ -95,7 +95,7 @@ We're going to need you to modify your `tasks.json`. To do this, go to **Termina To start a build, go to **Terminal > Run Build Task...** or press `Cmd + Shift + B` for macOS. You can run other tasks by going to **Terminal > Run** and choosing which task you want to run. -![Terminal run graphic](../meta/images/vscode_run.png) +![](../meta/images/vscode_run.png) ## Debugging @@ -138,19 +138,19 @@ Edit the `launch.json` that opens in the editor with: To add a breakpoint in your `pipeline.py` script, for example, click on the left hand side of the line of code: -![Click on code line graphic](../meta/images/vscode_set_breakpoint.png) +![](../meta/images/vscode_set_breakpoint.png) Click on **Debug** button on the left pane: -![Debug graphic](../meta/images/vscode_debug_button.png) +![](../meta/images/vscode_debug_button.png) Then select the debug config **Python: Kedro Run** and click **Debug** (the green play button): -![Debug config graphic](../meta/images/vscode_run_debug.png) +![](../meta/images/vscode_run_debug.png) Execution should stop at the breakpoint: -![Execution stopped at breakpoint graphic](../meta/images/vscode_breakpoint.png) +![](../meta/images/vscode_breakpoint.png) ### Advanced: Remote Interpreter / Debugging @@ -233,7 +233,7 @@ ssh -vNL 3000:127.0.0.1:3000 @ Go to the **Debugging** section in VS Code and select the newly created remote debugger profile: -![Select Kedro remote debugger graphic](../meta/images/vscode_remote_debugger.png) +![](../meta/images/vscode_remote_debugger.png) You will need to set a breakpoint in VS Code as described [above](#debugging) and start the debugger by clicking the green play triangle: @@ -255,4 +255,4 @@ Enter the following in your `settings.json` file: and start editing your `catalog` files. -> Different schemas for different Kedro versions can be found [here](https://github.com/quantumblacklabs/kedro/tree/develop/static/jsonschema). +> Different schemas for different Kedro versions can be found [here](https://github.com/quantumblacklabs/kedro/tree/master/static/jsonschema). diff --git a/docs/source/09_development/02_set_up_pycharm.md b/docs/source/09_development/02_set_up_pycharm.md index 7aec31c6b2..9c7e962e4b 100644 --- a/docs/source/09_development/02_set_up_pycharm.md +++ b/docs/source/09_development/02_set_up_pycharm.md @@ -6,19 +6,19 @@ This section will present a quick guide on how to configure [PyCharm](https://ww Open a new project directory in PyCharm. You will need to add your **Project Interpreter**, so go to **PyCharm | Preferences** for macOS or **File | Settings** for Windows and Linux: -![Pycharm Preferences graphic](../meta/images/pycharm_preferences.png) +![](../meta/images/pycharm_preferences.png) Choose **Project Interpreter**:
-![Project Interpreter graphic](../meta/images/pycharm_project_interpreter.png) +![](../meta/images/pycharm_project_interpreter.png) Click the cog on the right-hand side and click **Add**: -![Add interpreter graphic](../meta/images/pycharm_add_interpreter.png) +![](../meta/images/pycharm_add_interpreter.png) Select **Conda Environment**: -![Add conda environment graphic](../meta/images/pycharm_add_conda_env.png) +![](../meta/images/pycharm_add_conda_env.png) Choose **Existing environment** and navigate your way to find your existing environment. If you don't see your `conda` environment in the dropdown menu then you need to open a `terminal` / `command prompt` with your `conda` environment activated and run: @@ -31,11 +31,11 @@ python -c "import sys; print(sys.executable)" Paste the interpreter path into the file picker and click **OK**:
-![Paste interpreter patch graphic](../meta/images/pycharm_select_conda_interpreter.png) +![](../meta/images/pycharm_select_conda_interpreter.png) Finally, in the **Project Explorer** right-click on `src` and then go to **Mark Directory as | Sources Root**: -![Mark directory as sources root graphic](../meta/images/pycharm_mark_dir_as_sources_root.png) +![](../meta/images/pycharm_mark_dir_as_sources_root.png) ## Set up Run configurations @@ -46,11 +46,11 @@ Here we will walk you through an example of how to setup Run configuration for K Go to **Run | Edit Configurations**: -![Run | Edit Configurations graphic](../meta/images/pycharm_edit_confs.png) +![](../meta/images/pycharm_edit_confs.png) Add a new **Python** Run configuration: -![Add a new Python run configuration graphic](../meta/images/pycharm_add_py_run_config.png) +![](../meta/images/pycharm_add_py_run_config.png) Create a **Run / Debug Configuration** for `kedro run` and get the path to the Kedro CLI script: @@ -64,17 +64,17 @@ python -c "import sys, os.path; print(os.path.join(os.path.dirname(sys.executabl Edit the new Run configuration as follows: -![Edit the new configuration graphic](../meta/images/pycharm_edit_py_run_config.png) +![](../meta/images/pycharm_edit_py_run_config.png) Replace **Script path** with path obtained above and **Working directory** with the path of your project directory and then click **OK**. To execute the Run configuration, select it from the **Run / Debug Configurations** dropdown in the toolbar (if that toolbar is not visible, you can enable it by going to **View > Toolbar**). Click the green triangle: -![Execute the Run configuration graphic](../meta/images/pycharm_conf_run_button.png) +![](../meta/images/pycharm_conf_run_button.png) You may also select **Run** from the toolbar and execute from there.
-![Select Run from the toolbar graphic](../meta/images/pycharm_conf_run_dropdown.png) +![](../meta/images/pycharm_conf_run_dropdown.png) For other `kedro` commands, follow same steps but replace `run` in the `Parameters` field with the other commands that are to be used (e.g., `test`, `package`, `build-docs` etc.). @@ -83,11 +83,11 @@ For other `kedro` commands, follow same steps but replace `run` in the `Paramete To debug, simply click the line number in the source where you want execution to break: -![Add a breakpoint to the source graphic](../meta/images/pycharm_add_breakpoint.png) +![](../meta/images/pycharm_add_breakpoint.png) Then click the bug button in the toolbar (![](../meta/images/pycharm_debugger_button.png)) and execution should stop at the breakpoint: -![Click the bug button to debug graphic](../meta/images/pycharm_debugger_break.png) +![](../meta/images/pycharm_debugger_break.png) >For more information about debugging with PyCharm take a look at the [debugging guide on jetbrains.com](https://www.jetbrains.com/help/pycharm/part-1-debugging-python-code.html). @@ -98,23 +98,23 @@ Then click the bug button in the toolbar (![](../meta/images/pycharm_debugger_bu Firstly, add an SSH interpreter. Go to **Preferences | Project Interpreter** as above and proceed to add a new interpreter. Select **SSH Interpreter** and fill in details of the remote computer: -![Select SSH Interpreter graphic](../meta/images/pycharm_ssh_interpreter_1.png) +![](../meta/images/pycharm_ssh_interpreter_1.png) Click **Next** and add the SSH password or SSH private key: -![Add SSH password/private key graphic](../meta/images/pycharm_ssh_interpreter_2.png) +![](../meta/images/pycharm_ssh_interpreter_2.png) Click **Next** and add the path of the remote interpreter. Assuming a Unix-like OS, this can be found by running `which python` within the appropriate `conda` environment on the remote computer. -![Add path to remote interpreter graphic](../meta/images/pycharm_ssh_interpreter_3.png) +![](../meta/images/pycharm_ssh_interpreter_3.png) Click **Finish**. Go to **Run / Debug Configurations** to add a **Remote Run**. Select the remote interpreter that you have just created. For the script path, get the path of the Kedro CLI on the remote computer by running `which kedro` (macOS / Linux) in the appropriate environment. -![Add a remote runner graphic](../meta/images/pycharm_ssh_runner.png) +![](../meta/images/pycharm_ssh_runner.png) Click **OK** and then select **Remote Run** from the toolbar and click **Run** to execute remotely. -![Select remote run graphic](../meta/images/pycharm_remote_run.png) +![](../meta/images/pycharm_remote_run.png) To remotely debug, click the debugger button as [described above](#debugging). @@ -122,12 +122,12 @@ To remotely debug, click the debugger button as [described above](#debugging). You can enable the Kedro catalog validation schema in your PyCharm IDE to enable real-time validation, autocompletion and see information about the different fields in your `catalog` as you write it. To enable this, open a `catalog.yml` file and you should see "No JSON Schema" in the bottom right corner of your window. Click it and select "Edit Schema Mapping". -![Edit schema mapping graphic](../meta/images/pycharm_edit_schema_mapping.png) +![](../meta/images/pycharm_edit_schema_mapping.png) Add a new mapping using the "+" button in the top left of the window and select the name you want for it. Enter this URL `https://raw.githubusercontent.com/quantumblacklabs/kedro/develop/static/jsonschema/kedro-catalog-0.16.json` in the "Schema URL" field and select "JSON Schema Version 7" in the "Schema version" field. Add the following file path pattern to the mapping: `conf/**/*catalog*`. -![Add file path pattern to mapping graphic](../meta/images/pycharm_catalog_schema_mapping.png) +![](../meta/images/pycharm_catalog_schema_mapping.png) -> Different schemas for different Kedro versions can be found [here](https://github.com/quantumblacklabs/kedro/tree/develop/static/jsonschema). +> Different schemas for different Kedro versions can be found [here](https://github.com/quantumblacklabs/kedro/tree/master/static/jsonschema). diff --git a/kedro/templates/pipeline/{{ cookiecutter.pipeline_name }}/config/catalog.yml b/kedro/templates/pipeline/{{ cookiecutter.pipeline_name }}/config/catalog.yml index d3d11dc3e8..64fedbeb06 100644 --- a/kedro/templates/pipeline/{{ cookiecutter.pipeline_name }}/config/catalog.yml +++ b/kedro/templates/pipeline/{{ cookiecutter.pipeline_name }}/config/catalog.yml @@ -2,4 +2,4 @@ # using Kedro {{ cookiecutter.kedro_version }}. # # Documentation for this file format can be found in "The Data Catalog" -# Link: https://kedro.readthedocs.io/en/stable/04_user_guide/04_data_catalog.html +# Link: https://kedro.readthedocs.io/en/stable/05_data/01_data_catalog.html diff --git a/kedro/templates/pipeline/{{ cookiecutter.pipeline_name }}/config/parameters.yml b/kedro/templates/pipeline/{{ cookiecutter.pipeline_name }}/config/parameters.yml index 587dbb7b88..25a4ce0bd8 100644 --- a/kedro/templates/pipeline/{{ cookiecutter.pipeline_name }}/config/parameters.yml +++ b/kedro/templates/pipeline/{{ cookiecutter.pipeline_name }}/config/parameters.yml @@ -1,5 +1,5 @@ # This is a boilerplate parameters config generated for pipeline '{{ cookiecutter.pipeline_name }}' # using Kedro {{ cookiecutter.kedro_version }}. # -# Documentation for this file format can be found in -# https://kedro.readthedocs.io/en/stable/04_user_guide/03_configuration.html#parameters +# Documentation for this file format can be found in "Parameters" +# Link: https://kedro.readthedocs.io/en/stable/04_kedro_project_setup/02_configuration.html#parameters diff --git a/kedro/templates/project/{{ cookiecutter.repo_name }}/conf/base/catalog.yml b/kedro/templates/project/{{ cookiecutter.repo_name }}/conf/base/catalog.yml index 7de1ffd49f..122d5ea7fa 100644 --- a/kedro/templates/project/{{ cookiecutter.repo_name }}/conf/base/catalog.yml +++ b/kedro/templates/project/{{ cookiecutter.repo_name }}/conf/base/catalog.yml @@ -1,7 +1,7 @@ # Here you can define all your data sets by using simple YAML syntax. # # Documentation for this file format can be found in "The Data Catalog" -# Link: https://kedro.readthedocs.io/en/stable/04_user_guide/04_data_catalog.html +# Link: https://kedro.readthedocs.io/en/stable/05_data/01_data_catalog.html {% if cookiecutter.include_example == "True" %} # # We support interacting with a variety of data stores including local file systems, cloud, network and HDFS diff --git a/kedro/templates/project/{{ cookiecutter.repo_name }}/conf/base/parameters.yml b/kedro/templates/project/{{ cookiecutter.repo_name }}/conf/base/parameters.yml index b289bc08f5..b1c7efc5f1 100644 --- a/kedro/templates/project/{{ cookiecutter.repo_name }}/conf/base/parameters.yml +++ b/kedro/templates/project/{{ cookiecutter.repo_name }}/conf/base/parameters.yml @@ -2,6 +2,8 @@ # Parameters for the example pipeline. Feel free to delete these once you # remove the example pipeline from hooks.py and the example nodes in # `src/pipelines/` +# Documentation for this file format can be found in "Parameters" +# Link: https://kedro.readthedocs.io/en/stable/04_kedro_project_setup/02_configuration.html#parameters example_test_data_ratio: 0.2 example_num_train_iter: 10000 example_learning_rate: 0.01