Skip to content

Commit

Permalink
Revise FAQs and README (#2909)
Browse files Browse the repository at this point in the history
* Revise FAQs and README

Signed-off-by: Jo Stichbury <[email protected]>

* Add back the data layers FAQ as I've no idea where else it fits

Signed-off-by: Jo Stichbury <[email protected]>

* minor changes from review

Signed-off-by: Jo Stichbury <[email protected]>

---------

Signed-off-by: Jo Stichbury <[email protected]>
  • Loading branch information
stichbury authored and lrcouto committed Aug 8, 2023
1 parent 9962ede commit 6922492
Show file tree
Hide file tree
Showing 9 changed files with 45 additions and 65 deletions.
6 changes: 3 additions & 3 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,13 +6,13 @@ We welcome any and all contributions to Kedro, at whatever level you can manage.

You can find the Kedro community on our [Slack organisation](https://slack.kedro.org/), which is where we share news and announcements, and answer technical questions. You're welcome to post links to any articles or videos about Kedro that you create or find, such as how-tos, showcases, demos, blog posts or tutorials.

We also curate a [GitHub repo that lists content created by the Kedro community](https://github.com/kedro-org/awesome-kedro).
We also curate a [GitHub repo that lists content created by the Kedro community](https://github.com/kedro-org/awesome-kedro). If you've made something with Kedro, simply add it to the list with a PR!

## Contribute to the project

There are quite a few ways to contribute to the project. To learn how to do so, visit our [Contribute to Kedro](https://github.com/kedro-org/kedro/wiki/Contribute-to-Kedro) wiki page.
There are quite a few ways to contribute to Kedro, sich as answering questions about Kedro to help others, fixing a typo on the documentation, reporting a bug, reviewing pull requests or adding a feature.

## Join our Technical Steering Committee
Take a look at some of our [contribution suggestions on the Kedro GitHub Wiki](https://github.com/kedro-org/kedro/wiki/Contribute-to-Kedro)!

Kedro is an incubating project in [LF AI & Data](https://lfaidata.foundation/), a sub-organisation within the Linux Foundation that focuses on open innovation within the data and AI space.

Expand Down
64 changes: 10 additions & 54 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,7 @@
[![Conda version](https://img.shields.io/conda/vn/conda-forge/kedro.svg)](https://anaconda.org/conda-forge/kedro)
[![License](https://img.shields.io/badge/license-Apache%202.0-blue.svg)](https://github.com/kedro-org/kedro/blob/main/LICENSE.md)
[![Slack Organisation](https://img.shields.io/badge/slack-chat-blueviolet.svg?label=Kedro%20Slack&logo=slack)](https://slack.kedro.org)
[![Slack Archive](https://img.shields.io/badge/slack-archive-blueviolet.svg?label=Kedro%20Slack%20)](https://linen-slack.kedro.org/)
![CircleCI - Main Branch](https://img.shields.io/circleci/build/github/kedro-org/kedro/main?label=main)
![Develop Branch Build](https://img.shields.io/circleci/build/github/kedro-org/kedro/develop?label=develop)
[![Documentation](https://readthedocs.org/projects/kedro/badge/?version=stable)](https://docs.kedro.org/)
Expand All @@ -14,7 +15,7 @@

## What is Kedro?

Kedro is a toolbox for production-ready data science. It uses software engineering best practices to help you create data engineering and data science pipelines that are reproducible, maintainable, and modular.
Kedro is a toolbox for production-ready data science. It uses software engineering best practices to help you create data engineering and data science pipelines that are reproducible, maintainable, and modular. You can find out more at [kedro.org](https://kedro.org).

Kedro is an open-source Python framework hosted by the [LF AI & Data Foundation](https://lfaidata.foundation/).

Expand Down Expand Up @@ -51,12 +52,10 @@ _A pipeline visualisation generated using [Kedro-Viz](https://github.com/kedro-o

The [Kedro documentation](https://docs.kedro.org/en/stable/) first explains [how to install Kedro](https://docs.kedro.org/en/stable/get_started/install.html) and then introduces [key Kedro concepts](https://docs.kedro.org/en/stable/get_started/kedro_concepts.html).

- The first example illustrates the [basics of a Kedro project](https://docs.kedro.org/en/stable/get_started/new_project.html) using the Iris dataset
- You can then review the [spaceflights tutorial](https://docs.kedro.org/en/stable/tutorial/tutorial_template.html) to build a Kedro project for hands-on experience
You can then review the [spaceflights tutorial](https://docs.kedro.org/en/stable/tutorial/spaceflights_tutorial.html) to build a Kedro project for hands-on experience

For new and intermediate Kedro users, there's a comprehensive section on [how to visualise Kedro projects using Kedro-Viz](https://docs.kedro.org/en/stable/visualisation/kedro-viz_visualisation.html) and [how to work with Kedro and Jupyter notebooks](https://docs.kedro.org/en/stable/notebooks_and_ipython/kedro_and_notebooks).
For new and intermediate Kedro users, there's a comprehensive section on [how to visualise Kedro projects using Kedro-Viz](https://docs.kedro.org/en/stable/visualisation/index.html) and [how to work with Kedro and Jupyter notebooks](https://docs.kedro.org/en/stable/notebooks_and_ipython/index.html). We also recommend the [API reference documentation](/kedro) for additional information.

Further documentation is available for more advanced Kedro usage and deployment. We also recommend the [glossary](https://docs.kedro.org/en/stable/resources/glossary.html) and the [API reference documentation](/kedro) for additional information.

## Why does Kedro exist?

Expand All @@ -68,64 +67,21 @@ Kedro is built upon our collective best-practice (and mistakes) trying to delive
- To increase efficiency, because applied concepts like modularity and separation of concerns inspire the creation of
**reusable analytics code**

Find out more about how Kedro can answer your use cases from the [product FAQs on the Kedro website](https://kedro.org/#faq).

## The humans behind Kedro

The [Kedro product team](https://docs.kedro.org/en/stable/contribution/technical_steering_committee.html#kedro-maintainers) and a number of [open source contributors from across the world](https://github.com/kedro-org/kedro/releases) maintain Kedro.

## Can I contribute?

Yes! Want to help build Kedro? Check out our [guide to contributing to Kedro](https://github.com/kedro-org/kedro/blob/main/CONTRIBUTING.md).
Yes! We welcome all kinds of contributions. Check out our [guide to contributing to Kedro](https://github.com/kedro-org/kedro/wiki/Contribute-to-Kedro).

## Where can I learn more?

There is a growing community around Kedro. Have a look at the [Kedro FAQs](https://docs.kedro.org/en/stable/faq/faq.html#how-can-i-find-out-more-about-kedro) to find projects using Kedro and links to articles, podcasts and talks.

## Who likes Kedro?

There are Kedro users across the world, who work at start-ups, major enterprises and academic institutions like [Absa](https://www.absa.co.za/),
[Acensi](https://acensi.eu/page/home),
[Advanced Programming Solutions SL](https://www.linkedin.com/feed/update/urn:li:activity:6863494681372721152/),
[AI Singapore](https://makerspace.aisingapore.org/2020/08/leveraging-kedro-in-100e/),
[AMAI GmbH](https://www.am.ai/),
[Augment Partners](https://www.linkedin.com/posts/augment-partners_kedro-cheat-sheet-by-augment-activity-6858927624631283712-Ivqk),
[AXA UK](https://www.axa.co.uk/),
[Belfius](https://www.linkedin.com/posts/vangansen_mlops-machinelearning-kedro-activity-6772379995953238016-JUmo),
[Beamery](https://medium.com/hacking-talent/production-code-for-data-science-and-our-experience-with-kedro-60bb69934d1f),
[Caterpillar](https://www.caterpillar.com/),
[CRIM](https://www.crim.ca/en/),
[Dendra Systems](https://www.dendra.io/),
[Element AI](https://www.elementai.com/),
[GetInData](https://getindata.com/blog/running-machine-learning-pipelines-kedro-kubeflow-airflow),
[GMO](https://recruit.gmo.jp/engineer/jisedai/engineer/jisedai/engineer/jisedai/engineer/jisedai/engineer/jisedai/blog/kedro_and_mlflow_tracking/),
[Indicium](https://medium.com/indiciumtech/how-to-build-models-as-products-using-mlops-part-2-machine-learning-pipelines-with-kedro-10337c48de92),
[Imperial College London](https://github.com/dssg/barefoot-winnie-public),
[ING](https://www.ing.com),
[Jungle Scout](https://junglescouteng.medium.com/jungle-scout-case-study-kedro-airflow-and-mlflow-use-on-production-code-150d7231d42e),
[Helvetas](https://www.linkedin.com/posts/lionel-trebuchon_mlflow-kedro-ml-ugcPost-6747074322164154368-umKw),
[Leapfrog](https://www.lftechnology.com/blog/ai-pipeline-kedro/),
[McKinsey & Company](https://www.mckinsey.com/alumni/news-and-insights/global-news/firm-news/kedro-from-proprietary-to-open-source),
[Mercado Libre Argentina](https://www.mercadolibre.com.ar),
[Modec](https://www.modec.com/),
[Mosaic Data Science](https://www.youtube.com/watch?v=fCWGevB366g),
[NaranjaX](https://www.youtube.com/watch?v=_0kMmRfltEQ),
[NASA](https://github.com/nasa/ML-airport-taxi-out),
[NHS AI Lab](https://nhsx.github.io/skunkworks/synthetic-data-pipeline),
[Open Data Science LatAm](https://www.odesla.org/),
[Prediqt](https://prediqt.co/),
[QuantumBlack](https://medium.com/quantumblack/introducing-kedro-the-open-source-library-for-production-ready-machine-learning-code-d1c6d26ce2cf),
[ReSpo.Vision](https://neptune.ai/customers/respo-vision),
[Retrieva](https://tech.retrieva.jp/entry/2020/07/28/181414),
[Roche](https://www.roche.com/),
[Sber](https://www.linkedin.com/posts/seleznev-artem_welcome-to-kedros-documentation-kedro-activity-6767523561109385216-woTt),
[Société Générale](https://www.societegenerale.com/en),
[Telkomsel](https://medium.com/life-at-telkomsel/how-we-build-a-production-grade-data-pipeline-7004e56c8c98),
[Universidad Rey Juan Carlos](https://github.com/vchaparro/MasterThesis-wind-power-forecasting/blob/master/thesis.pdf),
[UrbanLogiq](https://urbanlogiq.com/),
[Wildlife Studios](https://wildlifestudios.com),
[WovenLight](https://www.wovenlight.com/) and
[XP](https://youtu.be/wgnGOVNkXqU?t=2210).

Kedro won [Best Technical Tool or Framework for AI](https://awards.ai/the-awards/previous-awards/the-4th-ai-award-winners/) in the 2019 Awards AI competition and a merit award for the 2020 [UK Technical Communication Awards](https://uktcawards.com/announcing-the-award-winners-for-2020/). It is listed on the 2020 [ThoughtWorks Technology Radar](https://www.thoughtworks.com/radar/languages-and-frameworks/kedro) and the 2020 [Data & AI Landscape](https://mattturck.com/data2020/). Kedro has received an [honorable mention in the User Experience category in Fast Company’s 2022 Innovation by Design Awards](https://www.fastcompany.com/90772252/user-experience-innovation-by-design-2022).
There is a growing community around Kedro. We encourage you to ask and answer technical questions on [Slack](https://slack.kedro.org/) and bookmark the [Linen archive of past discussions](https://linen-slack.kedro.org/).

We keep a list of [technical FAQs in the Kedro documentation](https://docs.kedro.org/en/stable/faq/faq.html) and you can find a growing list of blog posts, videos and projects that use Kedro over on the [`awesome-kedro` GitHub repository](https://github.com/kedro-org/awesome-kedro). If you have created anything with Kedro we'd love to include it on the list. Just make a PR to add it!

## How can I cite Kedro?

Expand Down
26 changes: 25 additions & 1 deletion docs/source/faq/faq.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,6 @@
# Frequently asked questions
# FAQs

This is a growing set of technical FAQs. The [product FAQs on the Kedro website](https://kedro.org/#faq) explain how Kedro can answer the typical use cases and requirements of data scientists, data engineers, machine learning engineers and product owners.

## Visualisation

Expand Down Expand Up @@ -46,3 +48,25 @@
* [How do I create a modular pipeline](../nodes_and_pipelines/modular_pipelines.md#how-do-i-create-a-modular-pipeline)?

* [Can I use generator functions in a node](../nodes_and_pipelines/nodes.md#how-to-use-generator-functions-in-a-node)?

## What is data engineering convention?

[Bruce Philp](https://github.com/bruceaphilp) and [Guilherme Braccialli](https://github.com/gbraccialli-qb) are the
brains behind a layered data-engineering convention as a model of managing data. You can find an [in-depth walk through of their convention](https://towardsdatascience.com/the-importance-of-layered-thinking-in-data-engineering-a09f685edc71) as a blog post on Medium.

Refer to the following table below for a high level guide to each layer's purpose

> **Note**:The data layers don’t have to exist locally in the `data` folder within your project, but we recommend that you structure your S3 buckets or other data stores in a similar way.
![data_engineering_convention](../meta/images/data_layers.png)

| Folder in data | Description |
| -------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| Raw | Initial start of the pipeline, containing the sourced data model(s) that should never be changed, it forms your single source of truth to work from. These data models are typically un-typed in most cases e.g. csv, but this will vary from case to case |
| Intermediate | Optional data model(s), which are introduced to type your :code:`raw` data model(s), e.g. converting string based values into their current typed representation |
| Primary | Domain specific data model(s) containing cleansed, transformed and wrangled data from either `raw` or `intermediate`, which forms your layer that you input into your feature engineering |
| Feature | Analytics specific data model(s) containing a set of features defined against the `primary` data, which are grouped by feature area of analysis and stored against a common dimension |
| Model input | Analytics specific data model(s) containing all :code:`feature` data against a common dimension and in the case of live projects against an analytics run date to ensure that you track the historical changes of the features over time |
| Models | Stored, serialised pre-trained machine learning models |
| Model output | Analytics specific data model(s) containing the results generated by the model based on the `model input` data |
| Reporting | Reporting data model(s) that are used to combine a set of `primary`, `feature`, `model input` and `model output` data used to drive the dashboard and the views constructed. It encapsulates and removes the need to define any blending or joining of data, improve performance and replacement of presentation layer without having to redefine the data models |
4 changes: 2 additions & 2 deletions docs/source/get_started/install.md
Original file line number Diff line number Diff line change
Expand Up @@ -134,7 +134,7 @@ You should see an ASCII art graphic and the Kedro version number. For example:

![](../meta/images/kedro_graphic.png)

If you do not see the graphic displayed, or have any issues with your installation, check out the [searchable archive of Slack discussions](https://www.linen.dev/s/kedro), or post a new query on the [Slack organisation](https://slack.kedro.org).
If you do not see the graphic displayed, or have any issues with your installation, check out the [searchable archive of Slack discussions](https://linen-slack.kedro.org/), or post a new query on the [Slack organisation](https://slack.kedro.org).


## How to upgrade Kedro
Expand Down Expand Up @@ -187,4 +187,4 @@ pip install kedro
* Installation prerequisites include a virtual environment manager like `conda`, Python 3.7+, and `git`.
* You should install Kedro using `pip install kedro`.

If you encounter any problems as you set up Kedro, ask for help on Kedro's [Slack organisation](https://slack.kedro.org) or review the [searchable archive of Slack discussions](https://www.linen.dev/s/kedro).
If you encounter any problems as you set up Kedro, ask for help on Kedro's [Slack organisation](https://slack.kedro.org) or review the [searchable archive of Slack discussions](https://linen-slack.kedro.org/).
4 changes: 2 additions & 2 deletions docs/source/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -43,8 +43,8 @@ Welcome to Kedro's documentation!
:target: https://slack.kedro.org
:alt: Kedro's Slack organisation

.. image:: https://img.shields.io/badge/slack-archive-blue.svg?label=Kedro%20Slack%20
:target: https://www.linen.dev/s/kedro
.. image:: https://img.shields.io/badge/slack-archive-blueviolet.svg?label=Kedro%20Slack%20
:target: https://linen-slack.kedro.org/
:alt: Kedro's Slack archive

.. image:: https://img.shields.io/badge/code%20style-black-black.svg
Expand Down
Binary file added docs/source/meta/images/data_layers.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
2 changes: 1 addition & 1 deletion docs/source/resources/index.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# Resources
# FAQs and resources

```{toctree}
:maxdepth: 1
Expand Down
2 changes: 1 addition & 1 deletion docs/source/tutorial/spaceflights_tutorial.md
Original file line number Diff line number Diff line change
Expand Up @@ -31,7 +31,7 @@ If you hit an issue with the tutorial:
* Check the [spaceflights tutorial FAQ](spaceflights_tutorial_faqs.md) to see if we have answered the question already.
* Use [Kedro-Viz](../visualisation/kedro-viz_visualisation) to visualise your project to better understand how the datasets, nodes and pipelines fit together.
* Use the [#questions channel](https://slack.kedro.org/) on our Slack channel to ask the community for help.
* Search the [searchable archive of Slack discussions](https://www.linen.dev/s/kedro).
* Search the [searchable archive of Slack discussions](https://linen-slack.kedro.org/).

## Terminology

Expand Down
2 changes: 1 addition & 1 deletion docs/source/visualisation/kedro-viz_visualisation.md
Original file line number Diff line number Diff line change
Expand Up @@ -42,7 +42,7 @@ You should see the following:
If a visualisation panel opens up and a pipeline is not visible, refresh the view, and check that your tutorial project code is complete if you've not generated it from the starter template. If you still don't see the visualisation, the Kedro community can help:

* use the [#questions channel](https://slack.kedro.org/) on our Slack channel to ask the community for help
* search the [searchable archive of Slack discussions](https://www.linen.dev/s/kedro)
* search the [searchable archive of Slack discussions](https://linen-slack.kedro.org/)

To exit the visualisation, close the browser tab. To regain control of the terminal, enter `^+c` on Mac or `Ctrl+c` on Windows or Linux machines.

Expand Down

0 comments on commit 6922492

Please sign in to comment.