Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rethink the TSC #148

Open
astrojuanlu opened this issue Apr 9, 2024 · 7 comments
Open

Rethink the TSC #148

astrojuanlu opened this issue Apr 9, 2024 · 7 comments
Assignees

Comments

@astrojuanlu
Copy link
Member

Problems

Some members of the TSC have called out that

  • Not all current members are committed enough. From our own rules,

https://github.com/kedro-org/kedro/blob/f58740dc156d6a8fde2f61a9e0dc1e4030f9d150/docs/source/contribution/technical_steering_committee.md?plain=1#L18-L22

  • Some members are onboarded to the TSC when they just join the QB team. Again, from our rules:

https://github.com/kedro-org/kedro/blob/f58740dc156d6a8fde2f61a9e0dc1e4030f9d150/docs/source/contribution/technical_steering_committee.md?plain=1#L36-L39

  • Some people gain commit access before joining the TSC.

https://github.com/kedro-org/kedro/blob/f58740dc156d6a8fde2f61a9e0dc1e4030f9d150/docs/source/contribution/technical_steering_committee.md?plain=1#L115

Proposed solutions

Some people have suggested to shrink the QB presence in the TSC. But there are some logistics issues:

  • If a QB team member is not on the TSC, what's the process that prescribes how and why do they get commit access to the repositories?
  • If some QB team members have commit access and some don't, does it mean that some people need to fork the repositories to contribute?

Next steps

We have to survey other similar projects and understand what they do.

@astrojuanlu astrojuanlu self-assigned this Apr 9, 2024
@deepyaman
Copy link
Member

Thanks for writing this up!

Proposed solutions

Some people have suggested to shrink the QB presence in the TSC. But there are some logistics issues:

  • If a QB team member is not on the TSC, what's the process that prescribes how and why do they get commit access to the repositories?
  • If some QB team members have commit access and some don't, does it mean that some people need to fork the repositories to contribute?

Next steps

We have to survey other similar projects and understand what they do.

Agree that it makes sense to learn from other OSS governance models. My initial thoughts are that it could be as simple as distinguishing committers from maintainers. For example, Apache Mesos does this. (I'm not very familiar with the project; this was more from me trying to find an example project that has separate committer and maintainer roles, since it's something I've seen not infrequently.) From the linked doc:

An Apache Mesos committer is a contributor who has been given write access to the Apache Mesos code repository and related Apache infrastructure. In the Mesos project, each committer is also a voting member of the PMC.

[...]

Maintainers are committers that have spent a significant amount of time and effort in the maintenance of a component in the project. Since maintainers have generally lived through the experience of maintaining a part of the project, they tend to have additional context, a sense of direction, a long-term outlook, and an increased incentive to ensure that development is done in a sustainable manner.

[...]

We’re here to build great software together! Maintainers are a means to ensure that we can continue to build great software while scaling the amount of contributors and committers in the project. The responsibilities listed above are expected from all committers in the project in the work that they do, no matter which component they touch. Maintainers additionally carry the above responsibilities for all changes going into a particular component, no matter who is doing the work.

The whole doc is quite well-written, IMO, so I've copied quite a bit. :)

I don't think Kedro necessarily has to have as granular component-level ownership, since it's a bit less broad, but that could still be interesting to explore. Certain people do have areas of depth (e.g. @merelcht for config, @marrrcin for deployment, @Galileo-Galilei for MLFlow).

One "risk" of this model may be that some members of the TSC who are not paid to work on Kedro specifically (naturally) spend less time on the project, which may or may not mean they're not maintainer candidates by this definition, or that the maintainer criteria should account more for depth/project understanding. To give an example, @idanov is unquestionably one of the most knowledgeable people about Kedro, and it makes sense to have his input on questions that impact Kedro direction; however, it may not make sense to have him review every PR in certain areas. I do think the following guidance from the Mesos doc goes a long way here:

All committers of the Mesos project are expected to use good judgement when committing patches as to whether the maintainers should be consulted.

Last but not least, I think the above proposal could potentially help alleviate the concerns raised in the "Proposed solutions" section.

@Galileo-Galilei
Copy link
Member

Hi, agree with most of @deepyaman comments. I'd like to give some insights on my personal experience as a non-QB maintainer.

On your questions:

Not all current members are committed enough

As @deepyaman says, it is hard to commit strongly to be available 1 day a week for an unpaid project. I personally choose to commit to the project on the long term, but my direct contributions to the codebase are extremely small. I almost never participate to ceremonies because they obviously happen on working time, and when I participate it is because I am biased towards things that I feel are useful for my actual job. I think this requirement is too strong a constraint and may prevent non-QB users to join the TSC.

I do contribute on my spare time and I can't estimate how much time I spend to contribute each week. I could technically contribute through coding but:

  • It requires a lot of focus / time. I barely find timeslots of say 2-3 hours straigth where I can focus to develop a feature / add tests / docs properly.
  • Since I have less synchronization with the rest of the team, this leads to a dilemma : either I do it on very small features with little impact or I contribute to complex feature more to the project, but this always raise a lot of interactions about implementations details. It either forces other to change their priorities to adapt to my agenda (which is not desirable) or let the PR open for a very long time, until someone can pick up with the open questions.

For these reasons, I found much easier to contribute through :

  • plugin(s) with few maintainers so I have less coordination constraints (e.g. kedro-mlflow, kedro-boot and some less fortunate attemps (👋 kedro-pandera))
  • answers slack questions to help the commmunity
  • issues and design doc because I can do it asynchronously (I can write and think while commuting)

But I acknowledge that it may be not the kind of contributions which is needed. Measuring contribution is harder to measure than juste "a number of commit" to the repo.

Some members are onboarded to the TSC when they just join the QB team

Some people gain commit access before joining the TSC.

Not an issue for me, I think we should adjust the rules (and maybe differentiate commiters /maintainers as suggested). Almost all contributions of the codebase are made by QB users (for an obvisou reason : they are paid for it !) and developer experience matters a lot. Forcing them to fork the repo just to give them a "lower" status won't help improving the developement of the tool, nor make people think that QB is less involved. This looks hugely counterproductive to me.

There's a lot of churn

As the list grows, maybe that instead of mentioning all past maintainers we can makes a reference to the biggest contributors (through commit history) to celebrate past contributors? This is not an ideal contribution since some maintainers (e.g. people with less direct contribution to the codebase) whould be ignore.

I don't think the current situation is a big problem though.

The TSC is too QB-dominated

Once again, I understand that we'd like to change that on the long run as it feels "healthier" for the project and may make it more sustainable but :

  • non-QB unpaid maintainers just can't commit themselves to code as much as QB's one. Truth is that if QB stops tomorrow affecting people to Kedro's development, the project will just go stale and dies on a short / medium term.
  • Adding more non QB maintainers won't change previous statement. It may makes more people feel that there is a more diverse representation, but ultimately money matters to help the project move forwards.
  • As a consequence, I think that it is normal to have a QB vote which represents the impact on the project. What if we had a major disagreement (say, to focus on the development on a big feature that non-QB maintainers want but QB does not and that will costs months of development, mainly done by QB members) ? There is a risk that QB can just stop investing in the team and because the rest of us could not take over, both features will end up not implemented. I understand that when you are investing millions (dozens of engineers for almost 5 years) you want to decide what investments are made and at least have a high level decision on the roadmap.

The TSC is too big

This is not the real problem, but rather as you said the fact that voting takes a long time (people can just skip the notification and forget, non-QB maintainers are less available on work time...). It may be worth investigating some ways to speed up the voting process (reduce the majority needed, remove voting for some actions, have some 'super maintainer' (yetunde, merel, ivan?) having extra votes, create a "developer" status for some QB member which are less committed on the long run...).

Extra comments on committing as an individual

It is hard to contribute as an individual and not on the behalf of an enterprise:

  • accessing internal tools (miro...) requires an enterprise account. Since I do not contribute on behalf of my enterprise, I cannot access to all the internal QB tooling which makes contributing much harder (e.g. I cannot access to TD recordings)
  • I can't participate to ceremonies because they happen on worktime. I don't have a solution for this, but it is harder to have impact on the project while being "far" of the core team, even if you make a fantastic effort to document everything publicly.
  • contribute through coding is hard because it requires a lot of focus / time and interactions (with the core team to define precisely what to do on complex issue which needs to get better defined while developing) so i focus on more asynchronous tasks i can do on my phone.

@deepyaman
Copy link
Member

For these reasons, I found much easier to contribute through :

  • plugin(s) with few maintainers so I have less coordination constraints (e.g. kedro-mlflow, kedro-boot and some less fortunate attemps (👋 kedro-pandera))
  • answers slack questions to help the commmunity
  • issues and design doc because I can do it asynchronously (I can write and think while commuting)

But I acknowledge that it may be not the kind of contributions which is needed. Measuring contribution is harder to measure than juste "a number of commit" to the repo.

I'd argue these are very much types of contributions that are needed! kedro-mlflow (and, it seems, now kedro-boot) have been extremely influential in terms of how a large portion of the userbase uses Kedro. Your multi-part manifesto on universal Kedro deployment (among others things) is the kind of visionary thinking that is extremely valuable in a group that is "committed to Kedro's long-term success."

  • As a consequence, I think that it is normal to have a QB vote which represents the impact on the project. What if we had a major disagreement (say, to focus on the development on a big feature that non-QB maintainers want but QB does not and that will costs months of development, mainly done by QB members) ? There is a risk that QB can just stop investing in the team and because the rest of us could not take over, both features will end up not implemented. I understand that when you are investing millions (dozens of engineers for almost 5 years) you want to decide what investments are made and at least have a high level decision on the roadmap.

Very valid point. :) I didn't give enough weight to this.

That said, a majority-QB TSC would still make it very unlikely that the project goes off in a direction that QB would not want to invest in. I think there's a lot of leeway between the current 80+% and that.

@deepyaman
Copy link
Member

deepyaman commented Oct 16, 2024

One other issue I've been thinking about recently is, what changes should require a TSC vote?

Right now, I believe the places we vote are on:

  • adding or removing maintainers
  • adding or removing core datasets

Should major ways to how Kedro works (not refactors, but more like the proposal to create a new data catalog or replacing Kedro's versioning with a new approach, or historically reworking the config loader) require a TSC vote?

If you look at other projects (e.g. the nature of scikit-learn enhancement proposals like https://scikit-learn-enhancement-proposals.readthedocs.io/en/latest/slep010/proposal.html, or Spark SPIPs, just as a couple examples), major changes should be proposed and require a vote.

P.S. I think this also lends itself back to having more of a committer/maintainer split; committers and others could raise proposals to improve Kedro, but a meaningful maintainer vote should ideally come from a group that has a good understanding across Kedro implementation.

@astrojuanlu
Copy link
Member Author

Thanks @deepyaman and sorry I never replied here.

I agree in spirit with what you say: that major changes in Kedro should indeed require a TSC vote (after all, if we're voting about adding removing datasets, why not control for changes that could be even more substantial).

My gut reaction is that it would be a major governance upgrade that could potentially slow us down. But at the very least we should probably consider it.

@astrojuanlu
Copy link
Member Author

astrojuanlu commented Oct 30, 2024

Also

I think this also lends itself back to having more of a committer/maintainer split; committers and others could raise proposals to improve Kedro, but a meaningful maintainer vote should ideally come from a group that has a good understanding across Kedro implementation.

Internally I had proposed an "observer" role with no voting rights, but actually the maintainer vs committer divide is probably more apt and also more similar to how I've seen other projects govern themselves. I believe Django works this way, and possibly projects that adopt the C4 model.

@astrojuanlu
Copy link
Member Author

Did a quick exploration of other graduated LF AI & Data projects:

Project name Roles Governance doc
DocArray TSC + Core developers (⊂ TSC) + Contributors https://github.com/docarray/docarray/blob/v0.40.0/GOVERNANCE.md
Egeria TSC + Maintainers + Contributors https://lf-aidata.atlassian.net/wiki/spaces/EG/overview
Flyte SC + Maintainers + Committers + Collaborators https://github.com/flyteorg/community/blob/main/GOVERNANCE.md
Horovod TSC (Voting + Non-voting aka Maintainers) https://github.com/horovod/horovod/blob/v0.28.1/GOVERNANCE.md
Marquez Leadership (?) + Committers + Contributors https://github.com/MarquezProject/marquez/blob/0.50.0/GOVERNANCE.md
Milvus TSC + Maintainer + Reviewer + Contributor https://github.com/milvus-io/web-content/blob/16e951a/community/site/en/communityArticles/contributor_group/membership.md
ONNX SC + Approver + Contributor https://github.com/onnx/onnx/blob/v1.17.0/community/readme.md
OpenLineage TSC + Committer + Contributor https://github.com/OpenLineage/OpenLineage/blob/1.24.2/TECHNICAL_STEERING_COMMITTEE.md

tl;dr: Looks like it's common to have a TSC with voting rights and non-voting maintainer or committer roles.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants