Kepler Action should support runners other than Ubuntu system. #50

jiere · 2023-06-25T11:00:22Z

Currently our kepler-action scripts hard code for Ubuntu system runners, if users configured self-hosted runners which are not running Ubuntu, many actions will fail at very beginning.
What's more, it seems that current kepler-ci-artifacts release tarball, which is mainly for bcc, only publish deb files there

To support more potential self-hosted runners which are not running Ubuntu, we have to fill the gaps above asap.
The first step is to support rhel/centos platforms, IMO.

@SamYuan1990 , @rootfs .

SamYuan1990 · 2023-06-25T11:50:42Z

do we have any other OS supported by github action?

SamYuan1990 · 2023-06-25T11:52:18Z

for self hosted agent, is there any volunteer contribute an agent for us?

SamYuan1990 · 2023-06-25T12:02:09Z

logically, I want to install bcc by official package, but which is not available, you can find the reason in bcc repo.
hence for rpm, well we can try with official release if any....
But the question is why should we start this issue without an agent?
and... as I don't suppose we have different on either deb or rpm... as either ebpf or cgroup is from linux kernel.
hence, what's the benefits for us to support rpm OS as running this action repo on self-hosted runner/agent?

if we just want to support rpm, I am not sure but @rootfs , will redhat pipeline or cncf pipeline be a better option than github action?

@jiere , please help clarify with further more information.

jiere · 2023-06-25T12:02:13Z

For platform-validation feature, we plan to provide manual triggered workflow which runs on self-hosted runners, so that specific cases could be run on specific platforms(act as self-hosted runner).
In this way, we cannot limit users to install Ubuntu on their platforms, right?

jiere · 2023-06-25T12:09:57Z

In another word, platform validation feature should be one-shot test, or on-demand test, so manual triggered workflow is suitable for it. The runner no need to be volunteer for community use, just the platform vendor self-hosted is enough.
Through kepler's workflow-dispatch workflow, specific cases could be executed, that's our goal.

SamYuan1990 · 2023-06-25T12:14:08Z

For platform-validation feature, we plan to provide manual triggered workflow which runs on self-hosted runners, so that specific cases could be run on specific platforms(act as self-hosted runner). In this way, we cannot limit users to install Ubuntu on their platforms, right?

maybe no limited with github action.
even if we resolved OS issue, there still cpu arch issue....
if GHA as limitation, maybe we should open our mind with solutions.
take another CNCF project as COCO as example, https://github.com/confidential-containers, they have different cloud providers.

which means, for a hardware provider, if we just make a check once per quarter... as our release cycle.
are we sure we have to have a github action integration?

It's a valuable topic and discussion, but we should open our mind.
I would like to see what will happen and keep moving with 1st real case.(maybe anyone contribute an OS in RHEL with GHA there?)

Otherwise, I suppose it's too early to discuss in details today.

jiere · 2023-06-25T12:24:24Z

Let me clarify one thing here, why we choose to provide self-hosted runner? Not for the OS distro, but for the BareMetal host.
Platform validation cases should run directly on BareMetal host, not run in VM. So we cannot rely on VM based runners to run such cases.

jiere · 2023-06-25T12:27:39Z

take another CNCF project as COCO as example, https://github.com/confidential-containers, they have different cloud providers.

This is another topic actually, not in current issue scope :-D
That refers to if we should find some "permanent volunteer machines" as the runners.
In this issue, actually we want to address such scenario:

No volunteer machines yet;
Platform vendors want to test their platforms in Kepler formally, officially and automatically;
Platform vendors do have BareMetal machines for test, but the machines are provisioned by RHEL, for example.

SamYuan1990 · 2023-06-25T12:36:21Z

take another CNCF project as COCO as example, https://github.com/confidential-containers, they have different cloud providers.

This is another topic actually, not in current issue scope :-D That refers to if we should find some "permanent volunteer machines" as the runners. In this issue, actually we want to address such scenario:

No volunteer machines yet;

Platform vendors want to test their platforms in Kepler formally and officially;

Platform vendors do have BareMetal machines for test, but the machines are provisioned by RHEL, for example.

In my point of view.

the volunteer machine is necessary.
as different BM may have different power api...
hence we can't assuming anything before we have a machine.

in most case, the integration with a new machine, may need follow the guide which WIP as sustainable-computing-io/kepler-doc#60

I do agree the
Platform vendors want to test their platforms in Kepler formally, officially and automatically;
Platform vendors do have BareMetal machines for test, but the machines are provisioned by RHEL, for example.

Currently we have x86 supported, and from hardware provider point of view, I suppose the Platform vendor provides other CPU platform.
for cloud provider, maybe they just provide a k8s for each time running.

SamYuan1990 · 2023-06-25T12:52:42Z

btw, another open questions is that, should we create the k8s cluster for test or we leave Platform vendor to provide a k8s cluster for test for us?

kenplusplus · 2023-06-25T13:01:04Z

I think:

A github action should not bind to specific OS, specially use "apt" in a nodejs script.... I am curious why using nodejs to do deploy work? Isn't ansible better to handle OS/platform/middleware?
CoCo is also provided by big group, but it is out of this topic, since VM is not support well, correct?

SamYuan1990 · 2023-06-25T13:08:17Z

I think:

A github action should not bind to specific OS, specially use "apt" in a nodejs script.... I am curious why using nodejs to do deploy work? Isn't ansible better to handle OS/platform/middleware?

CoCo is also provided by big group, but it is out of this topic, since VM is not support well, correct?

I suppose we can start from sustainable-computing-io/kepler#482
@jiere starts from Platform validation, but ... I hope to link with a specific ticket or contributor.
@kenplusplus , no matter ansible or nodejs, some how/some where a apt/yum install should be executed to provide kepler dependency before we deploy kepler for test. Just because currently we just have github action running as ubuntu OS, we implements in nodejs.

SamYuan1990 · 2023-06-25T13:33:23Z

@jiere , @kenplusplus , @rootfs
here is my point,
if our start point is other platform as hardware support for BM, either ARM(sustainable-computing-io/kepler#482) or redfish?
I would say it's too early to discuss before we have a real BM from contributor.

if our point is discussion with today's code logic.
hence our 1st infrastructure is github action with ubuntu. and logically to install the dependency before deploy kepler for test. we implemented in nodejs with apt install.

if our point is a free discussion as brainstorming, well, personally I am open for any kind of integration, either apt or yum. ansible or GHA, or travis or Tekton etc.... at any level, no matter a OS or a k8s cluster. From the logical for testing:

we have to install kepler dependency(as header?) before we deploy kepler for test.
we have to install a k8s cluster before we deploy kepler for test.
we have to make sure kepler mount with enough path on the host machine for either ebpf or cgroup.
we have to make sure kepler get from external hardware resources if necessary.

if our discussion scope is just as switch to yum from apt, or ansible ... which is to limited.
In my point of view, a provider can contribute a BM to us, and we just deploy kepler and run test scripts. Hence we can reduce some security concerns from the provider, as aovid running either apt/yum script to edit the OS packages.
also, the provider is able to ask us to adjust from GHA to other CI platform. I am open mind on this.

rootfs · 2023-06-29T12:31:51Z

I like the idea of supporting CI platforms other than GH action. That'll cover many cases we want to ensure the PRs or releases are fully tested. The limitation of runners is an issue unfortunately. Self hosting is an option, shall we start from this step first?

SamYuan1990 · 2023-06-29T13:32:45Z

I like the idea of supporting CI platforms other than GH action. That'll cover many cases we want to ensure the PRs or releases are fully tested. The limitation of runners is an issue unfortunately. Self hosting is an option, shall we start from this step first?

do we have any self hosting (github action agent) available for now to support us step forward?

rootfs · 2023-06-29T13:42:27Z

It looks CNCF can provide Prow for hosted projects
https://github.com/cncf/servicedesk#continuous-integration

SamYuan1990 · 2023-06-29T14:02:28Z

It looks CNCF can provide Prow for hosted projects https://github.com/cncf/servicedesk#continuous-integration

do they provide any guidance or examples?
https://docs.prow.k8s.io/docs/components/core/hook/
it seems they still need working on their document...

but as the hook setting is empty... I suppose we need a clear guidance to have a try.
and I am confusing with should we add them
https://github.com/kubernetes/test-infra/blob/master/config/jobs/cadvisor/cadvisor.yaml
or
https://github.com/google/cadvisor/blob/master/.github/workflows/test.yml

SamYuan1990 · 2023-06-29T14:14:15Z

I am extend my search scope from prow repos ... to search engine .... to fetch more informations.

SamYuan1990 · 2024-03-17T13:29:14Z

as it seems there no fedora support for github action, @rootfs , do we have any idea to test kepler-action on rhel?

SamYuan1990 · 2024-04-01T14:12:32Z

remove apt-get from this repo and make libbpf dependency impl by local dev cluster.

This was referenced Jul 3, 2023

Add support for Microshift in CI sustainable-computing-io/kepler#759

Merged

Bump up Kepler-action version #55

Closed

SamYuan1990 mentioned this issue Jul 10, 2023

Fix issue #26: Cleanup and simplify code logic sustainable-computing-io/local-dev-cluster#28

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Kepler Action should support runners other than Ubuntu system. #50

Kepler Action should support runners other than Ubuntu system. #50

jiere commented Jun 25, 2023

SamYuan1990 commented Jun 25, 2023

SamYuan1990 commented Jun 25, 2023 •

edited

Loading

SamYuan1990 commented Jun 25, 2023

jiere commented Jun 25, 2023

jiere commented Jun 25, 2023

SamYuan1990 commented Jun 25, 2023

jiere commented Jun 25, 2023

jiere commented Jun 25, 2023 •

edited

Loading

SamYuan1990 commented Jun 25, 2023

SamYuan1990 commented Jun 25, 2023

kenplusplus commented Jun 25, 2023

SamYuan1990 commented Jun 25, 2023

SamYuan1990 commented Jun 25, 2023 •

edited

Loading

rootfs commented Jun 29, 2023

SamYuan1990 commented Jun 29, 2023

rootfs commented Jun 29, 2023

SamYuan1990 commented Jun 29, 2023

SamYuan1990 commented Jun 29, 2023

SamYuan1990 commented Mar 17, 2024

SamYuan1990 commented Apr 1, 2024

Kepler Action should support runners other than Ubuntu system. #50

Kepler Action should support runners other than Ubuntu system. #50

Comments

jiere commented Jun 25, 2023

SamYuan1990 commented Jun 25, 2023

SamYuan1990 commented Jun 25, 2023 • edited Loading

SamYuan1990 commented Jun 25, 2023

jiere commented Jun 25, 2023

jiere commented Jun 25, 2023

SamYuan1990 commented Jun 25, 2023

jiere commented Jun 25, 2023

jiere commented Jun 25, 2023 • edited Loading

SamYuan1990 commented Jun 25, 2023

SamYuan1990 commented Jun 25, 2023

kenplusplus commented Jun 25, 2023

SamYuan1990 commented Jun 25, 2023

SamYuan1990 commented Jun 25, 2023 • edited Loading

rootfs commented Jun 29, 2023

SamYuan1990 commented Jun 29, 2023

rootfs commented Jun 29, 2023

SamYuan1990 commented Jun 29, 2023

SamYuan1990 commented Jun 29, 2023

SamYuan1990 commented Mar 17, 2024

SamYuan1990 commented Apr 1, 2024

SamYuan1990 commented Jun 25, 2023 •

edited

Loading

jiere commented Jun 25, 2023 •

edited

Loading

SamYuan1990 commented Jun 25, 2023 •

edited

Loading