Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Do not start getty on tty0 to avoid screen corruption #4517

Merged
merged 4 commits into from
Jan 15, 2025

Conversation

rucoder
Copy link
Contributor

@rucoder rucoder commented Jan 10, 2025

Replace tty0 with tty1 when starting getty

getty can be started in two different ways:

  1. when 'getty' parameter is specified on kernel coommane line and there is at least one console= parameter
  2. if getty is NOT specified on kernel command line but debug.enable.console=true is set AND at least one console= is specified

most people specify console=tty0 on kernel command line if VGA console is desired. this is technically works becasue during kernel start tty0 points to the first available console which is tty1. However when getty is started after sytems is booted tty0 may not be equal to tty1 anymore e.g. if openvt was called like we do to run monitor app. in this case getty will take over CURRENT virtual console which is tty2 in case of monitor application.

  1. To reproduce the issue (tty0 points to current virtual console e.g. tty2) following conditions must be met
  • there is NO 'getty' specified on kernel command line
  • there is console=tty0 specified
  • debug.enable.console=true is set

if all 3 are true then getty is started by pillar AFTER monitor app is started on tty2 and since tty0 points to tty2 in this case getty revokes console from monitor app and the output gets corrupted

  1. we cannot use /dev/console from rungetty.sh anymore because it corrupts tty2 when running from pillar. Instead we use /dev/tty so the logs will be redirected to proper stdout and it should work in both cases: when getty is started by 'getty' parameter and when it is started from context of pillar. In later case logs will appear from pillar context which is correct

@rene
Copy link
Contributor

rene commented Jan 10, 2025

@rucoder , I think it would be better to fix the original rungetty.sh script (from linuxkit: https://github.com/linuxkit/linuxkit/blob/master/pkg/getty/usr/bin/rungetty.sh) instead of introduce a new one....

@@ -3,3 +3,4 @@
.*\.dts
.*\.md
^.codespellignorelines
pkg/dom0-ztools/rootfs/usr/bin/rungetty.sh
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do you want to ignore blank tabs?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As far as I understand, because it's not our code originally )

@rene
Copy link
Contributor

rene commented Jan 10, 2025

@rucoder , I think it would be better to fix the original rungetty.sh script (from linuxkit: https://github.com/linuxkit/linuxkit/blob/master/pkg/getty/usr/bin/rungetty.sh) instead of introduce a new one....

☝️ @deitch

Copy link
Member

@OhmSpectator OhmSpectator left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see we add this new script, but where do we run it? I don't see any invocation...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be nice to point to the source of the file in the commit message if it's not ours.
https://github.com/linuxkit/linuxkit/blob/master/pkg/getty/usr/bin/rungetty.sh, I guess?

@@ -3,3 +3,4 @@
.*\.dts
.*\.md
^.codespellignorelines
pkg/dom0-ztools/rootfs/usr/bin/rungetty.sh
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As far as I understand, because it's not our code originally )

@rucoder
Copy link
Contributor Author

rucoder commented Jan 10, 2025

@rucoder , I think it would be better to fix the original rungetty.sh script (from linuxkit: https://github.com/linuxkit/linuxkit/blob/master/pkg/getty/usr/bin/rungetty.sh) instead of introduce a new one....

☝️ @deitch

@deitch @rene technically this fix is applicable to eve only. no one in their sane mind would run getty manually from another process. I would not accept such fix if I were a maintainer for getty in linux kit

@rucoder
Copy link
Contributor Author

rucoder commented Jan 10, 2025

@rene @OhmSpectator the script is ignored in SPDX and tabs check as well as yetus checks because it is not our code and I just copied original file to minimize changes and make them visible.

@rucoder
Copy link
Contributor Author

rucoder commented Jan 10, 2025

I see we add this new script, but where do we run it? I don't see any invocation...

makes sense, fixed

- Add unmodified file to patch it following commit
The script is taken from getty container from linux kit
https://github.com/linuxkit/linuxkit/blob/master/pkg/getty/usr/bin/rungetty.sh

Signed-off-by: Mikhail Malyshev <[email protected]>
1) tty0 points to current virtual console. it is not a problem when getty
is started very early before pillar/monitor are started however under
following conditions it causes problems:
1. there is NO 'getty' specified on kernel command line
2. there is console=tty0 specified
3. debug.enable.console=true is set

if all 3 are true then getty is started by pillar AFTER monitor app is
started on tty2 and since tty0 points to tty2 in this case getty revokes
console from monitor app and the output gets corrupted

2) we cannot use /dev/console from rungetty.sh anymore because it
corrupts tty2 when running from pillar. Instead we use /dev/tty so
the logs will be redirected to proper stdout and it should work in both
cases: when getty is started by 'getty' parameter and when it is started
from context of pillar. In later case logs will appear from pillar
context which is correct

Signed-off-by: Mikhail Malyshev <[email protected]>
We do not maintain this file so do not check for errors (too many of them
really)

Signed-off-by: Mikhail Malyshev <[email protected]>
@rucoder rucoder force-pushed the rucoder/getty-tty0 branch from 80f6356 to a993891 Compare January 10, 2025 21:44
@deitch
Copy link
Contributor

deitch commented Jan 12, 2025

technically this fix is applicable to eve only. no one in their sane mind would run getty manually from another process. I would not accept such fix if I were a maintainer for getty in linux kit

Exactly. Are we "in our sane mind" if we take a kernel cmdline that included console=tty0 and change it under the covers to tty1? I would think that would upset the user once they figured out what happened ("user" being us when we eventually have a problem and figure out what we did after spending two days trying to understand where getty is when we know we wrote "console=tty0").

becasue during kernel start tty0 points to the first available console which is tty1. However when getty is started after sytems is booted tty0 may not be equal to tty1 anymore e.g. if openvt was called like we do to run monitor app. in this case getty will take over CURRENT virtual console which is tty2 in case of monitor application

This completely confused me. Someone writes tty0 on cmdline, but it actually is tty1, and sometimes tty2?

Going more to the heart of the issue, based on what you described above, it sounds like you are saying, in chronological order:

  1. System starts
  2. TUI starts on tty0
  3. getty starts, tries also to grab tty0, not knowing that something already has it, and they corrupt each other.

Would it make sense then, rather than some hard-coding, to have a detection, some form of, "if ttyX already is in use, skip it"? I don't know how that would work, but it would be a logical thing to do.

@rucoder
Copy link
Contributor Author

rucoder commented Jan 13, 2025

technically this fix is applicable to eve only. no one in their sane mind would run getty manually from another process. I would not accept such fix if I were a maintainer for getty in linux kit

Exactly. Are we "in our sane mind" if we take a kernel cmdline that included console=tty0 and change it under the covers to tty1? I would think that would upset the user once they figured out what happened ("user" being us when we eventually have a problem and figure out what we did after spending two days trying to understand where getty is when we know we wrote "console=tty0").

becasue during kernel start tty0 points to the first available console which is tty1. However when getty is started after sytems is booted tty0 may not be equal to tty1 anymore e.g. if openvt was called like we do to run monitor app. in this case getty will take over CURRENT virtual console which is tty2 in case of monitor application

This completely confused me. Someone writes tty0 on cmdline, but it actually is tty1, and sometimes tty2?

Going more to the heart of the issue, based on what you described above, it sounds like you are saying, in chronological order:

1. System starts

2. TUI starts on tty0

3. getty starts, tries also to grab tty0, not knowing that something already has it, and they corrupt each other.

Would it make sense then, rather than some hard-coding, to have a detection, some form of, "if ttyX already is in use, skip it"? I don't know how that would work, but it would be a logical thing to do.

@deitch

no, this is not how it works. Again, there are 2 scenarios

  1. Getty is started BEFORE pillar and TUI. it happens if you have console= in kernel command line AND getty in kernel command line. while getty is started tty1 is the current console so tty0 points to tty1
  2. Getty is started AFTRER TUI from Pillar. it happens if you DO NOT have getty in kernel command line but you set debug.enable.console.true. it happens when TUI is started on tty2 (it is hardcoded!) and tty0 points to tty2 in this case becasue this is the CURRENT tty.

people use tty0 in kernel command line becasue they do not care which tty kernel is going to use for its console. in case of [1] this is the first available console which is tty1. in case of [2] this is tty2 which is not what we want

EDIT: i think the main confusion here is meaning of tty0. this is not the first graphical tty, this is CURRENT graphical tty

@OhmSpectator
Copy link
Member

OhmSpectator commented Jan 13, 2025

@deitch, from a doc here https://www.kernel.org/doc/html/next/admin-guide/serial-console.html:

Note that if you boot without a console= option (or with console=/dev/tty0), /dev/console is the same as /dev/tty0.

So, technically console=/dev/tty0 means, "we don't care about exactly ttyX to be used". As far as I understand it, tyy0 is a special "master" tty, that is linked to an active foreground tty, that technically can be any other ttyX.

@rucoder
Copy link
Contributor Author

rucoder commented Jan 13, 2025

@deitch, from a doc here https://www.kernel.org/doc/html/next/admin-guide/serial-console.html:

Note that if you boot without a console= option (or with console=/dev/tty0), /dev/console is the same as /dev/tty0.

So, technically console=/dev/tty0 means, "we don't care about exactly ttyX to be used". As far as I understand it, tyy0 is a special "master" tty, that is linked to an active foreground tty, that technically can be any other ttyX.

yes, exactly. And do not confuse with /dev/console and /dev/tty. this blogpost covers it
https://www.baeldung.com/linux/monitor-keyboard-drivers

@OhmSpectator
Copy link
Member

Also, this kernel doc could bring light to the difference between /dev/tty0 and /dev/ttyX:
https://www.kernel.org/doc/html/v6.11/admin-guide/devices.html#virtual-consoles-and-the-console-device

Virtual consoles are full-screen terminal displays on the system video monitor. Virtual consoles are named /dev/tty#, with numbering starting at /dev/tty1; /dev/tty0 is the current virtual console.

Copy link
Member

@OhmSpectator OhmSpectator left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall, it looks good. Adding a link to the kernel documentation near the change explaining the special meaning of tty0 would make it even better.

@rucoder rucoder force-pushed the rucoder/getty-tty0 branch from 32cf850 to 30c9567 Compare January 13, 2025 13:12
@OhmSpectator
Copy link
Member

For the flaky test I created an Issue here: #4519

@OhmSpectator OhmSpectator self-requested a review January 13, 2025 14:23
Copy link
Member

@OhmSpectator OhmSpectator left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Restart the tests

@deitch
Copy link
Contributor

deitch commented Jan 13, 2025

So, technically console=/dev/tty0 means, "we don't care about exactly ttyX to be used". As far as I understand it, tyy0 is a special "master" tty, that is linked to an active foreground tty, that technically can be any other ttyX.

Good point, and the link to the kernel doc (and especially the blog post) clarifies.

So the problem isn't, "I started TUI on a known tty, and then getty started on that known tty because it was hard-coded"; it is "I started TUI on a known tty (tty2), and then getty started later on tty0, which just means the currently active tty, so when I switch to the TUI (tty2), it becomes active, and if I was there when getty started, it will mess it up."

Something like that?

I keep trying to come up with reasons why your logic should suggest something else, but I cannot. It makes sense. So much so, that I wonder if it should be upstreamed.

Copy link
Contributor

@deitch deitch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just some typos, and maybe include the links - the kernel doc and the blog post - in the documentation page?

@OhmSpectator
Copy link
Member

@rucoder, could you please address the comments from @deitch, so we can merge the PR soon?

@rucoder rucoder force-pushed the rucoder/getty-tty0 branch from 30c9567 to 557dc66 Compare January 15, 2025 12:31
@rucoder
Copy link
Contributor Author

rucoder commented Jan 15, 2025

@rucoder, could you please address the comments from @deitch, so we can merge the PR soon?

@OhmSpectator fixed

- Add explanation for differences between /dev/tty0, /dev/console, and /dev/ttyX

Signed-off-by: Mikhail Malyshev <[email protected]>
@rucoder rucoder force-pushed the rucoder/getty-tty0 branch from 557dc66 to ef304df Compare January 15, 2025 12:37
@OhmSpectator OhmSpectator merged commit 5726401 into lf-edge:master Jan 15, 2025
39 of 40 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants