Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use AsFd and AsSocket instead of AsRawFd and AsRawSocket #38

Closed
notgull opened this issue Aug 11, 2022 · 15 comments
Closed

Use AsFd and AsSocket instead of AsRawFd and AsRawSocket #38

notgull opened this issue Aug 11, 2022 · 15 comments

Comments

@notgull
Copy link
Member

notgull commented Aug 11, 2022

Rust 1.63 stabilized the I/O safety features in the standard library. As per sunfishcode/io-lifetimes#38, Rust crates in the ecosystem (such as async-std and smol) should be migrated to using the new I/O-safe traits instead of the AsRaw* family of traits.

To port this crate, we could replace the Source trait implementations, currently:

        impl Source for RawFd {
            fn raw(&self) -> RawFd {
                *self
            }
        }

        impl<T: AsRawFd> Source for &T {
            fn raw(&self) -> RawFd {
                self.as_raw_fd()
            }
        }

with this:

        impl Source for BorrowedFd<'_> {
            fn raw(&self) -> RawFd {
                self.as_raw_fd()
            }
        }

        impl<T: AsFd> Source for &T {
            fn raw(&self) -> RawFd {
                self.as_fd().as_raw_fd()
            }
        }

Although, ideally, the trait would return BorrowedFd<'_> instead, and the rest of the crate would be adjusted accordingly.

I am willing to implement this change in this crate as well as in async-io. The main question is if the MSRV bump is worth it.

@taiki-e
Copy link
Collaborator

taiki-e commented Aug 12, 2022

If we do this, it would be a breaking change because it would change types that implement Source . And the new version force downstream to use AsFd/AsSocket. Right?

@notgull
Copy link
Member Author

notgull commented Aug 12, 2022

I agree, this might be premature. Let's wait until the ecosystem evolves to act.

tzemanovic added a commit to heliaxdev/async-io that referenced this issue Sep 20, 2022
work-around for
smol-rs/polling#38
breaking namada tooling build with nightly 2022-05-20
tzemanovic added a commit to heliaxdev/polling that referenced this issue Sep 20, 2022
work-around for
smol-rs#38
breaking namada tooling build with nightly 2022-05-20
tzemanovic added a commit to heliaxdev/async-process that referenced this issue Sep 20, 2022
work-around for
smol-rs/polling#38
breaking namada tooling build with nightly 2022-05-20
@notgull
Copy link
Member Author

notgull commented Jun 14, 2023

Okay, the ecosystem has evolved. We can now definitely use AsFd/AsSocket in our definitions. However, it comes with a catch; for kqueue, IOCP, poll and event ports, the source (the AsFd implementer) has to outlive its registration in the Poller. I'm still brainstorming the best way of implementing this in a way that is compatible with both async-io and calloop.

@notgull
Copy link
Member Author

notgull commented Jun 15, 2023

Unfortunately I can't escape the conclusion that integrating I/O safety into this API would require it becoming either unsafe or very unwieldy. For any given AsFd implementor, there is no guarantee that the file descriptor it returns will be consistent or that it won't be closed before the actual object is dropped. The other way of doing this would be to parameterize the API around BorrowedFd, but that would make the API hard to use, especially for async-io and calloop.

@notgull
Copy link
Member Author

notgull commented Jun 15, 2023

To clarify this issue: the poll() backend and the IOCP backend directly store the raw FD/socket inline. Even without these backends both kqueue and event ports technically store the raw resource as well.

Therefore a file descriptor must be unregistered before it is dropped to prevent soundness issues. The idea was to have a Source<T> type that contains the source as well as a reference to the Poller that unregisters the T before dropping it. However it would still be possible to pass this type in:

struct Bad {
    x: Option<TcpStream>,
    y: TcpStream,
}

impl AsFd for Bad {
    fn as_fd(&self) -> BorrowedFd<'_> {
        self.x.as_ref().unwrap_or(&self.y).as_fd()
    }
}

Then a user could do:

let bad = Bad { .. };
let mut bad: Source<Bad> = poller.register(bad, ..);
bad.get_mut().x = None;
drop(bad);

This means that x is dropped before the Bad source is deregistered, leading to unsound behavior.

This could be solved by marking get_mut as unsafe with the precondition that the inner source must not be closed or changed. However this would mean that, on the async-io crate, Async::get_mut would need to be either removed or made unsafe, potentially crippling the functionality. Not to mention with interior mutability it wouldn't be too hard to make the return value of AsFd inconsistent.

I'm not sure if there is an actual solution to this aside from just making the public API as it currently is unsafe and then letting higher level crates be the explicit safe interface.

@notgull
Copy link
Member Author

notgull commented Jun 16, 2023

@smol-rs/admins @sunfishcode We should brainstorm about this problem. I really don't want to remove the get_mut() function from async_io, but as the moment I can't see a way around that.

@fogti
Copy link
Member

fogti commented Jun 16, 2023

well, if it is unsafe to use, just make it unsafe and document the requirement that the file descriptor must be stable.

@notgull
Copy link
Member Author

notgull commented Jun 17, 2023

If we make direct access to the underlying source unsafe, several other operations also become unsafe, like [read/write]_with_mut and the Read/Write implementations. This would cripple the functionality of the async_io crate, and I would really like to find a way around this if we can.

@fogti
Copy link
Member

fogti commented Jun 18, 2023

hm we could implement some guarding in the _with_mut functions to check if the file descriptor stayed the same. I'm unsure if the rust compiler can optimize out redundant checks, if it can, this shouldn't even incur a runtime cost.

@notgull
Copy link
Member Author

notgull commented Jun 18, 2023

hm we could implement some guarding in the _with_mut functions to check if the file descriptor stayed the same. I'm unsure if the rust compiler can optimize out redundant checks, if it can, this shouldn't even incur a runtime cost.

This might be the best solution, but what should happen //if// the check doesn't match? The only real option would be to abort() the program, since soundness has been violated (and, if checks like this were added in the future, Miri would've crashed the program already).

@notgull
Copy link
Member Author

notgull commented Jun 20, 2023

hm we could implement some guarding in the _with_mut functions to check if the file descriptor stayed the same. I'm unsure if the rust compiler can optimize out redundant checks, if it can, this shouldn't even incur a runtime cost.

I just realized this probably wouldn't work. Consider the following code:

let poller = Arc::new(Poller::new());
let (signal, cont) = mpsc::channel();

// on main thread
struct Bad(Option<TcpStream>);

impl AsFd for Bad {
    fn as_fd(&self) -> BorrowedFd<'_> {
        self.0.as_ref().unwrap().as_fd()
    }
}

let mut source = poller.add(Bad(Some(TcpStream::connect(..))));
source.with_mut(|bad| {
    // Drop the inner TCP socket, closing the FD.
    bad.0 = None;

    // Signal the other thread and block forever.
    signal.send(()).unwrap();
    parking::Parker::new().park();
    unreachable!()
});

// on thread #2
cont.recv().unwrap();

// Allocate a new file descriptor. This probably takes up the previously
// allocated socket's FD.
let _unaware = TcpStream::connect(..);

// This call arbitrarily uses the closed/new socket, immediately leading to undefined behavior.
poller.wait(..)?;

The problem being that there's nothing really stopping us from parking mid-closure and then doing unsound things in the meantime. This could be solved by having a mutex somewhere that doesn't let wait be called while with_mut is active, but that adds overhead and possible deadlocks that I would like to avoid.

One solution would be to make add unsafe with the condition that the provided source must not be closed while it it still registered in the poller. Another solution would be to have an unsafe Source trait that stipulates that there is no way to close the underlying FD from &mut.

See tokio-rs/mio#1588 for the mirrored mio version of this issue, although they haven't started discussing this conundrum yet.

@sunfishcode
Copy link
Contributor

I unfortunately don't have any answers here.

Since kqueue uses file descriptor associations, rather than file description associations, I/O safety would oblige Poller to either make add unsafe, or to have add take ownership of the fds. If it takes ownership, then how does the user get it back when an event happens to do the I/O? It's theoretically doable, by adding a lifetime parameter to Event and having Event have the ability to provide BorrowedFds, but that's not very ergonomic.

@ids1024
Copy link

ids1024 commented Jun 28, 2023

I'm not sure I understand why kqueue is a safety concern here? If a file descriptor added to a kqueue is closed, it is automatically removed from the kqueue. In contrast with epoll, which will still get events after the fd is closed, with the fd number of the closed fd. Neither behavior is inherently unsafe, though I don't know if thats an issue with how it is used in polling or dependent libraries. poll however is obviously more of an issue since the library has to continuously call poll with the fd, and it should be valid to do that.

I/O safety would oblige Poller to either make add unsafe, or to have add take ownership of the fds.

Theoretically there are a few safe options:

  • Take ownership
  • Require file descriptors to be reference counted
  • Poller has a lifetime bound and uses fds with that lifetime
    • May work nicely in some niche cases, but completely unhelpful in most, I expect.

I don't think any of these are viable with how the library would normally be used. Though higher level libraries using polling may be able to provide a safe library by assuming ownership of anything they poll. I think it makes sense to have this unsafe at the polling level, while consumers of the library can then freely adopt any of these solutions if feasible for their use cases.

@notgull
Copy link
Member Author

notgull commented Jun 29, 2023

I'm not sure I understand why kqueue is a safety concern here? If a file descriptor added to a kqueue is closed, it is automatically removed from the kqueue.

kqueue keeps track of the file descriptor rather than the file description, like epoll does. This means that the FD is morally borrowed by kqueue for the duration of its registration. See here for further discussion.

I also think this affects Solaris' event ports, but I can't be bothered to take a deep dive into illumos' source code to figure it out. Some Solaris nerd can tell me if I'm wrong.

I don't think any of these are viable with how the library would normally be used. Though higher level libraries using polling may be able to provide a safe library by assuming ownership of anything they poll. I think it makes sense to have this unsafe at the polling level, while consumers of the library can then freely adopt any of these solutions if feasible for their use cases.

Sounds good to me.

@notgull
Copy link
Member Author

notgull commented Aug 10, 2023

Closed in #123

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

5 participants