Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cannot put 'Term' inside 'ResourceArc', or: Support for storing references to Erlang datatypes in Rustler structs? #333

Closed
Qqwy opened this issue Oct 1, 2020 · 6 comments

Comments

@Qqwy
Copy link
Contributor

Qqwy commented Oct 1, 2020

For some use cases (c.f. christian-public/Idris2-Erlang#4) it would be nice to store references to Erlang datastructures inside a Rust structure.

Converting (serializing) the data to Rust types before storage is slow and breaks for certain types like references which might be garbage-collected and then later when deserializing the data they might be no longer valid. A prime example is the Erlang 'ref' type (such as used by :atomics, :counter or NIF modules).

ResourceArc might store a pointer to an arbitrary Erlang term inside (as long as the ResourceArc as a whole is constructed/passed back to the same Env the term came from), but currently this is not possible because Term itself does not have a ResourceTypeProvider implementation.

Either adding a ResourceTypeProvider implementation for Type or adding a separate class called e.g. TermArc whose implementation would be similar to ResourceArc but simpler since no type-conversions would need to take place would be possibilities.

@Qqwy Qqwy changed the title Cannot put 'Term' inside 'ResourceArc', or: Support for storing Erlang references in Rustler structs? Cannot put 'Term' inside 'ResourceArc', or: Support for storing references to Erlang datatypes in Rustler structs? Oct 1, 2020
@hansihe
Copy link
Member

hansihe commented Oct 1, 2020

We can't really do this, we wrap the C NIF API, and this is not supported functionality there. ResourceArc is just a wrapper for resources, and you can't store terms in there safely.

One possibility to accomplish what you are trying to do could be to allocate an owned environment, and then use that for storing terms outside of the process heap. This is a really interesting idea actually, and we should probably look into implementing it. It would require testing to figure out any potential caveats to this approach in regards to memory usage or performance.

@Qqwy
Copy link
Contributor Author

Qqwy commented Oct 1, 2020

Thank you for you reply!

I had a go making a simple implementation of a managed mutable box type using an owned environment:

use rustler::env::OwnedEnv;
use rustler::env::SavedTerm;

// Put this in a ResourceArc:
pub struct MutableTermBox
{
    inner: std::sync::Mutex<MutableTermBoxContents>,
}

struct MutableTermBoxContents
{
    owned_env: OwnedEnv,
    saved_term: SavedTerm
}

impl MutableTermBox {
    pub fn new(term: Term) -> Self {
        Self{inner: std::sync::Mutex::new(MutableTermBoxContents::new(term))}
    }

    pub fn get<'a>(&self, env: Env<'a>) -> Term<'a> {
        let inner = self.inner.lock().unwrap();

        // Copy over term from owned environment to the target environment
        inner.owned_env.run(|inner_env| {
            let term = inner.saved_term.load(inner_env);
            term.in_env(env)
        })
    }

    pub fn set(&self, term: Term) -> Atom {
        let mut term_ptr = self.inner.lock().unwrap();
        term_ptr.owned_env.clear();
        term_ptr.saved_term = term_ptr.owned_env.save(term);

        atoms::ok()
    }
}

impl MutableTermBoxContents {
    fn new(term: Term) -> Self {
        let owned_env = OwnedEnv::new();
        let saved_term = owned_env.save(term);
        Self{owned_env: owned_env, saved_term: saved_term}
    }
}

Since I'm new to Rustler there probably are some ways to improve this further.

@Qqwy
Copy link
Contributor Author

Qqwy commented Oct 1, 2020

As for memory usage/efficiency: I have not done any profiling so far (nor do I know what a good approach would be to profiling this kind of code), but I did have a look at the internals of the BEAM's erl_nif.c implementation.
As far as I was able to tell w.r.t. above implementation, it means that:

  • both reading (get) and writing (set) require a deep copy of the term to be made.
  • allocating the OwnedEnv should be reasonably fast as much less work is done than allocating a 'normal' process environment. Not that this is very important because new will probably be called much less often than get/set.
  • clearing the OwnedEnv (during set) should be very fast, since it really does nothing else than calling the destructors of (+deallocating) all saved terms (which in above code is only one).

@Qqwy
Copy link
Contributor Author

Qqwy commented Oct 20, 2020

I have another idea: If e.g. an ETS handle is created and then moved to inside an OwnedEnv, we can read/write to this ETS data from anywhere without incurring the copying overhead that we'd have from moving terms into and out of the owned environment directly.

So this might allocating and de-allocating an owned environment slightly slower, but will probably significantly increase the read/write speed from/to it. This is probably worthwhile.

@hansihe
Copy link
Member

hansihe commented Oct 20, 2020 via email

@Qqwy
Copy link
Contributor Author

Qqwy commented Dec 21, 2020

@hansihe Yes, unfortunately it does.

I think the 'OwnedEnv' is the best we can do right now. It would be great if a hook would be added to erl_nif to tell the runtime that you're storing a reference to an object in your structure (and another one to relinquish this reference later), so that cheap refcounting can be used rather than expensive copying.

EDIT: Of course, it might be difficult to do so without interfering with Erlang's garbage collector which might invalidate pointers.

I think this would be the final piece of the puzzle to allow us to use native containers (arrays, vectors, queues, and the likes) which could be very useful in certain circumstances.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants