Cannot put 'Term' inside 'ResourceArc', or: Support for storing references to Erlang datatypes in Rustler structs? #333

Qqwy · 2020-10-01T14:01:43Z

For some use cases (c.f. christian-public/Idris2-Erlang#4) it would be nice to store references to Erlang datastructures inside a Rust structure.

Converting (serializing) the data to Rust types before storage is slow and breaks for certain types like references which might be garbage-collected and then later when deserializing the data they might be no longer valid. A prime example is the Erlang 'ref' type (such as used by :atomics, :counter or NIF modules).

ResourceArc might store a pointer to an arbitrary Erlang term inside (as long as the ResourceArc as a whole is constructed/passed back to the same Env the term came from), but currently this is not possible because Term itself does not have a ResourceTypeProvider implementation.

Either adding a ResourceTypeProvider implementation for Type or adding a separate class called e.g. TermArc whose implementation would be similar to ResourceArc but simpler since no type-conversions would need to take place would be possibilities.

The text was updated successfully, but these errors were encountered:

hansihe · 2020-10-01T14:28:07Z

We can't really do this, we wrap the C NIF API, and this is not supported functionality there. ResourceArc is just a wrapper for resources, and you can't store terms in there safely.

One possibility to accomplish what you are trying to do could be to allocate an owned environment, and then use that for storing terms outside of the process heap. This is a really interesting idea actually, and we should probably look into implementing it. It would require testing to figure out any potential caveats to this approach in regards to memory usage or performance.

Qqwy · 2020-10-01T16:03:46Z

Thank you for you reply!

I had a go making a simple implementation of a managed mutable box type using an owned environment:

use rustler::env::OwnedEnv;
use rustler::env::SavedTerm;

// Put this in a ResourceArc:
pub struct MutableTermBox
{
    inner: std::sync::Mutex<MutableTermBoxContents>,
}

struct MutableTermBoxContents
{
    owned_env: OwnedEnv,
    saved_term: SavedTerm
}

impl MutableTermBox {
    pub fn new(term: Term) -> Self {
        Self{inner: std::sync::Mutex::new(MutableTermBoxContents::new(term))}
    }

    pub fn get<'a>(&self, env: Env<'a>) -> Term<'a> {
        let inner = self.inner.lock().unwrap();

        // Copy over term from owned environment to the target environment
        inner.owned_env.run(|inner_env| {
            let term = inner.saved_term.load(inner_env);
            term.in_env(env)
        })
    }

    pub fn set(&self, term: Term) -> Atom {
        let mut term_ptr = self.inner.lock().unwrap();
        term_ptr.owned_env.clear();
        term_ptr.saved_term = term_ptr.owned_env.save(term);

        atoms::ok()
    }
}

impl MutableTermBoxContents {
    fn new(term: Term) -> Self {
        let owned_env = OwnedEnv::new();
        let saved_term = owned_env.save(term);
        Self{owned_env: owned_env, saved_term: saved_term}
    }
}

Since I'm new to Rustler there probably are some ways to improve this further.

Qqwy · 2020-10-01T19:40:22Z

As for memory usage/efficiency: I have not done any profiling so far (nor do I know what a good approach would be to profiling this kind of code), but I did have a look at the internals of the BEAM's erl_nif.c implementation.
As far as I was able to tell w.r.t. above implementation, it means that:

both reading (get) and writing (set) require a deep copy of the term to be made.
allocating the OwnedEnv should be reasonably fast as much less work is done than allocating a 'normal' process environment. Not that this is very important because new will probably be called much less often than get/set.
clearing the OwnedEnv (during set) should be very fast, since it really does nothing else than calling the destructors of (+deallocating) all saved terms (which in above code is only one).

Qqwy · 2020-10-20T22:03:29Z

I have another idea: If e.g. an ETS handle is created and then moved to inside an OwnedEnv, we can read/write to this ETS data from anywhere without incurring the copying overhead that we'd have from moving terms into and out of the owned environment directly.

So this might allocating and de-allocating an owned environment slightly slower, but will probably significantly increase the read/write speed from/to it. This is probably worthwhile.

hansihe · 2020-10-20T22:30:26Z

I'm pretty sure reading and writing from ETS also performs a copy of the term.

…

On Wed, Oct 21, 2020, at 00:03, Qqwy wrote: I have another idea: If e.g. an ETS handle is created and then moved to inside an OwnedEnv, we can read/write to this ETS data from anywhere without incurring the copying overhead that we'd have from moving terms into and out of the owned environment directly. So this might allocating and de-allocating an owned environment slightly slower, but will probably significantly increase the read/write speed from/to it. This is probably worthwhile. — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#333 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAGN7UTV3YXZCRHCF5SHTPTSLYCL7ANCNFSM4SAPCBXQ>.

Qqwy · 2020-12-21T12:59:56Z

@hansihe Yes, unfortunately it does.

I think the 'OwnedEnv' is the best we can do right now. It would be great if a hook would be added to erl_nif to tell the runtime that you're storing a reference to an object in your structure (and another one to relinquish this reference later), so that cheap refcounting can be used rather than expensive copying.

EDIT: Of course, it might be difficult to do so without interfering with Erlang's garbage collector which might invalidate pointers.

I think this would be the final piece of the puzzle to allow us to use native containers (arrays, vectors, queues, and the likes) which could be very useful in certain circumstances.

Qqwy changed the title ~~Cannot put 'Term' inside 'ResourceArc', or: Support for storing Erlang references in Rustler structs?~~ Cannot put 'Term' inside 'ResourceArc', or: Support for storing references to Erlang datatypes in Rustler structs? Oct 1, 2020

This was referenced Sep 8, 2021

Any document on type conversions between Rust and Elixir? #363

Open

StoredTerm #382

Closed

Qqwy closed this as completed May 27, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Cannot put 'Term' inside 'ResourceArc', or: Support for storing references to Erlang datatypes in Rustler structs? #333

Cannot put 'Term' inside 'ResourceArc', or: Support for storing references to Erlang datatypes in Rustler structs? #333

Qqwy commented Oct 1, 2020

hansihe commented Oct 1, 2020

Qqwy commented Oct 1, 2020 •

edited

Loading

Qqwy commented Oct 1, 2020 •

edited

Loading

Qqwy commented Oct 20, 2020

hansihe commented Oct 20, 2020 via email

Qqwy commented Dec 21, 2020 •

edited

Loading

Cannot put 'Term' inside 'ResourceArc', or: Support for storing references to Erlang datatypes in Rustler structs? #333

Cannot put 'Term' inside 'ResourceArc', or: Support for storing references to Erlang datatypes in Rustler structs? #333

Comments

Qqwy commented Oct 1, 2020

hansihe commented Oct 1, 2020

Qqwy commented Oct 1, 2020 • edited Loading

Qqwy commented Oct 1, 2020 • edited Loading

Qqwy commented Oct 20, 2020

hansihe commented Oct 20, 2020 via email

Qqwy commented Dec 21, 2020 • edited Loading

Qqwy commented Oct 1, 2020 •

edited

Loading

Qqwy commented Oct 1, 2020 •

edited

Loading

Qqwy commented Dec 21, 2020 •

edited

Loading