impl Default for Tensor? #822

emchristiansen · 2023-07-20T02:30:08Z

Thanks for working on this!
I once had a project where we regularly worked with 6 dimensional tensors and it was such a pain to keep track of the axes we wrote a separate library to track them for us - something like this would have been great!

Is it possible to impl Default for Tensor in any reasonable way?
E.g. if I only have the type Tensor<S, E, D, T>, can I generate a Tensor of zeros of that type?
So far the only construction method I've seen for Tensors uses a device object.

Why I care: I have crazy nested datastructures that I want to compute gradients through using dfdx.
The datastructures can be parameterized with anything "number like", and for something to be "number like" it has to have an additive identity element (zero), i.e. the default value.

Relatedly, ensuring Tensor<Rank0, _, _, _> impls all the num-like traits would be amazing, as it would make it a drop-in replacement for f32, with the side-effect of getting gradients for free.
This would make me very happy.

The text was updated successfully, but these errors were encountered:

coreylowman · 2023-07-20T12:50:05Z

I think this would require thread local device objects (like rand::thread_rng(), but if we had that it would be possible. Imagining something like:

pub fn thread_cpu() -> Cpu { ... }
pub fn thread_cuda(ordinal: usize) -> Cuda { ... }

impl<S: Shape, E: Dtype> Default for Tensor<S, E, Cpu> {
    fn default() -> Self {
         thread_cpu().zeros()
    }
}

impl<S: Shape, E: Dtype> Default for Tensor<S, E, Cuda> {
    fn default() -> Self {
         thread_cuda(0).zeros()
    }
}

I'm unsure how sound these thread local objects are though, would have to think about it. It would be weird to mix the use of the thread local object and a separate object.

emchristiansen · 2023-07-20T15:27:39Z

As a workaround, assuming I'm doing everything on a single device (say the CPU for now), could I just define something like this and use it for my device everywhere, assuming I'm careful to remain inside the same system thread*?

pub static DFDX_DEVICE: Lazy<Cpu> = Lazy::new(|| Cpu::default());

fn foo() {
  let weight: Tensor<Rank2<4, 2>, f32, _, NoneTape> =
    DFDX_DEVICE.sample_normal();
  ...
}

But even if that worked, what about the gradient tape?
If T is NoneTape it's pretty clear what to do, but what if T is OwnedTape<..>?
What would the correct default value be in that case?

*Also, is thread locality important for Cpu or just Cuda?

coreylowman · 2023-07-24T13:26:59Z

As a workaround, assuming I'm doing everything on a single device (say the CPU for now), could I just define something like this and use it for my device everywhere, assuming I'm careful to remain inside the same system thread*?

Yeah definitely!

But even if that worked, what about the gradient tape?

Probably just call .traced() after construction - none of the tensor creation methods currently create OwnedTapes, so that would be consistent.

*Also, is thread locality import for Cpu or just Cuda?

If it's just for you you probably don't need to worry about it. The main thing is to minimize the number of different device objects. For CPU its mainly important if you want enable allocation caching (https://docs.rs/dfdx/latest/dfdx/tensor/trait.Cache.html). Same for CUDA, but CUDA also will load in kernels into the object, so you don't want to create different ones.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

impl Default for Tensor? #822

impl Default for Tensor? #822

emchristiansen commented Jul 20, 2023

coreylowman commented Jul 20, 2023

emchristiansen commented Jul 20, 2023 •

edited

Loading

coreylowman commented Jul 24, 2023

impl Default for Tensor? #822

impl Default for Tensor? #822

Comments

emchristiansen commented Jul 20, 2023

coreylowman commented Jul 20, 2023

emchristiansen commented Jul 20, 2023 • edited Loading

coreylowman commented Jul 24, 2023

emchristiansen commented Jul 20, 2023 •

edited

Loading