Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

unexpected behavior in [ strided_slice #230

Closed
t-kalinowski opened this issue Apr 2, 2018 · 27 comments
Closed

unexpected behavior in [ strided_slice #230

t-kalinowski opened this issue Apr 2, 2018 · 27 comments

Comments

@t-kalinowski
Copy link
Member

missing elements in an index seem to be ignored.

library(tensorflow)
d <- c(2, 4, 2)
X <- array_reshape(1:prod(d), d)

x <- tf$convert_to_tensor_or_sparse_tensor(X)
sess <- tf$Session()
x_1_3 <-   sess$run(x[1,c(1,3),])
x_1_2_3 <- sess$run(x[1,c(1,2,3),])
identical(x_1_3, x_1_2_3) # should be FALSE
#> [1] TRUE

# expected output:
X[1, c(1,3),]
#>      [,1] [,2]
#> [1,]    1    2
#> [2,]    5    6

# actual output:
x_1_3
#>      [,1] [,2]
#> [1,]    1    2
#> [2,]    3    4
#> [3,]    5    6

Created on 2018-04-02 by the reprex package (v0.2.0).

@jjallaire
Copy link
Member

@goldingn Is the above expected behavior?

@bwlewis
Copy link
Collaborator

bwlewis commented Apr 3, 2018

[is eventually mapped in this example to tf.strided_slice (https://www.tensorflow.org/api_docs/python/tf/strided_slice) here: https://github.com/rstudio/tensorflow/blob/master/R/extraction.R#L104

Ye gads! tf.strided_slice has an astoundingly horrible API and it does not look like it can handle generic indexing. It hurts my brain to read through that stuff! It's not clear from some quick and dirty experiments I just ran that the suggested https://www.tensorflow.org/api_docs/python/tf/Tensor#__getitem__ method is much better. (I was not able to replicate generic numpy indexing with that either).

Since the tensorflow APIs are so awful, it might be best to just throw an error when presented with non-contiguous bracket function subset ranges until tensorflow itself is improved. Just a cheap proposal...

See also several open/gripe issues on tensorflow related to this like tensorflow/tensorflow#4638 and tensorflow/tensorflow#206

For complete reference, the actual low-level tensorflow implementation is here: https://github.com/tensorflow/tensorflow/blob/7a0def60d45c1841a4e79a0ddf6aa9d50bf551ac/tensorflow/core/ops/array_ops.cc#L1501

edit: PyTorch does this right: http://pytorch.org/docs/master/torch.html#torch.index_select -- tensorflow should follow that lead.

@jjallaire
Copy link
Member

Wow, that's really disappointing that TF isn't smarter about this.

@goldingn Do you agree that we should throw an error in this case? If so could you propose a change to that effect?

@t-kalinowski
Copy link
Member Author

would it be difficult to split the slicing index vector into contiguous sequences, call tf.strided_slice() for each chunk, then tf.concat() the results?

@goldingn
Copy link
Contributor

goldingn commented Apr 4, 2018

Yeah, unfortunately this is the expected behaviour, because tf$strided_slice is quite limited.

I agree we should error on this, rather than silently ignoring the intermediate values.

We should also mention the limitations of this extraction here. This sentence is misleading:

"In constrat to tf$ functions, extraction using [ uses a syntax fully compatible with R extraction, including the use of 1-based rather than 0-based indexes"

I'll add the error handling, a test, and edit the docs then send a PR.

I wonder if there's a more visible place in the docs we can discuss what the extract syntax can do?

@goldingn
Copy link
Contributor

goldingn commented Apr 4, 2018

@t-kalinowski

It would be possible to do a bunch of strided_slice-ing and concating for the user, but that would be a departure from the python tensorflow [ syntax, which is just an interface to strided slice. E.g. in python tensorflow you could do:

x[0, 0:2, :]

but you couldn't pass a vector to the second slice dimension (as far as I know anyway!).

We previously discussed whether to the port full R extract syntax to tensorflow (as I do in greta), but that's inconsistent with the python API, and also introduces a fair bit of overhead.*

I'm planning to trim some fat off that greta implementation, but I think it makes sense to keep the tensorflow API consistent with the python tensorflow API, and retain all the native speed.

I think the best way to achieve the subsetting in your example with the current API would be:

indices <- c(1L, 3L)
x_1_3 <- tf$gather(x, indices - 1L, axis = 1L)
x_1_3 <- x_1_3[1, , ]

*The separate slice and concat operations probably wouldn't be too slow, but R's extract syntax handles a lot of different cases. E.g.:

d <- c(2, 4, 2)
X <- array(seq_len(prod(d)), dim = d)
X[c(TRUE, FALSE, TRUE)]
 [1]  1  3  4  6  7  9 10 12 13 15 16

I think it makes sense to be consistent with either python tensorflow or full R, and providing all this R extraction functionality is expensive.

@goldingn
Copy link
Contributor

goldingn commented Apr 4, 2018

There's a PR for the error message and a test.

I see that tensorflow API doc isn't in the package repo any more. Is it editable on github or just internal now @jjallaire?

@jjallaire
Copy link
Member

It is currently internal but I'm hoping to make it public soon.

In the meantime if you just suggest some copy here I'll make the change on the website (in the meantime I'll merge the PR)

@goldingn
Copy link
Contributor

goldingn commented Apr 4, 2018

OK great. Here's some text.

In contrast to `tf$` functions, extraction with `[` uses 1-based rather than 0-based indexes. Because extraction with `[` is intended to replicate the similar syntax in Python tensorflow, it is more limited than R's extraction syntax for arrays. For example, the extraction indices must be positive integers, and any sequences of integers must be increasing, and have no missing elements.

@t-kalinowski
Copy link
Member Author

t-kalinowski commented Apr 4, 2018

@goldingn, thank you for the pointer to tf.gather() (I had been overlooking it,
assuming it was the equivalent of tidyr::gather)

The reason I encountered this issue is that I have time series data, which is
organized as a 3 dimensional array of shape (samples, timestep, variable), and
I was trying to do a skip-take decimation along the timestep axis such that I
take only every other sample. In python, i would do this with something like
x[:, ::2, :], and the equivalent R I guess is x[,seq(1, dim(x)[2], by = 2),]
In python tensorflow, this kind of skip-take works. eg:

library(tensorflow)
library(reticulate)
d <- c(2, 4, 2); X <- array_reshape(1:prod(d), d)

py$X <- X
py_run_string("
import tensorflow as tf
x = tf.convert_to_tensor_or_sparse_tensor(X)
sess = tf.Session()
x_1_3 = sess.run(x[0,::2,:])
")
py$x_1_3
#>      [,1] [,2]
#> [1,]    1    2
#> [2,]    5    6

Do you have any suggestions about how to achieve this from the R side?

Created on 2018-04-04 by the reprex package (v0.2.0).

@bwlewis
Copy link
Collaborator

bwlewis commented Apr 4, 2018

You can use tf.strided_slice directly (https://www.tensorflow.org/api_docs/python/tf/strided_slice) ... but beware that function behaves slightly differently than [ (really __getitem__) in TF. In particular, tf.strided_slice returns a tensor of the same shape as the input. Here is your example:

library(tensorflow)
d = c(2, 4, 2)
X = array_reshape(1:prod(d), d)
x = tf$convert_to_tensor_or_sparse_tensor(X)
sess = tf$Session()
sess$run(tf$strided_slice(x, c(0L,0L,0L), c(1L,4L,2L), c(1L, 2L, 1L)))
# , , 1
#      [,1] [,2]
# [1,]    1    5
#, , 2
#      [,1] [,2]
# [1,]    2    6

You can drop the singleton dimensions on the R side:

drop(sess$run(tf$strided_slice(x, c(0L,0L,0L), c(1L,4L,2L), c(1L, 2L, 1L))))
#       [,1] [,2]
# [1,]    1    2
# [2,]    5    6

I can't quite figure out to invoke the [ (getitem) directly yet. Maybe that would be better...?

@jjallaire
Copy link
Member

I think you can do something like this:

x$`__getItem__`(index)

@jjallaire
Copy link
Member

Thanks @goldingn, just incorporated your updated docs.

@t-kalinowski
Copy link
Member Author

t-kalinowski commented Apr 4, 2018

Here is a quick-n-dirty mockup of another approach to enabling the python slicing syntax for tensorflow tensors. The key differences with this approach are

  • [.tensorflow.tensor is build on top of __getitem__ instead of directly on tf.strided_slice
  • there is a new R function py_slice(), which is a wrapper around the builtin python function slice() that powers syntax like x[start:stop:step] (which is equivalent to x[slice(start, stop, step)].

This enables new syntax for tensorflow subsetting in R, namely, something like
x[ , py_slice(,,2), ] which would be equivalent to python x[:, ::2, :]

There are a bunch of loose ends and this is just a conceptual sketch (0-based vs 1-based indexing isn't coherent yet, subsetting by a tensor isn't supported yet, as.py_slice() should probaby be S3, etc.). I'm not entirely sure if this fits into the bigger picture, but right now it feels wrong that the step slicing argument isn't available from the R side, since it is quite useful often, and I think the goal is to have R be an equally capable interface to tensorflow as from python. I also think that building on top of __getitem__ allows for more closely mirroring the python interface and functionality than building on top of strided_slice (and it should, in theory, also extend to numpy arrays with minimal effort)

library(tensorflow)
library(reticulate)
d <- c(2, 4, 2); X <- array_reshape(1:prod(d), d)

# some helpers
as_nullable_integer <- keras:::as_nullable_integer
parse1 <- function(...) parse(text = paste0(...), keep.source = FALSE)[[1]]
is_scalar <- function(x) identical(length(x), 1L)
ndots <- function(...) length(eval(substitute(alist(...))))


# python style slicing, mainly for the `step` functionality
py_slice <- function(start = NULL, stop = NULL, step = NULL) {
  import_builtins()$slice(
    as_nullable_integer(start),
    as_nullable_integer(stop),
    as_nullable_integer(step))
}

as.py_slice <- function(x) {
  if(identical(x, quote(expr=)) || is.null(x)) {
    py_slice()
  } else if (inherits(x, "python.builtin.slice")) {
    x
 } else if (is_scalar(x)) {
    stopifnot(is.numeric(x))
    x <- x - 1L #  1-based indexing -> 0-based indexing
    py_slice(x, x + 1L)
 } else { # atomic index of length greater than 1
   stopifnot(
     is.numeric(x), # logical, character subsetting not supported
     x[1] > 0, x[length(x)] > 0, # negative indexes not supported
     x[1] == min(x), x[length(x)] == max(x),
     all(x == x[1]:x[length(x)])) # missing indexes not supported
      
   start <- x[1] - 1L # subtract 1 for 1-based -> 0-based indexing
   stop <- x[length(x)] - 1L
   py_slice(start, stop) 
 }
}


`[.tensorflow.tensor` <- function(x, ..., drop = TRUE) {
  shp <- x$get_shape()
  stopifnot(ndots(...) == py_len(shp))
  
  args <- vector("list", py_len(shp))
  
  for (i in seq_len(py_len(shp)))
    args[[i]] <- as.py_slice(eval(parse1("..", i)))
  
  out <- x$`__getitem__`(tuple(args))
  if(isTRUE(drop))
    out <- tf$squeeze(out)
  out
}

sess = tf$Session()
x = tf$convert_to_tensor_or_sparse_tensor(X)
sess$run( x[1, py_slice(,,2), ] )
#>      [,1] [,2]
#> [1,]    1    2
#> [2,]    5    6

Created on 2018-04-04 by the reprex package (v0.2.0).

@goldingn
Copy link
Contributor

goldingn commented Apr 5, 2018

Thanks @t-kalinowski! Yeah, it would be nice to have strides available in this syntax.

That function looks good - it's a bit more verbose than in python, but it's explicit what it does. Another syntax option is to use non-standard evaluation to handle double colon notation when extracting from tensors.

I.e. we can make it so that syntax like x[1, 1:10:3, ] works. We unfortunately can't handle this: x[1, ::3, ] since R's parser wigs out at the double colon, though we can use a variant like this: x[1, .:.:3, ], and other combinations of dots and colons. The . could be a different character too if someone has a better idea.

What do people think? Would this more concise python-like syntax be a nice addition, or would the non-standard evaluation confuse matters?


Here's code to mock up the strided indexing on a regular array:

parse_strides <- function (x, text, dim = 10) {
  chunks <- strsplit(text, ":")[[1]]
  if (length(chunks) == 3) {
    suppressWarnings(chunks <- vapply(chunks, as.numeric, 1))
    
    # handle missing indices
    missing <- is.na(chunks)
    if (any(missing)) {
      if (missing[1]) {
        chunks[1] <- 1
      }
      if (missing[2]) {
        chunks[2] <- dim
      }
      if (missing[3]) {
        chunks[3] <- 1
      }
    }
    
    x <- seq(from = chunks[1], to = chunks[2], by = chunks[3])
    
  }
  x
}

weird_array <- function(data = NA, dim = length(data), dimnames = NULL) {
  x <- array(data, dim, dimnames)
  class(x) <- c("weird_array", class(x))
  x
}

# this demo version is hard-coded to be 3D, and missing indices aren't supported
`[.weird_array` <- function (x, i, j, k) {
  
  dims <- dim(x)
  
  text_i <- deparse(substitute(i))
  text_j <- deparse(substitute(j))
  text_k <- deparse(substitute(k))
  
  i <- parse_strides(i, text_i, dims[1])
  j <- parse_strides(j, text_j, dims[2])
  k <- parse_strides(k, text_k, dims[3])
  
  x <- unclass(x)
  x[i, j, k]
}

# test run
d <- c(2, 10, 2)
x <- weird_array(seq_len(prod(d)), dim = d)

x[1, 1:8, 2]
x[1, 1:8:2, 2]
x[1, 1:.:2, 2]
x[1, .:.:., 2]

@t-kalinowski
Copy link
Member Author

t-kalinowski commented Apr 5, 2018

Thanks @goldingn ,
I like the idea of a more compact syntax that mirrors the python syntax closer too. I'm not sure which is better.

If we do decide to go with the more compact syntax, I don't like using . as a place holder here. It'll be somewhat confusing what happens to . when used in a pipe %>%, (also, i think ultimately, we'd want to eval() each of the chunks instead of just calling as.numeric(), so that people can construct slicers with variables like x[.:user_var_stop:user_var_step], which would lead to even more confusing behavior for ..

I played around with the parser a bit just now, how's this for a compact syntax notation that gets past the R parser:

x[ , `::2`, ]

or even more simply:

x[ , "::2", ]

@terrytangyuan
Copy link
Contributor

@jjallaire I just realized that the format of this section seems to be messed up. Maybe missing a closing bracket or something?

@jjallaire
Copy link
Member

Thanks, just fixed! (I needed to escape the $ to avoid their being treated as equations)

@goldingn
Copy link
Contributor

goldingn commented Apr 9, 2018

So the syntax options for implementing python-like strided slice indexing via [ are:

python version

we can't do this in R because the parser won't allow ::

x[, 1:10:2, ]
x[, ::2, ]

R option 1 - a slice function

x[, py_slice(1, 10, 2), ]
x[, py_slice(, , 2), ]

R option 2 - quoted

this could alternatively (or also) use backticks: `::2`

x[, '1:10:2', ]
x[, '::2', ]

R option 3 - some character for missing arguments

the missing value character could be ?, . or something else

x[, 1:10:2, ]
x[, ?:?:2, ]

In all of these, standard increasing indices like 1:10 would work too.

I'm torn between making the syntax minimal and close to the python syntax, and reducing the amount of NSE people need to get their head around. What does everyone think?

@jjallaire
Copy link
Member

I like the backticks for consistency with code people may be porting, but I realize that might seem odd to some. I don't love the py_slice() function.

@bwlewis
Copy link
Collaborator

bwlewis commented Apr 9, 2018

Is it possible to just use seq() instead of a new py_slice() function? In that case at least nothing new to learn for R users.

@goldingn
Copy link
Contributor

goldingn commented Apr 9, 2018

That would definitely work for the first of those cases (and perhaps we should enable that too), but it's not so easy for the second case, where the start and end integers don't need to be specified. We could perhaps parse seq(, , by = 2) but it would be syntax that would break when not used in [

@t-kalinowski
Copy link
Member Author

t-kalinowski commented Apr 9, 2018

I would prefer to avoid NSE in [ as much as possible because it's inconsistent with most other [ methods, and it requires esoteric workarounds if you want to program with it. I agree though that the compact python syntax is very convenient and `::2` saves quite a bit of typing. If we decide on backticks wrapping a slicing spec like this, I suggest we also provide a way that works with standard evaluation as well. (e.g., py_slice())

@t-kalinowski
Copy link
Member Author

t-kalinowski commented Apr 9, 2018

actually, nevermind my last comment. There is no reason why x[ , `start:stop:step`, ] couldn't work like standard evaluation (with start stop and step being expressions or symbols that are eval()ed in the parent.frame(). On second thought, py_slice() isn't necessary.

@t-kalinowski
Copy link
Member Author

OK, unless someone else (@goldingn ?) wants to take the lead here, I volunteer to work up a PR later this week/weekend. The general strategy will be similar to the one in the mockup above. ([ will call x$`__getitem__`() instead of tf$strided_slice()).

Some NSE will be used for the case of 3 : in the expression, enabling syntax like this:

x[,`::step`,]
x[,start:stop:step,] 

otherwise, standard evaluation will be used, but we'll accept a sequence with a strided step.

x[,start:stop,] 
x[,seq(start, stop, by = step),]

@goldingn
Copy link
Contributor

goldingn commented Apr 9, 2018

Sounds perfect! I won't have much time for the next couple of weeks, so very happy for you to take the lead!

Feel free to tag me if any of the existing code doesn't make sense though.

One query - would it be worth putting start:stop:stride syntax in backticks too, do that every thing that is NSE is obviously NSE?

@t-kalinowski
Copy link
Member Author

Great. Good point on the clean NSE separation. 👍

@skeydan skeydan closed this as completed Sep 11, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants