Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Register C Callable #24

Open
dcooley opened this issue Apr 14, 2020 · 5 comments
Open

Register C Callable #24

dcooley opened this issue Apr 14, 2020 · 5 comments

Comments

@dcooley
Copy link

dcooley commented Apr 14, 2020

Inspired by recent discussions and progress in other libraries, and by data.table exposing their C functions, would you consider exposing the C functions from geodist so they can be called by other C/C++ code?

I've made a very quick prototype here, where all that was needed was to register a function as R_RegisterCCallable()

void R_init_geodist(DllInfo *dll)
{
    R_RegisterCCallable("geodist", "R_cheap", (DL_FUNC) &R_cheap);

    R_registerRoutines(dll, NULL, CallEntries, NULL, NULL);
    R_useDynamicSymbols(dll, FALSE);
}

Which then makes it availabel to other C/C++ code, for example

library(Rcpp)
cppFunction(
  code = '
    SEXP cheap( SEXP x ) {
      typedef SEXP R_CHEAP_FUN(SEXP); 
      R_CHEAP_FUN *R_cheap = (R_CHEAP_FUN *) R_GetCCallable("geodist","R_cheap");
      return R_cheap( x );
    }
  '
)

And a quick check, using the example from ?geodist

n <- 50
# Default "cheap" distance measure is only accurate for short distances:
x <- cbind (runif (n, -0.1, 0.1), runif (n, -0.1, 0.1))
y <- cbind (runif (2 * n, -0.1, 0.1), runif (2 * n, -0.1, 0.1))
colnames (x) <- colnames (y) <- c ("x", "y")
d0 <- geodist (x) # A 50-by-50 matrix

d1 <- cheap(x)

all.equal( as.numeric(d0), d1 )
# [1] TRUE
@mpadge
Copy link
Member

mpadge commented Apr 14, 2020

All grand by me, but with one caveat: The number of exposed functions is actually pretty huge: geodist_init.c lists them all:

extern SEXP R_cheap(SEXP);
extern SEXP R_cheap_paired(SEXP, SEXP);
extern SEXP R_cheap_paired_vec(SEXP, SEXP, SEXP, SEXP);
extern SEXP R_cheap_range(SEXP);
extern SEXP R_cheap_seq(SEXP);
extern SEXP R_cheap_seq_range(SEXP);
extern SEXP R_cheap_seq_vec(SEXP, SEXP);
extern SEXP R_cheap_vec(SEXP, SEXP);
extern SEXP R_cheap_xy(SEXP, SEXP);
extern SEXP R_cheap_xy_range(SEXP, SEXP);
extern SEXP R_cheap_xy_vec(SEXP, SEXP, SEXP, SEXP);
extern SEXP R_geodesic(SEXP);
extern SEXP R_geodesic_paired(SEXP, SEXP);
extern SEXP R_geodesic_paired_vec(SEXP, SEXP, SEXP, SEXP);
extern SEXP R_geodesic_range(SEXP);
extern SEXP R_geodesic_seq(SEXP);
extern SEXP R_geodesic_seq_range(SEXP);
extern SEXP R_geodesic_seq_vec(SEXP, SEXP);
extern SEXP R_geodesic_vec(SEXP, SEXP);
extern SEXP R_geodesic_xy(SEXP, SEXP);
extern SEXP R_geodesic_xy_range(SEXP, SEXP);
extern SEXP R_geodesic_xy_vec(SEXP, SEXP, SEXP, SEXP);
extern SEXP R_haversine(SEXP);
extern SEXP R_haversine_paired(SEXP, SEXP);
extern SEXP R_haversine_paired_vec(SEXP, SEXP, SEXP, SEXP);
extern SEXP R_haversine_range(SEXP);
extern SEXP R_haversine_seq(SEXP);
extern SEXP R_haversine_seq_range(SEXP);
extern SEXP R_haversine_seq_vec(SEXP, SEXP);
extern SEXP R_haversine_vec(SEXP, SEXP);
extern SEXP R_haversine_xy(SEXP, SEXP);
extern SEXP R_haversine_xy_range(SEXP, SEXP);
extern SEXP R_haversine_xy_vec(SEXP, SEXP, SEXP, SEXP);
extern SEXP R_vincenty(SEXP);
extern SEXP R_vincenty_paired(SEXP, SEXP);
extern SEXP R_vincenty_paired_vec(SEXP, SEXP, SEXP, SEXP);
extern SEXP R_vincenty_range(SEXP);
extern SEXP R_vincenty_seq(SEXP);
extern SEXP R_vincenty_seq_range(SEXP);
extern SEXP R_vincenty_seq_vec(SEXP, SEXP);
extern SEXP R_vincenty_vec(SEXP, SEXP);
extern SEXP R_vincenty_xy(SEXP, SEXP);
extern SEXP R_vincenty_xy_range(SEXP, SEXP);
extern SEXP R_vincenty_xy_vec(SEXP, SEXP, SEXP, SEXP);

Each one is a particular combination of input params. Exposing those would induce some safety compromises, as the code currently relies very strongly on the R-side per-processing to ensure the appropriate C routine is called. That means the code does no checking of arguments at all in any C routine. Direct passing of arbitrary inputs would almost certainly raise problems, and at the least, I'd first have to implement a bunch of C-level code to ensure sane inputs, and doing that would likely slow calculations down somewhat, so there would also be potential disadvantages there. Note also that some SEXPs are assumed to be vectors, while others are assumed to be stored as matrices. R knows no difference between these, but the geodist code sometimes assumes indexing into vectors-as-2-column-matrices, and other times not. All of that would have to be controlled, and there may well be cases where such checking at the C-level might just not be possible.

I don't mean to be obstructive here, and i definitely do think this is a good idea, but would be interested to first have your insights into some of these concerns?

@mdsumner
Copy link
Member

(All interesting stuff for me, just went through my first example and learnt heaps thanks both)

@mpadge
Copy link
Member

mpadge commented Apr 14, 2020

An alternative would be for me to write a C-level meta-wrapper around all of those internal functions. The latter could then be left so, and the exposed bit used to funnel the right combinations of SEXP elements to the right internal calls, just as the current R code does. That should be fairly straightforward.

@dcooley
Copy link
Author

dcooley commented Apr 14, 2020

You bring up some valid concerns for sure. And I think your suggested C-level API would be an easier solution. But, having brought up this issue, I have no real need for it yet, so don't feel like you're required to do this :)

@mpadge
Copy link
Member

mpadge commented Apr 15, 2020

I'm happy to leave this issue open for a while. I'll attend to some others first, and then slowly come around to this. Thanks for the impetus!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants