Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Initial attempt at efficient range queries across multiple Set objects. #10

Open
wants to merge 6 commits into
base: master
Choose a base branch
from

Conversation

davidblewett
Copy link
Contributor

This is not compiling yet (using rustc 1.24.0-nightly (0a3761e63 2018-01-03)):

(dblewett)Wed Jan 10, 10:03 | /home/dblewett/src/python-rust-fst/fstwrapper
rusty% cargo build          
 Compiling fst-wrapper v0.3.0 (file:///home/dblewett/src/python-rust-fst/fstwrapper)
error[E0277]: the trait bound `for<'a> &fst::set::Stream<'_>: fst::Streamer<'a>` is not satisfied
 --> src/set.rs:152:36
  |
152 |     let ob = set::OpBuilder::new().add(stream);
  |                                    ^^^ the trait `for<'a> fst::Streamer<'a>` is not implemented for `&fst::set::Stream<'_>`
  |
  = help: the following implementations were found:
            <fst::set::Stream<'s, A> as fst::Streamer<'a>>

error[E0277]: the trait bound `for<'a> &fst::set::Stream<'_>: fst::Streamer<'a>` is not satisfied
 --> src/set.rs:168:8
  |
168 |     ob.push(stream);
  |        ^^^^ the trait `for<'a> fst::Streamer<'a>` is not implemented for `&fst::set::Stream<'_>`
  |
  = help: the following implementations were found:
            <fst::set::Stream<'s, A> as fst::Streamer<'a>>

error: aborting due to 2 previous errors

error: Could not compile `fst-wrapper`.

@@ -146,13 +146,28 @@ pub extern "C" fn fst_set_make_opbuilder(ptr: *mut Set) -> *mut set::OpBuilder<'
}
make_free_fn!(fst_set_opbuilder_free, *mut set::OpBuilder);

#[no_mangle]
pub extern "C" fn fst_set_make_opstreambuilder(ptr: *mut set::Stream) -> *mut set::OpBuilder<'static> {
let stream = ref_from_ptr!(ptr);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@davidblewett I think you want stream to be a set::Stream here, but it looks like it's a &set::Stream. In particular, this requires passing ownership of the stream into this function, so I imagine you want something like let stream = unsafe { *Box::from_raw(ptr) };. Similarly for the function below.

I haven't audited this code thoroughly, but I might be concerned about memory leaks here. e.g., What's responsible for freeing the memory of an opbuilder?

Copy link
Contributor Author

@davidblewett davidblewett Jan 18, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

https://github.com/jbaiter/python-rust-fst/blob/master/rust_fst/set.py#L60-L61 :

    # NOTE: No need for `ffi.gc`, since the struct will be free'd
    #       once we call union/intersection/difference

@jbaiter how do I wire up fst_set_opstreambuilder_free ?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, right! Nice catch.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@BurntSushi I tried updating to use *mut &set::Stream, but the result didn't change:

  Compiling fst-wrapper v0.3.0 (file:///home/dblewett/src/python-rust-fst/fstwrapper)                          
error[E0277]: the trait bound `for<'a> &fst::set::Stream<'_>: fst::Streamer<'a>` is not satisfied               
   --> src/set.rs:152:36                                
    |                                                   
152 |     let ob = set::OpBuilder::new().add(stream);   
    |                                    ^^^ the trait `for<'a> fst::Streamer<'a>` is not implemented for `&fst::set::Stream<'_>`                                                                                               
    |                                                   
    = help: the following implementations were found:   
              <fst::set::Stream<'s, A> as fst::Streamer<'a>>                                                    

error[E0277]: the trait bound `for<'a> &fst::set::Stream<'_>: fst::Streamer<'a>` is not satisfied               
   --> src/set.rs:168:8                                 
    |                                                   
168 |     ob.push(stream);                              
    |        ^^^^ the trait `for<'a> fst::Streamer<'a>` is not implemented for `&fst::set::Stream<'_>`          
    |                                                   
    = help: the following implementations were found:   
              <fst::set::Stream<'s, A> as fst::Streamer<'a>>                                                    

error: aborting due to 2 previous errors                

error: Could not compile `fst-wrapper`.   

#[no_mangle]
pub extern "C" fn fst_set_opbuilder_push(ptr: *mut set::OpBuilder, set_ptr: *mut Set) {
let set = ref_from_ptr!(set_ptr);
let ob = mutref_from_ptr!(ptr);
ob.push(set);
}

#[no_mangle]
pub extern "C" fn fst_set_opbuilder_push_stream(ptr: *mut set::OpBuilder, stream_ptr: *mut &set::Stream) {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be *mut set::Stream, not *mut &set::Stream.

@davidblewett
Copy link
Contributor Author

@BurntSushi fstwrapper is now compiling; thanks!

@jbaiter However, now I'm getting segfaults in trying to test the functionality:

Python 3.6.3 (default, Oct  3 2017, 21:45:48)           
Type 'copyright', 'credits' or 'license' for more information                                                   
IPython 6.2.1 -- An enhanced Interactive Python. Type '?' for help.                                             

In [1]: import rust_fst                                 

In [2]: fst = rust_fst.Set('/home/dblewett/Desktop/bgzf-test/activity.fst')                                     

In [3]: fst.range_union(slice('1000~000003e8-0000-0054-01d3-8bdfe91ec2b7','1000~000003e8-0000-0054-01d3-8bdfe91ec2b7~~'), rust_fst.Set('/home/dblewett/Desktop/bgzf-test/index/nyt/2018/1/2/0/activity.fst'))                   
*** Error in `/home/dblewett/.virtualenvs/fst/bin/python3.6': free(): invalid pointer: 0x00007f88dd4f5cd8 ***   
zsh: abort (core dumped)  ipython

The API signature is a little awkward and should probably get some more thought, but I was just trying to test the plumbing without luck.

@jbaiter
Copy link
Owner

jbaiter commented Jan 21, 2018

Can you add a unit test for the range_union method? Then I can take a shot at debugging it :-)
To fix the merge conflict, just move the new function signatures from _build_ffy.py to rust/rust_fst.h.

* Supports range queries across multiple sets:
  a = Set.from_iter(["bar", "foo"])
  b = Set.from_iter(["baz", "foo"])
  list(UnionSet(a, b)['ba':'bb'])
  ['bar', 'baz']
* Add StreamBuilder, to correspond with the Rust library's abstraction
* Update OpBuilder to support constructing operations against multiple
  underlying types (Set and StreamBuilder for now)
* difference, intersection, symmetric_difference, union
@davidblewett
Copy link
Contributor Author

@jbaiter @BurntSushi Sorry this got delayed so long. I have this working now.

@davidblewett
Copy link
Contributor Author

The tests pass for me (cPython 3.7, Ubuntu 18.04.1):

(fst-test) 
(dblewett)Wed Sep 12, 15:46 | /home/dblewett/src
rusty% py.test python-rust-fst/tests
==================================================================================================================== test session starts ====================================================================================================================
platform linux -- Python 3.7.0b3, pytest-3.8.0, py-1.6.0, pluggy-0.7.1
rootdir: /home/dblewett/src/python-rust-fst, inifile:
collected 47 items                                                                                                                                                                                                                                          

python-rust-fst/tests/test_map.py ..................                                                                                                                                                                                                  [ 38%]
python-rust-fst/tests/test_set.py .............................                                                                                                                                                                                       [100%]

================================================================================================================= 47 passed in 0.18 seconds =================================================================================================================```

@davidblewett
Copy link
Contributor Author

@jbaiter, @BurntSushi what do you think of this? I wasn't sure about the UnionSet name.

Copy link

@BurntSushi BurntSushi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From a quick skim, this all looks reasonable to me! Nice work!

@davidblewett
Copy link
Contributor Author

@jbaiter is there interest in including this in this package? If not the entire PR, maybe the supporting C functions? it would be cumbersome to do completely outside this package.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants