-
Notifications
You must be signed in to change notification settings - Fork 11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Roadmap discussion #15
Comments
Hey Pravir, finally found some time to respond to this! I am pleased you are thinking about this. The FRB community needs a really solid software package that is well equipped for the future. So I think a refactor is well motivated, but I'm actually wondering whether it's better to start a completely new package, and port the useful code over to it? Data arraysI love the idea of having a profile, block and cube class. Can these support multiple beams and polarization too? I recently played around with xarray, which I conceptually like the idea of as a base data structure -- from which you could define a profile, cube and block. . However there are two issues I've found for high resolution data:
I got a bit carried away and tried to roll my own xarray-like As well as profile/block/cube, should it support voltage-level products? Perhaps we could learn from baseband, which looks to have a task framework including dedispersion. (I'm not very familiar with these) Other DependenciesIn terms of unit/coordinates/time handling, I think astropy does a decent job so would be a reasonable dependency. I like the idea of ALL arrays having units attached (i.e. as an Internal data structureMy opinion is that the sigproc header is too dated, and it doesn't support polarization, so I would suggest using a new and improved internal data structure. (Also storing angles as a float in Next-gen use cases and questionsI guess the most important for adoption is that the package is easy to use for current use cases, and that it can support next-generation use cases too. The UWL and narrow-bandwidth FRBs make a good case study: sub-band searches will probably become more common. How easy would it be to pipe data into tensorflow? How about multiple beams? How about parallelization across nodes (e.g. dask), paralleization on CPUS (openmp) or GPU acceleration (cupy)? It is easy to have 'feature creep' and make the task huge, so it will be important to set a scope and stick to it. What exactly is a 'single pulse toolbox'? And does this need to be high performance for FRB searches, or is useability more important? |
Hi @telegraphic, xarray seems to be the perfect replacement for subclassing ndarray. However, one of the issues I found is in the implementation. I like the existing structure of having methods added to the array itself. We can also pipe together methods, e.g tim = TimeSeries(data)
tim.downsample().pad().toDat() I don't see a safe way to subclass xarray (docs discourage subclassing). Instead, they suggest using accessors to extend new methods. I am not able to figure out a clean API design using accessors. Another approach would be to wrap the xarray inside the object, and the array can then be accessed by For FFT, I think the better option will be to use numpy FFT and switch to pyFFTW (if available), thus removing FFTW3 as a dependency in this package. Re sigproc headers, yes, I agree. So, one idea could be to move all sigproc and psrfits related functions to the hdr = Header.from_sigproc()
hdr = Header.from_psrfits() We still use some of the sigproc cmd tools, e.g., Single-pulse toolboxSo, my intention behind a single pulse toolbox is to have a reference place of robust methods to simulate and analyze single pulses/FRBs. For example, I now think this demands a separate package, which should have the block and profile class and most of the psrchive methods. We can then also have a standard format to store these data (similar to |
It seems you came to the same conclusion about xarray! Using PyFFTW (or the cupy FFT) are easy replacements. I love that sigproc2 is PravirK's updates to Evan Keane's fork of Michael Keith's release of Duncan Lorimer's original SIGPROC. Maybe some similar command line tools but with different names to avoid name collisions? e.g. Adding 'publication quality plots' to the list as you didn't explicitly mention it 📈 |
ok, I have moved the single-pulse toolbox idea (block/profile, psrchive, simulate, plots) to a different package burstpy for now. As we expand the sigpyproc codebase with other formats, a restructuring is required (to v0.6 with #16).
|
@telegraphic Re I like the baseband_tasks.io.psrfits framework (also baseband_tasks.io.hdf5) and will be easy to adapt here; however, it supports only fold-mode. I think the search-mode addition will be easy to implement, so will try for a PR there. All we need is robust wrappers around astropy.io.fits to read header keywords and data. |
Hi, I am starting this thread to discuss plans for sigpyproc.
Current work
I am refactoring the code in packaging branch based on PEP8 and the new type hints. Adding more abstraction and moving the dynamic header class to more strictly structured. This is going to break the existing API (functions name changed to lower case, etc.). Another addition would be to refactor some of the existing code into 3 classes
profile
(for 1D pulse profile),block
(for 2D freq-time spectrum) andcube
(for folded data) similar to psrchive. Also, will be adding robust S/N estimation (using pdmp approach).Future work
FRB simulator
I have plans to integrate @vivgastro Furby as a class inside sigpyproc (with some additional features and support for UWL-like frequency bands). This will complete sigpyproc as a Single-pulse toolbox in the sense that it can generate data/pulses as well as search, visualize and measure properties of those pulses.
PSRFITS support
As @telegraphic suggested, it would be nice to support other formats (e.g.,
PSRFITS
,HDF5
). I think we can add support to read those formats into the existing sigpyproc framework. I am not sure if we should also have a unified header (e.g., all PSRFITS keywords) or a writer class as all these formats (at leastsigproc
andPSRFITS
) are completely different. Also, there are existing packages like your working towards this. IMO we should keep the header keywords (~25) defined in thesigproc
docs as the base of this package and read other format files into this framework.For example, we can have
from sigpyproc.Readers import FitsReader
with all the functionalities ofFilReader
.Roadmap
Should we move towards an entirely python-based package? Most of the C++ code (running mean/median, FFT) can easily be accessed using
NumPy
andSciPy
. One issue might be the speed and multi-threading, but it can be compensated using theNumba
. @telegraphicWe can revive the FRBs/sigproc project to have a fully modern C++ and sigpyproc-like object-oriented framework with proper documentation. The codebase there is very old and can be easily condensed using modern third-party libraries. @evanocathain
The text was updated successfully, but these errors were encountered: