Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create PLEP on adopting OpenPMD standard? #13

Open
namurphy opened this issue Nov 17, 2017 · 7 comments
Open

Create PLEP on adopting OpenPMD standard? #13

namurphy opened this issue Nov 17, 2017 · 7 comments

Comments

@namurphy
Copy link
Member

A PlasmaPy issue was raised recently about adopting the OpenPMD standard for particle-mesh data. I personally think that this would be a great topic for a PLEP, since it is an important design decision and it would also be really helpful to have a design document to refer to during development that clarifies what adopting this standard would entail and how we would go about implementing it. I would be happy to help with drafting this, though at this point I have just a basic knowledge of what this standard is.

Original topic: PlasmaPy/PlasmaPy#167
OpenPMD repositories: https://github.com/openPMD
OpenPMD standard: https://github.com/openPMD/openPMD-standard

@tulasinandan
Copy link

I like the standard defined by these people. Looks like we do not have a PLEP yet. Maybe we should create one. I'll look into it within the next couple of working days.

@ax3l
Copy link

ax3l commented Jun 19, 2018

Hi there, we are excited that you consider openPMD!

A lot has been added since we last wrote and I just want to give you a short heads-up.

openPMD 2.0 has incorporated a lot of features and if there weren't a ton of conferences ahead we would finish it very soon. But we will get there. There is also a new module/tool called openPMD-updater that can forward-update existing files with openPMD 1.X with light-weight meta data updates as soon as 2.0.0 is finalized.

For adopting you come at a great time, since we just started to implement a high-level API! openPMD-api has reached alpha state for C++ and has first python bindings as well. (Manual, Install) For HDF5, we are h5py compatible in output and add a nice high-level description that actually understands the openPMD self-describing physics and objects without fiddling with low-level file APIs. Talking of low-level, we support already HDF5 and ADIOS as filesystem backends and plan for NetCDF as well (and even more ;-) ). Also, the same API works if you need scalable I/O for large parallel applications.

That's it so far, feel free to ping us anytime :)

@StanczakDominik
Copy link
Member

Heyo! We had a talk about this topic today during our bi-weekly telecon (by the way, would you be interested in participating some time in the future?) as this is relevant to @ritiek's work over in PlasmaPy/PlasmaPy#500.

At the moment, we're kind of wrangling with a few design decisions:

  • we'd like to keep being available on PyPI besides just being on conda (which we, as I just remembered, should finally get done)
  • we'd like to keep our feature set consistent between PyPI and conda
  • compiling the API from source is a pain according to @ritiek who tried

At the same time, I personally think that:

  • with a scientific Python package, well, basically almost everyone uses conda anyway so it's not too much of a problem
  • for plasmas, being computationally intensive as they are, you can't always dodge having Fortran/C includes (unless you're fine with rewriting existing codes for years)
  • the API looks neat and I'd like to stay on the high level end of things instead of duplicating your functionality

But of course there may be other approaches, for example we could work with you upstream and ensure Cmake compilation does what it's supposed to with pip. I haven't tried, but it theoretically could be feasible.

This discussion seems important so I'll tag everyone from today's telecon : @namurphy @SolarDrew @tulasinandan @ritiek @samurai688 and I guess @lemmatum might be likely to care

@ax3l
Copy link

ax3l commented Jun 19, 2018

Hi, thank you for your thoughts and explanations!

Regarding the API: it's all in all just a convenient offering to use it and we have ourselves lived well with h5py and native bindings in the last years to write openPMD markup :-) We have a validator script that can just check if what one wrote is correct.

@ritiek I would love your feedback as an issue over in openPMD-api whatever might have caused you pain - I want to hear all of it and see how we can improve it! Install from source must be as easy as cmake .. && make && make install, otherwise we failed. But let us discuss the API install in a separate issue, we are still in alpha and need feedback on things that are unexpectedly troublesome. Contrary to the C++ API (alpha), the Python API is just getting started. Even if you use h5py which is totally reasonable for PlasmaPy as well, we would love your feedback on the Python API if you have the time. Thoughts like "ugh, returning X is quite unpythonic. Why not interface as Y instead." are very welcome.

Generally, we can totally publish binary and source packages to PyPI (draft of a setup.py) as well if need be, e.g. if someone wants to assist us with setting up a multibuild so we can automate the process as we automate our conda releases.

That said, I personally think that you will generally reach a more modern and more stable experience for users with conda since it's a better controlled system for various libraries you want a controlled install and performance with. Also, currently standardized Python wheels (e.g. manylinux/manylinus2010) are extremely ancient in compiler requirements which is caused by some distributions and trickles down to poor PyPI. For openPMD we are agnostic to the distribution channel, publish most things to PyPI and conda and even go the extra mile to support the old compilers in the list. If we can automate the shipping in a controlled, open build environment (for osx, win and linux) and user experience is not horrible with it, we will ship ;-)

That said, don't wait for us if you like h5py and the HDF5 backend is sufficient for your use case! h5py works great for us as well and we are compatible (e.g. in weird things like representation of bools and string attributes - just make sure to run our validator on your data).

by the way, would you be interested in participating some time in the future?

certainly! :)

@ritiek
Copy link

ritiek commented Jun 20, 2018

@ax3l I am really glad you're taking much efforts to have people adopt these standards which I hope people really do!

Install from source must be as easy as cmake .. && make && make install, otherwise we failed.

Yep, it did build smoothly with no problems when I passed no arguments but there was some trouble when passing -DopenPMD_USE_PYTHON=ON and -DopenPMD_USE_HDF5 to cmake.

It's been a while since I tried to build but AFAIK there was some error regarding `MPI` and `HDF5` which I think [building HDF5 and h5py with MPI support fixed it](https://gist.github.com/kentwait/280aa20e9d8f2c8737b2bec4a49b7c92). I think it would be very nice if it were point to relevant docs for building with MPI support. (Also, it isn't immediately clear what MPI support does, it might be good to have some docs what additional features it brings to OpenPMD-API :)

That error regarding HDF5 and MPI went away but then now I face this one.

-- HDF5: Using hdf5 compiler wrapper to determine C configuration
-- Can NOT find 'adios_config' - set ADIOS_ROOT, ADIOS_DIR or INSTALL_PREFIX, or check your PATH
-- Could NOT find ADIOS (missing: ADIOS_LIBRARIES ADIOS_INCLUDE_DIRS) (Required is at least version "1.13.1")
CMake Error at CMakeLists.txt:176 (find_package):
  Could not find a package configuration file provided by "pybind11"
  (requested version 2.2.1) with any of the following names:

    pybind11Config.cmake
    pybind11-config.cmake

  Add the installation prefix of "pybind11" to CMAKE_PREFIX_PATH or set
  "pybind11_DIR" to a directory containing one of the above files.  If
  "pybind11" provides a separate development package or SDK, be sure it has
  been installed.


-- Configuring incomplete, errors occurred!
See also "/home/ritiek/Downloads/openPMD-api-build/CMakeFiles/CMakeOutput.log".

I tried looking up and I think installing ADIOS might fix it but it was mentioned in under optional I/O dependencies in the README (and/or it could be that I have pybind installed incorrectly) but I didn't really try installing it yet. I guess I'll tinker a bit more and might raise an issue in OpenPMD-API repo with feedback on what could have been better in my perspective. :)


Also, I tried installing it with spack

spack install openpmd-api +python
spack load --dependencies openpmd-api

but for some reason spack always ended up installing Python 2.7.15 even when I created a symbolic link in /usr/bin/python which pointed to my Python 3.6 binary (I wanted it build OpenPMD-API for Python 3.6).

Initially it did say it was running Python 3.6 with spack python but that changed when I did spack install openpmd-api +python. I am not really sure whether this is something to do with OpenPMD-API or spack itself. I haven't used spack before this so I can't say how it works. Do you guys know of a way to specify the Python version?

And conda installation really did go smooth, no errors in da way. :D


That said, I am myself very new to HDF5 and OpenPMD in general, so it may be just something to do with my lack of clear understanding at the moment. :/

@ax3l
Copy link

ax3l commented Jun 20, 2018

@ritiek excellent, thank you for the details, wonderful!

I would like to respond to you but am afraid to spam your PLEP proposal. can we move it to a bugreport in https://github.com/openPMD/openPMD-api/issues/new/choose ? (Just copy & paste your last comment and link here.)

@ax3l
Copy link

ax3l commented Aug 20, 2019

Little update: openPMD 1.x is implemented with PlasmaPy/PlasmaPy#500 (closing PlasmaPy/PlasmaPy#167 ) via h5py 🚀 ✨

Adding to what I said earlier: our low-level reference implementation openPMD-api might be useful for any kind of data pre-/post-processing in the future as well (or you could use it for I/O), also it is currently been advanced for staged (in-transit) workflows. It provides now C++11 and Python3 bindings and is shipped via:

  • spack (auto-build from soure)
  • conda-forge (binaries for OSX, Windows, Linux on x86; aarch64 and ppc64le coming soon)
  • pypi (source-build for all platforms; manylinux2010 binary wheels for linux-x86)
  • piwheels (binaries for RaspberryPi; status)

Current file backends include: json, HDF5, ADIOS1 and ADIOS2, the latter three support serial as well as MPI-parallel I/O.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants