Skip to content

FLAME GPU 2.0.0-rc.1

Pre-release
Pre-release
Compare
Choose a tag to compare
@github-actions github-actions released this 12 Jan 16:28
· 50 commits to master since this release

FLAME GPU 2.0.0-rc.1 is the second release-candidate for FLAME GPU 2.0.0

As a release-candidate the API should be stable, unless any issues are found during the release-candidate phase which require a breaking change to resolve.

There are several important additions and a breaking change since FLAME GPU 2.0.0-rc, which may require changes to your models.
See the changelog for more detail.

This release-candidate release requires:

  • CMake >= 3.18
  • CUDA >= 11.0 and a Compute Capability >= 3.5 NVIDIA GPU.
  • C++17 capable C++ compiler (host), compatible with the installed CUDA version (i.e VS2019+, or GCC >= 8.1)
  • git
  • Python >= 3.8 (optional)
  • MPI >= 3 (optional) with CMake >= 3.20

For full version requirements, please see the Requirements section of the README.

Documentation and Support

Installing Pre-compiled Python Binary Wheels

Python binary wheels for pyflamegpu are not currently distributed via pip, however, they can now be installed from the pyflamegpu wheelhouse - whl.flamegpu.com.
They can also be installed manually by downloading assets from this wheel. the previous be installed manually from assets attached to this release.

To install pyflamegpu 2.0.0rc1 from whl.flamegpu.com, install via pip with --extra-index-url or --find-links and the appropriate URI from whl.flamegpu.com.
E.g. to install the latest pyflamegpu build for CUDA 11.2-11.8 without visualiastion:

python3 -m pip install --extra-index-url https://whl.flamegpu.com/whl/cuda112/ pyflamegpu

To install pyflamegpu 2.0.0rc1 manually, download the appropriate .whl file for your platform, and install it into your python environment using pip. I.e.

python3 -m pip install pyflamegpu-2.0.0rc0+cuda112-cp39-cp39-linux_x86_64.whl

CUDA 11.2-11.8 or CUDA 12.x including nvrtc must be installed on your system containing a Compute Capability 3.5 or newer NVIDIA GPU.

Python binary wheels are available for x86_64 systems with:

  • Linux with glibc >= 2.17 (I.e. Ubuntu >= 13.04, CentOS/RHEL >= 7+, etc.)
  • Windows 10+
  • Python 3.8 - 3.12
  • CUDA 12.x
  • CUDA 11.2 - 11.8
  • Wheels with visualisation enabled or disabled.
    • Note that Linux wheels do not package shared object dependencies at this time.

Wheel filenames are of the format pyflamegpu-2.0.0rc1+cuda<CUDA>[.vis]-cp<PYTHON>-cp<PYTHON>-<platform>.whl, where:

  • cuda<CUDA> encodes the CUDA version used
  • .vis indicates visualisation support is included
  • cp<PYTHON> identifies the python version
  • <platform> identifies the OS/CPU Architecture

For Example:

  • pyflamegpu-2.0.0rc1+cuda120-cp38-cp38-linux_x86_64.whl is a CUDA 12.0-12.x compatible wheel, without visualisation support, for python 3.8 on Linux x86_64.
  • pyflamegpu-2.0.0rc1+cuda112.vis-cp39-cp39-win_amd64.whl is a CUDA 11.2 - 11.8 compatible wheel, with visualisation support, for python 3.9 on Windows 64-bit.

Building FLAME GPU from Source

For instructions on building FLAME GPU from source, please see the Building FLAME GPU section of the README.

Known Issues

  • Warnings and a loss of performance due to hash collisions in device code (#356)
  • Multiple known areas where performance can be improved (e.g. #449, #402)
  • Windows/MSVC builds using CUDA 11.0 may encounter errors when performing incremental builds if the static library has been recompiled. If this presents itself, re-save any .cu file in your executable producing project and re-trigger the build.
  • Debug builds under linux with CUDA 11.0 may encounter cuda errors during validateIDCollisions. Consider using an alternate CUDA version if this is required (#569).
  • CUDA 11.0 with GCC 9 may encounter a segmentation fault during compilation of the test suite. Consider using GCC 8 with CUDA 11.0.
  • CUDA 12.2+ suffers from poor RTC compilation times, to be fixed in a future release. (#1118).

Breaking changes compared to 2.0.0-rc

  • CUDAEnsemble::getLogs returns std::map<unsigned int, RunLog> rather than std::vector<RunLog>, required for distributed ensemble support. (#1090)