FLAME GPU 2.0.0-rc.1
Pre-releaseFLAME GPU 2.0.0-rc.1 is the second release-candidate for FLAME GPU 2.0.0
As a release-candidate the API should be stable, unless any issues are found during the release-candidate phase which require a breaking change to resolve.
There are several important additions and a breaking change since FLAME GPU 2.0.0-rc, which may require changes to your models.
See the changelog for more detail.
This release-candidate release requires:
- CMake
>= 3.18
- CUDA
>= 11.0
and a Compute Capability>= 3.5
NVIDIA GPU. - C++17 capable C++ compiler (host), compatible with the installed CUDA version (i.e VS2019+, or GCC >= 8.1)
- git
- Python >= 3.8 (optional)
- MPI >= 3 (optional) with CMake >=
3.20
For full version requirements, please see the Requirements section of the README.
Documentation and Support
- Quickstart Guide
- Documentation and User Guide
- GitHub Discussions
- GitHub Issues
- Website
- Pyflamegpu Wheelhouse
Installing Pre-compiled Python Binary Wheels
Python binary wheels for pyflamegpu
are not currently distributed via pip, however, they can now be installed from the pyflamegpu wheelhouse - whl.flamegpu.com.
They can also be installed manually by downloading assets from this wheel. the previous be installed manually from assets attached to this release.
To install pyflamegpu 2.0.0rc1
from whl.flamegpu.com
, install via pip with --extra-index-url
or --find-links
and the appropriate URI from whl.flamegpu.com.
E.g. to install the latest pyflamegpu build for CUDA 11.2-11.8 without visualiastion:
python3 -m pip install --extra-index-url https://whl.flamegpu.com/whl/cuda112/ pyflamegpu
To install pyflamegpu 2.0.0rc1
manually, download the appropriate .whl
file for your platform, and install it into your python environment using pip. I.e.
python3 -m pip install pyflamegpu-2.0.0rc0+cuda112-cp39-cp39-linux_x86_64.whl
CUDA 11.2-11.8
or CUDA 12.x
including nvrtc
must be installed on your system containing a Compute Capability 3.5
or newer NVIDIA GPU.
Python binary wheels are available for x86_64 systems with:
- Linux with
glibc >= 2.17
(I.e. Ubuntu >= 13.04, CentOS/RHEL >= 7+, etc.) - Windows 10+
- Python
3.8
-3.12
- CUDA
12.x
- Built with support for Compute Capability
50 60 70 80 90
GPUs
- Built with support for Compute Capability
- CUDA
11.2
-11.8
- Built with support for Compute Capability
35 50 60 70 80
GPUs
- Built with support for Compute Capability
- Wheels with visualisation enabled or disabled.
- Note that Linux wheels do not package shared object dependencies at this time.
Wheel filenames are of the format pyflamegpu-2.0.0rc1+cuda<CUDA>[.vis]-cp<PYTHON>-cp<PYTHON>-<platform>.whl
, where:
cuda<CUDA>
encodes the CUDA version used.vis
indicates visualisation support is includedcp<PYTHON>
identifies the python version<platform>
identifies the OS/CPU Architecture
For Example:
pyflamegpu-2.0.0rc1+cuda120-cp38-cp38-linux_x86_64.whl
is a CUDA 12.0-12.x compatible wheel, without visualisation support, for python 3.8 on Linux x86_64.pyflamegpu-2.0.0rc1+cuda112.vis-cp39-cp39-win_amd64.whl
is a CUDA 11.2 - 11.8 compatible wheel, with visualisation support, for python 3.9 on Windows 64-bit.
Building FLAME GPU from Source
For instructions on building FLAME GPU from source, please see the Building FLAME GPU section of the README.
Known Issues
- Warnings and a loss of performance due to hash collisions in device code (#356)
- Multiple known areas where performance can be improved (e.g. #449, #402)
- Windows/MSVC builds using CUDA 11.0 may encounter errors when performing incremental builds if the static library has been recompiled. If this presents itself, re-save any
.cu
file in your executable producing project and re-trigger the build. - Debug builds under linux with CUDA 11.0 may encounter cuda errors during
validateIDCollisions
. Consider using an alternate CUDA version if this is required (#569). - CUDA 11.0 with GCC 9 may encounter a segmentation fault during compilation of the test suite. Consider using GCC 8 with CUDA 11.0.
- CUDA 12.2+ suffers from poor RTC compilation times, to be fixed in a future release. (#1118).
Breaking changes compared to 2.0.0-rc
CUDAEnsemble::getLogs
returnsstd::map<unsigned int, RunLog>
rather thanstd::vector<RunLog>
, required for distributed ensemble support. (#1090)