Files

GPU2

Fix typo in nbinptp.f, and missing jpred.f and inconsistent install.sh

Jan 19, 2018

5c672bd · Jan 19, 2018

Name	Name	Last commit message	Last commit date
parent directory ..
Istat	Istat	First commit of NBODY6	Sep 18, 2015
irrlib	irrlib	NBODY6 update from Sverre Aarseth	Jan 11, 2016
lib	lib	NBODY6 update from Sverre Aarseth	Jan 11, 2016
run	run	NBODY6 Update (20,Apr,2016, Sverre)	Apr 28, 2016
Makefile	Makefile	Cumulative update up to 2018-1-18	Jan 18, 2018
Makefile.build	Makefile.build	Cumulative update up to 2018-1-18	Jan 18, 2018
Ncode	Ncode	Cumulative update up to 2018-1-18	Jan 18, 2018
README	README	First commit of NBODY6	Sep 18, 2015
README.2011	README.2011	First commit of NBODY6	Sep 18, 2015
SAVE	SAVE	Improvement and bugs fix	Dec 30, 2015
adjust.f	adjust.f	Improvement and bugs fix	Dec 30, 2015
bhplot.f	bhplot.f	NBODY6 Update (20,Apr,2016, Sverre)	Apr 28, 2016
check3.f	check3.f	First commit of NBODY6	Sep 18, 2015
checkl2.f	checkl2.f	Cumulative update up to 2018-1-18	Jan 18, 2018
cmfirr.f	cmfirr.f	First commit of NBODY6	Sep 18, 2015
cmfirr2.f	cmfirr2.f	First commit of NBODY6	Sep 18, 2015
common6.h	common6.h	Fix the error of soft link in some directories	Dec 30, 2015
debug.pdf	debug.pdf	First commit of NBODY6	Sep 18, 2015
debug.tex	debug.tex	First commit of NBODY6	Sep 18, 2015
energy2.f	energy2.f	First commit of NBODY6	Sep 18, 2015
fpcorr2.f	fpcorr2.f	First commit of NBODY6	Sep 18, 2015
fpoly0.f	fpoly0.f	First commit of NBODY6	Sep 18, 2015
gpucor.f	gpucor.f	Cumulative update up to 2018-1-18	Jan 18, 2018
gpuirr.avx.s	gpuirr.avx.s	First commit of NBODY6	Sep 18, 2015
gpuirr.s	gpuirr.s	First commit of NBODY6	Sep 18, 2015
gpuirr.sse.cpp	gpuirr.sse.cpp	First commit of NBODY6	Sep 18, 2015
guide2.pdf	guide2.pdf	First commit of NBODY6	Sep 18, 2015
guide2.tex	guide2.tex	First commit of NBODY6	Sep 18, 2015
install.sh	install.sh	Fix typo in nbinptp.f, and missing jpred.f and inconsistent install.sh	Jan 19, 2018
intgrt.omp.f	intgrt.omp.f	Cumulative update up to 2018-1-18	Jan 18, 2018
intgrt.omp.f.stand	intgrt.omp.f.stand	First commit of NBODY6	Sep 18, 2015
jpred.f	jpred.f	First commit of NBODY6	Sep 18, 2015
jpred2.f	jpred2.f	First commit of NBODY6	Sep 18, 2015
kspert.f	kspert.f	First commit of NBODY6	Sep 18, 2015
kspinit.f	kspinit.f	First commit of NBODY6	Sep 18, 2015
kspreg.f	kspreg.f	First commit of NBODY6	Sep 18, 2015
ksres3.f	ksres3.f	First commit of NBODY6	Sep 18, 2015
long.f	long.f	First commit of NBODY6	Sep 18, 2015
nbint.f	nbint.f	Cumulative update up to 2018-1-18	Jan 18, 2018
nbintp.f	nbintp.f	Fix typo in nbinptp.f, and missing jpred.f and inconsistent install.sh	Jan 19, 2018
nbintp.f.stand	nbintp.f.stand	First commit of NBODY6	Sep 18, 2015
params.h	params.h	Fix the error of soft link in some directories	Dec 30, 2015
phicor.f	phicor.f	First commit of NBODY6	Sep 18, 2015
repair.f	repair.f	First commit of NBODY6	Sep 18, 2015
scale.f	scale.f	NBODY6 Update (20,Apr,2016, Sverre)	Apr 28, 2016
start.f	start.f	Cumulative update up to 2018-1-18	Jan 18, 2018
swap.f	swap.f	Cumulative update up to 2018-1-18	Jan 18, 2018
sweep2.f	sweep2.f	Cumulative update up to 2018-1-18	Jan 18, 2018

README

INSTALLATION NOTES FOR LATEST NBODY6

1. Updates
We added support for AVX.
'Makefile*' files were changed and cleaned up.

2. Installation
(a) For users with AVX support:
Just type
 $ make gpu
to generate the executable ./run/nbody6.gpu
of which the regular force part is accelerated by GPU
or
 $ make avx
to generate the executable ./run/nbody6.avx
that runs without GPU but both the regular and
irregular force part is tuned for AVX.
In both versions, AVX is used for the irregular
force part.

(b) For users without AVX support:
Edit the first line of Makefile to comment it out as
#avx       = enable
Then type
 $ make gpu
to generate the executable ./run/nbody6.gpu
or
 $ make sse
to generate the executable ./run/nbody6.sse

3. Modifications
If you are not lucky, you need some modifications
of Makefile.
(1) If the CUDA version is less than 5, comment out the line
      NVCC += -DWITH_CUDA5
(2) SDK_PATH needs to be set such that 'cutil.h' ('helper_cuda.h'
    in CUDA 5) is found.
(3) If your GPU generation is before Kepler, the option
    -arch=sm_30 needs to be changed to a relevant value.
(4) If 'libhugetlbfs' is not installed or you don't
    want to use huge-pages, comment out the line in Makefile.ncode:
      LD_GPU += -B /usr/share/libhugetlbfs -Wl,--hugetlbfs-align
    For older version of libhugetlbfs, the linker option may be
      LD_GPU += -B /usr/share/libhugetlbfs -Wl,--hugetlbfs-link=B
    If you use huge-pages, set three environment variables:
      HUGETLB_VERBOSE='2'
      HUGETLB_ELFMAP='W'
      HUGETLB_MORECORE='yes'

4. For big N simulations
We added '-fPIC' option for successful compilation when a big
NMAX (> 100k) is set in 'params.h'.
With big N, it can cause a stack overflow and segmentation fault.
In such cases, you can increase the stack size with 'ulimit -s'
command in BASH or 'limit stacksize' in TCSH.
Just type
 $ ulimit -s
which returns the current value in KB (maybe 10240).
And you can increase it by
 $ ulimit -s 20480
or alternatively 'limit stacksize 20480' or more for TCSH.

5. Current environment at IoA Cambridge:
CPU      : Core i5-3570 (4 cores, 3.4 GHz)
GPU      : One GeForce GTX 660Ti (7 SMXs, 1344 CUDA cores)
OS       : CentOS 6.4 for x86_64
Compiler : GCC 4.4.7 (default of CentOS)
CUDA     : CUDA 5.0 Production Release

Here is screen a shot of this configuration integrating
64k stars from t=0 to t=2.

***********************
Initializing NBODY6/GPU library
#CPU 4, #GPU 1
 device: 0
 device 0: GeForce GTX 660 Ti
***********************
***********************
Opened NBODY6/GPU library
#CPU 4, #GPU 1
 device: 0
 0 65546
nbmax = 65546
***********************
**************************** 
Opening GPUIRR lib. AVX ver. 
 nmax = 65546, lmax = 500
**************************** 

***********************
Closed NBODY6/GPU library
time send   : 0.875775 sec
time grav   : 20.004446 sec
time reduce : 0.271385 sec
time regtot : 21.151606 sec
1164.470114 Gflops (gravity part only)
***********************
**************************** 
Closing GPUIRR lib. CPU ver. 
time grav  : 17.729413 sec

perf grav  : 21.715221 Gflops
perf grav  : 163.998749 usec
<#NB>      : 76.406419 
**************************** 


Keigo Nitadori
April 2013

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Files

GPU2

GPU2

README

Files

GPU2

Directory actions

More options

Directory actions

More options

Latest commit

History

GPU2

Folders and files

parent directory

README