Swarm-NG
1.1
|
Class of GPU integrators with a thread for each body-pair. More...
Classes | |
class | hermite_adap |
GPU implementation of PEC2 Hermite integrator w/ adaptive time step. More... | |
class | hermite |
GPU implementation of PEC2 Hermite integrator. More... | |
struct | FixedTimeStep |
data structure for fixed time step More... | |
struct | AdaptiveTimeStep |
data structure for adaptive time step More... | |
class | rkck |
Runge Kutta Cash Karp integrator Fixed/Adaptive. More... | |
struct | EulerPropagatorParams |
Paramaters for EulerPropagator. More... | |
struct | EulerPropagator |
GPU implementation of euler propagator It is of no practical use. More... | |
struct | HermitePropagatorParams |
Paramaters for HermitePropagator. More... | |
struct | HermitePropagator |
GPU implementation of hermite propagator It is of no practical use since hermite integrator implements the same functionaliy faster. More... | |
struct | MidpointPropagatorParams |
Paramaters for MidpointPropagator. More... | |
struct | MidpointPropagator |
GPU implementation of modified midpoint method propagator. More... | |
struct | MVSPropagatorParams |
Paramaters for MvsPropagator. More... | |
struct | MVSPropagator |
GPU implementation of mixed variables symplectic propagator. More... | |
struct | VerletPropagatorParams |
Paramaters for VerletPropagator. More... | |
struct | VerletPropagator |
GPU implementation of Verlet propagator. More... | |
class | integrator |
Common functionality and skeleton for body-pair-per-thread integrators Common tasks include: More... | |
class | generic |
Generic integrator for rapid creation of new integrators. More... | |
class | GravitationAcc |
templatized Class to calculate acceleration and jerk in parallel More... | |
class | GravitationAccJerk |
templatized Class working as a function object to calculate acceleration and jerk in parallel. More... | |
struct | GravitationAccScalars |
Unit type of the acceleration pairs shared array. More... | |
struct | GravitationAccJerkScalars |
Unit type of the acceleration and jerk pairs shared array. More... | |
class | GravitationAcc_GR |
templatized Class to calculate acceleration and jerk in parallel More... | |
class | GravitationLargeN |
Gravitation calculation class for large number of bodies in a system. More... | |
class | GravitationMediumN |
Gravitation calculation for a number of bodies between 10-20 EXPERIMENTAL: This class is not thoroughly tested. More... | |
Functions | |
GPUAPI int | sysid () |
Kernel Helper Function: Extract system ID from CUDA thread ID. | |
GPUAPI int | sysid_in_block () |
Kernel Helper Function: Extract system sequence number inside current block. | |
GPUAPI int | thread_in_system () |
Kernel Helper Function: Extract the worker-thread number for current system. | |
GPUAPI int | system_per_block_gpu () |
Kernel Helper Function: Extract number of systems per a block from CUDA thread information. | |
GPUAPI int | thread_component_idx (int nbod) |
Kernel Helper Function: Logical coordinate component id [1:x,2:y,3:z] calculated from thread ID info. | |
GPUAPI int | thread_body_idx (int nbod) |
Kernel Helper Function: Logical body id [0..nbod-1] calculated from thread ID info. | |
template<class Impl , class T > | |
GPUAPI void * | system_shared_data_pointer (Impl *integ, T compile_time_param) |
Kernel Helper Function: Get the pointer to dynamic shared memory allocated for the system. More... | |
GENERIC double | inner_product (const double a[3], const double b[3]) |
Helper function for calculating inner product. | |
template<int nbod> | |
GENERIC int | first (int ij) |
Helper function to convert an integer from 1..n*(n-1)/2 to a pair (first,second), this function returns the first element. | |
template<int nbod> | |
GENERIC int | second (int ij) |
Helper function to convert an integer from 1..n*(n-1)/2 to a pair (first,second), this function returns the second element. | |
Class of GPU integrators with a thread for each body-pair.
Using a thread for each body-pair is to parallelize as much as possible when integrating an ensemble. The thread assignment is as follows
We use predicate barriers for each case since the number of threads that are actually working is not the same in each case. For example, in case of 3 bodies, there are 3 pairs, 9 body-components and 1 thread is needed for overall tasks.
For better coalesced reads from global and shared memory, the block is structured in a non-traditional way. Innermost (x) component is part of the system id, and the other component is the body-pair.
Global functions defined here are used inside kernels for consistent interpretation of thread and shared memory references
The integrators that derive from this may override three functions: thread_per_system(), shmem_per_sytem(),
GPUAPI void* swarm::gpu::bppt::system_shared_data_pointer | ( | Impl * | integ, |
T | compile_time_param | ||
) |
Kernel Helper Function: Get the pointer to dynamic shared memory allocated for the system.
This function assumes that the memory is used through CoalescedStructArray with a chunk size of SHMEM_CHUNK_SIZE. This uses overlapping data structures to provide coalescing for shared memory.
Definition at line 122 of file bppt.hpp.
References sysid_in_block().
Referenced by swarm::gpu::bppt::hermite< Monitor, Gravitation >::kernel(), swarm::gpu::bppt::rkck< AdaptationStyle, Monitor, Gravitation >::kernel(), TutorialIntegrator< Monitor, Gravitation >::kernel(), swarm::gpu::bppt::hermite_adap< Monitor, Gravitation >::kernel(), and swarm::gpu::bppt::generic< Propagator, Monitor, Gravitation >::kernel().