Class of GPU integrators with a thread for each body-pair. More...

Classes
class	hermite_adap
	GPU implementation of PEC2 Hermite integrator w/ adaptive time step. More...

class	hermite
	GPU implementation of PEC2 Hermite integrator. More...

struct	FixedTimeStep
	data structure for fixed time step More...

struct	AdaptiveTimeStep
	data structure for adaptive time step More...

class	rkck
	Runge Kutta Cash Karp integrator Fixed/Adaptive. More...

struct	EulerPropagatorParams
	Paramaters for EulerPropagator. More...

struct	EulerPropagator
	GPU implementation of euler propagator It is of no practical use. More...

struct	HermitePropagatorParams
	Paramaters for HermitePropagator. More...

struct	HermitePropagator
	GPU implementation of hermite propagator It is of no practical use since hermite integrator implements the same functionaliy faster. More...

struct	MidpointPropagatorParams
	Paramaters for MidpointPropagator. More...

struct	MidpointPropagator
	GPU implementation of modified midpoint method propagator. More...

struct	MVSPropagatorParams
	Paramaters for MvsPropagator. More...

struct	MVSPropagator
	GPU implementation of mixed variables symplectic propagator. More...

struct	VerletPropagatorParams
	Paramaters for VerletPropagator. More...

struct	VerletPropagator
	GPU implementation of Verlet propagator. More...

class	integrator
	Common functionality and skeleton for body-pair-per-thread integrators Common tasks include: More...

class	generic
	Generic integrator for rapid creation of new integrators. More...

class	GravitationAcc
	templatized Class to calculate acceleration and jerk in parallel More...

class	GravitationAccJerk
	templatized Class working as a function object to calculate acceleration and jerk in parallel. More...

struct	GravitationAccScalars
	Unit type of the acceleration pairs shared array. More...

struct	GravitationAccJerkScalars
	Unit type of the acceleration and jerk pairs shared array. More...

class	GravitationAcc_GR
	templatized Class to calculate acceleration and jerk in parallel More...

class	GravitationLargeN
	Gravitation calculation class for large number of bodies in a system. More...

class	GravitationMediumN
	Gravitation calculation for a number of bodies between 10-20 EXPERIMENTAL: This class is not thoroughly tested. More...

Functions
GPUAPI int	sysid ()
	Kernel Helper Function: Extract system ID from CUDA thread ID.

GPUAPI int	sysid_in_block ()
	Kernel Helper Function: Extract system sequence number inside current block.

GPUAPI int	thread_in_system ()
	Kernel Helper Function: Extract the worker-thread number for current system.

GPUAPI int	system_per_block_gpu ()
	Kernel Helper Function: Extract number of systems per a block from CUDA thread information.

GPUAPI int	thread_component_idx (int nbod)
	Kernel Helper Function: Logical coordinate component id [1:x,2:y,3:z] calculated from thread ID info.

GPUAPI int	thread_body_idx (int nbod)
	Kernel Helper Function: Logical body id [0..nbod-1] calculated from thread ID info.

template<class Impl , class T >
GPUAPI void *	system_shared_data_pointer (Impl *integ, T compile_time_param)
	Kernel Helper Function: Get the pointer to dynamic shared memory allocated for the system. More...

GENERIC double	inner_product (const double a[3], const double b[3])
	Helper function for calculating inner product.

template<int nbod>
GENERIC int	first (int ij)
	Helper function to convert an integer from 1..n*(n-1)/2 to a pair (first,second), this function returns the first element.

template<int nbod>
GENERIC int	second (int ij)
	Helper function to convert an integer from 1..n*(n-1)/2 to a pair (first,second), this function returns the second element.

Detailed Description

Class of GPU integrators with a thread for each body-pair.

Using a thread for each body-pair is to parallelize as much as possible when integrating an ensemble. The thread assignment is as follows

When computing interaction forces (and higher derivatives) between bodies, one thread is assigned to each pair of bodies.
When integrating quantities for bodies individually, one thread is assigned to each coordinate component of each body.
When advancing the time or checking for stop criteria or setting the time step, only one thread is used.

We use predicate barriers for each case since the number of threads that are actually working is not the same in each case. For example, in case of 3 bodies, there are 3 pairs, 9 body-components and 1 thread is needed for overall tasks.

For better coalesced reads from global and shared memory, the block is structured in a non-traditional way. Innermost (x) component is part of the system id, and the other component is the body-pair.

Global functions defined here are used inside kernels for consistent interpretation of thread and shared memory references

The integrators that derive from this may override three functions: thread_per_system(), shmem_per_sytem(),

Function Documentation

template<class Impl , class T >

GPUAPI void* swarm::gpu::bppt::system_shared_data_pointer	(	Impl *	integ,
		T	compile_time_param
	)

Kernel Helper Function: Get the pointer to dynamic shared memory allocated for the system.

This function assumes that the memory is used through CoalescedStructArray with a chunk size of SHMEM_CHUNK_SIZE. This uses overlapping data structures to provide coalescing for shared memory.

Definition at line 122 of file bppt.hpp.

References sysid_in_block().

Referenced by swarm::gpu::bppt::hermite< Monitor, Gravitation >::kernel(), swarm::gpu::bppt::rkck< AdaptationStyle, Monitor, Gravitation >::kernel(), TutorialIntegrator< Monitor, Gravitation >::kernel(), swarm::gpu::bppt::hermite_adap< Monitor, Gravitation >::kernel(), and swarm::gpu::bppt::generic< Propagator, Monitor, Gravitation >::kernel().

Classes

Functions

Detailed Description

Function Documentation