Swarm-NG
1.1
|
Defines routines to allocate memory using different APIs and copy between them. More...
Go to the source code of this file.
Classes | |
struct | DefaultAllocator< T > |
Default allocator that uses C++ new/delete This class uses standard C++ routines for allocation and memory manipulation: new[], delete[] and std::copy. More... | |
struct | DeviceAllocator< T > |
CUDA device memory allocator that uses cudaMalloc,cudaMemcpy,cudaFree It creates a pointer that is allocated on the device. The pointer cannot be used by the caller and should only be passed to a CUDA kernel. The copy uses cudaMemcpy to transfer data between 2 device arrays. More... | |
struct | HostAllocator< T > |
CUDA host memory allocator uses cudaMallocHost,cudaMemcpy,cudaFreeHost Host memory allocator is similar to malloc. The pointers point to memory that can be used by C++. However, CUDA documentation claims that copying to device memory from a CUDA allocated host array is faster than memory allocated using malloc. More... | |
struct | MappedHostAllocator< T > |
CUDA host memory allocator similar to HostAllocator using device mapped memory A Mapped memory is accessible on host and device. However, the pointers are different and this complicated everything. According to CUDA manual, version 4.0 of CUDA SDK uses unified pointers so there in to need to map the pointer. In that case, The pointer obtained using this allocator can be passed to a kernel. More... | |
Functions | |
template<class A , class T > | |
void | alloc_copy (A, A, T *begin, T *end, T *dst) |
Simple copy between the same allocator. Uses the copy method of the allocator. | |
template<class T > | |
void | alloc_copy (DefaultAllocator< T >, DeviceAllocator< T >, T *begin, T *end, T *dst) |
Copy from host memory to device memory. | |
template<class T > | |
void | alloc_copy (DeviceAllocator< T >, DefaultAllocator< T >, T *begin, T *end, T *dst) |
Copy from device memory to host memory. | |
Defines routines to allocate memory using different APIs and copy between them.
The classes provide a generalized interface for different memory allocators. This is similar to what C++ library uses. The difference is, device memory allocators allocate a memory that is not accessible to the CPU. So in addition to allocations we need copy routines between allocators.
Every allocator provides three basic actions:
The copy routine inside the allocator only copies inside the same memory hierarchy. For memory transfers between domains, we need to use the function template alloc_copy. alloc_copy is specialized for different combinations of source and destination to use specific CUDA calls for copying.
Current allocators are:
Definition in file allocators.hpp.