Level 1 API

Tensors

static int ccv_nnc_tensor_nd(const int dim[CCV_NNC_MAX_DIM_ALLOC])

Count the dimensionality of a tensor.

ccv_nnc_tensor_new(const void *const ptr, const ccv_nnc_tensor_param_t params, const int flags)

Create a new tensor.

Parameters
  • ptr: If 0, nnc will allocate the tensor ourselves. Otherwise, will use the memory region referenced by ‘ptr’.
  • params: Tensor parameters.
  • flags: Reserved flags for the allocation.

ccv_nnc_tensor(const void *const ptr, const ccv_nnc_tensor_param_t params, const int flags)

Create a new tensor on stack.

Parameters
  • ptr: If 0, nnc will allocate the tensor ourselves. Otherwise, will use the memory region referenced by ‘ptr’.
  • params: Tensor parameters.
  • flags: Reserved flags for the allocation.

int ccv_nnc_tensor_pin_memory(ccv_nnc_tensor_t *const tensor)

Pin the tensor memory for faster access on GPU.

Return
0 for success.
Parameters
  • tensor: A tensor that we want to pin the memory.

void ccv_nnc_tensor_free(ccv_nnc_tensor_t *const tensor)

Free a tensor object.

Parameters
  • tensor: The tensor to be freed.

ccv_nnc_tensor_view_new(const ccv_nnc_tensor_t *const tensor, const int dim[CCV_NNC_MAX_DIM_ALLOC], const int ofs[CCV_NNC_MAX_DIM_ALLOC], const int inc[CCV_NNC_MAX_DIM_ALLOC])

Create a tensor view. A tensor view can be non-continuous. Essentially, it provides a view into a tensor.

Parameters
  • tensor: The tensor that we want to view into.
  • dim: The new dimension of the tensor view.
  • ofs: The offset on each of the dimension.
  • inc: The line size of each dimension.

ccv_nnc_tensor_view(const ccv_nnc_tensor_t *const tensor, const int dim[CCV_NNC_MAX_DIM_ALLOC], const int ofs[CCV_NNC_MAX_DIM_ALLOC], const int inc[CCV_NNC_MAX_DIM_ALLOC])

Create a tensor view on stack.

Parameters
  • tensor: The tensor that we want to view into.
  • dim: The new dimension of the tensor view.
  • ofs: The offset on each of the dimension.
  • inc: The line size of each dimension.

void ccv_nnc_tensor_view_free(ccv_nnc_tensor_view_t *const tensor_view)

Free a tensor view object.

Parameters
  • tensor_view: The tensor view to be freed.

void ccv_nnc_tensor_zero(void *const tensor)

Zero out a given tensor.

Parameters
  • tensor: The tensor to be zero out.

ccv_nnc_tensor_eq(const ccv_nnc_tensor_t *const a, const ccv_nnc_tensor_t *const b)

Compare whether two tensors are equal. This will tolerant some floating point issues follow http://www.cygnus-software.com/papers/comparingfloats/comparingfloats.htm

Return
0 if equal, -1 otherwise.
Parameters
  • a: Tensor a.
  • b: Tensor b.

Commands

enum [anonymous]::__anonymous81

Values:

CCV_NNC_CMD_ATTR_PASSTHROUGH = 0x01

This doesn’t compute anything, but pass the first n tensors to the output (useful for backprop that is identical).

CCV_NNC_CMD_ATTR_OUTPUT_ONES = 0x02

All the output tensors are 1s (unit).

CCV_NNC_CMD_ATTR_NULL_IS_ONES = 0x04

Accept nullptr input as if these are tensors with 1s (unit).

enum [anonymous]::__anonymous82

Values:

CCV_NNC_ACCUMULATE_OUTPUT = 0x01

Enable accumulate outputs (unsupported).

CCV_NNC_ZERO_MEMORY_ALLOC = 0x02

Don’t allocate any extra memory for this operation.

enum [anonymous]::__anonymous83

Values:

CCV_NNC_EXEC_SUCCESS = 0

Successfully executed the command.

CCV_NNC_EXEC_INVALID = -1

Invalid inputs.

CCV_NNC_EXEC_NO_KERNEL = -2

No kernel available for a given command / backend.

CCV_NNC_EXEC_OOM = -3

Out of memory error.

typedef struct ccv_nnc_stream_context_s ccv_nnc_stream_context_t

Opaque pointer to a stream object.

typedef struct ccv_nnc_cmd_s ccv_nnc_cmd_t
typedef int (*ccv_nnc_cmd_exec_f)(const ccv_nnc_cmd_t cmd, const ccv_nnc_hint_t hint, const int flags, ccv_nnc_tensor_t *const *const inputs, const int input_size, ccv_nnc_tensor_t *const *const outputs, const int output_size, ccv_nnc_stream_context_t *const stream_context)

For forward functions, the input tensors and output tensors can be arbitrary. However, for backward functions (backpropagation, or gradient functions in other libs), the input is: 0~m-1: gradient for output tensors, 1~n: input tensors for forward functions, n+1~n+m: output tensors for forward functions, the output is: 0~n-1: output gradients w.r.t. input tensors. Which input / output tensors can be ignored can be specified in the cmd config structs.

typedef int (*ccv_nnc_cmd_autotune_f)(const ccv_nnc_cmd_t cmd, const size_t max_workspace_size, const ccv_nnc_hint_t hint, const int flags, ccv_nnc_tensor_t *const *const inputs, const int input_size, ccv_nnc_tensor_t *const *const outputs, const int output_size, ccv_nnc_stream_context_t *const stream_context)

The function prototype for autotune. The only difference is the max_workspace_size. Whoever implement this function prototype means we handled over autotune task to the command itself, you are responsible to select the best algorithm.

Return
The selected algorithm.

uint64_t ccv_nnc_cmd_mono_time(void)

Return a high precision time unit. What this time unit is is platform specific.

Return
A monotonic increasing 64-bit integer w.r.t. passing of time.

ccv_nnc_cmd_name(const uint32_t cmd)

Return UTF-8 encoded name of a given command.

Return
A UTF-8 string (pointing to a static constant).

ccv_nnc_cmd_backend_name(const uint32_t backend)

Return UTF-8 encoded name of a given backend.

Return
A UTF-8 string (pointing to a static constant).

ccv_nnc_cmd_ok(const uint32_t cmd, const uint32_t backend)

Check whether a given backend is available for a given command.

Return
1 if it is available.

ccv_nnc_cmd(const uint32_t cmd, ccv_nnc_cmd_exec_f exec, const ccv_nnc_cmd_param_t params, const int flags)

Create a wrapped command with parameters.

Return
A wrapped ccv_nnc_cmd_t structure.
Parameters
  • cmd: The command identifier.
  • exec: If this is a CCV_NNC_CUSTOM_FORWARD / CCV_NNC_CUSTOM_BACKWARD command, this supplies the custom function.
  • params: The parameters for the command.
  • flags: A reserved field for flags.

ccv_nnc_hint_verify(const ccv_nnc_hint_t hint, const ccv_nnc_cmd_param_t cmd, const ccv_nnc_tensor_param_t a, const ccv_nnc_tensor_param_t b)

Verify whether a hint is compatible with a given command and a given input tensor parameters / output tensor parameters.

Return
1 if it passes.
Parameters
  • hint: The hint for a given command. Hint defines things such as paddings, strides etc. for a given command.
  • cmd: The wrapped command.
  • a: The input tensor parameters.
  • b: The output tensor parameters.

ccv_nnc_hint_auto(const ccv_nnc_cmd_param_t cmd, const ccv_nnc_tensor_param_t a, const ccv_nnc_tensor_param_t b)

Automatically find the best hint for a given input / output (on forward pass only).

Return
Best hint we can guess.
Parameters
  • cmd: The wrapped command.
  • a: The input tensor parameters.
  • b: The output tensor parameters.

void ccv_nnc_hint_tensor_auto(const ccv_nnc_cmd_t cmd, const ccv_nnc_tensor_param_t *const inputs, const int input_size, const ccv_nnc_hint_t hint, ccv_nnc_tensor_param_t *const outputs, const int output_size)

Automatically find the outputs for the given inputs / hint.

Parameters
  • cmd: The wrapped command.
  • inputs: An array of input tensor parameters.
  • input_size: The size of input array.
  • hint: The hint for the given command.
  • outputs: An array for the output tensor parameters.
  • output_size: The size of the output array.

ccv_nnc_cmd_find_backend(const ccv_nnc_cmd_t cmd, const int tensor_memory, const int tensor_formats, const int tensor_datatypes)

Find a suitable backend for a given command and tensor settings.

Return
The backend identifier for the selected backend.
Parameters
  • cmd: The wrapped command.
  • tensor_memory: The tensor memory setup (whether it is CPU or GPU).
  • tensor_formats: The tensor layout format (NCHW, NHWC, CHWN etc.)
  • tensor_datatypes: The datatype of a given tensor (FP32 etc.)

ccv_nnc_cmd_autotune(const ccv_nnc_cmd_t cmd, const size_t max_workspace_size, const ccv_nnc_hint_t hint, const int flags, ccv_nnc_tensor_t *const *const inputs, const int input_size, ccv_nnc_tensor_t *const *const outputs, const int output_size, ccv_nnc_stream_context_t *const stream_context)

Run autotune to find the best kernel and configuration for the given input.

Return
The modified cmd that contains the updated configuration.
Parameters
  • cmd: The original wrapped command.
  • max_workspace_size: The maximum memory allowed for this command to execute.
  • hint: The hint for the given command.
  • flags: The reserved field for flags.
  • inputs: An array of input tensors.
  • input_size: The size of input array.
  • outputs: An array of output tensors.
  • output_size: The size of output array.
  • stream_context: The stream we can do the autotune on. 0 uses default stream.

ccv_nnc_cmd_bitmask(const ccv_nnc_cmd_t cmd, const int input_size, const int output_size, const uint64_t *const input_bitmasks, const int input_bitmask_size, const uint64_t *const output_bitmasks, const int output_bitmask_size)

Check whether a given tensor input / output pattern can be computed by the given command. bitmasks encode whether a given input tensor / output tensor available at a position.

Return
1 if the command can be executed with the given input / output pattern.
Parameters
  • cmd: The wrapped command to check.
  • input_size: The intended size of the input tensor array.
  • output_size: The intended size of the output tensor array.
  • input_bitmasks: The input tensor array encoding in bitmap, 0: no tensor, 1: has a tensor.
  • input_bitmask_size: The size of the input bitmask array.
  • output_bitmasks: The output tensor array encoding in bitmap.
  • output_bitmask_size: The size of the output bitmask array.

int ccv_nnc_cmd_exec(const ccv_nnc_cmd_t cmd, const ccv_nnc_hint_t hint, const int flags, ccv_nnc_tensor_t *const *const inputs, const int input_size, ccv_nnc_tensor_t *const *const outputs, const int output_size, ccv_nnc_stream_context_t *const stream_context)

Execute a given command.

Return
CCV_NNC_EXEC_SUCCESS if succeed.
Parameters
  • cmd: The wrapped command to be executed.
  • hint: The hint provided for the command.
  • flags: A reserved field for flags.
  • inputs: The input tensor array.
  • input_size: The size of input tensor array.
  • outputs: The output tensor array.
  • output_size: The size of output tensor array.
  • stream_context: The stream which the command will be executed upon.

ccv_nnc_cmd_is_forward(const ccv_nnc_cmd_t cmd)

Check whether the command is a forward pass or not.

Return
1 if it is a forward pass.
Parameters
  • cmd: The wrapped command.

ccv_nnc_cmd_is_backward(const ccv_nnc_cmd_t cmd)

Check whether the command is a backward pass or not.

Return
1 if it is a backward pass.
Parameters
  • cmd: The wrapped command.

ccv_nnc_cmd_attr(const ccv_nnc_cmd_t cmd, const int flags)

Check this command against listed attributes.

Return
1 if the flag is supported by the command.
Parameters
  • cmd: The wrapped command.
  • flags: The flags to check against the command (unsupported).

ccv_nnc_cmd_allow_inplace(const ccv_nnc_cmd_t cmd, const int input_idx, const int input_size, const int output_idx, const int output_size)

Check whether this command allow inplace operation against a particular input and output (index from 0).

Return
1 if the input tensor can be used as the output tensor.
Parameters
  • cmd: The wrapped command.
  • input_idx: The index of the input tensor we want to check.
  • input_size: The total number of inputs.
  • output_idx: the index of the output tensor we want to check.
  • output_size: The total number of outputs.

ccv_nnc_cmd_enforce_inplace(const ccv_nnc_cmd_t cmd, const int input_idx, const int input_size, const int output_idx, const int output_size)

Check whether this command need to enforce inplace operation against a particular input and output (index from 0).

Return
1 if the input tensor is required to be used as the output tensor.
Parameters
  • cmd: The wrapped command.
  • input_idx: The index of the input tensor we want to check.
  • input_size: The total number of inputs.
  • output_idx: the index of the output tensor we want to check.
  • output_size: The total number of outputs.

struct ccv_nnc_cmd_param_t
#include <ccv_nnc.h>

Parameters for command.

Public Members

int dim[CCV_NNC_MAX_DIM_ALLOC]

[size.dim] The window size for the layer. For full connect layer, it is 1 because it is 1x1 convolutional layer with count of filters

int count

[convolution.count] The number of filters for convolutional layer.

[bnorm.count] The number of axis selected.

[blas.count] The number of outputs for blas layer.

[reduce.count] The number of axis selected.

int groups

[convolution.groups] The number of groups for convolutional layer.

int reserved

[pool.reserved] A reserved field.

float kappa

[rnorm.kappa] As of b[i] = a[i] / (rnorm.kappa + rnorm.alpha * sum(a, i - rnorm.size / 2, i + rnorm.size / 2)) ^ rnorm.beta

float alpha

[rnorm.alpha] See **rnorm.kappa**.

float beta

[rnorm.beta] See **rnorm.kappa**.

int axis[CCV_NNC_MAX_DIM_ALLOC]

[bnorm.axis[]] The axis selected to compute mean / variance.

[reduce.axis[]] The axis selected to reduce.

float epsilon

[bnorm.epsilon] The epsilon for standard derivation.

int is_test

[bnorm.is_test] Whether in test mode.

float momentum

[bnorm.momentum] running_mean = running_mean * momentum + mean * (1 - momentum).

[minimize.momentum] For SGD, this follows http://www.cs.toronto.edu/%7Ehinton/absps/momentum.pdf.

float rate

[minimize.rate] The learning rate.

float decay

[minimize.decay] This is the weight decay parameter, which represents L2 regularization after momentum applied.

float dampening

[minimize.dampening] This usually == momentum, however, it can be changed.

float a[3]

[blas.a[3]] BLAS scalars.

float p

[dropout.p] Dropout probability.

struct ccv_nnc_hint_t

Public Members

int dim[CCV_NNC_MAX_DIM_ALLOC]

Stride for each dimension.

int begin[CCV_NNC_MAX_DIM_ALLOC]

Padding at the beginning of a dimension.

int end[CCV_NNC_MAX_DIM_ALLOC]

Padding at the end of a dimension.

struct ccv_nnc_cmd_s

Public Members

uint32_t cmd

The identifier for command.

uint32_t backend

The identifier for backend.

int algorithm

The algorithm selector (as defined by backend).

ccv_nnc_cmd_param_t info

The command parameters.

int (*exec)(const struct ccv_nnc_cmd_s cmd, const ccv_nnc_hint_t hint, const int flags, ccv_nnc_tensor_t *const *const inputs, const int input_size, ccv_nnc_tensor_t *const *const outputs, const int output_size, ccv_nnc_stream_context_t *const stream_context)

This has to be the same as the ccv_nnc_cmd_exec_f type. This is for type CCV_NNC_CUSTOM_FORWARD / CCV_NNC_CUSTOM_BACKWARD

Streams

enum [anonymous]::__anonymous84

Values:

CCV_STREAM_CONTEXT_CPU = 0x1

A CPU based stream context (unsupported).

CCV_STREAM_CONTEXT_GPU = 0x2

A GPU based stream context.

typedef struct ccv_nnc_stream_signal_s ccv_nnc_stream_signal_t

Opaque pointer to the signal object.

typedef ccv_nnc_stream_context_t *(*ccv_nnc_stream_context_neighbor_discovery_f)(const int device_id, void *const context)

The neighbor discovery function that will be called with the device id.

ccv_nnc_stream_context_new(const int type)

Create a new stream context.

Return
The newly created stream context.
Parameters
  • type: A combination of CPU / GPU and DEVICE_ID.

ccv_nnc_stream_context_type(const ccv_nnc_stream_context_t *const stream_context)

Get the type of the stream context.

Return
The type of the stream context.
Parameters
  • stream_context: The stream context we want to inspect.

ccv_nnc_stream_context_get_workspace(ccv_nnc_stream_context_t *const stream_context, const size_t workspace_size, const int mem)

Get a stream context local workspace memory. This memory region will be reused the next time when you call this method on the same stream context.

Return
A pointer to the workspace memory.
Parameters
  • stream_context: The stream context which provides the workspace memory.
  • workspace_size: The size of the workspace memory.
  • mem: The memory type of the said workspace memory (GPU or CPU).

void ccv_nnc_stream_context_drain(ccv_nnc_stream_context_t *const stream)

Deallocate any workspace memory on the stream context.

Parameters
  • stream: The stream context to drain workspace memory.

void ccv_nnc_stream_context_wait(const ccv_nnc_stream_context_t *const stream)

Wait until all tasks submitted (command, graph run etc.) on the stream context completed.

Parameters
  • stream: The stream context to wait.

void ccv_nnc_stream_context_free(ccv_nnc_stream_context_t *const stream_context)

Deallocate the stream context.

Parameters
  • stream_context: The stream context to be destroyed.

ccv_nnc_stream_signal_new(const int type)

Create a new stream signal.

Return
The newly created stream signal.
Parameters
  • type: A composed type denotes whether it associated with a GPU or CPU stream context, and on which device.

void ccv_nnc_stream_context_emit_signal(ccv_nnc_stream_context_t *const stream, ccv_nnc_stream_signal_t *const signal)

Emit a signal on a stream.

Parameters
  • stream: The stream context where the signal will be emitted.
  • signal: The signal to be emitted. It has to be on the same device as the stream.

void ccv_nnc_stream_context_wait_signal(const ccv_nnc_stream_context_t *const stream, const ccv_nnc_stream_signal_t *const signal)

Wait a signal on a stream.

Parameters
  • stream: The stream context that will be blocked by the signal.
  • signal: The signal to be waited. It can be on a different device of the stream.

ccv_nnc_stream_signal_get_emitter(const ccv_nnc_stream_signal_t *const signal)

Get on which stream context this signal is going to be emitted on.

Return
The most recent stream context you called ccv_nnc_stream_context_emit_signal with.
Parameters
  • signal: The signal we want to inspect.

void ccv_nnc_stream_signal_free(ccv_nnc_stream_signal_t *const signal)

Deallocate the signal.

Parameters
  • signal: The signal to be destroyed.

ccv_nnc_device_count(const int type)

Return number of devices.

Return
The number of devices.
Parameters
  • type: The type of devices (CCV_NNC_STREAM_CONTEXT_GPU / CCV_NNC_STREAM_CONTEXT_CPU)

ccv_nnc_device_remap(const int type, const int source, const int destination)

Remap a source device as the destination device.

Return
0 if the device remap is successful, -1 if it is not.
Parameters
  • type: The type of devices (CCV_NNC_STREAM_CONTEXT_GPU / CCV_NNC_STREAM_CONTEXT_CPU)
  • source: The original device id.
  • destination: The new device id.

void ccv_nnc_stream_context_set_neighbor_discovery(ccv_nnc_stream_context_t *const stream_context, ccv_nnc_stream_context_neighbor_discovery_f discovery, void *const context)

Set the neighbor stream context discovery mechanism. This method exposes how neighbor should be defined per stream context. This method is useful for commands that operates cross devices and need to find the correct stream context for these devices. Stream context itself is bounded to one device only.

Parameters
  • stream_context: The stream context that bounds to a discovery mechanism.
  • discovery: The neighbor discovery function to invoke.
  • context: The associated context with the neighbor discovery function.

ccv_nnc_stream_context_find_neighbor(ccv_nnc_stream_context_t *const stream_context, const int device_id)

Find a neighbor stream context on a given device id for current stream context.

Return
0 if no stream context found. Otherwise, return the stream context on that device.
Parameters
  • stream_context: The stream context which we will look for neighbors.
  • device_id: On which device the stream context may exist.

CCV_STREAM_GET_CONTEXT(type)
CCV_STREAM_GET_DEVICE(type)
CCV_STREAM_GET_DEVICE_ID(type)
CCV_STREAM_SET_DEVICE_ID(type, device_id)