Level 3.5 API¶

Automatic Differentiation¶

void ccv_nnc_symbolic_graph_backward(ccv_nnc_symbolic_graph_t *const graph, const ccv_nnc_tensor_symbol_t *const f_symbols, const int f_symbol_size, const ccv_nnc_tensor_symbol_t *const wrt_symbols, const int wrt_symbol_size, const ccv_nnc_graph_exec_symbol_t *const sources, const int source_size, const ccv_nnc_graph_exec_symbol_t *const destinations, const int destination_size)

Compute the backward graph, assuming the provided symbolic graph only contain the “forward” part from sources to destinations. This effectively is called the “autograd” or automatic differentiation process (specifically, “reverse AD”) in other libs. For a expression y = f(x), to compute dx, x is the wrt_symbol, y is the f_symbol.

Parameters
• graph: The symbolic graph.
• f_symbols: The tensor symbols array of the result (or loss).
• f_symbol_size: The size of the f symbols array.
• wrt_symbols: The tensor symbols array of the inputs.
• wrt_symbol_size: The size of the wrt symbols array.
• sources: The source execution nodes array for the computation.
• source_size: The size of the source nodes array.
• destinations: The destination execution nodes array for the computation.
• destination_size: The size of the destination nodes array.

ccv_nnc_tensor_symbol_for_backward(const ccv_nnc_symbolic_graph_t *const graph, const ccv_nnc_tensor_symbol_t symbol)

Get the symbol that contains the gradient. The list will be flushed if the ccv_nnc_symbolic_graph_backward function is called again.

Return
A tensor symbol that represents the gradient.
Parameters
• graph: The symbolic graph.
• symbol: The tensor symbol we want to retrieve its gradient (must be one of the wrt symbols or the f symbols).

ccv_nnc_graph_exec_symbol_for_backward(const ccv_nnc_symbolic_graph_t *const graph, const ccv_nnc_tensor_symbol_t symbol)

Get the execution node symbol for a tensor symbol. This used to retrieve the execution node for a gradient tensor symbol.

Return
A execution node symbol that generates the gradient.
Parameters
• graph: The symbolic graph.
• symbol: The tensor symbol that represents the gradient (must be one of the wrt symbols).

While Loop Essentials¶

typedef int (*ccv_nnc_graph_while_f)(ccv_nnc_tensor_t *const *const inputs, const int input_size, const void *const data)

The given tensors contains all the common / input / output tensors specified in the sub-graph.

ccv_nnc_tensor_tape_new(void)

Create a tensor tape that can be used to record for while loop or case..of.

Return
A ccv_nnc_tensor_tape_t pointer.

void ccv_nnc_tensor_tape_free(ccv_nnc_tensor_tape_t *const tape)

Deallocate the tensor tape and all the memory it allocated.

Parameters
• tape: The tensor tape object.

ccv_nnc_graph_exec_symbol_t ccv_nnc_symbolic_graph_while(ccv_nnc_symbolic_graph_t *const graph, const uint32_t cmd, ccv_nnc_symbolic_graph_t *const while_graph, const char *const name)

The API to operate on the symbolic graph is more involved than the concrete graph for while loops. The reason is because symbolic graph operates in SSA form (static single assignment), therefore, the while loops for the symbolic graph has to be parameterized.

Return
A while loop execution symbol (backed by a sub-graph) of the giving graph.
Parameters
• graph: The symbolic graph.
• cmd: The command idenfitier, can be either CCV_NNC_GRAPH_FORWARD or CCV_NNC_GRAPH_BACKWARD
• while_graph: The sub-graph to run the while loop.
• name: The name of the while loop. Optional.

void ccv_nnc_symbolic_graph_set_while_expr(ccv_nnc_symbolic_graph_t *const while_graph, const ccv_nnc_graph_while_f while_expr, const void *const while_data, const ccv_nnc_tensor_symbol_t *const inputs, const int input_size, const ccv_nnc_graph_exec_symbol_t *const breakpoints, const int breakpoint_size)

Set the expression to be evaluated, and at which nodes to be evaluated.

Parameters
• while_graph: The symbolic graph that will run the while loop.
• while_expr: The function pointer to the expression.
• while_data: A custom data provided to the expression evaluation function.
• inputs: The input tensor symbols array to the expression evaluation function.
• input_size: The size of the input tensor symbols array.
• breakpoints: The execution node symbols at which the while loop will pause, evaluate the expression, and choose to either break out or continue.
• breakpoint_size: The size of the execution node symbols array.

void ccv_nnc_symbolic_graph_set_carry_overs(ccv_nnc_symbolic_graph_t *const while_graph, const ccv_nnc_tensor_symbol_map_t *const symbol_map, const int symbol_map_size)

Set the loop carry parameters when reuse. (parameterized loop, these will be carried over to the next loop).

Parameters
• while_graph: The symbolic graph that will run the while loop.
• symbol_map: A pair of tensor symbols array, where the source tensor symbol is the output tensor symbol in this loop, the destination tensor symbol is the input tensor symbol in the next loop.
• symbol_map_size: The size of the symbol map array.

ccv_nnc_tensor_symbol_for_while_count(const ccv_nnc_symbolic_graph_t *const while_graph)

Retrieve the special (magical) tensor symbol that retains the while loop counter (thus, dimension of 1x1x1, CCV_64S type).

Return
A tensor symbol represents the implicit loop count.
Parameters
• while_graph: The symbolic graph that will run the while loop.

ccv_nnc_symbolic_graph_from_while_symbol(const ccv_nnc_symbolic_graph_t *const graph, const ccv_nnc_graph_exec_symbol_t while_symbol)

Extract the sub-graph of the while loop from a symbol.

Return
The sub-graph that represents a while loop.
Parameters
• graph: The symbolic graph.
• while_symbol: The execution node symbol.

ccv_nnc_graph_while(ccv_nnc_graph_t *const graph, const uint32_t cmd, ccv_nnc_graph_t *const while_graph)

Constructing looped concrete graph. Note that this interface is a little bit simpler than the one for symbolic graph. The reason is that a concrete graph operates on allocated tensors, thus, there is no mapping of tensor symbols between the parent graph and the while graph. (The reason to have a mapping in symbolic graphs is to constraint the variable leaking between the sub graph and parent graph).

Return
A execution node that represents the sub-graph.
Parameters
• graph: The concrete graph.
• cmd: The command idenfitier, can be either CCV_NNC_GRAPH_FORWARD or CCV_NNC_GRAPH_BACKWARD
• while_graph: The sub-graph to run the while loop.

void ccv_nnc_graph_set_while_expr(ccv_nnc_graph_t *const while_graph, const ccv_nnc_graph_while_f while_expr, const void *const while_data, ccv_nnc_tensor_t *const *const inputs, const int input_size, const ccv_nnc_graph_exec_t *const breakpoints, const int breakpoint_size)

Set the evaluated expression for the while loop. The while loop will break out if the expression evaluates to 0.

Parameters
• while_graph: The concrete graph that will run the while loop.
• while_expr: The function pointer to the expression.
• while_data: A custom data provided to the expression evaluation function.
• inputs: The input tensors array to the expression evaluation function.
• input_size: The size of the input tensors array.
• breakpoints: The execution nodes at which the while loop will pause, evaluate the expression, and choose to either break out or continue.
• breakpoint_size: The size of the execution nodes array.

ccv_nnc_tensor_for_while_count(const ccv_nnc_graph_t *const while_graph)

Get the special tensor for the while loop count. It contains one uint64_t value. We keep an implicit count when evaluate the while loop and you can access it with this tensor.

Return
A special tensor that you can retrieve the loop count at .data.i64[0].
Parameters
• while_graph: The concrete graph that will run the while loop.

ccv_nnc_graph_from_while_exec(const ccv_nnc_graph_t *const graph, ccv_nnc_graph_exec_t exec)

Retrieve the sub-graph from a execution node.

Return
The sub-graph.
Parameters
• graph: The concrete graph.
• exec: The execution node represents the sub-graph.

Branching¶

typedef int (*ccv_nnc_graph_case_of_f)(ccv_nnc_tensor_t *const *const inputs, const int input_size, const void *const data)

Function prototype to evaluate a branch expression.

ccv_nnc_symbolic_graph_case_of_new(ccv_nnc_symbolic_graph_t *const graph, const uint32_t cmd, const ccv_nnc_tensor_symbol_t *const inputs, const int input_size, const ccv_nnc_tensor_symbol_map_t *const symbol_map, const int symbol_map_size, const char *const name)

Create a new case..of execution node symbol.

Return
A execution node symbol that represents the case..of graph.
Parameters
• graph: The symbolic graph.
• cmd: The command idenfitier, can be either CCV_NNC_GRAPH_FORWARD or CCV_NNC_GRAPH_BACKWARD
• inputs: The input tensor symbols array for the expression.
• input_size: The size of the input tensor symbols array.
• symbol_map: The pair of tensor symbols array where the source is the input tensor symbol and the destination is the output tensor symbol.
• symbol_map_size: The size of symbol map array.
• name: The name of the case..of graph. Optional.

void ccv_nnc_symbolic_graph_set_case_of_expr(ccv_nnc_symbolic_graph_t *const graph, const ccv_nnc_graph_exec_symbol_t exec, ccv_nnc_graph_case_of_f case_of, const void *case_of_data)

Set the expression to be evaluated when choose which sub-graph to branch to.

Parameters
• graph: The symbolic graph.
• exec: The execution node symbol that represents the case..of graph.
• case_of: The function pointer to evaluate.
• case_of_data: The data associated with the function pointer.

void ccv_nnc_symbolic_graph_set_case_of(ccv_nnc_symbolic_graph_t *const graph, const ccv_nnc_graph_exec_symbol_t symbol, ccv_nnc_symbolic_graph_t *const case_graph, const int case_of, const ccv_nnc_tensor_symbol_map_t *const symbol_map, const int symbol_map_size)

Set a sub-graph as one of the branch for the case..of graph.

Parameters
• graph: The symbolic graph.
• symbol: The execution node symbol that represents the case..of graph.
• case_graph: The sub-graph for one of the branch.
• case_of: The index assigned to this sub-graph (expression returns this index to determine which sub-graph to execute).
• symbol_map: The pair of tensor symbols array where the source is the output tensor symbol of the sub-graph, and the destination is the output tensor symbol of the execution node symbol.
• symbol_map_size: The size of the symbol map array.

ccv_nnc_graph_case_of_new(ccv_nnc_graph_t *const graph, const uint32_t cmd, ccv_nnc_tensor_t *const *const inputs, const int input_size, ccv_nnc_tensor_t *const *const outputs, const int output_size)

Create a new case..of execution node.

Return
A execution node that represents the case..of graph.
Parameters
• graph: The concrete graph.
• cmd: The command idenfitier, can be either CCV_NNC_GRAPH_FORWARD or CCV_NNC_GRAPH_BACKWARD
• inputs: The input tensors array supplied to the expression.
• input_size: The size of the input tensors array.
• outputs: The output tensors array.
• output_size: The size of the output tensors array.

void ccv_nnc_graph_set_case_of_expr(ccv_nnc_graph_t *const graph, const ccv_nnc_graph_exec_t exec, ccv_nnc_graph_case_of_f case_of, const void *case_of_data, const int offset)

Set the expression to be evaluated when choose which sub-graph to branch to.

Parameters
• graph: The concrete graph.
• exec: The execution node that represents the case..of graph.
• case_of: The function pointer to evaluate.
• case_of_data: The data associated with the function pointer.
• offset: A integer added to the expression output to help choose the index. Thus, real index = expression index + offset.

void ccv_nnc_graph_set_case_of(ccv_nnc_graph_t *const graph, const ccv_nnc_graph_exec_t exec, ccv_nnc_graph_t *const case_graph, const int case_of)

Set a sub-graph as one of the branch for the case..of graph.

Parameters
• graph: The concrete graph.
• exec: The execution node that represents the case..of graph.
• case_graph: The sub-graph for one of the branch.
• case_of: The index assigned to this sub-graph (expression returns this index + offset to determine which sub-graph to execute).

void ccv_nnc_symbolic_graph_minimize(ccv_nnc_symbolic_graph_t *const graph, const ccv_nnc_cmd_t minimizer, const ccv_nnc_tensor_symbol_t *const losses, const int loss_size, const ccv_nnc_tensor_symbol_t *const parameters, const int parameter_size, const ccv_nnc_graph_exec_symbol_t *const sources, const int source_size, const ccv_nnc_graph_exec_symbol_t *const destinations, const int destination_size, ccv_nnc_tensor_symbol_t *const gradients, ccv_nnc_tensor_symbol_t *const updated_parameters, ccv_nnc_tensor_symbol_map_t *const saved_aux, ccv_nnc_graph_exec_symbol_t *const graph_exec_symbols)

This is the comparable part to Caffe’s solver or TensorFlow’s optimizer. It took a step further than just compute the gradient, but also apply the gradient to update parameters to minimize the loss.

Parameters
• graph: The symbolic graph.
• minimizer: The wrapped command that represents a particular optimization strategy.
• losses: The tensor symbols array of losses.
• loss_size: The size of the loss symbols array.
• parameters: The parameter tensor symbols to optimize.
• parameter_size: The size of parameter symbols array.
• sources: The source execution nodes array.
• source_size: The size of source nodes array.
• destinations: The destinations execution nodes array.
• destination_size: The size of destination nodes array.
• gradients: The tensor symbols that represents the gradient for update, should be the same size as the parameters array. This can be 0 (optional).
• updated_parameters: The tensor symbols that represents the updated parameters, should be the same size as the parameters array.
• saved_aux: The tensor symbols that is helpful for particular optimization strategy.
• graph_exec_symbols: The execution node symbols for the updates, should be the same size as the parameters array.

ccv_nnc_minimizer_saved_aux_size(const ccv_nnc_cmd_t minimizer)

The number of extra saved aux per parameter only depends on the commands. For example, SGD with momentum requires 1 aux (for momentum). Others require more.

Return
the number of saved aux per parameter.
Parameters
• minimizer: The wrapped command that represents a particular optimization strategy.

Graph Simplification¶

enum [anonymous]::__anonymous92

Values:

CCV_NNC_SIMPLIFY_COMMON_SUBEXPRESSION_ELIMINATION

If two commands generated the same outputs, all the places where the newer output used will be replaced by the old output. Later on the graph pruning stage, the command that generate the newer output will be eliminated.

CCV_NNC_SIMPLIFY_GRAPH_PRUNING

For the given outputs, eliminate unused input tensors, and then eliminate graph execs that don’t contribute to the outputs.

CCV_NNC_SIMPLIFY_DATA_TRANSFER_OPT

For CCV_NNC_DATA_TRANSFER, if the input / output is the same (on the same device, no alias), we can skip. Similarly, if it is on the same device, but alias of some, for some cases we can skip as well (if neither are carry overs, bypasses etc.)

CCV_NNC_SIMPLIFY_OPS_FUSION

Combine a few smaller ops into bigger one. For now, this functionality is limited. I can only address ops that are sequential.

void ccv_nnc_symbolic_graph_simplify(ccv_nnc_symbolic_graph_t *const graph, const int *const passes, const int pass_size, const ccv_nnc_tensor_symbol_t *const outputs, const int output_size, const ccv_nnc_graph_exec_symbol_t *const sources, const int source_size, const ccv_nnc_graph_exec_symbol_t *const destinations, const int destination_size)

Simplify a graph with given list of passes, in that particular order. Note, when a graph is simplified, its sources / destinations are changed as well.

Parameters
• graph: The symbolic graph.
• passes: The array of passes we are going to apply.
• pass_size: The size of the passes array.
• outputs: The output tensor symbols we want to retain (we are going to prune any execution nodes that is not related to these outputs).
• output_size: The size of the output array.
• sources: The source execution node symbols array.
• source_size: The size of source node symbols array.
• destinations: The destinations execution node symbols array.
• destination_size: The size of destination node symbols array.

Automatic Graph Parallelization¶

enum [anonymous]::__anonymous93

Values:

CCV_NNC_PARALLEL_REDUCE_OP_SUM

Op for reducer / allreducer. Currently only supports sum.

void ccv_nnc_symbolic_graph_data_parallel(ccv_nnc_symbolic_graph_t *const graph, const int parallel, const ccv_nnc_tensor_symbol_t *const broadcasts, const int broadcast_size, const ccv_nnc_tensor_symbol_t *const allreducers, const int allreducer_size, const ccv_nnc_tensor_symbol_t *const reducers, const int reducer_size, const int reduce_op_type, const ccv_nnc_graph_exec_symbol_t *const sources, const int source_size, const ccv_nnc_graph_exec_symbol_t *const destinations, const int destination_size)

Turn the existing graph to be capable to run on several devices with different data inputs at parallel. With this method, additional tensor symbols will be created that runs on different devices. That has been said, there are concepts of “broadcast” and “reduce”. “broadcast” tensor symbols will be copied to different devices, while “reduce” tensors will be summed from different devices to the default device. “allreducer” concept is simpler. The allreduce operation will be performed on these tensors and then be used on different devices again.

Limitations: right now, the way to reduce / allreduce tensors only supports “sum”. The data parallel only supports GPU, thus, the nodes will be duplicated are GPU computations and GPU memory backed tensors. Also, right now, the tensors to be broadcasted / allreduced / reduced should have no aliases.

Parameters
• graph: The symbolic graph.
• parallel: Number of devices we want to run on. 0 will use all devices available. 1 will skip.
• broadcasts: The tensor symbols to be broadcasted.
• broadcast_size: The size of the broadcast tensor symbols array.
• allreducers: The tensor symbols that to be allreduced.
• allreducer_size: The size of the allreducer tensor symbols array.
• reducers: The tensor symbols to be reduced.
• reducer_size: The size of the reducer tensor symbols array.
• reduce_op_type: The reduce op for reducer / allreducer.
• sources: The source execution node symbols array.
• source_size: The size of source node symbols array.
• destinations: The destinations execution node symbols array.
• destination_size: The size of destination node symbols array.

ccv_nnc_tensor_symbol_copy(const ccv_nnc_symbolic_graph_t *const graph, const ccv_nnc_tensor_symbol_t symbol, const int device_id)

Get the symbol that is on a device other than the default one. The list will be flushed if the ccv_nnc_symbolic_graph_data_parallel function is called again.

Return
A tensor symbol that is on a different device.
Parameters
• graph: The symbolic graph.
• symbol: The tensor symbol we want to retrieve its counterparts on a different device.
• device_id: The device numeric id for this symbol.

ccv_nnc_graph_exec_symbol_copy(const ccv_nnc_symbolic_graph_t *const graph, const ccv_nnc_graph_exec_symbol_t symbol, const int device_id)

Get the execution node that is on a device other than the default one. The list will be flushed if the ccv_nnc_symbolic_graph_data_parallel function is called again.

Return
A execution node that is on a different device.
Parameters
• graph: The symbolic graph.
• symbol: The execution node we want to retrieve its counterparts on a different device.
• device_id: The device numeric id for this symbol.

While Loop Others¶

enum [anonymous]::__anonymous91

Values:

CCV_NNC_MULTIVIEW_K0N = 0

All of them are repeated.

CCV_NNC_MULTIVIEW_K1N = 1

The first one is the first, the second one starts to repeat. (0111111…)

typedef struct ccv_nnc_tensor_multiview_s ccv_nnc_tensor_multiview_t

Augmented tensor to run a graph with while loop (An obvious example is dynamic RNN).

void ccv_nnc_tensor_tape_io(ccv_nnc_tensor_tape_t *const tape, const ccv_nnc_graph_t *const graph, const int *const input_flags, ccv_nnc_tensor_t *const *const inputs, const int input_size, const int *const output_flags, ccv_nnc_tensor_t *const *const outputs, const int output_size)

For a given tape on a given graph, update the input / output tensors so new version will be created (if needed).

Parameters
• tape: The tensor tape object.
• graph: The concrete graph this tensor tape is executing in.
• input_flags: The flags associated with input tensors.
• inputs: The input tensors.
• input_size: The size of input tensors array.
• output_flags: The flags associated with output tensors.
• outputs: The output tensors.
• output_size: The size of output tensors array.

uint64_t ccv_nnc_tensor_tape_numbering(ccv_nnc_tensor_tape_t *const tape, const ccv_nnc_graph_t *const graph, const ccv_nnc_graph_exec_t exec)

Retrieve the number we associated with the execution node that recorded on the tape for a particular run of the graph.

Return
The number associated with the execution node.
Parameters
• tape: The tensor tape object.
• graph: The concrete graph this tensor tape is executing in.
• exec: The execution node.

void ccv_nnc_tensor_tape_set_numbering(ccv_nnc_tensor_tape_t *const tape, ccv_nnc_graph_t *const graph, const ccv_nnc_graph_exec_t exec, const uint64_t numbering)

Set the number we associated with the execution node that recorded on the tape for a particular run of the graph.

Parameters
• tape: The tensor tape object.
• graph: The concrete graph this tensor tape is executing in.
• exec: The execution node.
• numbering: The number associated with the execution node.

void ccv_nnc_tensor_multiview(ccv_nnc_tensor_t *data[], const uint8_t kind, const uint16_t repeat, const ccv_nnc_graph_t *const graph, ccv_nnc_tensor_multiview_t *const tensor_multiview)

Setup a tensor multiview with a given set of tensors. A multiview tensor point to a list of tensors, and its access depends on the loop count. For example, if we have a multiview tensor with list of [a, b, c, d], and kind is 1N, repeat is 3. For loop count 0, 1, 2, 3, 4, 5, 6, the corresponding tensors used will be a, b, c, d, b, c. If kind is 0N, and repeat is 4, it will be a, b, c, d, a, b.

Parameters
• data[]: The pointer to the list of tensors the multiview object can point to.
• kind: Can be either CCV_NNC_MULTIVIEW_K0N or CCV_NNC_MULTIVIEW_K1N, basically whether to keep the initial tensor.
• repeat: The length of the repeat.
• graph: Which graph this multiview object attaches to.
• tensor_multiview: The tensor multiview object to be updated.

void ccv_nnc_tensor_multiview_free(const ccv_nnc_tensor_multiview_t tensor_multiview)

Since tensor_multiview will never be allocated with *_new method, the *_free method simply frees anything that is dynamically allocated afterwards (such as the reference items).

Parameters
• tensor_multiview: The tensor multiview object to be deallocated.

void ccv_nnc_tensor_synchronize_to_multiview(ccv_nnc_tensor_multiview_t *const tensor_multiview, ccv_nnc_tensor_t *const tensor)

Setup a tensor as a reference to a tensor multiview, thus, when tensor multiview’s tu (current tensor) updates, the tensor reference’s data.u8 will get update as well (point to the same memory region as the tu).

Parameters
• tensor_multiview: The tensor multiview object.
• tensor: The tensor that will be updated along with the multiview object.

void ccv_nnc_tensor_multiview_synchronize(ccv_nnc_tensor_multiview_t *const tensor_multiview)

Send broadcast to subscribers of the multiview, call this in the beginning of exec.

Parameters
• tensor_multiview: The tensor multiview object.

CCV_NNC_MULTIVIEW_DATA(x)
CCV_NNC_MULTIVIEW_PHI

Denote this is a phi multi-view tensor.

CCV_NNC_MULTIVIEW_K01(x)
struct ccv_nnc_tensor_multiview_s
#include <ccv_nnc.h>

Augmented tensor to run a graph with while loop (An obvious example is dynamic RNN).