NDArray API
Arithmetic Operations
In the following example y
can be a Real
value or another NDArray
API | Example | |
---|---|---|
+ |
x .+ y |
Elementwise summation |
- |
x .- y |
Elementwise minus |
* |
x .* y |
Elementwise multiplication |
/ |
x ./ y |
Elementwise division |
^ |
x .^ y |
Elementwise power |
% |
x .% y |
Elementwise modulo |
Trigonometric Functions
API | Example | |
---|---|---|
sin
|
sin.(x) |
Elementwise sine |
cos
|
cos.(x) |
Elementwise cosine |
tan
|
tan.(x) |
Elementwise tangent |
asin
|
asin.(x) |
Elementwise inverse sine |
acos
|
acos.(x) |
Elementwise inverse cosine |
atan
|
atan.(x) |
Elementwise inverse tangent |
Hyperbolic Functions
API | Example | |
---|---|---|
sinh
|
sinh.(x) |
Elementwise hyperbolic sine |
cosh
|
cosh.(x) |
Elementwise hyperbolic cosine |
tanh
|
tanh.(x) |
Elementwise hyperbolic tangent |
asinh
|
asinh.(x) |
Elementwise inverse hyperbolic sine |
acosh
|
acosh.(x) |
Elementwise inverse hyperbolic cosine |
atanh
|
atanh.(x) |
Elementwise inverse hyperbolic tangent |
Activation Functions
API | Example | |
---|---|---|
σ
|
σ.(x) |
Sigmoid function |
sigmoid
|
sigmoid.(x) |
Sigmoid function |
relu
|
relu.(x) |
ReLU function |
softmax
|
softmax.(x) |
Softmax function |
log_softmax
|
log_softmax.(x) |
Softmax followed by log |
Reference
# MXNet.mx.NDArray
— Type.
NDArray{T,N}
Wrapper of the NDArray
type in libmxnet
. This is the basic building block of tensor-based computation.
Note
since C/C++ use row-major ordering for arrays while Julia follows a column-major ordering. To keep things consistent, we keep the underlying data in their original layout, but use language-native convention when we talk about shapes. For example, a mini-batch of 100 MNIST images is a tensor of C/C++/Python shape (100,1,28,28), while in Julia, the same piece of memory have shape (28,28,1,100).
source
# Base.cos
— Function.
cos.(x::NDArray)Defined in src/operator/tensor/elemwise_unary_op_trig.cc:L63
source
# Base.cosh
— Function.
cosh.(x::NDArray)Defined in src/operator/tensor/elemwise_unary_op_trig.cc:L216
source
# Base.reshape
— Method.
reshape(arr::NDArray, dim; reverse=false)
Defined in src/operator/tensor/matrix_op.cc:L165
source
# Base.reshape
— Method.
reshape(arr::NDArray, dim...; reverse=false)
Defined in src/operator/tensor/matrix_op.cc:L165
source
# Base.sin
— Function.
sin.(x::NDArray)Defined in src/operator/tensor/elemwise_unary_op_trig.cc:L46
source
# Base.sinh
— Function.
sinh.(x::NDArray)Defined in src/operator/tensor/elemwise_unary_op_trig.cc:L201
source
# Base.tan
— Function.
tan.(x::NDArray)Defined in src/operator/tensor/elemwise_unary_op_trig.cc:L83
source
# Base.tanh
— Function.
tanh.(x::NDArray)Defined in src/operator/tensor/elemwise_unary_op_trig.cc:L234
source
# MXNet.mx.broadcast_axes
— Method.
broadcast_axis(x::NDArray, dim, size)
broadcast_axes(x::NDArray, dim, size)
Broadcasts the input array over particular axis(axes). Parameter dim
and size
could be a scalar, a Tuple or an Array.
broadcast_axes
is just an alias.
julia> x
1×2×1 mx.NDArray{Int64,3} @ CPU0:
[:, :, 1] =
1 2
julia> mx.broadcast_axis(x, 1, 2)
2×2×1 mx.NDArray{Int64,3} @ CPU0:
[:, :, 1] =
1 2
1 2
julia> mx.broadcast_axis(x, 3, 2)
1×2×2 mx.NDArray{Int64,3} @ CPU0:
[:, :, 1] =
1 2
[:, :, 2] =
1 2
Defined in src/operator/tensor/broadcast_reduce_op_value.cc:L207
source
# MXNet.mx.broadcast_axis
— Method.
broadcast_axis(x::NDArray, dim, size)
broadcast_axes(x::NDArray, dim, size)
Broadcasts the input array over particular axis(axes). Parameter dim
and size
could be a scalar, a Tuple or an Array.
broadcast_axes
is just an alias.
julia> x
1×2×1 mx.NDArray{Int64,3} @ CPU0:
[:, :, 1] =
1 2
julia> mx.broadcast_axis(x, 1, 2)
2×2×1 mx.NDArray{Int64,3} @ CPU0:
[:, :, 1] =
1 2
1 2
julia> mx.broadcast_axis(x, 3, 2)
1×2×2 mx.NDArray{Int64,3} @ CPU0:
[:, :, 1] =
1 2
[:, :, 2] =
1 2
Defined in src/operator/tensor/broadcast_reduce_op_value.cc:L207
source
# MXNet.mx.broadcast_to
— Method.
broadcast_to(x::NDArray, dims)
broadcast_to(x::NDArray, dims...)
Broadcasts the input array to a new shape.
In the case of broacasting doesn't work out of box, you can expand the NDArray first.
julia> x = mx.ones(2, 3, 4);
julia> y = mx.ones(1, 1, 4);
julia> x .+ mx.broadcast_to(y, 2, 3, 4)
2×3×4 mx.NDArray{Float32,3} @ CPU0:
[:, :, 1] =
2.0 2.0 2.0
2.0 2.0 2.0
[:, :, 2] =
2.0 2.0 2.0
2.0 2.0 2.0
[:, :, 3] =
2.0 2.0 2.0
2.0 2.0 2.0
[:, :, 4] =
2.0 2.0 2.0
2.0 2.0 2.0
Defined in src/operator/tensor/broadcast_reduce_op_value.cc:L231
source
# MXNet.mx.broadcast_to
— Method.
broadcast_to(x::NDArray, dims)
broadcast_to(x::NDArray, dims...)
Broadcasts the input array to a new shape.
In the case of broacasting doesn't work out of box, you can expand the NDArray first.
julia> x = mx.ones(2, 3, 4);
julia> y = mx.ones(1, 1, 4);
julia> x .+ mx.broadcast_to(y, 2, 3, 4)
2×3×4 mx.NDArray{Float32,3} @ CPU0:
[:, :, 1] =
2.0 2.0 2.0
2.0 2.0 2.0
[:, :, 2] =
2.0 2.0 2.0
2.0 2.0 2.0
[:, :, 3] =
2.0 2.0 2.0
2.0 2.0 2.0
[:, :, 4] =
2.0 2.0 2.0
2.0 2.0 2.0
Defined in src/operator/tensor/broadcast_reduce_op_value.cc:L231
source
# MXNet.mx.clip!
— Method.
clip(x::NDArray, min, max)
clip!(x::NDArray, min, max)
Clips (limits) the values in NDArray
. Given an interval, values outside the interval are clipped to the interval edges. Clipping x
between min
and x
would be:
clip(x, min_, max_) = max(min(x, max_), min_))
julia> x = NDArray(1:9);
julia> mx.clip(x, 2, 8)'
1×9 mx.NDArray{Int64,2} @ CPU0:
2 2 3 4 5 6 7 8 8
The storage type of clip output depends on storage types of inputs and the min
, max
parameter values:
- clip(default) = default
- clip(row_sparse, min <= 0, max >= 0) = row_sparse
- clip(csr, min <= 0, max >= 0) = csr
- clip(row_sparse, min < 0, max < 0) = default
- clip(row_sparse, min > 0, max > 0) = default
- clip(csr, min < 0, max < 0) = csr
- clip(csr, min > 0, max > 0) = csr
Defined in src/operator/tensor/matrix_op.cc:L490
source
# MXNet.mx.clip
— Method.
clip(x::NDArray, min, max)
clip!(x::NDArray, min, max)
Clips (limits) the values in NDArray
. Given an interval, values outside the interval are clipped to the interval edges. Clipping x
between min
and x
would be:
clip(x, min_, max_) = max(min(x, max_), min_))
julia> x = NDArray(1:9);
julia> mx.clip(x, 2, 8)'
1×9 mx.NDArray{Int64,2} @ CPU0:
2 2 3 4 5 6 7 8 8
The storage type of clip output depends on storage types of inputs and the min
, max
parameter values:
- clip(default) = default
- clip(row_sparse, min <= 0, max >= 0) = row_sparse
- clip(csr, min <= 0, max >= 0) = csr
- clip(row_sparse, min < 0, max < 0) = default
- clip(row_sparse, min > 0, max > 0) = default
- clip(csr, min < 0, max < 0) = csr
- clip(csr, min > 0, max > 0) = csr
Defined in src/operator/tensor/matrix_op.cc:L490
source
# MXNet.mx.context
— Method.
context(arr::NDArray)
Get the context that this NDArray
lives on.
source
# MXNet.mx.empty
— Method.
empty(dims::Tuple[, ctx::Context = cpu()])
empty(dim1, dim2, ...)
Allocate memory for an uninitialized NDArray
with specific shape of type Float32.
source
# MXNet.mx.empty
— Method.
empty(DType, dims[, ctx::Context = cpu()])
empty(DType, dims)
empty(DType, dim1, dim2, ...)
Allocate memory for an uninitialized NDArray
with a specified type.
source
# MXNet.mx.expand_dims
— Method.
expand_dims(x::NDArray, dim)
Insert a new axis into dim
.
julia> x
4 mx.NDArray{Float64,1} @ CPU0:
1.0
2.0
3.0
4.0
julia> mx.expand_dims(x, 1)
1×4 mx.NDArray{Float64,2} @ CPU0:
1.0 2.0 3.0 4.0
julia> mx.expand_dims(x, 2)
4×1 mx.NDArray{Float64,2} @ CPU0:
1.0
2.0
3.0
4.0
Defined in src/operator/tensor/matrix_op.cc:L293
source
# MXNet.mx.log_softmax
— Function.
log_softmax.(x::NDArray, [dim = ndims(x)])
Computes the log softmax of the input. This is equivalent to computing softmax followed by log.
julia> x 2×3 mx.NDArray{Float64,2} @ CPU0: 1.0 2.0 0.1 0.1 2.0 1.0
julia> mx.log_softmax.(x) 2×3 mx.NDArray{Float64,2} @ CPU0: -1.41703 -0.41703 -2.31703 -2.31703 -0.41703 -1.41703
source
# MXNet.mx.relu
— Function.
relu.(x::NDArray)
Computes rectified linear.
Defined in src/operator/tensor/elemwise_unary_op_basic.cc:L83
source
# MXNet.mx.sigmoid
— Function.
σ.(x::NDArray)
sigmoid.(x::NDArray)
Computes sigmoid of x element-wise.
The storage type of sigmoid
output is always dense.
Defined in src/operator/tensor/elemwise_unary_op_basic.cc:L102
source
# MXNet.mx.softmax
— Function.
softmax.(x::NDArray, [dim = ndims(x)])
Applies the softmax function.
The resulting array contains elements in the range (0, 1)
and the elements along the given axis sum up to 1.
Defined in src/operator/nn/softmax.cc:L54
source
# MXNet.mx.σ
— Function.
σ.(x::NDArray)
sigmoid.(x::NDArray)
Computes sigmoid of x element-wise.
The storage type of sigmoid
output is always dense.
Defined in src/operator/tensor/elemwise_unary_op_basic.cc:L102
source
# MXNet.mx.@inplace
— Macro.
@inplace
Julia does not support re-definiton of +=
operator (like __iadd__
in python), When one write a += b
, it gets translated to a = a+b
. a+b
will allocate new memory for the results, and the newly allocated NDArray
object is then assigned back to a, while the original contents in a is discarded. This is very inefficient when we want to do inplace update.
This macro is a simple utility to implement this behavior. Write
@mx.inplace a += b
will translate into
mx.add_to!(a, b)
which will do inplace adding of the contents of b
into a
.
source
# Base.Iterators.Flatten
— Method.
Flatten(data)
Flattens the input array into a 2-D array by collapsing the higher dimensions.
.. note:: Flatten
is deprecated. Use flatten
instead.
For an input array with shape $(d1, d2, ..., dk)$, flatten
operation reshapes the input array into an output array of shape $(d1, d2...dk)$.
Note that the bahavior of this function is different from numpy.ndarray.flatten, which behaves similar to mxnet.ndarray.reshape((-1,)).
Example::
x = [[
[1,2,3],
[4,5,6],
[7,8,9]
],
[ [1,2,3],
[4,5,6],
[7,8,9]
]],
flatten(x) = [[ 1., 2., 3., 4., 5., 6., 7., 8., 9.],
[ 1., 2., 3., 4., 5., 6., 7., 8., 9.]]
Defined in src/operator/tensor/matrix_op.cc:L212
Arguments
-
data::NDArray-or-SymbolicNode
: Input array.
source
# Base.:%
— Method.
.%(x::NDArray, y::NDArray)
.%(x::NDArray, y::Real)
.%(x::Real, y::NDArray)
Elementwise modulo for NDArray
.
source
# Base.:*
— Method.
.*(x, y)
Elementwise multiplication for NDArray
.
source
# Base.:*
— Method.
*(A::NDArray, B::NDArray)
Matrix/tensor multiplication.
source
# Base.:+
— Method.
+(args...)
.+(args...)
Summation. Multiple arguments of either scalar or NDArray
could be added together. Note at least the first or second argument needs to be an NDArray
to avoid ambiguity of built-in summation.
source
# Base.:-
— Method.
-(x::NDArray)
-(x, y)
.-(x, y)
Subtraction x - y
, of scalar types or NDArray
. Or create the negative of x
.
source
# Base.:/
— Method.
./(x::NDArray, y::NDArray)
./(x::NDArray, y::Real)
./(x::Real, y::NDArray)
- Elementwise dividing an
NDArray
by a scalar or anotherNDArray
of the same shape.
- Elementwise divide a scalar by an
NDArray
. - Matrix division (solving linear systems) is not implemented yet.
source
# Base.LinAlg.dot
— Method.
dot(x::NDArray, y::NDArray)
Defined in src/operator/tensor/dot.cc:L62
source
# Base.LinAlg.norm
— Method.
norm(data)
Flattens the input array and then computes the l2 norm.
Examples::
x = [[1, 2], [3, 4]]
norm(x) = [5.47722578]
rsp = x.cast_storage('row_sparse')
norm(rsp) = [5.47722578]
csr = x.cast_storage('csr')
norm(csr) = [5.47722578]
Defined in src/operator/tensor/broadcast_reduce_op_value.cc:L266
Arguments
-
data::NDArray-or-SymbolicNode
: Source input
source
# Base.Math.cbrt
— Method.
cbrt(data)
Returns element-wise cube-root value of the input.
.. math:: cbrt(x) = \sqrt[3]{x}
Example::
cbrt([1, 8, -125]) = [1, 2, -5]
Defined in src/operator/tensor/elemwise_unary_op_basic.cc:L601
Arguments
-
data::NDArray-or-SymbolicNode
: The input array.
source
# Base.Math.gamma
— Method.
gamma(data)
Returns the gamma function (extension of the factorial function to the reals), computed element-wise on the input array.
The storage type of $gamma$ output is always dense
Arguments
-
data::NDArray-or-SymbolicNode
: The input array.
source
# Base._div
— Method.
_div(lhs, rhs)
_div is an alias of elemwise_div.
Divides arguments element-wise.
The storage type of $elemwise_div$ output is always dense
Arguments
-
lhs::NDArray-or-SymbolicNode
: first input -
rhs::NDArray-or-SymbolicNode
: second input
source
# Base._sub
— Method.
_sub(lhs, rhs)
_sub is an alias of elemwise_sub.
Subtracts arguments element-wise.
The storage type of $elemwise_sub$ output depends on storage types of inputs
- elemwise_sub(row_sparse, row_sparse) = row_sparse
- elemwise_sub(csr, csr) = csr
- otherwise, $elemwise_sub$ generates output with default storage
Arguments
-
lhs::NDArray-or-SymbolicNode
: first input -
rhs::NDArray-or-SymbolicNode
: second input
source
# Base.abs
— Method.
abs(data)
Returns element-wise absolute value of the input.
Example::
abs([-2, 0, 3]) = [2, 0, 3]
The storage type of $abs$ output depends upon the input storage type:
- abs(default) = default
- abs(row_sparse) = row_sparse
Defined in src/operator/tensor/elemwise_unary_op_basic.cc:L385
Arguments
-
data::NDArray-or-SymbolicNode
: The input array.
source
# Base.acos
— Function.
acos.(x::NDArray)Defined in src/operator/tensor/elemwise_unary_op_trig.cc:L123
source
# Base.acosh
— Function.
acosh.(x::NDArray)Defined in src/operator/tensor/elemwise_unary_op_trig.cc:L264
source
# Base.asin
— Function.
asin.(x::NDArray)Defined in src/operator/tensor/elemwise_unary_op_trig.cc:L104
source
# Base.asinh
— Function.
asinh.(x::NDArray)Defined in src/operator/tensor/elemwise_unary_op_trig.cc:L250
source
# Base.atan
— Function.
atan.(x::NDArray)Defined in src/operator/tensor/elemwise_unary_op_trig.cc:L144
source
# Base.atanh
— Function.
atanh.(x::NDArray)Defined in src/operator/tensor/elemwise_unary_op_trig.cc:L281
source
# Base.cat
— Method.
cat(dim, xs::NDArray...)
Concate the NDArray
s which have the same element type along the dim
. Building a diagonal matrix is not supported yet.
source
# Base.ceil
— Method.
ceil(data)
Returns element-wise ceiling of the input.
The ceil of the scalar x is the smallest integer i, such that i >= x.
Example::
ceil([-2.1, -1.9, 1.5, 1.9, 2.1]) = [-2., -1., 2., 2., 3.]
The storage type of $ceil$ output depends upon the input storage type:
- ceil(default) = default
- ceil(row_sparse) = row_sparse
Defined in src/operator/tensor/elemwise_unary_op_basic.cc:L463
Arguments
-
data::NDArray-or-SymbolicNode
: The input array.
source
# Base.convert
— Method.
convert(::Type{Array{<:Real}}, x::NDArray)
Convert an NDArray
into a Julia Array
of specific type. Data will be copied.
source
# Base.copy!
— Method.
copy!(dst::Union{NDArray, Array}, src::Union{NDArray, Array})
Copy contents of src
into dst
.
source
# Base.copy
— Method.
copy(arr :: NDArray)
copy(arr :: NDArray, ctx :: Context)
copy(arr :: Array, ctx :: Context)
Create a copy of an array. When no Context
is given, create a Julia Array
. Otherwise, create an NDArray
on the specified context.
source
# Base.deepcopy
— Method.
deepcopy(arr::NDArray)
Get a deep copy of the data blob in the form of an NDArray of default storage type. This function blocks. Do not use it in performance critical code.
source
# Base.eltype
— Method.
eltype(x::NDArray)
Get the element type of an NDArray
.
source
# Base.exp
— Method.
exp(data)
Returns element-wise exponential value of the input.
.. math:: exp(x) = e^x \approx 2.718^x
Example::
exp([0, 1, 2]) = [1., 2.71828175, 7.38905621]
The storage type of $exp$ output is always dense
Defined in src/operator/tensor/elemwise_unary_op_basic.cc:L641
Arguments
-
data::NDArray-or-SymbolicNode
: The input array.
source
# Base.expm1
— Method.
expm1(data)
Returns $exp(x) - 1$ computed element-wise on the input.
This function provides greater precision than $exp(x) - 1$ for small values of $x$.
The storage type of $expm1$ output depends upon the input storage type:
- expm1(default) = default
- expm1(row_sparse) = row_sparse
Defined in src/operator/tensor/elemwise_unary_op_basic.cc:L720
Arguments
-
data::NDArray-or-SymbolicNode
: The input array.
source
# Base.fill!
— Method.
fill!(arr::NDArray, x)
Create an NDArray
filled with the value x
, like Base.fill!
.
source
# Base.floor
— Method.
floor(data)
Returns element-wise floor of the input.
The floor of the scalar x is the largest integer i, such that i <= x.
Example::
floor([-2.1, -1.9, 1.5, 1.9, 2.1]) = [-3., -2., 1., 1., 2.]
The storage type of $floor$ output depends upon the input storage type:
- floor(default) = default
- floor(row_sparse) = row_sparse
Defined in src/operator/tensor/elemwise_unary_op_basic.cc:L482
Arguments
-
data::NDArray-or-SymbolicNode
: The input array.
source
# Base.getindex
— Method.
getindex(arr::NDArray, idx)
Shortcut for slice
. A typical use is to write
arr[:] += 5
which translates into
arr[:] = arr[:] + 5
which furthur translates into
setindex!(getindex(arr, Colon()), 5, Colon())
Note
The behavior is quite different from indexing into Julia's Array
. For example, arr[2:5]
create a copy of the sub-array for Julia Array
, while for NDArray
, this is a slice that shares the memory.
source
# Base.getindex
— Method.
Shortcut for slice
. NOTE the behavior for Julia's built-in index slicing is to create a copy of the sub-array, while here we simply call slice
, which shares the underlying memory.
source
# Base.hcat
— Method.
hcat(x::NDArray...)
source
# Base.identity
— Method.
identity(data)
identity is an alias of _copy.
Returns a copy of the input.
From:src/operator/tensor/elemwise_unary_op_basic.cc:111
Arguments
-
data::NDArray-or-SymbolicNode
: The input array.
source
# Base.length
— Method.
length(x::NDArray)
Get the number of elements in an NDArray
.
source
# Base.log
— Method.
log(data)
Returns element-wise Natural logarithmic value of the input.
The natural logarithm is logarithm in base e, so that $log(exp(x)) = x$
The storage type of $log$ output is always dense
Defined in src/operator/tensor/elemwise_unary_op_basic.cc:L653
Arguments
-
data::NDArray-or-SymbolicNode
: The input array.
source
# Base.log10
— Method.
log10(data)
Returns element-wise Base-10 logarithmic value of the input.
$10**log10(x) = x$
The storage type of $log10$ output is always dense
Defined in src/operator/tensor/elemwise_unary_op_basic.cc:L665
Arguments
-
data::NDArray-or-SymbolicNode
: The input array.
source
# Base.log1p
— Method.
log1p(data)
Returns element-wise $log(1 + x)$ value of the input.
This function is more accurate than $log(1 + x)$ for small $x$ so that :math:1+x\approx 1
The storage type of $log1p$ output depends upon the input storage type:
- log1p(default) = default
- log1p(row_sparse) = row_sparse
Defined in src/operator/tensor/elemwise_unary_op_basic.cc:L702
Arguments
-
data::NDArray-or-SymbolicNode
: The input array.
source
# Base.log2
— Method.
log2(data)
Returns element-wise Base-2 logarithmic value of the input.
$2**log2(x) = x$
The storage type of $log2$ output is always dense
Defined in src/operator/tensor/elemwise_unary_op_basic.cc:L677
Arguments
-
data::NDArray-or-SymbolicNode
: The input array.
source
# Base.maximum
— Method.
maximum(arr::NDArray, dims)
Defined in src/operator/tensor/broadcast_reduce_op_value.cc:L160
source
# Base.maximum
— Method.
maximum(arr::NDArray)
Defined in src/operator/tensor/broadcast_reduce_op_value.cc:L160
source
# Base.mean
— Method.
mean(arr::NDArray, region)
Defined in src/operator/tensor/broadcast_reduce_op_value.cc:L101
source
# Base.mean
— Method.
mean(arr::NDArray)
Defined in src/operator/tensor/broadcast_reduce_op_value.cc:L101
source
# Base.minimum
— Method.
minimum(arr::NDArray, dims)
Defined in src/operator/tensor/broadcast_reduce_op_value.cc:L174
source
# Base.minimum
— Method.
minimum(arr::NDArray)
Defined in src/operator/tensor/broadcast_reduce_op_value.cc:L174
source
# Base.ndims
— Method.
ndims(x::NDArray)
Get the number of dimensions of an NDArray
. Is equivalent to length(size(arr))
.
source
# Base.permutedims
— Method.
permutedims(arr::NDArray, axes)
Defined in src/operator/tensor/matrix_op.cc:L257
source
# Base.prod
— Method.
prod(arr::NDArray, dims)
Defined in src/operator/tensor/broadcast_reduce_op_value.cc:L116
source
# Base.prod
— Method.
prod(arr::NDArray)
Defined in src/operator/tensor/broadcast_reduce_op_value.cc:L116
source
# Base.repeat
— Method.
repeat(data, repeats, axis)
Repeats elements of an array.
By default, $repeat$ flattens the input array into 1-D and then repeats the elements::
x = [[ 1, 2], [ 3, 4]]
repeat(x, repeats=2) = [ 1., 1., 2., 2., 3., 3., 4., 4.]
The parameter $axis$ specifies the axis along which to perform repeat::
repeat(x, repeats=2, axis=1) = [[ 1., 1., 2., 2.], [ 3., 3., 4., 4.]]
repeat(x, repeats=2, axis=0) = [[ 1., 2.], [ 1., 2.], [ 3., 4.], [ 3., 4.]]
repeat(x, repeats=2, axis=-1) = [[ 1., 1., 2., 2.], [ 3., 3., 4., 4.]]
Defined in src/operator/tensor/matrix_op.cc:L563
Arguments
-
data::NDArray-or-SymbolicNode
: Input data array -
repeats::int, required
: The number of repetitions for each element. -
axis::int or None, optional, default='None'
: The axis along which to repeat values. The negative numbers are interpreted counting from the backward. By default, use the flattened input array, and return a flat output array.
source
# Base.reverse
— Method.
reverse(data, axis)
Reverses the order of elements along given axis while preserving array shape.
Note: reverse and flip are equivalent. We use reverse in the following examples.
Examples::
x = [[ 0., 1., 2., 3., 4.], [ 5., 6., 7., 8., 9.]]
reverse(x, axis=0) = [[ 5., 6., 7., 8., 9.], [ 0., 1., 2., 3., 4.]]
reverse(x, axis=1) = [[ 4., 3., 2., 1., 0.], [ 9., 8., 7., 6., 5.]]
Defined in src/operator/tensor/matrix_op.cc:L665
Arguments
-
data::NDArray-or-SymbolicNode
: Input data array -
axis::Shape(tuple), required
: The axis which to reverse elements.
source
# Base.round
— Method.
round(data)
Returns element-wise rounded value to the nearest integer of the input.
Example::
round([-1.5, 1.5, -1.9, 1.9, 2.1]) = [-2., 2., -2., 2., 2.]
The storage type of $round$ output depends upon the input storage type:
- round(default) = default
- round(row_sparse) = row_sparse
Defined in src/operator/tensor/elemwise_unary_op_basic.cc:L423
Arguments
-
data::NDArray-or-SymbolicNode
: The input array.
source
# Base.setindex!
— Method.
setindex!(arr::NDArray, val, idx)
Assign values to an NDArray
. The following scenarios are supported
- single value assignment via linear indexing:
arr[42] = 24
-
arr[:] = val
: whole array assignment,val
could be a scalar or an array (JuliaArray
orNDArray
) of the same shape. -
arr[start:stop] = val
: assignment to a slice,val
could be a scalar or an array of the same shape to the slice. See alsoslice
.
source
# Base.sign
— Method.
sign(data)
Returns element-wise sign of the input.
Example::
sign([-2, 0, 3]) = [-1, 0, 1]
The storage type of $sign$ output depends upon the input storage type:
- sign(default) = default
- sign(row_sparse) = row_sparse
Defined in src/operator/tensor/elemwise_unary_op_basic.cc:L404
Arguments
-
data::NDArray-or-SymbolicNode
: The input array.
source
# Base.similar
— Method.
similar(x::NDArray)
Create an NDArray
with similar shape, data type, and context with the given one. Note that the returned NDArray
is uninitialized.
source
# Base.size
— Method.
size(x::NDArray)
size(x::NDArray, dims...)
Get the shape of an NDArray
. The shape is in Julia's column-major convention. See also the notes on NDArray shapes NDArray
.
source
# Base.sort
— Method.
sort(data, axis, is_ascend)
Returns a sorted copy of an input array along the given axis.
Examples::
x = [[ 1, 4], [ 3, 1]]
// sorts along the last axis sort(x) = [[ 1., 4.], [ 1., 3.]]
// flattens and then sorts sort(x) = [ 1., 1., 3., 4.]
// sorts along the first axis sort(x, axis=0) = [[ 1., 1.], [ 3., 4.]]
// in a descend order sort(x, is_ascend=0) = [[ 4., 1.], [ 3., 1.]]
Defined in src/operator/tensor/ordering_op.cc:L126
Arguments
-
data::NDArray-or-SymbolicNode
: The input array -
axis::int or None, optional, default='-1'
: Axis along which to choose sort the input tensor. If not given, the flattened array is used. Default is -1. -
is_ascend::boolean, optional, default=1
: Whether to sort in ascending or descending order.
source
# Base.split
— Method.
split(data, num_outputs, axis, squeeze_axis)
split is an alias of SliceChannel.
Splits an array along a particular axis into multiple sub-arrays.
.. note:: $SliceChannel$ is deprecated. Use $split$ instead.
Note that num_outputs
should evenly divide the length of the axis along which to split the array.
Example::
x = [[[ 1.] [ 2.]] [[ 3.] [ 4.]] [[ 5.] [ 6.]]] x.shape = (3, 2, 1)
y = split(x, axis=1, num_outputs=2) // a list of 2 arrays with shape (3, 1, 1) y = [[[ 1.]] [[ 3.]] [[ 5.]]]
[[[ 2.]]
[[ 4.]]
[[ 6.]]]
y[0].shape = (3, 1, 1)
z = split(x, axis=0, num_outputs=3) // a list of 3 arrays with shape (1, 2, 1) z = [[[ 1.] [ 2.]]]
[[[ 3.]
[ 4.]]]
[[[ 5.]
[ 6.]]]
z[0].shape = (1, 2, 1)
squeeze_axis=1
removes the axis with length 1 from the shapes of the output arrays. Note that setting squeeze_axis
to $1$ removes axis with length 1 only along the axis
which it is split. Also squeeze_axis
can be set to true only if $input.shape[axis] == num_outputs$.
Example::
z = split(x, axis=0, num_outputs=3, squeeze_axis=1) // a list of 3 arrays with shape (2, 1) z = [[ 1.] [ 2.]]
[[ 3.]
[ 4.]]
[[ 5.]
[ 6.]]
z[0].shape = (2 ,1 )
Defined in src/operator/slice_channel.cc:L107
Arguments
-
data::NDArray-or-SymbolicNode
: The input -
num_outputs::int, required
: Number of splits. Note that this should evenly divide the length of theaxis
. -
axis::int, optional, default='1'
: Axis along which to split. -
squeeze_axis::boolean, optional, default=0
: If true, Removes the axis with length 1 from the shapes of the output arrays. Note that settingsqueeze_axis
to $true$ removes axis with length 1 only along theaxis
which it is split. Alsosqueeze_axis
can be set to $true$ only if $input.shape[axis] == num_outputs$.
source
# Base.sqrt
— Method.
sqrt(data)
Returns element-wise square-root value of the input.
.. math:: \textrm{sqrt}(x) = \sqrt{x}
Example::
sqrt([4, 9, 16]) = [2, 3, 4]
The storage type of $sqrt$ output depends upon the input storage type:
- sqrt(default) = default
- sqrt(row_sparse) = row_sparse
Defined in src/operator/tensor/elemwise_unary_op_basic.cc:L564
Arguments
-
data::NDArray-or-SymbolicNode
: The input array.
source
# Base.squeeze
— Method.
squeeze(data, axis, num_args)
Remove single-dimensional entries from the shape of an array. Same behavior of defining the output tensor shape as numpy.squeeze for the most of cases. See the following note for exception.
Examples::
data = [[[0], [1], [2]]] squeeze(data) = [0, 1, 2] squeeze(data, axis=0) = [[0], [1], [2]] squeeze(data, axis=2) = [[0, 1, 2]] squeeze(data, axis=(0, 2)) = [0, 1, 2]
.. Note:: The output of this operator will keep at least one dimension not removed. For example, squeeze([[[4]]]) = [4], while in numpy.squeeze, the output will become a scalar.
Arguments
-
data::NDArray-or-SymbolicNode[]
: data to squeeze -
axis::int, optional, default='0'
: The axis in the result array along which the input arrays are stacked. -
num_args::int, required
: Number of inputs to be stacked.
source
# Base.sum
— Method.
sum(arr::NDArray, dims)
Defined in src/operator/tensor/broadcast_reduce_op_value.cc:L85
source
# Base.sum
— Method.
sum(arr::NDArray)
Defined in src/operator/tensor/broadcast_reduce_op_value.cc:L85
source
# Base.transpose
— Method.
transpose(arr::NDArray{T, 1}) where T
Defined in src/operator/tensor/matrix_op.cc:L165
source
# Base.transpose
— Method.
transpose(arr::NDArray{T, 2}) where T
Defined in src/operator/tensor/matrix_op.cc:L257
source
# Base.trunc
— Method.
trunc(data)
Return the element-wise truncated value of the input.
The truncated value of the scalar x is the nearest integer i which is closer to zero than x is. In short, the fractional part of the signed number x is discarded.
Example::
trunc([-2.1, -1.9, 1.5, 1.9, 2.1]) = [-2., -1., 1., 1., 2.]
The storage type of $trunc$ output depends upon the input storage type:
- trunc(default) = default
- trunc(row_sparse) = row_sparse
Defined in src/operator/tensor/elemwise_unary_op_basic.cc:L502
Arguments
-
data::NDArray-or-SymbolicNode
: The input array.
source
# Base.vcat
— Method.
vcat(x::NDArray...)
source
# MXNet.mx.Activation
— Method.
Activation(data, act_type)
Applies an activation function element-wise to the input.
The following activation functions are supported:
-
relu
: Rectified Linear Unit, :math:y = max(x, 0)
-
sigmoid
: :math:y = \frac{1}{1 + exp(-x)}
-
tanh
: Hyperbolic tangent, :math:y = \frac{exp(x) - exp(-x)}{exp(x) + exp(-x)}
-
softrelu
: Soft ReLU, or SoftPlus, :math:y = log(1 + exp(x))
Defined in src/operator/nn/activation.cc:L92
Arguments
-
data::NDArray-or-SymbolicNode
: Input array to activation function. -
act_type::{'relu', 'sigmoid', 'softrelu', 'tanh'}, required
: Activation function to be applied.
source
# MXNet.mx.BatchNorm
— Method.
BatchNorm(data, gamma, beta, moving_mean, moving_var, eps, momentum, fix_gamma, use_global_stats, output_mean_var, axis, cudnn_off)
Batch normalization.
Normalizes a data batch by mean and variance, and applies a scale $gamma$ as well as offset $beta$.
Assume the input has more than one dimension and we normalize along axis 1. We first compute the mean and variance along this axis:
.. math::
data_mean[i] = mean(data[:,i,:,...]) \ data_var[i] = var(data[:,i,:,...])
Then compute the normalized output, which has the same shape as input, as following:
.. math::
out[:,i,:,...] = \frac{data[:,i,:,...] - data_mean[i]}{\sqrt{data_var[i]+\epsilon}} * gamma[i] + beta[i]
Both mean and var returns a scalar by treating the input as a vector.
Assume the input has size k on axis 1, then both $gamma$ and $beta$ have shape (k,). If $output_mean_var$ is set to be true, then outputs both $data_mean$ and $data_var$ as well, which are needed for the backward pass.
Besides the inputs and the outputs, this operator accepts two auxiliary states, $moving_mean$ and $moving_var$, which are k-length vectors. They are global statistics for the whole dataset, which are updated by::
moving_mean = moving_mean * momentum + data_mean * (1 - momentum) moving_var = moving_var * momentum + data_var * (1 - momentum)
If $use_global_stats$ is set to be true, then $moving_mean$ and $moving_var$ are used instead of $data_mean$ and $data_var$ to compute the output. It is often used during inference.
The parameter $axis$ specifies which axis of the input shape denotes the 'channel' (separately normalized groups). The default is 1. Specifying -1 sets the channel axis to be the last item in the input shape.
Both $gamma$ and $beta$ are learnable parameters. But if $fix_gamma$ is true, then set $gamma$ to 1 and its gradient to 0.
Defined in src/operator/nn/batch_norm.cc:L400
Arguments
-
data::NDArray-or-SymbolicNode
: Input data to batch normalization -
gamma::NDArray-or-SymbolicNode
: gamma array -
beta::NDArray-or-SymbolicNode
: beta array -
moving_mean::NDArray-or-SymbolicNode
: running mean of input -
moving_var::NDArray-or-SymbolicNode
: running variance of input -
eps::double, optional, default=0.001
: Epsilon to prevent div 0. Must be no less than CUDNN_BN_MIN_EPSILON defined in cudnn.h when using cudnn (usually 1e-5) -
momentum::float, optional, default=0.9
: Momentum for moving average -
fix_gamma::boolean, optional, default=1
: Fix gamma while training -
use_global_stats::boolean, optional, default=0
: Whether use global moving statistics instead of local batch-norm. This will force change batch-norm into a scale shift operator. -
output_mean_var::boolean, optional, default=0
: Output All,normal mean and var -
axis::int, optional, default='1'
: Specify which shape axis the channel is specified -
cudnn_off::boolean, optional, default=0
: Do not select CUDNN operator, if available
source
# MXNet.mx.BatchNorm_v1
— Method.
BatchNorm_v1(data, gamma, beta, eps, momentum, fix_gamma, use_global_stats, output_mean_var)
Batch normalization.
Normalizes a data batch by mean and variance, and applies a scale $gamma$ as well as offset $beta$.
Assume the input has more than one dimension and we normalize along axis 1. We first compute the mean and variance along this axis:
.. math::
data_mean[i] = mean(data[:,i,:,...]) \ data_var[i] = var(data[:,i,:,...])
Then compute the normalized output, which has the same shape as input, as following:
.. math::
out[:,i,:,...] = \frac{data[:,i,:,...] - data_mean[i]}{\sqrt{data_var[i]+\epsilon}} * gamma[i] + beta[i]
Both mean and var returns a scalar by treating the input as a vector.
Assume the input has size k on axis 1, then both $gamma$ and $beta$ have shape (k,). If $output_mean_var$ is set to be true, then outputs both $data_mean$ and $data_var$ as well, which are needed for the backward pass.
Besides the inputs and the outputs, this operator accepts two auxiliary states, $moving_mean$ and $moving_var$, which are k-length vectors. They are global statistics for the whole dataset, which are updated by::
moving_mean = moving_mean * momentum + data_mean * (1 - momentum) moving_var = moving_var * momentum + data_var * (1 - momentum)
If $use_global_stats$ is set to be true, then $moving_mean$ and $moving_var$ are used instead of $data_mean$ and $data_var$ to compute the output. It is often used during inference.
Both $gamma$ and $beta$ are learnable parameters. But if $fix_gamma$ is true, then set $gamma$ to 1 and its gradient to 0.
Defined in src/operator/batch_norm_v1.cc:L90
Arguments
-
data::NDArray-or-SymbolicNode
: Input data to batch normalization -
gamma::NDArray-or-SymbolicNode
: gamma array -
beta::NDArray-or-SymbolicNode
: beta array -
eps::float, optional, default=0.001
: Epsilon to prevent div 0 -
momentum::float, optional, default=0.9
: Momentum for moving average -
fix_gamma::boolean, optional, default=1
: Fix gamma while training -
use_global_stats::boolean, optional, default=0
: Whether use global moving statistics instead of local batch-norm. This will force change batch-norm into a scale shift operator. -
output_mean_var::boolean, optional, default=0
: Output All,normal mean and var
source
# MXNet.mx.BilinearSampler
— Method.
BilinearSampler(data, grid)
Applies bilinear sampling to input feature map.
Bilinear Sampling is the key of [NIPS2015] \"Spatial Transformer Networks\". The usage of the operator is very similar to remap function in OpenCV, except that the operator has the backward pass.
Given :math:data
and :math:grid
, then the output is computed by
.. math:: x_{src} = grid[batch, 0, y_{dst}, x_{dst}] \ y_{src} = grid[batch, 1, y_{dst}, x_{dst}] \ output[batch, channel, y_{dst}, x_{dst}] = G(data[batch, channel, y_{src}, x_{src})
:math:x_{dst}
, :math:y_{dst}
enumerate all spatial locations in :math:output
, and :math:G()
denotes the bilinear interpolation kernel. The out-boundary points will be padded with zeros.The shape of the output will be (data.shape[0], data.shape[1], grid.shape[2], grid.shape[3]).
The operator assumes that :math:data
has 'NCHW' layout and :math:grid
has been normalized to [-1, 1].
BilinearSampler often cooperates with GridGenerator which generates sampling grids for BilinearSampler. GridGenerator supports two kinds of transformation: $affine$ and $warp$. If users want to design a CustomOp to manipulate :math:grid
, please firstly refer to the code of GridGenerator.
Example 1::
Zoom out data two times
data = array([[[[1, 4, 3, 6], [1, 8, 8, 9], [0, 4, 1, 5], [1, 0, 1, 3]]]])
affine_matrix = array([[2, 0, 0], [0, 2, 0]])
affine_matrix = reshape(affine_matrix, shape=(1, 6))
grid = GridGenerator(data=affine_matrix, transform_type='affine', target_shape=(4, 4))
out = BilinearSampler(data, grid)
out [[[[ 0, 0, 0, 0], [ 0, 3.5, 6.5, 0], [ 0, 1.25, 2.5, 0], [ 0, 0, 0, 0]]]
Example 2::
shift data horizontally by -1 pixel
data = array([[[[1, 4, 3, 6], [1, 8, 8, 9], [0, 4, 1, 5], [1, 0, 1, 3]]]])
warp_maxtrix = array([[[[1, 1, 1, 1], [1, 1, 1, 1], [1, 1, 1, 1], [1, 1, 1, 1]], [[0, 0, 0, 0], [0, 0, 0, 0], [0, 0, 0, 0], [0, 0, 0, 0]]]])
grid = GridGenerator(data=warp_matrix, transform_type='warp') out = BilinearSampler(data, grid)
out [[[[ 4, 3, 6, 0], [ 8, 8, 9, 0], [ 4, 1, 5, 0], [ 0, 1, 3, 0]]]
Defined in src/operator/bilinear_sampler.cc:L245
Arguments
-
data::NDArray-or-SymbolicNode
: Input data to the BilinearsamplerOp. -
grid::NDArray-or-SymbolicNode
: Input grid to the BilinearsamplerOp.grid has two channels: x_src, y_src
source
# MXNet.mx.BlockGrad
— Method.
BlockGrad(data)
Stops gradient computation.
Stops the accumulated gradient of the inputs from flowing through this operator in the backward direction. In other words, this operator prevents the contribution of its inputs to be taken into account for computing gradients.
Example::
v1 = [1, 2] v2 = [0, 1] a = Variable('a') b = Variable('b') b_stop_grad = stop_gradient(3 * b) loss = MakeLoss(b_stop_grad + a)
executor = loss.simple_bind(ctx=cpu(), a=(1,2), b=(1,2)) executor.forward(is_train=True, a=v1, b=v2) executor.outputs [ 1. 5.]
executor.backward() executor.grad_arrays [ 0. 0.] [ 1. 1.]
Defined in src/operator/tensor/elemwise_unary_op_basic.cc:L166
Arguments
-
data::NDArray-or-SymbolicNode
: The input array.
source
# MXNet.mx.Cast
— Method.
Cast(data, dtype)
Casts all elements of the input to a new type.
.. note:: $Cast$ is deprecated. Use $cast$ instead.
Example::
cast([0.9, 1.3], dtype='int32') = [0, 1] cast([1e20, 11.1], dtype='float16') = [inf, 11.09375] cast([300, 11.1, 10.9, -1, -3], dtype='uint8') = [44, 11, 10, 255, 253]
Defined in src/operator/tensor/elemwise_unary_op_basic.cc:L310
Arguments
-
data::NDArray-or-SymbolicNode
: The input. -
dtype::{'float16', 'float32', 'float64', 'int32', 'uint8'}, required
: Output data type.
source
# MXNet.mx.Concat
— Method.
Concat(data, num_args, dim)
Note: Concat takes variable number of positional inputs. So instead of calling as Concat([x, y, z], num_args=3), one should call via Concat(x, y, z), and num_args will be determined automatically.
Joins input arrays along a given axis.
.. note:: Concat
is deprecated. Use concat
instead.
The dimensions of the input arrays should be the same except the axis along which they will be concatenated. The dimension of the output array along the concatenated axis will be equal to the sum of the corresponding dimensions of the input arrays.
Example::
x = [[1,1],[2,2]] y = [[3,3],[4,4],[5,5]] z = [[6,6], [7,7],[8,8]]
concat(x,y,z,dim=0) = [[ 1., 1.], [ 2., 2.], [ 3., 3.], [ 4., 4.], [ 5., 5.], [ 6., 6.], [ 7., 7.], [ 8., 8.]]
Note that you cannot concat x,y,z along dimension 1 since dimension 0 is not the same for all the input arrays.
concat(y,z,dim=1) = [[ 3., 3., 6., 6.], [ 4., 4., 7., 7.], [ 5., 5., 8., 8.]]
Defined in src/operator/concat.cc:L104
Arguments
-
data::NDArray-or-SymbolicNode[]
: List of arrays to concatenate -
num_args::int, required
: Number of inputs to be concated. -
dim::int, optional, default='1'
: the dimension to be concated.
source
# MXNet.mx.Convolution
— Method.
Convolution(data, weight, bias, kernel, stride, dilate, pad, num_filter, num_group, workspace, no_bias, cudnn_tune, cudnn_off, layout)
Compute N-D convolution on (N+2)-D input.
In the 2-D convolution, given input data with shape (batch_size, channel, height, width), the output is computed by
.. math::
out[n,i,:,:] = bias[i] + \sum_{j=0}^{channel} data[n,j,:,:] \star weight[i,j,:,:]
where :math:\star
is the 2-D cross-correlation operator.
For general 2-D convolution, the shapes are
- data: (batch_size, channel, height, width)
- weight: (num_filter, channel, kernel[0], kernel[1])
- bias: (num_filter,)
- out: (batch_size, num_filter, out_height, out_width).
Define::
f(x,k,p,s,d) = floor((x+2p-d(k-1)-1)/s)+1
then we have::
out_height=f(height, kernel[0], pad[0], stride[0], dilate[0]) out_width=f(width, kernel[1], pad[1], stride[1], dilate[1])
If $no_bias$ is set to be true, then the $bias$ term is ignored.
The default data $layout$ is NCHW, namely (batch_size, channel, height, width). We can choose other layouts such as NHWC.
If $num_group$ is larger than 1, denoted by g, then split the input $data$ evenly into g parts along the channel axis, and also evenly split $weight$ along the first dimension. Next compute the convolution on the i-th part of the data with the i-th weight part. The output is obtained by concatenating all the g results.
1-D convolution does not have height dimension but only width in space.
- data: (batch_size, channel, width)
- weight: (num_filter, channel, kernel[0])
- bias: (num_filter,)
- out: (batch_size, num_filter, out_width).
3-D convolution adds an additional depth dimension besides height and width. The shapes are
- data: (batch_size, channel, depth, height, width)
- weight: (num_filter, channel, kernel[0], kernel[1], kernel[2])
- bias: (num_filter,)
- out: (batch_size, num_filter, out_depth, out_height, out_width).
Both $weight$ and $bias$ are learnable parameters.
There are other options to tune the performance.
-
cudnn_tune: enable this option leads to higher startup time but may give faster speed. Options are
- off: no tuning
- limited_workspace:run test and pick the fastest algorithm that doesn't exceed workspace limit.
- fastest: pick the fastest algorithm and ignore workspace limit.
- None (default): the behavior is determined by environment variable $MXNET_CUDNN_AUTOTUNE_DEFAULT$. 0 for off, 1 for limited workspace (default), 2 for fastest.
- workspace: A large number leads to more (GPU) memory usage but may improve the performance.
Defined in src/operator/nn/convolution.cc:L170
Arguments
-
data::NDArray-or-SymbolicNode
: Input data to the ConvolutionOp. -
weight::NDArray-or-SymbolicNode
: Weight matrix. -
bias::NDArray-or-SymbolicNode
: Bias parameter. -
kernel::Shape(tuple), required
: Convolution kernel size: (w,), (h, w) or (d, h, w) -
stride::Shape(tuple), optional, default=[]
: Convolution stride: (w,), (h, w) or (d, h, w). Defaults to 1 for each dimension. -
dilate::Shape(tuple), optional, default=[]
: Convolution dilate: (w,), (h, w) or (d, h, w). Defaults to 1 for each dimension. -
pad::Shape(tuple), optional, default=[]
: Zero pad for convolution: (w,), (h, w) or (d, h, w). Defaults to no padding. -
num_filter::int (non-negative), required
: Convolution filter(channel) number -
num_group::int (non-negative), optional, default=1
: Number of group partitions. -
workspace::long (non-negative), optional, default=1024
: Maximum temporary workspace allowed for convolution (MB). -
no_bias::boolean, optional, default=0
: Whether to disable bias parameter. -
cudnn_tune::{None, 'fastest', 'limited_workspace', 'off'},optional, default='None'
: Whether to pick convolution algo by running performance test. -
cudnn_off::boolean, optional, default=0
: Turn off cudnn for this layer. -
layout::{None, 'NCDHW', 'NCHW', 'NCW', 'NDHWC', 'NHWC'},optional, default='None'
: Set layout for input, output and weight. Empty for default layout: NCW for 1d, NCHW for 2d and NCDHW for 3d.
source
# MXNet.mx.Convolution_v1
— Method.
Convolution_v1(data, weight, bias, kernel, stride, dilate, pad, num_filter, num_group, workspace, no_bias, cudnn_tune, cudnn_off, layout)
This operator is DEPRECATED. Apply convolution to input then add a bias.
Arguments
-
data::NDArray-or-SymbolicNode
: Input data to the ConvolutionV1Op. -
weight::NDArray-or-SymbolicNode
: Weight matrix. -
bias::NDArray-or-SymbolicNode
: Bias parameter. -
kernel::Shape(tuple), required
: convolution kernel size: (h, w) or (d, h, w) -
stride::Shape(tuple), optional, default=[]
: convolution stride: (h, w) or (d, h, w) -
dilate::Shape(tuple), optional, default=[]
: convolution dilate: (h, w) or (d, h, w) -
pad::Shape(tuple), optional, default=[]
: pad for convolution: (h, w) or (d, h, w) -
num_filter::int (non-negative), required
: convolution filter(channel) number -
num_group::int (non-negative), optional, default=1
: Number of group partitions. Equivalent to slicing input into num_group partitions, apply convolution on each, then concatenate the results -
workspace::long (non-negative), optional, default=1024
: Maximum tmp workspace allowed for convolution (MB). -
no_bias::boolean, optional, default=0
: Whether to disable bias parameter. -
cudnn_tune::{None, 'fastest', 'limited_workspace', 'off'},optional, default='None'
: Whether to pick convolution algo by running performance test. Leads to higher startup time but may give faster speed. Options are: 'off': no tuning 'limited_workspace': run test and pick the fastest algorithm that doesn't exceed workspace limit. 'fastest': pick the fastest algorithm and ignore workspace limit. If set to None (default), behavior is determined by environment variable MXNET_CUDNN_AUTOTUNE_DEFAULT: 0 for off, 1 for limited workspace (default), 2 for fastest. -
cudnn_off::boolean, optional, default=0
: Turn off cudnn for this layer. -
layout::{None, 'NCDHW', 'NCHW', 'NDHWC', 'NHWC'},optional, default='None'
: Set layout for input, output and weight. Empty for default layout: NCHW for 2d and NCDHW for 3d.
source
# MXNet.mx.Correlation
— Method.
Correlation(data1, data2, kernel_size, max_displacement, stride1, stride2, pad_size, is_multiply)
Applies correlation to inputs.
The correlation layer performs multiplicative patch comparisons between two feature maps.
Given two multi-channel feature maps :math:f_{1}, f_{2}
, with :math:w
, :math:h
, and :math:c
being their width, height, and number of channels, the correlation layer lets the network compare each patch from :math:f_{1}
with each patch from :math:f_{2}
.
For now we consider only a single comparison of two patches. The 'correlation' of two patches centered at :math:x_{1}
in the first map and :math:x_{2}
in the second map is then defined as:
.. math:: c(x_{1}, x_{2}) = \sum_{o \in [-k,k] \times [-k,k]}
for a square patch of size :math:K:=2k+1
.
Note that the equation above is identical to one step of a convolution in neural networks, but instead of convolving data with a filter, it convolves data with other data. For this reason, it has no training weights.
Computing :math:c(x_{1}, x_{2})
involves :math:c * K^{2}
multiplications. Comparing all patch combinations involves :math:w^{2}*h^{2}
such computations.
Given a maximum displacement :math:d
, for each location :math:x_{1}
it computes correlations :math:c(x_{1}, x_{2})
only in a neighborhood of size :math:D:=2d+1
, by limiting the range of :math:x_{2}
. We use strides :math:s_{1}, s_{2}
, to quantize :math:x_{1}
globally and to quantize :math:x_{2}
within the neighborhood centered around :math:x_{1}
.
The final output is defined by the following expression:
.. math:: out[n, q, i, j] = c(x_{i, j}, x_{q})
where :math:i
and :math:j
enumerate spatial locations in :math:f_{1}
, and :math:q
denotes the :math:q^{th}
neighborhood of :math:x_{i,j}
.
Defined in src/operator/correlation.cc:L192
Arguments
-
data1::NDArray-or-SymbolicNode
: Input data1 to the correlation. -
data2::NDArray-or-SymbolicNode
: Input data2 to the correlation. -
kernel_size::int (non-negative), optional, default=1
: kernel size for Correlation must be an odd number -
max_displacement::int (non-negative), optional, default=1
: Max displacement of Correlation -
stride1::int (non-negative), optional, default=1
: stride1 quantize data1 globally -
stride2::int (non-negative), optional, default=1
: stride2 quantize data2 within the neighborhood centered around data1 -
pad_size::int (non-negative), optional, default=0
: pad for Correlation -
is_multiply::boolean, optional, default=1
: operation type is either multiplication or subduction
source
# MXNet.mx.Crop
— Method.
Crop(data, num_args, offset, h_w, center_crop)
Note: Crop takes variable number of positional inputs. So instead of calling as Crop([x, y, z], num_args=3), one should call via Crop(x, y, z), and num_args will be determined automatically.
.. note:: Crop
is deprecated. Use slice
instead.
Crop the 2nd and 3rd dim of input data, with the corresponding size of h_w or with width and height of the second input symbol, i.e., with one input, we need h_w to specify the crop height and width, otherwise the second input symbol's size will be used
Defined in src/operator/crop.cc:L50
Arguments
-
data::SymbolicNode or SymbolicNode[]
: Tensor or List of Tensors, the second input will be used as crop_like shape reference -
num_args::int, required
: Number of inputs for crop, if equals one, then we will use the h_wfor crop height and width, else if equals two, then we will use the heightand width of the second input symbol, we name crop_like here -
offset::Shape(tuple), optional, default=[0,0]
: crop offset coordinate: (y, x) -
h_w::Shape(tuple), optional, default=[0,0]
: crop height and width: (h, w) -
center_crop::boolean, optional, default=0
: If set to true, then it will use be the center_crop,or it will crop using the shape of crop_like
source
# MXNet.mx.Custom
— Method.
Custom(data, op_type)
Apply a custom operator implemented in a frontend language (like Python).
Custom operators should override required methods like forward
and backward
. The custom operator must be registered before it can be used. Please check the tutorial here: http://mxnet.io/faq/new_op.html.
Defined in src/operator/custom/custom.cc:L369
Arguments
-
data::NDArray-or-SymbolicNode[]
: Input data for the custom operator. -
op_type::string
: Name of the custom operator. This is the name that is passed tomx.operator.register
to register the operator.
source
# MXNet.mx.Deconvolution
— Method.
Deconvolution(data, weight, bias, kernel, stride, dilate, pad, adj, target_shape, num_filter, num_group, workspace, no_bias, cudnn_tune, cudnn_off, layout)
Computes 1D or 2D transposed convolution (aka fractionally strided convolution) of the input tensor. This operation can be seen as the gradient of Convolution operation with respect to its input. Convolution usually reduces the size of the input. Transposed convolution works the other way, going from a smaller input to a larger output while preserving the connectivity pattern.
Arguments
-
data::NDArray-or-SymbolicNode
: Input tensor to the deconvolution operation. -
weight::NDArray-or-SymbolicNode
: Weights representing the kernel. -
bias::NDArray-or-SymbolicNode
: Bias added to the result after the deconvolution operation. -
kernel::Shape(tuple), required
: Deconvolution kernel size: (w,), (h, w) or (d, h, w). This is same as the kernel size used for the corresponding convolution -
stride::Shape(tuple), optional, default=[]
: The stride used for the corresponding convolution: (w,), (h, w) or (d, h, w). Defaults to 1 for each dimension. -
dilate::Shape(tuple), optional, default=[]
: Dilation factor for each dimension of the input: (w,), (h, w) or (d, h, w). Defaults to 1 for each dimension. -
pad::Shape(tuple), optional, default=[]
: The amount of implicit zero padding added during convolution for each dimension of the input: (w,), (h, w) or (d, h, w). $(kernel-1)/2$ is usually a good choice. Iftarget_shape
is set,pad
will be ignored and a padding that will generate the target shape will be used. Defaults to no padding. -
adj::Shape(tuple), optional, default=[]
: Adjustment for output shape: (w,), (h, w) or (d, h, w). Iftarget_shape
is set,adj
will be ignored and computed accordingly. -
target_shape::Shape(tuple), optional, default=[]
: Shape of the output tensor: (w,), (h, w) or (d, h, w). -
num_filter::int (non-negative), required
: Number of output filters. -
num_group::int (non-negative), optional, default=1
: Number of groups partition. -
workspace::long (non-negative), optional, default=512
: Maximum temporal workspace allowed for deconvolution (MB). -
no_bias::boolean, optional, default=1
: Whether to disable bias parameter. -
cudnn_tune::{None, 'fastest', 'limited_workspace', 'off'},optional, default='None'
: Whether to pick convolution algorithm by running performance test. -
cudnn_off::boolean, optional, default=0
: Turn off cudnn for this layer. -
layout::{None, 'NCDHW', 'NCHW', 'NCW', 'NDHWC', 'NHWC'},optional, default='None'
: Set layout for input, output and weight. Empty for default layout, NCW for 1d, NCHW for 2d and NCDHW for 3d.
source
# MXNet.mx.Dropout
— Method.
Dropout(data, p, mode)
Applies dropout operation to input array.
- During training, each element of the input is set to zero with probability p. The whole array is rescaled by :math:
1/(1-p)
to keep the expected sum of the input unchanged. - During testing, this operator does not change the input if mode is 'training'. If mode is 'always', the same computaion as during training will be applied.
Example::
random.seed(998) input_array = array([[3., 0.5, -0.5, 2., 7.], [2., -0.4, 7., 3., 0.2]]) a = symbol.Variable('a') dropout = symbol.Dropout(a, p = 0.2) executor = dropout.simple_bind(a = input_array.shape)
If training
executor.forward(is_train = True, a = input_array) executor.outputs [[ 3.75 0.625 -0. 2.5 8.75 ] [ 2.5 -0.5 8.75 3.75 0. ]]
If testing
executor.forward(is_train = False, a = input_array) executor.outputs [[ 3. 0.5 -0.5 2. 7. ] [ 2. -0.4 7. 3. 0.2 ]]
Defined in src/operator/nn/dropout.cc:L79
Arguments
-
data::NDArray-or-SymbolicNode
: Input array to which dropout will be applied. -
p::float, optional, default=0.5
: Fraction of the input that gets dropped out during training time. -
mode::{'always', 'training'},optional, default='training'
: Whether to only turn on dropout during training or to also turn on for inference.
source
# MXNet.mx.ElementWiseSum
— Method.
ElementWiseSum(args)
ElementWiseSum is an alias of add_n.
Note: ElementWiseSum takes variable number of positional inputs. So instead of calling as ElementWiseSum([x, y, z], num_args=3), one should call via ElementWiseSum(x, y, z), and num_args will be determined automatically.
Adds all input arguments element-wise.
.. math:: add_n(a_1, a_2, ..., a_n) = a_1 + a_2 + ... + a_n
$add_n$ is potentially more efficient than calling $add$ by n
times.
The storage type of $add_n$ output depends on storage types of inputs
- add_n(row_sparse, row_sparse, ..) = row_sparse
- otherwise, $add_n$ generates output with default storage
Defined in src/operator/tensor/elemwise_sum.cc:L123
Arguments
-
args::NDArray-or-SymbolicNode[]
: Positional input arguments
source
# MXNet.mx.Embedding
— Method.
Embedding(data, weight, input_dim, output_dim, dtype)
Maps integer indices to vector representations (embeddings).
This operator maps words to real-valued vectors in a high-dimensional space, called word embeddings. These embeddings can capture semantic and syntactic properties of the words. For example, it has been noted that in the learned embedding spaces, similar words tend to be close to each other and dissimilar words far apart.
For an input array of shape (d1, ..., dK), the shape of an output array is (d1, ..., dK, output_dim). All the input values should be integers in the range [0, input_dim).
If the input_dim is ip0 and output_dim is op0, then shape of the embedding weight matrix must be (ip0, op0).
By default, if any index mentioned is too large, it is replaced by the index that addresses the last vector in an embedding matrix.
Examples::
input_dim = 4 output_dim = 5
// Each row in weight matrix y represents a word. So, y = (w0,w1,w2,w3) y = [[ 0., 1., 2., 3., 4.], [ 5., 6., 7., 8., 9.], [ 10., 11., 12., 13., 14.], [ 15., 16., 17., 18., 19.]]
// Input array x represents n-grams(2-gram). So, x = [(w1,w3), (w0,w2)] x = [[ 1., 3.], [ 0., 2.]]
// Mapped input x to its vector representation y. Embedding(x, y, 4, 5) = [[[ 5., 6., 7., 8., 9.], [ 15., 16., 17., 18., 19.]],
[[ 0., 1., 2., 3., 4.],
[ 10., 11., 12., 13., 14.]]]
Defined in src/operator/tensor/indexing_op.cc:L225
Arguments
-
data::NDArray-or-SymbolicNode
: The input array to the embedding operator. -
weight::NDArray-or-SymbolicNode
: The embedding weight matrix. -
input_dim::int, required
: Vocabulary size of the input indices. -
output_dim::int, required
: Dimension of the embedding vectors. -
dtype::{'float16', 'float32', 'float64', 'int32', 'uint8'},optional, default='float32'
: Data type of weight.
source
# MXNet.mx.FullyConnected
— Method.
FullyConnected(data, weight, bias, num_hidden, no_bias, flatten)
Applies a linear transformation: :math:Y = XW^T + b
.
If $flatten$ is set to be true, then the shapes are:
-
data:
(batch_size, x1, x2, ..., xn)
-
weight:
(num_hidden, x1 * x2 * ... * xn)
-
bias:
(num_hidden,)
-
out:
(batch_size, num_hidden)
If $flatten$ is set to be false, then the shapes are:
-
data:
(x1, x2, ..., xn, input_dim)
-
weight:
(num_hidden, input_dim)
-
bias:
(num_hidden,)
-
out:
(x1, x2, ..., xn, num_hidden)
The learnable parameters include both $weight$ and $bias$.
If $no_bias$ is set to be true, then the $bias$ term is ignored.
Defined in src/operator/nn/fully_connected.cc:L98
Arguments
-
data::NDArray-or-SymbolicNode
: Input data. -
weight::NDArray-or-SymbolicNode
: Weight matrix. -
bias::NDArray-or-SymbolicNode
: Bias parameter. -
num_hidden::int, required
: Number of hidden nodes of the output. -
no_bias::boolean, optional, default=0
: Whether to disable bias parameter. -
flatten::boolean, optional, default=1
: Whether to collapse all but the first axis of the input data tensor.
source
# MXNet.mx.GridGenerator
— Method.
GridGenerator(data, transform_type, target_shape)
Generates 2D sampling grid for bilinear sampling.
Arguments
-
data::NDArray-or-SymbolicNode
: Input data to the function. -
transform_type::{'affine', 'warp'}, required
: The type of transformation. Foraffine
, input data should be an affine matrix of size (batch, 6). Forwarp
, input data should be an optical flow of size (batch, 2, h, w). -
target_shape::Shape(tuple), optional, default=[0,0]
: Specifies the output shape (H, W). This is required if transformation type isaffine
. If transformation type iswarp
, this parameter is ignored.
source
# MXNet.mx.IdentityAttachKLSparseReg
— Method.
IdentityAttachKLSparseReg(data, sparseness_target, penalty, momentum)
Apply a sparse regularization to the output a sigmoid activation function.
Arguments
-
data::NDArray-or-SymbolicNode
: Input data. -
sparseness_target::float, optional, default=0.1
: The sparseness target -
penalty::float, optional, default=0.001
: The tradeoff parameter for the sparseness penalty -
momentum::float, optional, default=0.9
: The momentum for running average
source
# MXNet.mx.InstanceNorm
— Method.
InstanceNorm(data, gamma, beta, eps)
Applies instance normalization to the n-dimensional input array.
This operator takes an n-dimensional input array where (n>2) and normalizes the input using the following formula:
.. math::
out = \frac{x - mean[data]}{ \sqrt{Var[data]} + \epsilon} * gamma + beta
This layer is similar to batch normalization layer (BatchNorm
) with two differences: first, the normalization is carried out per example (instance), not over a batch. Second, the same normalization is applied both at test and train time. This operation is also known as contrast normalization
.
If the input data is of shape [batch, channel, spacial_dim1, spacial_dim2, ...], gamma
and beta
parameters must be vectors of shape [channel].
This implementation is based on paper:
.. [1] Instance Normalization: The Missing Ingredient for Fast Stylization, D. Ulyanov, A. Vedaldi, V. Lempitsky, 2016 (arXiv:1607.08022v2).
Examples::
// Input of shape (2,1,2) x = [[[ 1.1, 2.2]], [[ 3.3, 4.4]]]
// gamma parameter of length 1 gamma = [1.5]
// beta parameter of length 1 beta = [0.5]
// Instance normalization is calculated with the above formula InstanceNorm(x,gamma,beta) = [[[-0.997527 , 1.99752665]], [[-0.99752653, 1.99752724]]]
Defined in src/operator/instance_norm.cc:L95
Arguments
-
data::NDArray-or-SymbolicNode
: An n-dimensional input array (n > 2) of the form [batch, channel, spatial_dim1, spatial_dim2, ...]. -
gamma::NDArray-or-SymbolicNode
: A vector of length 'channel', which multiplies the normalized input. -
beta::NDArray-or-SymbolicNode
: A vector of length 'channel', which is added to the product of the normalized input and the weight. -
eps::float, optional, default=0.001
: Anepsilon
parameter to prevent division by 0.
source
# MXNet.mx.L2Normalization
— Method.
L2Normalization(data, eps, mode)
Normalize the input array using the L2 norm.
For 1-D NDArray, it computes::
out = data / sqrt(sum(data ** 2) + eps)
For N-D NDArray, if the input array has shape (N, N, ..., N),
with $mode$ = $instance$, it normalizes each instance in the multidimensional array by its L2 norm.::
for i in 0...N out[i,:,:,...,:] = data[i,:,:,...,:] / sqrt(sum(data[i,:,:,...,:] ** 2) + eps)
with $mode$ = $channel$, it normalizes each channel in the array by its L2 norm.::
for i in 0...N out[:,i,:,...,:] = data[:,i,:,...,:] / sqrt(sum(data[:,i,:,...,:] ** 2) + eps)
with $mode$ = $spatial$, it normalizes the cross channel norm for each position in the array by its L2 norm.::
for dim in 2...N for i in 0...N out[.....,i,...] = take(out, indices=i, axis=dim) / sqrt(sum(take(out, indices=i, axis=dim) ** 2) + eps) -dim-
Example::
x = [[[1,2], [3,4]], [[2,2], [5,6]]]
L2Normalization(x, mode='instance') =[[[ 0.18257418 0.36514837] [ 0.54772252 0.73029673]] [[ 0.24077171 0.24077171] [ 0.60192931 0.72231513]]]
L2Normalization(x, mode='channel') =[[[ 0.31622776 0.44721359] [ 0.94868326 0.89442718]] [[ 0.37139067 0.31622776] [ 0.92847669 0.94868326]]]
L2Normalization(x, mode='spatial') =[[[ 0.44721359 0.89442718] [ 0.60000002 0.80000001]] [[ 0.70710677 0.70710677] [ 0.6401844 0.76822126]]]
Defined in src/operator/l2_normalization.cc:L93
Arguments
-
data::NDArray-or-SymbolicNode
: Input array to normalize. -
eps::float, optional, default=1e-10
: A small constant for numerical stability. -
mode::{'channel', 'instance', 'spatial'},optional, default='instance'
: Specify the dimension along which to compute L2 norm.
source
# MXNet.mx.LRN
— Method.
LRN(data, alpha, beta, knorm, nsize)
Applies local response normalization to the input.
The local response normalization layer performs "lateral inhibition" by normalizing over local input regions.
If :math:a_{x,y}^{i}
is the activity of a neuron computed by applying kernel :math:i
at position :math:(x, y)
and then applying the ReLU nonlinearity, the response-normalized activity :math:b_{x,y}^{i}
is given by the expression:
.. math:: b_{x,y}^{i} = \frac{a_{x,y}^{i}}{\Bigg({k + \alpha \sum_{j=max(0, i-\frac{n}{2})}^{min(N-1, i+\frac{n}{2})} (a_{x,y}^{j})^{2}}\Bigg)^{\beta}}
where the sum runs over :math:n
"adjacent" kernel maps at the same spatial position, and :math:N
is the total number of kernels in the layer.
Defined in src/operator/lrn.cc:L73
Arguments
-
data::NDArray-or-SymbolicNode
: Input data. -
alpha::float, optional, default=0.0001
: The variance scaling parameter :math:lpha
in the LRN expression. -
beta::float, optional, default=0.75
: The power parameter :math:eta
in the LRN expression. -
knorm::float, optional, default=2
: The parameter :math:k
in the LRN expression. -
nsize::int (non-negative), required
: normalization window width in elements.
source
# MXNet.mx.LeakyReLU
— Method.
LeakyReLU(data, gamma, act_type, slope, lower_bound, upper_bound)
Applies Leaky rectified linear unit activation element-wise to the input.
Leaky ReLUs attempt to fix the "dying ReLU" problem by allowing a small slope
when the input is negative and has a slope of one when input is positive.
The following modified ReLU Activation functions are supported:
-
elu: Exponential Linear Unit.
y = x > 0 ? x : slope * (exp(x)-1)
-
leaky: Leaky ReLU.
y = x > 0 ? x : slope * x
-
prelu: Parametric ReLU. This is same as leaky except that
slope
is learnt during training. -
rrelu: Randomized ReLU. same as leaky but the
slope
is uniformly and randomly chosen from [lower_bound, upper_bound) for training, while fixed to be (lower_bound+upper_bound)/2 for inference.
Defined in src/operator/leaky_relu.cc:L58
Arguments
-
data::NDArray-or-SymbolicNode
: Input data to activation function. -
gamma::NDArray-or-SymbolicNode
: Slope parameter for PReLU. Only required when act_type is 'prelu'. It should be either a vector of size 1, or the same size as the second dimension of data. -
act_type::{'elu', 'leaky', 'prelu', 'rrelu'},optional, default='leaky'
: Activation function to be applied. -
slope::float, optional, default=0.25
: Init slope for the activation. (For leaky and elu only) -
lower_bound::float, optional, default=0.125
: Lower bound of random slope. (For rrelu only) -
upper_bound::float, optional, default=0.334
: Upper bound of random slope. (For rrelu only)
source
# MXNet.mx.LinearRegressionOutput
— Method.
LinearRegressionOutput(data, label, grad_scale)
Computes and optimizes for squared loss during backward propagation. Just outputs $data$ during forward propagation.
If :math:\hat{y}_i
is the predicted value of the i-th sample, and :math:y_i
is the corresponding target value, then the squared loss estimated over :math:n
samples is defined as
:math:\text{SquaredLoss}(\textbf{Y}, \hat{\textbf{Y}} ) = \frac{1}{n} \sum_{i=0}^{n-1} \lVert \textbf{y}_i - \hat{\textbf{y}}_i \rVert_2
.. note:: Use the LinearRegressionOutput as the final output layer of a net.
By default, gradients of this loss function are scaled by factor 1/m
, where m is the number of regression outputs of a training example. The parameter grad_scale
can be used to change this scale to grad_scale/m
.
Defined in src/operator/regression_output.cc:L80
Arguments
-
data::NDArray-or-SymbolicNode
: Input data to the function. -
label::NDArray-or-SymbolicNode
: Input label to the function. -
grad_scale::float, optional, default=1
: Scale the gradient by a float factor
source
# MXNet.mx.LogisticRegressionOutput
— Method.
LogisticRegressionOutput(data, label, grad_scale)
Applies a logistic function to the input.
The logistic function, also known as the sigmoid function, is computed as :math:\frac{1}{1+exp(-\textbf{x})}
.
Commonly, the sigmoid is used to squash the real-valued output of a linear model :math:wTx+b into the [0,1] range so that it can be interpreted as a probability. It is suitable for binary classification or probability prediction tasks.
.. note:: Use the LogisticRegressionOutput as the final output layer of a net.
By default, gradients of this loss function are scaled by factor 1/m
, where m is the number of regression outputs of a training example. The parameter grad_scale
can be used to change this scale to grad_scale/m
.
Defined in src/operator/regression_output.cc:L122
Arguments
-
data::NDArray-or-SymbolicNode
: Input data to the function. -
label::NDArray-or-SymbolicNode
: Input label to the function. -
grad_scale::float, optional, default=1
: Scale the gradient by a float factor
source
# MXNet.mx.MAERegressionOutput
— Method.
MAERegressionOutput(data, label, grad_scale)
Computes mean absolute error of the input.
MAE is a risk metric corresponding to the expected value of the absolute error.
If :math:\hat{y}_i
is the predicted value of the i-th sample, and :math:y_i
is the corresponding target value, then the mean absolute error (MAE) estimated over :math:n
samples is defined as
:math:\text{MAE}(\textbf{Y}, \hat{\textbf{Y}} ) = \frac{1}{n} \sum_{i=0}^{n-1} \lVert \textbf{y}_i - \hat{\textbf{y}}_i \rVert_1
.. note:: Use the MAERegressionOutput as the final output layer of a net.
By default, gradients of this loss function are scaled by factor 1/m
, where m is the number of regression outputs of a training example. The parameter grad_scale
can be used to change this scale to grad_scale/m
.
Defined in src/operator/regression_output.cc:L101
Arguments
-
data::NDArray-or-SymbolicNode
: Input data to the function. -
label::NDArray-or-SymbolicNode
: Input label to the function. -
grad_scale::float, optional, default=1
: Scale the gradient by a float factor
source
# MXNet.mx.MakeLoss
— Method.
MakeLoss(data, grad_scale, valid_thresh, normalization)
Make your own loss function in network construction.
This operator accepts a customized loss function symbol as a terminal loss and the symbol should be an operator with no backward dependency. The output of this function is the gradient of loss with respect to the input data.
For example, if you are a making a cross entropy loss function. Assume $out$ is the predicted output and $label$ is the true label, then the cross entropy can be defined as::
cross_entropy = label * log(out) + (1 - label) * log(1 - out) loss = MakeLoss(cross_entropy)
We will need to use $MakeLoss$ when we are creating our own loss function or we want to combine multiple loss functions. Also we may want to stop some variables' gradients from backpropagation. See more detail in $BlockGrad$ or $stop_gradient$.
In addition, we can give a scale to the loss by setting $grad_scale$, so that the gradient of the loss will be rescaled in the backpropagation.
.. note:: This operator should be used as a Symbol instead of NDArray.
Defined in src/operator/make_loss.cc:L71
Arguments
-
data::NDArray-or-SymbolicNode
: Input array. -
grad_scale::float, optional, default=1
: Gradient scale as a supplement to unary and binary operators -
valid_thresh::float, optional, default=0
: clip each element in the array to 0 when it is less than $valid_thresh$. This is used when $normalization$ is set to $'valid'$. -
normalization::{'batch', 'null', 'valid'},optional, default='null'
: If this is set to null, the output gradient will not be normalized. If this is set to batch, the output gradient will be divided by the batch size. If this is set to valid, the output gradient will be divided by the number of valid input elements.
source
# MXNet.mx.Pad
— Method.
Pad(data, mode, pad_width, constant_value)
Pads an input array with a constant or edge values of the array.
.. note:: Pad
is deprecated. Use pad
instead.
.. note:: Current implementation only supports 4D and 5D input arrays with padding applied only on axes 1, 2 and 3. Expects axes 4 and 5 in pad_width
to be zero.
This operation pads an input array with either a constant_value
or edge values along each axis of the input array. The amount of padding is specified by pad_width
.
pad_width
is a tuple of integer padding widths for each axis of the format $(before_1, after_1, ... , before_N, after_N)$. The pad_width
should be of length $2*N$ where $N$ is the number of dimensions of the array.
For dimension $N$ of the input array, $before_N$ and $after_N$ indicates how many values to add before and after the elements of the array along dimension $N$. The widths of the higher two dimensions $before_1$, $after_1$, $before_2$, $after_2$ must be 0.
Example::
x = [[[[ 1. 2. 3.] [ 4. 5. 6.]]
[[ 7. 8. 9.]
[ 10. 11. 12.]]]
[[[ 11. 12. 13.]
[ 14. 15. 16.]]
[[ 17. 18. 19.]
[ 20. 21. 22.]]]]
pad(x,mode="edge", pad_width=(0,0,0,0,1,1,1,1)) =
[[[[ 1. 1. 2. 3. 3.]
[ 1. 1. 2. 3. 3.]
[ 4. 4. 5. 6. 6.]
[ 4. 4. 5. 6. 6.]]
[[ 7. 7. 8. 9. 9.]
[ 7. 7. 8. 9. 9.]
[ 10. 10. 11. 12. 12.]
[ 10. 10. 11. 12. 12.]]]
[[[ 11. 11. 12. 13. 13.]
[ 11. 11. 12. 13. 13.]
[ 14. 14. 15. 16. 16.]
[ 14. 14. 15. 16. 16.]]
[[ 17. 17. 18. 19. 19.]
[ 17. 17. 18. 19. 19.]
[ 20. 20. 21. 22. 22.]
[ 20. 20. 21. 22. 22.]]]]
pad(x, mode="constant", constant_value=0, pad_width=(0,0,0,0,1,1,1,1)) =
[[[[ 0. 0. 0. 0. 0.]
[ 0. 1. 2. 3. 0.]
[ 0. 4. 5. 6. 0.]
[ 0. 0. 0. 0. 0.]]
[[ 0. 0. 0. 0. 0.]
[ 0. 7. 8. 9. 0.]
[ 0. 10. 11. 12. 0.]
[ 0. 0. 0. 0. 0.]]]
[[[ 0. 0. 0. 0. 0.]
[ 0. 11. 12. 13. 0.]
[ 0. 14. 15. 16. 0.]
[ 0. 0. 0. 0. 0.]]
[[ 0. 0. 0. 0. 0.]
[ 0. 17. 18. 19. 0.]
[ 0. 20. 21. 22. 0.]
[ 0. 0. 0. 0. 0.]]]]
Defined in src/operator/pad.cc:L766
Arguments
-
data::NDArray-or-SymbolicNode
: An n-dimensional input array. -
mode::{'constant', 'edge', 'reflect'}, required
: Padding type to use. "constant" pads withconstant_value
"edge" pads using the edge values of the input array "reflect" pads by reflecting values with respect to the edges. -
pad_width::Shape(tuple), required
: Widths of the padding regions applied to the edges of each axis. It is a tuple of integer padding widths for each axis of the format $(before_1, after_1, ... , before_N, after_N)$. It should be of length $2*N$ where $N$ is the number of dimensions of the array.This is equivalent to pad_width in numpy.pad, but flattened. -
constant_value::double, optional, default=0
: The value used for padding whenmode
is "constant".
source
# MXNet.mx.Pooling
— Method.
Pooling(data, global_pool, cudnn_off, kernel, pool_type, pooling_convention, stride, pad)
Performs pooling on the input.
The shapes for 1-D pooling are
- data: (batch_size, channel, width),
- out: (batch_size, num_filter, out_width).
The shapes for 2-D pooling are
- data: (batch_size, channel, height, width)
-
out: (batch_size, num_filter, out_height, out_width), with::
out_height = f(height, kernel[0], pad[0], stride[0]) out_width = f(width, kernel[1], pad[1], stride[1])
The definition of f depends on $pooling_convention$, which has two options:
-
valid (default)::
f(x, k, p, s) = floor((x+2*p-k)/s)+1 * full, which is compatible with Caffe::
f(x, k, p, s) = ceil((x+2*p-k)/s)+1
But $global_pool$ is set to be true, then do a global pooling, namely reset $kernel=(height, width)$.
Three pooling options are supported by $pool_type$:
- avg: average pooling
- max: max pooling
- sum: sum pooling
For 3-D pooling, an additional depth dimension is added before height. Namely the input data will have shape (batch_size, channel, depth, height, width).
Defined in src/operator/nn/pooling.cc:L133
Arguments
-
data::NDArray-or-SymbolicNode
: Input data to the pooling operator. -
global_pool::boolean, optional, default=0
: Ignore kernel size, do global pooling based on current input feature map. -
cudnn_off::boolean, optional, default=0
: Turn off cudnn pooling and use MXNet pooling operator. -
kernel::Shape(tuple), required
: Pooling kernel size: (y, x) or (d, y, x) -
pool_type::{'avg', 'max', 'sum'}, required
: Pooling type to be applied. -
pooling_convention::{'full', 'valid'},optional, default='valid'
: Pooling convention to be applied. -
stride::Shape(tuple), optional, default=[]
: Stride: for pooling (y, x) or (d, y, x). Defaults to 1 for each dimension. -
pad::Shape(tuple), optional, default=[]
: Pad for pooling: (y, x) or (d, y, x). Defaults to no padding.
source
# MXNet.mx.Pooling_v1
— Method.
Pooling_v1(data, global_pool, kernel, pool_type, pooling_convention, stride, pad)
This operator is DEPRECATED. Perform pooling on the input.
The shapes for 2-D pooling is
- data: (batch_size, channel, height, width)
-
out: (batch_size, num_filter, out_height, out_width), with::
out_height = f(height, kernel[0], pad[0], stride[0]) out_width = f(width, kernel[1], pad[1], stride[1])
The definition of f depends on $pooling_convention$, which has two options:
-
valid (default)::
f(x, k, p, s) = floor((x+2*p-k)/s)+1 * full, which is compatible with Caffe::
f(x, k, p, s) = ceil((x+2*p-k)/s)+1
But $global_pool$ is set to be true, then do a global pooling, namely reset $kernel=(height, width)$.
Three pooling options are supported by $pool_type$:
- avg: average pooling
- max: max pooling
- sum: sum pooling
1-D pooling is special case of 2-D pooling with weight=1 and kernel[1]=1.
For 3-D pooling, an additional depth dimension is added before height. Namely the input data will have shape (batch_size, channel, depth, height, width).
Defined in src/operator/pooling_v1.cc:L104
Arguments
-
data::NDArray-or-SymbolicNode
: Input data to the pooling operator. -
global_pool::boolean, optional, default=0
: Ignore kernel size, do global pooling based on current input feature map. -
kernel::Shape(tuple), required
: pooling kernel size: (y, x) or (d, y, x) -
pool_type::{'avg', 'max', 'sum'}, required
: Pooling type to be applied. -
pooling_convention::{'full', 'valid'},optional, default='valid'
: Pooling convention to be applied. -
stride::Shape(tuple), optional, default=[]
: stride: for pooling (y, x) or (d, y, x) -
pad::Shape(tuple), optional, default=[]
: pad for pooling: (y, x) or (d, y, x)
source
# MXNet.mx.RNN
— Method.
RNN(data, parameters, state, state_cell, state_size, num_layers, bidirectional, mode, p, state_outputs)
Applies a recurrent layer to input.
Arguments
-
data::NDArray-or-SymbolicNode
: Input data to RNN -
parameters::NDArray-or-SymbolicNode
: Vector of all RNN trainable parameters concatenated -
state::NDArray-or-SymbolicNode
: initial hidden state of the RNN -
state_cell::NDArray-or-SymbolicNode
: initial cell state for LSTM networks (only for LSTM) -
state_size::int (non-negative), required
: size of the state for each layer -
num_layers::int (non-negative), required
: number of stacked layers -
bidirectional::boolean, optional, default=0
: whether to use bidirectional recurrent layers -
mode::{'gru', 'lstm', 'rnn_relu', 'rnn_tanh'}, required
: the type of RNN to compute -
p::float, optional, default=0
: Dropout probability, fraction of the input that gets dropped out at training time -
state_outputs::boolean, optional, default=0
: Whether to have the states as symbol outputs.
source
# MXNet.mx.ROIPooling
— Method.
ROIPooling(data, rois, pooled_size, spatial_scale)
Performs region of interest(ROI) pooling on the input array.
ROI pooling is a variant of a max pooling layer, in which the output size is fixed and region of interest is a parameter. Its purpose is to perform max pooling on the inputs of non-uniform sizes to obtain fixed-size feature maps. ROI pooling is a neural-net layer mostly used in training a Fast R-CNN
network for object detection.
This operator takes a 4D feature map as an input array and region proposals as rois
, then it pools over sub-regions of input and produces a fixed-sized output array regardless of the ROI size.
To crop the feature map accordingly, you can resize the bounding box coordinates by changing the parameters rois
and spatial_scale
.
The cropped feature maps are pooled by standard max pooling operation to a fixed size output indicated by a pooled_size
parameter. batch_size will change to the number of region bounding boxes after ROIPooling
.
The size of each region of interest doesn't have to be perfectly divisible by the number of pooling sections(pooled_size
).
Example::
x = [[[[ 0., 1., 2., 3., 4., 5.], [ 6., 7., 8., 9., 10., 11.], [ 12., 13., 14., 15., 16., 17.], [ 18., 19., 20., 21., 22., 23.], [ 24., 25., 26., 27., 28., 29.], [ 30., 31., 32., 33., 34., 35.], [ 36., 37., 38., 39., 40., 41.], [ 42., 43., 44., 45., 46., 47.]]]]
// region of interest i.e. bounding box coordinates. y = [[0,0,0,4,4]]
// returns array of shape (2,2) according to the given roi with max pooling. ROIPooling(x, y, (2,2), 1.0) = [[[[ 14., 16.], [ 26., 28.]]]]
// region of interest is changed due to the change in spacial_scale
parameter. ROIPooling(x, y, (2,2), 0.7) = [[[[ 7., 9.], [ 19., 21.]]]]
Defined in src/operator/roi_pooling.cc:L287
Arguments
-
data::NDArray-or-SymbolicNode
: The input array to the pooling operator, a 4D Feature maps -
rois::NDArray-or-SymbolicNode
: Bounding box coordinates, a 2D array of [[batch_index, x1, y1, x2, y2]], where (x1, y1) and (x2, y2) are top left and bottom right corners of designated region of interest.batch_index
indicates the index of corresponding image in the input array -
pooled_size::Shape(tuple), required
: ROI pooling output shape (h,w) -
spatial_scale::float, required
: Ratio of input feature map height (or w) to raw image height (or w). Equals the reciprocal of total stride in convolutional layers
source
# MXNet.mx.SVMOutput
— Method.
SVMOutput(data, label, margin, regularization_coefficient, use_linear)
Computes support vector machine based transformation of the input.
This tutorial demonstrates using SVM as output layer for classification instead of softmax: /dmlc/mxnet/tree/master/example/svm_mnist.
Arguments
-
data::NDArray-or-SymbolicNode
: Input data for SVM transformation. -
label::NDArray-or-SymbolicNode
: Class label for the input data. -
margin::float, optional, default=1
: The loss function penalizes outputs that lie outside this margin. Default margin is 1. -
regularization_coefficient::float, optional, default=1
: Regularization parameter for the SVM. This balances the tradeoff between coefficient size and error. -
use_linear::boolean, optional, default=0
: Whether to use L1-SVM objective. L2-SVM objective is used by default.
source
# MXNet.mx.SequenceLast
— Method.
SequenceLast(data, sequence_length, use_sequence_length, axis)
Takes the last element of a sequence.
This function takes an n-dimensional input array of the form [max_sequence_length, batch_size, other_feature_dims] and returns a (n-1)-dimensional array of the form [batch_size, other_feature_dims].
Parameter sequence_length
is used to handle variable-length sequences. sequence_length
should be an input array of positive ints of dimension [batch_size]. To use this parameter, set use_sequence_length
to True
, otherwise each example in the batch is assumed to have the max sequence length.
.. note:: Alternatively, you can also use take
operator.
Example::
x = [[[ 1., 2., 3.], [ 4., 5., 6.], [ 7., 8., 9.]],
[[ 10., 11., 12.],
[ 13., 14., 15.],
[ 16., 17., 18.]],
[[ 19., 20., 21.],
[ 22., 23., 24.],
[ 25., 26., 27.]]]
// returns last sequence when sequence_length parameter is not used SequenceLast(x) = [[ 19., 20., 21.], [ 22., 23., 24.], [ 25., 26., 27.]]
// sequence_length is used SequenceLast(x, sequence_length=[1,1,1], use_sequence_length=True) = [[ 1., 2., 3.], [ 4., 5., 6.], [ 7., 8., 9.]]
// sequence_length is used SequenceLast(x, sequence_length=[1,2,3], use_sequence_length=True) = [[ 1., 2., 3.], [ 13., 14., 15.], [ 25., 26., 27.]]
Defined in src/operator/sequence_last.cc:L92
Arguments
-
data::NDArray-or-SymbolicNode
: n-dimensional input array of the form [max_sequence_length, batch_size, other_feature_dims] where n>2 -
sequence_length::NDArray-or-SymbolicNode
: vector of sequence lengths of the form [batch_size] -
use_sequence_length::boolean, optional, default=0
: If set to true, this layer takes in an extra input parametersequence_length
to specify variable length sequence -
axis::int, optional, default='0'
: The sequence axis. Only values of 0 and 1 are currently supported.
source
# MXNet.mx.SequenceMask
— Method.
SequenceMask(data, sequence_length, use_sequence_length, value, axis)
Sets all elements outside the sequence to a constant value.
This function takes an n-dimensional input array of the form [max_sequence_length, batch_size, other_feature_dims] and returns an array of the same shape.
Parameter sequence_length
is used to handle variable-length sequences. sequence_length
should be an input array of positive ints of dimension [batch_size]. To use this parameter, set use_sequence_length
to True
, otherwise each example in the batch is assumed to have the max sequence length and this operator works as the identity
operator.
Example::
x = [[[ 1., 2., 3.], [ 4., 5., 6.]],
[[ 7., 8., 9.],
[ 10., 11., 12.]],
[[ 13., 14., 15.],
[ 16., 17., 18.]]]
// Batch 1 B1 = [[ 1., 2., 3.], [ 7., 8., 9.], [ 13., 14., 15.]]
// Batch 2 B2 = [[ 4., 5., 6.], [ 10., 11., 12.], [ 16., 17., 18.]]
// works as identity operator when sequence_length parameter is not used SequenceMask(x) = [[[ 1., 2., 3.], [ 4., 5., 6.]],
[[ 7., 8., 9.],
[ 10., 11., 12.]],
[[ 13., 14., 15.],
[ 16., 17., 18.]]]
// sequence_length [1,1] means 1 of each batch will be kept // and other rows are masked with default mask value = 0 SequenceMask(x, sequence_length=[1,1], use_sequence_length=True) = [[[ 1., 2., 3.], [ 4., 5., 6.]],
[[ 0., 0., 0.],
[ 0., 0., 0.]],
[[ 0., 0., 0.],
[ 0., 0., 0.]]]
// sequence_length [2,3] means 2 of batch B1 and 3 of batch B2 will be kept // and other rows are masked with value = 1 SequenceMask(x, sequence_length=[2,3], use_sequence_length=True, value=1) = [[[ 1., 2., 3.], [ 4., 5., 6.]],
[[ 7., 8., 9.],
[ 10., 11., 12.]],
[[ 1., 1., 1.],
[ 16., 17., 18.]]]
Defined in src/operator/sequence_mask.cc:L114
Arguments
-
data::NDArray-or-SymbolicNode
: n-dimensional input array of the form [max_sequence_length, batch_size, other_feature_dims] where n>2 -
sequence_length::NDArray-or-SymbolicNode
: vector of sequence lengths of the form [batch_size] -
use_sequence_length::boolean, optional, default=0
: If set to true, this layer takes in an extra input parametersequence_length
to specify variable length sequence -
value::float, optional, default=0
: The value to be used as a mask. -
axis::int, optional, default='0'
: The sequence axis. Only values of 0 and 1 are currently supported.
source
# MXNet.mx.SequenceReverse
— Method.
SequenceReverse(data, sequence_length, use_sequence_length, axis)
Reverses the elements of each sequence.
This function takes an n-dimensional input array of the form [max_sequence_length, batch_size, other_feature_dims] and returns an array of the same shape.
Parameter sequence_length
is used to handle variable-length sequences. sequence_length
should be an input array of positive ints of dimension [batch_size]. To use this parameter, set use_sequence_length
to True
, otherwise each example in the batch is assumed to have the max sequence length.
Example::
x = [[[ 1., 2., 3.], [ 4., 5., 6.]],
[[ 7., 8., 9.],
[ 10., 11., 12.]],
[[ 13., 14., 15.],
[ 16., 17., 18.]]]
// Batch 1 B1 = [[ 1., 2., 3.], [ 7., 8., 9.], [ 13., 14., 15.]]
// Batch 2 B2 = [[ 4., 5., 6.], [ 10., 11., 12.], [ 16., 17., 18.]]
// returns reverse sequence when sequence_length parameter is not used SequenceReverse(x) = [[[ 13., 14., 15.], [ 16., 17., 18.]],
[[ 7., 8., 9.],
[ 10., 11., 12.]],
[[ 1., 2., 3.],
[ 4., 5., 6.]]]
// sequence_length [2,2] means 2 rows of // both batch B1 and B2 will be reversed. SequenceReverse(x, sequence_length=[2,2], use_sequence_length=True) = [[[ 7., 8., 9.], [ 10., 11., 12.]],
[[ 1., 2., 3.],
[ 4., 5., 6.]],
[[ 13., 14., 15.],
[ 16., 17., 18.]]]
// sequence_length [2,3] means 2 of batch B2 and 3 of batch B3 // will be reversed. SequenceReverse(x, sequence_length=[2,3], use_sequence_length=True) = [[[ 7., 8., 9.], [ 16., 17., 18.]],
[[ 1., 2., 3.],
[ 10., 11., 12.]],
[[ 13., 14, 15.],
[ 4., 5., 6.]]]
Defined in src/operator/sequence_reverse.cc:L113
Arguments
-
data::NDArray-or-SymbolicNode
: n-dimensional input array of the form [max_sequence_length, batch_size, other dims] where n>2 -
sequence_length::NDArray-or-SymbolicNode
: vector of sequence lengths of the form [batch_size] -
use_sequence_length::boolean, optional, default=0
: If set to true, this layer takes in an extra input parametersequence_length
to specify variable length sequence -
axis::int, optional, default='0'
: The sequence axis. Only 0 is currently supported.
source
# MXNet.mx.SliceChannel
— Method.
SliceChannel(data, num_outputs, axis, squeeze_axis)
Splits an array along a particular axis into multiple sub-arrays.
.. note:: $SliceChannel$ is deprecated. Use $split$ instead.
Note that num_outputs
should evenly divide the length of the axis along which to split the array.
Example::
x = [[[ 1.] [ 2.]] [[ 3.] [ 4.]] [[ 5.] [ 6.]]] x.shape = (3, 2, 1)
y = split(x, axis=1, num_outputs=2) // a list of 2 arrays with shape (3, 1, 1) y = [[[ 1.]] [[ 3.]] [[ 5.]]]
[[[ 2.]]
[[ 4.]]
[[ 6.]]]
y[0].shape = (3, 1, 1)
z = split(x, axis=0, num_outputs=3) // a list of 3 arrays with shape (1, 2, 1) z = [[[ 1.] [ 2.]]]
[[[ 3.]
[ 4.]]]
[[[ 5.]
[ 6.]]]
z[0].shape = (1, 2, 1)
squeeze_axis=1
removes the axis with length 1 from the shapes of the output arrays. Note that setting squeeze_axis
to $1$ removes axis with length 1 only along the axis
which it is split. Also squeeze_axis
can be set to true only if $input.shape[axis] == num_outputs$.
Example::
z = split(x, axis=0, num_outputs=3, squeeze_axis=1) // a list of 3 arrays with shape (2, 1) z = [[ 1.] [ 2.]]
[[ 3.]
[ 4.]]
[[ 5.]
[ 6.]]
z[0].shape = (2 ,1 )
Defined in src/operator/slice_channel.cc:L107
Arguments
-
data::NDArray-or-SymbolicNode
: The input -
num_outputs::int, required
: Number of splits. Note that this should evenly divide the length of theaxis
. -
axis::int, optional, default='1'
: Axis along which to split. -
squeeze_axis::boolean, optional, default=0
: If true, Removes the axis with length 1 from the shapes of the output arrays. Note that settingsqueeze_axis
to $true$ removes axis with length 1 only along theaxis
which it is split. Alsosqueeze_axis
can be set to $true$ only if $input.shape[axis] == num_outputs$.
source
# MXNet.mx.SoftmaxActivation
— Method.
SoftmaxActivation(data, mode)
Applies softmax activation to input. This is intended for internal layers.
.. note::
This operator has been deprecated, please use softmax
.
If mode
= $instance$, this operator will compute a softmax for each instance in the batch. This is the default mode.
If mode
= $channel$, this operator will compute a k-class softmax at each position of each instance, where k
= $num_channel$. This mode can only be used when the input array has at least 3 dimensions. This can be used for fully convolutional network
, image segmentation
, etc.
Example::
input_array = mx.nd.array([[3., 0.5, -0.5, 2., 7.], [2., -.4, 7., 3., 0.2]]) softmax_act = mx.nd.SoftmaxActivation(input_array) print softmax_act.asnumpy()
[[ 1.78322066e-02 1.46375655e-03 5.38485940e-04 6.56010211e-03 9.73605454e-01] [ 6.56221947e-03 5.95310994e-04 9.73919690e-01 1.78379621e-02 1.08472735e-03]]
Defined in src/operator/nn/softmax_activation.cc:L67
Arguments
-
data::NDArray-or-SymbolicNode
: Input array to activation function. -
mode::{'channel', 'instance'},optional, default='instance'
: Specifies how to compute the softmax. If set to $instance$, it computes softmax for each instance. If set to $channel$, It computes cross channel softmax for each position of each instance.
source
# MXNet.mx.SoftmaxOutput
— Method.
SoftmaxOutput(data, label, grad_scale, ignore_label, multi_output, use_ignore, preserve_shape, normalization, out_grad, smooth_alpha)
Computes the gradient of cross entropy loss with respect to softmax output.
-
This operator computes the gradient in two steps. The cross entropy loss does not actually need to be computed.
- Applies softmax function on the input array.
- Computes and returns the gradient of cross entropy loss w.r.t. the softmax output.
-
The softmax function, cross entropy loss and gradient is given by:
-
Softmax Function:
.. math:: \text{softmax}(x)_i = \frac{exp(x_i)}{\sum_j exp(x_j)} * Cross Entropy Function:
.. math:: \text{CE(label, output)} = - \sum_i \text{label}_i \log(\text{output}_i) * The gradient of cross entropy loss w.r.t softmax output:
.. math:: \text{gradient} = \text{output} - \text{label} * During forward propagation, the softmax function is computed for each instance in the input array.
For general N-D input arrays with shape :math:
(d_1, d_2, ..., d_n)
. The size is :math:s=d_1 \cdot d_2 \cdot \cdot \cdot d_n
. We can use the parameterspreserve_shape
andmulti_output
to specify the way to compute softmax:- By default,
preserve_shape
is $false$. This operator will reshape the input array into a 2-D array with shape :math:(d_1, \frac{s}{d_1})
and then compute the softmax function for each row in the reshaped array, and afterwards reshape it back to the original shape :math:(d_1, d_2, ..., d_n)
. - If
preserve_shape
is $true$, the softmax function will be computed along the last axis (axis
= $-1$). - If
multi_output
is $true$, the softmax function will be computed along the second axis (axis
= $1$). -
During backward propagation, the gradient of cross-entropy loss w.r.t softmax output array is computed. The provided label can be a one-hot label array or a probability label array.
-
If the parameter
use_ignore
is $true$,ignore_label
can specify input instances with a particular label to be ignored during backward propagation. This has no effect when softmaxoutput
has same shape aslabel
.Example::
data = [[1,2,3,4],[2,2,2,2],[3,3,3,3],[4,4,4,4]] label = [1,0,2,3] ignore_label = 1 SoftmaxOutput(data=data, label = label, multi_output=true, use_ignore=true, ignore_label=ignore_label)
forward softmax output
[[ 0.0320586 0.08714432 0.23688284 0.64391428] [ 0.25 0.25 0.25 0.25 ] [ 0.25 0.25 0.25 0.25 ] [ 0.25 0.25 0.25 0.25 ]]
backward gradient output
[[ 0. 0. 0. 0. ] [-0.75 0.25 0.25 0.25] [ 0.25 0.25 -0.75 0.25] [ 0.25 0.25 0.25 -0.75]]
notice that the first row is all 0 because label[0] is 1, which is equal to ignore_label.
* The parameter `grad_scale` can be used to rescale the gradient, which is often used to give each loss function different weights. * This operator also supports various ways to normalize the gradient by `normalization`, The `normalization` is applied if softmax output has different shape than the labels. The `normalization` mode can be set to the followings:
- $'null'$: do nothing.
- $'batch'$: divide the gradient by the batch size.
- $'valid'$: divide the gradient by the number of instances which are not ignored.
Defined in src/operator/softmax_output.cc:L123
Arguments
-
data::NDArray-or-SymbolicNode
: Input array. -
label::NDArray-or-SymbolicNode
: Ground truth label. -
grad_scale::float, optional, default=1
: Scales the gradient by a float factor. -
ignore_label::float, optional, default=-1
: The instances whoselabels
==ignore_label
will be ignored during backward, ifuse_ignore
is set to $true$). -
multi_output::boolean, optional, default=0
: If set to $true$, the softmax function will be computed along axis $1$. This is applied when the shape of input array differs from the shape of label array. -
use_ignore::boolean, optional, default=0
: If set to $true$, theignore_label
value will not contribute to the backward gradient. -
preserve_shape::boolean, optional, default=0
: If set to $true$, the softmax function will be computed along the last axis ($-1$). -
normalization::{'batch', 'null', 'valid'},optional, default='null'
: Normalizes the gradient. -
out_grad::boolean, optional, default=0
: Multiplies gradient with output gradient element-wise. -
smooth_alpha::float, optional, default=0
: Constant for computing a label smoothed version of cross-entropyfor the backwards pass. This constant gets subtracted from theone-hot encoding of the gold label and distributed uniformly toall other labels.
source
# MXNet.mx.SpatialTransformer
— Method.
SpatialTransformer(data, loc, target_shape, transform_type, sampler_type)
Applies a spatial transformer to input feature map.
Arguments
-
data::NDArray-or-SymbolicNode
: Input data to the SpatialTransformerOp. -
loc::NDArray-or-SymbolicNode
: localisation net, the output dim should be 6 when transform_type is affine. You shold initialize the weight and bias with identity tranform. -
target_shape::Shape(tuple), optional, default=[0,0]
: output shape(h, w) of spatial transformer: (y, x) -
transform_type::{'affine'}, required
: transformation type -
sampler_type::{'bilinear'}, required
: sampling type
source
# MXNet.mx.SwapAxis
— Method.
SwapAxis(data, dim1, dim2)
Interchanges two axes of an array.
Examples::
x = [[1, 2, 3]]) swapaxes(x, 0, 1) = [[ 1], [ 2], [ 3]]
x = [[[ 0, 1], [ 2, 3]], [[ 4, 5], [ 6, 7]]] // (2,2,2) array
swapaxes(x, 0, 2) = [[[ 0, 4], [ 2, 6]], [[ 1, 5], [ 3, 7]]]
Defined in src/operator/swapaxis.cc:L70
Arguments
-
data::NDArray-or-SymbolicNode
: Input array. -
dim1::int (non-negative), optional, default=0
: the first axis to be swapped. -
dim2::int (non-negative), optional, default=0
: the second axis to be swapped.
source
# MXNet.mx.UpSampling
— Method.
UpSampling(data, scale, num_filter, sample_type, multi_input_mode, num_args, workspace)
Note: UpSampling takes variable number of positional inputs. So instead of calling as UpSampling([x, y, z], num_args=3), one should call via UpSampling(x, y, z), and num_args will be determined automatically.
Performs nearest neighbor/bilinear up sampling to inputs.
Arguments
-
data::NDArray-or-SymbolicNode[]
: Array of tensors to upsample -
scale::int (non-negative), required
: Up sampling scale -
num_filter::int (non-negative), optional, default=0
: Input filter. Only used by bilinear sample_type. -
sample_type::{'bilinear', 'nearest'}, required
: upsampling method -
multi_input_mode::{'concat', 'sum'},optional, default='concat'
: How to handle multiple input. concat means concatenate upsampled images along the channel dimension. sum means add all images together, only available for nearest neighbor upsampling. -
num_args::int, required
: Number of inputs to be upsampled. For nearest neighbor upsampling, this can be 1-N; the size of output will be(scaleh_0,scalew_0) and all other inputs will be upsampled to thesame size. For bilinear upsampling this must be 2; 1 input and 1 weight. -
workspace::long (non-negative), optional, default=512
: Tmp workspace for deconvolution (MB)
source
# MXNet.mx._CachedOp
— Method.
_CachedOp()
Arguments
source
# MXNet.mx._CrossDeviceCopy
— Method.
_CrossDeviceCopy()
Special op to copy data cross device
Arguments
source
# MXNet.mx._CustomFunction
— Method.
_CustomFunction()
Arguments
source
# MXNet.mx._Div
— Method.
_Div(lhs, rhs)
_Div is an alias of elemwise_div.
Divides arguments element-wise.
The storage type of $elemwise_div$ output is always dense
Arguments
-
lhs::NDArray-or-SymbolicNode
: first input -
rhs::NDArray-or-SymbolicNode
: second input
source
# MXNet.mx._DivScalar
— Method.
_DivScalar(data, scalar)
_DivScalar is an alias of _div_scalar.
Divide an array with a scalar.
$_div_scalar$ only operates on data array of input if input is sparse.
For example, if input of shape (100, 100) has only 2 non zero elements, i.e. input.data = [5, 6], scalar = nan, it will result output.data = [nan, nan] instead of 10000 nans.
Defined in src/operator/tensor/elemwise_binary_scalar_op_basic.cc:L171
Arguments
-
data::NDArray-or-SymbolicNode
: source input -
scalar::float
: scalar input
source
# MXNet.mx._Equal
— Method.
_Equal(lhs, rhs)
_Equal is an alias of _equal.
Arguments
-
lhs::NDArray-or-SymbolicNode
: first input -
rhs::NDArray-or-SymbolicNode
: second input
source
# MXNet.mx._EqualScalar
— Method.
_EqualScalar(data, scalar)
_EqualScalar is an alias of _equal_scalar.
Arguments
-
data::NDArray-or-SymbolicNode
: source input -
scalar::float
: scalar input
source
# MXNet.mx._Greater
— Method.
_Greater(lhs, rhs)
_Greater is an alias of _greater.
Arguments
-
lhs::NDArray-or-SymbolicNode
: first input -
rhs::NDArray-or-SymbolicNode
: second input
source
# MXNet.mx._GreaterEqualScalar
— Method.
_GreaterEqualScalar(data, scalar)
_GreaterEqualScalar is an alias of _greater_equal_scalar.
Arguments
-
data::NDArray-or-SymbolicNode
: source input -
scalar::float
: scalar input
source
# MXNet.mx._GreaterScalar
— Method.
_GreaterScalar(data, scalar)
_GreaterScalar is an alias of _greater_scalar.
Arguments
-
data::NDArray-or-SymbolicNode
: source input -
scalar::float
: scalar input
source
# MXNet.mx._Greater_Equal
— Method.
_Greater_Equal(lhs, rhs)
_Greater_Equal is an alias of _greater_equal.
Arguments
-
lhs::NDArray-or-SymbolicNode
: first input -
rhs::NDArray-or-SymbolicNode
: second input
source
# MXNet.mx._Hypot
— Method.
_Hypot(lhs, rhs)
_Hypot is an alias of _hypot.
Given the "legs" of a right triangle, return its hypotenuse.
Defined in src/operator/tensor/elemwise_binary_op_extended.cc:L79
Arguments
-
lhs::NDArray-or-SymbolicNode
: first input -
rhs::NDArray-or-SymbolicNode
: second input
source
# MXNet.mx._HypotScalar
— Method.
_HypotScalar(data, scalar)
_HypotScalar is an alias of _hypot_scalar.
Arguments
-
data::NDArray-or-SymbolicNode
: source input -
scalar::float
: scalar input
source
# MXNet.mx._Lesser
— Method.
_Lesser(lhs, rhs)
_Lesser is an alias of _lesser.
Arguments
-
lhs::NDArray-or-SymbolicNode
: first input -
rhs::NDArray-or-SymbolicNode
: second input
source
# MXNet.mx._LesserEqualScalar
— Method.
_LesserEqualScalar(data, scalar)
_LesserEqualScalar is an alias of _lesser_equal_scalar.
Arguments
-
data::NDArray-or-SymbolicNode
: source input -
scalar::float
: scalar input
source
# MXNet.mx._LesserScalar
— Method.
_LesserScalar(data, scalar)
_LesserScalar is an alias of _lesser_scalar.
Arguments
-
data::NDArray-or-SymbolicNode
: source input -
scalar::float
: scalar input
source
# MXNet.mx._Lesser_Equal
— Method.
_Lesser_Equal(lhs, rhs)
_Lesser_Equal is an alias of _lesser_equal.
Arguments
-
lhs::NDArray-or-SymbolicNode
: first input -
rhs::NDArray-or-SymbolicNode
: second input
source
# MXNet.mx._Maximum
— Method.
_Maximum(lhs, rhs)
_Maximum is an alias of _maximum.
Arguments
-
lhs::NDArray-or-SymbolicNode
: first input -
rhs::NDArray-or-SymbolicNode
: second input
source
# MXNet.mx._MaximumScalar
— Method.
_MaximumScalar(data, scalar)
_MaximumScalar is an alias of _maximum_scalar.
Arguments
-
data::NDArray-or-SymbolicNode
: source input -
scalar::float
: scalar input
source
# MXNet.mx._Minimum
— Method.
_Minimum(lhs, rhs)
_Minimum is an alias of _minimum.
Arguments
-
lhs::NDArray-or-SymbolicNode
: first input -
rhs::NDArray-or-SymbolicNode
: second input
source
# MXNet.mx._MinimumScalar
— Method.
_MinimumScalar(data, scalar)
_MinimumScalar is an alias of _minimum_scalar.
Arguments
-
data::NDArray-or-SymbolicNode
: source input -
scalar::float
: scalar input
source
# MXNet.mx._MinusScalar
— Method.
_MinusScalar(data, scalar)
_MinusScalar is an alias of _minus_scalar.
Arguments
-
data::NDArray-or-SymbolicNode
: source input -
scalar::float
: scalar input
source
# MXNet.mx._ModScalar
— Method.
_ModScalar(data, scalar)
_ModScalar is an alias of _mod_scalar.
Arguments
-
data::NDArray-or-SymbolicNode
: source input -
scalar::float
: scalar input
source
# MXNet.mx._Mul
— Method.
_Mul(lhs, rhs)
_Mul is an alias of elemwise_mul.
Multiplies arguments element-wise.
The storage type of $elemwise_mul$ output depends on storage types of inputs
- elemwise_mul(default, default) = default
- elemwise_mul(row_sparse, row_sparse) = row_sparse
- elemwise_mul(default, row_sparse) = default
- elemwise_mul(row_sparse, default) = default
- elemwise_mul(csr, csr) = csr
- otherwise, $elemwise_mul$ generates output with default storage
Arguments
-
lhs::NDArray-or-SymbolicNode
: first input -
rhs::NDArray-or-SymbolicNode
: second input
source
# MXNet.mx._MulScalar
— Method.
_MulScalar(data, scalar)
_MulScalar is an alias of _mul_scalar.
Multiply an array with a scalar.
$_mul_scalar$ only operates on data array of input if input is sparse.
For example, if input of shape (100, 100) has only 2 non zero elements, i.e. input.data = [5, 6], scalar = nan, it will result output.data = [nan, nan] instead of 10000 nans.
Defined in src/operator/tensor/elemwise_binary_scalar_op_basic.cc:L149
Arguments
-
data::NDArray-or-SymbolicNode
: source input -
scalar::float
: scalar input
source
# MXNet.mx._NDArray
— Method.
_NDArray(data, info)
Stub for implementing an operator implemented in native frontend language with ndarray.
Arguments
-
data::NDArray-or-SymbolicNode[]
: Input data for the custom operator. -
info::ptr, required
:
source
# MXNet.mx._Native
— Method.
_Native(data, info, need_top_grad)
Stub for implementing an operator implemented in native frontend language.
Arguments
-
data::NDArray-or-SymbolicNode[]
: Input data for the custom operator. -
info::ptr, required
: -
need_top_grad::boolean, optional, default=1
: Whether this layer needs out grad for backward. Should be false for loss layers.
source
# MXNet.mx._NoGradient
— Method.
_NoGradient()
Place holder for variable who cannot perform gradient
Arguments
source
# MXNet.mx._NotEqualScalar
— Method.
_NotEqualScalar(data, scalar)
_NotEqualScalar is an alias of _not_equal_scalar.
Arguments
-
data::NDArray-or-SymbolicNode
: source input -
scalar::float
: scalar input
source
# MXNet.mx._Not_Equal
— Method.
_Not_Equal(lhs, rhs)
_Not_Equal is an alias of _not_equal.
Arguments
-
lhs::NDArray-or-SymbolicNode
: first input -
rhs::NDArray-or-SymbolicNode
: second input
source
# MXNet.mx._PlusScalar
— Method.
_PlusScalar(data, scalar)
_PlusScalar is an alias of _plus_scalar.
Arguments
-
data::NDArray-or-SymbolicNode
: source input -
scalar::float
: scalar input
source
# MXNet.mx._Power
— Method.
_Power(lhs, rhs)
_Power is an alias of _power.
Arguments
-
lhs::NDArray-or-SymbolicNode
: first input -
rhs::NDArray-or-SymbolicNode
: second input
source
# MXNet.mx._PowerScalar
— Method.
_PowerScalar(data, scalar)
_PowerScalar is an alias of _power_scalar.
Arguments
-
data::NDArray-or-SymbolicNode
: source input -
scalar::float
: scalar input
source
# MXNet.mx._RDivScalar
— Method.
_RDivScalar(data, scalar)
_RDivScalar is an alias of _rdiv_scalar.
Arguments
-
data::NDArray-or-SymbolicNode
: source input -
scalar::float
: scalar input
source
# MXNet.mx._RMinusScalar
— Method.
_RMinusScalar(data, scalar)
_RMinusScalar is an alias of _rminus_scalar.
Arguments
-
data::NDArray-or-SymbolicNode
: source input -
scalar::float
: scalar input
source
# MXNet.mx._RModScalar
— Method.
_RModScalar(data, scalar)
_RModScalar is an alias of _rmod_scalar.
Arguments
-
data::NDArray-or-SymbolicNode
: source input -
scalar::float
: scalar input
source
# MXNet.mx._RPowerScalar
— Method.
_RPowerScalar(data, scalar)
_RPowerScalar is an alias of _rpower_scalar.
Arguments
-
data::NDArray-or-SymbolicNode
: source input -
scalar::float
: scalar input
source
# MXNet.mx._add
— Method.
_add(lhs, rhs)
_add is an alias of elemwise_add.
Adds arguments element-wise.
The storage type of $elemwise_add$ output depends on storage types of inputs
- elemwise_add(row_sparse, row_sparse) = row_sparse
- elemwise_add(csr, csr) = csr
- otherwise, $elemwise_add$ generates output with default storage
Arguments
-
lhs::NDArray-or-SymbolicNode
: first input -
rhs::NDArray-or-SymbolicNode
: second input
source
# MXNet.mx._arange
— Method.
_arange(start, stop, step, repeat, ctx, dtype)
Return evenly spaced values within a given interval. Similar to Numpy
Arguments
-
start::double, required
: Start of interval. The interval includes this value. The default start value is 0. -
stop::double or None, optional, default=None
: End of interval. The interval does not include this value, except in some cases where step is not an integer and floating point round-off affects the length of out. -
step::double, optional, default=1
: Spacing between values. -
repeat::int, optional, default='1'
: The repeating time of all elements. E.g repeat=3, the element a will be repeated three times –> a, a, a. -
ctx::string, optional, default=''
: Context of output, in format cpu|gpu|cpu_pinned.Only used for imperative calls. -
dtype::{'float16', 'float32', 'float64', 'int32', 'int64', 'uint8'},optional, default='float32'
: Target data type.
source
# MXNet.mx._backward_Activation
— Method.
_backward_Activation()
Arguments
source
# MXNet.mx._backward_BatchNorm
— Method.
_backward_BatchNorm()
Arguments
source
# MXNet.mx._backward_BatchNorm_v1
— Method.
_backward_BatchNorm_v1()
Arguments
source
# MXNet.mx._backward_BilinearSampler
— Method.
_backward_BilinearSampler()
Arguments
source
# MXNet.mx._backward_CachedOp
— Method.
_backward_CachedOp()
Arguments
source
# MXNet.mx._backward_Concat
— Method.
_backward_Concat()
Arguments
source
# MXNet.mx._backward_Convolution
— Method.
_backward_Convolution()
Arguments
source
# MXNet.mx._backward_Convolution_v1
— Method.
_backward_Convolution_v1()
Arguments
source
# MXNet.mx._backward_Correlation
— Method.
_backward_Correlation()
Arguments
source
# MXNet.mx._backward_Crop
— Method.
_backward_Crop()
Arguments
source
# MXNet.mx._backward_Custom
— Method.
_backward_Custom()
Arguments
source
# MXNet.mx._backward_CustomFunction
— Method.
_backward_CustomFunction()
Arguments
source
# MXNet.mx._backward_Deconvolution
— Method.
_backward_Deconvolution()
Arguments
source
# MXNet.mx._backward_Dropout
— Method.
_backward_Dropout()
Arguments
source
# MXNet.mx._backward_Embedding
— Method.
_backward_Embedding()
Arguments
source
# MXNet.mx._backward_FullyConnected
— Method.
_backward_FullyConnected()
Arguments
source
# MXNet.mx._backward_GridGenerator
— Method.
_backward_GridGenerator()
Arguments
source
# MXNet.mx._backward_IdentityAttachKLSparseReg
— Method.
_backward_IdentityAttachKLSparseReg()
Arguments
source
# MXNet.mx._backward_InstanceNorm
— Method.
_backward_InstanceNorm()
Arguments
source
# MXNet.mx._backward_L2Normalization
— Method.
_backward_L2Normalization()
Arguments
source
# MXNet.mx._backward_LRN
— Method.
_backward_LRN()
Arguments
source
# MXNet.mx._backward_LeakyReLU
— Method.
_backward_LeakyReLU()
Arguments
source
# MXNet.mx._backward_MakeLoss
— Method.
_backward_MakeLoss()
Arguments
source
# MXNet.mx._backward_Pad
— Method.
_backward_Pad()
Arguments
source
# MXNet.mx._backward_Pooling
— Method.
_backward_Pooling()
Arguments
source
# MXNet.mx._backward_Pooling_v1
— Method.
_backward_Pooling_v1()
Arguments
source
# MXNet.mx._backward_RNN
— Method.
_backward_RNN()
Arguments
source
# MXNet.mx._backward_ROIPooling
— Method.
_backward_ROIPooling()
Arguments
source
# MXNet.mx._backward_SVMOutput
— Method.
_backward_SVMOutput()
Arguments
source
# MXNet.mx._backward_SequenceLast
— Method.
_backward_SequenceLast()
Arguments
source
# MXNet.mx._backward_SequenceMask
— Method.
_backward_SequenceMask()
Arguments
source
# MXNet.mx._backward_SequenceReverse
— Method.
_backward_SequenceReverse()
Arguments
source
# MXNet.mx._backward_SliceChannel
— Method.
_backward_SliceChannel()
Arguments
source
# MXNet.mx._backward_Softmax
— Method.
_backward_Softmax()
Arguments
source
# MXNet.mx._backward_SoftmaxActivation
— Method.
_backward_SoftmaxActivation()
Arguments
source
# MXNet.mx._backward_SoftmaxOutput
— Method.
_backward_SoftmaxOutput()
Arguments
source
# MXNet.mx._backward_SparseEmbedding
— Method.
_backward_SparseEmbedding()
Arguments
source
# MXNet.mx._backward_SpatialTransformer
— Method.
_backward_SpatialTransformer()
Arguments
source
# MXNet.mx._backward_SwapAxis
— Method.
_backward_SwapAxis()
Arguments
source
# MXNet.mx._backward_UpSampling
— Method.
_backward_UpSampling()
Arguments
source
# MXNet.mx._backward__CrossDeviceCopy
— Method.
_backward__CrossDeviceCopy()
Arguments
source
# MXNet.mx._backward__NDArray
— Method.
_backward__NDArray()
Arguments
source
# MXNet.mx._backward__Native
— Method.
_backward__Native()
Arguments
source
# MXNet.mx._backward__contrib_CTCLoss
— Method.
_backward__contrib_CTCLoss()
Arguments
source
# MXNet.mx._backward__contrib_DeformableConvolution
— Method.
_backward__contrib_DeformableConvolution()
Arguments
source
# MXNet.mx._backward__contrib_DeformablePSROIPooling
— Method.
_backward__contrib_DeformablePSROIPooling()
Arguments
source
# MXNet.mx._backward__contrib_MultiBoxDetection
— Method.
_backward__contrib_MultiBoxDetection()
Arguments
source
# MXNet.mx._backward__contrib_MultiBoxPrior
— Method.
_backward__contrib_MultiBoxPrior()
Arguments
source
# MXNet.mx._backward__contrib_MultiBoxTarget
— Method.
_backward__contrib_MultiBoxTarget()
Arguments
source
# MXNet.mx._backward__contrib_MultiProposal
— Method.
_backward__contrib_MultiProposal()
Arguments
source
# MXNet.mx._backward__contrib_PSROIPooling
— Method.
_backward__contrib_PSROIPooling()
Arguments
source
# MXNet.mx._backward__contrib_Proposal
— Method.
_backward__contrib_Proposal()
Arguments
source
# MXNet.mx._backward__contrib_count_sketch
— Method.
_backward__contrib_count_sketch()
Arguments
source
# MXNet.mx._backward__contrib_fft
— Method.
_backward__contrib_fft()
Arguments
source
# MXNet.mx._backward__contrib_ifft
— Method.
_backward__contrib_ifft()
Arguments
source
# MXNet.mx._backward_abs
— Method.
_backward_abs(lhs, rhs)
Arguments
-
lhs::NDArray-or-SymbolicNode
: first input -
rhs::NDArray-or-SymbolicNode
: second input
source
# MXNet.mx._backward_add
— Method.
_backward_add()
Arguments
source
# MXNet.mx._backward_arccos
— Method.
_backward_arccos(lhs, rhs)
Arguments
-
lhs::NDArray-or-SymbolicNode
: first input -
rhs::NDArray-or-SymbolicNode
: second input
source
# MXNet.mx._backward_arccosh
— Method.
_backward_arccosh(lhs, rhs)
Arguments
-
lhs::NDArray-or-SymbolicNode
: first input -
rhs::NDArray-or-SymbolicNode
: second input
source
# MXNet.mx._backward_arcsin
— Method.
_backward_arcsin(lhs, rhs)
Arguments
-
lhs::NDArray-or-SymbolicNode
: first input -
rhs::NDArray-or-SymbolicNode
: second input
source
# MXNet.mx._backward_arcsinh
— Method.
_backward_arcsinh(lhs, rhs)
Arguments
-
lhs::NDArray-or-SymbolicNode
: first input -
rhs::NDArray-or-SymbolicNode
: second input
source
# MXNet.mx._backward_arctan
— Method.
_backward_arctan(lhs, rhs)
Arguments
-
lhs::NDArray-or-SymbolicNode
: first input -
rhs::NDArray-or-SymbolicNode
: second input
source
# MXNet.mx._backward_arctanh
— Method.
_backward_arctanh(lhs, rhs)
Arguments
-
lhs::NDArray-or-SymbolicNode
: first input -
rhs::NDArray-or-SymbolicNode
: second input
source
# MXNet.mx._backward_batch_dot
— Method.
_backward_batch_dot()
Arguments
source
# MXNet.mx._backward_broadcast_add
— Method.
_backward_broadcast_add()
Arguments
source
# MXNet.mx._backward_broadcast_div
— Method.
_backward_broadcast_div()
Arguments
source
# MXNet.mx._backward_broadcast_hypot
— Method.
_backward_broadcast_hypot()
Arguments
source
# MXNet.mx._backward_broadcast_maximum
— Method.
_backward_broadcast_maximum()
Arguments
source
# MXNet.mx._backward_broadcast_minimum
— Method.
_backward_broadcast_minimum()
Arguments
source
# MXNet.mx._backward_broadcast_mod
— Method.
_backward_broadcast_mod()
Arguments
source
# MXNet.mx._backward_broadcast_mul
— Method.
_backward_broadcast_mul()
Arguments
source
# MXNet.mx._backward_broadcast_power
— Method.
_backward_broadcast_power()
Arguments
source
# MXNet.mx._backward_broadcast_sub
— Method.
_backward_broadcast_sub()
Arguments
source
# MXNet.mx._backward_cast
— Method.
_backward_cast()
Arguments
source
# MXNet.mx._backward_cbrt
— Method.
_backward_cbrt(lhs, rhs)
Arguments
-
lhs::NDArray-or-SymbolicNode
: first input -
rhs::NDArray-or-SymbolicNode
: second input
source
# MXNet.mx._backward_clip
— Method.
_backward_clip()
Arguments
source
# MXNet.mx._backward_contrib_bipartite_matching
— Method.
_backward_contrib_bipartite_matching(is_ascend, threshold, topk)
Arguments
-
is_ascend::boolean, optional, default=0
: Use ascend order for scores instead of descending. Please set threshold accordingly. -
threshold::float, required
: Ignore matching when score < thresh, if is_ascend=false, or ignore score > thresh, if is_ascend=true. -
topk::int, optional, default='-1'
: Limit the number of matches to topk, set -1 for no limit
source
# MXNet.mx._backward_contrib_box_iou
— Method.
_backward_contrib_box_iou(format)
Arguments
-
format::{'center', 'corner'},optional, default='corner'
: The box encoding type.
"corner" means boxes are encoded as [xmin, ymin, xmax, ymax], "center" means boxes are encodes as [x, y, width, height].
source
# MXNet.mx._backward_contrib_box_nms
— Method.
_backward_contrib_box_nms(overlap_thresh, topk, coord_start, score_index, id_index, force_suppress, in_format, out_format)
Arguments
-
overlap_thresh::float, optional, default=0.5
: Overlapping(IoU) threshold to suppress object with smaller score. -
topk::int, optional, default='-1'
: Apply nms to topk boxes with descending scores, -1 to no restriction. -
coord_start::int, optional, default='2'
: Start index of the consecutive 4 coordinates. -
score_index::int, optional, default='1'
: Index of the scores/confidence of boxes. -
id_index::int, optional, default='-1'
: Optional, index of the class categories, -1 to disable. -
force_suppress::boolean, optional, default=0
: Optional, if set false and id_index is provided, nms will only apply to boxes belongs to the same category -
in_format::{'center', 'corner'},optional, default='corner'
: The input box encoding type.
"corner" means boxes are encoded as [xmin, ymin, xmax, ymax], "center" means boxes are encodes as [x, y, width, height].
-
out_format::{'center', 'corner'},optional, default='corner'
: The output box encoding type.
"corner" means boxes are encoded as [xmin, ymin, xmax, ymax], "center" means boxes are encodes as [x, y, width, height].
source
# MXNet.mx._backward_copy
— Method.
_backward_copy()
Arguments
source
# MXNet.mx._backward_cos
— Method.
_backward_cos(lhs, rhs)
Arguments
-
lhs::NDArray-or-SymbolicNode
: first input -
rhs::NDArray-or-SymbolicNode
: second input
source
# MXNet.mx._backward_cosh
— Method.
_backward_cosh(lhs, rhs)
Arguments
-
lhs::NDArray-or-SymbolicNode
: first input -
rhs::NDArray-or-SymbolicNode
: second input
source
# MXNet.mx._backward_degrees
— Method.
_backward_degrees(lhs, rhs)
Arguments
-
lhs::NDArray-or-SymbolicNode
: first input -
rhs::NDArray-or-SymbolicNode
: second input
source
# MXNet.mx._backward_div
— Method.
_backward_div()
Arguments
source
# MXNet.mx._backward_div_scalar
— Method.
_backward_div_scalar(data, scalar)
Arguments
-
data::NDArray-or-SymbolicNode
: source input -
scalar::float
: scalar input
source
# MXNet.mx._backward_dot
— Method.
_backward_dot(transpose_a, transpose_b)
Arguments
-
transpose_a::boolean, optional, default=0
: If true then transpose the first input before dot. -
transpose_b::boolean, optional, default=0
: If true then transpose the second input before dot.
source
# MXNet.mx._backward_expm1
— Method.
_backward_expm1(lhs, rhs)
Arguments
-
lhs::NDArray-or-SymbolicNode
: first input -
rhs::NDArray-or-SymbolicNode
: second input
source
# MXNet.mx._backward_gamma
— Method.
_backward_gamma(lhs, rhs)
Arguments
-
lhs::NDArray-or-SymbolicNode
: first input -
rhs::NDArray-or-SymbolicNode
: second input
source
# MXNet.mx._backward_gammaln
— Method.
_backward_gammaln(lhs, rhs)
Arguments
-
lhs::NDArray-or-SymbolicNode
: first input -
rhs::NDArray-or-SymbolicNode
: second input
source
# MXNet.mx._backward_gather_nd
— Method.
_backward_gather_nd(data, indices, shape)
Accumulates data according to indices and get the result. It's the backward of gather_nd
.
Given data
with shape (Y_0, ..., Y_{K-1}, X_M, ..., X_{N-1})
and indices with shape (M, Y_0, ..., Y_{K-1})
, the output will have shape (X_0, X_1, ..., X_{N-1})
, where M <= N
. If M == N
, data shape should simply be (Y_0, ..., Y_{K-1})
.
The elements in output is defined as follows::
output[indices[0, y_0, ..., y_{K-1}], ..., indices[M-1, y_0, ..., y_{K-1}], x_M, ..., x_{N-1}] += data[y_0, ..., y_{K-1}, x_M, ..., x_{N-1}]
all other entries in output are 0 or the original value if AddTo is triggered.
Examples::
data = [2, 3, 0] indices = [[1, 1, 0], [0, 1, 0]] shape = (2, 2) _backward_gather_nd(data, indices, shape) = [[0, 0], [2, 3]] # Same as scatter_nd
The difference between scatter_nd and scatter_nd_acc is the latter will accumulate
the values that point to the same index.
data = [2, 3, 0] indices = [[1, 1, 0], [1, 1, 0]] shape = (2, 2) _backward_gather_nd(data, indices, shape) = [[0, 0], [0, 5]]
Arguments
-
data::NDArray-or-SymbolicNode
: data -
indices::NDArray-or-SymbolicNode
: indices -
shape::Shape(tuple), required
: Shape of output.
source
# MXNet.mx._backward_hypot
— Method.
_backward_hypot()
Arguments
source
# MXNet.mx._backward_hypot_scalar
— Method.
_backward_hypot_scalar(lhs, rhs, scalar)
Arguments
-
lhs::NDArray-or-SymbolicNode
: first input -
rhs::NDArray-or-SymbolicNode
: second input -
scalar::float
: scalar value
source
# MXNet.mx._backward_linalg_gelqf
— Method.
_backward_linalg_gelqf()
Arguments
source
# MXNet.mx._backward_linalg_gemm
— Method.
_backward_linalg_gemm()
Arguments
source
# MXNet.mx._backward_linalg_gemm2
— Method.
_backward_linalg_gemm2()
Arguments
source
# MXNet.mx._backward_linalg_potrf
— Method.
_backward_linalg_potrf()
Arguments
source
# MXNet.mx._backward_linalg_potri
— Method.
_backward_linalg_potri()
Arguments
source
# MXNet.mx._backward_linalg_sumlogdiag
— Method.
_backward_linalg_sumlogdiag()
Arguments
source
# MXNet.mx._backward_linalg_syevd
— Method.
_backward_linalg_syevd()
Arguments
source
# MXNet.mx._backward_linalg_syrk
— Method.
_backward_linalg_syrk()
Arguments
source
# MXNet.mx._backward_linalg_trmm
— Method.
_backward_linalg_trmm()
Arguments
source
# MXNet.mx._backward_linalg_trsm
— Method.
_backward_linalg_trsm()
Arguments
source
# MXNet.mx._backward_linear_reg_out
— Method.
_backward_linear_reg_out()
Arguments
source
# MXNet.mx._backward_log
— Method.
_backward_log(lhs, rhs)
Arguments
-
lhs::NDArray-or-SymbolicNode
: first input -
rhs::NDArray-or-SymbolicNode
: second input
source
# MXNet.mx._backward_log10
— Method.
_backward_log10(lhs, rhs)
Arguments
-
lhs::NDArray-or-SymbolicNode
: first input -
rhs::NDArray-or-SymbolicNode
: second input
source
# MXNet.mx._backward_log1p
— Method.
_backward_log1p(lhs, rhs)
Arguments
-
lhs::NDArray-or-SymbolicNode
: first input -
rhs::NDArray-or-SymbolicNode
: second input
source
# MXNet.mx._backward_log2
— Method.
_backward_log2(lhs, rhs)
Arguments
-
lhs::NDArray-or-SymbolicNode
: first input -
rhs::NDArray-or-SymbolicNode
: second input
source
# MXNet.mx._backward_log_softmax
— Method.
_backward_log_softmax(lhs, rhs)
Arguments
-
lhs::NDArray-or-SymbolicNode
: first input -
rhs::NDArray-or-SymbolicNode
: second input
source
# MXNet.mx._backward_logistic_reg_out
— Method.
_backward_logistic_reg_out()
Arguments
source
# MXNet.mx._backward_mae_reg_out
— Method.
_backward_mae_reg_out()
Arguments
source
# MXNet.mx._backward_max
— Method.
_backward_max()
Arguments
source
# MXNet.mx._backward_maximum
— Method.
_backward_maximum()
Arguments
source
# MXNet.mx._backward_maximum_scalar
— Method.
_backward_maximum_scalar(lhs, rhs, scalar)
Arguments
-
lhs::NDArray-or-SymbolicNode
: first input -
rhs::NDArray-or-SymbolicNode
: second input -
scalar::float
: scalar value
source
# MXNet.mx._backward_mean
— Method.
_backward_mean()
Arguments
source
# MXNet.mx._backward_min
— Method.
_backward_min()
Arguments
source
# MXNet.mx._backward_minimum
— Method.
_backward_minimum()
Arguments
source
# MXNet.mx._backward_minimum_scalar
— Method.
_backward_minimum_scalar(lhs, rhs, scalar)
Arguments
-
lhs::NDArray-or-SymbolicNode
: first input -
rhs::NDArray-or-SymbolicNode
: second input -
scalar::float
: scalar value
source
# MXNet.mx._backward_mod
— Method.
_backward_mod()
Arguments
source
# MXNet.mx._backward_mod_scalar
— Method.
_backward_mod_scalar(lhs, rhs, scalar)
Arguments
-
lhs::NDArray-or-SymbolicNode
: first input -
rhs::NDArray-or-SymbolicNode
: second input -
scalar::float
: scalar value
source
# MXNet.mx._backward_mul
— Method.
_backward_mul()
Arguments
source
# MXNet.mx._backward_mul_scalar
— Method.
_backward_mul_scalar(data, scalar)
Arguments
-
data::NDArray-or-SymbolicNode
: source input -
scalar::float
: scalar input
source
# MXNet.mx._backward_nanprod
— Method.
_backward_nanprod()
Arguments
source
# MXNet.mx._backward_nansum
— Method.
_backward_nansum()
Arguments
source
# MXNet.mx._backward_pick
— Method.
_backward_pick()
Arguments
source
# MXNet.mx._backward_power
— Method.
_backward_power()
Arguments
source
# MXNet.mx._backward_power_scalar
— Method.
_backward_power_scalar(lhs, rhs, scalar)
Arguments
-
lhs::NDArray-or-SymbolicNode
: first input -
rhs::NDArray-or-SymbolicNode
: second input -
scalar::float
: scalar value
source
# MXNet.mx._backward_prod
— Method.
_backward_prod()
Arguments
source
# MXNet.mx._backward_radians
— Method.
_backward_radians(lhs, rhs)
Arguments
-
lhs::NDArray-or-SymbolicNode
: first input -
rhs::NDArray-or-SymbolicNode
: second input
source
# MXNet.mx._backward_rcbrt
— Method.
_backward_rcbrt(lhs, rhs)
Arguments
-
lhs::NDArray-or-SymbolicNode
: first input -
rhs::NDArray-or-SymbolicNode
: second input
source
# MXNet.mx._backward_rdiv_scalar
— Method.
_backward_rdiv_scalar(lhs, rhs, scalar)
Arguments
-
lhs::NDArray-or-SymbolicNode
: first input -
rhs::NDArray-or-SymbolicNode
: second input -
scalar::float
: scalar value
source
# MXNet.mx._backward_reciprocal
— Method.
_backward_reciprocal(lhs, rhs)
Arguments
-
lhs::NDArray-or-SymbolicNode
: first input -
rhs::NDArray-or-SymbolicNode
: second input
source
# MXNet.mx._backward_relu
— Method.
_backward_relu(lhs, rhs)
Arguments
-
lhs::NDArray-or-SymbolicNode
: first input -
rhs::NDArray-or-SymbolicNode
: second input
source
# MXNet.mx._backward_repeat
— Method.
_backward_repeat()
Arguments
source
# MXNet.mx._backward_reverse
— Method.
_backward_reverse()
Arguments
source
# MXNet.mx._backward_rmod_scalar
— Method.
_backward_rmod_scalar(lhs, rhs, scalar)
Arguments
-
lhs::NDArray-or-SymbolicNode
: first input -
rhs::NDArray-or-SymbolicNode
: second input -
scalar::float
: scalar value
source
# MXNet.mx._backward_rpower_scalar
— Method.
_backward_rpower_scalar(lhs, rhs, scalar)
Arguments
-
lhs::NDArray-or-SymbolicNode
: first input -
rhs::NDArray-or-SymbolicNode
: second input -
scalar::float
: scalar value
source
# MXNet.mx._backward_rsqrt
— Method.
_backward_rsqrt(lhs, rhs)
Arguments
-
lhs::NDArray-or-SymbolicNode
: first input -
rhs::NDArray-or-SymbolicNode
: second input
source
# MXNet.mx._backward_sample_multinomial
— Method.
_backward_sample_multinomial()
Arguments
source
# MXNet.mx._backward_sigmoid
— Method.
_backward_sigmoid(lhs, rhs)
Arguments
-
lhs::NDArray-or-SymbolicNode
: first input -
rhs::NDArray-or-SymbolicNode
: second input
source
# MXNet.mx._backward_sign
— Method.
_backward_sign(lhs, rhs)
Arguments
-
lhs::NDArray-or-SymbolicNode
: first input -
rhs::NDArray-or-SymbolicNode
: second input
source
# MXNet.mx._backward_sin
— Method.
_backward_sin(lhs, rhs)
Arguments
-
lhs::NDArray-or-SymbolicNode
: first input -
rhs::NDArray-or-SymbolicNode
: second input
source
# MXNet.mx._backward_sinh
— Method.
_backward_sinh(lhs, rhs)
Arguments
-
lhs::NDArray-or-SymbolicNode
: first input -
rhs::NDArray-or-SymbolicNode
: second input
source
# MXNet.mx._backward_slice
— Method.
_backward_slice()
Arguments
source
# MXNet.mx._backward_slice_axis
— Method.
_backward_slice_axis()
Arguments
source
# MXNet.mx._backward_smooth_l1
— Method.
_backward_smooth_l1(lhs, rhs)
Arguments
-
lhs::NDArray-or-SymbolicNode
: first input -
rhs::NDArray-or-SymbolicNode
: second input
source
# MXNet.mx._backward_softmax
— Method.
_backward_softmax(lhs, rhs)
Arguments
-
lhs::NDArray-or-SymbolicNode
: first input -
rhs::NDArray-or-SymbolicNode
: second input
source
# MXNet.mx._backward_softmax_cross_entropy
— Method.
_backward_softmax_cross_entropy()
Arguments
source
# MXNet.mx._backward_sparse_retain
— Method.
_backward_sparse_retain()
Arguments
source
# MXNet.mx._backward_sqrt
— Method.
_backward_sqrt(lhs, rhs)
Arguments
-
lhs::NDArray-or-SymbolicNode
: first input -
rhs::NDArray-or-SymbolicNode
: second input
source
# MXNet.mx._backward_square
— Method.
_backward_square(lhs, rhs)
Arguments
-
lhs::NDArray-or-SymbolicNode
: first input -
rhs::NDArray-or-SymbolicNode
: second input
source
# MXNet.mx._backward_square_sum
— Method.
_backward_square_sum()
Arguments
source
# MXNet.mx._backward_squeeze
— Method.
_backward_squeeze()
Arguments
source
# MXNet.mx._backward_stack
— Method.
_backward_stack()
Arguments
source
# MXNet.mx._backward_sub
— Method.
_backward_sub()
Arguments
source
# MXNet.mx._backward_sum
— Method.
_backward_sum()
Arguments
source
# MXNet.mx._backward_take
— Method.
_backward_take()
Arguments
source
# MXNet.mx._backward_tan
— Method.
_backward_tan(lhs, rhs)
Arguments
-
lhs::NDArray-or-SymbolicNode
: first input -
rhs::NDArray-or-SymbolicNode
: second input
source
# MXNet.mx._backward_tanh
— Method.
_backward_tanh(lhs, rhs)
Arguments
-
lhs::NDArray-or-SymbolicNode
: first input -
rhs::NDArray-or-SymbolicNode
: second input
source
# MXNet.mx._backward_tile
— Method.
_backward_tile()
Arguments
source
# MXNet.mx._backward_topk
— Method.
_backward_topk()
Arguments
source
# MXNet.mx._backward_where
— Method.
_backward_where()
Arguments
source
# MXNet.mx._broadcast_add!
— Method.
_broadcast_add!(x::NDArray, y::NDArray)
Defined in src/operator/tensor/elemwise_binary_broadcast_op_basic.cc:L51
source
# MXNet.mx._broadcast_add
— Method.
_broadcast_add(x::NDArray, y::NDArray)
Defined in src/operator/tensor/elemwise_binary_broadcast_op_basic.cc:L51
source
# MXNet.mx._broadcast_backward
— Method.
_broadcast_backward()
Arguments
source
# MXNet.mx._broadcast_div!
— Method.
_broadcast_div!(x::NDArray, y::NDArray)
Defined in src/operator/tensor/elemwise_binary_broadcast_op_basic.cc:L157
source
# MXNet.mx._broadcast_div
— Method.
_broadcast_div(x::NDArray, y::NDArray)
Defined in src/operator/tensor/elemwise_binary_broadcast_op_basic.cc:L157
source
# MXNet.mx._broadcast_equal!
— Method.
_broadcast_equal!(x::NDArray, y::NDArray)
Defined in src/operator/tensor/elemwise_binary_broadcast_op_logic.cc:L46
source
# MXNet.mx._broadcast_equal
— Method.
_broadcast_equal(x::NDArray, y::NDArray)
Defined in src/operator/tensor/elemwise_binary_broadcast_op_logic.cc:L46
source
# MXNet.mx._broadcast_greater!
— Method.
_broadcast_greater!(x::NDArray, y::NDArray)
Defined in src/operator/tensor/elemwise_binary_broadcast_op_logic.cc:L82
source
# MXNet.mx._broadcast_greater
— Method.
_broadcast_greater(x::NDArray, y::NDArray)
Defined in src/operator/tensor/elemwise_binary_broadcast_op_logic.cc:L82
source
# MXNet.mx._broadcast_greater_equal!
— Method.
_broadcast_greater_equal!(x::NDArray, y::NDArray)
Defined in src/operator/tensor/elemwise_binary_broadcast_op_logic.cc:L100
source
# MXNet.mx._broadcast_greater_equal
— Method.
_broadcast_greater_equal(x::NDArray, y::NDArray)
Defined in src/operator/tensor/elemwise_binary_broadcast_op_logic.cc:L100
source
# MXNet.mx._broadcast_hypot!
— Method.
_broadcast_hypot!(x::NDArray, y::NDArray)
Defined in src/operator/tensor/elemwise_binary_broadcast_op_extended.cc:L156
source
# MXNet.mx._broadcast_hypot
— Method.
_broadcast_hypot(x::NDArray, y::NDArray)
Defined in src/operator/tensor/elemwise_binary_broadcast_op_extended.cc:L156
source
# MXNet.mx._broadcast_lesser!
— Method.
_broadcast_lesser!(x::NDArray, y::NDArray)
Defined in src/operator/tensor/elemwise_binary_broadcast_op_logic.cc:L118
source
# MXNet.mx._broadcast_lesser
— Method.
_broadcast_lesser(x::NDArray, y::NDArray)
Defined in src/operator/tensor/elemwise_binary_broadcast_op_logic.cc:L118
source
# MXNet.mx._broadcast_lesser_equal!
— Method.
_broadcast_lesser_equal!(x::NDArray, y::NDArray)
Defined in src/operator/tensor/elemwise_binary_broadcast_op_logic.cc:L136
source
# MXNet.mx._broadcast_lesser_equal
— Method.
_broadcast_lesser_equal(x::NDArray, y::NDArray)
Defined in src/operator/tensor/elemwise_binary_broadcast_op_logic.cc:L136
source
# MXNet.mx._broadcast_maximum!
— Method.
_broadcast_maximum!(x::NDArray, y::NDArray)
Defined in src/operator/tensor/elemwise_binary_broadcast_op_extended.cc:L80
source
# MXNet.mx._broadcast_maximum
— Method.
_broadcast_maximum(x::NDArray, y::NDArray)
Defined in src/operator/tensor/elemwise_binary_broadcast_op_extended.cc:L80
source
# MXNet.mx._broadcast_minimum!
— Method.
_broadcast_minimum!(x::NDArray, y::NDArray)
Defined in src/operator/tensor/elemwise_binary_broadcast_op_extended.cc:L115
source
# MXNet.mx._broadcast_minimum
— Method.
_broadcast_minimum(x::NDArray, y::NDArray)
Defined in src/operator/tensor/elemwise_binary_broadcast_op_extended.cc:L115
source
# MXNet.mx._broadcast_minus!
— Method.
_broadcast_minus!(x::NDArray, y::NDArray)
Defined in src/operator/tensor/elemwise_binary_broadcast_op_basic.cc:L90
source
# MXNet.mx._broadcast_minus
— Method.
_broadcast_minus(x::NDArray, y::NDArray)
Defined in src/operator/tensor/elemwise_binary_broadcast_op_basic.cc:L90
source
# MXNet.mx._broadcast_mod!
— Method.
_broadcast_mod!(x::NDArray, y::NDArray)
Defined in src/operator/tensor/elemwise_binary_broadcast_op_basic.cc:L190
source
# MXNet.mx._broadcast_mod
— Method.
_broadcast_mod(x::NDArray, y::NDArray)
Defined in src/operator/tensor/elemwise_binary_broadcast_op_basic.cc:L190
source
# MXNet.mx._broadcast_mul!
— Method.
_broadcast_mul!(x::NDArray, y::NDArray)
Defined in src/operator/tensor/elemwise_binary_broadcast_op_basic.cc:L123
source
# MXNet.mx._broadcast_mul
— Method.
_broadcast_mul(x::NDArray, y::NDArray)
Defined in src/operator/tensor/elemwise_binary_broadcast_op_basic.cc:L123
source
# MXNet.mx._broadcast_not_equal!
— Method.
_broadcast_not_equal!(x::NDArray, y::NDArray)
Defined in src/operator/tensor/elemwise_binary_broadcast_op_logic.cc:L64
source
# MXNet.mx._broadcast_not_equal
— Method.
_broadcast_not_equal(x::NDArray, y::NDArray)
Defined in src/operator/tensor/elemwise_binary_broadcast_op_logic.cc:L64
source
# MXNet.mx._broadcast_power!
— Method.
_broadcast_power!(x::NDArray, y::NDArray)
Defined in src/operator/tensor/elemwise_binary_broadcast_op_extended.cc:L45
source
# MXNet.mx._broadcast_power
— Method.
_broadcast_power(x::NDArray, y::NDArray)
Defined in src/operator/tensor/elemwise_binary_broadcast_op_extended.cc:L45
source
# MXNet.mx._contrib_CTCLoss
— Method.
_contrib_CTCLoss(data, label, data_lengths, label_lengths, use_data_lengths, use_label_lengths, blank_label)
Connectionist Temporal Classification Loss.
The shapes of the inputs and outputs:
-
data:
(sequence_length, batch_size, alphabet_size)
-
label:
(batch_size, label_sequence_length)
-
out:
(batch_size)
The data
tensor consists of sequences of activation vectors (without applying softmax), with i-th channel in the last dimension corresponding to i-th label for i between 0 and alphabet_size-1 (i.e always 0-indexed). Alphabet size should include one additional value reserved for blank label. When blank_label
is $"first"$, the $0$-th channel is be reserved for activation of blank label, or otherwise if it is "last", $(alphabet_size-1)$-th channel should be reserved for blank label.
$label$ is an index matrix of integers. When blank_label
is $"first"$, the value 0 is then reserved for blank label, and should not be passed in this matrix. Otherwise, when blank_label
is $"last"$, the value (alphabet_size-1)
is reserved for blank label.
If a sequence of labels is shorter than label_sequence_length, use the special padding value at the end of the sequence to conform it to the correct length. The padding value is 0
when blank_label
is $"first"$, and -1
otherwise.
For example, suppose the vocabulary is [a, b, c]
, and in one batch we have three sequences 'ba', 'cbb', and 'abac'. When blank_label
is $"first"$, we can index the labels as {'a': 1, 'b': 2, 'c': 3}
, and we reserve the 0-th channel for blank label in data tensor. The resulting label
tensor should be padded to be::
[[2, 1, 0, 0], [3, 2, 2, 0], [1, 2, 1, 3]]
When blank_label
is $"last"$, we can index the labels as {'a': 0, 'b': 1, 'c': 2}
, and we reserve the channel index 3 for blank label in data tensor. The resulting label
tensor should be padded to be::
[[1, 0, -1, -1], [2, 1, 1, -1], [0, 1, 0, 2]]
$out$ is a list of CTC loss values, one per example in the batch.
See Connectionist Temporal Classification: Labelling Unsegmented Sequence Data with Recurrent Neural Networks, A. Graves et al. for more information on the definition and the algorithm.
Defined in src/operator/contrib/ctc_loss.cc:L115
Arguments
-
data::NDArray-or-SymbolicNode
: Input data to the ctc_loss op. -
label::NDArray-or-SymbolicNode
: Ground-truth labels for the loss. -
data_lengths::NDArray-or-SymbolicNode
: Lengths of data for each of the samples. Only required when use_data_lengths is true. -
label_lengths::NDArray-or-SymbolicNode
: Lengths of labels for each of the samples. Only required when use_label_lengths is true. -
use_data_lengths::boolean, optional, default=0
: Whether the data lenghts are decided bydata_lengths
. If false, the lengths are equal to the max sequence length. -
use_label_lengths::boolean, optional, default=0
: Whether the label lenghts are decided bylabel_lengths
, or derived frompadding_mask
. If false, the lengths are derived from the first occurrence of the value ofpadding_mask
. The value ofpadding_mask
is $0$ when first CTC label is reserved for blank, and $-1$ when last label is reserved for blank. Seeblank_label
. -
blank_label::{'first', 'last'},optional, default='first'
: Set the label that is reserved for blank label.If "first", 0-th label is reserved, and label values for tokens in the vocabulary are between $1$ and $alphabet_size-1$, and the padding mask is $-1$. If "last", last label value $alphabet_size-1$ is reserved for blank label instead, and label values for tokens in the vocabulary are between $0$ and $alphabet_size-2$, and the padding mask is $0$.
source
# MXNet.mx._contrib_DeformableConvolution
— Method.
_contrib_DeformableConvolution(data, offset, weight, bias, kernel, stride, dilate, pad, num_filter, num_group, num_deformable_group, workspace, no_bias, layout)
Compute 2-D deformable convolution on 4-D input.
The deformable convolution operation is described in https://arxiv.org/abs/1703.06211
For 2-D deformable convolution, the shapes are
- data: (batch_size, channel, height, width)
- offset: (batch_size, num_deformable_group * kernel[0] * kernel[1], height, width)
- weight: (num_filter, channel, kernel[0], kernel[1])
- bias: (num_filter,)
- out: (batch_size, num_filter, out_height, out_width).
Define::
f(x,k,p,s,d) = floor((x+2p-d(k-1)-1)/s)+1
then we have::
out_height=f(height, kernel[0], pad[0], stride[0], dilate[0]) out_width=f(width, kernel[1], pad[1], stride[1], dilate[1])
If $no_bias$ is set to be true, then the $bias$ term is ignored.
The default data $layout$ is NCHW, namely (batch_size, channle, height, width).
If $num_group$ is larger than 1, denoted by g, then split the input $data$ evenly into g parts along the channel axis, and also evenly split $weight$ along the first dimension. Next compute the convolution on the i-th part of the data with the i-th weight part. The output is obtained by concating all the g results.
If $num_deformable_group$ is larger than 1, denoted by dg, then split the input $offset$ evenly into dg parts along the channel axis, and also evenly split $out$ evenly into dg parts along the channel axis. Next compute the deformable convolution, apply the i-th part of the offset part on the i-th out.
Both $weight$ and $bias$ are learnable parameters.
Defined in src/operator/contrib/deformable_convolution.cc:L100
Arguments
-
data::NDArray-or-SymbolicNode
: Input data to the DeformableConvolutionOp. -
offset::NDArray-or-SymbolicNode
: Input offset to the DeformableConvolutionOp. -
weight::NDArray-or-SymbolicNode
: Weight matrix. -
bias::NDArray-or-SymbolicNode
: Bias parameter. -
kernel::Shape(tuple), required
: Convolution kernel size: (h, w) or (d, h, w) -
stride::Shape(tuple), optional, default=[]
: Convolution stride: (h, w) or (d, h, w). Defaults to 1 for each dimension. -
dilate::Shape(tuple), optional, default=[]
: Convolution dilate: (h, w) or (d, h, w). Defaults to 1 for each dimension. -
pad::Shape(tuple), optional, default=[]
: Zero pad for convolution: (h, w) or (d, h, w). Defaults to no padding. -
num_filter::int (non-negative), required
: Convolution filter(channel) number -
num_group::int (non-negative), optional, default=1
: Number of group partitions. -
num_deformable_group::int (non-negative), optional, default=1
: Number of deformable group partitions. -
workspace::long (non-negative), optional, default=1024
: Maximum temperal workspace allowed for convolution (MB). -
no_bias::boolean, optional, default=0
: Whether to disable bias parameter. -
layout::{None, 'NCDHW', 'NCHW', 'NCW'},optional, default='None'
: Set layout for input, output and weight. Empty for default layout: NCW for 1d, NCHW for 2d and NCDHW for 3d.
source
# MXNet.mx._contrib_DeformablePSROIPooling
— Method.
_contrib_DeformablePSROIPooling(data, rois, trans, spatial_scale, output_dim, group_size, pooled_size, part_size, sample_per_part, trans_std, no_trans)
Performs deformable position-sensitive region-of-interest pooling on inputs. The DeformablePSROIPooling operation is described in https://arxiv.org/abs/1703.06211 .batch_size will change to the number of region bounding boxes after DeformablePSROIPooling
Arguments
-
data::SymbolicNode
: Input data to the pooling operator, a 4D Feature maps -
rois::SymbolicNode
: Bounding box coordinates, a 2D array of [[batch_index, x1, y1, x2, y2]]. (x1, y1) and (x2, y2) are top left and down right corners of designated region of interest. batch_index indicates the index of corresponding image in the input data -
trans::SymbolicNode
: transition parameter -
spatial_scale::float, required
: Ratio of input feature map height (or w) to raw image height (or w). Equals the reciprocal of total stride in convolutional layers -
output_dim::int, required
: fix output dim -
group_size::int, required
: fix group size -
pooled_size::int, required
: fix pooled size -
part_size::int, optional, default='0'
: fix part size -
sample_per_part::int, optional, default='1'
: fix samples per part -
trans_std::float, optional, default=0
: fix transition std -
no_trans::boolean, optional, default=0
: Whether to disable trans parameter.
source
# MXNet.mx._contrib_MultiBoxDetection
— Method.
_contrib_MultiBoxDetection(cls_prob, loc_pred, anchor, clip, threshold, background_id, nms_threshold, force_suppress, variances, nms_topk)
Convert multibox detection predictions.
Arguments
-
cls_prob::NDArray-or-SymbolicNode
: Class probabilities. -
loc_pred::NDArray-or-SymbolicNode
: Location regression predictions. -
anchor::NDArray-or-SymbolicNode
: Multibox prior anchor boxes -
clip::boolean, optional, default=1
: Clip out-of-boundary boxes. -
threshold::float, optional, default=0.01
: Threshold to be a positive prediction. -
background_id::int, optional, default='0'
: Background id. -
nms_threshold::float, optional, default=0.5
: Non-maximum suppression threshold. -
force_suppress::boolean, optional, default=0
: Suppress all detections regardless of class_id. -
variances::tuple of
: Variances to be decoded from box regression output., optional, default=[0.1,0.1,0.2,0.2] -
nms_topk::int, optional, default='-1'
: Keep maximum top k detections before nms, -1 for no limit.
source
# MXNet.mx._contrib_MultiBoxPrior
— Method.
_contrib_MultiBoxPrior(data, sizes, ratios, clip, steps, offsets)
Generate prior(anchor) boxes from data, sizes and ratios.
Arguments
-
data::NDArray-or-SymbolicNode
: Input data. -
sizes::tuple of
: List of sizes of generated MultiBoxPriores., optional, default=[1] -
ratios::tuple of
: List of aspect ratios of generated MultiBoxPriores., optional, default=[1] -
clip::boolean, optional, default=0
: Whether to clip out-of-boundary boxes. -
steps::tuple of
: Priorbox step across y and x, -1 for auto calculation., optional, default=[-1,-1] -
offsets::tuple of
: Priorbox center offsets, y and x respectively, optional, default=[0.5,0.5]
source
# MXNet.mx._contrib_MultiBoxTarget
— Method.
_contrib_MultiBoxTarget(anchor, label, cls_pred, overlap_threshold, ignore_label, negative_mining_ratio, negative_mining_thresh, minimum_negative_samples, variances)
Compute Multibox training targets
Arguments
-
anchor::NDArray-or-SymbolicNode
: Generated anchor boxes. -
label::NDArray-or-SymbolicNode
: Object detection labels. -
cls_pred::NDArray-or-SymbolicNode
: Class predictions. -
overlap_threshold::float, optional, default=0.5
: Anchor-GT overlap threshold to be regarded as a positive match. -
ignore_label::float, optional, default=-1
: Label for ignored anchors. -
negative_mining_ratio::float, optional, default=-1
: Max negative to positive samples ratio, use -1 to disable mining -
negative_mining_thresh::float, optional, default=0.5
: Threshold used for negative mining. -
minimum_negative_samples::int, optional, default='0'
: Minimum number of negative samples. -
variances::tuple of
: Variances to be encoded in box regression target., optional, default=[0.1,0.1,0.2,0.2]
source
# MXNet.mx._contrib_MultiProposal
— Method.
_contrib_MultiProposal(cls_score, bbox_pred, im_info, rpn_pre_nms_top_n, rpn_post_nms_top_n, threshold, rpn_min_size, scales, ratios, feature_stride, output_score, iou_loss)
Generate region proposals via RPN
Arguments
-
cls_score::NDArray-or-SymbolicNode
: Score of how likely proposal is object. -
bbox_pred::NDArray-or-SymbolicNode
: BBox Predicted deltas from anchors for proposals -
im_info::NDArray-or-SymbolicNode
: Image size and scale. -
rpn_pre_nms_top_n::int, optional, default='6000'
: Number of top scoring boxes to keep after applying NMS to RPN proposals -
rpn_post_nms_top_n::int, optional, default='300'
: Overlap threshold used for non-maximumsuppresion(suppress boxes with IoU >= this threshold -
threshold::float, optional, default=0.7
: NMS value, below which to suppress. -
rpn_min_size::int, optional, default='16'
: Minimum height or width in proposal -
scales::tuple of
: Used to generate anchor windows by enumerating scales, optional, default=[4,8,16,32] -
ratios::tuple of
: Used to generate anchor windows by enumerating ratios, optional, default=[0.5,1,2] -
feature_stride::int, optional, default='16'
: The size of the receptive field each unit in the convolution layer of the rpn,for example the product of all stride's prior to this layer. -
output_score::boolean, optional, default=0
: Add score to outputs -
iou_loss::boolean, optional, default=0
: Usage of IoU Loss
source
# MXNet.mx._contrib_PSROIPooling
— Method.
_contrib_PSROIPooling(data, rois, spatial_scale, output_dim, pooled_size, group_size)
Performs region-of-interest pooling on inputs. Resize bounding box coordinates by spatial_scale and crop input feature maps accordingly. The cropped feature maps are pooled by max pooling to a fixed size output indicated by pooled_size. batch_size will change to the number of region bounding boxes after PSROIPooling
Arguments
-
data::SymbolicNode
: Input data to the pooling operator, a 4D Feature maps -
rois::SymbolicNode
: Bounding box coordinates, a 2D array of [[batch_index, x1, y1, x2, y2]]. (x1, y1) and (x2, y2) are top left and down right corners of designated region of interest. batch_index indicates the index of corresponding image in the input data -
spatial_scale::float, required
: Ratio of input feature map height (or w) to raw image height (or w). Equals the reciprocal of total stride in convolutional layers -
output_dim::int, required
: fix output dim -
pooled_size::int, required
: fix pooled size -
group_size::int, optional, default='0'
: fix group size
source
# MXNet.mx._contrib_Proposal
— Method.
_contrib_Proposal(cls_score, bbox_pred, im_info, rpn_pre_nms_top_n, rpn_post_nms_top_n, threshold, rpn_min_size, scales, ratios, feature_stride, output_score, iou_loss)
Generate region proposals via RPN
Arguments
-
cls_score::NDArray-or-SymbolicNode
: Score of how likely proposal is object. -
bbox_pred::NDArray-or-SymbolicNode
: BBox Predicted deltas from anchors for proposals -
im_info::NDArray-or-SymbolicNode
: Image size and scale. -
rpn_pre_nms_top_n::int, optional, default='6000'
: Number of top scoring boxes to keep after applying NMS to RPN proposals -
rpn_post_nms_top_n::int, optional, default='300'
: Overlap threshold used for non-maximumsuppresion(suppress boxes with IoU >= this threshold -
threshold::float, optional, default=0.7
: NMS value, below which to suppress. -
rpn_min_size::int, optional, default='16'
: Minimum height or width in proposal -
scales::tuple of
: Used to generate anchor windows by enumerating scales, optional, default=[4,8,16,32] -
ratios::tuple of
: Used to generate anchor windows by enumerating ratios, optional, default=[0.5,1,2] -
feature_stride::int, optional, default='16'
: The size of the receptive field each unit in the convolution layer of the rpn,for example the product of all stride's prior to this layer. -
output_score::boolean, optional, default=0
: Add score to outputs -
iou_loss::boolean, optional, default=0
: Usage of IoU Loss
source
# MXNet.mx._contrib_SparseEmbedding
— Method.
_contrib_SparseEmbedding(data, weight, input_dim, output_dim, dtype)
Maps integer indices to vector representations (embeddings).
This operator maps words to real-valued vectors in a high-dimensional space, called word embeddings. These embeddings can capture semantic and syntactic properties of the words. For example, it has been noted that in the learned embedding spaces, similar words tend to be close to each other and dissimilar words far apart.
For an input array of shape (d1, ..., dK), the shape of an output array is (d1, ..., dK, output_dim). All the input values should be integers in the range [0, input_dim).
If the input_dim is ip0 and output_dim is op0, then shape of the embedding weight matrix must be (ip0, op0).
The storage type of weight must be row_sparse
, and the gradient of the weight will be of row_sparse
storage type, too.
.. Note::
`SparseEmbedding` is designed for the use case where `input_dim` is very large (e.g. 100k).
The operator is available on both CPU and GPU.
Examples::
input_dim = 4 output_dim = 5
// Each row in weight matrix y represents a word. So, y = (w0,w1,w2,w3) y = [[ 0., 1., 2., 3., 4.], [ 5., 6., 7., 8., 9.], [ 10., 11., 12., 13., 14.], [ 15., 16., 17., 18., 19.]]
// Input array x represents n-grams(2-gram). So, x = [(w1,w3), (w0,w2)] x = [[ 1., 3.], [ 0., 2.]]
// Mapped input x to its vector representation y. SparseEmbedding(x, y, 4, 5) = [[[ 5., 6., 7., 8., 9.], [ 15., 16., 17., 18., 19.]],
[[ 0., 1., 2., 3., 4.],
[ 10., 11., 12., 13., 14.]]]
Defined in src/operator/tensor/indexing_op.cc:L294
Arguments
-
data::NDArray-or-SymbolicNode
: The input array to the embedding operator. -
weight::NDArray-or-SymbolicNode
: The embedding weight matrix. -
input_dim::int, required
: Vocabulary size of the input indices. -
output_dim::int, required
: Dimension of the embedding vectors. -
dtype::{'float16', 'float32', 'float64', 'int32', 'uint8'},optional, default='float32'
: Data type of weight.
source
# MXNet.mx._contrib_bipartite_matching
— Method.
_contrib_bipartite_matching(data, is_ascend, threshold, topk)
Compute bipartite matching. The matching is performed on score matrix with shape [B, N, M]
- B: batch_size
- N: number of rows to match
- M: number of columns as reference to be matched against.
Returns: x : matched column indices. -1 indicating non-matched elements in rows. y : matched row indices.
Note::
Zero gradients are back-propagated in this op for now.
Example::
s = [[0.5, 0.6], [0.1, 0.2], [0.3, 0.4]]
x, y = bipartite_matching(x, threshold=1e-12, is_ascend=False)
x = [1, -1, 0]
y = [2, 0]
Defined in src/operator/contrib/bounding_box.cc:L169
Arguments
-
data::NDArray-or-SymbolicNode
: The input -
is_ascend::boolean, optional, default=0
: Use ascend order for scores instead of descending. Please set threshold accordingly. -
threshold::float, required
: Ignore matching when score < thresh, if is_ascend=false, or ignore score > thresh, if is_ascend=true. -
topk::int, optional, default='-1'
: Limit the number of matches to topk, set -1 for no limit
source
# MXNet.mx._contrib_box_iou
— Method.
_contrib_box_iou(lhs, rhs, format)
Bounding box overlap of two arrays. The overlap is defined as Intersection-over-Union, aka, IOU.
- lhs: (a_1, a_2, ..., a_n, 4) array
- rhs: (b_1, b_2, ..., b_n, 4) array
- output: (a_1, a_2, ..., a_n, b_1, b_2, ..., b_n) array
Note::
Zero gradients are back-propagated in this op for now.
Example::
x = [[0.5, 0.5, 1.0, 1.0], [0.0, 0.0, 0.5, 0.5]]
y = [0.25, 0.25, 0.75, 0.75]
box_iou(x, y, format='corner') = [[0.1428], [0.1428]]
Defined in src/operator/contrib/bounding_box.cc:L123
Arguments
-
lhs::NDArray-or-SymbolicNode
: The first input -
rhs::NDArray-or-SymbolicNode
: The second input -
format::{'center', 'corner'},optional, default='corner'
: The box encoding type.
"corner" means boxes are encoded as [xmin, ymin, xmax, ymax], "center" means boxes are encodes as [x, y, width, height].
source
# MXNet.mx._contrib_box_nms
— Method.
_contrib_box_nms(data, overlap_thresh, topk, coord_start, score_index, id_index, force_suppress, in_format, out_format)
Apply non-maximum suppression to input.
The output will be sorted in descending order according to score
. Boxes with overlaps larger than overlap_thresh
and smaller scores will be removed and filled with -1, the corresponding position will be recorded for backward propogation.
During back-propagation, the gradient will be copied to the original position according to the input index. For positions that have been suppressed, the in_grad will be assigned 0. In summary, gradients are sticked to its boxes, will either be moved or discarded according to its original index in input.
Input requirements:
- Input tensor have at least 2 dimensions, (n, k), any higher dims will be regarded
as batch, e.g. (a, b, c, d, n, k) == (abc*d, n, k)
- n is the number of boxes in each batch
- k is the width of each box item.
By default, a box is [id, score, xmin, ymin, xmax, ymax, ...], additional elements are allowed.
-
id_index
: optional, use -1 to ignore, useful ifforce_suppress=False
, which means
we will skip highly overlapped boxes if one is apple
while the other is car
.
-
coord_start
: required, default=2, the starting index of the 4 coordinates.
Two formats are supported: corner
: [xmin, ymin, xmax, ymax] center
: [x, y, width, height]
-
score_index
: required, default=1, box score/confidence.
When two boxes overlap IOU > overlap_thresh
, the one with smaller score will be suppressed.
-
in_format
andout_format
: default='corner', specify in/out box formats.
Examples::
x = [[0, 0.5, 0.1, 0.1, 0.2, 0.2], [1, 0.4, 0.1, 0.1, 0.2, 0.2], [0, 0.3, 0.1, 0.1, 0.14, 0.14], [2, 0.6, 0.5, 0.5, 0.7, 0.8]] box_nms(x, overlap_thresh=0.1, coord_start=2, score_index=1, id_index=0, force_suppress=True, in_format='corner', out_typ='corner') = [[2, 0.6, 0.5, 0.5, 0.7, 0.8], [0, 0.5, 0.1, 0.1, 0.2, 0.2], [-1, -1, -1, -1, -1, -1], [-1, -1, -1, -1, -1, -1]] out_grad = [[0.1, 0.1, 0.1, 0.1, 0.1, 0.1], [0.2, 0.2, 0.2, 0.2, 0.2, 0.2], [0.3, 0.3, 0.3, 0.3, 0.3, 0.3], [0.4, 0.4, 0.4, 0.4, 0.4, 0.4]]
exe.backward
in_grad = [[0.2, 0.2, 0.2, 0.2, 0.2, 0.2], [0, 0, 0, 0, 0, 0], [0, 0, 0, 0, 0, 0], [0.1, 0.1, 0.1, 0.1, 0.1, 0.1]]
Defined in src/operator/contrib/bounding_box.cc:L82
Arguments
-
data::NDArray-or-SymbolicNode
: The input -
overlap_thresh::float, optional, default=0.5
: Overlapping(IoU) threshold to suppress object with smaller score. -
topk::int, optional, default='-1'
: Apply nms to topk boxes with descending scores, -1 to no restriction. -
coord_start::int, optional, default='2'
: Start index of the consecutive 4 coordinates. -
score_index::int, optional, default='1'
: Index of the scores/confidence of boxes. -
id_index::int, optional, default='-1'
: Optional, index of the class categories, -1 to disable. -
force_suppress::boolean, optional, default=0
: Optional, if set false and id_index is provided, nms will only apply to boxes belongs to the same category -
in_format::{'center', 'corner'},optional, default='corner'
: The input box encoding type.
"corner" means boxes are encoded as [xmin, ymin, xmax, ymax], "center" means boxes are encodes as [x, y, width, height].
-
out_format::{'center', 'corner'},optional, default='corner'
: The output box encoding type.
"corner" means boxes are encoded as [xmin, ymin, xmax, ymax], "center" means boxes are encodes as [x, y, width, height].
source
# MXNet.mx._contrib_box_non_maximum_suppression
— Method.
_contrib_box_non_maximum_suppression(data, overlap_thresh, topk, coord_start, score_index, id_index, force_suppress, in_format, out_format)
_contrib_box_non_maximum_suppression is an alias of _contrib_box_nms.
Apply non-maximum suppression to input.
The output will be sorted in descending order according to score
. Boxes with overlaps larger than overlap_thresh
and smaller scores will be removed and filled with -1, the corresponding position will be recorded for backward propogation.
During back-propagation, the gradient will be copied to the original position according to the input index. For positions that have been suppressed, the in_grad will be assigned 0. In summary, gradients are sticked to its boxes, will either be moved or discarded according to its original index in input.
Input requirements:
- Input tensor have at least 2 dimensions, (n, k), any higher dims will be regarded
as batch, e.g. (a, b, c, d, n, k) == (abc*d, n, k)
- n is the number of boxes in each batch
- k is the width of each box item.
By default, a box is [id, score, xmin, ymin, xmax, ymax, ...], additional elements are allowed.
-
id_index
: optional, use -1 to ignore, useful ifforce_suppress=False
, which means
we will skip highly overlapped boxes if one is apple
while the other is car
.
-
coord_start
: required, default=2, the starting index of the 4 coordinates.
Two formats are supported: corner
: [xmin, ymin, xmax, ymax] center
: [x, y, width, height]
-
score_index
: required, default=1, box score/confidence.
When two boxes overlap IOU > overlap_thresh
, the one with smaller score will be suppressed.
-
in_format
andout_format
: default='corner', specify in/out box formats.
Examples::
x = [[0, 0.5, 0.1, 0.1, 0.2, 0.2], [1, 0.4, 0.1, 0.1, 0.2, 0.2], [0, 0.3, 0.1, 0.1, 0.14, 0.14], [2, 0.6, 0.5, 0.5, 0.7, 0.8]] box_nms(x, overlap_thresh=0.1, coord_start=2, score_index=1, id_index=0, force_suppress=True, in_format='corner', out_typ='corner') = [[2, 0.6, 0.5, 0.5, 0.7, 0.8], [0, 0.5, 0.1, 0.1, 0.2, 0.2], [-1, -1, -1, -1, -1, -1], [-1, -1, -1, -1, -1, -1]] out_grad = [[0.1, 0.1, 0.1, 0.1, 0.1, 0.1], [0.2, 0.2, 0.2, 0.2, 0.2, 0.2], [0.3, 0.3, 0.3, 0.3, 0.3, 0.3], [0.4, 0.4, 0.4, 0.4, 0.4, 0.4]]
exe.backward
in_grad = [[0.2, 0.2, 0.2, 0.2, 0.2, 0.2], [0, 0, 0, 0, 0, 0], [0, 0, 0, 0, 0, 0], [0.1, 0.1, 0.1, 0.1, 0.1, 0.1]]
Defined in src/operator/contrib/bounding_box.cc:L82
Arguments
-
data::NDArray-or-SymbolicNode
: The input -
overlap_thresh::float, optional, default=0.5
: Overlapping(IoU) threshold to suppress object with smaller score. -
topk::int, optional, default='-1'
: Apply nms to topk boxes with descending scores, -1 to no restriction. -
coord_start::int, optional, default='2'
: Start index of the consecutive 4 coordinates. -
score_index::int, optional, default='1'
: Index of the scores/confidence of boxes. -
id_index::int, optional, default='-1'
: Optional, index of the class categories, -1 to disable. -
force_suppress::boolean, optional, default=0
: Optional, if set false and id_index is provided, nms will only apply to boxes belongs to the same category -
in_format::{'center', 'corner'},optional, default='corner'
: The input box encoding type.
"corner" means boxes are encoded as [xmin, ymin, xmax, ymax], "center" means boxes are encodes as [x, y, width, height].
-
out_format::{'center', 'corner'},optional, default='corner'
: The output box encoding type.
"corner" means boxes are encoded as [xmin, ymin, xmax, ymax], "center" means boxes are encodes as [x, y, width, height].
source
# MXNet.mx._contrib_count_sketch
— Method.
_contrib_count_sketch(data, h, s, out_dim, processing_batch_size)
Apply CountSketch to input: map a d-dimension data to k-dimension data"
.. note:: count_sketch
is only available on GPU.
Assume input data has shape (N, d), sign hash table s has shape (N, d), index hash table h has shape (N, d) and mapping dimension out_dim = k, each element in s is either +1 or -1, each element in h is random integer from 0 to k-1. Then the operator computs:
.. math:: out[h[i]] += data[i] * s[i]
Example::
out_dim = 5 x = [[1.2, 2.5, 3.4],[3.2, 5.7, 6.6]] h = [[0, 3, 4]] s = [[1, -1, 1]] mx.contrib.ndarray.count_sketch(data=x, h=h, s=s, out_dim = 5) = [[1.2, 0, 0, -2.5, 3.4], [3.2, 0, 0, -5.7, 6.6]]
Defined in src/operator/contrib/count_sketch.cc:L67
Arguments
-
data::NDArray-or-SymbolicNode
: Input data to the CountSketchOp. -
h::NDArray-or-SymbolicNode
: The index vector -
s::NDArray-or-SymbolicNode
: The sign vector -
out_dim::int, required
: The output dimension. -
processing_batch_size::int, optional, default='32'
: How many sketch vectors to process at one time.
source
# MXNet.mx._contrib_ctc_loss
— Method.
_contrib_ctc_loss(data, label, data_lengths, label_lengths, use_data_lengths, use_label_lengths, blank_label)
_contrib_ctc_loss is an alias of _contrib_CTCLoss.
Connectionist Temporal Classification Loss.
The shapes of the inputs and outputs:
-
data:
(sequence_length, batch_size, alphabet_size)
-
label:
(batch_size, label_sequence_length)
-
out:
(batch_size)
The data
tensor consists of sequences of activation vectors (without applying softmax), with i-th channel in the last dimension corresponding to i-th label for i between 0 and alphabet_size-1 (i.e always 0-indexed). Alphabet size should include one additional value reserved for blank label. When blank_label
is $"first"$, the $0$-th channel is be reserved for activation of blank label, or otherwise if it is "last", $(alphabet_size-1)$-th channel should be reserved for blank label.
$label$ is an index matrix of integers. When blank_label
is $"first"$, the value 0 is then reserved for blank label, and should not be passed in this matrix. Otherwise, when blank_label
is $"last"$, the value (alphabet_size-1)
is reserved for blank label.
If a sequence of labels is shorter than label_sequence_length, use the special padding value at the end of the sequence to conform it to the correct length. The padding value is 0
when blank_label
is $"first"$, and -1
otherwise.
For example, suppose the vocabulary is [a, b, c]
, and in one batch we have three sequences 'ba', 'cbb', and 'abac'. When blank_label
is $"first"$, we can index the labels as {'a': 1, 'b': 2, 'c': 3}
, and we reserve the 0-th channel for blank label in data tensor. The resulting label
tensor should be padded to be::
[[2, 1, 0, 0], [3, 2, 2, 0], [1, 2, 1, 3]]
When blank_label
is $"last"$, we can index the labels as {'a': 0, 'b': 1, 'c': 2}
, and we reserve the channel index 3 for blank label in data tensor. The resulting label
tensor should be padded to be::
[[1, 0, -1, -1], [2, 1, 1, -1], [0, 1, 0, 2]]
$out$ is a list of CTC loss values, one per example in the batch.
See Connectionist Temporal Classification: Labelling Unsegmented Sequence Data with Recurrent Neural Networks, A. Graves et al. for more information on the definition and the algorithm.
Defined in src/operator/contrib/ctc_loss.cc:L115
Arguments
-
data::NDArray-or-SymbolicNode
: Input data to the ctc_loss op. -
label::NDArray-or-SymbolicNode
: Ground-truth labels for the loss. -
data_lengths::NDArray-or-SymbolicNode
: Lengths of data for each of the samples. Only required when use_data_lengths is true. -
label_lengths::NDArray-or-SymbolicNode
: Lengths of labels for each of the samples. Only required when use_label_lengths is true. -
use_data_lengths::boolean, optional, default=0
: Whether the data lenghts are decided bydata_lengths
. If false, the lengths are equal to the max sequence length. -
use_label_lengths::boolean, optional, default=0
: Whether the label lenghts are decided bylabel_lengths
, or derived frompadding_mask
. If false, the lengths are derived from the first occurrence of the value ofpadding_mask
. The value ofpadding_mask
is $0$ when first CTC label is reserved for blank, and $-1$ when last label is reserved for blank. Seeblank_label
. -
blank_label::{'first', 'last'},optional, default='first'
: Set the label that is reserved for blank label.If "first", 0-th label is reserved, and label values for tokens in the vocabulary are between $1$ and $alphabet_size-1$, and the padding mask is $-1$. If "last", last label value $alphabet_size-1$ is reserved for blank label instead, and label values for tokens in the vocabulary are between $0$ and $alphabet_size-2$, and the padding mask is $0$.
source
# MXNet.mx._contrib_dequantize
— Method.
_contrib_dequantize(input, min_range, max_range, out_type)
Dequantize the input tensor into a float tensor. [min_range, max_range] are scalar floats that spcify the range for the output data.
Each value of the tensor will undergo the following:
out[i] = min_range + (in[i] * (max_range - min_range) / range(INPUT_TYPE))
here range(T) = numeric_limits
Defined in src/operator/contrib/dequantize.cc:L41
Arguments
-
input::NDArray-or-SymbolicNode
: A ndarray/symbol of typeuint8
-
min_range::NDArray-or-SymbolicNode
: The minimum scalar value possibly produced for the input -
max_range::NDArray-or-SymbolicNode
: The maximum scalar value possibly produced for the input -
out_type::{'float32'}, required
: Output data type.
source
# MXNet.mx._contrib_fft
— Method.
_contrib_fft(data, compute_size)
Apply 1D FFT to input"
.. note:: fft
is only available on GPU.
Currently accept 2 input data shapes: (N, d) or (N1, N2, N3, d), data can only be real numbers. The output data has shape: (N, 2d) or (N1, N2, N3, 2d). The format is: [real0, imag0, real1, imag1, ...].
Example::
data = np.random.normal(0,1,(3,4)) out = mx.contrib.ndarray.fft(data = mx.nd.array(data,ctx = mx.gpu(0)))
Defined in src/operator/contrib/fft.cc:L56
Arguments
-
data::NDArray-or-SymbolicNode
: Input data to the FFTOp. -
compute_size::int, optional, default='128'
: Maximum size of sub-batch to be forwarded at one time
source
# MXNet.mx._contrib_ifft
— Method.
_contrib_ifft(data, compute_size)
Apply 1D ifft to input"
.. note:: ifft
is only available on GPU.
Currently accept 2 input data shapes: (N, d) or (N1, N2, N3, d). Data is in format: [real0, imag0, real1, imag1, ...]. Last dimension must be an even number. The output data has shape: (N, d/2) or (N1, N2, N3, d/2). It is only the real part of the result.
Example::
data = np.random.normal(0,1,(3,4)) out = mx.contrib.ndarray.ifft(data = mx.nd.array(data,ctx = mx.gpu(0)))
Defined in src/operator/contrib/ifft.cc:L58
Arguments
-
data::NDArray-or-SymbolicNode
: Input data to the IFFTOp. -
compute_size::int, optional, default='128'
: Maximum size of sub-batch to be forwarded at one time
source
# MXNet.mx._contrib_quantize
— Method.
_contrib_quantize(input, min_range, max_range, out_type)
Quantize a input tensor from float to out_type
, with user-specified min_range
and max_range
.
[min_range, max_range] are scalar floats that spcify the range for the input data. Each value of the tensor will undergo the following:
out[i] = (in[i] - min_range) * range(OUTPUT_TYPE) / (max_range - min_range)
here range(T) = numeric_limits
Defined in src/operator/contrib/quantize.cc:L41
Arguments
-
input::NDArray-or-SymbolicNode
: A ndarray/symbol of typefloat32
-
min_range::NDArray-or-SymbolicNode
: The minimum scalar value possibly produced for the input -
max_range::NDArray-or-SymbolicNode
: The maximum scalar value possibly produced for the input -
out_type::{'uint8'},optional, default='uint8'
: Output data type.
source
# MXNet.mx._copy
— Method.
_copy(data)
Returns a copy of the input.
From:src/operator/tensor/elemwise_unary_op_basic.cc:111
Arguments
-
data::NDArray-or-SymbolicNode
: The input array.
source
# MXNet.mx._copyto
— Method.
_copyto(data)
Arguments
-
data::NDArray
: input data
source
# MXNet.mx._crop_assign
— Method.
_crop_assign(lhs, rhs, begin, end, step)
_crop_assign is an alias of _slice_assign.
Assign the rhs to a cropped subset of lhs.
Requirements
- output should be explicitly given and be the same as lhs.
- lhs and rhs are of the same data type, and on the same device.
From:src/operator/tensor/matrix_op.cc:385
Arguments
-
lhs::NDArray-or-SymbolicNode
: Source input -
rhs::NDArray-or-SymbolicNode
: value to assign -
begin::Shape(tuple), required
: starting indices for the slice operation, supports negative indices. -
end::Shape(tuple), required
: ending indices for the slice operation, supports negative indices. -
step::Shape(tuple), optional, default=[]
: step for the slice operation, supports negative values.
source
# MXNet.mx._crop_assign_scalar
— Method.
_crop_assign_scalar(data, scalar, begin, end, step)
_crop_assign_scalar is an alias of _slice_assign_scalar.
(Assign the scalar to a cropped subset of the input.
Requirements
- output should be explicitly given and be the same as input
)
From:src/operator/tensor/matrix_op.cc:410
Arguments
-
data::NDArray-or-SymbolicNode
: Source input -
scalar::float, optional, default=0
: The scalar value for assignment. -
begin::Shape(tuple), required
: starting indices for the slice operation, supports negative indices. -
end::Shape(tuple), required
: ending indices for the slice operation, supports negative indices. -
step::Shape(tuple), optional, default=[]
: step for the slice operation, supports negative values.
source
# MXNet.mx._cvcopyMakeBorder
— Method.
_cvcopyMakeBorder(src, top, bot, left, right, type, value, values)
Pad image border with OpenCV.
Arguments
-
src::NDArray
: source image -
top::int, required
: Top margin. -
bot::int, required
: Bottom margin. -
left::int, required
: Left margin. -
right::int, required
: Right margin. -
type::int, optional, default='0'
: Filling type (default=cv2.BORDER_CONSTANT). -
value::double, optional, default=0
: (Deprecated! Use $values$ instead.) Fill with single value. -
values::tuple of
: Fill with value(RGB[A] or gray), up to 4 channels., optional, default=[]
source
# MXNet.mx._cvimdecode
— Method.
_cvimdecode(buf, flag, to_rgb)
Decode image with OpenCV. Note: return image in RGB by default, instead of OpenCV's default BGR.
Arguments
-
buf::NDArray
: Buffer containing binary encoded image -
flag::int, optional, default='1'
: Convert decoded image to grayscale (0) or color (1). -
to_rgb::boolean, optional, default=1
: Whether to convert decoded image to mxnet's default RGB format (instead of opencv's default BGR).
source
# MXNet.mx._cvimread
— Method.
_cvimread(filename, flag, to_rgb)
Read and decode image with OpenCV. Note: return image in RGB by default, instead of OpenCV's default BGR.
Arguments
-
filename::string, required
: Name of the image file to be loaded. -
flag::int, optional, default='1'
: Convert decoded image to grayscale (0) or color (1). -
to_rgb::boolean, optional, default=1
: Whether to convert decoded image to mxnet's default RGB format (instead of opencv's default BGR).
source
# MXNet.mx._cvimresize
— Method.
_cvimresize(src, w, h, interp)
Resize image with OpenCV.
Arguments
-
src::NDArray
: source image -
w::int, required
: Width of resized image. -
h::int, required
: Height of resized image. -
interp::int, optional, default='1'
: Interpolation method (default=cv2.INTER_LINEAR).
source
# MXNet.mx._div_scalar
— Method.
_div_scalar(data, scalar)
Divide an array with a scalar.
$_div_scalar$ only operates on data array of input if input is sparse.
For example, if input of shape (100, 100) has only 2 non zero elements, i.e. input.data = [5, 6], scalar = nan, it will result output.data = [nan, nan] instead of 10000 nans.
Defined in src/operator/tensor/elemwise_binary_scalar_op_basic.cc:L171
Arguments
-
data::NDArray-or-SymbolicNode
: source input -
scalar::float
: scalar input
source
# MXNet.mx._docsig
— Method.
Generate docstring from function signature
source
# MXNet.mx._equal
— Method.
_equal(lhs, rhs)
Arguments
-
lhs::NDArray-or-SymbolicNode
: first input -
rhs::NDArray-or-SymbolicNode
: second input
source
# MXNet.mx._equal_scalar
— Method.
_equal_scalar(data, scalar)
Arguments
-
data::NDArray-or-SymbolicNode
: source input -
scalar::float
: scalar input
source
# MXNet.mx._get_ndarray_function_def
— Method.
The libxmnet APIs are automatically imported from libmxnet.so
. The functions listed here operate on NDArray
objects. The arguments to the functions are typically ordered as
func_name(arg_in1, arg_in2, ..., scalar1, scalar2, ..., arg_out1, arg_out2, ...)
unless NDARRAY_ARG_BEFORE_SCALAR
is not set. In this case, the scalars are put before the input arguments:
func_name(scalar1, scalar2, ..., arg_in1, arg_in2, ..., arg_out1, arg_out2, ...)
If ACCEPT_EMPTY_MUTATE_TARGET
is set. An overloaded function without the output arguments will also be defined:
func_name(arg_in1, arg_in2, ..., scalar1, scalar2, ...)
Upon calling, the output arguments will be automatically initialized with empty NDArrays.
Those functions always return the output arguments. If there is only one output (the typical situation), that object (NDArray
) is returned. Otherwise, a tuple containing all the outputs will be returned.
source
# MXNet.mx._grad_add
— Method.
_grad_add(lhs, rhs)
Arguments
-
lhs::NDArray-or-SymbolicNode
: first input -
rhs::NDArray-or-SymbolicNode
: second input
source
# MXNet.mx._greater
— Method.
_greater(lhs, rhs)
Arguments
-
lhs::NDArray-or-SymbolicNode
: first input -
rhs::NDArray-or-SymbolicNode
: second input
source
# MXNet.mx._greater_equal
— Method.
_greater_equal(lhs, rhs)
Arguments
-
lhs::NDArray-or-SymbolicNode
: first input -
rhs::NDArray-or-SymbolicNode
: second input
source
# MXNet.mx._greater_equal_scalar
— Method.
_greater_equal_scalar(data, scalar)
Arguments
-
data::NDArray-or-SymbolicNode
: source input -
scalar::float
: scalar input
source
# MXNet.mx._greater_scalar
— Method.
_greater_scalar(data, scalar)
Arguments
-
data::NDArray-or-SymbolicNode
: source input -
scalar::float
: scalar input
source
# MXNet.mx._hypot
— Method.
_hypot(lhs, rhs)
Given the "legs" of a right triangle, return its hypotenuse.
Defined in src/operator/tensor/elemwise_binary_op_extended.cc:L79
Arguments
-
lhs::NDArray-or-SymbolicNode
: first input -
rhs::NDArray-or-SymbolicNode
: second input
source
# MXNet.mx._hypot_scalar
— Method.
_hypot_scalar(data, scalar)
Arguments
-
data::NDArray-or-SymbolicNode
: source input -
scalar::float
: scalar input
source
# MXNet.mx._identity_with_attr_like_rhs
— Method.
_identity_with_attr_like_rhs(lhs, rhs)
Arguments
-
lhs::NDArray-or-SymbolicNode
: First input. -
rhs::NDArray-or-SymbolicNode
: Second input.
source
# MXNet.mx._image_adjust_lighting
— Method.
_image_adjust_lighting(data, alpha)
Adjust the lighting level of the input. Follow the AlexNet style.
Defined in src/operator/image/image_random.cc:L118
Arguments
-
data::NDArray-or-SymbolicNode
: The input. -
alpha::tuple of
: The lighting alphas for the R, G, B channels., required
source
# MXNet.mx._image_flip_left_right
— Method.
_image_flip_left_right(data)
Defined in src/operator/image/image_random.cc:L68
Arguments
-
data::NDArray-or-SymbolicNode
: The input.
source
# MXNet.mx._image_flip_top_bottom
— Method.
_image_flip_top_bottom(data)
Defined in src/operator/image/image_random.cc:L76
Arguments
-
data::NDArray-or-SymbolicNode
: The input.
source
# MXNet.mx._image_normalize
— Method.
_image_normalize(data, mean, std)
Defined in src/operator/image/image_random.cc:L52
Arguments
-
data::NDArray-or-SymbolicNode
: The input. -
mean::tuple of
: Sequence of mean for each channel., required -
std::tuple of
: Sequence of standard deviations for each channel., required
source
# MXNet.mx._image_random_brightness
— Method.
_image_random_brightness(data, min_factor, max_factor)
Defined in src/operator/image/image_random.cc:L84
Arguments
-
data::NDArray-or-SymbolicNode
: The input. -
min_factor::float, required
: Minimum factor. -
max_factor::float, required
: Maximum factor.
source
# MXNet.mx._image_random_color_jitter
— Method.
_image_random_color_jitter(data, brightness, contrast, saturation, hue)
Defined in src/operator/image/image_random.cc:L111
Arguments
-
data::NDArray-or-SymbolicNode
: The input. -
brightness::float, required
: How much to jitter brightness. -
contrast::float, required
: How much to jitter contrast. -
saturation::float, required
: How much to jitter saturation. -
hue::float, required
: How much to jitter hue.
source
# MXNet.mx._image_random_contrast
— Method.
_image_random_contrast(data, min_factor, max_factor)
Defined in src/operator/image/image_random.cc:L90
Arguments
-
data::NDArray-or-SymbolicNode
: The input. -
min_factor::float, required
: Minimum factor. -
max_factor::float, required
: Maximum factor.
source
# MXNet.mx._image_random_flip_left_right
— Method.
_image_random_flip_left_right(data)
Defined in src/operator/image/image_random.cc:L72
Arguments
-
data::NDArray-or-SymbolicNode
: The input.
source
# MXNet.mx._image_random_flip_top_bottom
— Method.
_image_random_flip_top_bottom(data)
Defined in src/operator/image/image_random.cc:L80
Arguments
-
data::NDArray-or-SymbolicNode
: The input.
source
# MXNet.mx._image_random_hue
— Method.
_image_random_hue(data, min_factor, max_factor)
Defined in src/operator/image/image_random.cc:L104
Arguments
-
data::NDArray-or-SymbolicNode
: The input. -
min_factor::float, required
: Minimum factor. -
max_factor::float, required
: Maximum factor.
source
# MXNet.mx._image_random_lighting
— Method.
_image_random_lighting(data, alpha_std)
Randomly add PCA noise. Follow the AlexNet style.
Defined in src/operator/image/image_random.cc:L125
Arguments
-
data::NDArray-or-SymbolicNode
: The input. -
alpha_std::float, optional, default=0.05
: Level of the lighting noise.
source
# MXNet.mx._image_random_saturation
— Method.
_image_random_saturation(data, min_factor, max_factor)
Defined in src/operator/image/image_random.cc:L97
Arguments
-
data::NDArray-or-SymbolicNode
: The input. -
min_factor::float, required
: Minimum factor. -
max_factor::float, required
: Maximum factor.
source
# MXNet.mx._image_to_tensor
— Method.
_image_to_tensor(data)
Defined in src/operator/image/image_random.cc:L42
Arguments
-
data::NDArray-or-SymbolicNode
: The input.
source
# MXNet.mx._imdecode
— Method.
_imdecode(mean, index, x0, y0, x1, y1, c, size)
Decode an image, clip to (x0, y0, x1, y1), subtract mean, and write to buffer
Arguments
-
mean::NDArray-or-SymbolicNode
: image mean -
index::int
: buffer position for output -
x0::int
: x0 -
y0::int
: y0 -
x1::int
: x1 -
y1::int
: y1 -
c::int
: channel -
size::int
: length of str_img
source
# MXNet.mx._lesser
— Method.
_lesser(lhs, rhs)
Arguments
-
lhs::NDArray-or-SymbolicNode
: first input -
rhs::NDArray-or-SymbolicNode
: second input
source
# MXNet.mx._lesser_equal
— Method.
_lesser_equal(lhs, rhs)
Arguments
-
lhs::NDArray-or-SymbolicNode
: first input -
rhs::NDArray-or-SymbolicNode
: second input
source
# MXNet.mx._lesser_equal_scalar
— Method.
_lesser_equal_scalar(data, scalar)
Arguments
-
data::NDArray-or-SymbolicNode
: source input -
scalar::float
: scalar input
source
# MXNet.mx._lesser_scalar
— Method.
_lesser_scalar(data, scalar)
Arguments
-
data::NDArray-or-SymbolicNode
: source input -
scalar::float
: scalar input
source
# MXNet.mx._linalg_gelqf
— Method.
_linalg_gelqf(A)
LQ factorization for general matrix. Input is a tensor A of dimension n >= 2.
If n=2, we compute the LQ factorization (LAPACK gelqf, followed by orglq). A must have shape (x, y) with x <= y, and must have full rank =x. The LQ factorization consists of L with shape (x, x) and Q with shape (x, y), so that:
A = L * Q
Here, L is lower triangular (upper triangle equal to zero) with nonzero diagonal, and Q is row-orthonormal, meaning that
Q * Q\ :sup:T
is equal to the identity matrix of shape (x, x).
If n>2, gelqf is performed separately on the trailing two dimensions for all inputs (batch mode).
.. note:: The operator supports float32 and float64 data types only.
Examples::
// Single LQ factorization A = [[1., 2., 3.], [4., 5., 6.]] Q, L = gelqf(A) Q = [[-0.26726124, -0.53452248, -0.80178373], [0.87287156, 0.21821789, -0.43643578]] L = [[-3.74165739, 0.], [-8.55235974, 1.96396101]]
// Batch LQ factorization A = [[[1., 2., 3.], [4., 5., 6.]], [[7., 8., 9.], [10., 11., 12.]]] Q, L = gelqf(A) Q = [[[-0.26726124, -0.53452248, -0.80178373], [0.87287156, 0.21821789, -0.43643578]], [[-0.50257071, -0.57436653, -0.64616234], [0.7620735, 0.05862104, -0.64483142]]] L = [[[-3.74165739, 0.], [-8.55235974, 1.96396101]], [[-13.92838828, 0.], [-19.09768702, 0.52758934]]]
Defined in src/operator/tensor/la_op.cc:L529
Arguments
-
A::NDArray-or-SymbolicNode
: Tensor of input matrices to be factorized
source
# MXNet.mx._linalg_gemm
— Method.
_linalg_gemm(A, B, C, transpose_a, transpose_b, alpha, beta)
Performs general matrix multiplication and accumulation. Input are tensors A, B, C, each of dimension n >= 2 and having the same shape on the leading n-2 dimensions.
If n=2, the BLAS3 function gemm is performed:
out = alpha * op\ (A) * op\ (B) + beta * C
Here, alpha and beta are scalar parameters, and op() is either the identity or matrix transposition (depending on transpose_a, transpose_b).
If n>2, gemm is performed separately on the trailing two dimensions for all inputs (batch mode).
.. note:: The operator supports float32 and float64 data types only.
Examples::
// Single matrix multiply-add A = [[1.0, 1.0], [1.0, 1.0]] B = [[1.0, 1.0], [1.0, 1.0], [1.0, 1.0]] C = [[1.0, 1.0, 1.0], [1.0, 1.0, 1.0]] gemm(A, B, C, transpose_b=True, alpha=2.0, beta=10.0) = [[14.0, 14.0, 14.0], [14.0, 14.0, 14.0]]
// Batch matrix multiply-add A = [[[1.0, 1.0]], [[0.1, 0.1]]] B = [[[1.0, 1.0]], [[0.1, 0.1]]] C = [[[10.0]], [[0.01]]] gemm(A, B, C, transpose_b=True, alpha=2.0 , beta=10.0) = [[[104.0]], [[0.14]]]
Defined in src/operator/tensor/la_op.cc:L69
Arguments
-
A::NDArray-or-SymbolicNode
: Tensor of input matrices -
B::NDArray-or-SymbolicNode
: Tensor of input matrices -
C::NDArray-or-SymbolicNode
: Tensor of input matrices -
transpose_a::boolean, optional, default=0
: Multiply with transposed of first input (A). -
transpose_b::boolean, optional, default=0
: Multiply with transposed of second input (B). -
alpha::double, optional, default=1
: Scalar factor multiplied with A*B. -
beta::double, optional, default=1
: Scalar factor multiplied with C.
source
# MXNet.mx._linalg_gemm2
— Method.
_linalg_gemm2(A, B, transpose_a, transpose_b, alpha)
Performs general matrix multiplication. Input are tensors A, B, each of dimension n >= 2 and having the same shape on the leading n-2 dimensions.
If n=2, the BLAS3 function gemm is performed:
out = alpha * op\ (A) * op\ (B)
Here alpha is a scalar parameter and op() is either the identity or the matrix transposition (depending on transpose_a, transpose_b).
If n>2, gemm is performed separately on the trailing two dimensions for all inputs (batch mode).
.. note:: The operator supports float32 and float64 data types only.
Examples::
// Single matrix multiply A = [[1.0, 1.0], [1.0, 1.0]] B = [[1.0, 1.0], [1.0, 1.0], [1.0, 1.0]] gemm2(A, B, transpose_b=True, alpha=2.0) = [[4.0, 4.0, 4.0], [4.0, 4.0, 4.0]]
// Batch matrix multiply A = [[[1.0, 1.0]], [[0.1, 0.1]]] B = [[[1.0, 1.0]], [[0.1, 0.1]]] gemm2(A, B, transpose_b=True, alpha=2.0) = [[[4.0]], [[0.04 ]]]
Defined in src/operator/tensor/la_op.cc:L128
Arguments
-
A::NDArray-or-SymbolicNode
: Tensor of input matrices -
B::NDArray-or-SymbolicNode
: Tensor of input matrices -
transpose_a::boolean, optional, default=0
: Multiply with transposed of first input (A). -
transpose_b::boolean, optional, default=0
: Multiply with transposed of second input (B). -
alpha::double, optional, default=1
: Scalar factor multiplied with A*B.
source
# MXNet.mx._linalg_potrf
— Method.
_linalg_potrf(A)
Performs Cholesky factorization of a symmetric positive-definite matrix. Input is a tensor A of dimension n >= 2.
If n=2, the Cholesky factor L of the symmetric, positive definite matrix A is computed. L is lower triangular (entries of upper triangle are all zero), has positive diagonal entries, and:
A = L * L\ :sup:T
If n>2, potrf is performed separately on the trailing two dimensions for all inputs (batch mode).
.. note:: The operator supports float32 and float64 data types only.
Examples::
// Single matrix factorization A = [[4.0, 1.0], [1.0, 4.25]] potrf(A) = [[2.0, 0], [0.5, 2.0]]
// Batch matrix factorization A = [[[4.0, 1.0], [1.0, 4.25]], [[16.0, 4.0], [4.0, 17.0]]] potrf(A) = [[[2.0, 0], [0.5, 2.0]], [[4.0, 0], [1.0, 4.0]]]
Defined in src/operator/tensor/la_op.cc:L178
Arguments
-
A::NDArray-or-SymbolicNode
: Tensor of input matrices to be decomposed
source
# MXNet.mx._linalg_potri
— Method.
_linalg_potri(A)
Performs matrix inversion from a Cholesky factorization. Input is a tensor A of dimension n >= 2.
If n=2, A is a lower triangular matrix (entries of upper triangle are all zero) with positive diagonal. We compute:
out = A\ :sup:-T
* A\ :sup:-1
In other words, if A is the Cholesky factor of a symmetric positive definite matrix B (obtained by potrf), then
out = B\ :sup:-1
If n>2, potri is performed separately on the trailing two dimensions for all inputs (batch mode).
.. note:: The operator supports float32 and float64 data types only.
.. note:: Use this operator only if you are certain you need the inverse of B, and cannot use the Cholesky factor A (potrf), together with backsubstitution (trsm). The latter is numerically much safer, and also cheaper.
Examples::
// Single matrix inverse A = [[2.0, 0], [0.5, 2.0]] potri(A) = [[0.26563, -0.0625], [-0.0625, 0.25]]
// Batch matrix inverse A = [[[2.0, 0], [0.5, 2.0]], [[4.0, 0], [1.0, 4.0]]] potri(A) = [[[0.26563, -0.0625], [-0.0625, 0.25]], [[0.06641, -0.01562], [-0.01562, 0,0625]]]
Defined in src/operator/tensor/la_op.cc:L236
Arguments
-
A::NDArray-or-SymbolicNode
: Tensor of lower triangular matrices
source
# MXNet.mx._linalg_sumlogdiag
— Method.
_linalg_sumlogdiag(A)
Computes the sum of the logarithms of the diagonal elements of a square matrix. Input is a tensor A of dimension n >= 2.
If n=2, A must be square with positive diagonal entries. We sum the natural logarithms of the diagonal elements, the result has shape (1,).
If n>2, sumlogdiag is performed separately on the trailing two dimensions for all inputs (batch mode).
.. note:: The operator supports float32 and float64 data types only.
Examples::
// Single matrix reduction A = [[1.0, 1.0], [1.0, 7.0]] sumlogdiag(A) = [1.9459]
// Batch matrix reduction A = [[[1.0, 1.0], [1.0, 7.0]], [[3.0, 0], [0, 17.0]]] sumlogdiag(A) = [1.9459, 3.9318]
Defined in src/operator/tensor/la_op.cc:L405
Arguments
-
A::NDArray-or-SymbolicNode
: Tensor of square matrices
source
# MXNet.mx._linalg_syevd
— Method.
_linalg_syevd(A)
Eigendecomposition for symmetric matrix. Input is a tensor A of dimension n >= 2.
If n=2, A must be symmetric, of shape (x, x). We compute the eigendecomposition, resulting in the orthonormal matrix U of eigenvectors, shape (x, x), and the vector L of eigenvalues, shape (x,), so that:
U * A = diag(L) * U
Here:
U * U\ :sup:T
= U\ :sup:T
* U = I
where I is the identity matrix. Also, L(0) <= L(1) <= L(2) <= ... (ascending order).
If n>2, syevd is performed separately on the trailing two dimensions of A (batch mode). In this case, U has n dimensions like A, and L has n-1 dimensions.
.. note:: The operator supports float32 and float64 data types only.
.. note:: Derivatives for this operator are defined only if A is such that all its eigenvalues are distinct, and the eigengaps are not too small. If you need gradients, do not apply this operator to matrices with multiple eigenvalues.
Examples::
// Single symmetric eigendecomposition A = [[1., 2.], [2., 4.]] U, L = syevd(A) U = [[0.89442719, -0.4472136], [0.4472136, 0.89442719]] L = [0., 5.]
// Batch symmetric eigendecomposition A = [[[1., 2.], [2., 4.]], [[1., 2.], [2., 5.]]] U, L = syevd(A) U = [[[0.89442719, -0.4472136], [0.4472136, 0.89442719]], [[0.92387953, -0.38268343], [0.38268343, 0.92387953]]] L = [[0., 5.], [0.17157288, 5.82842712]]
Defined in src/operator/tensor/la_op.cc:L598
Arguments
-
A::NDArray-or-SymbolicNode
: Tensor of input matrices to be factorized
source
# MXNet.mx._linalg_syrk
— Method.
_linalg_syrk(A, transpose, alpha)
Multiplication of matrix with its transpose. Input is a tensor A of dimension n >= 2.
If n=2, the operator performs the BLAS3 function syrk:
out = alpha * A * A\ :sup:T
if transpose=False, or
out = alpha * A\ :sup:T
\ * A
if transpose=True.
If n>2, syrk is performed separately on the trailing two dimensions for all inputs (batch mode).
.. note:: The operator supports float32 and float64 data types only.
Examples::
// Single matrix multiply A = [[1., 2., 3.], [4., 5., 6.]] syrk(A, alpha=1., transpose=False) = [[14., 32.], [32., 77.]] syrk(A, alpha=1., transpose=True) = [[17., 22., 27.], [22., 29., 36.], [27., 36., 45.]]
// Batch matrix multiply A = [[[1., 1.]], [[0.1, 0.1]]] syrk(A, alpha=2., transpose=False) = [[[4.]], [[0.04]]]
Defined in src/operator/tensor/la_op.cc:L461
Arguments
-
A::NDArray-or-SymbolicNode
: Tensor of input matrices -
transpose::boolean, optional, default=0
: Use transpose of input matrix. -
alpha::double, optional, default=1
: Scalar factor to be applied to the result.
source
# MXNet.mx._linalg_trmm
— Method.
_linalg_trmm(A, B, transpose, rightside, alpha)
Performs multiplication with a lower triangular matrix. Input are tensors A, B, each of dimension n >= 2 and having the same shape on the leading n-2 dimensions.
If n=2, A must be lower triangular. The operator performs the BLAS3 function trmm:
out = alpha * op\ (A) * B
if rightside=False, or
out = alpha * B * op\ (A)
if rightside=True. Here, alpha is a scalar parameter, and op() is either the identity or the matrix transposition (depending on transpose).
If n>2, trmm is performed separately on the trailing two dimensions for all inputs (batch mode).
.. note:: The operator supports float32 and float64 data types only.
Examples::
// Single triangular matrix multiply A = [[1.0, 0], [1.0, 1.0]] B = [[1.0, 1.0, 1.0], [1.0, 1.0, 1.0]] trmm(A, B, alpha=2.0) = [[2.0, 2.0, 2.0], [4.0, 4.0, 4.0]]
// Batch triangular matrix multiply A = [[[1.0, 0], [1.0, 1.0]], [[1.0, 0], [1.0, 1.0]]] B = [[[1.0, 1.0, 1.0], [1.0, 1.0, 1.0]], [[0.5, 0.5, 0.5], [0.5, 0.5, 0.5]]] trmm(A, B, alpha=2.0) = [[[2.0, 2.0, 2.0], [4.0, 4.0, 4.0]], [[1.0, 1.0, 1.0], [2.0, 2.0, 2.0]]]
Defined in src/operator/tensor/la_op.cc:L293
Arguments
-
A::NDArray-or-SymbolicNode
: Tensor of lower triangular matrices -
B::NDArray-or-SymbolicNode
: Tensor of matrices -
transpose::boolean, optional, default=0
: Use transposed of the triangular matrix -
rightside::boolean, optional, default=0
: Multiply triangular matrix from the right to non-triangular one. -
alpha::double, optional, default=1
: Scalar factor to be applied to the result.
source
# MXNet.mx._linalg_trsm
— Method.
_linalg_trsm(A, B, transpose, rightside, alpha)
Solves matrix equation involving a lower triangular matrix. Input are tensors A, B, each of dimension n >= 2 and having the same shape on the leading n-2 dimensions.
If n=2, A must be lower triangular. The operator performs the BLAS3 function trsm, solving for out in:
op\ (A) * out = alpha * B
if rightside=False, or
out * op\ (A) = alpha * B
if rightside=True. Here, alpha is a scalar parameter, and op() is either the identity or the matrix transposition (depending on transpose).
If n>2, trsm is performed separately on the trailing two dimensions for all inputs (batch mode).
.. note:: The operator supports float32 and float64 data types only.
Examples::
// Single matrix solve A = [[1.0, 0], [1.0, 1.0]] B = [[2.0, 2.0, 2.0], [4.0, 4.0, 4.0]] trsm(A, B, alpha=0.5) = [[1.0, 1.0, 1.0], [1.0, 1.0, 1.0]]
// Batch matrix solve A = [[[1.0, 0], [1.0, 1.0]], [[1.0, 0], [1.0, 1.0]]] B = [[[2.0, 2.0, 2.0], [4.0, 4.0, 4.0]], [[4.0, 4.0, 4.0], [8.0, 8.0, 8.0]]] trsm(A, B, alpha=0.5) = [[[1.0, 1.0, 1.0], [1.0, 1.0, 1.0]], [[2.0, 2.0, 2.0], [2.0, 2.0, 2.0]]]
Defined in src/operator/tensor/la_op.cc:L356
Arguments
-
A::NDArray-or-SymbolicNode
: Tensor of lower triangular matrices -
B::NDArray-or-SymbolicNode
: Tensor of matrices -
transpose::boolean, optional, default=0
: Use transposed of the triangular matrix -
rightside::boolean, optional, default=0
: Multiply triangular matrix from the right to non-triangular one. -
alpha::double, optional, default=1
: Scalar factor to be applied to the result.
source
# MXNet.mx._maximum
— Method.
_maximum(lhs, rhs)
Arguments
-
lhs::NDArray-or-SymbolicNode
: first input -
rhs::NDArray-or-SymbolicNode
: second input
source
# MXNet.mx._maximum_scalar
— Method.
_maximum_scalar(data, scalar)
Arguments
-
data::NDArray-or-SymbolicNode
: source input -
scalar::float
: scalar input
source
# MXNet.mx._minimum
— Method.
_minimum(lhs, rhs)
Arguments
-
lhs::NDArray-or-SymbolicNode
: first input -
rhs::NDArray-or-SymbolicNode
: second input
source
# MXNet.mx._minimum_scalar
— Method.
_minimum_scalar(data, scalar)
Arguments
-
data::NDArray-or-SymbolicNode
: source input -
scalar::float
: scalar input
source
# MXNet.mx._minus!
— Method.
_minus!(x::NDArray, y::NDArray)
source
# MXNet.mx._minus
— Method.
_minus(x::NDArray, y::NDArray)
source
# MXNet.mx._minus_scalar
— Method.
_minus_scalar(data, scalar)
Arguments
-
data::NDArray-or-SymbolicNode
: source input -
scalar::float
: scalar input
source
# MXNet.mx._mod!
— Method.
_mod!(x::NDArray, y::NDArray)
source
# MXNet.mx._mod
— Method.
_mod(x::NDArray, y::NDArray)
source
# MXNet.mx._mod_scalar!
— Method.
_mod_scalar!(x::NDArray, y::Real)
source
# MXNet.mx._mod_scalar
— Method.
_mod_scalar(x::NDArray, y::Real)
source
# MXNet.mx._mul
— Method.
_mul(lhs, rhs)
_mul is an alias of elemwise_mul.
Multiplies arguments element-wise.
The storage type of $elemwise_mul$ output depends on storage types of inputs
- elemwise_mul(default, default) = default
- elemwise_mul(row_sparse, row_sparse) = row_sparse
- elemwise_mul(default, row_sparse) = default
- elemwise_mul(row_sparse, default) = default
- elemwise_mul(csr, csr) = csr
- otherwise, $elemwise_mul$ generates output with default storage
Arguments
-
lhs::NDArray-or-SymbolicNode
: first input -
rhs::NDArray-or-SymbolicNode
: second input
source
# MXNet.mx._mul_scalar
— Method.
_mul_scalar(data, scalar)
Multiply an array with a scalar.
$_mul_scalar$ only operates on data array of input if input is sparse.
For example, if input of shape (100, 100) has only 2 non zero elements, i.e. input.data = [5, 6], scalar = nan, it will result output.data = [nan, nan] instead of 10000 nans.
Defined in src/operator/tensor/elemwise_binary_scalar_op_basic.cc:L149
Arguments
-
data::NDArray-or-SymbolicNode
: source input -
scalar::float
: scalar input
source
# MXNet.mx._not_equal
— Method.
_not_equal(lhs, rhs)
Arguments
-
lhs::NDArray-or-SymbolicNode
: first input -
rhs::NDArray-or-SymbolicNode
: second input
source
# MXNet.mx._not_equal_scalar
— Method.
_not_equal_scalar(data, scalar)
Arguments
-
data::NDArray-or-SymbolicNode
: source input -
scalar::float
: scalar input
source
# MXNet.mx._onehot_encode
— Method.
_onehot_encode(lhs, rhs)
Arguments
-
lhs::NDArray
: Left operand to the function. -
rhs::NDArray
: Right operand to the function.
source
# MXNet.mx._plus!
— Method.
_plus!(x::NDArray, y::NDArray)
source
# MXNet.mx._plus
— Method.
_plus(x::NDArray, y::NDArray)
source
# MXNet.mx._plus_scalar
— Method.
_plus_scalar(data, scalar)
Arguments
-
data::NDArray-or-SymbolicNode
: source input -
scalar::float
: scalar input
source
# MXNet.mx._power
— Method.
_power(lhs, rhs)
Arguments
-
lhs::NDArray-or-SymbolicNode
: first input -
rhs::NDArray-or-SymbolicNode
: second input
source
# MXNet.mx._power_scalar
— Method.
_power_scalar(data, scalar)
Arguments
-
data::NDArray-or-SymbolicNode
: source input -
scalar::float
: scalar input
source
# MXNet.mx._random_exponential
— Method.
_random_exponential(lam, shape, ctx, dtype)
Draw random samples from an exponential distribution.
Samples are distributed according to an exponential distribution parametrized by lambda (rate).
Example::
exponential(lam=4, shape=(2,2)) = [[ 0.0097189 , 0.08999364], [ 0.04146638, 0.31715935]]
Defined in src/operator/random/sample_op.cc:L115
Arguments
-
lam::float, optional, default=1
: Lambda parameter (rate) of the exponential distribution. -
shape::Shape(tuple), optional, default=[]
: Shape of the output. -
ctx::string, optional, default=''
: Context of output, in format cpu|gpu|cpu_pinned. Only used for imperative calls. -
dtype::{'None', 'float16', 'float32', 'float64'},optional, default='None'
: DType of the output in case this can't be inferred. Defaults to float32 if not defined (dtype=None).
source
# MXNet.mx._random_gamma
— Method.
_random_gamma(alpha, beta, shape, ctx, dtype)
Draw random samples from a gamma distribution.
Samples are distributed according to a gamma distribution parametrized by alpha (shape) and beta (scale).
Example::
gamma(alpha=9, beta=0.5, shape=(2,2)) = [[ 7.10486984, 3.37695289], [ 3.91697288, 3.65933681]]
Defined in src/operator/random/sample_op.cc:L100
Arguments
-
alpha::float, optional, default=1
: Alpha parameter (shape) of the gamma distribution. -
beta::float, optional, default=1
: Beta parameter (scale) of the gamma distribution. -
shape::Shape(tuple), optional, default=[]
: Shape of the output. -
ctx::string, optional, default=''
: Context of output, in format cpu|gpu|cpu_pinned. Only used for imperative calls. -
dtype::{'None', 'float16', 'float32', 'float64'},optional, default='None'
: DType of the output in case this can't be inferred. Defaults to float32 if not defined (dtype=None).
source
# MXNet.mx._random_generalized_negative_binomial
— Method.
_random_generalized_negative_binomial(mu, alpha, shape, ctx, dtype)
Draw random samples from a generalized negative binomial distribution.
Samples are distributed according to a generalized negative binomial distribution parametrized by mu (mean) and alpha (dispersion). alpha is defined as 1/k where k is the failure limit of the number of unsuccessful experiments (generalized to real numbers). Samples will always be returned as a floating point data type.
Example::
generalized_negative_binomial(mu=2.0, alpha=0.3, shape=(2,2)) = [[ 2., 1.], [ 6., 4.]]
Defined in src/operator/random/sample_op.cc:L168
Arguments
-
mu::float, optional, default=1
: Mean of the negative binomial distribution. -
alpha::float, optional, default=1
: Alpha (dispersion) parameter of the negative binomial distribution. -
shape::Shape(tuple), optional, default=[]
: Shape of the output. -
ctx::string, optional, default=''
: Context of output, in format cpu|gpu|cpu_pinned. Only used for imperative calls. -
dtype::{'None', 'float16', 'float32', 'float64'},optional, default='None'
: DType of the output in case this can't be inferred. Defaults to float32 if not defined (dtype=None).
source
# MXNet.mx._random_negative_binomial
— Method.
_random_negative_binomial(k, p, shape, ctx, dtype)
Draw random samples from a negative binomial distribution.
Samples are distributed according to a negative binomial distribution parametrized by k (limit of unsuccessful experiments) and p (failure probability in each experiment). Samples will always be returned as a floating point data type.
Example::
negative_binomial(k=3, p=0.4, shape=(2,2)) = [[ 4., 7.], [ 2., 5.]]
Defined in src/operator/random/sample_op.cc:L149
Arguments
-
k::int, optional, default='1'
: Limit of unsuccessful experiments. -
p::float, optional, default=1
: Failure probability in each experiment. -
shape::Shape(tuple), optional, default=[]
: Shape of the output. -
ctx::string, optional, default=''
: Context of output, in format cpu|gpu|cpu_pinned. Only used for imperative calls. -
dtype::{'None', 'float16', 'float32', 'float64'},optional, default='None'
: DType of the output in case this can't be inferred. Defaults to float32 if not defined (dtype=None).
source
# MXNet.mx._random_normal
— Method.
_random_normal(loc, scale, shape, ctx, dtype)
Draw random samples from a normal (Gaussian) distribution.
.. note:: The existing alias $normal$ is deprecated.
Samples are distributed according to a normal distribution parametrized by loc (mean) and scale (standard deviation).
Example::
normal(loc=0, scale=1, shape=(2,2)) = [[ 1.89171135, -1.16881478], [-1.23474145, 1.55807114]]
Defined in src/operator/random/sample_op.cc:L85
Arguments
-
loc::float, optional, default=0
: Mean of the distribution. -
scale::float, optional, default=1
: Standard deviation of the distribution. -
shape::Shape(tuple), optional, default=[]
: Shape of the output. -
ctx::string, optional, default=''
: Context of output, in format cpu|gpu|cpu_pinned. Only used for imperative calls. -
dtype::{'None', 'float16', 'float32', 'float64'},optional, default='None'
: DType of the output in case this can't be inferred. Defaults to float32 if not defined (dtype=None).
source
# MXNet.mx._random_poisson
— Method.
_random_poisson(lam, shape, ctx, dtype)
Draw random samples from a Poisson distribution.
Samples are distributed according to a Poisson distribution parametrized by lambda (rate). Samples will always be returned as a floating point data type.
Example::
poisson(lam=4, shape=(2,2)) = [[ 5., 2.], [ 4., 6.]]
Defined in src/operator/random/sample_op.cc:L132
Arguments
-
lam::float, optional, default=1
: Lambda parameter (rate) of the Poisson distribution. -
shape::Shape(tuple), optional, default=[]
: Shape of the output. -
ctx::string, optional, default=''
: Context of output, in format cpu|gpu|cpu_pinned. Only used for imperative calls. -
dtype::{'None', 'float16', 'float32', 'float64'},optional, default='None'
: DType of the output in case this can't be inferred. Defaults to float32 if not defined (dtype=None).
source
# MXNet.mx._random_uniform
— Method.
_random_uniform(low, high, shape, ctx, dtype)
Draw random samples from a uniform distribution.
.. note:: The existing alias $uniform$ is deprecated.
Samples are uniformly distributed over the half-open interval [low, high) (includes low, but excludes high).
Example::
uniform(low=0, high=1, shape=(2,2)) = [[ 0.60276335, 0.85794562], [ 0.54488319, 0.84725171]]
Defined in src/operator/random/sample_op.cc:L66
Arguments
-
low::float, optional, default=0
: Lower bound of the distribution. -
high::float, optional, default=1
: Upper bound of the distribution. -
shape::Shape(tuple), optional, default=[]
: Shape of the output. -
ctx::string, optional, default=''
: Context of output, in format cpu|gpu|cpu_pinned. Only used for imperative calls. -
dtype::{'None', 'float16', 'float32', 'float64'},optional, default='None'
: DType of the output in case this can't be inferred. Defaults to float32 if not defined (dtype=None).
source
# MXNet.mx._rdiv_scalar
— Method.
_rdiv_scalar(data, scalar)
Arguments
-
data::NDArray-or-SymbolicNode
: source input -
scalar::float
: scalar input
source
# MXNet.mx._rminus_scalar
— Method.
_rminus_scalar(data, scalar)
Arguments
-
data::NDArray-or-SymbolicNode
: source input -
scalar::float
: scalar input
source
# MXNet.mx._rmod_scalar!
— Method.
_rmod_scalar!(x::NDArray, y::Real)
source
# MXNet.mx._rmod_scalar
— Method.
_rmod_scalar(x::NDArray, y::Real)
source
# MXNet.mx._rpower_scalar
— Method.
_rpower_scalar(data, scalar)
Arguments
-
data::NDArray-or-SymbolicNode
: source input -
scalar::float
: scalar input
source
# MXNet.mx._sample_exponential
— Method.
_sample_exponential(lam, shape, dtype)
Concurrent sampling from multiple exponential distributions with parameters lambda (rate).
The parameters of the distributions are provided as an input array. Let [s] be the shape of the input array, n be the dimension of [s], [t] be the shape specified as the parameter of the operator, and m be the dimension of [t]. Then the output will be a (n+m)-dimensional array with shape [s]x[t].
For any valid n-dimensional index i with respect to the input array, output[i] will be an m-dimensional array that holds randomly drawn samples from the distribution which is parameterized by the input value at index i. If the shape parameter of the operator is not set, then one sample will be drawn per distribution and the output array has the same shape as the input array.
Examples::
lam = [ 1.0, 8.5 ]
// Draw a single sample for each distribution sample_exponential(lam) = [ 0.51837951, 0.09994757]
// Draw a vector containing two samples for each distribution sample_exponential(lam, shape=(2)) = [[ 0.51837951, 0.19866663], [ 0.09994757, 0.50447971]]
Defined in src/operator/random/multisample_op.cc:L284
Arguments
-
lam::NDArray-or-SymbolicNode
: Lambda (rate) parameters of the distributions. -
shape::Shape(tuple), optional, default=[]
: Shape to be sampled from each random distribution. -
dtype::{'None', 'float16', 'float32', 'float64'},optional, default='None'
: DType of the output in case this can't be inferred. Defaults to float32 if not defined (dtype=None).
source
# MXNet.mx._sample_gamma
— Method.
_sample_gamma(alpha, shape, dtype, beta)
Concurrent sampling from multiple gamma distributions with parameters alpha (shape) and beta (scale).
The parameters of the distributions are provided as input arrays. Let [s] be the shape of the input arrays, n be the dimension of [s], [t] be the shape specified as the parameter of the operator, and m be the dimension of [t]. Then the output will be a (n+m)-dimensional array with shape [s]x[t].
For any valid n-dimensional index i with respect to the input arrays, output[i] will be an m-dimensional array that holds randomly drawn samples from the distribution which is parameterized by the input values at index i. If the shape parameter of the operator is not set, then one sample will be drawn per distribution and the output array has the same shape as the input arrays.
Examples::
alpha = [ 0.0, 2.5 ] beta = [ 1.0, 0.7 ]
// Draw a single sample for each distribution sample_gamma(alpha, beta) = [ 0. , 2.25797319]
// Draw a vector containing two samples for each distribution sample_gamma(alpha, beta, shape=(2)) = [[ 0. , 0. ], [ 2.25797319, 1.70734084]]
Defined in src/operator/random/multisample_op.cc:L282
Arguments
-
alpha::NDArray-or-SymbolicNode
: Alpha (shape) parameters of the distributions. -
shape::Shape(tuple), optional, default=[]
: Shape to be sampled from each random distribution. -
dtype::{'None', 'float16', 'float32', 'float64'},optional, default='None'
: DType of the output in case this can't be inferred. Defaults to float32 if not defined (dtype=None). -
beta::NDArray-or-SymbolicNode
: Beta (scale) parameters of the distributions.
source
# MXNet.mx._sample_generalized_negative_binomial
— Method.
_sample_generalized_negative_binomial(mu, shape, dtype, alpha)
Concurrent sampling from multiple generalized negative binomial distributions with parameters mu (mean) and alpha (dispersion).
The parameters of the distributions are provided as input arrays. Let [s] be the shape of the input arrays, n be the dimension of [s], [t] be the shape specified as the parameter of the operator, and m be the dimension of [t]. Then the output will be a (n+m)-dimensional array with shape [s]x[t].
For any valid n-dimensional index i with respect to the input arrays, output[i] will be an m-dimensional array that holds randomly drawn samples from the distribution which is parameterized by the input values at index i. If the shape parameter of the operator is not set, then one sample will be drawn per distribution and the output array has the same shape as the input arrays.
Samples will always be returned as a floating point data type.
Examples::
mu = [ 2.0, 2.5 ] alpha = [ 1.0, 0.1 ]
// Draw a single sample for each distribution sample_generalized_negative_binomial(mu, alpha) = [ 0., 3.]
// Draw a vector containing two samples for each distribution sample_generalized_negative_binomial(mu, alpha, shape=(2)) = [[ 0., 3.], [ 3., 1.]]
Defined in src/operator/random/multisample_op.cc:L293
Arguments
-
mu::NDArray-or-SymbolicNode
: Means of the distributions. -
shape::Shape(tuple), optional, default=[]
: Shape to be sampled from each random distribution. -
dtype::{'None', 'float16', 'float32', 'float64'},optional, default='None'
: DType of the output in case this can't be inferred. Defaults to float32 if not defined (dtype=None). -
alpha::NDArray-or-SymbolicNode
: Alpha (dispersion) parameters of the distributions.
source
# MXNet.mx._sample_multinomial
— Method.
_sample_multinomial(data, shape, get_prob, dtype)
Concurrent sampling from multiple multinomial distributions.
data is an n dimensional array whose last dimension has length k, where k is the number of possible outcomes of each multinomial distribution. This operator will draw shape samples from each distribution. If shape is empty one sample will be drawn from each distribution.
If get_prob is true, a second array containing log likelihood of the drawn samples will also be returned. This is usually used for reinforcement learning where you can provide reward as head gradient for this array to estimate gradient.
Note that the input distribution must be normalized, i.e. data must sum to 1 along its last axis.
Examples::
probs = [[0, 0.1, 0.2, 0.3, 0.4], [0.4, 0.3, 0.2, 0.1, 0]]
// Draw a single sample for each distribution sample_multinomial(probs) = [3, 0]
// Draw a vector containing two samples for each distribution sample_multinomial(probs, shape=(2)) = [[4, 2], [0, 0]]
// requests log likelihood sample_multinomial(probs, get_prob=True) = [2, 1], [0.2, 0.3]
Arguments
-
data::NDArray-or-SymbolicNode
: Distribution probabilities. Must sum to one on the last axis. -
shape::Shape(tuple), optional, default=[]
: Shape to be sampled from each random distribution. -
get_prob::boolean, optional, default=0
: Whether to also return the log probability of sampled result. This is usually used for differentiating through stochastic variables, e.g. in reinforcement learning. -
dtype::{'int32'},optional, default='int32'
: DType of the output in case this can't be inferred. Only support int32 for now.
source
# MXNet.mx._sample_negative_binomial
— Method.
_sample_negative_binomial(k, shape, dtype, p)
Concurrent sampling from multiple negative binomial distributions with parameters k (failure limit) and p (failure probability).
The parameters of the distributions are provided as input arrays. Let [s] be the shape of the input arrays, n be the dimension of [s], [t] be the shape specified as the parameter of the operator, and m be the dimension of [t]. Then the output will be a (n+m)-dimensional array with shape [s]x[t].
For any valid n-dimensional index i with respect to the input arrays, output[i] will be an m-dimensional array that holds randomly drawn samples from the distribution which is parameterized by the input values at index i. If the shape parameter of the operator is not set, then one sample will be drawn per distribution and the output array has the same shape as the input arrays.
Samples will always be returned as a floating point data type.
Examples::
k = [ 20, 49 ] p = [ 0.4 , 0.77 ]
// Draw a single sample for each distribution sample_negative_binomial(k, p) = [ 15., 16.]
// Draw a vector containing two samples for each distribution sample_negative_binomial(k, p, shape=(2)) = [[ 15., 50.], [ 16., 12.]]
Defined in src/operator/random/multisample_op.cc:L289
Arguments
-
k::NDArray-or-SymbolicNode
: Limits of unsuccessful experiments. -
shape::Shape(tuple), optional, default=[]
: Shape to be sampled from each random distribution. -
dtype::{'None', 'float16', 'float32', 'float64'},optional, default='None'
: DType of the output in case this can't be inferred. Defaults to float32 if not defined (dtype=None). -
p::NDArray-or-SymbolicNode
: Failure probabilities in each experiment.
source
# MXNet.mx._sample_normal
— Method.
_sample_normal(mu, shape, dtype, sigma)
Concurrent sampling from multiple normal distributions with parameters mu (mean) and sigma (standard deviation).
The parameters of the distributions are provided as input arrays. Let [s] be the shape of the input arrays, n be the dimension of [s], [t] be the shape specified as the parameter of the operator, and m be the dimension of [t]. Then the output will be a (n+m)-dimensional array with shape [s]x[t].
For any valid n-dimensional index i with respect to the input arrays, output[i] will be an m-dimensional array that holds randomly drawn samples from the distribution which is parameterized by the input values at index i. If the shape parameter of the operator is not set, then one sample will be drawn per distribution and the output array has the same shape as the input arrays.
Examples::
mu = [ 0.0, 2.5 ] sigma = [ 1.0, 3.7 ]
// Draw a single sample for each distribution sample_normal(mu, sigma) = [-0.56410581, 0.95934606]
// Draw a vector containing two samples for each distribution sample_normal(mu, sigma, shape=(2)) = [[-0.56410581, 0.2928229 ], [ 0.95934606, 4.48287058]]
Defined in src/operator/random/multisample_op.cc:L279
Arguments
-
mu::NDArray-or-SymbolicNode
: Means of the distributions. -
shape::Shape(tuple), optional, default=[]
: Shape to be sampled from each random distribution. -
dtype::{'None', 'float16', 'float32', 'float64'},optional, default='None'
: DType of the output in case this can't be inferred. Defaults to float32 if not defined (dtype=None). -
sigma::NDArray-or-SymbolicNode
: Standard deviations of the distributions.
source
# MXNet.mx._sample_poisson
— Method.
_sample_poisson(lam, shape, dtype)
Concurrent sampling from multiple Poisson distributions with parameters lambda (rate).
The parameters of the distributions are provided as an input array. Let [s] be the shape of the input array, n be the dimension of [s], [t] be the shape specified as the parameter of the operator, and m be the dimension of [t]. Then the output will be a (n+m)-dimensional array with shape [s]x[t].
For any valid n-dimensional index i with respect to the input array, output[i] will be an m-dimensional array that holds randomly drawn samples from the distribution which is parameterized by the input value at index i. If the shape parameter of the operator is not set, then one sample will be drawn per distribution and the output array has the same shape as the input array.
Samples will always be returned as a floating point data type.
Examples::
lam = [ 1.0, 8.5 ]
// Draw a single sample for each distribution sample_poisson(lam) = [ 0., 13.]
// Draw a vector containing two samples for each distribution sample_poisson(lam, shape=(2)) = [[ 0., 4.], [ 13., 8.]]
Defined in src/operator/random/multisample_op.cc:L286
Arguments
-
lam::NDArray-or-SymbolicNode
: Lambda (rate) parameters of the distributions. -
shape::Shape(tuple), optional, default=[]
: Shape to be sampled from each random distribution. -
dtype::{'None', 'float16', 'float32', 'float64'},optional, default='None'
: DType of the output in case this can't be inferred. Defaults to float32 if not defined (dtype=None).
source
# MXNet.mx._sample_uniform
— Method.
_sample_uniform(low, shape, dtype, high)
Concurrent sampling from multiple uniform distributions on the intervals given by [low,high).
The parameters of the distributions are provided as input arrays. Let [s] be the shape of the input arrays, n be the dimension of [s], [t] be the shape specified as the parameter of the operator, and m be the dimension of [t]. Then the output will be a (n+m)-dimensional array with shape [s]x[t].
For any valid n-dimensional index i with respect to the input arrays, output[i] will be an m-dimensional array that holds randomly drawn samples from the distribution which is parameterized by the input values at index i. If the shape parameter of the operator is not set, then one sample will be drawn per distribution and the output array has the same shape as the input arrays.
Examples::
low = [ 0.0, 2.5 ] high = [ 1.0, 3.7 ]
// Draw a single sample for each distribution sample_uniform(low, high) = [ 0.40451524, 3.18687344]
// Draw a vector containing two samples for each distribution sample_uniform(low, high, shape=(2)) = [[ 0.40451524, 0.18017688], [ 3.18687344, 3.68352246]]
Defined in src/operator/random/multisample_op.cc:L277
Arguments
-
low::NDArray-or-SymbolicNode
: Lower bounds of the distributions. -
shape::Shape(tuple), optional, default=[]
: Shape to be sampled from each random distribution. -
dtype::{'None', 'float16', 'float32', 'float64'},optional, default='None'
: DType of the output in case this can't be inferred. Defaults to float32 if not defined (dtype=None). -
high::NDArray-or-SymbolicNode
: Upper bounds of the distributions.
source
# MXNet.mx._scatter_elemwise_div
— Method.
_scatter_elemwise_div(lhs, rhs)
Divides arguments element-wise. If the left-hand-side input is 'row_sparse', then only the values which exist in the left-hand sparse array are computed. The 'missing' values are ignored.
The storage type of $_scatter_elemwise_div$ output depends on storage types of inputs
- _scatter_elemwise_div(row_sparse, row_sparse) = row_sparse
- _scatter_elemwise_div(row_sparse, dense) = row_sparse
- _scatter_elemwise_div(row_sparse, csr) = row_sparse
- otherwise, $_scatter_elemwise_div$ behaves exactly like elemwise_div and generates output
with default storage
Arguments
-
lhs::NDArray-or-SymbolicNode
: first input -
rhs::NDArray-or-SymbolicNode
: second input
source
# MXNet.mx._scatter_minus_scalar
— Method.
_scatter_minus_scalar(data, scalar)
Subtracts a scalar to a tensor element-wise. If the left-hand-side input is 'row_sparse' or 'csr', then only the values which exist in the left-hand sparse array are computed. The 'missing' values are ignored.
The storage type of $_scatter_minus_scalar$ output depends on storage types of inputs
- _scatter_minus_scalar(row_sparse, scalar) = row_sparse
- _scatter_minus_scalar(csr, scalar) = csr
- otherwise, $_scatter_minus_scalar$ behaves exactly like _minus_scalar and generates output
with default storage
Arguments
-
data::NDArray-or-SymbolicNode
: source input -
scalar::float
: scalar input
source
# MXNet.mx._scatter_plus_scalar
— Method.
_scatter_plus_scalar(data, scalar)
Adds a scalar to a tensor element-wise. If the left-hand-side input is 'row_sparse' or 'csr', then only the values which exist in the left-hand sparse array are computed. The 'missing' values are ignored.
The storage type of $_scatter_plus_scalar$ output depends on storage types of inputs
- _scatter_plus_scalar(row_sparse, scalar) = row_sparse
- _scatter_plus_scalar(csr, scalar) = csr
- otherwise, $_scatter_plus_scalar$ behaves exactly like _plus_scalar and generates output
with default storage
Arguments
-
data::NDArray-or-SymbolicNode
: source input -
scalar::float
: scalar input
source
# MXNet.mx._scatter_set_nd
— Method.
_scatter_set_nd(data, indices, shape)
This operator has the same functionality as scatter_nd except that it does not reset the elements not indexed by the input index NDArray
in the input data NDArray
. .. note:: This operator is for internal use only.
Examples::
data = [2, 3, 0] indices = [[1, 1, 0], [0, 1, 0]] out = [[1, 1], [1, 1]] scatter_nd(data=data, indices=indices, out=out) out = [[0, 1], [2, 3]]
Arguments
-
data::NDArray-or-SymbolicNode
: data -
indices::NDArray-or-SymbolicNode
: indices -
shape::Shape(tuple), required
: Shape of output.
source
# MXNet.mx._set_value
— Method.
_set_value(src)
Arguments
-
src::real_t
: Source input to the function.
source
# MXNet.mx._slice_assign
— Method.
_slice_assign(lhs, rhs, begin, end, step)
Assign the rhs to a cropped subset of lhs.
Requirements
- output should be explicitly given and be the same as lhs.
- lhs and rhs are of the same data type, and on the same device.
From:src/operator/tensor/matrix_op.cc:385
Arguments
-
lhs::NDArray-or-SymbolicNode
: Source input -
rhs::NDArray-or-SymbolicNode
: value to assign -
begin::Shape(tuple), required
: starting indices for the slice operation, supports negative indices. -
end::Shape(tuple), required
: ending indices for the slice operation, supports negative indices. -
step::Shape(tuple), optional, default=[]
: step for the slice operation, supports negative values.
source
# MXNet.mx._slice_assign_scalar
— Method.
_slice_assign_scalar(data, scalar, begin, end, step)
(Assign the scalar to a cropped subset of the input.
Requirements
- output should be explicitly given and be the same as input
)
From:src/operator/tensor/matrix_op.cc:410
Arguments
-
data::NDArray-or-SymbolicNode
: Source input -
scalar::float, optional, default=0
: The scalar value for assignment. -
begin::Shape(tuple), required
: starting indices for the slice operation, supports negative indices. -
end::Shape(tuple), required
: ending indices for the slice operation, supports negative indices. -
step::Shape(tuple), optional, default=[]
: step for the slice operation, supports negative values.
source
# MXNet.mx._sparse_ElementWiseSum
— Method.
_sparse_ElementWiseSum(args)
_sparse_ElementWiseSum is an alias of add_n.
Note: _sparse_ElementWiseSum takes variable number of positional inputs. So instead of calling as _sparse_ElementWiseSum([x, y, z], num_args=3), one should call via _sparse_ElementWiseSum(x, y, z), and num_args will be determined automatically.
Adds all input arguments element-wise.
.. math:: add_n(a_1, a_2, ..., a_n) = a_1 + a_2 + ... + a_n
$add_n$ is potentially more efficient than calling $add$ by n
times.
The storage type of $add_n$ output depends on storage types of inputs
- add_n(row_sparse, row_sparse, ..) = row_sparse
- otherwise, $add_n$ generates output with default storage
Defined in src/operator/tensor/elemwise_sum.cc:L123
Arguments
-
args::NDArray-or-SymbolicNode[]
: Positional input arguments
source
# MXNet.mx._sparse_abs
— Method.
_sparse_abs(data)
_sparse_abs is an alias of abs.
Returns element-wise absolute value of the input.
Example::
abs([-2, 0, 3]) = [2, 0, 3]
The storage type of $abs$ output depends upon the input storage type:
- abs(default) = default
- abs(row_sparse) = row_sparse
Defined in src/operator/tensor/elemwise_unary_op_basic.cc:L385
Arguments
-
data::NDArray-or-SymbolicNode
: The input array.
source
# MXNet.mx._sparse_adam_update
— Method.
_sparse_adam_update(weight, grad, mean, var, lr, beta1, beta2, epsilon, wd, rescale_grad, clip_gradient)
_sparse_adam_update is an alias of adam_update.
Update function for Adam optimizer. Adam is seen as a generalization of AdaGrad.
Adam update consists of the following steps, where g represents gradient and m, v are 1st and 2nd order moment estimates (mean and variance).
.. math::
g_t = \nabla J(W_{t-1})\ m_t = \beta_1 m_{t-1} + (1 - \beta_1) g_t\ v_t = \beta_2 v_{t-1} + (1 - \beta_2) g_t^2\ W_t = W_{t-1} - \alpha \frac{ m_t }{ \sqrt{ v_t } + \epsilon }
It updates the weights using::
m = beta1m + (1-beta1)grad v = beta2v + (1-beta2)(grad**2) w += - learning_rate * m / (sqrt(v) + epsilon)
If w, m and v are all of $row_sparse$ storage type, only the row slices whose indices appear in grad.indices are updated (for w, m and v)::
for row in grad.indices: m[row] = beta1m[row] + (1-beta1)grad[row] v[row] = beta2v[row] + (1-beta2)(grad[row]**2) w[row] += - learning_rate * m[row] / (sqrt(v[row]) + epsilon)
Defined in src/operator/optimizer_op.cc:L383
Arguments
-
weight::NDArray-or-SymbolicNode
: Weight -
grad::NDArray-or-SymbolicNode
: Gradient -
mean::NDArray-or-SymbolicNode
: Moving mean -
var::NDArray-or-SymbolicNode
: Moving variance -
lr::float, required
: Learning rate -
beta1::float, optional, default=0.9
: The decay rate for the 1st moment estimates. -
beta2::float, optional, default=0.999
: The decay rate for the 2nd moment estimates. -
epsilon::float, optional, default=1e-08
: A small constant for numerical stability. -
wd::float, optional, default=0
: Weight decay augments the objective function with a regularization term that penalizes large weights. The penalty scales with the square of the magnitude of each weight. -
rescale_grad::float, optional, default=1
: Rescale gradient to grad = rescale_grad*grad. -
clip_gradient::float, optional, default=-1
: Clip gradient to the range of [-clip_gradient, clip_gradient] If clip_gradient <= 0, gradient clipping is turned off. grad = max(min(grad, clip_gradient), -clip_gradient).
source
# MXNet.mx._sparse_add_n
— Method.
_sparse_add_n(args)
_sparse_add_n is an alias of add_n.
Note: _sparse_add_n takes variable number of positional inputs. So instead of calling as _sparse_add_n([x, y, z], num_args=3), one should call via _sparse_add_n(x, y, z), and num_args will be determined automatically.
Adds all input arguments element-wise.
.. math:: add_n(a_1, a_2, ..., a_n) = a_1 + a_2 + ... + a_n
$add_n$ is potentially more efficient than calling $add$ by n
times.
The storage type of $add_n$ output depends on storage types of inputs
- add_n(row_sparse, row_sparse, ..) = row_sparse
- otherwise, $add_n$ generates output with default storage
Defined in src/operator/tensor/elemwise_sum.cc:L123
Arguments
-
args::NDArray-or-SymbolicNode[]
: Positional input arguments
source
# MXNet.mx._sparse_arccos
— Method.
_sparse_arccos(data)
_sparse_arccos is an alias of arccos.
Returns element-wise inverse cosine of the input array.
The input should be in range [-1, 1]
. The output is in the closed interval :math:[0, \pi]
.. math:: arccos([-1, -.707, 0, .707, 1]) = [\pi, 3\pi/4, \pi/2, \pi/4, 0]
The storage type of $arccos$ output is always dense
Defined in src/operator/tensor/elemwise_unary_op_trig.cc:L123
Arguments
-
data::NDArray-or-SymbolicNode
: The input array.
source
# MXNet.mx._sparse_arccosh
— Method.
_sparse_arccosh(data)
_sparse_arccosh is an alias of arccosh.
Returns the element-wise inverse hyperbolic cosine of the input array, computed element-wise.
The storage type of $arccosh$ output is always dense
Defined in src/operator/tensor/elemwise_unary_op_trig.cc:L264
Arguments
-
data::NDArray-or-SymbolicNode
: The input array.
source
# MXNet.mx._sparse_arcsin
— Method.
_sparse_arcsin(data)
_sparse_arcsin is an alias of arcsin.
Returns element-wise inverse sine of the input array.
The input should be in the range [-1, 1]
. The output is in the closed interval of [:math:-\pi/2
, :math:\pi/2
].
.. math:: arcsin([-1, -.707, 0, .707, 1]) = [-\pi/2, -\pi/4, 0, \pi/4, \pi/2]
The storage type of $arcsin$ output depends upon the input storage type:
- arcsin(default) = default
- arcsin(row_sparse) = row_sparse
Defined in src/operator/tensor/elemwise_unary_op_trig.cc:L104
Arguments
-
data::NDArray-or-SymbolicNode
: The input array.
source
# MXNet.mx._sparse_arcsinh
— Method.
_sparse_arcsinh(data)
_sparse_arcsinh is an alias of arcsinh.
Returns the element-wise inverse hyperbolic sine of the input array, computed element-wise.
The storage type of $arcsinh$ output depends upon the input storage type:
- arcsinh(default) = default
- arcsinh(row_sparse) = row_sparse
Defined in src/operator/tensor/elemwise_unary_op_trig.cc:L250
Arguments
-
data::NDArray-or-SymbolicNode
: The input array.
source
# MXNet.mx._sparse_arctan
— Method.
_sparse_arctan(data)
_sparse_arctan is an alias of arctan.
Returns element-wise inverse tangent of the input array.
The output is in the closed interval :math:[-\pi/2, \pi/2]
.. math:: arctan([-1, 0, 1]) = [-\pi/4, 0, \pi/4]
The storage type of $arctan$ output depends upon the input storage type:
- arctan(default) = default
- arctan(row_sparse) = row_sparse
Defined in src/operator/tensor/elemwise_unary_op_trig.cc:L144
Arguments
-
data::NDArray-or-SymbolicNode
: The input array.
source
# MXNet.mx._sparse_arctanh
— Method.
_sparse_arctanh(data)
_sparse_arctanh is an alias of arctanh.
Returns the element-wise inverse hyperbolic tangent of the input array, computed element-wise.
The storage type of $arctanh$ output depends upon the input storage type:
- arctanh(default) = default
- arctanh(row_sparse) = row_sparse
Defined in src/operator/tensor/elemwise_unary_op_trig.cc:L281
Arguments
-
data::NDArray-or-SymbolicNode
: The input array.
source
# MXNet.mx._sparse_cast_storage
— Method.
_sparse_cast_storage(data, stype)
_sparse_cast_storage is an alias of cast_storage.
Casts tensor storage type to the new type.
When an NDArray with default storage type is cast to csr or row_sparse storage, the result is compact, which means:
- for csr, zero values will not be retained
- for row_sparse, row slices of all zeros will not be retained
The storage type of $cast_storage$ output depends on stype parameter:
- cast_storage(csr, 'default') = default
- cast_storage(row_sparse, 'default') = default
- cast_storage(default, 'csr') = csr
- cast_storage(default, 'row_sparse') = row_sparse
Example::
dense = [[ 0., 1., 0.],
[ 2., 0., 3.],
[ 0., 0., 0.],
[ 0., 0., 0.]]
# cast to row_sparse storage type
rsp = cast_storage(dense, 'row_sparse')
rsp.indices = [0, 1]
rsp.values = [[ 0., 1., 0.],
[ 2., 0., 3.]]
# cast to csr storage type
csr = cast_storage(dense, 'csr')
csr.indices = [1, 0, 2]
csr.values = [ 1., 2., 3.]
csr.indptr = [0, 1, 3, 3, 3]
Defined in src/operator/tensor/cast_storage.cc:L69
Arguments
-
data::NDArray-or-SymbolicNode
: The input. -
stype::{'csr', 'default', 'row_sparse'}, required
: Output storage type.
source
# MXNet.mx._sparse_ceil
— Method.
_sparse_ceil(data)
_sparse_ceil is an alias of ceil.
Returns element-wise ceiling of the input.
The ceil of the scalar x is the smallest integer i, such that i >= x.
Example::
ceil([-2.1, -1.9, 1.5, 1.9, 2.1]) = [-2., -1., 2., 2., 3.]
The storage type of $ceil$ output depends upon the input storage type:
- ceil(default) = default
- ceil(row_sparse) = row_sparse
Defined in src/operator/tensor/elemwise_unary_op_basic.cc:L463
Arguments
-
data::NDArray-or-SymbolicNode
: The input array.
source
# MXNet.mx._sparse_clip
— Method.
_sparse_clip(data, a_min, a_max)
_sparse_clip is an alias of clip.
Clips (limits) the values in an array.
Given an interval, values outside the interval are clipped to the interval edges. Clipping $x$ between a_min
and a_x
would be::
clip(x, a_min, a_max) = max(min(x, a_max), a_min))
Example::
x = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
clip(x,1,8) = [ 1., 1., 2., 3., 4., 5., 6., 7., 8., 8.]
The storage type of $clip$ output depends on storage types of inputs and the a_min, a_max parameter values:
- clip(default) = default
- clip(row_sparse, a_min <= 0, a_max >= 0) = row_sparse
- clip(csr, a_min <= 0, a_max >= 0) = csr
- clip(row_sparse, a_min < 0, a_max < 0) = default
- clip(row_sparse, a_min > 0, a_max > 0) = default
- clip(csr, a_min < 0, a_max < 0) = csr
- clip(csr, a_min > 0, a_max > 0) = csr
Defined in src/operator/tensor/matrix_op.cc:L490
Arguments
-
data::NDArray-or-SymbolicNode
: Input array. -
a_min::float, required
: Minimum value -
a_max::float, required
: Maximum value
source
# MXNet.mx._sparse_cos
— Method.
_sparse_cos(data)
_sparse_cos is an alias of cos.
Computes the element-wise cosine of the input array.
The input should be in radians (:math:2\pi
rad equals 360 degrees).
.. math:: cos([0, \pi/4, \pi/2]) = [1, 0.707, 0]
The storage type of $cos$ output is always dense
Defined in src/operator/tensor/elemwise_unary_op_trig.cc:L63
Arguments
-
data::NDArray-or-SymbolicNode
: The input array.
source
# MXNet.mx._sparse_cosh
— Method.
_sparse_cosh(data)
_sparse_cosh is an alias of cosh.
Returns the hyperbolic cosine of the input array, computed element-wise.
.. math:: cosh(x) = 0.5\times(exp(x) + exp(-x))
The storage type of $cosh$ output is always dense
Defined in src/operator/tensor/elemwise_unary_op_trig.cc:L216
Arguments
-
data::NDArray-or-SymbolicNode
: The input array.
source
# MXNet.mx._sparse_degrees
— Method.
_sparse_degrees(data)
_sparse_degrees is an alias of degrees.
Converts each element of the input array from radians to degrees.
.. math:: degrees([0, \pi/2, \pi, 3\pi/2, 2\pi]) = [0, 90, 180, 270, 360]
The storage type of $degrees$ output depends upon the input storage type:
- degrees(default) = default
- degrees(row_sparse) = row_sparse
Defined in src/operator/tensor/elemwise_unary_op_trig.cc:L163
Arguments
-
data::NDArray-or-SymbolicNode
: The input array.
source
# MXNet.mx._sparse_dot
— Method.
_sparse_dot(lhs, rhs, transpose_a, transpose_b)
_sparse_dot is an alias of dot.
Dot product of two arrays.
$dot$'s behavior depends on the input array dimensions:
- 1-D arrays: inner product of vectors
- 2-D arrays: matrix multiplication
-
N-D arrays: a sum product over the last axis of the first input and the first axis of the second input
For example, given 3-D $x$ with shape
(n,m,k)
and $y$ with shape(k,r,s)
, the result array will have shape(n,m,r,s)
. It is computed by::dot(x,y)[i,j,a,b] = sum(x[i,j,:]*y[:,a,b])
Example::
x = reshape([0,1,2,3,4,5,6,7], shape=(2,2,2)) y = reshape([7,6,5,4,3,2,1,0], shape=(2,2,2)) dot(x,y)[0,0,1,1] = 0 sum(x[0,0,:]*y[:,1,1]) = 0
The storage type of $dot$ output depends on storage types of inputs and transpose options:
- dot(csr, default) = default
- dot(csr.T, default) = row_sparse
- dot(csr, row_sparse) = default
- dot(default, csr) = csr
- otherwise, $dot$ generates output with default storage
Defined in src/operator/tensor/dot.cc:L62
Arguments
-
lhs::NDArray-or-SymbolicNode
: The first input -
rhs::NDArray-or-SymbolicNode
: The second input -
transpose_a::boolean, optional, default=0
: If true then transpose the first input before dot. -
transpose_b::boolean, optional, default=0
: If true then transpose the second input before dot.
source
# MXNet.mx._sparse_elemwise_add
— Method.
_sparse_elemwise_add(lhs, rhs)
_sparse_elemwise_add is an alias of elemwise_add.
Adds arguments element-wise.
The storage type of $elemwise_add$ output depends on storage types of inputs
- elemwise_add(row_sparse, row_sparse) = row_sparse
- elemwise_add(csr, csr) = csr
- otherwise, $elemwise_add$ generates output with default storage
Arguments
-
lhs::NDArray-or-SymbolicNode
: first input -
rhs::NDArray-or-SymbolicNode
: second input
source
# MXNet.mx._sparse_elemwise_div
— Method.
_sparse_elemwise_div(lhs, rhs)
_sparse_elemwise_div is an alias of elemwise_div.
Divides arguments element-wise.
The storage type of $elemwise_div$ output is always dense
Arguments
-
lhs::NDArray-or-SymbolicNode
: first input -
rhs::NDArray-or-SymbolicNode
: second input
source
# MXNet.mx._sparse_elemwise_mul
— Method.
_sparse_elemwise_mul(lhs, rhs)
_sparse_elemwise_mul is an alias of elemwise_mul.
Multiplies arguments element-wise.
The storage type of $elemwise_mul$ output depends on storage types of inputs
- elemwise_mul(default, default) = default
- elemwise_mul(row_sparse, row_sparse) = row_sparse
- elemwise_mul(default, row_sparse) = default
- elemwise_mul(row_sparse, default) = default
- elemwise_mul(csr, csr) = csr
- otherwise, $elemwise_mul$ generates output with default storage
Arguments
-
lhs::NDArray-or-SymbolicNode
: first input -
rhs::NDArray-or-SymbolicNode
: second input
source
# MXNet.mx._sparse_elemwise_sub
— Method.
_sparse_elemwise_sub(lhs, rhs)
_sparse_elemwise_sub is an alias of elemwise_sub.
Subtracts arguments element-wise.
The storage type of $elemwise_sub$ output depends on storage types of inputs
- elemwise_sub(row_sparse, row_sparse) = row_sparse
- elemwise_sub(csr, csr) = csr
- otherwise, $elemwise_sub$ generates output with default storage
Arguments
-
lhs::NDArray-or-SymbolicNode
: first input -
rhs::NDArray-or-SymbolicNode
: second input
source
# MXNet.mx._sparse_exp
— Method.
_sparse_exp(data)
_sparse_exp is an alias of exp.
Returns element-wise exponential value of the input.
.. math:: exp(x) = e^x \approx 2.718^x
Example::
exp([0, 1, 2]) = [1., 2.71828175, 7.38905621]
The storage type of $exp$ output is always dense
Defined in src/operator/tensor/elemwise_unary_op_basic.cc:L641
Arguments
-
data::NDArray-or-SymbolicNode
: The input array.
source
# MXNet.mx._sparse_expm1
— Method.
_sparse_expm1(data)
_sparse_expm1 is an alias of expm1.
Returns $exp(x) - 1$ computed element-wise on the input.
This function provides greater precision than $exp(x) - 1$ for small values of $x$.
The storage type of $expm1$ output depends upon the input storage type:
- expm1(default) = default
- expm1(row_sparse) = row_sparse
Defined in src/operator/tensor/elemwise_unary_op_basic.cc:L720
Arguments
-
data::NDArray-or-SymbolicNode
: The input array.
source
# MXNet.mx._sparse_fix
— Method.
_sparse_fix(data)
_sparse_fix is an alias of fix.
Returns element-wise rounded value to the nearest integer towards zero of the input.
Example::
fix([-2.1, -1.9, 1.9, 2.1]) = [-2., -1., 1., 2.]
The storage type of $fix$ output depends upon the input storage type:
- fix(default) = default
- fix(row_sparse) = row_sparse
Defined in src/operator/tensor/elemwise_unary_op_basic.cc:L520
Arguments
-
data::NDArray-or-SymbolicNode
: The input array.
source
# MXNet.mx._sparse_floor
— Method.
_sparse_floor(data)
_sparse_floor is an alias of floor.
Returns element-wise floor of the input.
The floor of the scalar x is the largest integer i, such that i <= x.
Example::
floor([-2.1, -1.9, 1.5, 1.9, 2.1]) = [-3., -2., 1., 1., 2.]
The storage type of $floor$ output depends upon the input storage type:
- floor(default) = default
- floor(row_sparse) = row_sparse
Defined in src/operator/tensor/elemwise_unary_op_basic.cc:L482
Arguments
-
data::NDArray-or-SymbolicNode
: The input array.
source
# MXNet.mx._sparse_ftrl_update
— Method.
_sparse_ftrl_update(weight, grad, z, n, lr, lamda1, beta, wd, rescale_grad, clip_gradient)
_sparse_ftrl_update is an alias of ftrl_update.
Update function for Ftrl optimizer. Referenced from Ad Click Prediction: a View from the Trenches, available at http://dl.acm.org/citation.cfm?id=2488200.
It updates the weights using::
rescaled_grad = clip(grad * rescale_grad, clip_gradient) z += rescaled_grad - (sqrt(n + rescaled_grad2) - sqrt(n)) * weight / learning_rate n += rescaled_grad2 w = (sign(z) * lamda1 - z) / ((beta + sqrt(n)) / learning_rate + wd) * (abs(z) > lamda1)
If w, z and n are all of $row_sparse$ storage type, only the row slices whose indices appear in grad.indices are updated (for w, z and n)::
for row in grad.indices: rescaled_grad[row] = clip(grad[row] * rescale_grad, clip_gradient) z[row] += rescaled_grad[row] - (sqrt(n[row] + rescaled_grad[row]2) - sqrt(n[row])) * weight[row] / learning_rate n[row] += rescaled_grad[row]2 w[row] = (sign(z[row]) * lamda1 - z[row]) / ((beta + sqrt(n[row])) / learning_rate + wd) * (abs(z[row]) > lamda1)
Defined in src/operator/optimizer_op.cc:L520
Arguments
-
weight::NDArray-or-SymbolicNode
: Weight -
grad::NDArray-or-SymbolicNode
: Gradient -
z::NDArray-or-SymbolicNode
: z -
n::NDArray-or-SymbolicNode
: Square of grad -
lr::float, required
: Learning rate -
lamda1::float, optional, default=0.01
: The L1 regularization coefficient. -
beta::float, optional, default=1
: Per-Coordinate Learning Rate beta. -
wd::float, optional, default=0
: Weight decay augments the objective function with a regularization term that penalizes large weights. The penalty scales with the square of the magnitude of each weight. -
rescale_grad::float, optional, default=1
: Rescale gradient to grad = rescale_grad*grad. -
clip_gradient::float, optional, default=-1
: Clip gradient to the range of [-clip_gradient, clip_gradient] If clip_gradient <= 0, gradient clipping is turned off. grad = max(min(grad, clip_gradient), -clip_gradient).
source
# MXNet.mx._sparse_gamma
— Method.
_sparse_gamma(data)
_sparse_gamma is an alias of gamma.
Returns the gamma function (extension of the factorial function to the reals), computed element-wise on the input array.
The storage type of $gamma$ output is always dense
Arguments
-
data::NDArray-or-SymbolicNode
: The input array.
source
# MXNet.mx._sparse_gammaln
— Method.
_sparse_gammaln(data)
_sparse_gammaln is an alias of gammaln.
Returns element-wise log of the absolute value of the gamma function of the input.
The storage type of $gammaln$ output is always dense
Arguments
-
data::NDArray-or-SymbolicNode
: The input array.
source
# MXNet.mx._sparse_log
— Method.
_sparse_log(data)
_sparse_log is an alias of log.
Returns element-wise Natural logarithmic value of the input.
The natural logarithm is logarithm in base e, so that $log(exp(x)) = x$
The storage type of $log$ output is always dense
Defined in src/operator/tensor/elemwise_unary_op_basic.cc:L653
Arguments
-
data::NDArray-or-SymbolicNode
: The input array.
source
# MXNet.mx._sparse_log10
— Method.
_sparse_log10(data)
_sparse_log10 is an alias of log10.
Returns element-wise Base-10 logarithmic value of the input.
$10**log10(x) = x$
The storage type of $log10$ output is always dense
Defined in src/operator/tensor/elemwise_unary_op_basic.cc:L665
Arguments
-
data::NDArray-or-SymbolicNode
: The input array.
source
# MXNet.mx._sparse_log1p
— Method.
_sparse_log1p(data)
_sparse_log1p is an alias of log1p.
Returns element-wise $log(1 + x)$ value of the input.
This function is more accurate than $log(1 + x)$ for small $x$ so that :math:1+x\approx 1
The storage type of $log1p$ output depends upon the input storage type:
- log1p(default) = default
- log1p(row_sparse) = row_sparse
Defined in src/operator/tensor/elemwise_unary_op_basic.cc:L702
Arguments
-
data::NDArray-or-SymbolicNode
: The input array.
source
# MXNet.mx._sparse_log2
— Method.
_sparse_log2(data)
_sparse_log2 is an alias of log2.
Returns element-wise Base-2 logarithmic value of the input.
$2**log2(x) = x$
The storage type of $log2$ output is always dense
Defined in src/operator/tensor/elemwise_unary_op_basic.cc:L677
Arguments
-
data::NDArray-or-SymbolicNode
: The input array.
source
# MXNet.mx._sparse_make_loss
— Method.
_sparse_make_loss(data)
_sparse_make_loss is an alias of make_loss.
Make your own loss function in network construction.
This operator accepts a customized loss function symbol as a terminal loss and the symbol should be an operator with no backward dependency. The output of this function is the gradient of loss with respect to the input data.
For example, if you are a making a cross entropy loss function. Assume $out$ is the predicted output and $label$ is the true label, then the cross entropy can be defined as::
cross_entropy = label * log(out) + (1 - label) * log(1 - out) loss = make_loss(cross_entropy)
We will need to use $make_loss$ when we are creating our own loss function or we want to combine multiple loss functions. Also we may want to stop some variables' gradients from backpropagation. See more detail in $BlockGrad$ or $stop_gradient$.
The storage type of $make_loss$ output depends upon the input storage type:
- make_loss(default) = default
- make_loss(row_sparse) = row_sparse
Defined in src/operator/tensor/elemwise_unary_op_basic.cc:L199
Arguments
-
data::NDArray-or-SymbolicNode
: The input array.
source
# MXNet.mx._sparse_mean
— Method.
_sparse_mean(data, axis, keepdims, exclude)
_sparse_mean is an alias of mean.
Computes the mean of array elements over given axes.
Defined in src/operator/tensor/broadcast_reduce_op_value.cc:L101
Arguments
-
data::NDArray-or-SymbolicNode
: The input -
axis::Shape(tuple), optional, default=[]
: The axis or axes along which to perform the reduction.`` The default,
axis=(), will compute over all elements into a scalar array with shape
(1,)`.If
axis
is int, a reduction is performed on a particular axis.If
axis
is a tuple of ints, a reduction is performed on all the axes specified in the tuple.If
exclude
is true, reduction will be performed on the axes that are NOT in axis instead.Negative values means indexing from right to left.
`` *
keepdims::boolean, optional, default=0: If this is set to
True, the reduced axes are left in the result as dimension with size one. *
exclude::boolean, optional, default=0`: Whether to perform reduction on axis that are NOT in axis instead.
source
# MXNet.mx._sparse_negative
— Method.
_sparse_negative(data)
_sparse_negative is an alias of negative.
Numerical negative of the argument, element-wise.
The storage type of $negative$ output depends upon the input storage type:
- negative(default) = default
- negative(row_sparse) = row_sparse
- negative(csr) = csr
Arguments
-
data::NDArray-or-SymbolicNode
: The input array.
source
# MXNet.mx._sparse_norm
— Method.
_sparse_norm(data)
_sparse_norm is an alias of norm.
Flattens the input array and then computes the l2 norm.
Examples::
x = [[1, 2], [3, 4]]
norm(x) = [5.47722578]
rsp = x.cast_storage('row_sparse')
norm(rsp) = [5.47722578]
csr = x.cast_storage('csr')
norm(csr) = [5.47722578]
Defined in src/operator/tensor/broadcast_reduce_op_value.cc:L266
Arguments
-
data::NDArray-or-SymbolicNode
: Source input
source
# MXNet.mx._sparse_radians
— Method.
_sparse_radians(data)
_sparse_radians is an alias of radians.
Converts each element of the input array from degrees to radians.
.. math:: radians([0, 90, 180, 270, 360]) = [0, \pi/2, \pi, 3\pi/2, 2\pi]
The storage type of $radians$ output depends upon the input storage type:
- radians(default) = default
- radians(row_sparse) = row_sparse
Defined in src/operator/tensor/elemwise_unary_op_trig.cc:L182
Arguments
-
data::NDArray-or-SymbolicNode
: The input array.
source
# MXNet.mx._sparse_relu
— Method.
_sparse_relu(data)
_sparse_relu is an alias of relu.
Computes rectified linear.
.. math:: max(features, 0)
The storage type of $relu$ output depends upon the input storage type:
- relu(default) = default
- relu(row_sparse) = row_sparse
Defined in src/operator/tensor/elemwise_unary_op_basic.cc:L83
Arguments
-
data::NDArray-or-SymbolicNode
: The input array.
source
# MXNet.mx._sparse_retain
— Method.
_sparse_retain(data, indices)
pick rows specified by user input index array from a row sparse matrix and save them in the output sparse matrix.
Example::
data = [[1, 2], [3, 4], [5, 6]] indices = [0, 1, 3] shape = (4, 2) rsp_in = row_sparse(data, indices) to_retain = [0, 3] rsp_out = retain(rsp_in, to_retain) rsp_out.values = [[1, 2], [5, 6]] rsp_out.indices = [0, 3]
The storage type of $retain$ output depends on storage types of inputs
- retain(row_sparse, default) = row_sparse
- otherwise, $retain$ is not supported
Defined in src/operator/tensor/sparse_retain.cc:L53
Arguments
-
data::NDArray-or-SymbolicNode
: The input array for sparse_retain operator. -
indices::NDArray-or-SymbolicNode
: The index array of rows ids that will be retained.
source
# MXNet.mx._sparse_rint
— Method.
_sparse_rint(data)
_sparse_rint is an alias of rint.
Returns element-wise rounded value to the nearest integer of the input.
.. note::
- For input $n.5$ $rint$ returns $n$ while $round$ returns $n+1$.
- For input $-n.5$ both $rint$ and $round$ returns $-n-1$.
Example::
rint([-1.5, 1.5, -1.9, 1.9, 2.1]) = [-2., 1., -2., 2., 2.]
The storage type of $rint$ output depends upon the input storage type:
- rint(default) = default
- rint(row_sparse) = row_sparse
Defined in src/operator/tensor/elemwise_unary_op_basic.cc:L444
Arguments
-
data::NDArray-or-SymbolicNode
: The input array.
source
# MXNet.mx._sparse_round
— Method.
_sparse_round(data)
_sparse_round is an alias of round.
Returns element-wise rounded value to the nearest integer of the input.
Example::
round([-1.5, 1.5, -1.9, 1.9, 2.1]) = [-2., 2., -2., 2., 2.]
The storage type of $round$ output depends upon the input storage type:
- round(default) = default
- round(row_sparse) = row_sparse
Defined in src/operator/tensor/elemwise_unary_op_basic.cc:L423
Arguments
-
data::NDArray-or-SymbolicNode
: The input array.
source
# MXNet.mx._sparse_rsqrt
— Method.
_sparse_rsqrt(data)
_sparse_rsqrt is an alias of rsqrt.
Returns element-wise inverse square-root value of the input.
.. math:: rsqrt(x) = 1/\sqrt{x}
Example::
rsqrt([4,9,16]) = [0.5, 0.33333334, 0.25]
The storage type of $rsqrt$ output is always dense
Defined in src/operator/tensor/elemwise_unary_op_basic.cc:L584
Arguments
-
data::NDArray-or-SymbolicNode
: The input array.
source
# MXNet.mx._sparse_sgd_mom_update
— Method.
_sparse_sgd_mom_update(weight, grad, mom, lr, momentum, wd, rescale_grad, clip_gradient)
_sparse_sgd_mom_update is an alias of sgd_mom_update.
Momentum update function for Stochastic Gradient Descent (SDG) optimizer.
Momentum update has better convergence rates on neural networks. Mathematically it looks like below:
.. math::
v_1 = \alpha * \nabla J(W_0)\ v_t = \gamma v_{t-1} - \alpha * \nabla J(W_{t-1})\ W_t = W_{t-1} + v_t
It updates the weights using::
v = momentum * v - learning_rate * gradient weight += v
Where the parameter $momentum$ is the decay rate of momentum estimates at each epoch.
If weight and grad are both of $row_sparse$ storage type and momentum is of $default$ storage type, standard update is applied.
If weight, grad and momentum are all of $row_sparse$ storage type, only the row slices whose indices appear in grad.indices are updated (for both weight and momentum)::
for row in gradient.indices: v[row] = momentum[row] * v[row] - learning_rate * gradient[row] weight[row] += v[row]
Defined in src/operator/optimizer_op.cc:L265
Arguments
-
weight::NDArray-or-SymbolicNode
: Weight -
grad::NDArray-or-SymbolicNode
: Gradient -
mom::NDArray-or-SymbolicNode
: Momentum -
lr::float, required
: Learning rate -
momentum::float, optional, default=0
: The decay rate of momentum estimates at each epoch. -
wd::float, optional, default=0
: Weight decay augments the objective function with a regularization term that penalizes large weights. The penalty scales with the square of the magnitude of each weight. -
rescale_grad::float, optional, default=1
: Rescale gradient to grad = rescale_grad*grad. -
clip_gradient::float, optional, default=-1
: Clip gradient to the range of [-clip_gradient, clip_gradient] If clip_gradient <= 0, gradient clipping is turned off. grad = max(min(grad, clip_gradient), -clip_gradient).
source
# MXNet.mx._sparse_sgd_update
— Method.
_sparse_sgd_update(weight, grad, lr, wd, rescale_grad, clip_gradient)
_sparse_sgd_update is an alias of sgd_update.
Update function for Stochastic Gradient Descent (SDG) optimizer.
It updates the weights using::
weight = weight - learning_rate * gradient
If weight is of $row_sparse$ storage type, only the row slices whose indices appear in grad.indices are updated::
for row in gradient.indices: weight[row] = weight[row] - learning_rate * gradient[row]
Defined in src/operator/optimizer_op.cc:L222
Arguments
-
weight::NDArray-or-SymbolicNode
: Weight -
grad::NDArray-or-SymbolicNode
: Gradient -
lr::float, required
: Learning rate -
wd::float, optional, default=0
: Weight decay augments the objective function with a regularization term that penalizes large weights. The penalty scales with the square of the magnitude of each weight. -
rescale_grad::float, optional, default=1
: Rescale gradient to grad = rescale_grad*grad. -
clip_gradient::float, optional, default=-1
: Clip gradient to the range of [-clip_gradient, clip_gradient] If clip_gradient <= 0, gradient clipping is turned off. grad = max(min(grad, clip_gradient), -clip_gradient).
source
# MXNet.mx._sparse_sigmoid
— Method.
_sparse_sigmoid(data)
_sparse_sigmoid is an alias of sigmoid.
Computes sigmoid of x element-wise.
.. math:: y = 1 / (1 + exp(-x))
The storage type of $sigmoid$ output is always dense
Defined in src/operator/tensor/elemwise_unary_op_basic.cc:L102
Arguments
-
data::NDArray-or-SymbolicNode
: The input array.
source
# MXNet.mx._sparse_sign
— Method.
_sparse_sign(data)
_sparse_sign is an alias of sign.
Returns element-wise sign of the input.
Example::
sign([-2, 0, 3]) = [-1, 0, 1]
The storage type of $sign$ output depends upon the input storage type:
- sign(default) = default
- sign(row_sparse) = row_sparse
Defined in src/operator/tensor/elemwise_unary_op_basic.cc:L404
Arguments
-
data::NDArray-or-SymbolicNode
: The input array.
source
# MXNet.mx._sparse_sin
— Method.
_sparse_sin(data)
_sparse_sin is an alias of sin.
Computes the element-wise sine of the input array.
The input should be in radians (:math:2\pi
rad equals 360 degrees).
.. math:: sin([0, \pi/4, \pi/2]) = [0, 0.707, 1]
The storage type of $sin$ output depends upon the input storage type:
- sin(default) = default
- sin(row_sparse) = row_sparse
Defined in src/operator/tensor/elemwise_unary_op_trig.cc:L46
Arguments
-
data::NDArray-or-SymbolicNode
: The input array.
source
# MXNet.mx._sparse_sinh
— Method.
_sparse_sinh(data)
_sparse_sinh is an alias of sinh.
Returns the hyperbolic sine of the input array, computed element-wise.
.. math:: sinh(x) = 0.5\times(exp(x) - exp(-x))
The storage type of $sinh$ output depends upon the input storage type:
- sinh(default) = default
- sinh(row_sparse) = row_sparse
Defined in src/operator/tensor/elemwise_unary_op_trig.cc:L201
Arguments
-
data::NDArray-or-SymbolicNode
: The input array.
source
# MXNet.mx._sparse_slice
— Method.
_sparse_slice(data, begin, end, step)
_sparse_slice is an alias of slice.
Slices a region of the array.
.. note:: $crop$ is deprecated. Use $slice$ instead.
This function returns a sliced array between the indices given by begin
and end
with the corresponding step
.
For an input array of $shape=(d_0, d_1, ..., d_n-1)$, slice operation with $begin=(b_0, b_1...b_m-1)$, $end=(e_0, e_1, ..., e_m-1)$, and $step=(s_0, s_1, ..., s_m-1)$, where m <= n, results in an array with the shape $(|e_0-b_0|/|s_0|, ..., |e_m-1-b_m-1|/|s_m-1|, d_m, ..., d_n-1)$.
The resulting array's k-th dimension contains elements from the k-th dimension of the input array starting from index $b_k$ (inclusive) with step $s_k$ until reaching $e_k$ (exclusive).
If the k-th elements are None
in the sequence of begin
, end
, and step
, the following rule will be used to set default values. If s_k
is None
, set s_k=1
. If s_k > 0
, set b_k=0
, e_k=d_k
; else, set b_k=d_k-1
, e_k=-1
.
The storage type of $slice$ output depends on storage types of inputs
- slice(csr) = csr
- otherwise, $slice$ generates output with default storage
.. note:: When input data storage type is csr, it only supports step=(), or step=(None,), or step=(1,) to generate a csr output. For other step parameter values, it falls back to slicing a dense tensor.
Example::
x = [[ 1., 2., 3., 4.], [ 5., 6., 7., 8.], [ 9., 10., 11., 12.]]
slice(x, begin=(0,1), end=(2,4)) = [[ 2., 3., 4.], [ 6., 7., 8.]] slice(x, begin=(None, 0), end=(None, 3), step=(-1, 2)) = [[9., 11.], [5., 7.], [1., 3.]]
Defined in src/operator/tensor/matrix_op.cc:L359
Arguments
-
data::NDArray-or-SymbolicNode
: Source input -
begin::Shape(tuple), required
: starting indices for the slice operation, supports negative indices. -
end::Shape(tuple), required
: ending indices for the slice operation, supports negative indices. -
step::Shape(tuple), optional, default=[]
: step for the slice operation, supports negative values.
source
# MXNet.mx._sparse_sqrt
— Method.
_sparse_sqrt(data)
_sparse_sqrt is an alias of sqrt.
Returns element-wise square-root value of the input.
.. math:: \textrm{sqrt}(x) = \sqrt{x}
Example::
sqrt([4, 9, 16]) = [2, 3, 4]
The storage type of $sqrt$ output depends upon the input storage type:
- sqrt(default) = default
- sqrt(row_sparse) = row_sparse
Defined in src/operator/tensor/elemwise_unary_op_basic.cc:L564
Arguments
-
data::NDArray-or-SymbolicNode
: The input array.
source
# MXNet.mx._sparse_square
— Method.
_sparse_square(data)
_sparse_square is an alias of square.
Returns element-wise squared value of the input.
.. math:: square(x) = x^2
Example::
square([2, 3, 4]) = [4, 9, 16]
The storage type of $square$ output depends upon the input storage type:
- square(default) = default
- square(row_sparse) = row_sparse
- square(csr) = csr
Defined in src/operator/tensor/elemwise_unary_op_basic.cc:L541
Arguments
-
data::NDArray-or-SymbolicNode
: The input array.
source
# MXNet.mx._sparse_stop_gradient
— Method.
_sparse_stop_gradient(data)
_sparse_stop_gradient is an alias of BlockGrad.
Stops gradient computation.
Stops the accumulated gradient of the inputs from flowing through this operator in the backward direction. In other words, this operator prevents the contribution of its inputs to be taken into account for computing gradients.
Example::
v1 = [1, 2] v2 = [0, 1] a = Variable('a') b = Variable('b') b_stop_grad = stop_gradient(3 * b) loss = MakeLoss(b_stop_grad + a)
executor = loss.simple_bind(ctx=cpu(), a=(1,2), b=(1,2)) executor.forward(is_train=True, a=v1, b=v2) executor.outputs [ 1. 5.]
executor.backward() executor.grad_arrays [ 0. 0.] [ 1. 1.]
Defined in src/operator/tensor/elemwise_unary_op_basic.cc:L166
Arguments
-
data::NDArray-or-SymbolicNode
: The input array.
source
# MXNet.mx._sparse_sum
— Method.
_sparse_sum(data, axis, keepdims, exclude)
_sparse_sum is an alias of sum.
Computes the sum of array elements over given axes.
.. Note::
sum
and sum_axis
are equivalent. For ndarray of csr storage type summation along axis 0 and axis 1 is supported. Setting keepdims or exclude to True will cause a fallback to dense operator.
Example::
data = [[[1,2],[2,3],[1,3]], [[1,4],[4,3],[5,2]], [[7,1],[7,2],[7,3]]]
sum(data, axis=1) [[ 4. 8.] [ 10. 9.] [ 21. 6.]]
sum(data, axis=[1,2]) [ 12. 19. 27.]
data = [[1,2,0], [3,0,1], [4,1,0]]
csr = cast_storage(data, 'csr')
sum(csr, axis=0) [ 8. 3. 1.]
sum(csr, axis=1) [ 3. 4. 5.]
Defined in src/operator/tensor/broadcast_reduce_op_value.cc:L85
Arguments
-
data::NDArray-or-SymbolicNode
: The input -
axis::Shape(tuple), optional, default=[]
: The axis or axes along which to perform the reduction.`` The default,
axis=(), will compute over all elements into a scalar array with shape
(1,)`.If
axis
is int, a reduction is performed on a particular axis.If
axis
is a tuple of ints, a reduction is performed on all the axes specified in the tuple.If
exclude
is true, reduction will be performed on the axes that are NOT in axis instead.Negative values means indexing from right to left.
`` *
keepdims::boolean, optional, default=0: If this is set to
True, the reduced axes are left in the result as dimension with size one. *
exclude::boolean, optional, default=0`: Whether to perform reduction on axis that are NOT in axis instead.
source
# MXNet.mx._sparse_tan
— Method.
_sparse_tan(data)
_sparse_tan is an alias of tan.
Computes the element-wise tangent of the input array.
The input should be in radians (:math:2\pi
rad equals 360 degrees).
.. math:: tan([0, \pi/4, \pi/2]) = [0, 1, -inf]
The storage type of $tan$ output depends upon the input storage type:
- tan(default) = default
- tan(row_sparse) = row_sparse
Defined in src/operator/tensor/elemwise_unary_op_trig.cc:L83
Arguments
-
data::NDArray-or-SymbolicNode
: The input array.
source
# MXNet.mx._sparse_tanh
— Method.
_sparse_tanh(data)
_sparse_tanh is an alias of tanh.
Returns the hyperbolic tangent of the input array, computed element-wise.
.. math:: tanh(x) = sinh(x) / cosh(x)
The storage type of $tanh$ output depends upon the input storage type:
- tanh(default) = default
- tanh(row_sparse) = row_sparse
Defined in src/operator/tensor/elemwise_unary_op_trig.cc:L234
Arguments
-
data::NDArray-or-SymbolicNode
: The input array.
source
# MXNet.mx._sparse_trunc
— Method.
_sparse_trunc(data)
_sparse_trunc is an alias of trunc.
Return the element-wise truncated value of the input.
The truncated value of the scalar x is the nearest integer i which is closer to zero than x is. In short, the fractional part of the signed number x is discarded.
Example::
trunc([-2.1, -1.9, 1.5, 1.9, 2.1]) = [-2., -1., 1., 1., 2.]
The storage type of $trunc$ output depends upon the input storage type:
- trunc(default) = default
- trunc(row_sparse) = row_sparse
Defined in src/operator/tensor/elemwise_unary_op_basic.cc:L502
Arguments
-
data::NDArray-or-SymbolicNode
: The input array.
source
# MXNet.mx._sparse_where
— Method.
_sparse_where(condition, x, y)
_sparse_where is an alias of where.
Return the elements, either from x or y, depending on the condition.
Given three ndarrays, condition, x, and y, return an ndarray with the elements from x or y, depending on the elements from condition are true or false. x and y must have the same shape. If condition has the same shape as x, each element in the output array is from x if the corresponding element in the condition is true, and from y if false.
If condition does not have the same shape as x, it must be a 1D array whose size is the same as x's first dimension size. Each row of the output array is from x's row if the corresponding element from condition is true, and from y's row if false.
Note that all non-zero values are interpreted as $True$ in condition.
Examples::
x = [[1, 2], [3, 4]] y = [[5, 6], [7, 8]] cond = [[0, 1], [-1, 0]]
where(cond, x, y) = [[5, 2], [3, 8]]
csr_cond = cast_storage(cond, 'csr')
where(csr_cond, x, y) = [[5, 2], [3, 8]]
Defined in src/operator/tensor/control_flow_op.cc:L57
Arguments
-
condition::NDArray-or-SymbolicNode
: condition array -
x::NDArray-or-SymbolicNode
: -
y::NDArray-or-SymbolicNode
:
source
# MXNet.mx._sparse_zeros_like
— Method.
_sparse_zeros_like(data)
_sparse_zeros_like is an alias of zeros_like.
Return an array of zeros with the same shape and type as the input array.
The storage type of $zeros_like$ output depends on the storage type of the input
- zeros_like(row_sparse) = row_sparse
- zeros_like(csr) = csr
- zeros_like(default) = default
Examples::
x = [[ 1., 1., 1.], [ 1., 1., 1.]]
zeros_like(x) = [[ 0., 0., 0.], [ 0., 0., 0.]]
Arguments
-
data::NDArray-or-SymbolicNode
: The input
source
# MXNet.mx._square_sum
— Method.
_square_sum(data, axis, keepdims, exclude)
Computes the square sum of array elements over a given axis for row-sparse matrix. This is a temporary solution for fusing ops square and sum together for row-sparse matrix to save memory for storing gradients. It will become deprecated once the functionality of fusing operators is finished in the future.
Example::
dns = mx.nd.array([[0, 0], [1, 2], [0, 0], [3, 4], [0, 0]]) rsp = dns.tostype('row_sparse') sum = mx.nd._internal._square_sum(rsp, axis=1) sum = [0, 5, 0, 25, 0]
Defined in src/operator/tensor/square_sum.cc:L63
Arguments
-
data::NDArray-or-SymbolicNode
: The input -
axis::Shape(tuple), optional, default=[]
: The axis or axes along which to perform the reduction.`` The default,
axis=(), will compute over all elements into a scalar array with shape
(1,)`.If
axis
is int, a reduction is performed on a particular axis.If
axis
is a tuple of ints, a reduction is performed on all the axes specified in the tuple.If
exclude
is true, reduction will be performed on the axes that are NOT in axis instead.Negative values means indexing from right to left.
`` *
keepdims::boolean, optional, default=0: If this is set to
True, the reduced axes are left in the result as dimension with size one. *
exclude::boolean, optional, default=0`: Whether to perform reduction on axis that are NOT in axis instead.
source
# MXNet.mx.adam_update
— Method.
adam_update(weight, grad, mean, var, lr, beta1, beta2, epsilon, wd, rescale_grad, clip_gradient)
Update function for Adam optimizer. Adam is seen as a generalization of AdaGrad.
Adam update consists of the following steps, where g represents gradient and m, v are 1st and 2nd order moment estimates (mean and variance).
.. math::
g_t = \nabla J(W_{t-1})\ m_t = \beta_1 m_{t-1} + (1 - \beta_1) g_t\ v_t = \beta_2 v_{t-1} + (1 - \beta_2) g_t^2\ W_t = W_{t-1} - \alpha \frac{ m_t }{ \sqrt{ v_t } + \epsilon }
It updates the weights using::
m = beta1m + (1-beta1)grad v = beta2v + (1-beta2)(grad**2) w += - learning_rate * m / (sqrt(v) + epsilon)
If w, m and v are all of $row_sparse$ storage type, only the row slices whose indices appear in grad.indices are updated (for w, m and v)::
for row in grad.indices: m[row] = beta1m[row] + (1-beta1)grad[row] v[row] = beta2v[row] + (1-beta2)(grad[row]**2) w[row] += - learning_rate * m[row] / (sqrt(v[row]) + epsilon)
Defined in src/operator/optimizer_op.cc:L383
Arguments
-
weight::NDArray-or-SymbolicNode
: Weight -
grad::NDArray-or-SymbolicNode
: Gradient -
mean::NDArray-or-SymbolicNode
: Moving mean -
var::NDArray-or-SymbolicNode
: Moving variance -
lr::float, required
: Learning rate -
beta1::float, optional, default=0.9
: The decay rate for the 1st moment estimates. -
beta2::float, optional, default=0.999
: The decay rate for the 2nd moment estimates. -
epsilon::float, optional, default=1e-08
: A small constant for numerical stability. -
wd::float, optional, default=0
: Weight decay augments the objective function with a regularization term that penalizes large weights. The penalty scales with the square of the magnitude of each weight. -
rescale_grad::float, optional, default=1
: Rescale gradient to grad = rescale_grad*grad. -
clip_gradient::float, optional, default=-1
: Clip gradient to the range of [-clip_gradient, clip_gradient] If clip_gradient <= 0, gradient clipping is turned off. grad = max(min(grad, clip_gradient), -clip_gradient).
source
# MXNet.mx.add_n
— Method.
add_n(args)
Note: add_n takes variable number of positional inputs. So instead of calling as add_n([x, y, z], num_args=3), one should call via add_n(x, y, z), and num_args will be determined automatically.
Adds all input arguments element-wise.
.. math:: add_n(a_1, a_2, ..., a_n) = a_1 + a_2 + ... + a_n
$add_n$ is potentially more efficient than calling $add$ by n
times.
The storage type of $add_n$ output depends on storage types of inputs
- add_n(row_sparse, row_sparse, ..) = row_sparse
- otherwise, $add_n$ generates output with default storage
Defined in src/operator/tensor/elemwise_sum.cc:L123
Arguments
-
args::NDArray-or-SymbolicNode[]
: Positional input arguments
source
# MXNet.mx.add_to!
— Method.
add_to!(dst::NDArray, args::NDArrayOrReal...)
Add a bunch of arguments into dst
. Inplace updating.
source
# MXNet.mx.argmax
— Method.
argmax(data, axis, keepdims)
Returns indices of the maximum values along an axis.
In the case of multiple occurrences of maximum values, the indices corresponding to the first occurrence are returned.
Examples::
x = [[ 0., 1., 2.], [ 3., 4., 5.]]
// argmax along axis 0 argmax(x, axis=0) = [ 1., 1., 1.]
// argmax along axis 1 argmax(x, axis=1) = [ 2., 2.]
// argmax along axis 1 keeping same dims as an input array argmax(x, axis=1, keepdims=True) = [[ 2.], [ 2.]]
Defined in src/operator/tensor/broadcast_reduce_op_index.cc:L52
Arguments
-
data::NDArray-or-SymbolicNode
: The input -
axis::int or None, optional, default='None'
: The axis along which to perform the reduction. Negative values means indexing from right to left. $Requires axis to be set as int, because global reduction is not supported yet.$ -
keepdims::boolean, optional, default=0
: If this is set toTrue
, the reduced axis is left in the result as dimension with size one.
source
# MXNet.mx.argmax_channel
— Method.
argmax_channel(data)
Returns argmax indices of each channel from the input array.
The result will be an NDArray of shape (num_channel,).
In case of multiple occurrences of the maximum values, the indices corresponding to the first occurrence are returned.
Examples::
x = [[ 0., 1., 2.], [ 3., 4., 5.]]
argmax_channel(x) = [ 2., 2.]
Defined in src/operator/tensor/broadcast_reduce_op_index.cc:L97
Arguments
-
data::NDArray-or-SymbolicNode
: The input array
source
# MXNet.mx.argmin
— Method.
argmin(data, axis, keepdims)
Returns indices of the minimum values along an axis.
In the case of multiple occurrences of minimum values, the indices corresponding to the first occurrence are returned.
Examples::
x = [[ 0., 1., 2.], [ 3., 4., 5.]]
// argmin along axis 0 argmin(x, axis=0) = [ 0., 0., 0.]
// argmin along axis 1 argmin(x, axis=1) = [ 0., 0.]
// argmin along axis 1 keeping same dims as an input array argmin(x, axis=1, keepdims=True) = [[ 0.], [ 0.]]
Defined in src/operator/tensor/broadcast_reduce_op_index.cc:L77
Arguments
-
data::NDArray-or-SymbolicNode
: The input -
axis::int or None, optional, default='None'
: The axis along which to perform the reduction. Negative values means indexing from right to left. $Requires axis to be set as int, because global reduction is not supported yet.$ -
keepdims::boolean, optional, default=0
: If this is set toTrue
, the reduced axis is left in the result as dimension with size one.
source
# MXNet.mx.argsort
— Method.
argsort(data, axis, is_ascend)
Returns the indices that would sort an input array along the given axis.
This function performs sorting along the given axis and returns an array of indices having same shape as an input array that index data in sorted order.
Examples::
x = [[ 0.3, 0.2, 0.4], [ 0.1, 0.3, 0.2]]
// sort along axis -1 argsort(x) = [[ 1., 0., 2.], [ 0., 2., 1.]]
// sort along axis 0 argsort(x, axis=0) = [[ 1., 0., 1.] [ 0., 1., 0.]]
// flatten and then sort argsort(x) = [ 3., 1., 5., 0., 4., 2.]
Defined in src/operator/tensor/ordering_op.cc:L176
Arguments
-
data::NDArray-or-SymbolicNode
: The input array -
axis::int or None, optional, default='-1'
: Axis along which to sort the input tensor. If not given, the flattened array is used. Default is -1. -
is_ascend::boolean, optional, default=1
: Whether to sort in ascending or descending order.
source
# MXNet.mx.batch_dot
— Method.
batch_dot(lhs, rhs, transpose_a, transpose_b)
Batchwise dot product.
$batch_dot$ is used to compute dot product of $x$ and $y$ when $x$ and $y$ are data in batch, namely 3D arrays in shape of (batch_size, :, :)
.
For example, given $x$ with shape (batch_size, n, m)
and $y$ with shape (batch_size, m, k)
, the result array will have shape (batch_size, n, k)
, which is computed by::
batch_dot(x,y)[i,:,:] = dot(x[i,:,:], y[i,:,:])
Defined in src/operator/tensor/dot.cc:L110
Arguments
-
lhs::NDArray-or-SymbolicNode
: The first input -
rhs::NDArray-or-SymbolicNode
: The second input -
transpose_a::boolean, optional, default=0
: If true then transpose the first input before dot. -
transpose_b::boolean, optional, default=0
: If true then transpose the second input before dot.
source
# MXNet.mx.batch_take
— Method.
batch_take(a, indices)
Takes elements from a data batch.
.. note:: batch_take
is deprecated. Use pick
instead.
Given an input array of shape $(d0, d1)$ and indices of shape $(i0,)$, the result will be an output array of shape $(i0,)$ with::
output[i] = input[i, indices[i]]
Examples::
x = [[ 1., 2.], [ 3., 4.], [ 5., 6.]]
// takes elements with specified indices batch_take(x, [0,1,0]) = [ 1. 4. 5.]
Defined in src/operator/tensor/indexing_op.cc:L426
Arguments
-
a::NDArray-or-SymbolicNode
: The input array -
indices::NDArray-or-SymbolicNode
: The index array
source
# MXNet.mx.cast
— Method.
cast(data, dtype)
cast is an alias of Cast.
Casts all elements of the input to a new type.
.. note:: $Cast$ is deprecated. Use $cast$ instead.
Example::
cast([0.9, 1.3], dtype='int32') = [0, 1] cast([1e20, 11.1], dtype='float16') = [inf, 11.09375] cast([300, 11.1, 10.9, -1, -3], dtype='uint8') = [44, 11, 10, 255, 253]
Defined in src/operator/tensor/elemwise_unary_op_basic.cc:L310
Arguments
-
data::NDArray-or-SymbolicNode
: The input. -
dtype::{'float16', 'float32', 'float64', 'int32', 'uint8'}, required
: Output data type.
source
# MXNet.mx.cast_storage
— Method.
cast_storage(data, stype)
Casts tensor storage type to the new type.
When an NDArray with default storage type is cast to csr or row_sparse storage, the result is compact, which means:
- for csr, zero values will not be retained
- for row_sparse, row slices of all zeros will not be retained
The storage type of $cast_storage$ output depends on stype parameter:
- cast_storage(csr, 'default') = default
- cast_storage(row_sparse, 'default') = default
- cast_storage(default, 'csr') = csr
- cast_storage(default, 'row_sparse') = row_sparse
Example::
dense = [[ 0., 1., 0.],
[ 2., 0., 3.],
[ 0., 0., 0.],
[ 0., 0., 0.]]
# cast to row_sparse storage type
rsp = cast_storage(dense, 'row_sparse')
rsp.indices = [0, 1]
rsp.values = [[ 0., 1., 0.],
[ 2., 0., 3.]]
# cast to csr storage type
csr = cast_storage(dense, 'csr')
csr.indices = [1, 0, 2]
csr.values = [ 1., 2., 3.]
csr.indptr = [0, 1, 3, 3, 3]
Defined in src/operator/tensor/cast_storage.cc:L69
Arguments
-
data::NDArray-or-SymbolicNode
: The input. -
stype::{'csr', 'default', 'row_sparse'}, required
: Output storage type.
source
# MXNet.mx.choose_element_0index
— Method.
choose_element_0index(lhs, rhs)
Choose one element from each line(row for python, column for R/Julia) in lhs according to index indicated by rhs. This function assume rhs uses 0-based index.
Arguments
-
lhs::NDArray
: Left operand to the function. -
rhs::NDArray
: Right operand to the function.
source
# MXNet.mx.concat
— Method.
concat(data, num_args, dim)
concat is an alias of Concat.
Note: concat takes variable number of positional inputs. So instead of calling as concat([x, y, z], num_args=3), one should call via concat(x, y, z), and num_args will be determined automatically.
Joins input arrays along a given axis.
.. note:: Concat
is deprecated. Use concat
instead.
The dimensions of the input arrays should be the same except the axis along which they will be concatenated. The dimension of the output array along the concatenated axis will be equal to the sum of the corresponding dimensions of the input arrays.
Example::
x = [[1,1],[2,2]] y = [[3,3],[4,4],[5,5]] z = [[6,6], [7,7],[8,8]]
concat(x,y,z,dim=0) = [[ 1., 1.], [ 2., 2.], [ 3., 3.], [ 4., 4.], [ 5., 5.], [ 6., 6.], [ 7., 7.], [ 8., 8.]]
Note that you cannot concat x,y,z along dimension 1 since dimension 0 is not the same for all the input arrays.
concat(y,z,dim=1) = [[ 3., 3., 6., 6.], [ 4., 4., 7., 7.], [ 5., 5., 8., 8.]]
Defined in src/operator/concat.cc:L104
Arguments
-
data::NDArray-or-SymbolicNode[]
: List of arrays to concatenate -
num_args::int, required
: Number of inputs to be concated. -
dim::int, optional, default='1'
: the dimension to be concated.
source
# MXNet.mx.crop
— Method.
crop(data, begin, end, step)
crop is an alias of slice.
Slices a region of the array.
.. note:: $crop$ is deprecated. Use $slice$ instead.
This function returns a sliced array between the indices given by begin
and end
with the corresponding step
.
For an input array of $shape=(d_0, d_1, ..., d_n-1)$, slice operation with $begin=(b_0, b_1...b_m-1)$, $end=(e_0, e_1, ..., e_m-1)$, and $step=(s_0, s_1, ..., s_m-1)$, where m <= n, results in an array with the shape $(|e_0-b_0|/|s_0|, ..., |e_m-1-b_m-1|/|s_m-1|, d_m, ..., d_n-1)$.
The resulting array's k-th dimension contains elements from the k-th dimension of the input array starting from index $b_k$ (inclusive) with step $s_k$ until reaching $e_k$ (exclusive).
If the k-th elements are None
in the sequence of begin
, end
, and step
, the following rule will be used to set default values. If s_k
is None
, set s_k=1
. If s_k > 0
, set b_k=0
, e_k=d_k
; else, set b_k=d_k-1
, e_k=-1
.
The storage type of $slice$ output depends on storage types of inputs
- slice(csr) = csr
- otherwise, $slice$ generates output with default storage
.. note:: When input data storage type is csr, it only supports step=(), or step=(None,), or step=(1,) to generate a csr output. For other step parameter values, it falls back to slicing a dense tensor.
Example::
x = [[ 1., 2., 3., 4.], [ 5., 6., 7., 8.], [ 9., 10., 11., 12.]]
slice(x, begin=(0,1), end=(2,4)) = [[ 2., 3., 4.], [ 6., 7., 8.]] slice(x, begin=(None, 0), end=(None, 3), step=(-1, 2)) = [[9., 11.], [5., 7.], [1., 3.]]
Defined in src/operator/tensor/matrix_op.cc:L359
Arguments
-
data::NDArray-or-SymbolicNode
: Source input -
begin::Shape(tuple), required
: starting indices for the slice operation, supports negative indices. -
end::Shape(tuple), required
: ending indices for the slice operation, supports negative indices. -
step::Shape(tuple), optional, default=[]
: step for the slice operation, supports negative values.
source
# MXNet.mx.degrees
— Method.
degrees(data)
Converts each element of the input array from radians to degrees.
.. math:: degrees([0, \pi/2, \pi, 3\pi/2, 2\pi]) = [0, 90, 180, 270, 360]
The storage type of $degrees$ output depends upon the input storage type:
- degrees(default) = default
- degrees(row_sparse) = row_sparse
Defined in src/operator/tensor/elemwise_unary_op_trig.cc:L163
Arguments
-
data::NDArray-or-SymbolicNode
: The input array.
source
# MXNet.mx.div_from!
— Method.
div_from!(dst::NDArray, arg::NDArrayOrReal)
Elementwise divide a scalar or an NDArray
of the same shape from dst
. Inplace updating.
source
# MXNet.mx.elemwise_add
— Method.
elemwise_add(lhs, rhs)
Adds arguments element-wise.
The storage type of $elemwise_add$ output depends on storage types of inputs
- elemwise_add(row_sparse, row_sparse) = row_sparse
- elemwise_add(csr, csr) = csr
- otherwise, $elemwise_add$ generates output with default storage
Arguments
-
lhs::NDArray-or-SymbolicNode
: first input -
rhs::NDArray-or-SymbolicNode
: second input
source
# MXNet.mx.elemwise_div
— Method.
elemwise_div(lhs, rhs)
Divides arguments element-wise.
The storage type of $elemwise_div$ output is always dense
Arguments
-
lhs::NDArray-or-SymbolicNode
: first input -
rhs::NDArray-or-SymbolicNode
: second input
source
# MXNet.mx.elemwise_mul
— Method.
elemwise_mul(lhs, rhs)
Multiplies arguments element-wise.
The storage type of $elemwise_mul$ output depends on storage types of inputs
- elemwise_mul(default, default) = default
- elemwise_mul(row_sparse, row_sparse) = row_sparse
- elemwise_mul(default, row_sparse) = default
- elemwise_mul(row_sparse, default) = default
- elemwise_mul(csr, csr) = csr
- otherwise, $elemwise_mul$ generates output with default storage
Arguments
-
lhs::NDArray-or-SymbolicNode
: first input -
rhs::NDArray-or-SymbolicNode
: second input
source
# MXNet.mx.elemwise_sub
— Method.
elemwise_sub(lhs, rhs)
Subtracts arguments element-wise.
The storage type of $elemwise_sub$ output depends on storage types of inputs
- elemwise_sub(row_sparse, row_sparse) = row_sparse
- elemwise_sub(csr, csr) = csr
- otherwise, $elemwise_sub$ generates output with default storage
Arguments
-
lhs::NDArray-or-SymbolicNode
: first input -
rhs::NDArray-or-SymbolicNode
: second input
source
# MXNet.mx.fill
— Method.
fill(x, dims, ctx=cpu())
fill(x, dims...)
Create an NDArray
filled with the value x
, like Base.fill
.
source
# MXNet.mx.fill_element_0index
— Method.
fill_element_0index(lhs, mhs, rhs)
Fill one element of each line(row for python, column for R/Julia) in lhs according to index indicated by rhs and values indicated by mhs. This function assume rhs uses 0-based index.
Arguments
-
lhs::NDArray
: Left operand to the function. -
mhs::NDArray
: Middle operand to the function. -
rhs::NDArray
: Right operand to the function.
source
# MXNet.mx.fix
— Method.
fix(data)
Returns element-wise rounded value to the nearest integer towards zero of the input.
Example::
fix([-2.1, -1.9, 1.9, 2.1]) = [-2., -1., 1., 2.]
The storage type of $fix$ output depends upon the input storage type:
- fix(default) = default
- fix(row_sparse) = row_sparse
Defined in src/operator/tensor/elemwise_unary_op_basic.cc:L520
Arguments
-
data::NDArray-or-SymbolicNode
: The input array.
source
# MXNet.mx.flatten
— Method.
flatten(data)
flatten is an alias of Flatten.
Flattens the input array into a 2-D array by collapsing the higher dimensions.
.. note:: Flatten
is deprecated. Use flatten
instead.
For an input array with shape $(d1, d2, ..., dk)$, flatten
operation reshapes the input array into an output array of shape $(d1, d2...dk)$.
Note that the bahavior of this function is different from numpy.ndarray.flatten, which behaves similar to mxnet.ndarray.reshape((-1,)).
Example::
x = [[
[1,2,3],
[4,5,6],
[7,8,9]
],
[ [1,2,3],
[4,5,6],
[7,8,9]
]],
flatten(x) = [[ 1., 2., 3., 4., 5., 6., 7., 8., 9.],
[ 1., 2., 3., 4., 5., 6., 7., 8., 9.]]
Defined in src/operator/tensor/matrix_op.cc:L212
Arguments
-
data::NDArray-or-SymbolicNode
: Input array.
source
# MXNet.mx.flip
— Method.
flip(data, axis)
flip is an alias of reverse.
Reverses the order of elements along given axis while preserving array shape.
Note: reverse and flip are equivalent. We use reverse in the following examples.
Examples::
x = [[ 0., 1., 2., 3., 4.], [ 5., 6., 7., 8., 9.]]
reverse(x, axis=0) = [[ 5., 6., 7., 8., 9.], [ 0., 1., 2., 3., 4.]]
reverse(x, axis=1) = [[ 4., 3., 2., 1., 0.], [ 9., 8., 7., 6., 5.]]
Defined in src/operator/tensor/matrix_op.cc:L665
Arguments
-
data::NDArray-or-SymbolicNode
: Input data array -
axis::Shape(tuple), required
: The axis which to reverse elements.
source
# MXNet.mx.ftml_update
— Method.
ftml_update(weight, grad, d, v, z, lr, beta1, beta2, epsilon, wd, rescale_grad, clip_gradient)
The FTML optimizer described in FTML - Follow the Moving Leader in Deep Learning, available at http://proceedings.mlr.press/v70/zheng17a/zheng17a.pdf.
.. math::
g_t = \nabla J(W_{t-1})\ v_t = \beta_2 v_{t-1} + (1 - \beta_2) g_t^2\ d_t = \frac{ (1 - \beta_1^t) }{ \eta_t } (\sqrt{ \frac{ v_t }{ 1 - \beta_2^t } } + \epsilon) \sigma_t = d_t - \beta_1 d_{t-1} z_t = \beta_1 z_{ t-1 } + (1 - \beta_1^t) g_t - \sigma_t W_{t-1} W_t = - \frac{ z_t }{ d_t }
Defined in src/operator/optimizer_op.cc:L336
Arguments
-
weight::NDArray-or-SymbolicNode
: Weight -
grad::NDArray-or-SymbolicNode
: Gradient -
d::NDArray-or-SymbolicNode
: Internal state $d_t$ -
v::NDArray-or-SymbolicNode
: Internal state $v_t$ -
z::NDArray-or-SymbolicNode
: Internal state $z_t$ -
lr::float, required
: Learning rate -
beta1::float, optional, default=0.9
: The decay rate for the 1st moment estimates. -
beta2::float, optional, default=0.999
: The decay rate for the 2nd moment estimates. -
epsilon::float, optional, default=1e-08
: A small constant for numerical stability. -
wd::float, optional, default=0
: Weight decay augments the objective function with a regularization term that penalizes large weights. The penalty scales with the square of the magnitude of each weight. -
rescale_grad::float, optional, default=1
: Rescale gradient to grad = rescale_grad*grad. -
clip_gradient::float, optional, default=-1
: Clip gradient to the range of [-clip_gradient, clip_gradient] If clip_gradient <= 0, gradient clipping is turned off. grad = max(min(grad, clip_gradient), -clip_gradient).
source
# MXNet.mx.ftrl_update
— Method.
ftrl_update(weight, grad, z, n, lr, lamda1, beta, wd, rescale_grad, clip_gradient)
Update function for Ftrl optimizer. Referenced from Ad Click Prediction: a View from the Trenches, available at http://dl.acm.org/citation.cfm?id=2488200.
It updates the weights using::
rescaled_grad = clip(grad * rescale_grad, clip_gradient) z += rescaled_grad - (sqrt(n + rescaled_grad2) - sqrt(n)) * weight / learning_rate n += rescaled_grad2 w = (sign(z) * lamda1 - z) / ((beta + sqrt(n)) / learning_rate + wd) * (abs(z) > lamda1)
If w, z and n are all of $row_sparse$ storage type, only the row slices whose indices appear in grad.indices are updated (for w, z and n)::
for row in grad.indices: rescaled_grad[row] = clip(grad[row] * rescale_grad, clip_gradient) z[row] += rescaled_grad[row] - (sqrt(n[row] + rescaled_grad[row]2) - sqrt(n[row])) * weight[row] / learning_rate n[row] += rescaled_grad[row]2 w[row] = (sign(z[row]) * lamda1 - z[row]) / ((beta + sqrt(n[row])) / learning_rate + wd) * (abs(z[row]) > lamda1)
Defined in src/operator/optimizer_op.cc:L520
Arguments
-
weight::NDArray-or-SymbolicNode
: Weight -
grad::NDArray-or-SymbolicNode
: Gradient -
z::NDArray-or-SymbolicNode
: z -
n::NDArray-or-SymbolicNode
: Square of grad -
lr::float, required
: Learning rate -
lamda1::float, optional, default=0.01
: The L1 regularization coefficient. -
beta::float, optional, default=1
: Per-Coordinate Learning Rate beta. -
wd::float, optional, default=0
: Weight decay augments the objective function with a regularization term that penalizes large weights. The penalty scales with the square of the magnitude of each weight. -
rescale_grad::float, optional, default=1
: Rescale gradient to grad = rescale_grad*grad. -
clip_gradient::float, optional, default=-1
: Clip gradient to the range of [-clip_gradient, clip_gradient] If clip_gradient <= 0, gradient clipping is turned off. grad = max(min(grad, clip_gradient), -clip_gradient).
source
# MXNet.mx.gammaln
— Method.
gammaln(data)
Returns element-wise log of the absolute value of the gamma function of the input.
The storage type of $gammaln$ output is always dense
Arguments
-
data::NDArray-or-SymbolicNode
: The input array.
source
# MXNet.mx.gather_nd
— Method.
gather_nd(data, indices)
Gather elements or slices from data
and store to a tensor whose shape is defined by indices
.
Given data
with shape (X_0, X_1, ..., X_{N-1})
and indices with shape (M, Y_0, ..., Y_{K-1})
, the output will have shape (Y_0, ..., Y_{K-1}, X_M, ..., X_{N-1})
, where M <= N
. If M == N
, output shape will simply be (Y_0, ..., Y_{K-1})
.
The elements in output is defined as follows::
output[y_0, ..., y_{K-1}, x_M, ..., x_{N-1}] = data[indices[0, y_0, ..., y_{K-1}], ..., indices[M-1, y_0, ..., y_{K-1}], x_M, ..., x_{N-1}]
Examples::
data = [[0, 1], [2, 3]] indices = [[1, 1, 0], [0, 1, 0]] gather_nd(data, indices) = [2, 3, 0]
Arguments
-
data::NDArray-or-SymbolicNode
: data -
indices::NDArray-or-SymbolicNode
: indices
source
# MXNet.mx.is_shared
— Method.
is_shared(j_arr, arr)
Test whether j_arr
is sharing data with arr
.
Arguments:
-
j_arr::Array
: the Julia Array. -
arr::NDArray
: theNDArray
.
source
# MXNet.mx.khatri_rao
— Method.
khatri_rao(args)
Note: khatri_rao takes variable number of positional inputs. So instead of calling as khatri_rao([x, y, z], num_args=3), one should call via khatri_rao(x, y, z), and num_args will be determined automatically.
Computes the Khatri-Rao product of the input matrices.
Given a collection of :math:n
input matrices,
.. math:: A_1 \in \mathbb{R}^{M_1 \times M}, \ldots, A_n \in \mathbb{R}^{M_n \times N},
the (column-wise) Khatri-Rao product is defined as the matrix,
.. math:: X = A_1 \otimes \cdots \otimes A_n \in \mathbb{R}^{(M_1 \cdots M_n) \times N},
where the :math:k
th column is equal to the column-wise outer product :math:{A_1}_k \otimes \cdots \otimes {A_n}_k
where :math:{A_i}_k
is the kth column of the ith matrix.
Example::
A = mx.nd.array([[1, -1], [2, -3]]) B = mx.nd.array([[1, 4], [2, 5], [3, 6]]) C = mx.nd.khatri_rao(A, B) print(C.asnumpy())
[[ 1. -4.] [ 2. -5.] [ 3. -6.] [ 2. -12.] [ 4. -15.] [ 6. -18.]]
Defined in src/operator/contrib/krprod.cc:L108
Arguments
-
args::NDArray-or-SymbolicNode[]
: Positional input matrices
source
# MXNet.mx.linalg_gelqf
— Method.
linalg_gelqf(A)
linalg_gelqf is an alias of _linalg_gelqf.
LQ factorization for general matrix. Input is a tensor A of dimension n >= 2.
If n=2, we compute the LQ factorization (LAPACK gelqf, followed by orglq). A must have shape (x, y) with x <= y, and must have full rank =x. The LQ factorization consists of L with shape (x, x) and Q with shape (x, y), so that:
A = L * Q
Here, L is lower triangular (upper triangle equal to zero) with nonzero diagonal, and Q is row-orthonormal, meaning that
Q * Q\ :sup:T
is equal to the identity matrix of shape (x, x).
If n>2, gelqf is performed separately on the trailing two dimensions for all inputs (batch mode).
.. note:: The operator supports float32 and float64 data types only.
Examples::
// Single LQ factorization A = [[1., 2., 3.], [4., 5., 6.]] Q, L = gelqf(A) Q = [[-0.26726124, -0.53452248, -0.80178373], [0.87287156, 0.21821789, -0.43643578]] L = [[-3.74165739, 0.], [-8.55235974, 1.96396101]]
// Batch LQ factorization A = [[[1., 2., 3.], [4., 5., 6.]], [[7., 8., 9.], [10., 11., 12.]]] Q, L = gelqf(A) Q = [[[-0.26726124, -0.53452248, -0.80178373], [0.87287156, 0.21821789, -0.43643578]], [[-0.50257071, -0.57436653, -0.64616234], [0.7620735, 0.05862104, -0.64483142]]] L = [[[-3.74165739, 0.], [-8.55235974, 1.96396101]], [[-13.92838828, 0.], [-19.09768702, 0.52758934]]]
Defined in src/operator/tensor/la_op.cc:L529
Arguments
-
A::NDArray-or-SymbolicNode
: Tensor of input matrices to be factorized
source
# MXNet.mx.linalg_gemm
— Method.
linalg_gemm(A, B, C, transpose_a, transpose_b, alpha, beta)
linalg_gemm is an alias of _linalg_gemm.
Performs general matrix multiplication and accumulation. Input are tensors A, B, C, each of dimension n >= 2 and having the same shape on the leading n-2 dimensions.
If n=2, the BLAS3 function gemm is performed:
out = alpha * op\ (A) * op\ (B) + beta * C
Here, alpha and beta are scalar parameters, and op() is either the identity or matrix transposition (depending on transpose_a, transpose_b).
If n>2, gemm is performed separately on the trailing two dimensions for all inputs (batch mode).
.. note:: The operator supports float32 and float64 data types only.
Examples::
// Single matrix multiply-add A = [[1.0, 1.0], [1.0, 1.0]] B = [[1.0, 1.0], [1.0, 1.0], [1.0, 1.0]] C = [[1.0, 1.0, 1.0], [1.0, 1.0, 1.0]] gemm(A, B, C, transpose_b=True, alpha=2.0, beta=10.0) = [[14.0, 14.0, 14.0], [14.0, 14.0, 14.0]]
// Batch matrix multiply-add A = [[[1.0, 1.0]], [[0.1, 0.1]]] B = [[[1.0, 1.0]], [[0.1, 0.1]]] C = [[[10.0]], [[0.01]]] gemm(A, B, C, transpose_b=True, alpha=2.0 , beta=10.0) = [[[104.0]], [[0.14]]]
Defined in src/operator/tensor/la_op.cc:L69
Arguments
-
A::NDArray-or-SymbolicNode
: Tensor of input matrices -
B::NDArray-or-SymbolicNode
: Tensor of input matrices -
C::NDArray-or-SymbolicNode
: Tensor of input matrices -
transpose_a::boolean, optional, default=0
: Multiply with transposed of first input (A). -
transpose_b::boolean, optional, default=0
: Multiply with transposed of second input (B). -
alpha::double, optional, default=1
: Scalar factor multiplied with A*B. -
beta::double, optional, default=1
: Scalar factor multiplied with C.
source
# MXNet.mx.linalg_gemm2
— Method.
linalg_gemm2(A, B, transpose_a, transpose_b, alpha)
linalg_gemm2 is an alias of _linalg_gemm2.
Performs general matrix multiplication. Input are tensors A, B, each of dimension n >= 2 and having the same shape on the leading n-2 dimensions.
If n=2, the BLAS3 function gemm is performed:
out = alpha * op\ (A) * op\ (B)
Here alpha is a scalar parameter and op() is either the identity or the matrix transposition (depending on transpose_a, transpose_b).
If n>2, gemm is performed separately on the trailing two dimensions for all inputs (batch mode).
.. note:: The operator supports float32 and float64 data types only.
Examples::
// Single matrix multiply A = [[1.0, 1.0], [1.0, 1.0]] B = [[1.0, 1.0], [1.0, 1.0], [1.0, 1.0]] gemm2(A, B, transpose_b=True, alpha=2.0) = [[4.0, 4.0, 4.0], [4.0, 4.0, 4.0]]
// Batch matrix multiply A = [[[1.0, 1.0]], [[0.1, 0.1]]] B = [[[1.0, 1.0]], [[0.1, 0.1]]] gemm2(A, B, transpose_b=True, alpha=2.0) = [[[4.0]], [[0.04 ]]]
Defined in src/operator/tensor/la_op.cc:L128
Arguments
-
A::NDArray-or-SymbolicNode
: Tensor of input matrices -
B::NDArray-or-SymbolicNode
: Tensor of input matrices -
transpose_a::boolean, optional, default=0
: Multiply with transposed of first input (A). -
transpose_b::boolean, optional, default=0
: Multiply with transposed of second input (B). -
alpha::double, optional, default=1
: Scalar factor multiplied with A*B.
source
# MXNet.mx.linalg_potrf
— Method.
linalg_potrf(A)
linalg_potrf is an alias of _linalg_potrf.
Performs Cholesky factorization of a symmetric positive-definite matrix. Input is a tensor A of dimension n >= 2.
If n=2, the Cholesky factor L of the symmetric, positive definite matrix A is computed. L is lower triangular (entries of upper triangle are all zero), has positive diagonal entries, and:
A = L * L\ :sup:T
If n>2, potrf is performed separately on the trailing two dimensions for all inputs (batch mode).
.. note:: The operator supports float32 and float64 data types only.
Examples::
// Single matrix factorization A = [[4.0, 1.0], [1.0, 4.25]] potrf(A) = [[2.0, 0], [0.5, 2.0]]
// Batch matrix factorization A = [[[4.0, 1.0], [1.0, 4.25]], [[16.0, 4.0], [4.0, 17.0]]] potrf(A) = [[[2.0, 0], [0.5, 2.0]], [[4.0, 0], [1.0, 4.0]]]
Defined in src/operator/tensor/la_op.cc:L178
Arguments
-
A::NDArray-or-SymbolicNode
: Tensor of input matrices to be decomposed
source
# MXNet.mx.linalg_potri
— Method.
linalg_potri(A)
linalg_potri is an alias of _linalg_potri.
Performs matrix inversion from a Cholesky factorization. Input is a tensor A of dimension n >= 2.
If n=2, A is a lower triangular matrix (entries of upper triangle are all zero) with positive diagonal. We compute:
out = A\ :sup:-T
* A\ :sup:-1
In other words, if A is the Cholesky factor of a symmetric positive definite matrix B (obtained by potrf), then
out = B\ :sup:-1
If n>2, potri is performed separately on the trailing two dimensions for all inputs (batch mode).
.. note:: The operator supports float32 and float64 data types only.
.. note:: Use this operator only if you are certain you need the inverse of B, and cannot use the Cholesky factor A (potrf), together with backsubstitution (trsm). The latter is numerically much safer, and also cheaper.
Examples::
// Single matrix inverse A = [[2.0, 0], [0.5, 2.0]] potri(A) = [[0.26563, -0.0625], [-0.0625, 0.25]]
// Batch matrix inverse A = [[[2.0, 0], [0.5, 2.0]], [[4.0, 0], [1.0, 4.0]]] potri(A) = [[[0.26563, -0.0625], [-0.0625, 0.25]], [[0.06641, -0.01562], [-0.01562, 0,0625]]]
Defined in src/operator/tensor/la_op.cc:L236
Arguments
-
A::NDArray-or-SymbolicNode
: Tensor of lower triangular matrices
source
# MXNet.mx.linalg_sumlogdiag
— Method.
linalg_sumlogdiag(A)
linalg_sumlogdiag is an alias of _linalg_sumlogdiag.
Computes the sum of the logarithms of the diagonal elements of a square matrix. Input is a tensor A of dimension n >= 2.
If n=2, A must be square with positive diagonal entries. We sum the natural logarithms of the diagonal elements, the result has shape (1,).
If n>2, sumlogdiag is performed separately on the trailing two dimensions for all inputs (batch mode).
.. note:: The operator supports float32 and float64 data types only.
Examples::
// Single matrix reduction A = [[1.0, 1.0], [1.0, 7.0]] sumlogdiag(A) = [1.9459]
// Batch matrix reduction A = [[[1.0, 1.0], [1.0, 7.0]], [[3.0, 0], [0, 17.0]]] sumlogdiag(A) = [1.9459, 3.9318]
Defined in src/operator/tensor/la_op.cc:L405
Arguments
-
A::NDArray-or-SymbolicNode
: Tensor of square matrices
source
# MXNet.mx.linalg_syrk
— Method.
linalg_syrk(A, transpose, alpha)
linalg_syrk is an alias of _linalg_syrk.
Multiplication of matrix with its transpose. Input is a tensor A of dimension n >= 2.
If n=2, the operator performs the BLAS3 function syrk:
out = alpha * A * A\ :sup:T
if transpose=False, or
out = alpha * A\ :sup:T
\ * A
if transpose=True.
If n>2, syrk is performed separately on the trailing two dimensions for all inputs (batch mode).
.. note:: The operator supports float32 and float64 data types only.
Examples::
// Single matrix multiply A = [[1., 2., 3.], [4., 5., 6.]] syrk(A, alpha=1., transpose=False) = [[14., 32.], [32., 77.]] syrk(A, alpha=1., transpose=True) = [[17., 22., 27.], [22., 29., 36.], [27., 36., 45.]]
// Batch matrix multiply A = [[[1., 1.]], [[0.1, 0.1]]] syrk(A, alpha=2., transpose=False) = [[[4.]], [[0.04]]]
Defined in src/operator/tensor/la_op.cc:L461
Arguments
-
A::NDArray-or-SymbolicNode
: Tensor of input matrices -
transpose::boolean, optional, default=0
: Use transpose of input matrix. -
alpha::double, optional, default=1
: Scalar factor to be applied to the result.
source
# MXNet.mx.linalg_trmm
— Method.
linalg_trmm(A, B, transpose, rightside, alpha)
linalg_trmm is an alias of _linalg_trmm.
Performs multiplication with a lower triangular matrix. Input are tensors A, B, each of dimension n >= 2 and having the same shape on the leading n-2 dimensions.
If n=2, A must be lower triangular. The operator performs the BLAS3 function trmm:
out = alpha * op\ (A) * B
if rightside=False, or
out = alpha * B * op\ (A)
if rightside=True. Here, alpha is a scalar parameter, and op() is either the identity or the matrix transposition (depending on transpose).
If n>2, trmm is performed separately on the trailing two dimensions for all inputs (batch mode).
.. note:: The operator supports float32 and float64 data types only.
Examples::
// Single triangular matrix multiply A = [[1.0, 0], [1.0, 1.0]] B = [[1.0, 1.0, 1.0], [1.0, 1.0, 1.0]] trmm(A, B, alpha=2.0) = [[2.0, 2.0, 2.0], [4.0, 4.0, 4.0]]
// Batch triangular matrix multiply A = [[[1.0, 0], [1.0, 1.0]], [[1.0, 0], [1.0, 1.0]]] B = [[[1.0, 1.0, 1.0], [1.0, 1.0, 1.0]], [[0.5, 0.5, 0.5], [0.5, 0.5, 0.5]]] trmm(A, B, alpha=2.0) = [[[2.0, 2.0, 2.0], [4.0, 4.0, 4.0]], [[1.0, 1.0, 1.0], [2.0, 2.0, 2.0]]]
Defined in src/operator/tensor/la_op.cc:L293
Arguments
-
A::NDArray-or-SymbolicNode
: Tensor of lower triangular matrices -
B::NDArray-or-SymbolicNode
: Tensor of matrices -
transpose::boolean, optional, default=0
: Use transposed of the triangular matrix -
rightside::boolean, optional, default=0
: Multiply triangular matrix from the right to non-triangular one. -
alpha::double, optional, default=1
: Scalar factor to be applied to the result.
source
# MXNet.mx.linalg_trsm
— Method.
linalg_trsm(A, B, transpose, rightside, alpha)
linalg_trsm is an alias of _linalg_trsm.
Solves matrix equation involving a lower triangular matrix. Input are tensors A, B, each of dimension n >= 2 and having the same shape on the leading n-2 dimensions.
If n=2, A must be lower triangular. The operator performs the BLAS3 function trsm, solving for out in:
op\ (A) * out = alpha * B
if rightside=False, or
out * op\ (A) = alpha * B
if rightside=True. Here, alpha is a scalar parameter, and op() is either the identity or the matrix transposition (depending on transpose).
If n>2, trsm is performed separately on the trailing two dimensions for all inputs (batch mode).
.. note:: The operator supports float32 and float64 data types only.
Examples::
// Single matrix solve A = [[1.0, 0], [1.0, 1.0]] B = [[2.0, 2.0, 2.0], [4.0, 4.0, 4.0]] trsm(A, B, alpha=0.5) = [[1.0, 1.0, 1.0], [1.0, 1.0, 1.0]]
// Batch matrix solve A = [[[1.0, 0], [1.0, 1.0]], [[1.0, 0], [1.0, 1.0]]] B = [[[2.0, 2.0, 2.0], [4.0, 4.0, 4.0]], [[4.0, 4.0, 4.0], [8.0, 8.0, 8.0]]] trsm(A, B, alpha=0.5) = [[[1.0, 1.0, 1.0], [1.0, 1.0, 1.0]], [[2.0, 2.0, 2.0], [2.0, 2.0, 2.0]]]
Defined in src/operator/tensor/la_op.cc:L356
Arguments
-
A::NDArray-or-SymbolicNode
: Tensor of lower triangular matrices -
B::NDArray-or-SymbolicNode
: Tensor of matrices -
transpose::boolean, optional, default=0
: Use transposed of the triangular matrix -
rightside::boolean, optional, default=0
: Multiply triangular matrix from the right to non-triangular one. -
alpha::double, optional, default=1
: Scalar factor to be applied to the result.
source
# MXNet.mx.load
— Method.
load(filename, ::Type{NDArray})
Load NDArrays from binary file.
Arguments:
-
filename::String
: the path of the file to load. It could be S3 or HDFS address.
Returns either Dict{Symbol, NDArray}
or Vector{NDArray}
.
filename
can point to s3
or hdfs
resources if the libmxnet
is built with the corresponding components enabled. Examples:
s3://my-bucket/path/my-s3-ndarray
hdfs://my-bucket/path/my-hdfs-ndarray
/path-to/my-local-ndarray
source
# MXNet.mx.make_loss
— Method.
make_loss(data)
Make your own loss function in network construction.
This operator accepts a customized loss function symbol as a terminal loss and the symbol should be an operator with no backward dependency. The output of this function is the gradient of loss with respect to the input data.
For example, if you are a making a cross entropy loss function. Assume $out$ is the predicted output and $label$ is the true label, then the cross entropy can be defined as::
cross_entropy = label * log(out) + (1 - label) * log(1 - out) loss = make_loss(cross_entropy)
We will need to use $make_loss$ when we are creating our own loss function or we want to combine multiple loss functions. Also we may want to stop some variables' gradients from backpropagation. See more detail in $BlockGrad$ or $stop_gradient$.
The storage type of $make_loss$ output depends upon the input storage type:
- make_loss(default) = default
- make_loss(row_sparse) = row_sparse
Defined in src/operator/tensor/elemwise_unary_op_basic.cc:L199
Arguments
-
data::NDArray-or-SymbolicNode
: The input array.
source
# MXNet.mx.mod_from!
— Method.
mod_from!(x::NDArray, y::NDArray)
mod_from!(x::NDArray, y::Real)
Elementwise modulo for NDArray
. Inplace updating.
source
# MXNet.mx.mp_sgd_mom_update
— Method.
mp_sgd_mom_update(weight, grad, mom, weight32, lr, momentum, wd, rescale_grad, clip_gradient)
Updater function for multi-precision sgd optimizer
Arguments
-
weight::NDArray-or-SymbolicNode
: Weight -
grad::NDArray-or-SymbolicNode
: Gradient -
mom::NDArray-or-SymbolicNode
: Momentum -
weight32::NDArray-or-SymbolicNode
: Weight32 -
lr::float, required
: Learning rate -
momentum::float, optional, default=0
: The decay rate of momentum estimates at each epoch. -
wd::float, optional, default=0
: Weight decay augments the objective function with a regularization term that penalizes large weights. The penalty scales with the square of the magnitude of each weight. -
rescale_grad::float, optional, default=1
: Rescale gradient to grad = rescale_grad*grad. -
clip_gradient::float, optional, default=-1
: Clip gradient to the range of [-clip_gradient, clip_gradient] If clip_gradient <= 0, gradient clipping is turned off. grad = max(min(grad, clip_gradient), -clip_gradient).
source
# MXNet.mx.mp_sgd_update
— Method.
mp_sgd_update(weight, grad, weight32, lr, wd, rescale_grad, clip_gradient)
Updater function for multi-precision sgd optimizer
Arguments
-
weight::NDArray-or-SymbolicNode
: Weight -
grad::NDArray-or-SymbolicNode
: gradient -
weight32::NDArray-or-SymbolicNode
: Weight32 -
lr::float, required
: Learning rate -
wd::float, optional, default=0
: Weight decay augments the objective function with a regularization term that penalizes large weights. The penalty scales with the square of the magnitude of each weight. -
rescale_grad::float, optional, default=1
: Rescale gradient to grad = rescale_grad*grad. -
clip_gradient::float, optional, default=-1
: Clip gradient to the range of [-clip_gradient, clip_gradient] If clip_gradient <= 0, gradient clipping is turned off. grad = max(min(grad, clip_gradient), -clip_gradient).
source
# MXNet.mx.mul_to!
— Method.
mul_to!(dst::NDArray, arg::NDArrayOrReal)
Elementwise multiplication into dst
of either a scalar or an NDArray
of the same shape. Inplace updating.
source
# MXNet.mx.nanprod
— Method.
nanprod(data, axis, keepdims, exclude)
Computes the product of array elements over given axes treating Not a Numbers ($NaN$) as one.
Defined in src/operator/tensor/broadcast_reduce_op_value.cc:L146
Arguments
-
data::NDArray-or-SymbolicNode
: The input -
axis::Shape(tuple), optional, default=[]
: The axis or axes along which to perform the reduction.`` The default,
axis=(), will compute over all elements into a scalar array with shape
(1,)`.If
axis
is int, a reduction is performed on a particular axis.If
axis
is a tuple of ints, a reduction is performed on all the axes specified in the tuple.If
exclude
is true, reduction will be performed on the axes that are NOT in axis instead.Negative values means indexing from right to left.
`` *
keepdims::boolean, optional, default=0: If this is set to
True, the reduced axes are left in the result as dimension with size one. *
exclude::boolean, optional, default=0`: Whether to perform reduction on axis that are NOT in axis instead.
source
# MXNet.mx.nansum
— Method.
nansum(data, axis, keepdims, exclude)
Computes the sum of array elements over given axes treating Not a Numbers ($NaN$) as zero.
Defined in src/operator/tensor/broadcast_reduce_op_value.cc:L131
Arguments
-
data::NDArray-or-SymbolicNode
: The input -
axis::Shape(tuple), optional, default=[]
: The axis or axes along which to perform the reduction.`` The default,
axis=(), will compute over all elements into a scalar array with shape
(1,)`.If
axis
is int, a reduction is performed on a particular axis.If
axis
is a tuple of ints, a reduction is performed on all the axes specified in the tuple.If
exclude
is true, reduction will be performed on the axes that are NOT in axis instead.Negative values means indexing from right to left.
`` *
keepdims::boolean, optional, default=0: If this is set to
True, the reduced axes are left in the result as dimension with size one. *
exclude::boolean, optional, default=0`: Whether to perform reduction on axis that are NOT in axis instead.
source
# MXNet.mx.negative
— Method.
negative(data)
Numerical negative of the argument, element-wise.
The storage type of $negative$ output depends upon the input storage type:
- negative(default) = default
- negative(row_sparse) = row_sparse
- negative(csr) = csr
Arguments
-
data::NDArray-or-SymbolicNode
: The input array.
source
# MXNet.mx.normal
— Method.
normal(loc, scale, shape, ctx, dtype)
normal is an alias of _random_normal.
Draw random samples from a normal (Gaussian) distribution.
.. note:: The existing alias $normal$ is deprecated.
Samples are distributed according to a normal distribution parametrized by loc (mean) and scale (standard deviation).
Example::
normal(loc=0, scale=1, shape=(2,2)) = [[ 1.89171135, -1.16881478], [-1.23474145, 1.55807114]]
Defined in src/operator/random/sample_op.cc:L85
Arguments
-
loc::float, optional, default=0
: Mean of the distribution. -
scale::float, optional, default=1
: Standard deviation of the distribution. -
shape::Shape(tuple), optional, default=[]
: Shape of the output. -
ctx::string, optional, default=''
: Context of output, in format cpu|gpu|cpu_pinned. Only used for imperative calls. -
dtype::{'None', 'float16', 'float32', 'float64'},optional, default='None'
: DType of the output in case this can't be inferred. Defaults to float32 if not defined (dtype=None).
source
# MXNet.mx.one_hot
— Method.
one_hot(indices, depth, on_value, off_value, dtype)
Returns a one-hot array.
The locations represented by indices
take value on_value
, while all other locations take value off_value
.
one_hot
operation with indices
of shape $(i0, i1)$ and depth
of $d$ would result in an output array of shape $(i0, i1, d)$ with::
output[i,j,:] = off_value output[i,j,indices[i,j]] = on_value
Examples::
one_hot([1,0,2,0], 3) = [[ 0. 1. 0.] [ 1. 0. 0.] [ 0. 0. 1.] [ 1. 0. 0.]]
one_hot([1,0,2,0], 3, on_value=8, off_value=1, dtype='int32') = [[1 8 1] [8 1 1] [1 1 8] [8 1 1]]
one_hot([[1,0],[1,0],[2,0]], 3) = [[[ 0. 1. 0.] [ 1. 0. 0.]]
[[ 0. 1. 0.]
[ 1. 0. 0.]]
[[ 0. 0. 1.]
[ 1. 0. 0.]]]
Defined in src/operator/tensor/indexing_op.cc:L472
Arguments
-
indices::NDArray-or-SymbolicNode
: array of locations where to set on_value -
depth::int, required
: Depth of the one hot dimension. -
on_value::double, optional, default=1
: The value assigned to the locations represented by indices. -
off_value::double, optional, default=0
: The value assigned to the locations not represented by indices. -
dtype::{'float16', 'float32', 'float64', 'int32', 'uint8'},optional, default='float32'
: DType of the output
source
# MXNet.mx.ones
— Method.
ones([DType], dims, [ctx::Context = cpu()])
ones([DType], dims...)
ones(x::NDArray)
Create an NDArray
with specific shape & type, and initialize with 1.
source
# MXNet.mx.ones_like
— Method.
ones_like(data)
Return an array of ones with the same shape and type as the input array.
Examples::
x = [[ 0., 0., 0.], [ 0., 0., 0.]]
ones_like(x) = [[ 1., 1., 1.], [ 1., 1., 1.]]
Arguments
-
data::NDArray-or-SymbolicNode
: The input
source
# MXNet.mx.pad
— Method.
pad(data, mode, pad_width, constant_value)
pad is an alias of Pad.
Pads an input array with a constant or edge values of the array.
.. note:: Pad
is deprecated. Use pad
instead.
.. note:: Current implementation only supports 4D and 5D input arrays with padding applied only on axes 1, 2 and 3. Expects axes 4 and 5 in pad_width
to be zero.
This operation pads an input array with either a constant_value
or edge values along each axis of the input array. The amount of padding is specified by pad_width
.
pad_width
is a tuple of integer padding widths for each axis of the format $(before_1, after_1, ... , before_N, after_N)$. The pad_width
should be of length $2*N$ where $N$ is the number of dimensions of the array.
For dimension $N$ of the input array, $before_N$ and $after_N$ indicates how many values to add before and after the elements of the array along dimension $N$. The widths of the higher two dimensions $before_1$, $after_1$, $before_2$, $after_2$ must be 0.
Example::
x = [[[[ 1. 2. 3.] [ 4. 5. 6.]]
[[ 7. 8. 9.]
[ 10. 11. 12.]]]
[[[ 11. 12. 13.]
[ 14. 15. 16.]]
[[ 17. 18. 19.]
[ 20. 21. 22.]]]]
pad(x,mode="edge", pad_width=(0,0,0,0,1,1,1,1)) =
[[[[ 1. 1. 2. 3. 3.]
[ 1. 1. 2. 3. 3.]
[ 4. 4. 5. 6. 6.]
[ 4. 4. 5. 6. 6.]]
[[ 7. 7. 8. 9. 9.]
[ 7. 7. 8. 9. 9.]
[ 10. 10. 11. 12. 12.]
[ 10. 10. 11. 12. 12.]]]
[[[ 11. 11. 12. 13. 13.]
[ 11. 11. 12. 13. 13.]
[ 14. 14. 15. 16. 16.]
[ 14. 14. 15. 16. 16.]]
[[ 17. 17. 18. 19. 19.]
[ 17. 17. 18. 19. 19.]
[ 20. 20. 21. 22. 22.]
[ 20. 20. 21. 22. 22.]]]]
pad(x, mode="constant", constant_value=0, pad_width=(0,0,0,0,1,1,1,1)) =
[[[[ 0. 0. 0. 0. 0.]
[ 0. 1. 2. 3. 0.]
[ 0. 4. 5. 6. 0.]
[ 0. 0. 0. 0. 0.]]
[[ 0. 0. 0. 0. 0.]
[ 0. 7. 8. 9. 0.]
[ 0. 10. 11. 12. 0.]
[ 0. 0. 0. 0. 0.]]]
[[[ 0. 0. 0. 0. 0.]
[ 0. 11. 12. 13. 0.]
[ 0. 14. 15. 16. 0.]
[ 0. 0. 0. 0. 0.]]
[[ 0. 0. 0. 0. 0.]
[ 0. 17. 18. 19. 0.]
[ 0. 20. 21. 22. 0.]
[ 0. 0. 0. 0. 0.]]]]
Defined in src/operator/pad.cc:L766
Arguments
-
data::NDArray-or-SymbolicNode
: An n-dimensional input array. -
mode::{'constant', 'edge', 'reflect'}, required
: Padding type to use. "constant" pads withconstant_value
"edge" pads using the edge values of the input array "reflect" pads by reflecting values with respect to the edges. -
pad_width::Shape(tuple), required
: Widths of the padding regions applied to the edges of each axis. It is a tuple of integer padding widths for each axis of the format $(before_1, after_1, ... , before_N, after_N)$. It should be of length $2*N$ where $N$ is the number of dimensions of the array.This is equivalent to pad_width in numpy.pad, but flattened. -
constant_value::double, optional, default=0
: The value used for padding whenmode
is "constant".
source
# MXNet.mx.pick
— Method.
pick(data, index, axis, keepdims)
Picks elements from an input array according to the input indices along the given axis.
Given an input array of shape $(d0, d1)$ and indices of shape $(i0,)$, the result will be an output array of shape $(i0,)$ with::
output[i] = input[i, indices[i]]
By default, if any index mentioned is too large, it is replaced by the index that addresses the last element along an axis (the clip
mode).
This function supports n-dimensional input and (n-1)-dimensional indices arrays.
Examples::
x = [[ 1., 2.], [ 3., 4.], [ 5., 6.]]
// picks elements with specified indices along axis 0 pick(x, y=[0,1], 0) = [ 1., 4.]
// picks elements with specified indices along axis 1 pick(x, y=[0,1,0], 1) = [ 1., 4., 5.]
y = [[ 1.], [ 0.], [ 2.]]
// picks elements with specified indices along axis 1 and dims are maintained pick(x,y, 1, keepdims=True) = [[ 2.], [ 3.], [ 6.]]
Defined in src/operator/tensor/broadcast_reduce_op_index.cc:L145
Arguments
-
data::NDArray-or-SymbolicNode
: The input array -
index::NDArray-or-SymbolicNode
: The index array -
axis::int or None, optional, default='None'
: The axis along which to perform the reduction. Negative values means indexing from right to left. $Requires axis to be set as int, because global reduction is not supported yet.$ -
keepdims::boolean, optional, default=0
: If this is set toTrue
, the reduced axis is left in the result as dimension with size one.
source
# MXNet.mx.radians
— Method.
radians(data)
Converts each element of the input array from degrees to radians.
.. math:: radians([0, 90, 180, 270, 360]) = [0, \pi/2, \pi, 3\pi/2, 2\pi]
The storage type of $radians$ output depends upon the input storage type:
- radians(default) = default
- radians(row_sparse) = row_sparse
Defined in src/operator/tensor/elemwise_unary_op_trig.cc:L182
Arguments
-
data::NDArray-or-SymbolicNode
: The input array.
source
# MXNet.mx.random_exponential
— Method.
random_exponential(lam, shape, ctx, dtype)
random_exponential is an alias of _random_exponential.
Draw random samples from an exponential distribution.
Samples are distributed according to an exponential distribution parametrized by lambda (rate).
Example::
exponential(lam=4, shape=(2,2)) = [[ 0.0097189 , 0.08999364], [ 0.04146638, 0.31715935]]
Defined in src/operator/random/sample_op.cc:L115
Arguments
-
lam::float, optional, default=1
: Lambda parameter (rate) of the exponential distribution. -
shape::Shape(tuple), optional, default=[]
: Shape of the output. -
ctx::string, optional, default=''
: Context of output, in format cpu|gpu|cpu_pinned. Only used for imperative calls. -
dtype::{'None', 'float16', 'float32', 'float64'},optional, default='None'
: DType of the output in case this can't be inferred. Defaults to float32 if not defined (dtype=None).
source
# MXNet.mx.random_gamma
— Method.
random_gamma(alpha, beta, shape, ctx, dtype)
random_gamma is an alias of _random_gamma.
Draw random samples from a gamma distribution.
Samples are distributed according to a gamma distribution parametrized by alpha (shape) and beta (scale).
Example::
gamma(alpha=9, beta=0.5, shape=(2,2)) = [[ 7.10486984, 3.37695289], [ 3.91697288, 3.65933681]]
Defined in src/operator/random/sample_op.cc:L100
Arguments
-
alpha::float, optional, default=1
: Alpha parameter (shape) of the gamma distribution. -
beta::float, optional, default=1
: Beta parameter (scale) of the gamma distribution. -
shape::Shape(tuple), optional, default=[]
: Shape of the output. -
ctx::string, optional, default=''
: Context of output, in format cpu|gpu|cpu_pinned. Only used for imperative calls. -
dtype::{'None', 'float16', 'float32', 'float64'},optional, default='None'
: DType of the output in case this can't be inferred. Defaults to float32 if not defined (dtype=None).
source
# MXNet.mx.random_generalized_negative_binomial
— Method.
random_generalized_negative_binomial(mu, alpha, shape, ctx, dtype)
random_generalized_negative_binomial is an alias of _random_generalized_negative_binomial.
Draw random samples from a generalized negative binomial distribution.
Samples are distributed according to a generalized negative binomial distribution parametrized by mu (mean) and alpha (dispersion). alpha is defined as 1/k where k is the failure limit of the number of unsuccessful experiments (generalized to real numbers). Samples will always be returned as a floating point data type.
Example::
generalized_negative_binomial(mu=2.0, alpha=0.3, shape=(2,2)) = [[ 2., 1.], [ 6., 4.]]
Defined in src/operator/random/sample_op.cc:L168
Arguments
-
mu::float, optional, default=1
: Mean of the negative binomial distribution. -
alpha::float, optional, default=1
: Alpha (dispersion) parameter of the negative binomial distribution. -
shape::Shape(tuple), optional, default=[]
: Shape of the output. -
ctx::string, optional, default=''
: Context of output, in format cpu|gpu|cpu_pinned. Only used for imperative calls. -
dtype::{'None', 'float16', 'float32', 'float64'},optional, default='None'
: DType of the output in case this can't be inferred. Defaults to float32 if not defined (dtype=None).
source
# MXNet.mx.random_negative_binomial
— Method.
random_negative_binomial(k, p, shape, ctx, dtype)
random_negative_binomial is an alias of _random_negative_binomial.
Draw random samples from a negative binomial distribution.
Samples are distributed according to a negative binomial distribution parametrized by k (limit of unsuccessful experiments) and p (failure probability in each experiment). Samples will always be returned as a floating point data type.
Example::
negative_binomial(k=3, p=0.4, shape=(2,2)) = [[ 4., 7.], [ 2., 5.]]
Defined in src/operator/random/sample_op.cc:L149
Arguments
-
k::int, optional, default='1'
: Limit of unsuccessful experiments. -
p::float, optional, default=1
: Failure probability in each experiment. -
shape::Shape(tuple), optional, default=[]
: Shape of the output. -
ctx::string, optional, default=''
: Context of output, in format cpu|gpu|cpu_pinned. Only used for imperative calls. -
dtype::{'None', 'float16', 'float32', 'float64'},optional, default='None'
: DType of the output in case this can't be inferred. Defaults to float32 if not defined (dtype=None).
source
# MXNet.mx.random_normal
— Method.
random_normal(loc, scale, shape, ctx, dtype)
random_normal is an alias of _random_normal.
Draw random samples from a normal (Gaussian) distribution.
.. note:: The existing alias $normal$ is deprecated.
Samples are distributed according to a normal distribution parametrized by loc (mean) and scale (standard deviation).
Example::
normal(loc=0, scale=1, shape=(2,2)) = [[ 1.89171135, -1.16881478], [-1.23474145, 1.55807114]]
Defined in src/operator/random/sample_op.cc:L85
Arguments
-
loc::float, optional, default=0
: Mean of the distribution. -
scale::float, optional, default=1
: Standard deviation of the distribution. -
shape::Shape(tuple), optional, default=[]
: Shape of the output. -
ctx::string, optional, default=''
: Context of output, in format cpu|gpu|cpu_pinned. Only used for imperative calls. -
dtype::{'None', 'float16', 'float32', 'float64'},optional, default='None'
: DType of the output in case this can't be inferred. Defaults to float32 if not defined (dtype=None).
source
# MXNet.mx.random_poisson
— Method.
random_poisson(lam, shape, ctx, dtype)
random_poisson is an alias of _random_poisson.
Draw random samples from a Poisson distribution.
Samples are distributed according to a Poisson distribution parametrized by lambda (rate). Samples will always be returned as a floating point data type.
Example::
poisson(lam=4, shape=(2,2)) = [[ 5., 2.], [ 4., 6.]]
Defined in src/operator/random/sample_op.cc:L132
Arguments
-
lam::float, optional, default=1
: Lambda parameter (rate) of the Poisson distribution. -
shape::Shape(tuple), optional, default=[]
: Shape of the output. -
ctx::string, optional, default=''
: Context of output, in format cpu|gpu|cpu_pinned. Only used for imperative calls. -
dtype::{'None', 'float16', 'float32', 'float64'},optional, default='None'
: DType of the output in case this can't be inferred. Defaults to float32 if not defined (dtype=None).
source
# MXNet.mx.random_uniform
— Method.
random_uniform(low, high, shape, ctx, dtype)
random_uniform is an alias of _random_uniform.
Draw random samples from a uniform distribution.
.. note:: The existing alias $uniform$ is deprecated.
Samples are uniformly distributed over the half-open interval [low, high) (includes low, but excludes high).
Example::
uniform(low=0, high=1, shape=(2,2)) = [[ 0.60276335, 0.85794562], [ 0.54488319, 0.84725171]]
Defined in src/operator/random/sample_op.cc:L66
Arguments
-
low::float, optional, default=0
: Lower bound of the distribution. -
high::float, optional, default=1
: Upper bound of the distribution. -
shape::Shape(tuple), optional, default=[]
: Shape of the output. -
ctx::string, optional, default=''
: Context of output, in format cpu|gpu|cpu_pinned. Only used for imperative calls. -
dtype::{'None', 'float16', 'float32', 'float64'},optional, default='None'
: DType of the output in case this can't be inferred. Defaults to float32 if not defined (dtype=None).
source
# MXNet.mx.rcbrt
— Method.
rcbrt(data)
Returns element-wise inverse cube-root value of the input.
.. math:: rcbrt(x) = 1/\sqrt[3]{x}
Example::
rcbrt([1,8,-125]) = [1.0, 0.5, -0.2]
Defined in src/operator/tensor/elemwise_unary_op_basic.cc:L618
Arguments
-
data::NDArray-or-SymbolicNode
: The input array.
source
# MXNet.mx.rdiv_from!
— Method.
rdiv_from!(x:: Real, y::NDArray)
Elementwise divide a scalar by an NDArray
. Inplace updating.
source
# MXNet.mx.reciprocal
— Method.
reciprocal(data)
Returns the reciprocal of the argument, element-wise.
Calculates 1/x.
Example::
reciprocal([-2, 1, 3, 1.6, 0.2]) = [-0.5, 1.0, 0.33333334, 0.625, 5.0]
Defined in src/operator/tensor/elemwise_unary_op_basic.cc:L363
Arguments
-
data::NDArray-or-SymbolicNode
: The input array.
source
# MXNet.mx.reshape_like
— Method.
reshape_like(lhs, rhs)
Reshape lhs to have the same shape as rhs.
Arguments
-
lhs::NDArray-or-SymbolicNode
: First input. -
rhs::NDArray-or-SymbolicNode
: Second input.
source
# MXNet.mx.rint
— Method.
rint(data)
Returns element-wise rounded value to the nearest integer of the input.
.. note::
- For input $n.5$ $rint$ returns $n$ while $round$ returns $n+1$.
- For input $-n.5$ both $rint$ and $round$ returns $-n-1$.
Example::
rint([-1.5, 1.5, -1.9, 1.9, 2.1]) = [-2., 1., -2., 2., 2.]
The storage type of $rint$ output depends upon the input storage type:
- rint(default) = default
- rint(row_sparse) = row_sparse
Defined in src/operator/tensor/elemwise_unary_op_basic.cc:L444
Arguments
-
data::NDArray-or-SymbolicNode
: The input array.
source
# MXNet.mx.rmod_from!
— Method.
rmod_from!(y::Real, x::NDArray)
Elementwise modulo for NDArray
. Inplace updating.
source
# MXNet.mx.rmsprop_update
— Method.
rmsprop_update(weight, grad, n, lr, gamma1, epsilon, wd, rescale_grad, clip_gradient, clip_weights)
Update function for RMSProp
optimizer.
RMSprop
is a variant of stochastic gradient descent where the gradients are divided by a cache which grows with the sum of squares of recent gradients?
RMSProp
is similar to AdaGrad
, a popular variant of SGD
which adaptively tunes the learning rate of each parameter. AdaGrad
lowers the learning rate for each parameter monotonically over the course of training. While this is analytically motivated for convex optimizations, it may not be ideal for non-convex problems. RMSProp
deals with this heuristically by allowing the learning rates to rebound as the denominator decays over time.
Define the Root Mean Square (RMS) error criterion of the gradient as :math:RMS[g]_t = \sqrt{E[g^2]_t + \epsilon}
, where :math:g
represents gradient and :math:E[g^2]_t
is the decaying average over past squared gradient.
The :math:E[g^2]_t
is given by:
.. math:: E[g^2]t = \gamma * E[g^2] + (1-\gamma) * g_t^2
The update step is
.. math:: \theta_{t+1} = \theta_t - \frac{\eta}{RMS[g]_t} g_t
The RMSProp code follows the version in http://www.cs.toronto.edu/~tijmen/csc321/slides/lecture_slides_lec6.pdf Tieleman & Hinton, 2012.
Hinton suggests the momentum term :math:\gamma
to be 0.9 and the learning rate :math:\eta
to be 0.001.
Defined in src/operator/optimizer_op.cc:L441
Arguments
-
weight::NDArray-or-SymbolicNode
: Weight -
grad::NDArray-or-SymbolicNode
: Gradient -
n::NDArray-or-SymbolicNode
: n -
lr::float, required
: Learning rate -
gamma1::float, optional, default=0.95
: The decay rate of momentum estimates. -
epsilon::float, optional, default=1e-08
: A small constant for numerical stability. -
wd::float, optional, default=0
: Weight decay augments the objective function with a regularization term that penalizes large weights. The penalty scales with the square of the magnitude of each weight. -
rescale_grad::float, optional, default=1
: Rescale gradient to grad = rescale_grad*grad. -
clip_gradient::float, optional, default=-1
: Clip gradient to the range of [-clip_gradient, clip_gradient] If clip_gradient <= 0, gradient clipping is turned off. grad = max(min(grad, clip_gradient), -clip_gradient). -
clip_weights::float, optional, default=-1
: Clip weights to the range of [-clip_weights, clip_weights] If clip_weights <= 0, weight clipping is turned off. weights = max(min(weights, clip_weights), -clip_weights).
source
# MXNet.mx.rmspropalex_update
— Method.
rmspropalex_update(weight, grad, n, g, delta, lr, gamma1, gamma2, epsilon, wd, rescale_grad, clip_gradient, clip_weights)
Update function for RMSPropAlex optimizer.
RMSPropAlex
is non-centered version of RMSProp
.
Define :math:E[g^2]_t
is the decaying average over past squared gradient and :math:E[g]_t
is the decaying average over past gradient.
.. math:: E[g^2]t = \gamma_1 * E[g^2] + (1 - \gamma_1) * g_t^2\ E[g]t = \gamma_1 * E[g] + (1 - \gamma_1) * g_t\ \Delta_t = \gamma_2 * \Delta_{t-1} - \frac{\eta}{\sqrt{E[g^2]_t - E[g]_t^2 + \epsilon}} g_t\ The update step is
.. math:: \theta_{t+1} = \theta_t + \Delta_t
The RMSPropAlex code follows the version in http://arxiv.org/pdf/1308.0850v5.pdf Eq(38) - Eq(45) by Alex Graves, 2013.
Graves suggests the momentum term :math:\gamma_1
to be 0.95, :math:\gamma_2
to be 0.9 and the learning rate :math:\eta
to be 0.0001.
Defined in src/operator/optimizer_op.cc:L480
Arguments
-
weight::NDArray-or-SymbolicNode
: Weight -
grad::NDArray-or-SymbolicNode
: Gradient -
n::NDArray-or-SymbolicNode
: n -
g::NDArray-or-SymbolicNode
: g -
delta::NDArray-or-SymbolicNode
: delta -
lr::float, required
: Learning rate -
gamma1::float, optional, default=0.95
: Decay rate. -
gamma2::float, optional, default=0.9
: Decay rate. -
epsilon::float, optional, default=1e-08
: A small constant for numerical stability. -
wd::float, optional, default=0
: Weight decay augments the objective function with a regularization term that penalizes large weights. The penalty scales with the square of the magnitude of each weight. -
rescale_grad::float, optional, default=1
: Rescale gradient to grad = rescale_grad*grad. -
clip_gradient::float, optional, default=-1
: Clip gradient to the range of [-clip_gradient, clip_gradient] If clip_gradient <= 0, gradient clipping is turned off. grad = max(min(grad, clip_gradient), -clip_gradient). -
clip_weights::float, optional, default=-1
: Clip weights to the range of [-clip_weights, clip_weights] If clip_weights <= 0, weight clipping is turned off. weights = max(min(weights, clip_weights), -clip_weights).
source
# MXNet.mx.rsqrt
— Method.
rsqrt(data)
Returns element-wise inverse square-root value of the input.
.. math:: rsqrt(x) = 1/\sqrt{x}
Example::
rsqrt([4,9,16]) = [0.5, 0.33333334, 0.25]
The storage type of $rsqrt$ output is always dense
Defined in src/operator/tensor/elemwise_unary_op_basic.cc:L584
Arguments
-
data::NDArray-or-SymbolicNode
: The input array.
source
# MXNet.mx.sample_exponential
— Method.
sample_exponential(lam, shape, dtype)
sample_exponential is an alias of _sample_exponential.
Concurrent sampling from multiple exponential distributions with parameters lambda (rate).
The parameters of the distributions are provided as an input array. Let [s] be the shape of the input array, n be the dimension of [s], [t] be the shape specified as the parameter of the operator, and m be the dimension of [t]. Then the output will be a (n+m)-dimensional array with shape [s]x[t].
For any valid n-dimensional index i with respect to the input array, output[i] will be an m-dimensional array that holds randomly drawn samples from the distribution which is parameterized by the input value at index i. If the shape parameter of the operator is not set, then one sample will be drawn per distribution and the output array has the same shape as the input array.
Examples::
lam = [ 1.0, 8.5 ]
// Draw a single sample for each distribution sample_exponential(lam) = [ 0.51837951, 0.09994757]
// Draw a vector containing two samples for each distribution sample_exponential(lam, shape=(2)) = [[ 0.51837951, 0.19866663], [ 0.09994757, 0.50447971]]
Defined in src/operator/random/multisample_op.cc:L284
Arguments
-
lam::NDArray-or-SymbolicNode
: Lambda (rate) parameters of the distributions. -
shape::Shape(tuple), optional, default=[]
: Shape to be sampled from each random distribution. -
dtype::{'None', 'float16', 'float32', 'float64'},optional, default='None'
: DType of the output in case this can't be inferred. Defaults to float32 if not defined (dtype=None).
source
# MXNet.mx.sample_gamma
— Method.
sample_gamma(alpha, shape, dtype, beta)
sample_gamma is an alias of _sample_gamma.
Concurrent sampling from multiple gamma distributions with parameters alpha (shape) and beta (scale).
The parameters of the distributions are provided as input arrays. Let [s] be the shape of the input arrays, n be the dimension of [s], [t] be the shape specified as the parameter of the operator, and m be the dimension of [t]. Then the output will be a (n+m)-dimensional array with shape [s]x[t].
For any valid n-dimensional index i with respect to the input arrays, output[i] will be an m-dimensional array that holds randomly drawn samples from the distribution which is parameterized by the input values at index i. If the shape parameter of the operator is not set, then one sample will be drawn per distribution and the output array has the same shape as the input arrays.
Examples::
alpha = [ 0.0, 2.5 ] beta = [ 1.0, 0.7 ]
// Draw a single sample for each distribution sample_gamma(alpha, beta) = [ 0. , 2.25797319]
// Draw a vector containing two samples for each distribution sample_gamma(alpha, beta, shape=(2)) = [[ 0. , 0. ], [ 2.25797319, 1.70734084]]
Defined in src/operator/random/multisample_op.cc:L282
Arguments
-
alpha::NDArray-or-SymbolicNode
: Alpha (shape) parameters of the distributions. -
shape::Shape(tuple), optional, default=[]
: Shape to be sampled from each random distribution. -
dtype::{'None', 'float16', 'float32', 'float64'},optional, default='None'
: DType of the output in case this can't be inferred. Defaults to float32 if not defined (dtype=None). -
beta::NDArray-or-SymbolicNode
: Beta (scale) parameters of the distributions.
source
# MXNet.mx.sample_generalized_negative_binomial
— Method.
sample_generalized_negative_binomial(mu, shape, dtype, alpha)
sample_generalized_negative_binomial is an alias of _sample_generalized_negative_binomial.
Concurrent sampling from multiple generalized negative binomial distributions with parameters mu (mean) and alpha (dispersion).
The parameters of the distributions are provided as input arrays. Let [s] be the shape of the input arrays, n be the dimension of [s], [t] be the shape specified as the parameter of the operator, and m be the dimension of [t]. Then the output will be a (n+m)-dimensional array with shape [s]x[t].
For any valid n-dimensional index i with respect to the input arrays, output[i] will be an m-dimensional array that holds randomly drawn samples from the distribution which is parameterized by the input values at index i. If the shape parameter of the operator is not set, then one sample will be drawn per distribution and the output array has the same shape as the input arrays.
Samples will always be returned as a floating point data type.
Examples::
mu = [ 2.0, 2.5 ] alpha = [ 1.0, 0.1 ]
// Draw a single sample for each distribution sample_generalized_negative_binomial(mu, alpha) = [ 0., 3.]
// Draw a vector containing two samples for each distribution sample_generalized_negative_binomial(mu, alpha, shape=(2)) = [[ 0., 3.], [ 3., 1.]]
Defined in src/operator/random/multisample_op.cc:L293
Arguments
-
mu::NDArray-or-SymbolicNode
: Means of the distributions. -
shape::Shape(tuple), optional, default=[]
: Shape to be sampled from each random distribution. -
dtype::{'None', 'float16', 'float32', 'float64'},optional, default='None'
: DType of the output in case this can't be inferred. Defaults to float32 if not defined (dtype=None). -
alpha::NDArray-or-SymbolicNode
: Alpha (dispersion) parameters of the distributions.
source
# MXNet.mx.sample_multinomial
— Method.
sample_multinomial(data, shape, get_prob, dtype)
sample_multinomial is an alias of _sample_multinomial.
Concurrent sampling from multiple multinomial distributions.
data is an n dimensional array whose last dimension has length k, where k is the number of possible outcomes of each multinomial distribution. This operator will draw shape samples from each distribution. If shape is empty one sample will be drawn from each distribution.
If get_prob is true, a second array containing log likelihood of the drawn samples will also be returned. This is usually used for reinforcement learning where you can provide reward as head gradient for this array to estimate gradient.
Note that the input distribution must be normalized, i.e. data must sum to 1 along its last axis.
Examples::
probs = [[0, 0.1, 0.2, 0.3, 0.4], [0.4, 0.3, 0.2, 0.1, 0]]
// Draw a single sample for each distribution sample_multinomial(probs) = [3, 0]
// Draw a vector containing two samples for each distribution sample_multinomial(probs, shape=(2)) = [[4, 2], [0, 0]]
// requests log likelihood sample_multinomial(probs, get_prob=True) = [2, 1], [0.2, 0.3]
Arguments
-
data::NDArray-or-SymbolicNode
: Distribution probabilities. Must sum to one on the last axis. -
shape::Shape(tuple), optional, default=[]
: Shape to be sampled from each random distribution. -
get_prob::boolean, optional, default=0
: Whether to also return the log probability of sampled result. This is usually used for differentiating through stochastic variables, e.g. in reinforcement learning. -
dtype::{'int32'},optional, default='int32'
: DType of the output in case this can't be inferred. Only support int32 for now.
source
# MXNet.mx.sample_negative_binomial
— Method.
sample_negative_binomial(k, shape, dtype, p)
sample_negative_binomial is an alias of _sample_negative_binomial.
Concurrent sampling from multiple negative binomial distributions with parameters k (failure limit) and p (failure probability).
The parameters of the distributions are provided as input arrays. Let [s] be the shape of the input arrays, n be the dimension of [s], [t] be the shape specified as the parameter of the operator, and m be the dimension of [t]. Then the output will be a (n+m)-dimensional array with shape [s]x[t].
For any valid n-dimensional index i with respect to the input arrays, output[i] will be an m-dimensional array that holds randomly drawn samples from the distribution which is parameterized by the input values at index i. If the shape parameter of the operator is not set, then one sample will be drawn per distribution and the output array has the same shape as the input arrays.
Samples will always be returned as a floating point data type.
Examples::
k = [ 20, 49 ] p = [ 0.4 , 0.77 ]
// Draw a single sample for each distribution sample_negative_binomial(k, p) = [ 15., 16.]
// Draw a vector containing two samples for each distribution sample_negative_binomial(k, p, shape=(2)) = [[ 15., 50.], [ 16., 12.]]
Defined in src/operator/random/multisample_op.cc:L289
Arguments
-
k::NDArray-or-SymbolicNode
: Limits of unsuccessful experiments. -
shape::Shape(tuple), optional, default=[]
: Shape to be sampled from each random distribution. -
dtype::{'None', 'float16', 'float32', 'float64'},optional, default='None'
: DType of the output in case this can't be inferred. Defaults to float32 if not defined (dtype=None). -
p::NDArray-or-SymbolicNode
: Failure probabilities in each experiment.
source
# MXNet.mx.sample_normal
— Method.
sample_normal(mu, shape, dtype, sigma)
sample_normal is an alias of _sample_normal.
Concurrent sampling from multiple normal distributions with parameters mu (mean) and sigma (standard deviation).
The parameters of the distributions are provided as input arrays. Let [s] be the shape of the input arrays, n be the dimension of [s], [t] be the shape specified as the parameter of the operator, and m be the dimension of [t]. Then the output will be a (n+m)-dimensional array with shape [s]x[t].
For any valid n-dimensional index i with respect to the input arrays, output[i] will be an m-dimensional array that holds randomly drawn samples from the distribution which is parameterized by the input values at index i. If the shape parameter of the operator is not set, then one sample will be drawn per distribution and the output array has the same shape as the input arrays.
Examples::
mu = [ 0.0, 2.5 ] sigma = [ 1.0, 3.7 ]
// Draw a single sample for each distribution sample_normal(mu, sigma) = [-0.56410581, 0.95934606]
// Draw a vector containing two samples for each distribution sample_normal(mu, sigma, shape=(2)) = [[-0.56410581, 0.2928229 ], [ 0.95934606, 4.48287058]]
Defined in src/operator/random/multisample_op.cc:L279
Arguments
-
mu::NDArray-or-SymbolicNode
: Means of the distributions. -
shape::Shape(tuple), optional, default=[]
: Shape to be sampled from each random distribution. -
dtype::{'None', 'float16', 'float32', 'float64'},optional, default='None'
: DType of the output in case this can't be inferred. Defaults to float32 if not defined (dtype=None). -
sigma::NDArray-or-SymbolicNode
: Standard deviations of the distributions.
source
# MXNet.mx.sample_poisson
— Method.
sample_poisson(lam, shape, dtype)
sample_poisson is an alias of _sample_poisson.
Concurrent sampling from multiple Poisson distributions with parameters lambda (rate).
The parameters of the distributions are provided as an input array. Let [s] be the shape of the input array, n be the dimension of [s], [t] be the shape specified as the parameter of the operator, and m be the dimension of [t]. Then the output will be a (n+m)-dimensional array with shape [s]x[t].
For any valid n-dimensional index i with respect to the input array, output[i] will be an m-dimensional array that holds randomly drawn samples from the distribution which is parameterized by the input value at index i. If the shape parameter of the operator is not set, then one sample will be drawn per distribution and the output array has the same shape as the input array.
Samples will always be returned as a floating point data type.
Examples::
lam = [ 1.0, 8.5 ]
// Draw a single sample for each distribution sample_poisson(lam) = [ 0., 13.]
// Draw a vector containing two samples for each distribution sample_poisson(lam, shape=(2)) = [[ 0., 4.], [ 13., 8.]]
Defined in src/operator/random/multisample_op.cc:L286
Arguments
-
lam::NDArray-or-SymbolicNode
: Lambda (rate) parameters of the distributions. -
shape::Shape(tuple), optional, default=[]
: Shape to be sampled from each random distribution. -
dtype::{'None', 'float16', 'float32', 'float64'},optional, default='None'
: DType of the output in case this can't be inferred. Defaults to float32 if not defined (dtype=None).
source
# MXNet.mx.sample_uniform
— Method.
sample_uniform(low, shape, dtype, high)
sample_uniform is an alias of _sample_uniform.
Concurrent sampling from multiple uniform distributions on the intervals given by [low,high).
The parameters of the distributions are provided as input arrays. Let [s] be the shape of the input arrays, n be the dimension of [s], [t] be the shape specified as the parameter of the operator, and m be the dimension of [t]. Then the output will be a (n+m)-dimensional array with shape [s]x[t].
For any valid n-dimensional index i with respect to the input arrays, output[i] will be an m-dimensional array that holds randomly drawn samples from the distribution which is parameterized by the input values at index i. If the shape parameter of the operator is not set, then one sample will be drawn per distribution and the output array has the same shape as the input arrays.
Examples::
low = [ 0.0, 2.5 ] high = [ 1.0, 3.7 ]
// Draw a single sample for each distribution sample_uniform(low, high) = [ 0.40451524, 3.18687344]
// Draw a vector containing two samples for each distribution sample_uniform(low, high, shape=(2)) = [[ 0.40451524, 0.18017688], [ 3.18687344, 3.68352246]]
Defined in src/operator/random/multisample_op.cc:L277
Arguments
-
low::NDArray-or-SymbolicNode
: Lower bounds of the distributions. -
shape::Shape(tuple), optional, default=[]
: Shape to be sampled from each random distribution. -
dtype::{'None', 'float16', 'float32', 'float64'},optional, default='None'
: DType of the output in case this can't be inferred. Defaults to float32 if not defined (dtype=None). -
high::NDArray-or-SymbolicNode
: Upper bounds of the distributions.
source
# MXNet.mx.save
— Method.
save(filename::AbstractString, data)
Save NDarrays to binary file. Filename could be S3 or HDFS address, if libmxnet
is built with corresponding support (see load
).
-
filename::String
: path to the binary file to write to. -
data
: data to save to file. Data can be aNDArray
, aVector
ofNDArray
, or aDict{Symbol}
containsNDArray
s.
source
# MXNet.mx.scatter_nd
— Method.
scatter_nd(data, indices, shape)
Scatters data into a new tensor according to indices.
Given data
with shape (Y_0, ..., Y_{K-1}, X_M, ..., X_{N-1})
and indices with shape (M, Y_0, ..., Y_{K-1})
, the output will have shape (X_0, X_1, ..., X_{N-1})
, where M <= N
. If M == N
, data shape should simply be (Y_0, ..., Y_{K-1})
.
The elements in output is defined as follows::
output[indices[0, y_0, ..., y_{K-1}], ..., indices[M-1, y_0, ..., y_{K-1}], x_M, ..., x_{N-1}] = data[y_0, ..., y_{K-1}, x_M, ..., x_{N-1}]
all other entries in output are 0.
.. warning::
If the indices have duplicates, the result will be non-deterministic and
the gradient of `scatter_nd` will not be correct!!
Examples::
data = [2, 3, 0] indices = [[1, 1, 0], [0, 1, 0]] shape = (2, 2) scatter_nd(data, indices, shape) = [[0, 0], [2, 3]]
Arguments
-
data::NDArray-or-SymbolicNode
: data -
indices::NDArray-or-SymbolicNode
: indices -
shape::Shape(tuple), required
: Shape of output.
source
# MXNet.mx.sgd_mom_update
— Method.
sgd_mom_update(weight, grad, mom, lr, momentum, wd, rescale_grad, clip_gradient)
Momentum update function for Stochastic Gradient Descent (SDG) optimizer.
Momentum update has better convergence rates on neural networks. Mathematically it looks like below:
.. math::
v_1 = \alpha * \nabla J(W_0)\ v_t = \gamma v_{t-1} - \alpha * \nabla J(W_{t-1})\ W_t = W_{t-1} + v_t
It updates the weights using::
v = momentum * v - learning_rate * gradient weight += v
Where the parameter $momentum$ is the decay rate of momentum estimates at each epoch.
If weight and grad are both of $row_sparse$ storage type and momentum is of $default$ storage type, standard update is applied.
If weight, grad and momentum are all of $row_sparse$ storage type, only the row slices whose indices appear in grad.indices are updated (for both weight and momentum)::
for row in gradient.indices: v[row] = momentum[row] * v[row] - learning_rate * gradient[row] weight[row] += v[row]
Defined in src/operator/optimizer_op.cc:L265
Arguments
-
weight::NDArray-or-SymbolicNode
: Weight -
grad::NDArray-or-SymbolicNode
: Gradient -
mom::NDArray-or-SymbolicNode
: Momentum -
lr::float, required
: Learning rate -
momentum::float, optional, default=0
: The decay rate of momentum estimates at each epoch. -
wd::float, optional, default=0
: Weight decay augments the objective function with a regularization term that penalizes large weights. The penalty scales with the square of the magnitude of each weight. -
rescale_grad::float, optional, default=1
: Rescale gradient to grad = rescale_grad*grad. -
clip_gradient::float, optional, default=-1
: Clip gradient to the range of [-clip_gradient, clip_gradient] If clip_gradient <= 0, gradient clipping is turned off. grad = max(min(grad, clip_gradient), -clip_gradient).
source
# MXNet.mx.sgd_update
— Method.
sgd_update(weight, grad, lr, wd, rescale_grad, clip_gradient)
Update function for Stochastic Gradient Descent (SDG) optimizer.
It updates the weights using::
weight = weight - learning_rate * gradient
If weight is of $row_sparse$ storage type, only the row slices whose indices appear in grad.indices are updated::
for row in gradient.indices: weight[row] = weight[row] - learning_rate * gradient[row]
Defined in src/operator/optimizer_op.cc:L222
Arguments
-
weight::NDArray-or-SymbolicNode
: Weight -
grad::NDArray-or-SymbolicNode
: Gradient -
lr::float, required
: Learning rate -
wd::float, optional, default=0
: Weight decay augments the objective function with a regularization term that penalizes large weights. The penalty scales with the square of the magnitude of each weight. -
rescale_grad::float, optional, default=1
: Rescale gradient to grad = rescale_grad*grad. -
clip_gradient::float, optional, default=-1
: Clip gradient to the range of [-clip_gradient, clip_gradient] If clip_gradient <= 0, gradient clipping is turned off. grad = max(min(grad, clip_gradient), -clip_gradient).
source
# MXNet.mx.signsgd_update
— Method.
signsgd_update(weight, grad, lr, wd, rescale_grad, clip_gradient)
Update function for SignSGD optimizer. .. math::
g_t = \nabla J(W_{t-1})\ W_t = W_{t-1} - \eta_t \text{sign}(g_t)}
It updates the weights using::
weight = weight - learning_rate * sign(gradient)
.. note::
- sparse ndarray not supported for this optimizer yet.
Defined in src/operator/optimizer_op.cc:L55
Arguments
-
weight::NDArray-or-SymbolicNode
: Weight -
grad::NDArray-or-SymbolicNode
: Gradient -
lr::float, required
: Learning rate -
wd::float, optional, default=0
: Weight decay augments the objective function with a regularization term that penalizes large weights. The penalty scales with the square of the magnitude of each weight. -
rescale_grad::float, optional, default=1
: Rescale gradient to grad = rescale_grad*grad. -
clip_gradient::float, optional, default=-1
: Clip gradient to the range of [-clip_gradient, clip_gradient] If clip_gradient <= 0, gradient clipping is turned off. grad = max(min(grad, clip_gradient), -clip_gradient).
source
# MXNet.mx.signum_update
— Method.
signum_update(weight, grad, mom, lr, momentum, wd, rescale_grad, clip_gradient, wd_lh)
SIGN momentUM (Signum) optimizer.
.. math::
g_t = \nabla J(W_{t-1})\ m_t = \beta m_{t-1} + (1 - \beta) g_t\ W_t = W_{t-1} - \eta_t \text{sign}(m_t)}
It updates the weights using:: state = momentum * state + (1-momentum) * gradient weight = weight - learning_rate * sign(state)
Where the parameter $momentum$ is the decay rate of momentum estimates at each epoch.
.. note::
- sparse ndarray not supported for this optimizer yet.
Defined in src/operator/optimizer_op.cc:L84
Arguments
-
weight::NDArray-or-SymbolicNode
: Weight -
grad::NDArray-or-SymbolicNode
: Gradient -
mom::NDArray-or-SymbolicNode
: Momentum -
lr::float, required
: Learning rate -
momentum::float, optional, default=0
: The decay rate of momentum estimates at each epoch. -
wd::float, optional, default=0
: Weight decay augments the objective function with a regularization term that penalizes large weights. The penalty scales with the square of the magnitude of each weight. -
rescale_grad::float, optional, default=1
: Rescale gradient to grad = rescale_grad*grad. -
clip_gradient::float, optional, default=-1
: Clip gradient to the range of [-clip_gradient, clip_gradient] If clip_gradient <= 0, gradient clipping is turned off. grad = max(min(grad, clip_gradient), -clip_gradient). -
wd_lh::float, optional, default=0
: The amount of weight decay that does not go into gradient/momentum calculationsotherwise do weight decay algorithmically only.
source
# MXNet.mx.slice
— Method.
slice(arr :: NDArray, start:stop)
Create a view into a sub-slice of an NDArray
. Note only slicing at the slowest changing dimension is supported. In Julia's column-major perspective, this is the last dimension. For example, given an NDArray
of shape (2,3,4), slice(array, 2:3)
will create a NDArray
of shape (2,3,2), sharing the data with the original array. This operation is used in data parallelization to split mini-batch into sub-batches for different devices.
source
# MXNet.mx.slice
— Method.
slice(data, begin, end, step)
Slices a region of the array.
.. note:: $crop$ is deprecated. Use $slice$ instead.
This function returns a sliced array between the indices given by begin
and end
with the corresponding step
.
For an input array of $shape=(d_0, d_1, ..., d_n-1)$, slice operation with $begin=(b_0, b_1...b_m-1)$, $end=(e_0, e_1, ..., e_m-1)$, and $step=(s_0, s_1, ..., s_m-1)$, where m <= n, results in an array with the shape $(|e_0-b_0|/|s_0|, ..., |e_m-1-b_m-1|/|s_m-1|, d_m, ..., d_n-1)$.
The resulting array's k-th dimension contains elements from the k-th dimension of the input array starting from index $b_k$ (inclusive) with step $s_k$ until reaching $e_k$ (exclusive).
If the k-th elements are None
in the sequence of begin
, end
, and step
, the following rule will be used to set default values. If s_k
is None
, set s_k=1
. If s_k > 0
, set b_k=0
, e_k=d_k
; else, set b_k=d_k-1
, e_k=-1
.
The storage type of $slice$ output depends on storage types of inputs
- slice(csr) = csr
- otherwise, $slice$ generates output with default storage
.. note:: When input data storage type is csr, it only supports step=(), or step=(None,), or step=(1,) to generate a csr output. For other step parameter values, it falls back to slicing a dense tensor.
Example::
x = [[ 1., 2., 3., 4.], [ 5., 6., 7., 8.], [ 9., 10., 11., 12.]]
slice(x, begin=(0,1), end=(2,4)) = [[ 2., 3., 4.], [ 6., 7., 8.]] slice(x, begin=(None, 0), end=(None, 3), step=(-1, 2)) = [[9., 11.], [5., 7.], [1., 3.]]
Defined in src/operator/tensor/matrix_op.cc:L359
Arguments
-
data::NDArray-or-SymbolicNode
: Source input -
begin::Shape(tuple), required
: starting indices for the slice operation, supports negative indices. -
end::Shape(tuple), required
: ending indices for the slice operation, supports negative indices. -
step::Shape(tuple), optional, default=[]
: step for the slice operation, supports negative values.
source
# MXNet.mx.slice_axis
— Method.
slice_axis(data, axis, begin, end)
Slices along a given axis.
Returns an array slice along a given axis
starting from the begin
index to the end
index.
Examples::
x = [[ 1., 2., 3., 4.], [ 5., 6., 7., 8.], [ 9., 10., 11., 12.]]
slice_axis(x, axis=0, begin=1, end=3) = [[ 5., 6., 7., 8.], [ 9., 10., 11., 12.]]
slice_axis(x, axis=1, begin=0, end=2) = [[ 1., 2.], [ 5., 6.], [ 9., 10.]]
slice_axis(x, axis=1, begin=-3, end=-1) = [[ 2., 3.], [ 6., 7.], [ 10., 11.]]
Defined in src/operator/tensor/matrix_op.cc:L446
Arguments
-
data::NDArray-or-SymbolicNode
: Source input -
axis::int, required
: Axis along which to be sliced, supports negative indexes. -
begin::int, required
: The beginning index along the axis to be sliced, supports negative indexes. -
end::int or None, required
: The ending index along the axis to be sliced, supports negative indexes.
source
# MXNet.mx.smooth_l1
— Method.
smooth_l1(data, scalar)
Calculate Smooth L1 Loss(lhs, scalar) by summing
.. math::
f(x) =
\begin{cases}
(\sigma x)^2/2,& \text{if }x < 1/\sigma^2\\
|x|-0.5/\sigma^2,& \text{otherwise}
\end{cases}
where :math:x
is an element of the tensor lhs and :math:\sigma
is the scalar.
Example::
smooth_l1([1, 2, 3, 4], sigma=1) = [0.5, 1.5, 2.5, 3.5]
Defined in src/operator/tensor/elemwise_binary_scalar_op_extended.cc:L103
Arguments
-
data::NDArray-or-SymbolicNode
: source input -
scalar::float
: scalar input
source
# MXNet.mx.softmax_cross_entropy
— Method.
softmax_cross_entropy(data, label)
Calculate cross entropy of softmax output and one-hot label.
-
This operator computes the cross entropy in two steps:
- Applies softmax function on the input array.
- Computes and returns the cross entropy loss between the softmax output and the labels.
-
The softmax function and cross entropy loss is given by:
-
Softmax Function:
.. math:: \text{softmax}(x)_i = \frac{exp(x_i)}{\sum_j exp(x_j)}
- Cross Entropy Function:
.. math:: \text{CE(label, output)} = - \sum_i \text{label}_i \log(\text{output}_i)
Example::
x = [[1, 2, 3], [11, 7, 5]]
label = [2, 0]
softmax(x) = [[0.09003057, 0.24472848, 0.66524094], [0.97962922, 0.01794253, 0.00242826]]
softmax_cross_entropy(data, label) = - log(0.66524084) - log(0.97962922) = 0.4281871
Defined in src/operator/loss_binary_op.cc:L59
Arguments
-
data::NDArray-or-SymbolicNode
: Input data -
label::NDArray-or-SymbolicNode
: Input label
source
# MXNet.mx.square
— Method.
square(data)
Returns element-wise squared value of the input.
.. math:: square(x) = x^2
Example::
square([2, 3, 4]) = [4, 9, 16]
The storage type of $square$ output depends upon the input storage type:
- square(default) = default
- square(row_sparse) = row_sparse
- square(csr) = csr
Defined in src/operator/tensor/elemwise_unary_op_basic.cc:L541
Arguments
-
data::NDArray-or-SymbolicNode
: The input array.
source
# MXNet.mx.stack
— Method.
stack(data, axis, num_args)
Note: stack takes variable number of positional inputs. So instead of calling as stack([x, y, z], num_args=3), one should call via stack(x, y, z), and num_args will be determined automatically.
Join a sequence of arrays along a new axis.
The axis parameter specifies the index of the new axis in the dimensions of the result. For example, if axis=0 it will be the first dimension and if axis=-1 it will be the last dimension.
Examples::
x = [1, 2] y = [3, 4]
stack(x, y) = [[1, 2], [3, 4]] stack(x, y, axis=1) = [[1, 3], [2, 4]]
Arguments
-
data::NDArray-or-SymbolicNode[]
: List of arrays to stack -
axis::int, optional, default='0'
: The axis in the result array along which the input arrays are stacked. -
num_args::int, required
: Number of inputs to be stacked.
source
# MXNet.mx.stop_gradient
— Method.
stop_gradient(data)
stop_gradient is an alias of BlockGrad.
Stops gradient computation.
Stops the accumulated gradient of the inputs from flowing through this operator in the backward direction. In other words, this operator prevents the contribution of its inputs to be taken into account for computing gradients.
Example::
v1 = [1, 2] v2 = [0, 1] a = Variable('a') b = Variable('b') b_stop_grad = stop_gradient(3 * b) loss = MakeLoss(b_stop_grad + a)
executor = loss.simple_bind(ctx=cpu(), a=(1,2), b=(1,2)) executor.forward(is_train=True, a=v1, b=v2) executor.outputs [ 1. 5.]
executor.backward() executor.grad_arrays [ 0. 0.] [ 1. 1.]
Defined in src/operator/tensor/elemwise_unary_op_basic.cc:L166
Arguments
-
data::NDArray-or-SymbolicNode
: The input array.
source
# MXNet.mx.sub_from!
— Method.
sub_from!(dst::NDArray, args::NDArrayOrReal...)
Subtract a bunch of arguments from dst
. Inplace updating.
source
# MXNet.mx.sum_axis
— Method.
sum_axis(data, axis, keepdims, exclude)
sum_axis is an alias of sum.
Computes the sum of array elements over given axes.
.. Note::
sum
and sum_axis
are equivalent. For ndarray of csr storage type summation along axis 0 and axis 1 is supported. Setting keepdims or exclude to True will cause a fallback to dense operator.
Example::
data = [[[1,2],[2,3],[1,3]], [[1,4],[4,3],[5,2]], [[7,1],[7,2],[7,3]]]
sum(data, axis=1) [[ 4. 8.] [ 10. 9.] [ 21. 6.]]
sum(data, axis=[1,2]) [ 12. 19. 27.]
data = [[1,2,0], [3,0,1], [4,1,0]]
csr = cast_storage(data, 'csr')
sum(csr, axis=0) [ 8. 3. 1.]
sum(csr, axis=1) [ 3. 4. 5.]
Defined in src/operator/tensor/broadcast_reduce_op_value.cc:L85
Arguments
-
data::NDArray-or-SymbolicNode
: The input -
axis::Shape(tuple), optional, default=[]
: The axis or axes along which to perform the reduction.`` The default,
axis=(), will compute over all elements into a scalar array with shape
(1,)`.If
axis
is int, a reduction is performed on a particular axis.If
axis
is a tuple of ints, a reduction is performed on all the axes specified in the tuple.If
exclude
is true, reduction will be performed on the axes that are NOT in axis instead.Negative values means indexing from right to left.
`` *
keepdims::boolean, optional, default=0: If this is set to
True, the reduced axes are left in the result as dimension with size one. *
exclude::boolean, optional, default=0`: Whether to perform reduction on axis that are NOT in axis instead.
source
# MXNet.mx.swapaxes
— Method.
swapaxes(data, dim1, dim2)
swapaxes is an alias of SwapAxis.
Interchanges two axes of an array.
Examples::
x = [[1, 2, 3]]) swapaxes(x, 0, 1) = [[ 1], [ 2], [ 3]]
x = [[[ 0, 1], [ 2, 3]], [[ 4, 5], [ 6, 7]]] // (2,2,2) array
swapaxes(x, 0, 2) = [[[ 0, 4], [ 2, 6]], [[ 1, 5], [ 3, 7]]]
Defined in src/operator/swapaxis.cc:L70
Arguments
-
data::NDArray-or-SymbolicNode
: Input array. -
dim1::int (non-negative), optional, default=0
: the first axis to be swapped. -
dim2::int (non-negative), optional, default=0
: the second axis to be swapped.
source
# MXNet.mx.take
— Method.
take(a, indices, axis, mode)
Takes elements from an input array along the given axis.
This function slices the input array along a particular axis with the provided indices.
Given an input array with shape $(d0, d1, d2)$ and indices with shape $(i0, i1)$, the output will have shape $(i0, i1, d1, d2)$, computed by::
output[i,j,:,:] = input[indices[i,j],:,:]
.. note::
-
axis
- Only slicing along axis 0 is supported for now. -
mode
- Onlyclip
mode is supported for now.
Examples:: x = [4. 5. 6.]
// Trivial case, take the second element along the first axis. take(x, [1]) = [ 5. ]
x = [[ 1., 2.], [ 3., 4.], [ 5., 6.]]
// In this case we will get rows 0 and 1, then 1 and 2. Along axis 0 take(x, [[0,1],[1,2]]) = [[[ 1., 2.], [ 3., 4.]],
[[ 3., 4.],
[ 5., 6.]]]
Defined in src/operator/tensor/indexing_op.cc:L371
Arguments
-
a::NDArray-or-SymbolicNode
: The input array. -
indices::NDArray-or-SymbolicNode
: The indices of the values to be extracted. -
axis::int, optional, default='0'
: The axis of input array to be taken. -
mode::{'clip', 'raise', 'wrap'},optional, default='clip'
: Specify how out-of-bound indices bahave. "clip" means clip to the range. So, if all indices mentioned are too large, they are replaced by the index that addresses the last element along an axis. "wrap" means to wrap around. "raise" means to raise an error.
source
# MXNet.mx.tile
— Method.
tile(data, reps)
Repeats the whole array multiple times.
If $reps$ has length d, and input array has dimension of n. There are three cases:
-
n=d. Repeat i-th dimension of the input by $reps[i]$ times::
x = [[1, 2], [3, 4]]
tile(x, reps=(2,3)) = [[ 1., 2., 1., 2., 1., 2.], [ 3., 4., 3., 4., 3., 4.], [ 1., 2., 1., 2., 1., 2.], [ 3., 4., 3., 4., 3., 4.]] * n>d. $reps$ is promoted to length n by pre-pending 1's to it. Thus for an input shape $(2,3)$, $repos=(2,)$ is treated as $(1,2)$::
tile(x, reps=(2,)) = [[ 1., 2., 1., 2.],
[ 3., 4., 3., 4.]]
-
n
. The input is promoted to be d-dimensional by prepending new axes. So a shape $(2,2)$ array is promoted to $(1,2,2)$ for 3-D replication:: tile(x, reps=(2,2,3)) = [[[ 1., 2., 1., 2., 1., 2.], [ 3., 4., 3., 4., 3., 4.], [ 1., 2., 1., 2., 1., 2.], [ 3., 4., 3., 4., 3., 4.]],
[[ 1., 2., 1., 2., 1., 2.], [ 3., 4., 3., 4., 3., 4.], [ 1., 2., 1., 2., 1., 2.], [ 3., 4., 3., 4., 3., 4.]]]
Defined in src/operator/tensor/matrix_op.cc:L624
Arguments
-
data::NDArray-or-SymbolicNode
: Input data array -
reps::Shape(tuple), required
: The number of times for repeating the tensor a. If reps has length d, the result will have dimension of max(d, a.ndim); If a.ndim < d, a is promoted to be d-dimensional by prepending new axes. If a.ndim > d, reps is promoted to a.ndim by pre-pending 1's to it.
source
# MXNet.mx.topk
— Method.
topk(data, axis, k, ret_typ, is_ascend)
Returns the top k elements in an input array along the given axis.
Examples::
x = [[ 0.3, 0.2, 0.4], [ 0.1, 0.3, 0.2]]
// returns an index of the largest element on last axis topk(x) = [[ 2.], [ 1.]]
// returns the value of top-2 largest elements on last axis topk(x, ret_typ='value', k=2) = [[ 0.4, 0.3], [ 0.3, 0.2]]
// returns the value of top-2 smallest elements on last axis topk(x, ret_typ='value', k=2, is_ascend=1) = [[ 0.2 , 0.3], [ 0.1 , 0.2]]
// returns the value of top-2 largest elements on axis 0 topk(x, axis=0, ret_typ='value', k=2) = [[ 0.3, 0.3, 0.4], [ 0.1, 0.2, 0.2]]
// flattens and then returns list of both values and indices topk(x, ret_typ='both', k=2) = [[[ 0.4, 0.3], [ 0.3, 0.2]] , [[ 2., 0.], [ 1., 2.]]]
Defined in src/operator/tensor/ordering_op.cc:L63
Arguments
-
data::NDArray-or-SymbolicNode
: The input array -
axis::int or None, optional, default='-1'
: Axis along which to choose the top k indices. If not given, the flattened array is used. Default is -1. -
k::int, optional, default='1'
: Number of top elements to select, should be always smaller than or equal to the element number in the given axis. A global sort is performed if set k < 1. -
ret_typ::{'both', 'indices', 'mask', 'value'},optional, default='indices'
: The return type.
"value" means to return the top k values, "indices" means to return the indices of the top k values, "mask" means to return a mask array containing 0 and 1. 1 means the top k values. "both" means to return a list of both values and indices of top k elements.
-
is_ascend::boolean, optional, default=0
: Whether to choose k largest or k smallest elements. Top K largest elements will be chosen if set to false.
source
# MXNet.mx.try_get_shared
— Method.
try_get_shared(arr; sync=:nop)
Try to create a Julia array by sharing the data with the underlying NDArray
.
Arguments:
-
arr::NDArray
: the array to be shared.
Note
The returned array does not guarantee to share data with the underlying NDArray
. In particular, data sharing is possible only when the NDArray
lives on CPU.
-
sync::Symbol
::nop
,:write
,:read
On CPU, invoke_wait_to_read
if:read
; invoke_wait_to_write
if:write
.
source
# MXNet.mx.uniform
— Method.
uniform(low, high, shape, ctx, dtype)
uniform is an alias of _random_uniform.
Draw random samples from a uniform distribution.
.. note:: The existing alias $uniform$ is deprecated.
Samples are uniformly distributed over the half-open interval [low, high) (includes low, but excludes high).
Example::
uniform(low=0, high=1, shape=(2,2)) = [[ 0.60276335, 0.85794562], [ 0.54488319, 0.84725171]]
Defined in src/operator/random/sample_op.cc:L66
Arguments
-
low::float, optional, default=0
: Lower bound of the distribution. -
high::float, optional, default=1
: Upper bound of the distribution. -
shape::Shape(tuple), optional, default=[]
: Shape of the output. -
ctx::string, optional, default=''
: Context of output, in format cpu|gpu|cpu_pinned. Only used for imperative calls. -
dtype::{'None', 'float16', 'float32', 'float64'},optional, default='None'
: DType of the output in case this can't be inferred. Defaults to float32 if not defined (dtype=None).
source
# MXNet.mx.where
— Method.
where(condition, x, y)
Return the elements, either from x or y, depending on the condition.
Given three ndarrays, condition, x, and y, return an ndarray with the elements from x or y, depending on the elements from condition are true or false. x and y must have the same shape. If condition has the same shape as x, each element in the output array is from x if the corresponding element in the condition is true, and from y if false.
If condition does not have the same shape as x, it must be a 1D array whose size is the same as x's first dimension size. Each row of the output array is from x's row if the corresponding element from condition is true, and from y's row if false.
Note that all non-zero values are interpreted as $True$ in condition.
Examples::
x = [[1, 2], [3, 4]] y = [[5, 6], [7, 8]] cond = [[0, 1], [-1, 0]]
where(cond, x, y) = [[5, 2], [3, 8]]
csr_cond = cast_storage(cond, 'csr')
where(csr_cond, x, y) = [[5, 2], [3, 8]]
Defined in src/operator/tensor/control_flow_op.cc:L57
Arguments
-
condition::NDArray-or-SymbolicNode
: condition array -
x::NDArray-or-SymbolicNode
: -
y::NDArray-or-SymbolicNode
:
source
# MXNet.mx.zeros
— Method.
zeros([DType], dims, [ctx::Context = cpu()])
zeros([DType], dims...)
zeros(x::NDArray)
Create zero-ed NDArray
with specific shape and type.
source
# MXNet.mx.zeros_like
— Method.
zeros_like(data)
Return an array of zeros with the same shape and type as the input array.
The storage type of $zeros_like$ output depends on the storage type of the input
- zeros_like(row_sparse) = row_sparse
- zeros_like(csr) = csr
- zeros_like(default) = default
Examples::
x = [[ 1., 1., 1.], [ 1., 1., 1.]]
zeros_like(x) = [[ 0., 0., 0.], [ 0., 0., 0.]]
Arguments
-
data::NDArray-or-SymbolicNode
: The input
source
# TakingBroadcastSeriously.broadcast_
— Method.
source
# TakingBroadcastSeriously.broadcast_
— Method.
source
# TakingBroadcastSeriously.broadcast_
— Method.
source
# TakingBroadcastSeriously.broadcast_
— Method.
source
# TakingBroadcastSeriously.broadcast_
— Method.
source
# TakingBroadcastSeriously.broadcast_
— Method.
source
# TakingBroadcastSeriously.broadcast_
— Method.
source
# TakingBroadcastSeriously.broadcast_
— Method.
source
# TakingBroadcastSeriously.broadcast_
— Method.
source
# TakingBroadcastSeriously.broadcast_
— Method.
source
# TakingBroadcastSeriously.broadcast_
— Method.
source
# TakingBroadcastSeriously.broadcast_
— Method.
source
# TakingBroadcastSeriously.broadcast_
— Method.
source
# TakingBroadcastSeriously.broadcast_
— Method.
source
# TakingBroadcastSeriously.broadcast_
— Method.
source
# TakingBroadcastSeriously.broadcast_
— Method.
source
# TakingBroadcastSeriously.broadcast_
— Method.
source
# TakingBroadcastSeriously.broadcast_
— Method.
source
# TakingBroadcastSeriously.broadcast_
— Method.
source
# MXNet.mx.@nd_as_jl
— Macro.
Manipulating as Julia Arrays
@nd_as_jl(captures..., statement)
A convenient macro that allows to operate NDArray
as Julia Arrays. For example,
x = mx.zeros(3,4)
y = mx.ones(3,4)
z = mx.zeros((3,4), mx.gpu())
@mx.nd_as_jl ro=(x,y) rw=z begin
# now x, y, z are just ordinary Julia Arrays
z[:,1] = y[:,2]
z[:,2] = 5
end
Under the hood, the macro convert all the declared captures from NDArray
into Julia Arrays, by using try_get_shared
. And automatically commit the modifications back into the NDArray
that is declared as rw
. This is useful for fast prototyping and when implement non-critical computations, such as AbstractEvalMetric
.
Note
- Multiple
rw
and / orro
capture declaration could be made. - The macro does not check to make sure that
ro
captures are not modified. If the originalNDArray
lives in CPU memory, then it is very likely the corresponding Julia Array shares data with theNDArray
, so modifying the Julia Array will also modify the underlyingNDArray
. - More importantly, since the
NDArray
is asynchronized, we will wait for writing forrw
variables but wait only for reading inro
variables. If we write into thosero
variables, and if the memory is shared, racing condition might happen, and the behavior is undefined. - When an
NDArray
is declared to be captured asrw
, its contents is always sync back in the end. - The execution results of the expanded macro is always
nothing
. - The statements are wrapped in a
let
, thus locally introduced new variables will not be available after the statements. So you will need to declare the variables before calling the macro if needed.
source