ó šÄïYc@@sòdZddlmZddlmZddlmZddlmZddd„Z d „Z d efd „ƒYZ d e fd „ƒYZ de fd„ƒYZde fd„ƒYZeZde fd„ƒYZeZde fd„ƒYZdS(s% losses for training neural networks i(tabsolute_importi(tndarray(t numeric_typesi(t HybridBlockcC@sY|dk r!|j||ƒ}n|dk rUt|tƒsHtdƒ‚||}n|S(sApply weighting to loss. Parameters ---------- loss : Symbol The loss to be weighted. weight : float or None Global scalar weight for loss. sample_weight : Symbol or None Per sample weighting. Must be broadcastable to the same shape as loss. For example, if loss has shape (64, 10) and you want to weight each sample in the batch separately, `sample_weight` should have shape (64, 1). Returns ------- loss : Symbol Weighted loss sweight must be a numberN(tNonet broadcast_mult isinstanceRtAssertionError(tFtlosstweightt sample_weight((s0build/bdist.linux-armv7l/egg/mxnet/gluon/loss.pyt_apply_weightings    cC@s)|tkr|j|jƒS|jdƒS(N((Rtreshapetshape(Rtoutputtlabel((s0build/bdist.linux-armv7l/egg/mxnet/gluon/loss.pyt_reshape_label_as_output9stLosscB@s)eZdZd„Zd„Zd„ZRS(sÇBase class for loss. Parameters ---------- weight : float or None Global scalar weight for loss. batch_axis : int, default 0 The axis that represents mini-batch. cK@s,tt|ƒj|||_||_dS(N(tsuperRt__init__t_weightt _batch_axis(tselfR t batch_axistkwargs((s0build/bdist.linux-armv7l/egg/mxnet/gluon/loss.pyRIs cC@s"d}|jd|jj|jS(Ns-{name}(batch_axis={_batch_axis}, w={_weight})tname(tformatt __class__t__name__t__dict__(Rts((s0build/bdist.linux-armv7l/egg/mxnet/gluon/loss.pyt__repr__NscO@s t‚dS(sOverrides to construct symbolic graph for this `Block`. Parameters ---------- x : Symbol or NDArray The first input tensor. *args : list of Symbol or list of NDArray Additional input tensors. N(tNotImplementedError(RRtxtargsR((s0build/bdist.linux-armv7l/egg/mxnet/gluon/loss.pythybrid_forwardRs (Rt __module__t__doc__RR R$(((s0build/bdist.linux-armv7l/egg/mxnet/gluon/loss.pyR?s   tL2LosscB@s)eZdZddd„Zdd„ZRS(s´Calculates the mean squared error between output and label: .. math:: L = \frac{1}{2}\sum_i \Vert {output}_i - {label}_i \Vert^2. Output and label can have arbitrary shape as long as they have the same number of elements. Parameters ---------- weight : float or None Global scalar weight for loss. sample_weight : Symbol or None Per sample weighting. Must be broadcastable to the same shape as loss. For example, if loss has shape (64, 10) and you want to weight each sample in the batch, `sample_weight` should have shape (64, 1). batch_axis : int, default 0 The axis that represents mini-batch. gð?icK@s tt|ƒj|||dS(N(RR'R(RR RR((s0build/bdist.linux-armv7l/egg/mxnet/gluon/loss.pyRuscC@s]t|||ƒ}|j||ƒ}t|||jd|ƒ}|j|d|jdtƒS(Nitaxistexclude(RtsquareR RtmeanRtTrue(RRRRR R ((s0build/bdist.linux-armv7l/egg/mxnet/gluon/loss.pyR$xsN(RR%R&RRR$(((s0build/bdist.linux-armv7l/egg/mxnet/gluon/loss.pyR'`stL1LosscB@s)eZdZddd„Zdd„ZRS(s~Calculates the mean absolute error between output and label: .. math:: L = \frac{1}{2}\sum_i \vert {output}_i - {label}_i \vert. Output and label must have the same shape. Parameters ---------- weight : float or None Global scalar weight for loss. sample_weight : Symbol or None Per sample weighting. Must be broadcastable to the same shape as loss. For example, if loss has shape (64, 10) and you want to weight each sample in the batch, `sample_weight` should have shape (64, 1). batch_axis : int, default 0 The axis that represents mini-batch. icK@s tt|ƒj|||dS(N(RR-R(RR RR((s0build/bdist.linux-armv7l/egg/mxnet/gluon/loss.pyR“scC@sYt|||ƒ}|j||ƒ}t|||j|ƒ}|j|d|jdtƒS(NR(R)(RtabsR RR+RR,(RRRRR R ((s0build/bdist.linux-armv7l/egg/mxnet/gluon/loss.pyR$–sN(RR%R&RRR$(((s0build/bdist.linux-armv7l/egg/mxnet/gluon/loss.pyR-stSigmoidBinaryCrossEntropyLosscB@s,eZdZeddd„Zdd„ZRS(sThe cross-entropy loss for binary classification. (alias: SigmoidBCELoss) BCE loss is useful when training logistic regression. .. math:: loss(o, t) = - 1/n \sum_i (t[i] * log(o[i]) + (1 - t[i]) * log(1 - o[i])) Parameters ---------- from_sigmoid : bool, default is `False` Whether the input is from the output of sigmoid. Set this to false will make the loss calculate sigmoid and then BCE, which is more numerically stable through log-sum-exp trick. weight : float or None Global scalar weight for loss. sample_weight : Symbol or None Per sample weighting. Must be broadcastable to the same shape as loss. For example, if loss has shape (64, 10) and you want to weight each sample in the batch, `sample_weight` should have shape (64, 1). batch_axis : int, default 0 The axis that represents mini-batch. icK@s)tt|ƒj|||||_dS(N(RR/Rt _from_sigmoid(Rt from_sigmoidR RR((s0build/bdist.linux-armv7l/egg/mxnet/gluon/loss.pyR¶scC@sÕt|||ƒ}|jsl|j| dƒ}|||||j|j| ƒ|j| |ƒƒ}n5|j|dƒ||jd|dƒd| }t|||j|ƒ}|j|d|jdt ƒS(Nig:Œ0âŽyE>gð?R(R)( RR0tmaximumtlogtexpR RR+RR,(RRRRR tmax_valR ((s0build/bdist.linux-armv7l/egg/mxnet/gluon/loss.pyR$ºs >5N(RR%R&tFalseRRR$(((s0build/bdist.linux-armv7l/egg/mxnet/gluon/loss.pyR/stSoftmaxCrossEntropyLosscB@s2eZdZdeeddd„Zdd„ZRS(sComputes the softmax cross entropy loss. (alias: SoftmaxCELoss) If `sparse_label` is `True`, label should contain integer category indicators: .. math:: p = {softmax}({output}) L = -\sum_i {log}(p_{i,{label}_i}) Label's shape should be output's shape without the `axis` dimension. i.e. for `output.shape` = (1,2,3,4) and axis = 2, `label.shape` should be (1,2,4). If `sparse_label` is `False`, label should contain probability distribution with the same shape as output: .. math:: p = {softmax}({output}) L = -\sum_i \sum_j {label}_j {log}(p_{ij}) Parameters ---------- axis : int, default -1 The axis to sum over when computing softmax and entropy. sparse_label : bool, default True Whether label is an integer array instead of probability distribution. from_logits : bool, default False Whether input is a log probability (usually from log_softmax) instead of unnormalized numbers. weight : float or None Global scalar weight for loss. sample_weight : Symbol or None Per sample weighting. Must be broadcastable to the same shape as loss. For example, if loss has shape (64, 10) and you want to weight each sample in the batch, `sample_weight` should have shape (64, 1). batch_axis : int, default 0 The axis that represents mini-batch. iÿÿÿÿicK@s;tt|ƒj|||||_||_||_dS(N(RR7Rt_axist _sparse_labelt _from_logits(RR(t sparse_labelt from_logitsR RR((s0build/bdist.linux-armv7l/egg/mxnet/gluon/loss.pyRïs  cC@s |js|j|ƒ}n|jrI|j||d|jdtƒ }n#|j||d|jdtƒ }t|||j|ƒ}|j |d|j dtƒS(NR(tkeepdimsR)( R:t log_softmaxR9tpickR8R,tsumR RR+R(RRRRR R ((s0build/bdist.linux-armv7l/egg/mxnet/gluon/loss.pyR$ös  %#N(RR%R&R,R6RRR$(((s0build/bdist.linux-armv7l/egg/mxnet/gluon/loss.pyR7Çs'  t KLDivLosscB@s,eZdZeddd„Zdd„ZRS(s:The Kullback-Leibler divergence loss. KL divergence is a useful distance measure for continuous distributions and is often useful when performing direct regression over the space of (discretely sampled) continuous output distributions. .. _Kullback-Leibler divergence: https://en.wikipedia.org/wiki/Kullback-Leibler_divergence .. math:: L = 1/n \sum_i (label_i * (log(label_i) - output_i)) Label's shape should be the same as output's. Parameters ---------- from_logits : bool, default is `True` Whether the input is log probability (usually from log_softmax) instead of unnormalized numbers. weight : float or None Global scalar weight for loss. sample_weight : Symbol or None Per sample weighting. Must be broadcastable to the same shape as loss. For example, if loss has shape (64, 10) and you want to weight each sample in the batch, `sample_weight` should have shape (64, 1). batch_axis : int, default 0 The axis that represents mini-batch. icK@s)tt|ƒj|||||_dS(N(RRARR:(RR<R RR((s0build/bdist.linux-armv7l/egg/mxnet/gluon/loss.pyR scC@sj|js|j|ƒ}n||j|dƒ|}t|||j|ƒ}|j|d|jdtƒS(Ng:Œ0âŽyE>R(R)(R:R>R3R RR+RR,(RRRRR R ((s0build/bdist.linux-armv7l/egg/mxnet/gluon/loss.pyR$$s  N(RR%R&R,RRR$(((s0build/bdist.linux-armv7l/egg/mxnet/gluon/loss.pyRAsN(R&t __future__RtRtbaseRtblockRRR RRR'R-R/tSigmoidBCELossR7t SoftmaxCELossRA(((s0build/bdist.linux-armv7l/egg/mxnet/gluon/loss.pyts !'9