ó šÄïYc@sVdZddlmZddlmZddlmZmZde fd„ƒYZ dS( sParameter optimizer.i(t optimizer(t_create_kvstorei(t ParameterDictt ParametertTrainercBsDeZdZddd„Zd„Zd„Zd„Zed„Z RS(s‰Applies an `Optimizer` on a set of Parameters. Trainer should be used together with `autograd`. Parameters ---------- params : ParameterDict The set of parameters to optimize. optimizer : str or Optimizer The optimizer to use. See `help `_ on Optimizer for a list of available optimizers. optimizer_params : dict Key-word arguments to be passed to optimizer constructor. For example, `{'learning_rate': 0.1}`. All optimizers accept learning_rate, wd (weight decay), clip_gradient, and lr_scheduler. See each optimizer's constructor for a list of additional supported arguments. kvstore : str or KVStore kvstore type for multi-gpu and distributed training. See help on :any:`mxnet.kvstore.create` for more information. tdevicecCst|ttfƒr*t|jƒƒ}nt|ttfƒsXtdt|ƒƒ‚ng|_xF|D]>}t|t ƒs–tdt|ƒƒ‚n|jj |ƒqhW|r¶|ni}|j ddƒ|_ |j ƒ|_|j||ƒt|_||_dS(Ns<First argument must be a list or dict of Parameters, got %s.sDFirst argument must be a list or dict of Parameters, got list of %s.t rescale_gradgð?(t isinstancetdictRtlisttvaluesttuplet ValueErrorttypet_paramsRtappendtgett_scalet_check_contextst _contextst_init_optimizertFalset_kv_initializedt_kvstore(tselftparamsRtoptimizer_paramstkvstoretparam((s3build/bdist.linux-armv7l/egg/mxnet/gluon/trainer.pyt__init__/s&   cCspd}xc|jD]X}|jƒ}|dksb||ksbtd|jt|ƒt|ƒfƒ‚|}qW|S(Ns–All Parameters must be initialized on the same set of contexts, but Parameter %s is initialized on %s while previous Parameters are initialized on %s.(tNoneRtlist_ctxtAssertionErrortnametstr(RtcontextsRtctx((s3build/bdist.linux-armv7l/egg/mxnet/gluon/trainer.pyREs " cCs d„t|jƒDƒ}t|tjƒrV| s>tdƒ‚||_||j_ntj|d|||_g|j D]}tj |jƒ^q{|_ dS(NcSsi|]\}}||“qS(((t.0tiR((s3build/bdist.linux-armv7l/egg/mxnet/gluon/trainer.pys Qs sUoptimizer_params must be None if optimizer is an instance of Optimizer instead of strt param_dict( t enumerateRRtoptt OptimizerR t _optimizerR'tcreateRt get_updatert _updaters(RRRR't_((s3build/bdist.linux-armv7l/egg/mxnet/gluon/trainer.pyRPs    csÿ‡fd†ˆjDƒ}tˆjtˆjƒ|ƒ\}}|ràd|jkr[t}nxTtˆjƒD]C\}}|jƒ}|j ||dƒ|j ||d| ƒqkW|rË|j ˆj ƒn|ˆ_|ˆ_ ndˆ_dˆ_ tˆ_dS(Ncs,i|]"}|jˆjdƒ|j“qS(i(tdataRR!(R%R(R(s3build/bdist.linux-armv7l/egg/mxnet/gluon/trainer.pys `s tdistitpriority(RRRtlenRR RR(t list_datatinittpullt set_optimizerR+t_update_on_kvstoreRtTrueR(Rt arg_arraysRtupdate_on_kvstoreR&Rt param_arrays((Rs3build/bdist.linux-armv7l/egg/mxnet/gluon/trainer.pyt _init_kvstore_s"      c CsŒ|js|jƒn|j||j_x\t|jƒD]K\}}|jdkrZq9n|s¨xE|jƒD]4}|j smt d|j t |j ƒfƒ‚qmqmWn|jr#|jj||jƒd| ƒ|jr|jj||jƒd| ƒq9q#|jj||jƒd| ƒnx^t|j|jƒ|jƒƒD];\}}}| sd|j rE||||ƒt|_ qEqEWq9WdS(sKMakes one step of parameter update. Should be called after `autograd.compute_gradient` and outside of `record()` scope. Parameters ---------- batch_size : int Batch size of data processed. Gradient will be normalized by `1/batch_size`. Set this to 1 if you normalized loss manually with `loss = mean(loss)`. ignore_stale_grad : bool, optional, default=False If true, ignores Parameters with stale gradient (gradient that has not been updated by `backward` after last step) and skip update. tnullspGradient of Parameter `%s` on context %s has not been updated by backward since last `step`. This could mean a bug in your model that maked it only use a subset of the Parameters (Blocks) for this iteration. If you are intentionally only using a subset, call step with ignore_stale_grad=True to suppress this warning and skip updating of Parameters with stale gradientR2N(RR=RR+RR(Rtgrad_reqR4t _fresh_gradt UserWarningR!R"tcontextRtpusht list_gradR8R6tzipR.R( Rt batch_sizetignore_stale_gradR&RR0tupdtarrtgrad((s3build/bdist.linux-armv7l/egg/mxnet/gluon/trainer.pytstepts,   &    #1N( t__name__t __module__t__doc__RRRRR=RRK(((s3build/bdist.linux-armv7l/egg/mxnet/gluon/trainer.pyRs   N( RNtRR)tmodelRt parameterRRtobjectR(((s3build/bdist.linux-armv7l/egg/mxnet/gluon/trainer.pyts