ó —Àv]c@shdZddddgZddlmZddlZddlZddlZdd lm Z dd l m Z dd l m Z dd lm Z dd l mZddlmZddlmZddlmZddlmZddlmZd„Zd„Zdddd„Zd„Zeaeadaej d„ƒZ!ddddd„Z"d„Z#d„Z$dS(s7Functions for enabling AMP (automatic mixed precision).tinitt init_trainert scale_losstunscaleiÿÿÿÿ(t MethodTypeNi(tsymbol(tSymbol(tcontrib(tndarray(tNDArrayi(tlists(ttrainer(tbase(t optimizer(t LossScalercCs•tjtjf}t|tƒr4tj|d|ƒSt|tƒr|j|kr†|j|kr†|j j dkr†t j|d|ƒS|Sn|SdS(Ntdtypetcpu( tnptfloat16tfloat32t isinstanceRRtamp_castR Rtcontextt device_typeR(tsRt float_types((s6/tmp/pip-install-Qvdv_2/mxnet/mxnet/contrib/amp/amp.pyt_cast_symbol_NDArray&scCs«t|dƒ}tj|ƒ}t|ƒdkrw|dksK|jdƒrh|t|ƒ}||}q¡|}|}n*|jdƒr•|}|}n |}|}||fS(Nt _internalit_random_t_liket_(tgetattrR t_get_op_name_prefixtlentendswitht startswith(tnametmoduletsubmodule_dicttmodule_internaltprefixt func_namet cur_module((s6/tmp/pip-install-Qvdv_2/mxnet/mxnet/contrib/amp/amp.pyt_get_fun_to_wrap4s   c sQdd„}dd„}‡fd†}|tttfkrB|n|}i} x+tjD] } t|| dd!ƒ| | Ntcs(i|]\}}t|ˆƒ|“qS((R(t.0tktv(R-(s6/tmp/pip-install-Qvdv_2/mxnet/mxnet/contrib/amp/amp.pys Ps (tNonetlisttmapttupletitems(targstkwargstnew_args(tcond_argtfR-(s6/tmp/pip-install-Qvdv_2/mxnet/mxnet/contrib/amp/amp.pyt_new_funIs  (t__name__t __module__t__doc__(R<R-R;R=((R;R<R-s6/tmp/pip-install-Qvdv_2/mxnet/mxnet/contrib/amp/amp.pyt_ndarray_wrapperHs    cs=‡‡‡fd†}ˆj|_ˆj|_ˆj|_|S(Ncs»ˆdk rDˆd|ks4|ˆdˆdkrDˆ||ŽSnˆ||Ž}|jƒ}|jƒ‰tt‡‡fd†|ƒƒ}|jƒ}||Œ}|jd|jƒ|S(Niics |jˆkrt|ˆƒS|S(N(R$R(R,(tauxR-(s6/tmp/pip-install-Qvdv_2/mxnet/mxnet/contrib/amp/amp.pyR.`sR$(R3t get_childrentlist_auxiliary_statesR4R5t_gen_atomic_symbolt _set_attrR$(R8R9tsymtinputst atomic_symt wrapped_sym(R;R<R-(RBs6/tmp/pip-install-Qvdv_2/mxnet/mxnet/contrib/amp/amp.pyR=Xs      (R>R?R@(R<R-R;R=((R;R<R-s6/tmp/pip-install-Qvdv_2/mxnet/mxnet/contrib/amp/amp.pyt_symbol_wrapperWs    cs:‡‡fd†}ˆj|_ˆj|_ˆj|_|S(Nc s g}t}t|ƒ}x]t|ƒD]O\}}t|ttfƒr%|j|||fƒ|pnt|tƒ}q%q%Wx]|jƒD]O\}}t|ttfƒr…|j|||fƒ|pÎt|tƒ}q…q…W|s…ˆ}xG|D]?\}}}t|tƒrë|jt j kr*t j }q*qëqëWxÌ|D]I\} } }|j|kr5|jˆkr5t j |d|ƒ| | R?R@(R<R=(R-(R<s6/tmp/pip-install-Qvdv_2/mxnet/mxnet/contrib/amp/amp.pyt_symbol_widest_wrapperks    iiÿÿÿÿ(R3RRtsymbol_contribR t_OP_NAME_PREFIX_LISTRR t FP16_FUNCSR+tsetattrtoptAttributeErrort FP32_FUNCSRRtCONDITIONAL_FP32_FUNCStWIDEST_TYPE_CASTS(R%R-ttarget_precision_opstconditional_fp32_opstfp32_opsRARKR\t_wrapperR&top_name_prefixt wrap_listtfun_nameR*t f_to_wrapRVt arg_values((R-s6/tmp/pip-install-Qvdv_2/mxnet/mxnet/contrib/amp/amp.pyt_wrap_symbol_functionsFs`  &!   #   &  % /    cs|tkr‡fd†}n d„}xQtjjD]C}y)t||ƒ}t||||ƒƒWq4tk rvq4Xq4WdS(Ncs:‡‡fd†}ˆj|_ˆj|_ˆj|_|S(Ncs>d|kr$|dˆj|dR?R@(R<Rs(Rr(R<s6/tmp/pip-install-Qvdv_2/mxnet/mxnet/contrib/amp/amp.pyRiÇs    cs7‡fd†}ˆj|_ˆj|_ˆj|_|S(Ncs tjdˆjƒˆ||ŽS(NsN%s does not support dynamic loss scaling in symbolic and hybridized execution.(tloggingtwarningR>(R8R9(R<(s6/tmp/pip-install-Qvdv_2/mxnet/mxnet/contrib/amp/amp.pyt_warning_wrapperÔs  (R>R?R@(R<Rv((R<s6/tmp/pip-install-Qvdv_2/mxnet/mxnet/contrib/amp/amp.pyRiÓs    (RR RtLOSS_OUTPUT_FUNCTIONSRR`Rb(R%RrRiRlRm((Rrs6/tmp/pip-install-Qvdv_2/mxnet/mxnet/contrib/amp/amp.pyt_wrap_loss_output_functionsÅs   ccs~|jdk stdƒ‚|j|jj|_t|ttfƒrkg|D]}||jj^qMVn|jj|VdS(NsJLoss scaler is not initialized, did you forget to call amp.init_trainer()?( t_amp_loss_scalerR3tAssertionErrort_amp_original_scaleRqt_scaleRR4R6(tlosstoptimizer_or_trainertl((s6/tmp/pip-install-Qvdv_2/mxnet/mxnet/contrib/amp/amp.pyRès %RcCsŸts›|dtjgks'tdƒ‚tatjdƒtj|ƒ}tt ||||ƒtt ||||ƒt ƒa t t t ƒt t t ƒndS(sbInitialize AMP (automatic mixed precision). This needs to be done before model creation. Parameters ---------- target_dtype : {'float16'} Target low precision type for AMP. Currently only float16 is supported. target_precision_ops : list of string Override the list of functions casted to FP16. Entries in this list are names of the functions casted to FP16. conditional_fp32_ops : list of (string, string, list of string) Override the list of functions conditionally casted to FP32. The format of the list is (name of the function, name of the parameter, list of values of the parameter that make the function be casted to FP32). fp32_ops : list of string Override the list of functions casted to FP32. Entries in this list are names of the functions casted to FP32. Rs5AMP currently supports only float16 as a target_dtypes Using AMPN(t_amp_initializedRRRztTrueRttinfoRRoRRRt _loss_scalerRx(R-RfRgRh((s6/tmp/pip-install-Qvdv_2/mxnet/mxnet/contrib/amp/amp.pyRós        cststdƒ‚ts'tat}n tƒ}t|tjƒrÛ||_ |j |_ |j j ‰|j j|j _‡fd†}t||j ƒ|j _|j j‰|j|_t‡fd†}t||ƒ|_n7t|tjƒrütdƒ‚ntdt|ƒƒ‚dS(sÚInitialize trainer or optimizer to work with AMP dynamic loss scaling. Parameters ---------- optimizer_or_trainer : Optimizer or Trainer MXNet Optimizer or Gluon trainer to initialize with AMP s7AMP not initialized, did you forget to call amp.init()?cs&ˆƒs"|j||||ƒndS(N(told_update_multi_precision(tselfRYtweighttgradtstate(t skip_update(s6/tmp/pip-install-Qvdv_2/mxnet/mxnet/contrib/amp/amp.pytnew_update_multi_precision0s csˆ|jƒ|j|ƒdS(N(t_paramst _old_update(R…tignore_stale_grad(tlaunch_check_overflow(s6/tmp/pip-install-Qvdv_2/mxnet/mxnet/contrib/amp/amp.pyt new_update7s s3AMP is currently only compatible with Gluon TrainersMoptimizer_or_trainer should be a Gluon Trainer or an optimizer, instead is %sN(R€Rzt_amp_loss_scale_initializedRRƒRRR tTrainerRyR|R{twait_and_updatet _optimizertupdate_multi_precisionR„RRŽt_updateRŒRNtoptt Optimizert TypeErrorttype(R~t loss_scalerRŠR((RŽR‰s6/tmp/pip-install-Qvdv_2/mxnet/mxnet/contrib/amp/amp.pyRs(        cCs·t|tjƒr|g|jD]}|jdk r|j^q}x-|D]%}x|D]}||j9(qTWqGWd|_n7t|tjƒrt dƒ‚nt dt |ƒƒ‚dS(s-Check and unscale the gradients manually. This function should only be used if accessing gradients is necessary, e.g. for gradient clipping. Parameters ---------- optimizer_or_trainer : Optimizer or Trainer MXNet optimizer or Gluon Trainer used when scaling the gradients gð?s3AMP is currently only compatible with Gluon TrainersMoptimizer_or_trainer should be a Gluon Trainer or an optimizer, instead is %sN( RR R‘R‹t_gradR3R|R–R—R˜R™(R~tpt valid_gradstgradstg((s6/tmp/pip-install-Qvdv_2/mxnet/mxnet/contrib/amp/amp.pyRCs .   (%R@t__all__ttypesRRtt contextlibtnumpyRR/RRRR]RR R tgluonR R R R–RšRRR+R3RoRxRNR€RRƒtcontextmanagerRRRR(((s6/tmp/pip-install-Qvdv_2/mxnet/mxnet/contrib/amp/amp.pyts8     ~  $ +