B u `@shddlZddlZddlZddlZddlZddlmZddlm Z m Z m Z m ZmZmZmZddlmZddlmmmZddlmZddlmZddlmZddlm Z m!Z!ddl"m#m$m%Z&ddl'm#m$m(Z(ddl)m*Z*m+Z+m,Z,e e@e@Z-e d \Z.Z/e0e1Z2d Z3d d Z4d dZ5ddZ6ddZ7dddZ8Gddde9Z:Gddde9Z;dS)N) combinations)attempt_importnumpynumpy_availablepandaspandas_availablescipyscipy_available)tree_structure)CreateAbstractScenarioTreeModel) SolverFactory)Block ComponentUID) fit_rect_dist fit_mvn_dist fit_kde_distz4pyomo.contrib.interior_point.inverse_reduced_hessiang?c Cs|d}|dkrd}|}n&|d}||d|}|d|}|d}|d}|}x&tt|dD]} t||| }qjWt||}|dkr|Sy t|}Wn YnX||S)a> Create a Pyomo object from a string; it is attached to instance args: instance: a concrete pyomo model vstr: a particular Var or Param (e.g. "pp.Keq_a[2]") output: the object NOTE: We need to deal with blocks and with indexes that might really be strings or ints [N].)findsplitrangelengetattrint) instancevstrlZindexstrZbasestrrpartsnameretvalir%A/tmp/pip-unpacked-wheel-n62dbgi3/pyomo/contrib/parmest/parmest.py_object_from_string3s(      r'cCsd|d}t||S)z Wrapper for _object_from_string for PySP extensive forms but only for Vars at the node named RootNode. DLW April 2018: needs work to be generalized. zMASTER_BLEND_VAR_RootNode[r)r')Z efinstancerZefvstrr%r%r& _ef_ROOT_node_Object_from_stringXs r(c Cstd|d}t|}|dt| }t|dsDtdnt|j rV|j }n|j }t|dsptdn|j }t |t rt j|dd} nt |tjr|} n td yt| |}Wn&td |d t |YnXt|d r|j} | |} n|} |t | } |j} y|| | d }Wnxtk ry||| |}WnPtk ry|| |}WntdYnXYntdYnXYnXt|dr|j}x<|D]4}t||}||dk r|||nd|_qW|S)a This is going to be called by PySP and it will call into the user's model's callback. Parameters: ----------- scenario_tree_model: `pysp scenario tree` Standard pysp scenario tree, but with things tacked on: `CallbackModule` : `str` or `types.ModuleType` `CallbackFunction`: `str` or `callable` NOTE: if CallbackFunction is callable, you don't need a module. scenario_name: `str` `cb_data`: optional to pass through to user's callback function Scenario name should end with a number node_names: `None` Not used here Returns: -------- instance: `ConcreteModel` instantiated scenario Note: ---- There is flexibility both in how the function is passed and its signature. z(\d+)$rNCallbackFunctionz@Internal Error: tree needs callback in parmest callback functionCallbackModulez=Internal Error: tree needs CallbackModule in parmest callback)packagez"Internal Error: bad CallbackModulezError getting function=z from module=BootList)experiment_numbercb_dataz4Failed to create instance using callback; TypeError+z)Failed to create instance using callback. ThetaValsF)recompilesearchgrouprrhasattr RuntimeErrorcallabler)r* isinstancestrim import_moduletypes ModuleTypeprintrr,r. TypeErrorr/r'Zfixfixed)Zscenario_tree_modelZ scenario_nameZ node_namesZ scen_num_strZscen_numbasenamecallbackZcb_namemodnameZ cb_modulebootlistexp_numZ scen_namer.r thetavalsrobjectr%r%r& _pysp_instance_creation_callbacksh             rGcCst|}t}|jd|jd|jdx4|D],}|jdt||jdt|q.FirstStageCost_rule)rulecSs |j|jS)N)FirstStageCostre)r]r%r%r&TotalCost_rulesz7Estimator._create_parmest_model..TotalCost_rule)rqZsense)Z pyomo.corerorfrrhpyoZVarrd enumeraterZfind_component_onloggerwarningr?reprricomponent_objectsZ deactivateZ ExpressionrrrXreZminimizeZTotal_Cost_ObjectiveZ parmest_model) r[rTror]r$thetaZvar_cuidZ var_validateobjrprsr%r%r&_create_parmest_modelos4     zEstimator._create_parmest_modelc Cst|tjr(|j|ddf}nlt|tr||}t|trDt|try$t |d}t |}WdQRXWqt ddSn t ddS| |}|S)Nr zUnexpected data format)r7pd DataFramelocto_frame transposerRdictr8openjsonloadr=r|)r[r-r.Zexp_datainfiler]r%r%r&_instance_creation_callbacks       z%Estimator._instance_creation_callbackef_ipoptc/ Cs|dks|dkst|dkr(t|j}nttt|j}|jd}|jd}|j|j|<g|j|<d|j|<d|j|<d|_ |j |_ |dk r||_ |dk r||_ |j|_z tj} dt_tjdd|d } Wd| t_X|d kr:| |_d } | r|jrtd |std }|jdk r:x |jD]} |j| |j| <q W| r|j|j|jd d} t| jdkrr| dj}nd}|jj| n|j|j|jd} nDg}x"|jD]}| |jj!|qWt"j#|j||j|jd\} }| j$%| j$&|j'r t(dt)| j*j+i}x| ,D]\}}|||<qW| -}|rht|j}t|}|}d||||}t|dkrg}x||jj.t/d dD]h}i}xR|D]J}|0t)|}dd|1D}t|dkr|d||<n|||<qW| |qWt23|}|r||||fS|||fS|r.|||fS||fSn|dkr| } d}!td }"td}#td}$t4j5t4j5j6d| _7t4j5t4j5j8d| _9t4j5t4j5j8d| _:t4j5t4j5j;d| _t4j5t4j5j;d| _?t4j5t4j5j8d| _@xLtt|jD]:}%|j|%}&tA| |&}'|'B| j>|%d|'B| j?dqWd|$jd<tCdd,}(|(Dd|(Dd|(Dd|(EWdQRX|#j| |!dtCdd}(|(EWdQRX|"j| |!d| jW|||)fStMd$|dS)%a Set up all thetas as first stage Vars, return resulting theta values as well as the objective function value. NOTE: If thetavals is present it will be attached to the scenario tree so it can be used by the scenario creation callback. Side note (feb 2018, dlw): if you later decide to construct the tree just once and reuse it, then remember to remove thetavals from it when none is desired. Zk_augNrrrrezpyomo.contrib.parmest.parmestrG)ZfsfileZfsfct tree_modelrFzUCalculating both the gap and reduced hessian (covariance) is not currently supported.ipopt)rjZload_solutionsr)rj)Zindependent_variablesrlrjz# Solver termination condition = )Z descend_intocSsg|]}t|qSr%)rtvalue).0_r%r%r& Wsz$Estimator._Q_opt..TZ ipopt_sens) directionrHZ compute_invz ipopt.optwzcompute_red_hessian yes zoutput_file my_ouput.txt zrh_eigendecomp yes z k_aug zk_aug red_hesszresult_red_hess.txtr zUnknown solver in Q_Opt=)NAssertionErrorrOrnrrrIrhZStageVariablesZ StageCostr*rr)r/r,rgr.r ZCUID_repr_versionstZ StochSolverZmake_efZ ef_instancecalc_covr rloptionssolverjZsolutionZgapZ solutionsZ load_fromrSZMASTER_BLEND_VAR_RootNodeinverse_reduced_hessianZinv_reduced_hessian_barrierZ scenario_treeZ"pullScenarioSolutionsFromInstancesZsnapshotSolutionFromScenariosrkr=r8solvertermination_conditionZroot_Var_solutionZ root_E_objryr Zfind_component itervaluesr}r~rtZSuffixZ IMPORT_EXPORTZdualZIMPORTZ ipopt_zL_outZ ipopt_zU_outZEXPORTZ ipopt_zL_inZ ipopt_zU_inZ red_hessianZdof_vZrh_namer(Zset_suffix_valuerwritecloseupdate readlinesZMASTER_OBJECTIVE_EXPRESSIONexprrfloatrr5)/r[r/r return_valuesrCrrZstage1Zstage2Z_cuidverZstsolverZneed_gapkeyZ solve_resultZabsgapZind_varsvZ inv_red_hesrEr"ZsolvalobjvalnrZsseZcovZ var_valuesZexp_ivalsvarZ exp_i_vartempr]Z stream_solverrZsipoptZkaugZ vstrindexrZ varobjectfZHessDictlinesr$Zlineinr!jr%r%r&_Q_opts                                    &   zEstimator._Q_optc Cstd}dd}d|_|j|_||_|j|_|jrBt dt |t |dd}yt |j tjdd}Wnd}YnXd }tjj}d }x|jD]} d t | } t || d}|sZ|jrt d | t d tj||ddd\} } } }}t dt | t | t | t |t |||}|jr2t dt |jj|jjtjjkrZ|tjjkrZ|jj}t||j}t|}||7}qW|t|j}|||fS)a~ Return the objective function value with fixed theta values. Parameters ---------- thetavals: dict A dictionary of theta values. Returns ------- objectiveval: float The objective function value. thetavals: dict A dictionary of all values for theta that were input. solvertermination: Pyomo TerminationCondition Tries to return the "worst" solver status across the scenarios. pyo.TerminationCondition.optimal is the best and pyo.TerminationCondition.infeasible is the worst. rcSsdS)Nr%r%r%r%r&z'Estimator._Q_at_theta..Nz! Compute objective at theta = ZFOO1T)activeFrZ scenario_NODEz Experiment = z6 First solve with with special diagnostics wrapperix)Zmax_iterZ max_cpu_timez: status_obj, solved, iters, time, regularization_stat = z,standard solve solver termination condition=)rtr r*rr)r/rgr.rkr=r8rGnextryZ ConstraintTerminationConditionZoptimalrnipopt_solver_wrapperZipopt_solve_with_statsrrr infeasiblerrmrr)r[rEZ optimizerZ dummy_treerfirstZ sillylittleZ WorstStatusZtotobjZsnumZsnameZ status_objZsolvedZiterstimeZreguresultsZ objobjectrr#r%r%r& _Q_at_thetasP      "     zEstimator._Q_at_thetaTc Cst}|dkrBxtt|j|D]\}}||t|fq Wnxt|D]}d}d}d} xj|t|j kr| stj j |j||d} t|  } tt | }| |krd} |d7}||kr^tdq^W||| fqLW|S)NrF)rLTrzInternal error: timeout constructing a sample, the dim of theta may be too close to the samplesize)rRrurrnrSnpsortrrrhrandomchoicetolistuniquer5) r[ samplesizeZ num_samples replacementZ samplelistr$rattemptsZunique_samplesZ duplicatesampler%r%r&_get_sample_lists,  zEstimator._get_sample_listcCsDt|tstt|tstt|tdtfs2t|j||||dS)ag Parameter estimation using all scenarios in the data Parameters ---------- solver: string, optional "ef_ipopt" or "k_aug". Default is "ef_ipopt". return_values: list, optional List of Variable names used to return values from the model bootlist: list, optional List of bootstrap sample numbers, used internally when calling theta_est_bootstrap calc_cov: boolean, optional If True, calculate and return the covariance matrix (only for "ef_ipopt" solver) Returns ------- objectiveval: float The objective function value thetavals: dict A dictionary of all values for theta variable values: pd.DataFrame Variable values for each variable name in return_values (only for ef_ipopt) Hessian: dict A dictionary of dictionaries for the Hessian. The Hessian is not returned if the solver is ef_ipopt. cov: numpy.array Covariance matrix of the fitted parameters (only for ef_ipopt) N)rrrCr)r7r8rrRtyper)r[rrrCrr%r%r& theta_est#s zEstimator.theta_estcCs"t|tstt|tdtfs$tt|ts2tt|tdtfsHtt|tsVt|dkrht|j}|dk r|tj || |||}t |}| |}tt||_t} x6|D].\} } |jt| d\} } | | d<| | qWttt|j|_|| }t|} |s| d=| S)aK Parameter estimation using bootstrap resampling of the data Parameters ---------- bootstrap_samples: int Number of bootstrap samples to draw from the data samplesize: int or None, optional Size of each bootstrap sample. If samplesize=None, samplesize will be set to the number of samples in the data replacement: bool, optional Sample with or without replacement seed: int or None, optional Random seed return_samples: bool, optional Return a list of sample numbers used in each bootstrap estimation Returns ------- bootstrap_theta: DataFrame Theta values for each sample and (if return_samples = True) the sample numbers used in each estimation N)rCZsamples)r7rrrboolrrnrrseedrmpiuParallelTaskManagerglobal_to_local_datarRrrrSrgallgather_global_datar}r~)r[bootstrap_samplesrrrreturn_samples global_listtask_mgr local_listbootstrap_thetaidxrrrEglobal_bootstrap_thetar%r%r&theta_est_bootstrapHs2      zEstimator.theta_est_bootstrapcCs:t|tstt|tdtfs$tt|tdtfs:tt|tsHtt|j|}|dk rjtj ||j ||dd}t t|}| |}tt||_t} xZ|D]R\} } |jt| d\} } tttt|jt| }t|| d<| | qWttt|j|_|| }t|} |s6| d=| S)a Parameter estimation where N data points are left out of each sample Parameters ---------- lNo: int Number of data points to leave out for parameter estimation lNo_samples: int Number of leave-N-out samples. If lNo_samples=None, the maximum number of combinations will be used seed: int or None, optional Random seed return_samples: bool, optional Return a list of sample numbers that were left out Returns ------- lNo_theta: DataFrame Theta values for each sample and (if return_samples = True) the sample numbers left out of each estimation NF)r)rClNo)r7rrrrrrnrrrrrrrrRrrsetrgrrSrr}r~)r[r lNo_samplesrrrrrrZ lNo_thetarrrrEZlNo_srr%r%r&theta_est_leaveNouts.    zEstimator.theta_est_leaveNoutcCs,t|tstt|tdtfs$tt|ts2t|dks>tt|tsLtt|tdtfsbt|dk rvtj||j }|j ||dd}g} x|D]z\} } |j | ddf|_|jj |_ |\} } |j| d|_|jj |_ ||}|j|||| d\}}| | ||fqW||_|jj |_ | S)a Leave-N-out bootstrap test to compare theta values where N data points are left out to a bootstrap analysis using the remaining data, results indicate if theta is within a confidence region determined by the bootstrap analysis Parameters ---------- lNo: int Number of data points to leave out for parameter estimation lNo_samples: int Leave-N-out sample size. If lNo_samples=None, the maximum number of combinations will be used bootstrap_samples: int: Bootstrap sample size distribution: string Statistical distribution used to define a confidence region, options = 'MVN' for multivariate_normal, 'KDE' for gaussian_kde, and 'Rect' for rectangular. alphas: list List of alpha values used to determine if theta values are inside or outside the region. seed: int or None, optional Random seed Returns ---------- List of tuples with one entry per lNo_sample: * The first item in each tuple is the list of N samples that are left out. * The second item in each tuple is a DataFrame of theta estimated using the N samples. * The third item in each tuple is a DataFrame containing results from the bootstrap analysis using the remaining samples. For each DataFrame a column is added for each value of alpha which indicates if the theta estimate is in (True) or out (False) of the alpha region for a given distribution (based on the bootstrap results) N)RectMVNKDEF)r)index) distributionalphastest_theta_values)r7rrrrRrrrrgcopyrrrrnrZdroprconfidence_region_testrS)r[rrrrrrrTrrrrr{rzrZtrainingtestr%r%r&leaveNout_bootstrap_tests2*         z"Estimator.leaveNout_bootstrap_testcCst|tjst|j}|d}tt|}| |}t }x@|D]8}| |\}} } | t j jkrD|t ||gqDW||} t |dg} tj| | d} | S)ao Objective value for each theta Parameters ---------- theta_values: DataFrame, columns=theta_names Values of theta used to compute the objective Returns ------- obj_at_theta: DataFrame Objective value for each theta (infeasible solutions are omitted). recordsr{)rTrQ)r7r}r~rrQto_dictrrrrrRrrtrrrSvaluesr)r[ theta_valuesrhZ all_thetasrZ local_thetasZall_objThetar{ZthetvalsZ worststatusZglobal_all_objZdfcols obj_at_thetar%r%r&objective_at_thetas     zEstimator.objective_at_thetac Cst|tjstt|ttfs"tt|ts0tt|ts>t|}t |j }i}xH|D]@}t j j |d} || |dd||<|d||k||<qZW|r||fS|SdS)a Likelihood ratio test to identify theta values within a confidence region using the :math:`\chi^2` distribution Parameters ---------- obj_at_theta: DataFrame, columns = theta_names + 'obj' Objective values for each theta value (returned by objective_at_theta) obj_value: int or float Objective value from parameter estimation using all data alphas: list List of alpha values to use in the chi2 test return_thresholds: bool, optional Return the threshold value for each alpha Returns ------- LR: DataFrame Objective values for each theta value along with True or False for each alpha thresholds: dictionary If return_threshold = True, the thresholds are also returned. rrr{N)r7r}r~rrrrRrrrrgrstatsZchi2Zppf) r[rZ obj_valuerZreturn_thresholdsZLRSZ thresholdsaZchi2_valr%r%r&likelihood_ratio_test<s  zEstimator.likelihood_ratio_testc Cst|tjst|dkstt|ts*tt|tdttjfsDtt|tr`t| }| }|dk rx| }x8|D].}|dkrt ||\}} ||kj dd|| kj dd@||<|dk r||kj dd|| kj dd@||<q|dkrNt |} | |} tj| d|d} | | k||<|dk r| |} | | k||<q|dkrt|} | | } tj| d|d} | | k||<|dk r| | } | | k||<qW|dk r||fS|SdS) aq Confidence region test to determine if theta values are within a rectangular, multivariate normal, or Gaussian kernel density distribution for a range of alpha values Parameters ---------- theta_values: DataFrame, columns = theta_names Theta values used to generate a confidence region (generally returned by theta_est_bootstrap) distribution: string Statistical distribution used to define a confidence region, options = 'MVN' for multivariate_normal, 'KDE' for gaussian_kde, and 'Rect' for rectangular. alphas: list List of alpha values used to determine if theta values are inside or outside the region. test_theta_values: dictionary or DataFrame, keys/columns = theta_names, optional Additional theta values that are compared to the confidence region to determine if they are inside or outside. Returns ------- training_results: DataFrame Theta value used to generate the confidence region along with True (inside) or False (outside) for each alpha test_results: DataFrame If test_theta_values is not None, returns test theta value along with True (inside) or False (outside) for each alpha )rrrNrr)Zaxisrdr)r7r}r~rrRrrZSeriesrrrrallrZpdfrrZscoreatpercentiler) r[rrrrZtraining_resultsZ test_resultrZlbZubdistZZscorer%r%r&rhsH          z Estimator.confidence_region_test)NFFN)NN)T)NTNF)NNF)N)F)N)r_r`rarbr\r|rrrrrrrrrrrr%r%r%r&rc?s& 4 zK % ? < Q' ,rc)N) s6 $       %0j"