B _@szddlZddlZddlZddlZddlmZddlmZ m Z m Z m Z mZmZe e @e@ZddlmZddlmmmZddlmZddlmZddlmZddlmm m!Z"ddl#mm m$Z$ddl%m&Z&m'Z'm(Z(m)Z)m*Z*m+Z+e rerddl,m-Z-e-.Z/nd Z/e/r(dd l0m1Z1d Z2d d Z3ddZ4ddZ5ddZ6dddZ7Gddde8Z9Gddde8Z:dS)N) combinations)numpynumpy_availablepandaspandas_availablescipyscipy_available)CreateAbstractScenarioTreeModel) SolverFactory)Block) pairwise_plotgrouped_boxplotgrouped_violinplot fit_rect_dist fit_mvn_dist fit_kde_dist) AmplInterfaceF)inv_reduced_hessian_barrierg?c Cs|d}|dkrd}|}n&|d}||d|}|d|}|d}|d}|}x&tt|dD]} t||| }qjWt||}|dkr|Sy t|}Wn YnX||S)a@ Create a Pyomo object from a string; it is attached to instance args: instance: a concrete pyomo model vstr: a particular Var or Param (e.g. "pp.Keq_a[2]") output: the object NOTE: We need to deal with blocks and with indexes that might really be strings or ints [N].)findsplitrangelengetattrint) instancevstrlZindexstrZbasestrrpartsnameretvalir'A/tmp/pip-unpacked-wheel-d4p3hk07/pyomo/contrib/parmest/parmest.py_object_from_string/s(      r)cCsd|d}t||S)z Wrapper for _object_from_string for PySP extensive forms but only for Vars at the node named RootNode. DLW April 2018: needs work to be generalized. zMASTER_BLEND_VAR_RootNode[r)r))Z efinstancer Zefvstrr'r'r( _ef_ROOT_node_Object_from_stringTs r*c Cstd|d}t|}|dt| }t|dsDtdnt|j rV|j }n|j }t|dsptdn|j }t |t rt j|dd} nt |tjr|} n td yt| |}Wn&td |d t |YnXt|d r|j} | |} n|} |t | } |j} y|| | d }Wnxtk ry||| |}WnPtk ry|| |}WntdYnXYntdYnXYnXt|dr|j}x<|D]4}t||}||dk r|||nd|_qW|S)a This is going to be called by PySP and it will call into the user's model's callback. Parameters: ----------- scenario_tree_model: `pysp scenario tree` Standard pysp scenario tree, but with things tacked on: `CallbackModule` : `str` or `types.ModuleType` `CallbackFunction`: `str` or `callable` NOTE: if CallbackFunction is callable, you don't need a module. scenario_name: `str` `cb_data`: optional to pass through to user's callback function Scenario name should end with a number node_names: `None` Not used here Returns: -------- instance: `ConcreteModel` instantiated scenario Note: ---- There is flexibility both in how the function is passed and its signature. z(\d+)$rNCallbackFunctionz@Internal Error: tree needs callback in parmest callback functionCallbackModulez=Internal Error: tree needs CallbackModule in parmest callback)packagez"Internal Error: bad CallbackModulezError getting function=z from module=BootList)experiment_numbercb_dataz4Failed to create instance using callback; TypeError+z)Failed to create instance using callback. ThetaValsF)recompilesearchgrouprrhasattr RuntimeErrorcallabler+r, isinstancestrim import_moduletypes ModuleTypeprintrr.r0 TypeErrorr1r)Zfixfixed)Zscenario_tree_modelZ scenario_nameZ node_namesZ scen_num_strZscen_numbasenamecallbackZcb_namemodnameZ cb_modulebootlistexp_numZ scen_namer0r thetavalsr objectr'r'r( _pysp_instance_creation_callbacksh             rIcCst|}t}|jd|jd|jdx4|D],}|jdt||jdt|q|jD]4}ytd|}d|_ WqHt |d YqHXqHW|j rx| |D] }| qWd d }tj|d |_tjt|j |d |_d d}tj|tjd|_||_|S)zA Modify the Pyomo model for parameter estimation r) Objectiverrfg?)Z initializezmodel.Fz is not a variablecSsdS)Nrr')r_r'r'r(FirstStageCost_rulesz.FirstStageCost_rule)rulecSs |j|jS)N)FirstStageCostrg)r_r'r'r(TotalCost_rulesz7Estimator._create_parmest_model..TotalCost_rule)rsZsense)Z pyomo.corerqrhrrjpyoZVarrfevalrAr?rkcomponent_objectsZ deactivateZ ExpressionrtrZrgZminimizeZTotal_Cost_ObjectiveZ parmest_model) r]rVrqr_thetaZ var_validateobjrrrur'r'r(_create_parmest_modelks(      zEstimator._create_parmest_modelc Cst|tjr(|j|ddf}nlt|tr||}t|trDt|try$t |d}t |}WdQRXWqt ddSn t ddS| |}|S)Nr"zUnexpected data format)r9pd DataFramelocto_frame transposerTdictr:openjsonloadr?r{)r]r/r0Zexp_datainfiler_r'r'r(_instance_creation_callbacks       z%Estimator._instance_creation_callbackef_ipoptc. Cs|dks|dkst|dkr(t|j}nttt|j}|jd}|jd}|j|j|<g|j|<d|j|<d|j|<d|_ |j |_ |dk r||_ |dk r||_ |j|_tjdd|d } |d kr0| |_d } | r|jrtd |std }|jdk r x |jD]} |j| |j| <qW| rl|j|j|jd d} t| jdkrX| dj} nd} |jj| n|j|j|jd} nRtstdnBg}x"|jD]}| |jj!|qWt"|j||j|jd\} }| j#$| j#%|j&rt'dt(| j)j*i}x| +D]\}}|||<qW| ,}|r\t|j}t|}|}d||||}t|dkrg}x~| jj-t.d dD]j}i}xT|D]L}t/dt(|}dd|0D}t|dkr|d||<n|||<qW| |qWt12|}|r ||||fS|||fS|r$|||fS||fSn|dkr| }d} td }!td}"td}#t3j4t3j4j5d|_6t3j4t3j4j7d|_8t3j4t3j4j7d|_9t3j4t3j4j:d|_;t3j4t3j4j:d|_t3j4t3j4j7d|_?xLtt|jD]:}$|j|$}%t@||%}&|&A|j=|$d|&A|j>dqWd|#jd<tBdd,}'|'Cd|'Cd |'Cd!|'DWdQRX|"j|| dtBdd}'|'DWdQRX|!j|| d|j;E|j8|j1sz$Estimator._Q_opt..TZ ipopt_sens) directionrJZ compute_invz ipopt.optwzcompute_red_hessian yes zoutput_file my_ouput.txt zrh_eigendecomp yes z k_aug zk_aug red_hesszresult_red_hess.txtr"zUnknown solver in Q_Opt=)MAssertionErrorrQrprrrKrjZStageVariablesZ StageCostr,rr+r1r.rir0stZ StochSolverZmake_efZ ef_instancecalc_covr rnoptionssolverlZsolutionZgapZ solutionsZ load_from asl_available ImportErrorrUZMASTER_BLEND_VAR_RootNoderZ scenario_treeZ"pullScenarioSolutionsFromInstancesZsnapshotSolutionFromScenariosrmr?r:solvertermination_conditionZroot_Var_solutionZ root_E_objrxr rw itervaluesr|r}rvZSuffixZ IMPORT_EXPORTZdualZIMPORTZ ipopt_zL_outZ ipopt_zU_outZEXPORTZ ipopt_zL_inZ ipopt_zU_inZ red_hessianZdof_vZrh_namer*Zset_suffix_valuerwritecloseupdate readlinesZMASTER_OBJECTIVE_EXPRESSIONexprrfloatrr7).r]r1r return_valuesrErrZstage1Zstage2ZstsolverZneed_gapkeyZ solve_resultZabsgapZind_varsvZ inv_red_hesrGr$Zsolvalobjvalnr!ZsseZcovZ var_valuesZexp_ivalsvarZ exp_i_vartempr_Z stream_solverrZsipoptZkaugZ vstrindexr Z varobjectfZHessDictlinesr&Zlineinr#jr'r'r(_Q_opts                                   &   zEstimator._Q_optc Cstd}dd}d|_|j|_||_|j|_|jrBt dt |t |dd}yt |j tjdd}Wnd}YnXd }tjj}d }x|jD]} d t | } t || d}|sZ|jrt d | t d tj||ddd\} } } }}t dt | t | t | t |t |||}|jr2t dt |jj|jjtjjkrZ|tjjkrZ|jj}t||j}t|}||7}qW|t|j}|||fS)a~ Return the objective function value with fixed theta values. Parameters ---------- thetavals: dict A dictionary of theta values. Returns ------- objectiveval: float The objective function value. thetavals: dict A dictionary of all values for theta that were input. solvertermination: Pyomo TerminationCondition Tries to return the "worst" solver status across the scenarios. pyo.TerminationCondition.optimal is the best and pyo.TerminationCondition.infeasible is the worst. rcSsdS)Nr'r'r'r'r(z'Estimator._Q_at_theta..Nz! Compute objective at theta = ZFOO1T)activeFrZ scenario_NODEz Experiment = z6 First solve with with special diagnostics wrapperix)Zmax_iterZ max_cpu_timez: status_obj, solved, iters, time, regularization_stat = z,standard solve solver termination condition=)rvr r,rr+r1rir0rmr?r:rInextrxZ ConstraintTerminationConditionZoptimalrpipopt_solver_wrapperZipopt_solve_with_statsrrr infeasiblerrorr)r]rGZ optimizerZ dummy_treerfirstZ sillylittleZ WorstStatusZtotobjZsnumZsnameZ status_objZsolvedZiterstimeZreguresultsZ objobjectrr%r'r'r( _Q_at_thetasP      "     zEstimator._Q_at_thetaTc Cst}|dkrBxtt|j|D]\}}||t|fq Wnxt|D]}d}d}d} xj|t|j kr| stj j |j||d} t|  } tt | }| |krd} |d7}||kr^tdq^W||| fqLW|S)NrF)rNTrzInternal error: timeout constructing a sample, the dim of theta may be too close to the samplesize)rT enumeraterrprUnpsortrrrjrandomchoicetolistuniquer7) r] samplesizeZ num_samples replacementZ samplelistr&r!attemptsZunique_samplesZ duplicatesampler'r'r(_get_sample_lists,  zEstimator._get_sample_listcCsDt|tstt|tstt|tdtfs2t|j||||dS)ag Parameter estimation using all scenarios in the data Parameters ---------- solver: string, optional "ef_ipopt" or "k_aug". Default is "ef_ipopt". return_values: list, optional List of Variable names used to return values from the model bootlist: list, optional List of bootstrap sample numbers, used internally when calling theta_est_bootstrap calc_cov: boolean, optional If True, calculate and return the covariance matrix (only for "ef_ipopt" solver) Returns ------- objectiveval: float The objective function value thetavals: dict A dictionary of all values for theta variable values: pd.DataFrame Variable values for each variable name in return_values (only for ef_ipopt) Hessian: dict A dictionary of dictionaries for the Hessian. The Hessian is not returned if the solver is ef_ipopt. cov: numpy.array Covariance matrix of the fitted parameters (only for ef_ipopt) N)rrrEr)r9r:rrTtyper)r]rrrErr'r'r( theta_ests zEstimator.theta_estcCs"t|tstt|tdtfs$tt|ts2tt|tdtfsHtt|tsVt|dkrht|j}|dk r|tj || |||}t |}| |}tt||_t} x6|D].\} } |jt| d\} } | | d<| | qWttt|j|_|| }t|} |s| d=| S)aK Parameter estimation using bootstrap resampling of the data Parameters ---------- bootstrap_samples: int Number of bootstrap samples to draw from the data samplesize: int or None, optional Size of each bootstrap sample. If samplesize=None, samplesize will be set to the number of samples in the data replacement: bool, optional Sample with or without replacement seed: int or None, optional Random seed return_samples: bool, optional Return a list of sample numbers used in each bootstrap estimation Returns ------- bootstrap_theta: DataFrame Theta values for each sample and (if return_samples = True) the sample numbers used in each estimation N)rEZsamples)r9rrrboolrrprrseedrmpiuParallelTaskManagerglobal_to_local_datarTrrrUriallgather_global_datar|r})r]bootstrap_samplesrrrreturn_samples global_listtask_mgr local_listbootstrap_thetaidxrrrGglobal_bootstrap_thetar'r'r(theta_est_bootstrap"s2      zEstimator.theta_est_bootstrapcCs:t|tstt|tdtfs$tt|tdtfs:tt|tsHtt|j|}|dk rjtj ||j ||dd}t t|}| |}tt||_t} xZ|D]R\} } |jt| d\} } tttt|jt| }t|| d<| | qWttt|j|_|| }t|} |s6| d=| S)a Parameter estimation where N data points are left out of each sample Parameters ---------- lNo: int Number of data points to leave out for parameter estimation lNo_samples: int Number of leave-N-out samples. If lNo_samples=None, the maximum number of combinations will be used seed: int or None, optional Random seed return_samples: bool, optional Return a list of sample numbers that were left out Returns ------- lNo_theta: DataFrame Theta values for each sample and (if return_samples = True) the sample numbers left out of each estimation NF)r)rElNo)r9rrrrrrprrrrrrrrTrrsetrirrUrr|r})r]r lNo_samplesrrrrrrZ lNo_thetarrrrGZlNo_srr'r'r(theta_est_leaveNoutbs.    zEstimator.theta_est_leaveNoutcCs,t|tstt|tdtfs$tt|ts2t|dks>tt|tsLtt|tdtfsbt|dk rvtj||j }|j ||dd}g} x|D]z\} } |j | ddf|_|jj |_ |\} } |j| d|_|jj |_ ||}|j|||| d\}}| | ||fqW||_|jj |_ | S)a Leave-N-out bootstrap test to compare theta values where N data points are left out to a bootstrap analysis using the remaining data, results indicate if theta is within a confidence region determined by the bootstrap analysis Parameters ---------- lNo: int Number of data points to leave out for parameter estimation lNo_samples: int Leave-N-out sample size. If lNo_samples=None, the maximum number of combinations will be used bootstrap_samples: int: Bootstrap sample size distribution: string Statistical distribution used to define a confidence region, options = 'MVN' for multivariate_normal, 'KDE' for gaussian_kde, and 'Rect' for rectangular. alphas: list List of alpha values used to determine if theta values are inside or outside the region. seed: int or None, optional Random seed Returns ---------- List of tuples with one entry per lNo_sample: * The first item in each tuple is the list of N samples that are left out. * The second item in each tuple is a DataFrame of theta estimated using the N samples. * The third item in each tuple is a DataFrame containing results from the bootstrap analysis using the remaining samples. For each DataFrame a column is added for each value of alpha which indicates if the theta estimate is in (True) or out (False) of the alpha region for a given distribution (based on the bootstrap results) N)RectMVNKDEF)r)index) distributionalphastest_theta_values)r9rrrrTrrrricopyrr~rrprZdroprconfidence_region_testrU)r]rrrrrrrVrrrrrzryrZtrainingtestr'r'r(leaveNout_bootstrap_tests2*         z"Estimator.leaveNout_bootstrap_testcCst|tjst|j}|d}tt|}| |}t }x@|D]8}| |\}} } | t j jkrD|t ||gqDW||} t |dg} tj| | d} | S)ao Objective value for each theta Parameters ---------- theta_values: DataFrame, columns=theta_names Values of theta used to compute the objective Returns ------- obj_at_theta: DataFrame Objective value for each theta (infeasible solutions are omitted). recordsrz)rVrS)r9r|r}rrSto_dictrrrrrTrrvrrrUvaluesr)r] theta_valuesrjZ all_thetasrZ local_thetasZall_objThetarzZthetvalsZ worststatusZglobal_all_objZdfcols obj_at_thetar'r'r(objective_at_thetas     zEstimator.objective_at_thetac Cst|tjstt|ttfs"tt|ts0tt|ts>t|}t |j }i}xH|D]@}t j j |d} || |dd||<|d||k||<qZW|r||fS|SdS)a Likelihood ratio test to identify theta values within a confidence region using the :math:`\chi^2` distribution Parameters ---------- obj_at_theta: DataFrame, columns = theta_names + 'obj' Objective values for each theta value (returned by objective_at_theta) obj_value: int or float Objective value from parameter estimation using all data alphas: list List of alpha values to use in the chi2 test return_thresholds: bool, optional Return the threshold value for each alpha Returns ------- LR: DataFrame Objective values for each theta value along with True or False for each alpha thresholds: dictionary If return_threshold = True, the thresholds are also returned. rrrzN)r9r|r}rrrrTrrrrirstatsZchi2Zppf) r]rZ obj_valuerZreturn_thresholdsZLRSZ thresholdsaZchi2_valr'r'r(likelihood_ratio_tests  zEstimator.likelihood_ratio_testc Cst|tjst|dkstt|ts*tt|tdttjfsDtt|tr`t| }| }|dk rx| }x8|D].}|dkrt ||\}} ||kj dd|| kj dd@||<|dk r||kj dd|| kj dd@||<q|dkrNt |} | |} tj| d|d} | | k||<|dk r| |} | | k||<q|dkrt|} | | } tj| d|d} | | k||<|dk r| | } | | k||<qW|dk r||fS|SdS) aq Confidence region test to determine if theta values are within a rectangular, multivariate normal, or Gaussian kernel density distribution for a range of alpha values Parameters ---------- theta_values: DataFrame, columns = theta_names Theta values used to generate a confidence region (generally returned by theta_est_bootstrap) distribution: string Statistical distribution used to define a confidence region, options = 'MVN' for multivariate_normal, 'KDE' for gaussian_kde, and 'Rect' for rectangular. alphas: list List of alpha values used to determine if theta values are inside or outside the region. test_theta_values: dictionary or DataFrame, keys/columns = theta_names, optional Additional theta values that are compared to the confidence region to determine if they are inside or outside. Returns ------- training_results: DataFrame Theta value used to generate the confidence region along with True (inside) or False (outside) for each alpha test_results: DataFrame If test_theta_values is not None, returns test theta value along with True (inside) or False (outside) for each alpha )rrrNrr)Zaxisrdr)r9r|r}rrTrrZSeriesrrrrallrZpdfrrZscoreatpercentiler) r]rrrrZtraining_resultsZ test_resultrZlbZubdistZZscorer'r'r(rBsH          z Estimator.confidence_region_test)NFFN)NN)T)NTNF)NNF)N)F)N)rarbrcrdr^r{rrrrrrrrrrrr'r'r'r(re;s& $ hK % ? < Q' ,re)N);r2 importlibr;r=r itertoolsrZpyomo.common.dependenciesrrrrr|rrrZparmest_availableZ pyomo.environenvironrvZpyomo.pysp.util.rapperZpysputilZrapperrZ,pyomo.pysp.scenariotree.tree_structure_modelr Z pyomo.optr r Zpyomo.contrib.parmest.mpi_utilscontribZparmestZ mpi_utilsrZ*pyomo.contrib.parmest.ipopt_solver_wrapperrZpyomo.contrib.parmest.graphicsr r rrrrZpyomo.contrib.pynumero.aslr availablerZ4pyomo.contrib.interior_point.inverse_reduced_hessianr __version__r)r*rIrQrYrHrZrer'r'r'r( s8            %0j"