B @`@slddlZddlZddlZddlZddlZddlmZddlm Z m Z m Z m ZmZmZmZddlmZddlmmmZddlmZddlmZddlmZddlm Z m!Z!ddl"m#m$m%Z&ddl'm#m$m(Z(ddl)m#m$m*Z*e e@e@Z+e d\Z,Z-e.e/Z0d Z1d d Z2d d Z3ddZ4ddZ5dddZ6Gddde7Z8Gddde7Z9dS)N) combinations)attempt_importnumpynumpy_availablepandaspandas_availablescipyscipy_available)tree_structure)CreateAbstractScenarioTreeModel) SolverFactory)Block ComponentUIDz4pyomo.contrib.interior_point.inverse_reduced_hessiang?c Cs|d}|dkrd}|}n&|d}||d|}|d|}|d}|d}|}x&tt|dD]} t||| }qjWt||}|dkr|Sy t|}Wn YnX||S)a> Create a Pyomo object from a string; it is attached to instance args: instance: a concrete pyomo model vstr: a particular Var or Param (e.g. "pp.Keq_a[2]") output: the object NOTE: We need to deal with blocks and with indexes that might really be strings or ints [N].)findsplitrangelengetattrint) instancevstrlZindexstrZbasestrrpartsnameretvalir"A/tmp/pip-unpacked-wheel-bi3529v6/pyomo/contrib/parmest/parmest.py_object_from_string1s(      r$cCsd|d}t||S)z Wrapper for _object_from_string for PySP extensive forms but only for Vars at the node named RootNode. DLW April 2018: needs work to be generalized. zMASTER_BLEND_VAR_RootNode[r)r$)Z efinstancerZefvstrr"r"r# _ef_ROOT_node_Object_from_stringVs r%c Cstd|d}t|}|dt| }t|dsDtdnt|j rV|j }n|j }t|dsptdn|j }t |t rt j|dd} nt |tjr|} n td yt| |}Wn&td |d t |YnXt|d r|j} | |} n|} |t | } |j} y|| | d }Wnxtk ry||| |}WnPtk ry|| |}WntdYnXYntdYnXYnXt|dr|j}x<|D]4}t||}||dk r|||nd|_qW|S)a This is going to be called by PySP and it will call into the user's model's callback. Parameters: ----------- scenario_tree_model: `pysp scenario tree` Standard pysp scenario tree, but with things tacked on: `CallbackModule` : `str` or `types.ModuleType` `CallbackFunction`: `str` or `callable` NOTE: if CallbackFunction is callable, you don't need a module. scenario_name: `str` `cb_data`: optional to pass through to user's callback function Scenario name should end with a number node_names: `None` Not used here Returns: -------- instance: `ConcreteModel` instantiated scenario Note: ---- There is flexibility both in how the function is passed and its signature. z(\d+)$rNCallbackFunctionz@Internal Error: tree needs callback in parmest callback functionCallbackModulez=Internal Error: tree needs CallbackModule in parmest callback)packagez"Internal Error: bad CallbackModulezError getting function=z from module=BootList)experiment_numbercb_dataz4Failed to create instance using callback; TypeError+z)Failed to create instance using callback. ThetaValsF)recompilesearchgrouprrhasattr RuntimeErrorcallabler&r' isinstancestrim import_moduletypes ModuleTypeprintrr)r+ TypeErrorr,r$Zfixfixed)Zscenario_tree_modelZ scenario_nameZ node_namesZ scen_num_strZscen_numbasenamecallbackZcb_namemodnameZ cb_modulebootlistexp_numZ scen_namer+r thetavalsrobjectr"r"r# _pysp_instance_creation_callbacksh             rDcCst|}t}|jd|jd|jdx4|D],}|jdt||jdt|q.FirstStageCost_rule)rulecSs |j|jS)N)FirstStageCostrb)rZr"r"r#TotalCost_rulesz7Estimator._create_parmest_model..TotalCost_rule)rnZsense)Z pyomo.corerlrcrrepyoZVarra enumeraterZfind_component_onloggerwarningr<reprrfcomponent_objectsZ deactivateZ ExpressionrorUrbZminimizeZTotal_Cost_ObjectiveZ parmest_model) rXrQrlrZr!thetaZvar_cuidZ var_validateobjrmrpr"r"r#_create_parmest_modelms4     zEstimator._create_parmest_modelc Cst|tjr(|j|ddf}nlt|tr||}t|trDt|try$t |d}t |}WdQRXWqt ddSn t ddS| |}|S)NrzUnexpected data format)r4pd DataFramelocto_frame transposerOdictr5openjsonloadr:ry)rXr*r+Zexp_datainfilerZr"r"r#_instance_creation_callbacks       z%Estimator._instance_creation_callbackef_ipoptc/ Cs|dks|dkst|dkr(t|j}nttt|j}|jd}|jd}|j|j|<g|j|<d|j|<d|j|<d|_ |j |_ |dk r||_ |dk r||_ |j|_z tj} dt_tjdd|d } Wd| t_X|d krR| |_d } | r|jrtd |std }|jdk r:x |jD]} |j| |j| <q W| r|j|j|jd d} t| jdkrr| dj}nd}|jj| n|j|j|jd} nDg}x"|jD]}| |jj!|qWt"j#|j||j|jd\} }| j$%| j$&|j'r t(dt)| j*j+i}x| ,D]\}}|||<qW| -}|rt|j}t|}|}d||||}t.j/||0|0d}t|dkr6g}x||jj1t2d dD]h}i}xR|D]J}|3t)|}dd|4D}t|dkr|d||<n|||<qW| |qWt./|}|r,||||fS|||fS|rF|||fS||fSn|dkr| } d}!td }"td}#td}$t5j6t5j6j7d| _8t5j6t5j6j9d| _:t5j6t5j6j9d| _;t5j6t5j6jt5j6t5j6jG| j;t(d!|$j| |!di})i}t(d"tDd#d$}(|(H}*WdQRX| jIJ}xtt|*D]}+i|)|j|+<|*|+},t(|,|,K}-x6tt|-D]&}.tL|-|.|)|j|+|j|.<qW|j|+}&tB| |&}'t5M|'||j|+<qVW|||)fStNd%|dS)&a Set up all thetas as first stage Vars, return resulting theta values as well as the objective function value. NOTE: If thetavals is present it will be attached to the scenario tree so it can be used by the scenario creation callback. Side note (feb 2018, dlw): if you later decide to construct the tree just once and reuse it, then remember to remove thetavals from it when none is desired. Zk_augNrrorbzpyomo.contrib.parmest.parmestrD)ZfsfileZfsfct tree_modelrFzUCalculating both the gap and reduced hessian (covariance) is not currently supported.ipopt)rgZload_solutionsr)rg)Zindependent_variablesrirgz# Solver termination condition = )indexrN)Z descend_intocSsg|]}t|qSr")rqvalue).0_r"r"r# Vsz$Estimator._Q_opt..TZ ipopt_sens) directionrEZ compute_invz ipopt.optwzcompute_red_hessian yes zoutput_file my_ouput.txt zrh_eigendecomp yes z k_aug zk_aug red_hesszresult_red_hess.txtrzUnknown solver in Q_Opt=)OAssertionErrorrLrkrrrFreZStageVariablesZ StageCostr'rr&r,r)rdr+r ZCUID_repr_versionstZ StochSolverZmake_efZ ef_instancecalc_covr rioptionssolvergZsolutionZgapZ solutionsZ load_fromrPZMASTER_BLEND_VAR_RootNodeinverse_reduced_hessianZinv_reduced_hessian_barrierZ scenario_treeZ"pullScenarioSolutionsFromInstancesZsnapshotSolutionFromScenariosrhr:r5solvertermination_conditionZroot_Var_solutionZ root_E_objrzr{keysrvr Zfind_component itervaluesrqZSuffixZ IMPORT_EXPORTZdualZIMPORTZ ipopt_zL_outZ ipopt_zU_outZEXPORTZ ipopt_zL_inZ ipopt_zU_inZ red_hessianZdof_vZrh_namer%Zset_suffix_valuerwritecloseupdate readlinesZMASTER_OBJECTIVE_EXPRESSIONexprrfloatrr2)/rXr,r return_valuesr@rrZstage1Zstage2Z_cuidverZstsolverZneed_gapkeyZ solve_resultZabsgapZind_varsvZ inv_red_hesrBrZsolvalobjvalnrZsseZcovZ var_valuesZexp_ivalsvarZ exp_i_vartemprZZ stream_solverrZsipoptZkaugZ vstrindexrZ varobjectfZHessDictlinesr!Zlineinrjr"r"r#_Q_opts                                     &   zEstimator._Q_optc Cstd}dd}d|_|j|_||_|j|_|jrBt dt |t |dd}yt |j tjdd}Wnd}YnXd }tjj}d }x|jD]} d t | } t || d}|sZ|jrt d | t d tj||ddd\} } } }}t dt | t | t | t |t |||}|jr2t dt |jj|jjtjjkrZ|tjjkrZ|jj}t||j}t|}||7}qW|t|j}|||fS)a~ Return the objective function value with fixed theta values. Parameters ---------- thetavals: dict A dictionary of theta values. Returns ------- objectiveval: float The objective function value. thetavals: dict A dictionary of all values for theta that were input. solvertermination: Pyomo TerminationCondition Tries to return the "worst" solver status across the scenarios. pyo.TerminationCondition.optimal is the best and pyo.TerminationCondition.infeasible is the worst. rcSsdS)Nr"r"r"r"r#z'Estimator._Q_at_theta..Nz! Compute objective at theta = ZFOO1T)activeFrZ scenario_NODEz Experiment = z6 First solve with with special diagnostics wrapperix)Zmax_iterZ max_cpu_timez: status_obj, solved, iters, time, regularization_stat = z,standard solve solver termination condition=)rqr r'rr&r,rdr+rhr:r5rDnextrvZ ConstraintTerminationConditionZoptimalrkipopt_solver_wrapperZipopt_solve_with_statsrrr infeasiblerrjrr)rXrBZ optimizerZ dummy_treerfirstZ sillylittleZ WorstStatusZtotobjZsnumZsnameZ status_objZsolvedZiterstimeZreguresultsZ objobjectrr r"r"r# _Q_at_thetasP      "     zEstimator._Q_at_thetaTc Cst}|dkrBxtt|j|D]\}}||t|fq Wnxt|D]}d}d}d} xj|t|j kr| stj j |j||d} t|  } tt | }| |krd} |d7}||kr^tdq^W||| fqLW|S)NrF)rITrzInternal error: timeout constructing a sample, the dim of theta may be too close to the samplesize)rOrrrrkrPnpsortrrrerandomchoicetolistuniquer2) rX samplesizeZ num_samples replacementZ samplelistr!rattemptsZunique_samplesZ duplicatesampler"r"r#_get_sample_lists,  zEstimator._get_sample_listcCsDt|tstt|tstt|tdtfs2t|j||||dS)aP Parameter estimation using all scenarios in the data Parameters ---------- solver: string, optional "ef_ipopt" or "k_aug". Default is "ef_ipopt". return_values: list, optional List of Variable names used to return values from the model bootlist: list, optional List of bootstrap sample numbers, used internally when calling theta_est_bootstrap calc_cov: boolean, optional If True, calculate and return the covariance matrix (only for "ef_ipopt" solver) Returns ------- objectiveval: float The objective function value thetavals: dict A dictionary of all values for theta variable values: pd.DataFrame Variable values for each variable name in return_values (only for solver='ef_ipopt') Hessian: dict A dictionary of dictionaries for the Hessian (only for solver='k_aug') cov: pd.DataFrame Covariance matrix of the fitted parameters (only for solver='ef_ipopt') N)rrr@r)r4r5rrOtyper)rXrrr@rr"r"r# theta_est#s zEstimator.theta_estcCs"t|tstt|tdtfs$tt|ts2tt|tdtfsHtt|tsVt|dkrht|j}|dk r|tj || |||}t |}| |}tt||_t} x6|D].\} } |jt| d\} } | | d<| | qWttt|j|_|| }t|} |s| d=| S)aT Parameter estimation using bootstrap resampling of the data Parameters ---------- bootstrap_samples: int Number of bootstrap samples to draw from the data samplesize: int or None, optional Size of each bootstrap sample. If samplesize=None, samplesize will be set to the number of samples in the data replacement: bool, optional Sample with or without replacement seed: int or None, optional Random seed return_samples: bool, optional Return a list of sample numbers used in each bootstrap estimation Returns ------- bootstrap_theta: DataFrame Theta values for each sample and (if return_samples = True) the sample numbers used in each estimation N)r@Zsamples)r4rrrboolrrkrrseedrmpiuParallelTaskManagerglobal_to_local_datarOrrrPrdallgather_global_datarzr{)rXbootstrap_samplesrrrreturn_samples global_listtask_mgr local_listbootstrap_thetaidxrrrBglobal_bootstrap_thetar"r"r#theta_est_bootstrapGs2      zEstimator.theta_est_bootstrapcCs:t|tstt|tdtfs$tt|tdtfs:tt|tsHtt|j|}|dk rjtj ||j ||dd}t t|}| |}tt||_t} xZ|D]R\} } |jt| d\} } tttt|jt| }t|| d<| | qWttt|j|_|| }t|} |s6| d=| S)a Parameter estimation where N data points are left out of each sample Parameters ---------- lNo: int Number of data points to leave out for parameter estimation lNo_samples: int Number of leave-N-out samples. If lNo_samples=None, the maximum number of combinations will be used seed: int or None, optional Random seed return_samples: bool, optional Return a list of sample numbers that were left out Returns ------- lNo_theta: DataFrame Theta values for each sample and (if return_samples = True) the sample numbers left out of each estimation NF)r)r@lNo)r4rrrrrrkrrrrrrrrOrrsetrdrrPrrzr{)rXr lNo_samplesrrrrrrZ lNo_thetarrrrBZlNo_srr"r"r#theta_est_leaveNouts.    zEstimator.theta_est_leaveNoutcCs,t|tstt|tdtfs$tt|ts2t|dks>tt|tsLtt|tdtfsbt|dk rvtj||j }|j ||dd}g} x|D]z\} } |j | ddf|_|jj |_ |\} } |j| d|_|jj |_ ||}|j|||| d\}}| | ||fqW||_|jj |_ | S)a Leave-N-out bootstrap test to compare theta values where N data points are left out to a bootstrap analysis using the remaining data, results indicate if theta is within a confidence region determined by the bootstrap analysis Parameters ---------- lNo: int Number of data points to leave out for parameter estimation lNo_samples: int Leave-N-out sample size. If lNo_samples=None, the maximum number of combinations will be used bootstrap_samples: int: Bootstrap sample size distribution: string Statistical distribution used to define a confidence region, options = 'MVN' for multivariate_normal, 'KDE' for gaussian_kde, and 'Rect' for rectangular. alphas: list List of alpha values used to determine if theta values are inside or outside the region. seed: int or None, optional Random seed Returns ---------- List of tuples with one entry per lNo_sample: * The first item in each tuple is the list of N samples that are left out. * The second item in each tuple is a DataFrame of theta estimated using the N samples. * The third item in each tuple is a DataFrame containing results from the bootstrap analysis using the remaining samples. For each DataFrame a column is added for each value of alpha which indicates if the theta estimate is in (True) or out (False) of the alpha region for a given distribution (based on the bootstrap results) N)RectMVNKDEF)r)r) distributionalphastest_theta_values)r4rrrrOrrrrdcopyrr|rrkrZdroprconfidence_region_testrP)rXrrrrrrrQrrrrrxrwrZtrainingtestr"r"r#leaveNout_bootstrap_tests2*         z"Estimator.leaveNout_bootstrap_testcCst|tjst|j}|d}tt|}| |}t }x@|D]8}| |\}} } | t j jkrD|t ||gqDW||} t |dg} tj| | d} | S)ao Objective value for each theta Parameters ---------- theta_values: DataFrame, columns=theta_names Values of theta used to compute the objective Returns ------- obj_at_theta: DataFrame Objective value for each theta (infeasible solutions are omitted). recordsrx)rQrN)r4rzr{rrNto_dictrrrrrOrrqrrrPvaluesr)rX theta_valuesreZ all_thetasrZ local_thetasZall_objThetarxZthetvalsZ worststatusZglobal_all_objZdfcols obj_at_thetar"r"r#objective_at_thetas     zEstimator.objective_at_thetac Cst|tjstt|ttfs"tt|ts0tt|ts>t|}t |j }i}xH|D]@}t j j |d} || |dd||<|d||k||<qZW|r||fS|SdS)a Likelihood ratio test to identify theta values within a confidence region using the :math:`\chi^2` distribution Parameters ---------- obj_at_theta: DataFrame, columns = theta_names + 'obj' Objective values for each theta value (returned by objective_at_theta) obj_value: int or float Objective value from parameter estimation using all data alphas: list List of alpha values to use in the chi2 test return_thresholds: bool, optional Return the threshold value for each alpha Returns ------- LR: DataFrame Objective values for each theta value along with True or False for each alpha thresholds: dictionary If return_threshold = True, the thresholds are also returned. rrrxN)r4rzr{rrrrOrrrrdrstatsZchi2Zppf) rXrZ obj_valuerZreturn_thresholdsZLRSZ thresholdsaZchi2_valr"r"r#likelihood_ratio_test;s  zEstimator.likelihood_ratio_testc Cst|tjst|dkstt|ts*tt|tdttjfsDtt|tr`t| }| }|dk rx| }x>|D]4}|dkrt ||\}} ||kj dd|| kj dd@||<|dk r||kj dd|| kj dd@||<q|dkrRt |} | |} tj| d|d} | | k||<|dk r| |} | | k||<q|dkrt |} | | } tj| d|d} | | k||<|dk r| | } | | k||<qW|dk r||fS|SdS) aq Confidence region test to determine if theta values are within a rectangular, multivariate normal, or Gaussian kernel density distribution for a range of alpha values Parameters ---------- theta_values: DataFrame, columns = theta_names Theta values used to generate a confidence region (generally returned by theta_est_bootstrap) distribution: string Statistical distribution used to define a confidence region, options = 'MVN' for multivariate_normal, 'KDE' for gaussian_kde, and 'Rect' for rectangular. alphas: list List of alpha values used to determine if theta values are inside or outside the region. test_theta_values: dictionary or DataFrame, keys/columns = theta_names, optional Additional theta values that are compared to the confidence region to determine if they are inside or outside. Returns ------- training_results: DataFrame Theta value used to generate the confidence region along with True (inside) or False (outside) for each alpha test_results: DataFrame If test_theta_values is not None, returns test theta value along with True (inside) or False (outside) for each alpha )rrrNrr)Zaxisrdr)r4rzr{rrOrrZSeriesr}r~rgraphicsZ fit_rect_distallZ fit_mvn_distZpdfrrZscoreatpercentileZ fit_kde_dist) rXrrrrZtraining_resultsZ test_resultrZlbZubdistZZscorer"r"r#rgsH            z Estimator.confidence_region_test)NFFN)NN)T)NTNF)NNF)N)F)N)r\r]r^r_rYryrrrrrrrrrrrr"r"r"r#r`=s& 4 |K $ ? < Q' ,r`)N):r- importlibr6loggingr8r itertoolsrZpyomo.common.dependenciesrrrrrrzrrr Z pyomo.environenvironrqZpyomo.pysp.util.rapperZpysputilZrapperrZpyomo.pysp.scenariotreer Z,pyomo.pysp.scenariotree.tree_structure_modelr Z pyomo.optr r rZpyomo.contrib.parmest.mpi_utilscontribZparmestZ mpi_utilsrZ*pyomo.contrib.parmest.ipopt_solver_wrapperrZpyomo.contrib.parmest.graphicsrZparmest_availablerZ!inverse_reduced_hessian_available getLoggerr\rs __version__r$r%rDrLrTrCrUr`r"r"r"r# s6 $       %0j"