3 *] @s ddlmZddlmZmZmZmZddlmZm Z m Z m Z ddl m Z ddlmZddlmZddlmZddlmZdd lmZmZmZdd lmZdd lmZdd lmZdd l m!Z!eZ"ede"_#e de"_$e de"_%e de"_&eZ'de'j(d_)de'j(dj*d_+de'j(dj,_+de'j(d_-de'j(d_.de'j(d_/de'j0_1eddde'j0_2de'j0_3de'j0_4de'j0_5de'j0_6de'j0_7e de'j0_8e de'j0_9ee'_:ej;d'fe'j<_=edd Z>e Z?d!e?_@eZAdeA_Bd"eA_CdeA_DeZEdeE_Fd#eE_Gd$eE_Hee'e?e"eAeEd%ZId&S)()ClippedPPOAgentParameters)VisualizationParametersPresetValidationParametersTaskParameters Frameworks) TrainingStepsEnvironmentEpisodesEnvironmentStepsRunPhase)GymVectorEnvironment)BasicRLGraphManager)ScheduleParameters)LinearSchedule)CategoricalParameters) NoInputFilterNoOutputFilter InputFilter)ObservationStackingFilter)ObservationToUInt8Filter)MemoryGranularity) environmentsi(ga2U0*3?mainrelu observation@gh㈵>g+?g?g?i@Bg{Gz?gffffff? T)is_a_reference_filterzRover-TrainingGrounds-v2ii) agent_params env_paramsschedule_params vis_paramspreset_validation_paramsNi)JZ!rl_coach.agents.clipped_ppo_agentrrl_coach.base_parametersrrrrrl_coach.core_typesrrr r Z%rl_coach.environments.gym_environmentr Z.rl_coach.graph_managers.basic_rl_graph_managerr Z%rl_coach.graph_managers.graph_managerr Zrl_coach.schedulesrZ)rl_coach.exploration_policies.categoricalrrl_coach.filters.filterrrrZ8rl_coach.filters.observation.observation_stacking_filterrZ8rl_coach.filters.observation.observation_to_uint8_filterrZrl_coach.memories.memoryrmarkovrr#Z improve_stepsZ steps_between_evaluation_periodsZevaluation_stepsZ heatup_stepsr!network_wrappers learning_rateinput_embedders_parametersZactivation_functionmiddleware_parameters batch_sizeoptimizer_epsilonadam_optimizer_beta2 algorithmZ#clip_likelihood_ratio_using_epsilonZclipping_decay_scheduleZ beta_entropyZ gae_lambdadiscountZoptimization_epochsZestimate_state_value_using_gae2num_steps_between_copying_online_weights_to_targetnum_consecutive_playing_steps explorationZ Transitionsmemorymax_sizeZTraining_Ground_Filterr"levelr$dump_csv$dump_signals_to_csv_every_x_episodes tensorboardr%testmin_reward_thresholdmax_episodes_to_achieve_reward graph_managerr@r@"markov/presets/training_grounds.pys`