Make your own environment

Here are the steps required to create a new environment.


Pull requests are welcome!

Set up files

  1. Create a new file in highway_env/envs/

  2. Define a class YourEnv, that must inherit from AbstractEnv

This class provides several useful functions:

  • A default_config() method, that provides a default configuration dictionary that can be overloaded.

  • A define_spaces() method, that gives access to a choice of observation and action types, set from the environment configuration

  • A step() method, which executes the desired actions (at policy frequency) and simulate the environment (at simulation frequency)

  • A render() method, which renders the environment.

Create the scene

The first step is to create a RoadNetwork that describes the geometry and topology of roads and lanes in the scene. This should be achieved in a YourEnv._make_road() method, called from YourEnv.reset() to set the self.road field.

See Roads for reference, and existing environments as examples.

Create the vehicles

The second step is to populate your road network with vehicles. This should be achieved in a YourEnv._make_road() method, called from YourEnv.reset() to set the self.road.vehicles list of Vehicle.

First, define the controlled ego-vehicle by setting self.vehicle. The class of controlled vehicle depends on the choice of action type, and can be accessed as self.action_type.vehicle_class. Other vehicles can be created more freely, and added to the self.road.vehicles list.

See vehicle behaviors for reference, and existing environments as examples.

Make the environment configurable

To make a part of your environment configurable, overload the default_config() method to define new {"config_key": value} pairs with default values. These configurations then be accessed in your environment implementation with self.config["config_key"], and once the environment is created, it can be configured with env.configure({"config_key": other_value}) followed by env.reset().

Register the environment

In highway_env/envs/, add the following line:


and import it from highway_env/envs/

from highway_env.envs.your_env import *


That’s it! You should now be able to run the environment:

import gym
import highway_env

env = gym.make('your-env-v0')
obs = env.reset()
obs, reward, done, info = env.step(env.action_space.sample())


class highway_env.envs.common.abstract.AbstractEnv(config: Optional[dict] = None)[source]

A generic environment for various tasks involving a vehicle driving on a road.

The environment contains a road populated with vehicles, and a controlled ego-vehicle that can change lane and speed. The action space is fixed, but the observation space and reward function must be defined in the environment implementations.


The maximum distance of any vehicle present in the observation [m]

property vehicle: highway_env.vehicle.kinematics.Vehicle

First (default) controlled vehicle.

classmethod default_config() dict[source]

Default environment configuration.

Can be overloaded in environment implementations, or by calling configure(). :return: a configuration dict

seed(seed: Optional[int] = None) List[int][source]

Sets the seed for this env’s random number generator(s).


Some environments use multiple pseudorandom number generators. We want to capture all such seeds used in order to ensure that there aren’t accidental correlations between multiple generators.

list<bigint>: Returns the list of seeds used in this env’s random

number generators. The first value in the list should be the “main” seed, or the value which a reproducer should pass to ‘seed’. Often, the main seed equals the provided ‘seed’, but this won’t be true if seed=None, for example.

define_spaces() None[source]

Set the types and spaces of observation and action from config.

_reward(action: Union[int, numpy.ndarray]) float[source]

Return the reward associated with performing a given action and ending up in the current state.


action – the last action performed


the reward

_is_terminal() bool[source]

Check whether the current state is a terminal state

:return:is the state terminal

_info(obs: numpy.ndarray, action: Union[int, numpy.ndarray]) dict[source]

Return a dictionary of additional information

  • obs – current observation

  • action – current action


info dict

_cost(action: Union[int, numpy.ndarray]) float[source]

A constraint metric, for budgeted MDP.

If a constraint is defined, it must be used with an alternate reward that doesn’t contain it as a penalty. :param action: the last action performed :return: the constraint signal, the alternate (constraint-free) reward

reset() numpy.ndarray[source]

Reset the environment to it’s initial configuration


the observation of the reset state

_reset() None[source]

Reset the scene: roads and vehicles.

This method must be overloaded by the environments.

step(action: Union[int, numpy.ndarray]) Tuple[numpy.ndarray, float, bool, dict][source]

Perform an action and step the environment dynamics.

The action is executed by the ego-vehicle, and all other vehicles on the road performs their default behaviour for several simulation timesteps until the next decision making step.


action – the action performed by the ego-vehicle


a tuple (observation, reward, terminal, info)

_simulate(action: Optional[Union[int, numpy.ndarray]] = None) None[source]

Perform several steps of simulation with constant action.

render(mode: str = 'human') Optional[numpy.ndarray][source]

Render the environment.

Create a viewer if none exists, and use it to render an image. :param mode: the rendering mode

close() None[source]

Close the environment.

Will close the environment viewer if it exists.

get_available_actions() List[int][source]

Get the list of currently available actions.

Lane changes are not available on the boundary of the road, and speed changes are not available at maximal or minimal speed.


the list of available actions

_automatic_rendering() None[source]

Automatically render the intermediate frames while an action is still ongoing.

This allows to render the whole video and not only single steps corresponding to agent decision-making. If a monitor has been set, use its video recorder to capture intermediate frames.

simplify() highway_env.envs.common.abstract.AbstractEnv[source]

Return a simplified copy of the environment where distant vehicles have been removed from the road.

This is meant to lower the policy computational load while preserving the optimal actions set.


a simplified environment state

change_vehicles(vehicle_class_path: str) highway_env.envs.common.abstract.AbstractEnv[source]

Change the type of all vehicles on the road


vehicle_class_path – The path of the class of behavior for other vehicles Example: “highway_env.vehicle.behavior.IDMVehicle”


a new environment with modified behavior model for other vehicles

class highway_env.envs.common.abstract.MultiAgentWrapper(env)[source]

Run one timestep of the environment’s dynamics. When end of episode is reached, you are responsible for calling reset() to reset this environment’s state.

Accepts an action and returns a tuple (observation, reward, done, info).


action (object): an action provided by the agent


observation (object): agent’s observation of the current environment reward (float) : amount of reward returned after previous action done (bool): whether the episode has ended, in which case further step() calls will return undefined results info (dict): contains auxiliary diagnostic information (helpful for debugging, and sometimes learning)