Can AI agents think and act in real time? This sounds simple to humans, but is quite challenging for AI agents. We build a platform to evaluate and advance the real-time reasoning capabilities of language agents.

Interactive Demo

Play with different cognitive loads and time pressures. Notice how the three agents think and act differently. Use the "Random" button to sample different settings quickly.

Easy Hard
Low Extreme
Snake Tests an agent's ability to navigate a growing snake to catch food which has a time-limited appearance. Do you notice reactive agents often get trapped into cul-de-sacs while planning agents miss new food due to slower reactions?

Why do we want agents to be real-time?

The time current LLM-based agents take to think is not negligible. When encountering real-world tasks, an agent spending too long to think may miss critical opportunities or fail to respond to urgent situations.

View how current LLM-based agents fail to solve real-world, dynamic tasks

What is AgileThinker?

Reactive Agent

Bounded computation ensures timeliness but sacrifices decision quality when cognitive load is high.

Planning Agent

Unbounded computation improves decision quality but struggles under time pressure.

AgileThinker

AgileThinker avoids this trade-off by engaging both reactive and planning thinking systems. The two systems work in parallel, and the streaming output from the planning system is immediately used to enhance the reactive system's fast decisions.

Result: Robust performance across different levels of cognitive load and time pressure.

View how AgileThinker works in practice

What did we find?

Experiments show that under cognitive load and time pressure, AgileThinker, which engages two LLMs with both System 1 and 2 reasoning, greatly outperforms agents engaging only one LLM. Scores are normalized to [0, 1] for each game and then averaged.

Using the realtimegym Python Package

Installation

Install the Real-Time Reasoning Gym package:

git clone git@github.com:SALT-NLP/RealtimeGym.git
cd RealtimeGym
pip install -e .

Quick Start

Get started with a simple example:

import realtimegym

# Create environment
env, seed, renderer = realtimegym.make('Freeway-v0')
obs, done = env.reset()

Available Environments

Freeway

Navigate through dynamic traffic with real-time decision making.

realtimegym.make('Freeway-v0')

Snake

Strategic planning for food collection while avoiding obstacles.

realtimegym.make('Snake-v0')

Overcooked

Cooperative cooking with coordination and task prioritization.

realtimegym.make('Overcooked-v0')

Agent Implementations

Reactive Agent

Fast, intuitive System 1 - Always react quickly with bounded compute; no planning thread.

class ReactiveAgent:
  def think(timeout):
    start_reactive_thread(current_observation, "")
    run_reactive_thread(internal_budget)
    if reactive_thread_is_alive():
      s1_budget_forcing()
    action = get_reactive_thread_response()

Planning Agent

Slow, deliberate System 2 - Plan first within the full timeout, then execute the first action.

class PlanningAgent:
  def think(timeout):
    if not planning_thread_is_alive():
      start_planning_thread(current_observation)
    run_planning_thread(timeout)
    if not planning_thread_is_alive():
      plan = get_planning_thread_response()
    action = plan[0]; plan = plan[1:]

AgileThinker

Parallel: System 1 + System 2 - Plan in parallel with a fast reactive thread; use budget-aware forcing.

class AgileThinker:
  def think(timeout):
    if not planning_thread_is_alive():
      start_planning_thread(current_observation)
    run_planning_thread(timeout - internal_budget)
    plan = get_planning_thread_response()
    start_reactive_thread(current_observation, plan)
    run_reactive_thread(internal_budget)
    if reactive_thread_is_alive(): s1_budget_forcing()
    action = get_reactive_thread_response()

Complete Example

Show complete code example
import realtimegym
from realtimegym.agents.agile import AgileThinker
from realtimegym.prompts import freeway as prompt

env, seed, _ = realtimegym.make_env("Freeway-v0")
obs, done = env.reset()

log_file = "freeway_v0_agile.csv"
agent = AgileThinker(prompt, log_file, 'token')

while not done:
    agent.observe(obs)  # Fast observation
    agent.think(timeout=4096)  # Bounded thinking (token or seconds)
    action = agent.act()
    obs, done, reward, reset = env.step(action)

BibTeX

@misc{wen2025realtimereasoningagentsevolving,
      title={Real-Time Reasoning Agents in Evolving Environments}, 
      author={Yule Wen and Yixin Ye and Yanzhe Zhang and Diyi Yang and Hao Zhu},
      year={2025},
      eprint={2511.04898},
      archivePrefix={arXiv},
      primaryClass={cs.AI},
      url={https://arxiv.org/abs/2511.04898}, 
}

Authors

Yule Wen

Yule Wen1,*

Tsinghua University

Yixin Ye

Yixin Ye2,*

Shanghai Jiao Tong University

Yanzhe Zhang

Yanzhe Zhang3

Georgia Institute of Technology

Diyi Yang

Diyi Yang4

Stanford University

Hao Zhu

Hao Zhu4

Stanford University

1Tsinghua University, 2Shanghai Jiao Tong University, 3Georgia Institute of Technology, 4Stanford University

*Co-leading authors