Skip to content

OTS Prompting Guidelines

Guidelines for creating effective prompts to train AI agents on web applications.


Core Principles

Key Objective

We are trying to teach agents how to use web apps. Think of these prompts as raw instructions. Using assertions, we test if the model has done them correctly. Using learning algorithms, models eventually learn to perform these tasks.

  • Think you are teaching agents - Every prompt should be a learning opportunity
  • Find the objective point - Each prompt must have a clear, measurable goal

Prompt Requirements

Content Guidelines

Requirement Description
Coverage Create prompts that cover all available features in the gym app
Objectivity Prompts should be 100% objective. No subjectivity allowed for state-changing parts
Natural Language Make prompts as natural as possible - tasks a real human would ask an AI Agent to do
Backstories You can add backstories to make prompts more realistic

Length Guidelines

  • Standalone prompts: Around max 6 sentences
  • Cross gyms prompts: Can be around 3-10 sentences

Difficulty Distribution

The prompts should follow this distribution:

Difficulty Percentage
Easy 20%
Medium 20%
Hard 60%

Prompt Types

Prompts are categorized into three types based on their nature and requirements:

NRD (Non Response Dependent)

Action commands that change state

  • Tasks that involve performing actions in the application
  • Examples: "Add product to cart", "Update user profile", "Delete an item"
  • Focus on state-changing operations
  • Must be 100% objective and verifiable

RD (Response Dependent)

Analysis requests requiring reasoning

Special Rules for RD Tasks

  • The model can provide subjective answers but still based on concrete data pulled from the site
  • These tasks require the agent to analyze information and provide insights
  • Tasks that involve analyzing data and providing responses
  • Examples: "Which product has the best rating?", "Compare shipping options", "Analyze user activity"
  • Focus on information retrieval and analysis
  • The response itself should be based on objective data

Hybrid

Combination of RD and NRD prompts

  • Tasks that combine analysis with action
  • First, the agent analyzes or retrieves information (RD component)
  • Then, the agent performs an action based on that analysis (NRD component)
  • Examples: "Find the cheapest product and add it to cart", "Compare two products and purchase the better one"
  • Most complex task type requiring both reasoning and action

Prompt Type Distribution

When creating prompts for a gym, maintain a balanced distribution across all three types:

Type Description Typical Distribution
NRD Action-based tasks ~50-60% of prompts
RD Analysis-based tasks ~20-30% of prompts
Hybrid Combined tasks ~20-30% of prompts

Type Selection

  • Choose NRD for straightforward action tasks
  • Choose RD for information retrieval and analysis
  • Choose Hybrid for complex workflows requiring decision-making followed by action

What to Avoid

Don't Be Too Helpful

Prompts should not be too helpful.

Bad Example: "Go to settings, select drop down value x"

This is too helpful because it tells the agent exactly what to do.

  • All tasks should be objective (except for the open-ended part of RD tasks)
  • Models should still need to find their way to the objective
  • Don't give step-by-step instructions that eliminate the learning challenge