OTS Prompting Guidelines¶

Guidelines for creating effective prompts to train AI agents on web applications.

Core Principles¶

Key Objective

We are trying to teach agents how to use web apps. Think of these prompts as raw instructions. Using assertions, we test if the model has done them correctly. Using learning algorithms, models eventually learn to perform these tasks.

Think you are teaching agents - Every prompt should be a learning opportunity
Find the objective point - Each prompt must have a clear, measurable goal

Prompt Requirements¶

Content Guidelines¶

Requirement	Description
Coverage	Create prompts that cover all available features in the gym app
Objectivity	Prompts should be 100% objective. No subjectivity allowed for state-changing parts
Natural Language	Make prompts as natural as possible - tasks a real human would ask an AI Agent to do
Backstories	You can add backstories to make prompts more realistic

Length Guidelines¶

Standalone prompts: Around max 6 sentences
Cross gyms prompts: Can be around 3-10 sentences

Difficulty Distribution¶

The prompts should follow this distribution:

Difficulty	Percentage
Easy	20%
Medium	20%
Hard	60%

Prompt Types¶

Prompts are categorized into three types based on their nature and requirements:

NRD (Non Response Dependent)¶

Action commands that change state

Tasks that involve performing actions in the application
Examples: "Add product to cart", "Update user profile", "Delete an item"
Focus on state-changing operations
Must be 100% objective and verifiable

RD (Response Dependent)¶

Analysis requests requiring reasoning

Special Rules for RD Tasks

The model can provide subjective answers but still based on concrete data pulled from the site
These tasks require the agent to analyze information and provide insights

Tasks that involve analyzing data and providing responses
Examples: "Which product has the best rating?", "Compare shipping options", "Analyze user activity"
Focus on information retrieval and analysis
The response itself should be based on objective data

Hybrid¶

Combination of RD and NRD prompts

Tasks that combine analysis with action
First, the agent analyzes or retrieves information (RD component)
Then, the agent performs an action based on that analysis (NRD component)
Examples: "Find the cheapest product and add it to cart", "Compare two products and purchase the better one"
Most complex task type requiring both reasoning and action

Prompt Type Distribution¶

When creating prompts for a gym, maintain a balanced distribution across all three types:

Type	Description	Typical Distribution
NRD	Action-based tasks	~50-60% of prompts
RD	Analysis-based tasks	~20-30% of prompts
Hybrid	Combined tasks	~20-30% of prompts

Type Selection

Choose NRD for straightforward action tasks
Choose RD for information retrieval and analysis
Choose Hybrid for complex workflows requiring decision-making followed by action

What to Avoid¶

Don't Be Too Helpful

Prompts should not be too helpful.

Bad Example: "Go to settings, select drop down value x"

This is too helpful because it tells the agent exactly what to do.

All tasks should be objective (except for the open-ended part of RD tasks)
Models should still need to find their way to the objective
Don't give step-by-step instructions that eliminate the learning challenge