Operator Testing

icon picker
L - Finding Weekend Activities

Prompt: Can you help find 3 activities close by that I can participate in over this weekend?
Personalization Setting: None
Parameter
Description
UX - Information Exchange
Asked 1 clarifying question on my location. I provided ‘San Jose, CA’ as the response. Continued to actuation right away.
UX - Ease of Use
Relatively simple to use. The operator experience is built using an embedded browser – so there is no way for it use to the local browser (with any saved logins, cookies, etc,) – this hints at the fact that this is a “testing only” capability and real adoption will come from the CUA model APIs.
Clarification & Higher-Order Thinking
It performs a bing search for ‘san jose events’ and clicks on the first link available (sanjose.events). The date filter is correctly selected for the current weekend.
There is no higher-order thinking in a way since it does not clarify a vague prompt – there could be many different types of activities, but it lists out the top 3 events happening on both days.
Task Decomposition & Modularity
There is no focused effort at the beginning to gather more information on what the user is looking for. It seems like there would be at least 1 back-and-forth required for it to arrive at the correct results.
After returning the initial set of results, it asks if these satisfy my criteria or I would like to provide more information. I submit my preference to be wine tastings and stand up comedy shows.
Application Identification
It does a bing search initially, then navigates to websites like sanjose.events and eventbrite.
Personalization
There is some interaction to understand user preferences, but nothing
very impressive.
There is native connection available for StubHub, but at no point did it
search for events on StubHub.
Risk Management, Intervention & Handover
While accessing eventbrite, it landed on a page that required human verification (standard checkbox by cloudflare). It detected that this is a human verification step and asked me to take control and move on.
One feature here that is noticeable is when giving back control to Operator, there is a textbox where the user can indicate what kind of task needed for human intervention. This is perhaps a good strategy for data collection
Browser Navigation & Integration
Performs well in this area – no browser actuation errors as such.
Exception Handling & Reliability
No exceptions encountered. Reliability would be low since rather than understanding what the user truly wants, it follows a generic path of searching on bing and clicking on the first search result. This is bound to not be repeatable.
Time to Completion
Overall, a slow process. Since it primarily uses vision (includes screenshot of the webpage at every step) – the request payload sizes are larger than normal
Consistency
Credential Management
None encountered. As noted in the personalization section, the model does not ask the user to link any accounts for a better experience.
There are no rows in this table
Want to print your doc?
This is not the way.
Try clicking the ⋯ next to your doc name or using a keyboard shortcut (
CtrlP
) instead.