No initial questions on whether I’m looking for male / female jackets, size, price preferences, brand preferences etc.
User Experience - Ease of Use
Chat interfaces can be annoying especially when typing in preferences such as brand, price, gender, colour that can rather be either personalized or templatized.
Clarification & Higher-Order Thinking
Following was the reasoning trace for the model:
I started by searching for a "blue waterproof hiking jacket" to find relevant options. When I encountered results from The North Face, I selected the site because it's a well-known brand for outdoor gear. I used filters to narrow down the options to jackets suitable for hiking and ensured they were waterproof.
I didn't assume any gender preference, but the available results were categorized by gender, and the first product I found was a men's jacket. I focused on finding a blue waterproof hiking jacket as per your request, without assuming your purpose, price range, or brand preference.
The "Men’s Alta Vista Jacket" was selected because it met the criteria and was available in blue. If you have specific preferences or requirements, please let me know, and I can refine the search further.
Task Decomposition & Modularity
The model in general has a fantastic reasoning trace while navigating. It’s able to go all the way from search, to navigating to search results and then simultaneously open websites, rollback from ones that aren’t working / valid, navigate to the product pages, analyze work with filters. It’s understanding of a computer is pretty fantastic
In scenarios such as checkout, it’s also able to circumvent checking out without being able to sign-in wherever possible
Application Identification
Instead of navigating on the website itself to the product, clicking on the right product, it instead got the title of the product and then navigated to another tab and opened the result through Google search. This could either be because of scraping restrictions or inability to click & scrape dynamically on specific search results.
Personalization
Didn’t ask for my brand preference, gender, size and budget. When I later shared my preferences, it with it, it then tried to navigate to Amazon
Risk Management, Intervention & Handover
When asked to look for the same jacket on DICK’s on the Northface website, it couldn’t find it and so asked if I wanted to see similar jackets. Upon saying yes and explicitly stating to follow the same preferences, it digressed from the preferences.
Browser Navigation & Integration
It’s able to navigate through a wide variety of websites irrespective of the type of webpage largely due to it taking screenshots and emulating keyboard and mouse clicks which is where this trade off works very well.
Exception Handling & Reliability
Instead of navigating on the website itself to the product, clicking on the right product, it instead got the title of the product and then navigated to another tab and opened the result through Google search. This could either be because of scraping restrictions or inability to click & scrape dynamically on specific search results.
Time to Completion
Speed is relatively slow, especially around the time it takes to scrape search results, browser & scroll (requires multiple screenshots) and user asks information that can be buried down later in the page / requires navigation to multiple websites to check the price
Credential Management
Model generally avoids asking the user for any credentials and this may well be because of security reasons since these browser instances are hosted in the cloud and entering user credentials in third party cloud instances could possibly require MFA thus complicating the process and adding user friction
There are no rows in this table
Want to print your doc? This is not the way.
Try clicking the ⋯ next to your doc name or using a keyboard shortcut (