Solve for X is an experimental AI agent. It uses LLMs and queries designed to provide a simple approach to automated research based on dynamic queries generated by the tool itself. Solve for X is my personal quest [attempt] to combine the essence of LangChain and AutoGPT.
It is surprisingly simple to use but performs a gazillion little AI steps to achieve the stated goals. Best of all, there are no inferencing costs, and it can use arbitrary LLMs. This example uses PaLM 2 exclusively.
I’m under no illusion - this is not easy.
This R&D excursion began when I penned this little missive on the forum:
Stop Building the Thing
I’m sure this got your attention. What’s the “thing” and what should we build instead? The answer is:
The thing that builds the things…
At Tesla, the thing is the factory. The things are the cars. They invested billions to build the thing. Ergo … in a Coda context:
Stop building the Packs; build the thing that builds the Packs.
That “thing” is AGI. More specifically, a framework for using AI to create code, test it, perfect it, and document it.
Coda has used AI thus far to focus on helping a small slice of users write faster, perhaps better. [yawn] Coda (in my view) should be helping all users extend Coda.
Stop building the Formulas; build the thing that builds the Formulas.
Often, this guidance needn’t produce perfect or pristine outcomes. But it still supports my belief that a universal law of productivity we rarely benefit from is attainable through AI.
Let’s examine a few simple goals that AI has created some success. To demonstrate, imagine you want to build a “thing”, and that thing is a plan to go see the Cybertruck in Los Angeles. Having never been to the Petersen Museum (where Cybertruck is on display), I would assume the first step is to start Googling. And after a series of fifteen or twenty searches, we can formulate a plan.
This is what I mean by “Building the Thing”
Alternatively, what if I built a thing that builds the plan?
When I see users thrashing away in ChatGPT or Google Search, I liken it to non-trivial human effort being sunk into building parts of a desired outcome. Over time and with great effort, you begin to assemble a strategy to reach your objective, which might be stated like this:
Weekend of Fun, including a visit to the Petersen Museum in Los Angeles.
The subtext of this goal can be stated succinctly:
Plan three days of activities (Fri night through Sun late) that include a half-day at the museum, good restaurants, and sightseeing.
LangChain and AutoGPT exist to mitigate the chat and search-level-thrashing used to reach specific objectives that are so common among users. These tools compress time by allowing the LLMs to dynamically determine what needs to be done next, and then doing them for you.
The trouble with this approach - you need to be a developer to leverage them. And despite a lot of noise about shaping these tools for every-day users, let’s just say that even that will be poorly implemented unless the UX is in a form that business users appreciate. Coda is one such form; Google Workspaces is another.
Here’s the thing I built in Coda to build the “research things” I need to accomplish.
Each row in the Coda table is an AI Agent. You provide it with just a goal and initial task, and it sets off to complete your goal and generates it into an outcome field. The report is embellished with linked locations, interesting places, and sightseeing ideas. It fully embraced my stated preferences as well. Bear in mind - I primed this process with just two sentences.
This agent also lays out a general timeline and accurately sensed the time frame for these activities to occur.
It’s Not Perfect
These agents are thorough and fast. They can be easily created, modified, subclassed, and extended - all without rebuilding the “thing”.