GPT-Plus Pack Document

Explore

Pack Document: GPT-Plus

Documentation

The

GPT-Plus⁠

Pack currently exposes five primary formulas: conversation_raw, chat_raw, gpt, gpt_raw and dalle. Some usage examples are below, this documentation is a work in progress until I hit v1.0.

GPT-Plus.conversation_raw

conversation_raw never hits a remote endpoint, it just returns a structured object that you can use to inform subsequent invocations to GPT-Plus.chat_raw if you’d like to maintain a consistent thread.

GPT-Plus.gpt

gpt is a mostly preconfigured invocation of GPT-3 with sensible defaults and the ability to override a few parameters.

The gpt formula has the following signature: gpt(prompt, user, temperature, size). A typical invocation will use some text prompt with User().Email as the user value, a value between 0 and 1 for temperature, and a size value ranging from 16 to up to 4000 for the current configuration.

Arguments

prompt is the input to the GPT engine. You can say something like “Tell me a fairytale about gremlins” or something like “Summarize the following row of data with respect to...” and get a reasonably good output. Your prompt can be fairly long but if it’s too complex you may get an empty string back if the system got confused.

user is a unique identifier to track the actual person submitting the prompt. This is for your protection as a document owner — if they are abusing the API and no user tracking is set then you may be held accountable.

temperature is a value that controls how “random” the output is. If temperature is 0 then you’ll get the same response to the same prompt every time. If temperature is 1 you’ll get very different responses each time. Try to find a value that works for your use case; this system uses .7 as a default value.

size is the number of tokens in the response — basically, the higher this number the longer the response it can generate, but the more you’ll be paying for it. Tune this by experimenting — the algorithm doesn’t really hit its stride until about 512, but you can go much higher. Just be careful, this is directly related to cost. A “token” is currently about 3-4 characters, so the pricing can be tricky to calculate.

Example

Given the following invocation:

I received the following response:

GPT-Plus.gpt_raw

Unlike gpt, gpt_raw exposes most of the underlying parameters from the GPT-3 API. This is powerful and allows you to tune the performance of your query in some powerful and interesting ways.

Note that not all parameters are mapped as arguments — GPT-3’s API supports a few things that this Pack is not currently capable of supporting, like live-streaming responses back.

Arguments (* denotes required)

prompt* is the input against which to generate completions, encoded as a string, array of strings, array of tokens, or array of token arrays. Note that <|endoftext|> is the document separator that the model sees during training, so if a prompt is not specified the model will generate as if from the beginning of a new document. See the

docs⁠

for more information.

user* is a unique identifying string capturing the ID of your user who is invoking GPT, used for security purposes. Use something like the currently logged in user's email address.

model allows you to specify an alternative model to run your prompt against. Different models have different use cases and settings, so only tweak this if you know what you’re doing.

suffix is text to append to the end of your query. See

docs⁠

for more information.",

max_tokens relates to the size of the generated response. Higher values give longer, better results -- but use up more tokens. The maximum number of tokens to generate in the completion. The token count of your prompt plus max_tokens cannot exceed the model's context length. Most models have a context length of 2048 tokens (except for the newest models, which support 4096). See

docs⁠

for more info.

temperature is the sampling temperature. Higher values means the model will take more risks. Try 0.9 for more creative applications, and 0 (argmax sampling) for ones with a well-defined answer. We generally recommend altering this or top_p but not both. See

docs⁠

to learn more.",

top_p is an alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both. See the

docs⁠

to learn more.

echo is a boolean flag specifying whether or not to echo back the prompt in addition to the completion. See the

docs⁠

stop can contain up to four sequences where the API will stop generating further tokens. The returned text will not contain the stop sequence. See

docs⁠

to learn more.

presence_penalty is a number between -2.0 and 2.0. Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics. See

docs⁠

frequency_penalty is a number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim. See

docs⁠

best_of is the number of completions to generate before selecting the ‘best’ one (the one with the highest log probability per token). Results cannot be streamed (which is cool cuz we don’t do that here anyway — Pack Author). Note: Because this parameter generates many completions, it can quickly consume your token quota. Use carefully and ensure that you have reasonable settings for max_tokens and stop . See

docs⁠

logit_bias modifies the likelihood of specified tokens appearing in the completion. Accepts an array of strings in the format "<token>:<value>" that map tokens (specified by their token ID in the GPT tokenizer) to an associated bias value from -100 to 100. You can use this tokenizer tool (which works for both GPT-2 and GPT-3) to convert text to token IDs. Mathematically, the bias is added to the logits generated by the model prior to sampling. The exact effect will vary per model, but values between -1 and 1 should decrease or increase likelihood of selection; values like -100 or 100 should result in a ban or exclusive selection of the relevant token. As an example, you can pass ["50256:-100"] to prevent the <|endoftext|> token from being generated. See the

docs⁠

for more details, but note that our input format in Coda differs from the object notation the raw API normally takes!

Examples

Using the default parameters it may be tempting to treat this like a passthrough to gpt, but I used the OpenAI defaults for each parameter. That means that when you run a formula like this:

You get a response like this:

This is because OpenAI’s default max_tokens value (which I called size in gpt but expose under its true name in gpt_raw) defaults to 16 here not the more sensible 256 or 512.

A more robust invocation of gpt_raw may look something like this:

The above query generated the following response for me:

GPT-Plus.dalle

OpenAI’s other offering is DALL-E, a powerful tool for generating images from text. It has certain limitations right now — you can’t control the shape, it has to be a square for instance. But the capabilities are astonishing, and it’s always getting better.

Arguments

DALL-E’s API is a lot simpler than GPT-3’s for now — it just wants three arguments, prompt, user and size.

prompt is your image prompt, something like “A happy puppy playing on the moon” or “a bustling office scene”. Have fun with this!

user is a unique identifying string capturing the ID of your user who is invoking GPT, used for security purposes. Use something like the currently logged in user's email address.

size must be either the string “small”, the string “medium” or the string “large”. These map under the hood to ”256x256”, ”512x512” and ”1024x1024", which are the only three currently supported resolutions.