Experiment with the API.


Using the Playground, you can experiment with your different GPT-J model API's.
Customize the parameters, type in any text, and press Submit.
Don't have an account? Get started with our free, public playground.

Key concepts


The prompt is how you “program” the model to achieve the response you’d like. GPT-J can do everything from writing original stories to generating code. Because of its wide array of capabilities, you have to be explicit in showing it what you want. Telling and showing is the secret to a good prompt.
GPT-J tries to guess what you want from the prompt. If you write the prompt “Give me a list of fiction books” the model may not automatically assume you’re asking for a list of books. Instead, you could be asking the model to continue a conversation that starts with “Give me a list of fiction books” and continue to say “and I’ll tell you my favorite.”
There are three basic tips to creating prompts:
1. Check your settings
The temperature and top_p parameters are what you will typically be configuring based on the task. These parameters control how deterministic the model is in generating a response. A common mistake is assuming these parameters control “creativity”. For instance, if you're looking for a response that's not obvious, then you might want to set them higher. If you're asking it for a response where there's only one right answer, then you'd want to set them lower. More on GPT-J parameters later.
2. Show and tell
Make it clear what you want through a combination of instructions and examples. Back to our previous example, instead of:
“Give me a list of fiction books”
“Give me list of fiction books. Here’s an example list: Harry Potter, Game of Thrones, Lord of the Rings.”
3. Provide quality data
If you’re trying to classify text or get the model to follow a pattern, make sure that there are enough examples. Not only is providing sufficient examples important, but the examples should be proofread for spelling or grammatical errors. While the model is usually capable of seeing through simple errors, it may believe they are intentional.


Whitespace, or what happens when you press the Spacebar, can be a token or tokens depending on context. Make sure to never have trailing whitespace at the end of your prompt or else it can have unintended effects on the model’s response.


GPT-J understands and processes text by breaking it down into tokens. As a rough rule of thumb, 1 token is approximately 4 characters. For example, the word “television” gets broken up into the tokens “tele”, “vis” and “ion”, while a short and common word like “dog” is a single token. Tokens are important to understand because GPT-J, like other language models, have a maximum context length of 2048 tokens, or roughly 1500 words. The context length includes both the text prompt and generated response.


Parameters are different settings that control the way in which GPT-J responds. Becoming familiar with the following parameters will allow you to apply GPT-J to a number of different tasks.
Response length
Response length is the length of the generated text, in tokens, you’d like based on your prompt. A token is roughly 4 characters including alphanumerics and special characters.
Note that the max response length for GPT-J is 2048 tokens.‍
Temperature controls the randomness of the generated text. A value of 0 makes the engine deterministic, which means that it will always generate the same output for a given input text. A value of 1 makes the engine take the most risks.
As a frame of reference, it is common for story completion or idea generation to see temperature values between 0.7 to 0.9.‍
Top-P is an alternative way of controlling the randomness of the generated text. We recommend that only one of Temperature and Top P are used, so when using one of them, make sure that the other is set to 1.
A rough rule of thumb is that Top-P provides better control for applications in which GPT-J is expected to generate text with accuracy and correctness, while Temperature works best for those applications in which original, creative or even amusing responses are sought.
Top-K sampling means sorting by probability and zero-ing out the probabilities for anything below the k'th token. A lower value improves quality by removing the tail and making it less likely to go off topic.
Repetition penalty
Repetition penalty works by lowering the chances of a word being selected again the more times that word has already been used. In other words, it works to prevent repetitive word usage.
Stop sequences
Stop sequences allow you to define one or more sequences that when generated force GPT-J to stop.
Last modified 15d ago