Deploy vanilla or fine-tuned GPT-J replicas.


Whether you're deploying vanilla or fine-tuned GPT-J replicas, the process is similar and straightforward.
For the purpose of this guide, we'll deploy a vanilla GPT-J replica.
For more information on deploying a fine-tuned GPT-J model refer to our guide.

Create a new deployment

Select 'Vanilla GPT-J'

Press 'Deploy'

After deploying, it typically takes less than 5 minutes for the model to be available via HTTP API or the Playground.
You can also control how many replicas are running to scale with increased usage. Auto-scaling will be available within the next few weeks.
