Finetuning LLM model using serverless container

Hi, I want to deploy a finetuned LLM model using serverless container. Since the model is quite large (>100GB), it’s not super practical to download the model before the request.

I have also noticed sometimes a new request creates a container and it’d download the image again. How can I make sure that my image and its models are downloaded just once?

Hi @Alpaca, you can use the general storage for storing weights. This volume will then be shared between all replicas of your deployment. Please download to /data in your container or change the mount to whatever location your container would save the weights to.