AWS Logo
Menu
Using AtlaAI Selene 1 Mini on AWS

Using AtlaAI Selene 1 Mini on AWS

Learn how to deploy Atla AI Selene 1 Mini, a new SOTA SLM-as-a-judge, on Amazon Bedrock and Amazon SageMaker AI

Published Jan 30, 2025
Selene 1 Mini from Atla AI is a new Small Language Model (SLM) which claims to be the best performing model to perform "SLM-as-a-judge" - using a language model to evaluate the outputs of other language models. Selene Mini outperforms prior small evaluation models on average performance across 11 benchmarks, spanning three different types of evaluation tasks:
  • Absolute scoring, e.g. "Evaluate the harmlessness of this response on a scale of 1-5."
  • Classification, e.g. "Does this response address the user query? Answer Yes or No."
  • Pairwise preference. e.g. "Which of the following responses is more logically consistent - A or B?"
On some benchmarks, Selene Mini beats models several times its size, outperforming GPT-4o on RewardBench, EvalBiasBench, and Auto-J. It is also the highest-scoring 8B generative model on RewardBench, and the top-ranking model on Judge Arena. It was trained on dedicated datasets using a fine-tuned Llama-3.1-8B model, combining direct preference optimization (DPO) and supervised fine-tuning (SFT) techniques to create a general-purpose evaluator. It is available as open-source on [HuggingFace](https://huggingface.co/AtlaAI/Selene-1-Mini-Llama-3.1-8B).
Atla AI Selene 1 Mini - SLM-as-a-Judge Benchmarking
Atla AI Selene 1 Mini - SLM-as-a-Judge Benchmarking
I will show you first how to deploy it to Amazon SageMaker AI - both on GPU as well as on Inf2 chip. Selene 1 Mini is compatible with Inferentia chips since it's based on the Llama 3.1 architecture. Then, I will also show how to use it with Amazon Bedrock Custom Model Import for serverless inference.

Amazon SageMaker

Deployment on GPU

Once your model is deployed, you can start invoking it:
More details about some prompting strategies at the end of this blog.

Deployment on Inferentia 2

For this example, I'm using the DJL NeuronX LMI (Large Model Image). The LMI will serve the Llama model using vLLM, therefore we need to configure it first:
Now we can deploy:

Amazon Bedrock - Custom Model Import

Learn more about Amazon Bedrock Custom Model Import [here](https://docs.aws.amazon.com/bedrock/latest/userguide/model-customization-import-model.html).
Start by downloading the model from the HuggingFace Hub:
Once done, sync it with Amazon S3 to wherever you prefer:
Now, go to the AWS Console to start an Amazon Bedrock Custom Model Import job, or use the below code:
This process takes ~15 minutes. Once completed, you can start using your model! The first time you invoke the API, you will have to wait a couple of minutes until the model is ready for inferences:

Prompting for Selene 1 Mini

Atla AI Selene 1 Mini does not have a specific prompt format, at least not according to their HuggingFace page. However, they are kind enough to provide a sample prompt. I've slightly modified this prompt to provide separately the context from the user input, here it is for your convenience:
You can add your context + prompt to the user_input variable, and LLM output to assistant_response by doing standard string formatting in python, and then using this for inference:

Happy coding! 🚀 If this content has been useful, please leave a like 👍🏻️ or a comment 🗯. This will let me know that my work has been appreciated! 😄
 

Any opinions in this post are those of the individual author and may not reflect the opinions of AWS.

1 Comment