Click here - to use the wp menu builder

AWS brings RAG evaluation and LLM-as-a-judge feature to Amazon Bedrock

December 2, 2024

The new feature, which is currently in preview, according to the company, will allow developers to perform tests and evaluate other models with human-like quality at a lower cost compared to a human running these evaluations.

LLM-as-a-judge makes it easier for enterprises to go into production by providing fast, automated evaluation of AI-powered applications, shortening feedback loops, and speeding up improvements, AWS said. The evaluations assess multiple quality dimensions including correctness, helpfulness, and responsible AI criteria such as answer refusal and harmfulness.