What Tools Can Newsrooms Use to Evaluate Generative AI Prompts?

0

The complete post is available where it was originally published on this site

A growing list of tools may help you improve your generative AI prompts, but sometimes all you need is a spreadsheet

If your newsroom is using generative AI then you will also want to understand how well your prompts are performing. This is called prompt evaluation, sometimes referred to in industry jargon as “evals”. “…evals are what shape the reliability, usability, and ultimately, the success of AI systems,” according to the marketing material for Braintrust. Braintrust is one of a long list of tools that offer to help you evaluate prompts. Others include: Promptfoo, EvalLLM, Opik, Evidently, Deep Eval, MLFLow, Pydantic’s evals, Pytest evals, LangSmith and ChainForge. In addition, from the big businesses, Open AI has its own Evals API, Microsoft has the Azure AI Foundry portal, Amazon has Bedrock, and Google offers evaluations in Vertex.

What Tools Can Newsrooms Use to Evaluate Generative AI Prompts? was originally published in Generative AI in the Newsroom on Medium where you can read the complete article.