By now, ChatGPT, Claude, and other large language models have accumulated so much human knowledge that they're far from simple answer-generators; they can also express abstract concepts, such as ...
Use the vitals package with ellmer to evaluate and compare the accuracy of LLMs, including writing evals to test local models.