kipi.ai Interview Question

How to evaluate large language models?