LLM Evals: Setup and the Metrics That Matter