As tech companies continue to roll out large language models (LLM) with impressive results, measuring their real capabilities is becoming more difficult. According to a technical report released by ...