Join top executives in San Francisco on July 11-12, to hear how leaders are integrating and optimizing AI investments for success. Learn More
As tech companies continue to roll out large language models (LLM) with impressive results, measuring their real capabilities is becoming more difficult. According to a technical report released by OpenAI, GPT-4 performs impressively on bar exams, SAT math tests, and reading and writing exams.
However, tests designed for humans may not be good…