UntaggedBenchmarks and EvaluationFormal tests for common sense in AIDiagnosing hallucinations and novelty handlingExplore how to measure and benchmark common sense in artificial intelligence.