I wish they had broke it out by AI. The article states:
"Gemini performed worst with significant issues in 76% of responses, more than double the other assistants, largely due to its poor sourcing performance."
But I don't see that anywhere in the linked PDF of the "full results".
This sort of study should also be re-done from time to time to track AI version numbers.