Just over a month ago, I asked whether Apple’s Siri was finally starting to improve thanks to the company’s investments in both the service specifically, and AI and machine learning in general. I have heard bits of anecdotal evidence like my own from some of you who responded to that article, but today we got a little bit of verifiable data that positive change really is occurring and that Siri is gradually improving.
Gene Munster’s Loup Ventures recently conducted an 800 question test involving Apple’s Siri, Google’s Assistant, Amazon’s Alexa, and Microsoft’s Cortana. This is actually the second in a series of what should be a running test to track the changes and adaptations of all of the major digital assistant platforms.
In the previous test, which was conducted in April of last year, Apple’s Siri actually performed pretty poorly, scoring only a D+ grade. As for what is covered within the 800 questions that make up the test, here is a bit about the Methodology:
Just as we have in April of this year, we asked 800 questions to Siri on an iPhone X and 8 Plus the last week of December 2017. The queries covered five categories: Local, Commerce, Navigation, Information, and Command. Siri was graded on two metrics: did she understand what was asked? (this can be seen on the device’s screen), and did she answer or execute correctly? It is important to note that we have slightly modified our question set to be more reflective of the changing abilities of AI assistants. As voice computing becomes more versatile and digital assistants become more capable, we will continue to update our question set to be reflective of those improvements going forward. Our changes included questions around the use of smart home devices. We tested Siri with the Philips Hue smart lighting and Wemo Mini smart plugs.
In the previous test, Siri understood 94% of the queries, but only gave the correct answer 66% of the time, hence the low letter grade. In the more recent test, Siri improved its score from a D+ to a C. This time out, it recognized 99% of the queries and gave correct answers for 75% of the questions. Note that, while the exact same question set was used, these questions have not been publicly revealed and are not known to the companies in question. As such, while this may not rise to the level of a scientific study, it is still still an effective measuring stick that shows verifiable improvement in Siri.
As for the competition, Apple still lags well behind Google Assistant, which isn’t a surprise. It scored 81% accuracy on the same test. What is a little surprising is that Alexa and Cortana actually placed well behind Siri, with scores of D and F respectively. While this is interesting, we won’t be able to draw any definite conclusions until we have results from another year’s worth of these tests and we can see Apple’s forthcoming HomePod tested head-to-head against these same assistant platforms. Loup Ventures has a separate test for smart speakers using the same questions and methodology, which should include the HomePod after release.
The good news here is that Siri really is getting better. I expect the improvements to continue to be small for now, but as Apple’s investments in AI and machine learning grow and mature, the pace of growth should pick up. Also, the release of the HomePod could be a real boon for Siri, as it should spur increased usage of the service. This, in turn, will bring Apple an increasing amount of data to use in training and improving the service using AI and machine learning. If all goes well, this could become a positive feedback loop that finally helps to push Siri past years of relative stagnation.