Introducing AskCricinfo

News

An artificial-intelligence-based stats query tool that can answer questions put to it in regular English

Remember the stats scrapbook?

The question is a giveaway to my vintage. How many of you will know how it was to keep track of cricket before ESPNcricinfo came along?

I kept my own database: pages and pages of averages and records clipped from magazines and newspapers and pasted into large notebooks. Because everything got outdated quickly, my little project soon occupied a shelf and threatened to consume many more.

You had to have been there in that long-ago age to know how magical it felt to see averages tick over live, and to watch record pages update themselves on ESPNcricinfo. A few years after, along came Statsguru to spoil numbers geeks. It still remains the cricket world’s default tool to search, dice and zoom in and out of stats across players, teams, eras and formats. Early in my job as editor of the site, I remember the utter panic in a media box once when Statsguru went down for a while towards the end of a game. I could feel eyes on me, but instead of feeling awkward, I was quietly chuffed.

AskCricinfo is a natural, and if I may say, slightly giddy, step forward in that journey: another breakthrough, a little more magic. For all its wonders, Statsguru, with all those filters that allow you to drill and drill, can be a bit overwhelming for the newbie. But what if you could get Virat Kolhi’s record against legspinners in the last two seasons in the IPL by just typing out a query about it in regular English?

Or the answers to other queries, like these:

Just AskCricinfo. Get to the page. Key in the question. Or ask it aloud if you have voice input enabled for your phone browser. The answer will flash in a few seconds.
Of course it’s not all that simple at our end. First, the AskCricinfo engine needs to think like a cricket fan and interpret “Which opening batter has faced the most dot balls in the powerplays in an innings in IPL?” or “What is the highest partnership for the sixth wicket after five wickets down for less than 50 in the CPL?” correctly. Next, it has to translate that into language that computers understand, and also correctly identify which database to send the question to – our Statsguru engine, the ball-by-ball database, or the records pages. After these tasks, it needs to return the exact answers to those questions.

These are not static web pages called up by keyword-matching, as with normal search engines. Here, each question is unique, there are millions of questions that our databases can answer, and a million ways the engine can go wrong. The enormous complexity of this task is why you haven’t seen anything like this – not for cricket, not for any sport, not for anything else – where a natural language question returns a precise answer from a database.

Since a lot of these answers are derived from ball-by-ball data, this version of AskCricinfo only fully supports top-flight international cricket, both men’s and women’s, and all the major leagues. We have started with the newest format simply because there is relatively more ball-by-ball data available for T20 as a whole. Our aim is to build AskCricinfo tournament by tournament, and then format by format.
Artificial intelligence, as most of us know by now, gets better with use. Like with the discovery algorithms on streaming apps, AskCricinfo will get smarter the more questions it is asked.

In the period leading up to this launch, we have endured as much frustration as we have experienced exhilaration. When the first answer came through, it felt like magic. But as expectations and ambition grew, there were days when the task felt hopeless. Accepting the pursuit of accuracy as a more realistic goal freed us from the burden of trying to achieve perfection – unattainable in this case.

So let’s set our expectations. AskCricinfo is designed to answer queries based on scoring data. It doesn’t have opinions: you can ask who has taken the most wickets in the death overs, but not “Who is the best bowler in the death overs?” It doesn’t deal in trivia (How many batters have scored hundreds on their birthday? How many sets of brothers currently play in the IPL?) And there is some data it doesn’t process yet. (Who was the fastest bowler? Who has scored the most runs from cover drives?)
Since the tool is in learning mode, the more specific the questions, the better the answers. Try to spell things right, particularly names; use specific terms like “batting”, “bowling”, “in an innings”, “in a match”, “in a season”; “Ben Stokes’ bowling record” instead of “Ben Stokes’ record”. Use “versus” or “vs” or “v” for batter-versus-bowler queries, and “and” to compare two players ( “Kohli vs Rashid”, “Compare Warner and Rohit Sharma strike rates in powerplays”).
Those inclined to help, please make sure to click the thumb icons at the bottom of the AskCricinfo page. Particularly the “thumbs down” one if you think the result is incorrect. And if you can, tell us what went wrong in the text field below. Every mistake is an opportunity to learn.
The AskCricinfo project reunited us with one of ESPNcricinfo’s original co-founders, and it began, true to tradition, with a chance conversation online. Vishal Misra, whose most famous contribution to the site is re-engineering the live scorecard overnight during the 1996 World Cup, but who stopped working for Cricinfo long before I came on board, tagged me in a tweet that suggested someone should create an interface for Statsguru using an AI tool that he had come across. “I am waiting for you to do it,” I replied.
That little chat lit a spark. In a few days Vishal, who has been a professor, entrepreneur and consultant, and a cricket nut at his core, sent me a prototype, and in a matter of a couple of months he had put together a small team that has worked with a small team of ours to bring AskCricinfo to life.

Come and share the adventure with us. And let’s take to it to the places it can go.

Sambit Bal is editor-in-chief of ESPNcricinfo @sambitbal

Source: ESPN Crickinfo

Leave a Reply

Your email address will not be published. Required fields are marked *