YodaQA’s abilities are enlarged by traffic domain
Guest post by Petr Marek (source)
Everybody driving a car needs the navigation to get to the destination fast and avoid traffic jam. One of the biggest problems is how to enter fast the destination and how to find where are the congestions, what is the traffic situation. YodaQA Traffic is a project attempting to answer the traffic related questions quickly and efficiently. Drivers may ask questions in natural language like: “What is the traffic situation in the Evropská street?†or “What is the fastest route from Opletalova street to Kafkova street?†You can try out the prototype (demo available only for limited time) – try to ask for example “traffic situation in the Wilsonova street†.
YodaQA Traffic still has some limitations. Currently we only have a browser version not suitable for smart phones. It is answering traffic questions for Prague’s streets only.
But as usual, this whole technology demo is open source – you can find it in the branch f/traffic-flow of our Hub project.
How does it work and where we get the data from?
All YodaQA are first analyzed to recognize and select traffic questions. We do it in two steps. The first step is to recognize the question topic. We use six topics like traffic situation, traffic incident or fastest route. The topic is determined by comparing semantic similarity of the user’s question with a set of reference questions. We estimate the similarity with our Dataset-STS Scoring API. Each reference question is labeled by a “topicâ€. The Sentence Pair Similarity algorithm selects the reference question “topic†with the highest similarity to the question.
Next we need to recognize the location, i.e. to recognize the street name. This is handled by another tool called the Label-lookup which we normally use for entity linking in YodaQA. It compares questions words with a list of all street names in the Prague. We exported the list of streets names in Prague from OpenStreetMap. We do not do exact match, we try to select the closest street name from the list.
The last step is to decide whether the question is really the traffic question, because the Dataset-STS API and Label-lookup can find topic and street name even in a pure movie question like “When was the Nightmare on Elm Street released?â€. The Dataset-STS and Label-lookup return not only topic or street name but also the score, fortunately. We created dataset of over 70 traffic questions and over 300 movies questions and founded the minimal score thresholds, with which the recognition makes the lowest classification error on this dataset.
Once we know the type of question and the location we start a small script accessing the traffic situation data from HERE Maps. The only complication is that the the API doesn’t return traffic situation for particular street, but bounding box only. To overcome this problem we have to find a bounding box for a desired location, using an algorithm we developed for this purpose. Then we call the traffic flow API to acquire the information for all streets in the bounding box. Finally, we filter out the traffic situation for the desired street.
It was great fun to work on this application, it is not perfect but it shows how to create intelligent assistants helping people solving various everyday situations. We are also excited to see, how the users will use the new functionality of YodaQA and how it will help them.
Recent Comments