Archive

Posts Tagged ‘dataset-sts’

YodaQA’s abilities are enlarged by traffic domain

May 23rd, 2016 1 comment

Guest post by Petr Marek (source)

Everybody driving a car needs the navigation to get to the destination fast and avoid traffic jam. One of the biggest problems is how to enter fast the destination and how to find where are the congestions, what is the traffic situation. YodaQA Traffic is a project attempting to answer the traffic related questions quickly and efficiently. Drivers may ask questions in natural language like: “What is the traffic situation in the Evropská street?” or “What is the fastest route from Opletalova street to Kafkova street?” You can try out the prototype (demo available only for limited time) – try to ask for example “traffic situation in the Wilsonova street” .

YodaQA Traffic still has some limitations. Currently we only have a browser version not suitable for smart phones. It is answering traffic questions for Prague’s streets only.

But as usual, this whole technology demo is open source – you can find it in the branch f/traffic-flow of our Hub project.

How does it work and where we get the data from?

All YodaQA are first analyzed to recognize and select traffic questions. We do it in two steps. The first step is to recognize the question topic. We use six topics like traffic situation, traffic incident or fastest route. The topic is determined by comparing semantic similarity of the user’s question with a set of reference questions. We estimate the similarity with our Dataset-STS Scoring API. Each reference question is labeled by a “topic”. The Sentence Pair Similarity algorithm selects the reference question “topic” with the highest similarity to the question.

Next we need to recognize the location, i.e. to recognize the street name. This is handled by another tool called the Label-lookup which we normally use for entity linking in YodaQA. It compares questions words with a list of all street names in the Prague. We exported the list of streets names in Prague from OpenStreetMap. We do not do exact match, we try to select the closest street name from the list.

The last step is to decide whether the question is really the traffic question, because the Dataset-STS API and Label-lookup can find topic and street name even in a pure movie question like “When was the Nightmare on Elm Street released?”. The Dataset-STS and Label-lookup return not only topic or street name but also the score, fortunately. We created dataset of over 70 traffic questions and over 300 movies questions and founded the minimal score thresholds, with which the recognition makes the lowest classification error on this dataset.

Once we know the type of question and the location we start a small script accessing the traffic situation data from HERE Maps. The only complication is that the the API doesn’t return traffic situation for particular street, but bounding box only. To overcome this problem we have to find a bounding box for a desired location, using an algorithm we developed for this purpose. Then we call the traffic flow API to acquire the information for all streets in the bounding box. Finally, we filter out the traffic situation for the desired street.

It was great fun to work on this application, it is not perfect but it shows how to create intelligent assistants helping people solving various everyday situations. We are also excited to see, how the users will use the new functionality of YodaQA and how it will help them.

Categories: ailao, software Tags: , , , , , , ,

Semantic Sentence Pair Scoring

May 20th, 2016 No comments

The blog has been a little bit silent – a typical sign of us working too hard to worry about that! But we’ll satisfy some of your curiosity in the coming weeks as we have about six posts in the pipeline.

The thing I would like to mention first is some fundamental research we work on now. I stepped back from my daily Question Answering churn and took a little look around and decided the right thing to focus for a while are the fundamentals of the NLP field so that our machine learning works better and makes more sense. Warning: We’ll use some scientific jargon in this one post.

So, in the first months of 2016 I focused huge chunk of my research on deep learning of natural language. That means neural networks used on unstructured text, in various forms, shapes and goals. I have set some audacious goals for myself, fell short in some aspects but still made some good progress hopefully. Here’s the deal – a lot of the current research is about processing a single sentence, maybe to classify its sentiment or translate it or generate other sentences. But I have noticed that recently, I have seen many problems that are about scoring a pair of two sentences. So I decided to look into that and try to build something that (A) works better, (B) actually has an API and we can use it anywhere for anything.

My original goal was to build awesome new neural network architectures that will turn the field on its head. But I noticed that the field is a bit of a mess – there is a lot of tasks that are about the same thing, but very little cross-talk between them. So you get a paper that improves the task of Answer Sentence Selection, but could the models do better on the Ubuntu Dialogue task then, or on Paraphrasing datasets? Who knows! Meanwhile, each dataset has its own format and a lot of time is spent only in writing the adapter code for it. Training protocols (from objectives to segmentation to embedding preinitializations) are inconsistent, and some datasets need a lot of improvement. Well, my goal turned to sorting out the field, cross-check the same models on many tasks and provide a better entry point for others than I had.

Software: Getting a few students of the 3C group together, we have created the dataset-sts platform for all tasks and models that are about comparing two sentences using deep learning. We have a pretty good coverage (of both tasks and models), and more brewing in some side branches. It’s in Python and uses the awesome Keras deep learning library.

Paper: To kick things off research-wise, we have posted a paper Sentence Pair Scoring: Towards Unified Framework for Text Comprehension where we summed up what we have learned early in the process. A few highlights:

  • We have a lofty goal of building an universal text comprehension model, a sort of black box that eats your sentences and produces embeddings that correspond to their meaning, which you can use for whatever task you need to do. Long way to go, but we have found that a simple neural model trained on very large data is doing pretty good in this exact setting, and even if applied to tasks and data that look very different from the original. Maybe we are on to something.
  • Our framework is state-of-art on the Ubuntu Dialogue dataset of 1M techsupport IRC dialogs, beating Facebook’s memory network models.
  • It’s hard to compare neural models because if you train a model 16 times with the same data, the result will always be somewhat different. Not a big deal with large test datasets, but a very big deal with small test datasets which are still popular in the research community. Almost all papers ignore this! If you look at evolution of performance of models in some areas like Answer Sentence Selection, we have found that most differences over the last year are deep below per-train variance we see.

Please take a look, and tell us what you think! We’ll shortly cover a follow-up paper here that we also already posted, and we plan to continue the work by improving our task and model coverage further, fixing a few issues with our training process and experimenting with some novel neural network ideas.

More to come, both about our research and some more product-related news, in a few days. We will also talk about how the abstract-sounding research connects with some very practical technology we are introducing.