Archive

Posts Tagged ‘geo’

YodaQA’s abilities are enlarged by traffic domain

May 23rd, 2016 1 comment

Guest post by Petr Marek (source)

Everybody driving a car needs the navigation to get to the destination fast and avoid traffic jam. One of the biggest problems is how to enter fast the destination and how to find where are the congestions, what is the traffic situation. YodaQA Traffic is a project attempting to answer the traffic related questions quickly and efficiently. Drivers may ask questions in natural language like: “What is the traffic situation in the Evropská street?” or “What is the fastest route from Opletalova street to Kafkova street?” You can try out the prototype (demo available only for limited time) – try to ask for example “traffic situation in the Wilsonova street” .

YodaQA Traffic still has some limitations. Currently we only have a browser version not suitable for smart phones. It is answering traffic questions for Prague’s streets only.

But as usual, this whole technology demo is open source – you can find it in the branch f/traffic-flow of our Hub project.

How does it work and where we get the data from?

All YodaQA are first analyzed to recognize and select traffic questions. We do it in two steps. The first step is to recognize the question topic. We use six topics like traffic situation, traffic incident or fastest route. The topic is determined by comparing semantic similarity of the user’s question with a set of reference questions. We estimate the similarity with our Dataset-STS Scoring API. Each reference question is labeled by a “topic”. The Sentence Pair Similarity algorithm selects the reference question “topic” with the highest similarity to the question.

Next we need to recognize the location, i.e. to recognize the street name. This is handled by another tool called the Label-lookup which we normally use for entity linking in YodaQA. It compares questions words with a list of all street names in the Prague. We exported the list of streets names in Prague from OpenStreetMap. We do not do exact match, we try to select the closest street name from the list.

The last step is to decide whether the question is really the traffic question, because the Dataset-STS API and Label-lookup can find topic and street name even in a pure movie question like “When was the Nightmare on Elm Street released?”. The Dataset-STS and Label-lookup return not only topic or street name but also the score, fortunately. We created dataset of over 70 traffic questions and over 300 movies questions and founded the minimal score thresholds, with which the recognition makes the lowest classification error on this dataset.

Once we know the type of question and the location we start a small script accessing the traffic situation data from HERE Maps. The only complication is that the the API doesn’t return traffic situation for particular street, but bounding box only. To overcome this problem we have to find a bounding box for a desired location, using an algorithm we developed for this purpose. Then we call the traffic flow API to acquire the information for all streets in the bounding box. Finally, we filter out the traffic situation for the desired street.

It was great fun to work on this application, it is not perfect but it shows how to create intelligent assistants helping people solving various everyday situations. We are also excited to see, how the users will use the new functionality of YodaQA and how it will help them.

Categories: ailao, software Tags: , , , , , , ,

GPS souřadnice českých měst a obcí

February 1st, 2014 14 comments

Pro zobrazování poloh dopadů meteosond na IRC jsem potřeboval v jednoduchém CSV formátu seznam souřadnic českých měst, ale ukázalo se, že je překvapivě obtížné něco takového získat. Sice existuje tabulka na jednom astronomickém webu, výběr tam zahrnutých obcí je ale docela divný, někde je místo obce jen její část, atd.

Nakonec jsem zvolil postup “udÄ›lej si sám”, a to kombinací seznamu na Wikipedii, Google Geocoding API a trochy XPath.

Seznam rozumné podmnožiny měst mohu získat třeba pomocí:

curl 'http://cs.wikipedia.org/w/index.php?title=Seznam_obc%C3%AD_s_roz%C5%A1%C3%AD%C5%99enou_p%C5%AFsobnost%C3%AD&action=edit' |
  sed -ne 's/^# \[\[\([^]|]*|\)*\([^]]*\)\]\].*/\2/p' | sort

Mám-li zase jméno obce, její souřadnice mohu získat tímto zaklínadlem:

m=AÅ¡; curl -s 'http://maps.googleapis.com/maps/api/geocode/xml?address='"${m// /+},+CZ"'&sensor=false' |
  xmllint --xpath '//location[lat or lng]//text()' -

(Důležitý trik je to ,CZ, jinak bude Google znát spoustu Kolínů a Aš bude znamenat Americká Samoa. Alternativně si můžete z výsledků vyfiltrovat ty české pomocí XPath //result[address_component/short_name/text()="CZ"]/geometry/location[lat or lng]//text().)

Teď už to pro vygenerování jednoduchého CSV stačí spojit dohromady:

curl 'http://cs.wikipedia.org/w/index.php?title=Seznam_obc%C3%AD_s_roz%C5%A1%C3%AD%C5%99enou_p%C5%AFsobnost%C3%AD&action=edit' |
  sed -ne 's/^# \[\[\([^]|]*|\)*\([^]]*\)\]\].*/\2/p' | sort |
  while read m; do
    echo -n $m
    curl -s 'http://maps.googleapis.com/maps/api/geocode/xml?address='"${m// /+},+CZ"'&sensor=false' |
      xmllint --xpath '//location[lat or lng]//text()' - |
      tr -s '\n' ' ' | tr ' ' ','
    echo
    sleep 0.1
  done | sed 's/,$//'

Rádi byste hotové CSV?

Bonus: Podobně vygenerované CSV s pražskými částmi (katastrálními územími).

Bonus 2: A ještě CSV s obcemi s přenesenou působností (další velké obce a města)

Categories: linux Tags: , , , , ,

Weathersonde – Nearby Landing Notification

January 26th, 2014 1 comment

At our hackerspace brmlab, one of the things we do is picking up landed weather sondes. In short, fun hardware literally falling off the sky, several times a day, every day. These are stratospheric balloons used for weather data prediction, launched from various sites, that reach the 35km altitude, then the balloon bursts and it lands back on the ground at a random location. At the whole time, it transmits its current GPS coordinates via radio, making this a rather exciting sub-class of geocaching.

As a simple hack today (idea by chido), I created a simple script sonde.sh that is designed to be run three times a day, runs sonde trajectory prediction (a predict.habhub.org service – example) and if the sonde is predicted to land in a certain radius, reports that with a link to the prediction. By default, it is connected to jendabot, one of our brmlab IRC robotic minions, written in an appealingly crazy way as a collection of bash scripts.

Categories: life, software Tags: , , , , , ,