{"id":386,"date":"2016-01-13T03:06:24","date_gmt":"2016-01-13T01:06:24","guid":{"rendered":"http:\/\/log.or.cz\/?p=386"},"modified":"2016-01-13T18:00:12","modified_gmt":"2016-01-13T16:00:12","slug":"keras-for-binary-classification","status":"publish","type":"post","link":"https:\/\/log.or.cz\/?p=386","title":{"rendered":"Keras for Binary Classification"},"content":{"rendered":"<p>So I didn&#8217;t get around to seriously (besides running a few examples) play with <b><a href=\"http:\/\/keras.io\/\">Keras<\/a><\/b> (a powerful library for building fully-differentiable machine learning models aka <b>neural networks<\/b>) &#8211; until now.  And I have been a bit surprised about how tricky it actually was for me to get a simple task running, despite (or maybe because of) all the docs available already.<\/p>\n<p>The thing is, many of the &#8220;basic examples&#8221; gloss over exactly how the inputs and mainly outputs look like, and that&#8217;s <em>important<\/em>.  Especially since for me, the archetypal simplest machine learning problem consists of <b>binary classification<\/b>, but <b>in Keras the canonical task is categorical classification<\/b>.  Only after fumbling around for a few hours, I have realized this fundamental rift.<\/p>\n<p>The examples (besides LSTM sequence classification) silently assume that you want to classify to categories (e.g. to predict words etc.), not do a binary 1\/0 classification.  The consequences are that if you naively copy the example MLP at first, before learning to think about it, your model will never learn anything and to add insult to injury, always show the accuracy as 1.0.<\/p>\n<p>So, there are a few important things you need to do to perform binary classification:<\/p>\n<ul>\n<li> Pass <code>output_dim=1<\/code> to your final <code>Dense<\/code> layer (this is the obvious one).\n<li> Use <code>sigmoid<\/code> activation instead of <code>softmax<\/code> &#8211; obviously, <b>softmax on single output will always normalize whatever comes in to 1.0<\/b>.\n<li> Pass <code>class_mode='binary'<\/code> to <code>model.compile()<\/code> (this fixes the accuracy display, possibly more; you want to pass <code>show_accuracy=True<\/code> to <code>model.fit()<\/code>).\n<\/ul>\n<p>Other lessons learned:<\/p>\n<ul>\n<li> For some projects, my approach of first cobbling up an example from existing code and <em>then<\/em> thinking harder about it works great; for others, not so much&#8230;\n<li> In IPython, do not forget to reinitialize <code>model = Sequential()<\/code> in some of your cells &#8211; a lot of confusion ensues otherwise.\n<li> Keras is pretty awesome and powerful. Conceptually, I think I like <a href=\"https:\/\/github.com\/NNBlocks\/NNBlocks\/\">NNBlocks<\/a>&#8216; usage philosophy more (regarding how you build the model), but sadly that library is still very early in its inception (I have created a bunch of gh issues).\n<\/ul>\n<p><em>(Edit: After a few hours, I toned down this post a bit. It wasn&#8217;t meant at all to be an attack at Keras, though it might be perceived by someone as such. Just as a word of caution to fellow Keras newbies. And it shouldn&#8217;t take much <a href=\"https:\/\/github.com\/fchollet\/keras\/issues\/1454\">to improve the Keras docs<\/a>.)<\/em><\/p>\n","protected":false},"excerpt":{"rendered":"<p>So I didn&#8217;t get around to seriously (besides running a few examples) play with Keras (a powerful library for building fully-differentiable machine learning models aka neural networks) &#8211; until now. And I have been a bit surprised about how tricky it actually was for me to get a simple task running, despite (or maybe because [&hellip;]<\/p>\n","protected":false},"author":2,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[150,6],"tags":[158,157,124,142],"class_list":["post-386","post","type-post","status-publish","format-standard","hentry","category-ailao","category-software","tag-keras","tag-ml","tag-nlp","tag-python"],"_links":{"self":[{"href":"https:\/\/log.or.cz\/index.php?rest_route=\/wp\/v2\/posts\/386","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/log.or.cz\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/log.or.cz\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/log.or.cz\/index.php?rest_route=\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/log.or.cz\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=386"}],"version-history":[{"count":5,"href":"https:\/\/log.or.cz\/index.php?rest_route=\/wp\/v2\/posts\/386\/revisions"}],"predecessor-version":[{"id":392,"href":"https:\/\/log.or.cz\/index.php?rest_route=\/wp\/v2\/posts\/386\/revisions\/392"}],"wp:attachment":[{"href":"https:\/\/log.or.cz\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=386"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/log.or.cz\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=386"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/log.or.cz\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=386"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}