{"id":14602,"date":"2015-06-19T10:34:24","date_gmt":"2015-06-19T09:34:24","guid":{"rendered":"http:\/\/blog-journals.internal\/?p=14602"},"modified":"2015-06-19T10:34:24","modified_gmt":"2015-06-19T09:34:24","slug":"machine-learning-helps-computers-predict-near-synonyms","status":"publish","type":"post","link":"https:\/\/www.cambridge.org\/core\/blog\/2015\/06\/19\/machine-learning-helps-computers-predict-near-synonyms\/","title":{"rendered":"Machine learning helps computers predict near-synonyms"},"content":{"rendered":"<div id=\"bsf_rt_marker\"><\/div><p>Choosing the best word or phrase for a given context from among candidate near-synonyms, such as &#8220;slim&#8221; and &#8220;skinny&#8221;, is something that human writers, given some experience, do naturally; but for choices with this level of granularity, it can be a difficult selection problem for computers.<\/p>\n<p>Researchers from Macquarie University in Australia have published an article in the journal <a title=\"Natural Language Engineering\" href=\"http:\/\/journals.cambridge.org\/action\/displayAbstract?fromPage=online&amp;aid=9708315&amp;fulltextType=RA&amp;fileId=S1351324915000157\" target=\"_blank\">Natural Language Engineering<\/a>, investigating whether they could use machine learning to re-predict a particular choice among near-synonyms made by a human author \u2013 a task known as the lexical gap problem.<\/p>\n<p>They used a supervised machine learning approach to this problem in which the weights of different features of a document are learned computationally. Through using this approach, the computers were able to predict synonyms with greater accuracy and reduce errors.<\/p>\n<p>The initial approach solidly outperformed some standard baselines, and predictions of synonyms made using a small window around the word outperformed those made using a wider context (such as the whole document).<\/p>\n<p>However, they found that this was not the case uniformly across all types of near-synonyms.\u00a0 Those that embodied connotational or affective differences &#8212; such as &#8220;slim&#8221; versus &#8220;skinny&#8221;, with differences in how positively the meaning is presented &#8212; behaved quite differently, in a way that suggested that broader features related to the &#8216;tone&#8217; of the document could be useful, including document sentiment, document author, and a distance metric for weighting the wider lexical context of the gap itself\u00a0 (For instance, if the chosen near-synonym was negative in sentiment, this might be linked to other expressions of negative sentiment in the document).<\/p>\n<p>The distance weighting was particularly effective, resulting in a 38% decrease in errors, and these models turned out to improve accuracy not just on affective word choice, but on non-affective word choice also.<\/p>\n<p>Read the full article \u2018<a title=\"Predicting word choice in affective text\" href=\"http:\/\/journals.cambridge.org\/action\/displayAbstract?fromPage=online&amp;aid=9708315&amp;fulltextType=RA&amp;fileId=S1351324915000157\" target=\"_blank\">Predicting word choice in affective text<\/a>\u2019 online in the journal\u00a0<a title=\"Natural Language Engineering\" href=\"http:\/\/journals.cambridge.org\/action\/displayAbstract?fromPage=online&amp;aid=9708315&amp;fulltextType=RA&amp;fileId=S1351324915000157\" target=\"_blank\">Natural Language Engineering<\/a><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Choosing the best word or phrase for a given context from among candidate near-synonyms, such as &#8220;slim&#8221; and &#8220;skinny&#8221;, is something that human writers, given some experience, do naturally; but for choices with this level of granularity, it can be a difficult selection problem for computers. Researchers from Macquarie University in Australia have published an [&hellip;]<\/p>\n","protected":false},"author":312,"featured_media":14606,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[1593],"tags":[16,1625,544],"coauthors":[],"class_list":["post-14602","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-computer-science","tag-linguistics","tag-machine-learning","tag-natural-language-engineering"],"aioseo_notices":[],"_links":{"self":[{"href":"https:\/\/www.cambridge.org\/core\/blog\/wp-json\/wp\/v2\/posts\/14602","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.cambridge.org\/core\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.cambridge.org\/core\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.cambridge.org\/core\/blog\/wp-json\/wp\/v2\/users\/312"}],"replies":[{"embeddable":true,"href":"https:\/\/www.cambridge.org\/core\/blog\/wp-json\/wp\/v2\/comments?post=14602"}],"version-history":[{"count":0,"href":"https:\/\/www.cambridge.org\/core\/blog\/wp-json\/wp\/v2\/posts\/14602\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.cambridge.org\/core\/blog\/wp-json\/wp\/v2\/media\/14606"}],"wp:attachment":[{"href":"https:\/\/www.cambridge.org\/core\/blog\/wp-json\/wp\/v2\/media?parent=14602"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.cambridge.org\/core\/blog\/wp-json\/wp\/v2\/categories?post=14602"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.cambridge.org\/core\/blog\/wp-json\/wp\/v2\/tags?post=14602"},{"taxonomy":"author","embeddable":true,"href":"https:\/\/www.cambridge.org\/core\/blog\/wp-json\/wp\/v2\/coauthors?post=14602"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}