{"id":455,"date":"2014-09-22T19:36:00","date_gmt":"2014-09-22T23:36:00","guid":{"rendered":"http:\/\/blogs.library.american.edu\/mediaservices\/2014\/09\/22\/dig-into-television-and-film-corpuses-with-bookworm-movies\/"},"modified":"2014-09-22T19:36:00","modified_gmt":"2014-09-22T23:36:00","slug":"dig-into-television-and-film-corpuses-with-bookworm-movies","status":"publish","type":"post","link":"https:\/\/blogs.library.american.edu\/mediaservices\/2014\/09\/22\/dig-into-television-and-film-corpuses-with-bookworm-movies\/","title":{"rendered":"Dig into television and film corpuses with Bookworm Movies"},"content":{"rendered":"<div style=\"clear: both;text-align: center\"><a href=\"http:\/\/movies.benschmidt.org\/\" style=\"margin-left: 1em;margin-right: 1em\"><img loading=\"lazy\" decoding=\"async\" border=\"0\" src=\"http:\/\/1.bp.blogspot.com\/-V6NYk4NGYJk\/VCCxsizL_KI\/AAAAAAAABSs\/Pl9fMrf3rI0\/s1600\/doctor.png\" height=\"178\" width=\"400\" \/><\/a><\/div>\n<p>One handy tool for cultural analysis is to measure how often words are used within a given set of texts, whether that&#8217;s transcripts from Congress or <a href=\"https:\/\/books.google.com\/ngrams\">every document ever written<\/a>. It&#8217;s much easier to search through the written word for obvious reasons, leaving audio-visual media left out of the content analysis process. Luckily, a very clever professor named Ben Schmidt has leveraged big data to make movies and television shows as searchable as books.<\/p>\n<p>Schmidt&#8217;s new service, <a href=\"http:\/\/movies.benschmidt.org\/\">Bookwork Movies<\/a>, uses the Open Subtitles database to grab the scripts from thousands of movies and shows. Punch in any word or phrase \u2013 and, optionally, a specific show or medium \u2013 and Bookworm Movies will produce a detailed graph of how often each word is used relative to its entire corpus. <a href=\"http:\/\/movies.benschmidt.org\/#?%7B%22search_limits%22%3A%5B%7B%22word%22%3A%5B%22doctor%22%5D%2C%22MovieYear%22%3A%7B%22%24gte%22%3A1931%2C%22%24lte%22%3A2015%7D%2C%22TV_show__id%22%3A%5B%2233%22%5D%2C%22primary_original_language__id%22%3A%5B%221%22%5D%7D%2C%7B%22word%22%3A%5B%22doctor%22%5D%2C%22MovieYear%22%3A%7B%22%24gte%22%3A1931%2C%22%24lte%22%3A2015%7D%2C%22TV_show__id%22%3A%5B%223%22%5D%2C%22primary_original_language__id%22%3A%5B%221%22%5D%7D%2C%7B%22word%22%3A%5B%22doctor%22%5D%2C%22MovieYear%22%3A%7B%22%24gte%22%3A1931%2C%22%24lte%22%3A2015%7D%2C%22TV_show__id%22%3A%5B%2220%22%5D%2C%22primary_original_language__id%22%3A%5B%221%22%5D%7D%2C%7B%22word%22%3A%5B%22doctor%22%5D%2C%22MovieYear%22%3A%7B%22%24gte%22%3A1931%2C%22%24lte%22%3A2015%7D%2C%22TV_show__id%22%3A%5B%2237%22%5D%2C%22primary_original_language__id%22%3A%5B%221%22%5D%7D%5D%7D\">As show in the chart above<\/a>, <i>Scrubs<\/i> uses the word &#8220;doctor&#8221; more frequently than many medical dramas, while it appears comparatively little in <i>Grey&#8217;s Anatomy<\/i>. There&#8217;s all sorts of angles you could go above analyzing that. This is a terrific starting point for seeing how television shows and movies change language over time in comparison to one another.<\/p>\n<p>The best part? The entirety of <i>The Simpsons<\/i> is included as well. And thankfully, <a href=\"http:\/\/movies.benschmidt.org\/#?%7B%22search_limits%22%3A%5B%7B%22word%22%3A%5B%22selfie%22%5D%2C%22MovieYear%22%3A%7B%22%24gte%22%3A1931%2C%22%24lte%22%3A2015%7D%2C%22TV_show__id%22%3A%5B%222%22%5D%2C%22primary_original_language__id%22%3A%5B%221%22%5D%7D%5D%7D\">they haven&#8217;t used the word &#8220;selfie&#8221; yet<\/a>.\t\t<\/p>\n","protected":false},"excerpt":{"rendered":"<p>One handy tool for cultural analysis is to measure how often words are used within a given set of texts, whether that&#8217;s transcripts from Congress or every document ever written. It&#8217;s much easier to search through the written word for obvious reasons, leaving audio-visual media left out of the content analysis process. Luckily, a very [&hellip;]<\/p>\n","protected":false},"author":18,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[1],"tags":[212,293],"class_list":["post-455","post","type-post","status-publish","format-standard","hentry","category-uncategorized","tag-links-of-interest","tag-research"],"_links":{"self":[{"href":"https:\/\/blogs.library.american.edu\/mediaservices\/wp-json\/wp\/v2\/posts\/455","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/blogs.library.american.edu\/mediaservices\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/blogs.library.american.edu\/mediaservices\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/blogs.library.american.edu\/mediaservices\/wp-json\/wp\/v2\/users\/18"}],"replies":[{"embeddable":true,"href":"https:\/\/blogs.library.american.edu\/mediaservices\/wp-json\/wp\/v2\/comments?post=455"}],"version-history":[{"count":0,"href":"https:\/\/blogs.library.american.edu\/mediaservices\/wp-json\/wp\/v2\/posts\/455\/revisions"}],"wp:attachment":[{"href":"https:\/\/blogs.library.american.edu\/mediaservices\/wp-json\/wp\/v2\/media?parent=455"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/blogs.library.american.edu\/mediaservices\/wp-json\/wp\/v2\/categories?post=455"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/blogs.library.american.edu\/mediaservices\/wp-json\/wp\/v2\/tags?post=455"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}