Logo and Side Nav
News
Velit dreamcatcher cardigan anim, kitsch Godard occupy art party PBR. Ex cornhole mustache cliche. Anim proident accusamus tofu. Helvetica cillum labore quis magna, try-hard chia literally street art kale chips aliquip American Apparel.
Search
Browse News Archive
- March 2014
- February 2014
- January 2014
- December 2013
- November 2013
- October 2013
- September 2013
- August 2013
- July 2013
- June 2013
- May 2013
- April 2013
- March 2013
- February 2013
- January 2013
- December 2012
- November 2012
- October 2012
- September 2012
- August 2012
- July 2012
- June 2012
- May 2012
- April 2012
- March 2012
- February 2012
- January 2012
- December 2011
- November 2011
- October 2011
- September 2011
- August 2011
- July 2011
- June 2011
- May 2011
- April 2011
- March 2011
- February 2011
- December 2010
- November 2010
- October 2010
- August 2010
- July 2010
- June 2010
- May 2010
- April 2010
- March 2010
- February 2010
- January 2010
- December 2009
- November 2009
- October 2009
- September 2009
- August 2009
- July 2009
- June 2009
- May 2009
- April 2009
- March 2009
- February 2009
- January 2009
- December 2008
- November 2008
- October 2008
- September 2008
- August 2008
- July 2008
- June 2008
- May 2008
- April 2008
- March 2008
- February 2008
- January 2008
- December 2007
- November 2007
- October 2007
- September 2007
- June 2007
- May 2007
Thursday, July 21, 2011
Against Search
keywords: search, Google, knowledge discovery, digital library, database, classification, folksonomy, information retrieval, HCI, interface, information visualization, digital humanities, cultural analytics, visual analytics, software studies, Manovich
Early 21st century humanities and media studies researchers have access to unprecedented amounts of media – more than they can possibly study, let alone simply watch or even search. (For examples of large media collections, see the list of repositories made available to the participants of Digging Into Data 2011 Competition, www.diggingintodata.org).
The basic method of humanities and media studies which worked fine when the number of media objects were small – see all images or video, notice patterns, and interpret them – no longer works. For example, how do you study 167,00 images on Art Now Flickr gallery, 236,000 professional design portfolios on coroflot.com (both numbers as of 7/2011), or 176,000 Farm Security Administration/Office of War Information photographs taken between 1935 and 1944 digitized by Library of Congress (http://www.loc.gov/pictures/)?
Given the size of typical contemporary digital media collections, simply seeing what’s inside them is impossible.
Although it may appear that the reasons for this are the limitations of human vision and human information processing, I think that it is actually the fault of current interface designs and web technology. Standard interfaces for massive digital media collections such as list, gallery, grid, and slide do now allow us to see the contents of a whole collection. These interfaces usually they only display a few items at a time (regardless of whether you are in a browing mode, or in a search mode). This access method does not allow us to understand the “shape” of overall collection and notice interesting patters.
The popular media access technologies of the 19th and 20th century such as slide lanterns, film projectors, microfilm readers, Moviola and Steenbeck, record players, audio and video tape recorders, VCR, and DVD players were designed to access single media items at a time at a limited range of speeds. This went hand in hand with the media distribution mechanisms: record and video stores, libraries, television and radio would all only make available a few items at a time. For instance, you could not watch more than a few TV channels at the same time, or borrow more than a few videotapes from a library.
At the same time, hierarchical classification systems used in library catalogs made it difficult to browse a collection or navigate it in orders not supported by catalogs. When you walked from shelf to shelf, you were typically following a classiffication based on subjects, with books organized by author names inside each category.
Together, these distribution and classification systems encouraged 20th century media researchers to decide before hand what media items to see, hear, or read. A researcher usually started with some subject in mind – films by a particular author, works by a particular photographer, or categories such as “1950s experimental American films” and “early 20th century Paris postcards.” It was impossible to imagine navigating through all films ever made or all postcards ever printed. (One of the the first media projects which organizes its narrative around navigation of a media archive is Jean-Luck Godard’s "Histoire(s) du cinéma" which draws samples from hundreds of films. ) The popular social science method for working with larger media sets in an objective manner – content analysis, i.e. tagging of semantics in a media collection by several people using a predefined vocabulary of terms also requires that a researcher decide before hand what information would be relevant to tag.
Unfortunately, the current standard in media access – computer search – does not take us out of this paradigm. Search interface is a blank frame waiting for you to type something. Before you click on search button, you have to decide what keywords and phrases to search for. So while the search brings a dramatic increase in speed of access, it assumes is that you know beforehand something about the collection worth exploring further.
We need the techniques for efficient browsing of content and discovery of patterns in massive media collections. Consider this defintion of “browse”: “To scan, to casually look through in order to find items of interest, especially without knowledge of what to look for beforehand” (“Browse”, Wiktionary). Consider also one of the meanings of the word “exploration”: “to travel somewhere in search of discovery” (“Exploration”, Wiktionary.) How can we discover interesting things in massive media collections? I.e., how can we browse through them efficiently and effectively, without a knowledge of what we want to find?
---------------------
Anja Wiesinger wrote an interesting response to this post:
http://neuneun.com/2011/07/in-search-of/
---------------------
Some notes on the history of search engines and media collection interfaces - for article
http://en.wikipedia.org/wiki/Microfilm "Using the daguerreotype process, John Benjamin Dancer was one of the first to produce micro-photographs, in 1839. He achieved a reduction ratio of 160:1".
"In 1896, Canadian engineer Reginald A. Fessenden suggested microforms were a compact solution to engineers' unwieldy but frequently consulted materials. He proposed that up to 150,000,000 words could be made to fit in a square inch, and that a one foot cube could contain 1.5 million volumes"
"The year 1938 also saw another major event in the history of microfilm when University Microfilms International (UMI) was established by Eugene Power."
(http://en.wikipedia.org/wiki/Emanuel_Goldberg):
Emanuel Goldberg "introduced his “Statistical Machine,” a document search engine that used photoelectric cells and pattern recognition to search the metadata on rolls of microfilmed documents (US patent 1,838,389, 29 December 1931). This technology was used in a variant form in 1938 by Vannevar Bush in his “microfilm rapid selector,” his “comparator” (for cryptanalysis), and was the technological basis for the imaginary Memex in Bush’s influential 1945 essay “As we may think.”
Information retrieval
http://en.wikipedia.org/wiki/Information_retreival#Timeline
"1950: The term "information retrieval" appears to have been coined by Calvin Mooers."