Logo and Side Nav

News

Velit dreamcatcher cardigan anim, kitsch Godard occupy art party PBR. Ex cornhole mustache cliche. Anim proident accusamus tofu. Helvetica cillum labore quis magna, try-hard chia literally street art kale chips aliquip American Apparel.

Search

Browse News Archive

projects:

Showing posts with label projects. Show all posts
Showing posts with label projects. Show all posts

Friday, February 28, 2014

Gender, age, and ambiguity of selfies on Instagram

Gender and Age Distributions of Selfies


selfiecity.net research update by Mehrdad Yazdani, Research Scientist, Software Studies Initiative.



Who are the people behind selfies? Are they mostly young? Do women prefer taking selfies over men? Do these variations depend on geographic location? We looked at over 4,500 selfies from six cities to gain a sense of the different age groups and genders. (Analysis and visualizations of our findings for 3200 images from five cities from this dataset are available on selfiecity.net.)

We did this by first downloading a random sample of 140,000 images among all Instagram photos shared by people in central areas of 6 global cities for one whole week (Dec 4-12, 2013). Our random sample of Instagram photographs include:

  • 30,000 images from Tokyo
  • 30,000 images from New York
  • 20,000 images from Bangkok
  • 20,000 images from Berlin
  • 20,000 images from Moscow
  • 20,000 images from Sao Paulo
Now we have to figure out which of these images are actual selfies. We define the selfie as a photograph that you take of yourself. Since this definition can lead to a great deal of ambiguity, we ask several people to gain a better consensus. We utilize Amazon's Mechanical Turk service to find human reviewers (who are payed!) to review each image. At least 3 reviewers review each of the 140,000 images to find all the selfies. Obviously, in some cases, the reviewers disagree if an image is a genuine selfie or not. To resolve these disagreements, we use a simple majority vote (the mode of votes in statistics jargon) to make the final call as to whether the image is a selfie or not.

plot of chunk unnamed-chunk-2

What you see above is what we call the "selfie rate," that is, the percentage of selfies that our reviewers from Mechanical Turk found from the 140,000 images that we collected. What is most striking about this figure is that, in contrast to popular belief, the selfie is not ubiquitously plastered all over Instagram. In fact, Sao Paulo has a selfie rate clocking in at just under 5%! Tokyo, on the other hand, has an even significantly lower selfie rate of a hair above 1%.

But we don't stop at just finding selfies from our set of images. If a reviewer thinks that the image is indeed a selfie, he or she also takes a best guess at the gender and age of the selfie. Again, these are reviewers who use Mechanical Turk on a regular basis and therefore asking them to complete an image-tagging problem is on-par with their expertise. The graph above shows that the gender distribution of the selfies is heavily skewed towards females. Moscow in particular has a large disproportionate amount of female selfies. In fact, it is 4 times less likely that a selfie from Moscow is male (with a 95% confidence interval between 3.3 and 5.3).

However, it is not fair to assume that gender is a binary factor that we can neatly divide into "male" or "female." Would it be possible for us to have a way of measuring the ambiguity of a selfie's gender? Answering such a question is extremely difficult, but let's take a data science approach (read: hack). We will make an assumption that if it is difficult to ascertain a selfie's gender as "male" or "female" then our reviewers from Mechanical Turk will have a harder time making a decision. Since we have multiple reviewers (at least 3 or more), then there will be more disagreements if it is truly difficult for the reviewers to determine the selfie's gender. Let's assign a confidence score between 0 and 1 to the collective agreement of the reviewers for the gender of the selfie. What follows are the averages of gender discrimination confidence for the different cities:

plot of chunk unnamed-chunk-3

We see some very interesting patterns emerging from this figure. Over the entire population, we see that the reviewers are fairly confident (over 95%) of a selfie's gender. However, consistently for every city, the average gender confidence for males is less than those of females. In the case of Berlin, this difference may very well be insignificant and due to chance, but for the other cities we see much wider gap in confidence. Especially in the case of Sao Paulo and Moscow, the reviewers are much more confident at detecting females than the other cities. One possible interpretation: What makes these cities unique is that women in these cities are unquestionably "female looking" (at least when they take their selfies and post them), so the confidence reviewers have for these female selfies is higher.

We next take a look at the age distributions of the selfies. Here they are organized by city and gender:

plot of chunk unnamed-chunk-4

The most dramatic result here is that for every city we see that men who take selfies are older than their female counterparts. Bangkok has the youngest selfie enthusiasts, while New Yorkers have the oldest. If we look on a log-scale, as the age of a selfie increases, the odds of the selfie being male increases by a factor of 6.7 (with a 95% confidence interval between 4.99 and 9.03). Overall, however, the early twenty somethings dominate selfies on Instagram. As before, we determine the age of the selfie by asking several reviewers to make their best guess. We then estimate the age of the selfie by taking the median of the guesses of the reviewers. As in the case for determining gender, this can be a very difficult task and certain selfies can be harder to answer. To ascertain the agreement level for a selfies age, we computed the standard deviation of the reviewers guesses. In this case, higher standard deviation suggests more disagreement among the reviewers. We refer to this standard deviation as the "disagreement." Below we show average disagreements for each city and gender:

plot of chunk unnamed-chunk-5

With the exception of Berlin and New York (that have the highest disagreements), female age discrimination has the least amount of disagreement. The difference between the disagreement levels of males and females in Berlin does not appear to be significant. By far, Bangkok has the least amount of disagreement for age discrimination of female selfies among all cities. It is difficult to ascertain why this is the case. We welcome any hypotheses for this finding!

In summary, our study suggests that selfies are not the dominant imagery shared on Instagram. We have also observed that the selfies are extremely popular by females and twenty somethings. We are planning more posts on softwarestudies.com about additional details and more results from selfiecity.net research, so check back to see them.

Tuesday, February 25, 2014

Video about our selfiecity.net project is now on YouTube and Vimeo


selfiecity from Moritz Stefaner on Vimeo.

http://selfiecity.net
Investigating the style of self-portraits (selfies) in five cities across the world.

Selfiecity investigates selfies using a mix of theoretic, artistic and quantitative methods:
We present our findings about the demographics of people taking selfies, their poses and expressions.
Rich media visualizations (imageplots) assemble thousands of photos to reveal interesting patterns.
The interactive selfiexploratory allows you to navigate the whole set of 3200 photos.
Finally, theoretical essays discuss selfies in the history of photography, the functions of images in social media, and methods and dataset.

Learn more at http://selfiecity.net





Thursday, February 20, 2014

Our new project Selfiecity Investigates the style of self-portraits (selfies) in five cities across the world.





Our new project is now online:

selfiecity.net


The project investigates selfies using a mix of theoretic, artistic and quantitative methods:

We present our findings about the demographics of people taking selfies, their poses and expressions.

Rich media visualizations (imageplots) assemble thousands of photos to reveal interesting patterns.

The interactive selfiexploratory allows you to navigate the whole set of 3200 photos.

Theoretical essays discuss selfies in the history of photography, the functions of images in social media, and our methods and dataset.


Monday, November 25, 2013

Motion Structures by Everardo Reyes: Visualizing a moving image sequence as a 3D shape



Motion Structures is a new project by Everardo Reyes (Associate professor, Information and Communication, University of Paris 13, and an active member of our lab). Using ImageJ (the same open source science software for image analysis we use in the lab to develop custom plugins ImagePlot, ImageMontage, and ImageSlice), Everardo developed a new plugin. The tool takes any image sequence (film, video, animation) and translates into a 3D shape. The shape encodes spatial and temporal transformation in a moving image sequence.

The shape can be represented as perspectival images or printed in 3D. Here is one example from Motion Structures - a 5 second segment from Games of Thrones, visualized as a 3D shape:








Motion Structures is not the first project to extract the structure of a moving image sequence and represent in a new way. The early 20th century examples include work by Étienne-Jules Marey and Frank and Lillian Gilbreth (see my article Visualizing Vertov for the discussion).

More recently, we saw Ghostcatching by Paul Kaiser and Shelley Eshkar (1999), The Invisible Shapes of Things Past by Art+Com (1995-2007), Cinemetrics by Frederic Brodbeck (2011), and a number of other projects which all use computers.

Everardo adds his own unique take on how moving images can be converted into new visual representations; and since he made available his software tool, everybody can apply to other films, videos, TV shows, recordings of dance and other performances, and all other genres of moving images.

Sunday, July 21, 2013

The cover of "Software Takes Command" - visualizing 62.5 hours of video gameplay


Software Takes Command cover spread
The front and back cover of Lev Manovich's Software Takes Command (Bloomsbury Academic, 2012).




The background image is taken from visualization of video gameplay created in our lab by William Huber. The data are the game play sessions of the video games Kingdom Hearts (2002, Square Co., Ltd.) The was played from the beginning to the end in 29 sessions over 20 days. All together, these sessions took place 62.5 hours.

The video captured from all game sessions were assembled into a singe sequence. The sequences were sampled at 6 frames per second. This resulted in 225,000 frames for Kingdom Hearts gameplay. The visualizations use only every 10th frame from the complete frame sets (22,500 frames). Frames are organized in a grid in order of game play (left to right, top to bottom).



Kingdom Hearts is a franchise of video games and other media properties created in 2002 via a collaboration between Tokyo-based videogame publisher Square (now Square-Enix) and The Walt Disney Company, in which original characters created by Square travel through worlds representing Disney-owned media properties (e.g., Tarzan, Alice in Wonderland, The Nightmare before Christmas, etc.). Each world has its distinct characters derived from the respective Disney-produced films. It also features a distinct color palettes and rendering styles, which are related to visual styles of the corresponding Disney film.

Like other software-based artifacts, video games can have infinite varied realizations (since each game traversal is unique). Compressing many hours of game play into a single image and placing a number of such visualizations next to each other allows us to see the patterns of similarity and differences between these realizations. Such visualizations are also useful in comparing different releases of the popular games – such as the two releases of Kingdom Hearts shown in the two visualizations below.



Kingdom Hearts videogame traversal
The complete visualization of Kingdom Hearts (2002) game play: 62.5 hours, in 29 sessions over 20 days. Full size visualization is 10810 x 8000 pixels (download from Flickr).

Kingdom Hearts II videogame traversal
Visualization of Kingdom Hearts II (2005) game play: 37 hours, in 16 sessions over 18 days. Full size visualization is 10759 x 8000 pixels (download from Flickr).










Saturday, July 13, 2013

Visualization of 33292 Instagram photos on The Big Wall (66 million pixels tiled display)


Phototrails update 1

Lev Manovich Instagram Data-6


Last week we published our new project Phototrails - analysis and visualizations of 2.3 million Instagram photos from 13 global cities. Each visualization shows large numbers of photos organized by various attributes such as upload times, filters, or visual characteristics (hue, saturation, etc.)

Such big visual data looks best on big displays - such as The Big Wall constructed by Calit2 (California Institute for Telecommunication and Information). You can both see details of individual photos and the larger patterns at the same time. In contrast, when you use standard desktop applications or media sharing sites, you can see either one or another, but not both at the same time.

Below are the few photos showing two of Phototrails collaborators (Lev Manovich and Jay Chow) with one of our Instagram visualizations on The Big Wall. The visualization shows 33292 Instagram photos shared by people in Tel Aviv during one week. You can also view and download this and other visualizations (at smaller size) from the project site, or from our Flickr set.



The Big Wall s a tiled display environment consisting from 32 narrow-bezel LCD 55" displays. Each of the displays has full HD resolution (1920x1080 pixels), adding up to 66 million pixels on the entire wall (15,360 x 4,320 pixels).

Photos by Alex Matthews, Calit2.


Lev Manovich Instagram Data-2


Lev Manovich Instagram Data-14

Lev Manovich Instagram Data-7


Lev Manovich Instagram Data-8

Lev Manovich Instagram Data-21



Tuesday, July 2, 2013

Phototails: visualizing 2.3 M Instagram photos from 13 global cities











What do billions of Instagram photographs can tell us about the world? How can we see larger cultural patterns contained in such massive visual social data? Do these images reflect the specificity of local places?

A group of researchers from the Art History department at the University of Pittsburgh, the Software Studies Initiative at California Institute for Telecommunication and Information and the Computer Science program at The Graduate Center, City University of New York collaborated to investigate these questions.

Their research is the first academic study to investigate Instagram’s big visual data. The result is a project called Phototrails (phototrails.net), which developed new visualization techniques to analyze and compare more than 2.3 million publicly shared Instagram photos from 13 cities such as New York, San Francisco, London and Tokyo.

The team’s findings are published in the July issue of First Monday (http://www.firstmonday.org), an open-access peer–reviewed journal. In addition, all visualizations and findings are available on the project’s web site at www.phototrails.net.

The researchers found that each city has its own unique visual signature on Instagram. Based on measurements of multiple visual attributes such as hue, brightness, line orientation etc., Bangkok was found to be the most visually different from other cities, followed by Singapore, and Tokyo.

“Our visualizations allow us to uncover the aggregated visual characteristics of each city as well as to examine the impact of exceptional events such as hurricane Sandy”, says Nadav Hochman, a Ph.D. student in the History of Art and Architecture department at the University of Pittsburgh.

The study also looked at the patterns of Instagram use among 312,694 people during a four months time period. The great majority of people only uploaded one or a few photos. The proportions of these active users vary significantly from city to city. For example, the percentages of people who uploaded more than 30 photos are 2% in NYC, 6.7% in Moscow, and 10.9% in Tel Aviv.

The researchers also found differences in the use of Instagram filters. In their sample, the proportion of photos to which Instagram users applied filters varies between 68 and 81 percent. The cities with the highest percentage of filtered photos are Tel Aviv, London, and San Francisco, while the city with lowest percentage is New York.

The project is a collaboration between Nadav Hochman (PhD student, History of Art and Architecture University of Pittsburgh), Lev Manovich (Professor at The Graduate Center, CUNY, Visiting Researcher @ Calit2, and Director of Software Studies Initiative), and Jay Chow (graduate of the Interdisciplinary Computing and the Arts undergraduate program at UCSD, and Researcher @ Software Studies Initiative).

Project web site: http://phototrails.net/

For further details and high-resolution visualizations contact:
info@phototrails.net



Thursday, May 30, 2013

We are hiring: Visual Designers (web design/print ) for softwarestudies.com (freelance position)


Software Studies Initiative wants to hire Visual Designer or Design Team (web design, print, and optionally also information visualization)


Our lab (softwarestudies.com) works on using visualization and computational data analysis to explore massive cultural visual data.

Here are examples of our projects.

And here is the description of our vision for analyzing and visualizing big cultural data (aka cultural analytics).

We want to establish a long term relationship with a talented and versatile designer or design studio in NYC to work on visual design of our new projects during the next few years, new data visualizations, and a book presenting the projects.

The ideal designer/studio will be interested in contributing to the content of our work (exploration of massive cultural data sets using visualization and design), and possible involvement as a co-author on the projects. The designer(s) will receive full credit for all their contributions, and will be paid using industry-level rates for all their work with us.


The position is freelance. We don't require fixed number of hours - this depends on your availability.

We like to work face to face, and therefore we are looking for a NYC based designer/studio. However, if you have great skills and talent, we will consider working with designer(s) located anywhere else on the planet.


SKILLS

The designer/studio will work on web design, print design, and information design. However, if you are great only in some of these areas, we also want to talk to you.



RESPONSIBILITIES

1| Web design:

Improve the design of softwarestudies.com and manovich.net, making them more visual.

Creation of a new simple portfolio web site to feature our selected "best" projects (both past projects and new ones we will be designing with you).

Visual design of the projects for the new portfolio site. This may also include information visualizations of our data sets, animations, and videos. The portfolio will feature both new projects to be done in 2013-2015, and also some of the our older projects. They need to be moved from their present format (blog posts or single web pages) to a visually sophisticated level, compatible with designs for new projects. Examples: One million manga pages; Mondrian vs Rothko: footprints and evolution in style space.


All web designs should uses contemporary web technologies (including responsive/adaptive design), and feature social media integration.

Note: while the projects for the portfolio site can be designed specifically for the web, alternatively they can be all done as PDF, to become parts of the full-color book (see below).


2| Book design:

Design a new full-color very visual book (print) by Lev Manovich. The book will feature texts based on our publications about "cultural analytics" and a comprehensive portfolio of our visualization projects. You will need to develop a visual language for presentation of visualizations of cultural data sets and their organization strategy (for example, organizing them by scale, type of media, etc.)

Ideally, the same visual language developed for presentation of individual projects on the web (see 1) will be used in the book.


OPTIONAL:
3| Data visualization:

Create information and data visualizations of our selected cultural data sets (from small to very big) to be featured as part of projects presentations on the new portfolio site, and in the new book (see below).


START TIME

The position can start immediately.


SALARY

We will pay competitive industry rates. The level of pay will depend on your experience, visibility in design/data visualization fields, and the ability to create compelling cutting-edge work.



TO APPLY

Please send email to: manovich {dot} lev {at} gmail {dot} com.

Include the following:

-link to your portfolio web site;
-your desired rates (per hour, task or a project);
-your availability (how many hours per week / month you can work with us).


Use "design applicant for Software Studies Initiative" in the header of the email.
(Any other emails will not be open, so please make sure to use this line exactly.)


APPLICATIONS REVIEW

Will start immediately. We are hoping to make a decision by middle of June.

posted: May 30, 2013, 9:47am, EST.

Sunday, March 24, 2013

our ImagePlot software is used in #OCCUPYDATA NYC Hackaphon


Occuprint Image Analysis
Image plot of almost 400 OccuPrint posters organized by visual similarity.

human image plot
"In addition, we’ve a human pattern recognition task underway in the gallery... it appears that gallery visitors are using the same types of features to understand and visualize patterns in the OccuPrint poster set. However, one distinction is incredibly striking, speed. Relative to computer efficiency, human work feels more like geologic time."



#OCCUPYDATA NYC is a series of Hackathons taking place in NYC. (The next one is on April 22nd, 2013. 8pm. Kellen Auditorium @ The New School, 66 Fifth Avenue).

During the last hackthon which took place at The Graduate Center CUNY, the participants used our free software tool ImagePlot to analyze and visualize a collection of almost 400 Occuprint posters. (Occuprint "collects, prints and distributes posters from the worldwide Occupy movement.")

"Occuprint Image Analysis" results and visualizations are available here:

Occuprint Image Analysis



ImagePlot software download and documentation:

http://lab.softwarestudies.com/p/imageplot.html

Tuesday, December 18, 2012

AR student project allows museum visitors to look underneath van Gogh paintings


Van Gogh's studio practice: Canvases re-used

vg-3

vg-8

With the support of AR Lab, student Koen Mostert made a tool that enables you to see what is underneath the painting by touching an iPad screen. This was done for five paintings, with a separate iPad for each painting.

The project was presented in Van Gogh museum in Amsterdam. It examined van Gogh's various working methods. In addition to allowing visitors see the earlier paintings which van Gogh paintted over, the presentation also examined deterioration of paintings over the years.

The investigation into Van Gogh’s re-use of canvases is part of a large project entitled Van Gogh’s studio practice. A multidisciplinary team (comprising staff from the Van Gogh Museum, the Netherlands Cultural Heritage Agency and Shell) is examining every imaginable facet of Van Gogh’s artistic process and comparing it with that of his contemporaries. The objective is to gain greater insight into Van Gogh’s working method and to place it in the context of his time.

You can follow the developments of this investigation: www.vangoghsatelierpraktijk.nl


Friday, December 14, 2012

Visualization on tiled displays: new software from the University of Texas at Austin






When in 2005 I saw for the first time a 55 monitor tiled diplay at the opening of Calit2 building, it changed my life. I realized that such displays combined with the massive cultural data sets that were becoming available (for example, Artstor) can offer fundamentally new opportunites for the study of cultural processes and dynamics. Two years later, with the suupport from Calit2's visionary leaders Larry Smarr and Ramesh Rao, together with Noah Wardrip-Fruin I established Software Studies Initiative. We focused on development of "software-based research methods and next generation cyberinfrastructrure tools and resources for the study of massive sets of visual cultural data, asking theoretical questions that are important for humanities."

In 2009 we developed the interactive visualization application for explorations of image collections which was running on 287 mexapixel display at Calit2, made from 70 30-inch displays. (The software is described in Yamaoka, S., Manovich, L., Douglass, J., Kuester, F., Cultural Analytics in Large-Scale Visualization Environments, IEEE Computer, 11/2011.)

Our app allows for interactive manipulation of thousands of images of any size. However this interactivity currently has a price - the sofware only works on multi-monitor dispay systems which run CGLX (A Cross-Platform Cluster Graphics Library); the development of new apps requires experience with CGLX.


Now Texas Advanced Computing Center at University of Texas at Austin released Most Pixels Ever: Cluster Edition (MostPixelsEverCE), a library for extending Processing sketches to multi-node tiled displays:

TACC Develops Visualization Software for Humanities Researchers

Most Pixels Ever: Cluster Edition

MostPixelsEverCE - software

Created by artists Casey Reas and Ben Fry, Processing is an established platform used by numerous artists and designers - a very important factor in the further development of humanities visualization. Artists and designers pioneered innovative visualizations of cultural data almost ten years ago (for example, Manyeyes, 2003). Later, Processing artist Daniel Shiffman developed the original MostPixelsEver library, which inspired University of Texas researchers to develop their software. (Daniel teaches at ITP in NYC and he recently published a book Naure of Code funded via Kickstarter).

We are looking forward to "sketching" with the new software on scalable multi-monitor displays walls at Calit2. While our latest walls have lower resolution than U. of Texas's amazing 328 megapixel Stallion, they use new monitors with very thin besels:

Tour (04-11-2012) 23
Lev Manovich demonstrating interactive explorations of image collections (168 paintings by Mark Rothko).




Friday, October 19, 2012

"Big Data, Visualization, and Digital Humanities" - course at CUNY Graduate Center, Spring 2013


One million manga pages
Exploring a visualization of 1 million manga pages on 287 megapixel HIperSpace visualization system at Calit2, 2010.


Big Data, Visualization, and Digital Humanities
CUNY (City University of New York) Graduate Center, 365 5th Avenue, New York City.
Instructor: Lev Manovich
Course numbers: IDS 81650 / MALS 78500
Format: graduate seminar open to PhD and MA students

Want to visit the course and sit in on particular meetings?
It is possible but please email me first [manovich dot lev at gmail dot com].

Classes meet on Mondays, 2pm-4pm. Room: 3309.
First class meeting: Monday, January 28.
CUNY Graduate Center 2012-2013 academic calendar



Schedule, lectures, readings, tutorials, resources (will be updated during the semester).


Twitter:
Use the #digitalgc hashtag on twitter and blog posts.

Graduate Center Digital Initiatives:
http://gcdi.commons.gc.cuny.edu/


Other Digital Humanities / Digital Media courses at CUNY this semester:

Arienne Dwyer - MALS 75500 – Digital Humanities: Methods and Practices - Mondays, 11:45 a.m.-1:45 p.m.

Luke Waltzer and Chris Stein - ITCP 70020 - Interactive Technology and the University: Theory, Design, and Practice - Tuesdays - 4:15 pm - 6:15 pm


Current courses at other universities which cover related topics:

Lauren Klein (CUNY 2011 PhD graduate), Georgia Institute of Technology: Studies in Communication and Culture: Data

Stefan Sinclair, McGill University: Digital Studies/Citizenry

Katy Börner, David Polley, Scott Weingart. Indiana University. Information visualization



Course description:

The explosive growth of social media on the web, combined with the digitization of cultural artifacts by libraries and museums opens up exiting new possibilities for the study of cultural processes. For the first time, we have access to massive amounts of cultural data from both the past and the present.

How do we navigate and interact with massive cultural collections (billions of objects)?

How do we combine close reading of individual artifacts and “distant reading” of patterns across millions of these artifacts?

What visualization and computational tools are particularly suited for working with large cultural data sets?

How do we use exploratory visualization as a research method in the humanities and social science?

How to understand visualization theoretically in relation to other visual media, past and present?

This course explores the possibilities, the methods, and the tools for working with large cultural data sets, with a particular focus on data visualization and the analysis of visual media (images and video). It also covers relevant work from digital art and design, media theory and software studies,

We will also discuss cultural, social and technical developments that placed "information" and "data" in the center of contemporary social and economic life (the concepts of information society, network society, software society).

We will critically examine the fundamental paradigms developed by modern societies to analyze patterns in data - statistics, visualization, data mining. This will help us to employ computational tools more reflexively. At the same time, the practical work with these tools will help us to better understand how they are used in society at large - the modes of thinking they enable, their strengths and weaknesses, the often unexamined assumptions behind their use.

Finally, we also want to ask general questions about theory and art in a "postdigital" society.
The arrival of social media and the gradual move of all knowledge and media distribution and cultural communication to networked digital forms has created a new cultural landscape which challenges our existing methods and assumptions:

What new theoretical concepts and models we need to deal with the new scale of born-digital culture?



What will be covered?

The course is suitable for students from any area of humanities or social sciences.
No technical skills are required beyond the basic digital media literacy.

Because I expect people from a variety of backgrounds, I wll not go deeply into statistics, data analytics, and data mining. As examples of what we will cover, I will explain PCA, and show how to do it in R; I will also talk about the concepts of "features" and "features space" and show examples of features for text, sound, image, and spatial data.

I will demo data analysis and visualization tools, and demonstrate their use in class. However, there wil be no required technical assignments. (Given variety of student backgrounds, such assingmemnts will likely to be too simple for some and too challenging for others.) You are strongly encouraged to try the tools and the techniques shown in class outside of the class meetings, and use them in your practical course projects. I will provide simple data sets and exerises you should work on if you want to learn these tools.

As my own research focuses on analysis and visualization of images, video, and interactive media, visual media be the focus of demos and examples. We will learn about Image Processing, how to extract features from images and visualize image and video collections. (I will not be giving a comprehensive overview of all DH tools people use to work with texts, maps, or historical data - but we will cover basic concepts).


Info about my research work:

http://lab.softwarestudies.com/2008/09/cultural-analytics.html

You may want to take a look at my most recent classes, since the new class will draw from them:

http://manovich.net/teaching.php


Course structure:

1/3 instructor presentations;
1/3 discussions of readings and relevant projects from digital media, art, design, cinema, artistic visualization, architecture, museum design, digital humaniities;
1/3 demos of software and tools tutorials;

A typical "large cultural data analysis" project involves three parts: data, analysis/visualization, interface. Each concept will be discused in relation to current industry approaches, relevant projects the arts and design (historical and recent), media and software studies theory, and practical techniques and tools.


Course requirements:

Students will have a choice of doing the following: a final paper, a series of blog posts examining concepts or presenting a project; a practical project which can be done individually or in a collaboration with other students.


Tuesday, October 16, 2012

Veja.vis: resultados preliminares

A pesquisa Veja.vis é um projeto inovador no campo das chamadas "Humanidades Digitais" (em inglês Digital Humanities) que busca demonstrar a potencialidade da investigação em ciências sociais aliada às tecnologias de análise quantitativa de dados (data analytics). Em nosso caso específico, seguimos a proposta teórica de Lev Manovich de analisar também fatores culturais através de processamento de dados em larga escala com a produção e aplicação de algoritmos "culturais" de reconhecimento de faces, gênero, entre outros, para tratar de temas que levariam talvez décadas para serem obtidos, normalmente em extensas pesquisas em bibliotecas ou mesmo acervos. Para o processamento desses dados utilizamos computação de alta performance em grid e as imagens que seguem foram renderizadas utilizando esse método. Nos casos analisados abaixo, fica demonstrada a potencialidade da proposta do nosso grupo de estudos no campo que está se estabelecendo denominado Humanidades Digitais (Digital Humanities). Em nosso projeto, além de dados quantitativos, criamos alguns algoritmos "qualitativos" com o intuito de provocar uma discussão sobre como a revista com maior circulação nacional vem tratando de alguns temas, mesmo que inconscientemente, em suas capas. Por questão de limitação em nossa capacidade de hardware e também de tempo computacional para renderizar todas as imagens, utilizamos somente as imagens digitalizadas das capas da revista, que foram objeto de nossa análise neste projeto acadêmico que vem sendo desenvolvido pelo mestre em Comunicação Marcio Santos sob a orientação de Cicero Inacio da Silva. Os resultados analisam questões qualitativas e quantitativas chegando a dados como:
 

a) as mulheres negras representam 0,33% do total das capas, apesar de o último censo do IBGE (2010) demonstrar que metade da população feminina é negra/parda;




b) apenas 12% das capas da Revista Veja apresentam mulheres como tema principal, sendo que elas hoje são 51% da população;



c) a mulher negra é apresentada em apenas 3 tipos de temas: Guerra (refugiados), Carnaval e política (1 capa= 0,0041%), reforçando o estereótipo clássico do papel da mulher na sociedade brasileira;
 


d) os homens negros aparecem em uma porcentagem um pouco maior, um total de 2% das capas totais de VEJA (47 capas);

e) nas temáticas em que as capas masculinas com homens negros aparecem, três temas se destacam: esporte (40%), Crime (11%) e Política (19%);

f) Das capas com homens negros, 66% são brasileiros e o restante personalidades estrangeiras.

O trabalho Veja.Vis foi apresentado em inúmeros congressos nacionais e internacionais, entre eles o Digital Humanities (Humanidades Digitais) 2012 em Sheffield, Inglaterra.

Monday, August 20, 2012

NYC OccupyData Hackathon uses software tools developed in our lab


This May Suzanne Tamang used our software tooks in NYC OccupyData Hackathon to visualize Occupy inspired street art. The next Hackathon will take place in September.

Check our Suzanne's work:

is the occupy movement getting more colorful?




Saturday, May 5, 2012

animated visualization of Arizona Sentinel weekly, 1872-1911


UCSD undergraduate Cyrus Kiani added a new video to his already amazing work visualizing the history of American Newspapers using the collection at Library of Congress.

The new video shows evolution across 1962 front pages of Arizona Sentinel weekly, 1872-1911.

The Arizona Sentinel : 1872-1911

Place of publication: Arizona City [Yuma], Yuma County, A.T. [ Ariz.]

Frequency: Weekly

Language:English

sn 84021912

Chronicling America
Library of Congress
chroniclingamerica.loc.gov


Wednesday, April 25, 2012

Impressionism visualizations: final class project by Megan O'Rourke


Previously we featured visualizations of selected paintings by Impressionist artists created by Megan O'Rourke in my undergraduate class at UCSD (Winter 2012). One visualization compared works of artists using multiple image plots; the second used color histograms to show the same data in a different way.

Here is Megan's final class project which extends her investigation into new areas: comparing how different artists represented faces using image averaging, and visualizing evolution of their paintings over time in relation to brightness, saturation and hue.

To measure image properties used for visualizations, Megan used ImageMeasure macro which we distributed as part of our free ImagePlot software.

(CLICK ON EACH IMAGE BELOW TO SEE A HIGHER RESOLUTION VERSION)


Impressionist Portraits

Impressionism Sparklines

Impressionist Color Sparklines

Explanation of how this visualization was generated

Project text

Friday, April 20, 2012

Software Studies Initiative awarded $477,000 grant from Mellon Foundation


Project name:

Tools for the Analysis and Visualization of Large Image and Video Collections for the Humanities


Project team:

PI: Dr. Lev Manovich, Professor of Visual Arts, University of California, San Diego (UCSD);
Director, Software Studies Initiative, California Institute for Telecommunications and Information Technology (Calit2.

Almila Akdag, Postdoctoral Researcher, e-Humanities Group, The Royal Netherlands Academy of Arts and Sciences; Visiting Scholar, Visual Arts and Communication Design, Sabanci University, Istanbul, Turkey.

Loretta Auvil, Senior Project Coordinator at Illinois Informatics Institute, University of Illinois; SEASR co-PI.

Jeremy Douglass, Technical Director, Software Studies Initiative, UCSD.

Elizabeth Losh, Director of Academic Programs, Sixth College, Program in Culture, Art, and Technology, UCSD.


Project summary:

Since 2008, Software Studies Initiative at California Institute for Telecommunications and Information Technology (Calit2) and University of California, San Diego (UCSD) has been developing a comprehensive set of software tools for the quantitative analysis and visualization of large collections of images and video. The tools were designed for academic researchers in the humanities, and have already been used by scholars in a number of disciplines including art history, archeology, film and media studies, dance studies, and game studies. We have also been working with a number of prominent cultural institutions and collections including the Library of Congress, Getty Research Institute, the Austrian Film Museum, and the Netherlands Institute for Sound and Image, in using our techniques with their collections and data sets. The software development and its applications has received support from the National Endowment for the Humanities (NEH), the National Science Foundation (NSF), the University of California Humanities Research Institute (UCHRI), UC San Diego, and the California Institute for Telecommunications & Information Technologies (Calit2).

In our new three year project funded by $477,000 grant from The Andrew W. Mellon Foundation, we will work to fully integrate our techniques and tools into the SEASR/Meandre environment, a major platform for digital humanities research developed with key support from the Andrew W. Mellon Foundation. The integrated tools will come with comprehensive documentation and a set of examples covering a number of fields in the humanities and humanistic social sciences. This integration will address a current goal of SEASR to “continue to evolve to include processing of images and other multimedia data formats.” We anticipate these tools being used by an ever-expanding range of people, including academics and students in the humanities and humanistic social sciences, museum curators and visitors, and cultural creators who want to better understand how their work fits within a larger context

In addition to making available to others software tools, accessible user interfaces, documentation, and examples, Software Studies Initiative will also collaborate with other researchers to carry out large-scale case studies. Each case study will demonstrate how, within a particular field, quantitative analysis and visualization of images and/or video can open new research possibilities for that field. Each study will include documentation of the appropriate SEASR workflows, a paper describing the data, the methods used, the findings, and high-resolution still and animated visualizations:

Almila Akdag will lead the case study which will combine network analysis and image processing to explore a few million images and user data from deviantArt (the most popular social network for user-generated art).

Jeremy Douglass will lead the analysis of our one million manga images dataset.

Elizabeth Losh will lead the case study which applies our methods to thousands of hours of political video on the web and TV news.

Over 200 undergraduate and graduate UCSD students will participate in the project over its three year period, exploring selected data sets as part of their classes in visualization and computational art history, and digital humanities.


Contact:

Lev Manovich, Director, Software Studies Initiative [manovich@ucsd.edu]


More information:

Our methods for the analysis and visualization of large visual data sets

Our projects (analysis of image sets covering video games, visual art, graphic design, maagzines, newspapers, comic books, TV, films, animation, motion graphics.)

http://www.flickr.com/photos/culturevis/collections/ (Over 900 visualizations and sketches from our lab)

Our open source software tools (digital image processing and visualization of image sets of any size.)

Case study: One million manga pages

Pilot project: Digging Into Global News




Monday, April 16, 2012

Visualizations of Impressionist artists - color histograms (part 2)


Visualizations of Impressionist artists - color palettes (part 1) were created by UCSD undergraduate student Megan O'Rourke for her homework in my Winter 2012 class data visualization and compututional art history (the link is to Spring 2012 version of the class).

Here is another innovative visualization created by Megan. She adopted histogram technique to compare color palettes of six Impressionist artists. The histograms show the relative proportions of different hue in the set of paintings of each artist. (To make this visualization easier to read, below is the visualization of the same images from the earlier post. It maps paintings according to x-axis = median saturation, y-axis = average hue).

Together, the two visualizations reveal strong similarity between the color "footprints" of the selected paintings of these artists.

Impressionists Color Ranges

Impressionism Image Plots




Data:
Images of 630 Impressionist paintings.

Source:
Artstor.

Number of paintings per artist:
Artstor contains only some of the paintings by these arists. The diffrences in the numbers of images available for each artist reflect the differences in popularity of each artist as well their varied productivity.

The histograms use the median hue values measured per each painting.


Thursday, March 15, 2012

Visualizing newspapers history: The Hawaiian Star, 5930 front pages, 1893-1912

>
The Hawaiian Star, 5930 front pages, 1893-1912 (Vimeo)


Last September I met with Leslie Johnston (Chief of Repository Development at Library of Congress). We discussed how my lab and the students in my classes can start working on visualizing significant digital archives available though the Library web site.

We both agreed that the digitized archive of American newspapers created by The Library via a partnership with by National Endowment of Humanities is a good place to start. Currently the archive contains 4,776,214 pages, and it continues to grow. The pages are digitized at 400 dpi resolution.

A group of UCSD undergraduate students who were taking my 2011 Fall class on big cultural data, visualization and digital image processing (current syllabus) figured out how to download high-res images of newspaper images and metadata using Library API, and started working on visualizing a number of newspapers. We will be putting a page with this project's results on softwarestudies.com soon. Today we are releasing one of the animated visualization created by UCSD undergraduate student Cyrus Kiani (embedded on the top of the post).

Kiani's animation uses 5930 front pages from The Hawaiian Star covering 1893-1912 period. This period is particularly important for the development of modern visual communication (development of abstract art which leads to modern graphic design, the introduction of image oriented magazines such as Vogue, new medium of cinema, invention of phototelegraph, the first telefax machine to scan any two-dimensional image, etc.)

The animation of 5930 front pages of the single newspaper published during these 20 years for the first time make visible how visual design of modern print media changes over time, in search of the form appropriate to the new conditions of reception and new rhythm of modern life.


Here is Kiani's other visualizations and analysis of this data set (click on the image to view original version (3000 x 10650 pixels) on Flickr:

The Hawaiian Star, 5930 front pages, 1893-1912 (width = 1000 pixels)

Visit softwarestudies.com to see our other digital humanities projects, and to download the free software tools we developed for visualization of large image archives.

Saturday, February 11, 2012

Wolfram|Alpha Pro: Launching a Democratization of Data Science

At Software Studies Initiative, we have been working to democratize the use of digital image processing for exploring large image and video collections.

In September 20111, we released ImagePlot - a set of free open-source software tools which we developed in the lab and used in all our own research projects. You can now take a set of images and videos, automatically extract basic visual feature and then explore the patterns in your image or video collection using the extracted data.

So we are very exited to read the latest post from founder of Mathematica and Wolfram|Alpha Stephen Wolfram about the this week release of Wolfram|Alpha Pro, and how it automates analysis and visualization of the data sets.

http://blog.stephenwolfram.com/2012/02/launching-a-democratization-of-data-science/



Last October I was fortunate to chat with Stephen over lunch at Wolfram Data Summmit 2011 conference. He was interested in our work on analyzing and visualizing the spaces of variations of cultural artifacts. We talked about how the models of biological evolution and variability may related to cultural evolution and also the principles described in Wolfram's famous book The New Kind of Science.


Here are a few examples of our visualizations of spaces of variation in different kinds of cultural artifacts:

Mondrian vs Rothko: footprints and evolution in style space

Google logo space

One million manga pages