Software Studies Beta: Style Space: How to compare image sets and follow their evolution

Draft text by Lev Manovich (August 4-6, 2011).

All projects and visualizations are created by members of Software Studies Initiative
(credits appear under the images on Flickr)

Batch image processing softwate: Sunsern Cheamanunkul and Jeremy Douglass.
ImagePlot visualization software: Lev Manovich, Jeremy Douglass, Nadia Xiangfei Zeng.
ImagePlot documentation: Tara Zepel.
Statistical analysis of manga images and data: Sunsern Cheamanunkul, Bertrand Grandgeorge, Lev Manovich.

Research described in this article was supported by Calit2 UCSD Division, Center for Research in Computing and the Arts (CRCA), NEH Office of Digital Humanities, and National University of Singapore.

----------------------------
style is a "...distinctive manner which permits the grouping of works into related categories."
Fernie, Eric. Art History and its Methods: A critical anthology. London: Phaidon, 1995, p. 361.

----------------------------

AN EXAMPLE: VAN GOGH's PARIS AND ARLES PAINTINGS

Lets start with an example. We want to compare van Gogh paintings
created when the artist lived in Paris (1886-1888) and in Arles
(1888). We have digital images of most of the paintings done by
the artist in these two places: 1999 for Paris, and 161 for Arles. (We
did not include the paintings done after the ear accident which took
place in the end of 1889 - although van Gogh continued to be in Arles
for a few months, he was in and out of the hospital and his
productivity was severely diminished).

The following visualizations project each of the image set into the
same coordinate space. X-axis represents the measurements of average
brightness (X-axis); Y-axis represents the measurements of average
saturation (Y-axis). (We use median rather than mean since it is less
affected by outlier values. The measurements are done with a free open
source digital image analysis application ImageJ.)

Here are Paris paintings:

van_Gogh.Paris.X_brightness.Y_saturation

And here are Arles paintings:

van_Gogh.Arles.X_brightness.Y_saturation

Projecting sets of paintings done in two places into the same
coordinate space allows us to better see the similarities and
differences between the two periods on brightness/saturation
dimensions. We see the parts of the space of visual possibilities
explored in each period. We also see the relative distributions of
their works - the more dense and the more sparse areas, the presence
or absence of clusters, the outliers, etc.

Arles paintings are much less spread out than Paris paintings. Their
cluster is higher and to the right of the cluster formed by Paris
paintings (higher saturation, higher brightness). But these are not
absolute differences. The two clusters overlap significantly. In other
words: while some Arles paintings are exploring a new visual
territory, others are not. Traces of van Gogh earlier pre-Paris styles
are also still visible: a significant number of Paris paintings and a
number of Arles paintings are quite dark (left quarter of each
visualizations.)

STYLE SPACE: DEFINITION

A style space is a projection of quantified properties of a
set of cultural artifacts (or their parts) into a 2D place. X and Y
represent the properties (or their combinations). The position of
each artifact is determined by its values for these properties.

Since the rest of this discussion deals with images, we can rephrase
this definition as follows: A style space is a projection of
quantified visual properties of images into a 2D plane. In the
example above, X axis represents average brightness, and Y axis
represents average saturation. We can also use three visual
properties to map images in a three-dimensional space. Of course,
two or three properties can't capture all the aspect of a visual style.
Since images have many different visual properties, we can create
many 2D visualizations, each using a different combinations of
visual properties.

We are not claiming that such representations can capture all
aspects of a visual styke. A "style space" representation is
a tool for exploring image sets. (It is particularly effective for
large sets.) It allows us compare all images in a set (or sets)
according to their visual values. For instance, the two
visualizations above compare van Gogh's Paris and Arles paintings
according to their average brightness and average saturation.
Separating a "style" into distinct visual dimensions and
organizing images according to their values on these dimensions
allows us to see more clearly how differences between the images
in a set. Visual differences are translated into spatial distances.
Images which are visually similar will be close; images which
are different will be further away.

Here is another example of a style space concept application.
We compare 128 paintings by Piet Mondrian (1905-1917) and
151 paintings by Mark Rothko (1944-1957). The two image visualizations
are placed side by side, so they share the same X axis.

X-axis: brightness mean.
Y-axis: saturation mean.

(For a discussion of this example, see Mondrian vs Rothko: footprints
and evolution in style space).

Now, consider a style space where min and max of each axis are set to
smallest and biggest possible visual values. All images which were
already created, and all possible images which can be created in the
future will lie within the boundaries set by these mind and max
values.

To illustrate this, we placed a set of specially created black and
white images in a simple style space (X-axis = brightness mean, Y-axis
= brightness standard deviation):

Because brightness mean and brightness standard deviation variables
are correlated, all possible images will lie within a half ellipse,
defined by these coordinates: 0,0 (left), 255,0 (right), 127.5, 126.6
(top). The images of a particular artist, a particular artistic
school, the pages of a comic, all ads created by a company, or any
other cultural image set will typically occupy only a part of this
ellipse.

The following example maps pages from nine manga titles according to
their brightness mean (X) and brightness standard deviation (Y). The
pages make visible the ellipse shape. Most pages fall within a
particular part of the ellipse. These pages form a pretty tight
cluster; outside of the cluster, the ellipse is only sparsely
populated.

(Note: A manga narrative can be referred to as both a "title" and a
"series," if it consists from many chapters. In this text we use the
world "title" but you may also find the word "series" in descriptions
of our visualizations on Flickr linked here.)

We can refer to a particular part of a style space occupied by a set
of images as a footprint of this set. Informally, we can
characterize a footprint using its center and shape.

Formal descriptions are available in statistics. If we consider
measurements of a single visual dimension (i.e a single visual
property such as brightness mean), we can characterize their
distribution, the central tendency and the dispersion
(see http://en.wikipedia.org/wiki/Descriptive_statistics.)

If we want to analyze multiple features together, we can apply the techniques of multivariate statistics.

FEATURES

The visualizations above use simple visual features - brightness and
saturation. Digital image processing allows us to measure images on
hundreds of other visual dimensions: colors, textures, lines, shapes,
etc. In computer science, such measurements are often called "image features."

We can map images into a space defined by any combination of these
features. For example, the following visualization of 128 Mondrian
paintings created between 1905 and 1917 uses measures of average
brightness as X, and average hue as Y (a median average of colors
of every pixel represented on 0-255 scale). Although an average value
of all pixel's colors may seem like a strange idea, this feature
measurement turns out to be quite meaningful: it reveals that almost
all of 128 Mondrian paintings created between 1905 and 1917 fall into
groups: whose dominated by brown and red (bottom) and whose dominated
by blue and violet (top).

IMAGE FEATURES AND STYLE

To what extent basic properties of visual cultural artifacts (i.e.,
features) represent "dimensions" of style? In many cases, the basic
"low-level" properties correspond to "high-level" stylistic
attributes. For instance, in the case of many modern abstract artists
such as Mondrian and Rothko, measurements of color saturation and hues
are meaningful and can reveal interesting patterns in the evolution
of the artists.

Here is another example of how a low-level feature captures a
high-level style attribute. This feature is entropy
- a measure of unpredictability. If an image has lots of details
and/or textures, it will have high entropy (since it is hard to
predict the values of a pixel based on the values of its its
neighbours). If an image consists mostly from flat areas - i.e. a
singular gray tone or color without much variation or texture - it
will have low entropy.

This visualization maps one million manga page according to their
entropy (Y-axis) and standard deviation (X-axis). Both entropy and
standard deviation are measured using pixel's brightness values.)

The pages in the bottom part of the visualization are the most graphic
and have the least amount of detail. The pages in the upper right have
lots of detail and texture. The pages with the highest contrast are on
the right, while pages with the least contrast are on the left. In
between these four extremes, we find every possible stylistic
variation.

In other words: the footprint of our sample of one million pages
almost completely covers the complete space of possible values in
entropy/standard deviation space. In addition, the large part of this
footprint is very dense, i.e., the distances between neighbour pages
are very tiny. We can call this dense area a "core."

This suggests that our concept of “style” as it is commonly maybe not
appropriate then we consider large cultural data sets. The concept
assumes that we can partition a set of cultural artifacts works into a
small number of discrete categories. In the case of our one million
pages set, we find practically infinite graphical variations. If we
try to divide this space into discrete stylistic categories, any such
attempt will be arbitrary."

How does the statement that "our basic concept of 'style' maybe not
appropriate then we consider large cultural data sets" we just made
fits with the concept of a "style space"? A "style space" is simply a
space of all possible values of particular visual features (either
single features or their combinations) mapped into X and Y. Since we
can measure visual properties of any images, we can represent any
image set in such a space. Such a visualization reveals if it is
meaningful to speak about a "style" shared by this image set (or its
parts), or not. If an image set is spread out across the space, we
can't talk about their distinct style. If an image sets forms a
cluster which only occupies a small part of the space, we may be able
to.

In the case of one million manga images, they completely fill the
whole range of possible values on entropy dimension (little
texture/detail - lots of texture/detail). But with Mondrian and Rothko
image sets, the paintings produced by each artist in a particular
period we are considering only cover a smaller area of
brightness/saturation space, so it is meaningful to talk about a
"style" of each period. (If we measure and visualize numbers and
characteristics of shapes in paintings of each artist produced in
their later years, the footprints will be even smaller.)

(For more details about our manga data set, see
Douglass, Jeremy, William Huber, Lev Manovich. 2011. Understanding scanlation:
how to read one million fan-translated manga pages.)

DENSITY

Mapping all images in a set into a space defined by some of their
visual features can be very revealing, but it has one limitation:
sometimes it makes it hard to see varying density of images footprint.
Therefore a visualization which shows images can be supplemented by a
visualization which represents images as points and uses transparency.

The following visualization shows same one million manga data sets
mapped in the same way using points. The initial plot was created in
free Mondrian software, and then colorized in Photoshop.

manga.pages.all_titles.points_size4.blur_4.shadows_highlights_44

Another way to visualize density is by graphing values of images on
each single dimension separately. The following graphs show the
distributions of brightness mean and brightness standard deviation
averages calculated per each title in our manga set.

(In statistical terms, each feature is a "random variable."
The values of a single features of all images in a given set can be
descrbed using univarite statistics: measures of central tendency such as mean or median;
measures of dispersion such as range, variance, and standard deviation;
graphs of frequency distribution.

If we can fit a data to some well-known distribution such as normal
distribution, we can characterize what we informally called
"density" more precisely using probability density function.)

(Note: when using statistics to describe measures of visual features,
we need to always be clear if we treat our image set as a
complete population or as a sample
from a larger population. For example, we can think of one million
manga pages as a sample of a larger population of all manga. In the
case of van Gogh paintings, a set of all his paintings can be taken as
a complete population.)

End of part 1.

Continue to Part 2.

Logo (Home Page Only)

Software Studies Initiative

Logo and Side Nav

Software Studies Initiative

News

Search

Browse News Archive

Saturday, August 6, 2011

Style Space: How to compare image sets and follow their evolution

Labels

Footer

Software Studies Initiative