level 11
十六夜司
楼主
第二章:可视结构理论综述
An information visualization, like any artifact used for communication and reasoning,
is a representation system. This system includes correspondences between
low-level properties of the data and the image, which is the information captured in
the variable encoding model. However, it also includes a system for fitting those properties
into a larger picture: the visual information structure. This structure provides
context for individual data items, suggests patterns and relationships in the overall
data, and assists the user in reasoning about visually presented information.
Current infovis theory has much more to say about the low-level data encoding side
of this representation than about the high-level structural side. Theories in infovis
and diagrammatic reasoning that do consider the importance of visual structure tend
to be either fairly vague or to focus on spatial layout as another encoding dimension.
However, there is work in the related field of human-computer interaction (HCI)
that takes a more concrete view of how visual interfaces suggest structural properties
of systems, which suggests a possible way forward for infovis theory in this regard.
The importance of finding a way to integrate visual structure into infovis theory is
shown by work in cognitive science that highlights the strong effects that structure
and context can have on the perception of visual information. In this chapter, I
will present and discuss this background in visual structure theories in terms of the
attempt to make infovis theory more structurally sound.
2.1 Visual Structure in Infovis Theory
Infovis theory has most often adopted a model of visualization as information
extraction. This model focuses on how data are transformed into visual encodings,
and how a user then translates those visual encodings into internal knowledge. As
a result, theory of this kind tends to be largely concerned with object-level rather
than global properties. When structure is considered, it tends to be restricted to a
question of what data attributes influence an object’s position in space.
The seminal work in visualization theory is Bertin’s Semiology of Graphics [9].
Although Bertin was at the time writing about static diagrams, his work has been
highly influential in modern infovis. Bertin lays out a thorough system of information
graphics, defining “marks” as the primitive graphical object whose visual and spatial
properties are based on a mapping with underlying data. A mark can be any visual
element, such as a shape, line, area, or point, that represents information. He refers
to these visual properties as retinal properties, e.g., color, size, shape, and location.
Based on psychological knowledge about perception, he then provides guidelines for
the mapping of these properties to different types of data, such as categorical, ordinal,
and numerical: color is best suited to categorical data, position is the most precise
mapping for numerical values, and so forth.
Bertin also considers spatial structure in his work, primarily focusing on the image
plane and how marks are positioned on it. He calls systems of planar organization
“imposition” and sorts them broadly into diagrams, networks, maps, and symbols,
which can be further classified by the coordinate system used. This part of his theory
has been less broadly influential on infovis practice than the retinal properties, perhaps
because it is less thorough and does not provide such clear guidelines. Another
reason may be that the retinal properties were founded on scientific knowledge about
the capabilities of the human visual system, and no equivalent knowledge existed
at the time about how people understand visual structure. However, when visual
structure has been considered in infovis theory, it has usually resembled Bertin’s
construction.
Similarly to Bertin, Cleveland and McGill’s work on graphical perception [17]
explains the comprehension of information graphics through elementary perceptual
tasks, such as discerning angle, direction, area, and curvature of visual marks. Having
identified these tasks, they describe common diagram types like bar charts, pie charts,
and scatterplots in terms of which tasks are used to encode and decode data. Like
Bertin, they go on to make recommendations on the suitability of certain graphics
based on human perceptual abilities. Their theory is based on the idea that reading
visual information is a process of extracting information by decoding the visual
mapping.
These two works have together had a foundational influence on theoretical discussion
of information visualization. In many cases, this influence is direct and explicit:
for example, Mackinlay [42] employs Bertin’s classifications of visual marks and Cleveland
and McGill’s recommendations in his system for automating graph design. Card
and Mackinlay [13] also use a Bertin-inspired system to describe and classify visualization
methods in a taxonomy. In their model, visualization methods are coded
according to mappings between data variables and retinal variables; for example,
data variables are first coded by data type (i.e., nominal, ordered, or quantitative)
and then by what retinal or other visual property they are mapped to in a visualization.
What is striking about this paper is that, when they apply this model to
describing a number of actual infovis systems, it is almost always inadequate to the
task. Nearly every encoding they present includes asterisks and question marks to
indicate special cases, uncertainty about the visual variables being mapped, or what
the authors call “non-semantic use of space-time.” While this taxonomy makes a
heroic attempt to unify data description and visualization description under a single
model, the awkwardness of the fit seems to suggest that there are aspects of this
visual mapping that do not easily fall under variable encoding.
In other cases, the influence is more subtle, and reflects the emphasis on marks
and their visual properties in a broad range of ways. Wilkinson’s grammar of graphics
[61] attempts to define a language for combining these basic graphic elements. This
grammar takes an object-oriented approach in order to define generalized designs of
graphical representations of data. Like Bertin, Wilkinson considers structure only
in terms of coordinate systems—that is, how the position of marks is determined.
Shneiderman’s task by data type taxonomy [51] classifies data by a similar set of
structure types: one-dimensional, two-dimensional, three-dimensional, multidimensional,
temporal, tree, and network. Although these classifications refer to inherent
data properties, not visual structures, they are nonetheless influenced by assumptions
about on-screen positioning, or there would be no reason to separate two- and
three-dimensional data from the multidimensional category.
This influence is also present in taxonomies that classify visualization methods by
how they encode data, such as Chi’s data state reference model [16]. This model
expands on the steps involved in translating data into visual form, then defines the
behavior of a broad range of visualization methods at each step. This is similar to
Card and Mackinlay’s system, but is more process-oriented, emphasizing the encoding
as a transformation rather than a simple translation. While it is useful to expand on
what is meant by variable encoding, and what this process actually entails, it is still
an expansion on a narrow definition of what is going on in the use of infovis.
A basic assumption of this area of theory, made explicit in Cleveland and McGill
but implicit elsewhere, is that understanding a visualization is a process of information
extraction. That is, there is some encoding from data property to visual property,
and all a user does to gain knowledge from a visualization is reverse that encoding.
This viewpoint sees all the activity of using infovis happening at the level of individual
graphical marks; it does not allow for overall structural impressions having a
significant impact on understanding.
There have been many practical benefits of this line of research, such as its application
to automatic view generation in the visual analysis system Tableau [43]. Building
on his previous, more theoretical work in automated graph design [42], Mackinlay
provides users of Tableau with the option of automatically choosing the best graph
for their data, based on the type of data dimensions being visualized. The variable
encoding model has also provided a framework for usefully including knowledge of
perception in visualization research. However, a body of theory that concerns only
object properties is in danger of missing the forest for the trees. It is in some ways
surprising that infovis has taken such a narrow view of visual information, since the
closely related field of human-computer interaction (HCI) has dealt extensively with
the idea that a visual interface represents structure.
2.2 Visual Structure in Human-Computer Interaction
In human-computer interaction (HCI), the idea that an interface (the system of
input methods available when using a computer program) contains information about
how its components fit together and how they can be used is a natural one. A common
way of talking about this is in terms of a user’s mental model of a system [48]. That
is, when faced with a novel piece of software, a user tries to figure out how it works
and what interactions are possible based on the appearance of interface components.
These perceptions of form and function compose the user’s mental model, which is
used to make predictions about how to achieve a goal using the interface.
There is evidence that these mental models, far from being a purely abstract design
concept, can have a powerful effect on memory and reasoning in interface use. Kieras
and Bovair [35], in a series of experiments, presented participants with a novel device
consisting of various switches and flashing lights, then taught them how to use the
device either by rote (that is, by explaining what steps to take to achieve a specific
result) or by giving them a model of the device’s purpose and how it works, describing
it in Star Trek-inspired terms as a control panel for a “phaser bank” and assigning
purposes to the various interface components. Users given a meaningful model of how
and why a device works were not only more able to remember memorized tasks using
the device, but were also more likely to spontaneously find a more efficient way to
perform the task.
This work shows how important structure is for understanding a complex system.
While our purpose in infovis is not necessarily to solve problems (although it can be
in some cases), the argument can still be made that exploring a dataset is a matter of
learning a model for a complex system of information. Mental models are therefore
a useful way to think about how a user comes to understand a dataset.
While mental models are a good way of thinking about how people conceive of
the structure of software systems, the question of how people perceive that structure
is perhaps a more pressing one. That is, how do people construct a mental model
of a system, given the appearance and function of its interface? One of the most
common ways to discuss this process in HCI is in terms of perceived affordances.
The concept of affordances is originally derived from Gibson’s ecological perception
theory [29]. Gibson framed perception in terms of what actions a given animal sees
its environment as affording. For example, a solid, flat surface affords supporting the
animal’s movement, while a smooth, sloped surface affords sliding downwards. In
all cases, affordances are relative to the viewer; a given environment affords different
actions to a mouse and to an elephant. Any animal faced with a given environment
will automatically perceive such potentials for movement or action based on apparent
physical properties and the animal’s own abilities.
In HCI, the concept is used in a slightly different fashion, to refer to aspects of a
visual interface that suggest potential actions to a user [47]. For example, an interface
element that is styled to look like a physical toggle button suggests to the user that
it can be pressed. The general model of visual structure in HCI, then, is that people
view an interface in terms of its perceived physical affordances, derive predictions
about what actions they can take based on those affordances, and then derive a
mental model of the system by taking those actions and seeing how they meet their
predictions.
Given the amount of research overlap between HCI and infovis, it is surprising that
visualization is rarely thought of in terms of what mental models a technique suggests
to a user. There are two likely reasons for this. The first is that HCI assumes that the
systems it deals with are interactive, so the ability of a user to predict the outcome
of her actions is an obvious consideration. Infovis, on the other hand, builds on a
history of static depictions of data; interactivity is a more recent development for the
field. Consequently, the ability of a user to perceive data accurately is the primary
consideration.
The other reason is the lack of well-defined tasks in infovis. Having a model of
how a system works is obviously necessary if you need to use it in pursuit of a
goal. Knowing what you can do and how to do it are prerequisites for solving a
problem. But in infovis, we don’t necessarily know what problem we’re trying to
solve. The tasks we feel visualization systems are meant to address are vague ones like
understanding a dataset, forming hypotheses, pattern recognition, and exploration.
These are important tasks, and the possibility of systems that can help perform them
is what excites people about visualization. But they are also tasks that lack a clear
end state. Perhaps this aspect of visualization makes structural properties of the
interface seem less important than in other domains.
However, even a task without a clear goal can benefit from structure, even if the
contribution of a user’s mental model seems less direct or obvious. Some of the ways
that visual structure can affect understanding and general reasoning are illuminated
by work in diagrammatic reasoning and visual cognition.
2.3 Visual Cognition of Diagrams
While the information processing approach has provided a way to apply perception
research to information visualization, it is less well-suited to understanding visualization
from the perspective of higher-level cognition; that is, not only how people
perceive information, but how they learn, reason with, and remember information.
This cognitive perspective forces us to consider the structural properties of visualization
and how they affect not only what information is extracted but how that
information is understood.
Theories that focus on reasoning with visual representations include Stenning and
Oberlander’s view of diagrams and language as logically equivalent yet supporting
different facilities of inference [54]. That is, by making certain aspects of a problem
specific through visual representation, diagrams such as Euler circles can make
certain problem constraints explicit and therefore restrict potential inferences to a
smaller, valid subset. Similarly, Larkin and Simon [41] consider the differences between
graphical and verbal representations as differences in what information is made
salient and explicit. In a graphical representation, information is naturally organized
by location, while in a sentential representation it is organized sequentially. This
makes graphs more useful for, e.g., solving geometry problems, and language more
useful for problems that require logical reasoning. The authors consider what effects
the structure of a representation has on understanding, although they focus on the
very broad differences between words and pictures rather than defining differences
among types of graphical structure.
The importance of such differences, however, is illuminated by the extensive body
of work by Tversky and colleagues on how people interpret information presented
in different visual representations. For example, the authors presented the same
simple two-point data as either a bar chart or a line graph and asked for users’
interpretations [63]. They find that those viewing a bar chart tended to describe the
diagram as depicting two separate groups, while those viewing a line graph described
the data as a trend. This effect holds even when the interpretations conflicted with
the labels on the data points. For example, a line graph showing the average height
of males versus females prompted one participant to describe the chart as saying
“The more male a person is, the taller he/she is.” These findings and others are
further discussed as examples of how schematic figures such as bars and lines are
interpreted in varying contexts [56]. Many of these figures have seemingly natural
interpretations; for example, lines between marks imply a relationship between the
represented objects, while contours are used for grouping objects. However, in many
cases context aids the interpretation of ambiguous primitive features such as blobs
and lines by fitting their relevant properties to task demands. Understanding the
cognitive basis for these primitive features and how they can be altered in context
would go a long way towards explaining how visualization works.
This work has a particularly direct application to infovis, but it also recalls a
broader area of visual cognition that looks at how people use diagrams as an external
representation to aid in reasoning. Gattis and Holyoak [28] argue that the power of
graphical representations go beyond Larkin and Simon’s view that they merely allow
for more efficient information access in certain cases. Rather, they see diagrams and
graphs as having a special role in supporting reasoning by mapping conceptual relationships
to spatial ones, so that inferences about spatial properties can be extended
to inferences about the represented information. This view is supported by a number
of studies on diagrammatic reasoning, such as Bauer and Johnson-Laird’s finding [8]
that diagrams improve reasoning if they visually represent meaningful constraints in
a problem and Glenberg and Langston’s demonstration [30] that diagrams only improve
efficiency when their spatial mapping is conceptually meaningful. This work
taken together suggests that graphics can assist in problem solving, but only when
their spatial structure is meaningful in some way. The question of what structures
are meaningful and which are not, however, is not easily answered by existing work.
While this work suggests the importance of structure to the understanding of information
visualization, it offers no clear framework for discussing and analyzing that
structure. While they intuitively seem to be talking about the same thing, researchers
from different fields and perspectives may refer to these structural properties as visual
framing, spatial layout, graph types, and so on. A common language and theory for
discussing the effects of structure is necessary to integrate it into visualization practice,
as Bertin’s conception of retinal properties has provided a common language to
deal with object properties. A promising source for this theory is visual metaphor.
2014年02月22日 04点02分
1
An information visualization, like any artifact used for communication and reasoning,
is a representation system. This system includes correspondences between
low-level properties of the data and the image, which is the information captured in
the variable encoding model. However, it also includes a system for fitting those properties
into a larger picture: the visual information structure. This structure provides
context for individual data items, suggests patterns and relationships in the overall
data, and assists the user in reasoning about visually presented information.
Current infovis theory has much more to say about the low-level data encoding side
of this representation than about the high-level structural side. Theories in infovis
and diagrammatic reasoning that do consider the importance of visual structure tend
to be either fairly vague or to focus on spatial layout as another encoding dimension.
However, there is work in the related field of human-computer interaction (HCI)
that takes a more concrete view of how visual interfaces suggest structural properties
of systems, which suggests a possible way forward for infovis theory in this regard.
The importance of finding a way to integrate visual structure into infovis theory is
shown by work in cognitive science that highlights the strong effects that structure
and context can have on the perception of visual information. In this chapter, I
will present and discuss this background in visual structure theories in terms of the
attempt to make infovis theory more structurally sound.
2.1 Visual Structure in Infovis Theory
Infovis theory has most often adopted a model of visualization as information
extraction. This model focuses on how data are transformed into visual encodings,
and how a user then translates those visual encodings into internal knowledge. As
a result, theory of this kind tends to be largely concerned with object-level rather
than global properties. When structure is considered, it tends to be restricted to a
question of what data attributes influence an object’s position in space.
The seminal work in visualization theory is Bertin’s Semiology of Graphics [9].
Although Bertin was at the time writing about static diagrams, his work has been
highly influential in modern infovis. Bertin lays out a thorough system of information
graphics, defining “marks” as the primitive graphical object whose visual and spatial
properties are based on a mapping with underlying data. A mark can be any visual
element, such as a shape, line, area, or point, that represents information. He refers
to these visual properties as retinal properties, e.g., color, size, shape, and location.
Based on psychological knowledge about perception, he then provides guidelines for
the mapping of these properties to different types of data, such as categorical, ordinal,
and numerical: color is best suited to categorical data, position is the most precise
mapping for numerical values, and so forth.
Bertin also considers spatial structure in his work, primarily focusing on the image
plane and how marks are positioned on it. He calls systems of planar organization
“imposition” and sorts them broadly into diagrams, networks, maps, and symbols,
which can be further classified by the coordinate system used. This part of his theory
has been less broadly influential on infovis practice than the retinal properties, perhaps
because it is less thorough and does not provide such clear guidelines. Another
reason may be that the retinal properties were founded on scientific knowledge about
the capabilities of the human visual system, and no equivalent knowledge existed
at the time about how people understand visual structure. However, when visual
structure has been considered in infovis theory, it has usually resembled Bertin’s
construction.
Similarly to Bertin, Cleveland and McGill’s work on graphical perception [17]
explains the comprehension of information graphics through elementary perceptual
tasks, such as discerning angle, direction, area, and curvature of visual marks. Having
identified these tasks, they describe common diagram types like bar charts, pie charts,
and scatterplots in terms of which tasks are used to encode and decode data. Like
Bertin, they go on to make recommendations on the suitability of certain graphics
based on human perceptual abilities. Their theory is based on the idea that reading
visual information is a process of extracting information by decoding the visual
mapping.
These two works have together had a foundational influence on theoretical discussion
of information visualization. In many cases, this influence is direct and explicit:
for example, Mackinlay [42] employs Bertin’s classifications of visual marks and Cleveland
and McGill’s recommendations in his system for automating graph design. Card
and Mackinlay [13] also use a Bertin-inspired system to describe and classify visualization
methods in a taxonomy. In their model, visualization methods are coded
according to mappings between data variables and retinal variables; for example,
data variables are first coded by data type (i.e., nominal, ordered, or quantitative)
and then by what retinal or other visual property they are mapped to in a visualization.
What is striking about this paper is that, when they apply this model to
describing a number of actual infovis systems, it is almost always inadequate to the
task. Nearly every encoding they present includes asterisks and question marks to
indicate special cases, uncertainty about the visual variables being mapped, or what
the authors call “non-semantic use of space-time.” While this taxonomy makes a
heroic attempt to unify data description and visualization description under a single
model, the awkwardness of the fit seems to suggest that there are aspects of this
visual mapping that do not easily fall under variable encoding.
In other cases, the influence is more subtle, and reflects the emphasis on marks
and their visual properties in a broad range of ways. Wilkinson’s grammar of graphics
[61] attempts to define a language for combining these basic graphic elements. This
grammar takes an object-oriented approach in order to define generalized designs of
graphical representations of data. Like Bertin, Wilkinson considers structure only
in terms of coordinate systems—that is, how the position of marks is determined.
Shneiderman’s task by data type taxonomy [51] classifies data by a similar set of
structure types: one-dimensional, two-dimensional, three-dimensional, multidimensional,
temporal, tree, and network. Although these classifications refer to inherent
data properties, not visual structures, they are nonetheless influenced by assumptions
about on-screen positioning, or there would be no reason to separate two- and
three-dimensional data from the multidimensional category.
This influence is also present in taxonomies that classify visualization methods by
how they encode data, such as Chi’s data state reference model [16]. This model
expands on the steps involved in translating data into visual form, then defines the
behavior of a broad range of visualization methods at each step. This is similar to
Card and Mackinlay’s system, but is more process-oriented, emphasizing the encoding
as a transformation rather than a simple translation. While it is useful to expand on
what is meant by variable encoding, and what this process actually entails, it is still
an expansion on a narrow definition of what is going on in the use of infovis.
A basic assumption of this area of theory, made explicit in Cleveland and McGill
but implicit elsewhere, is that understanding a visualization is a process of information
extraction. That is, there is some encoding from data property to visual property,
and all a user does to gain knowledge from a visualization is reverse that encoding.
This viewpoint sees all the activity of using infovis happening at the level of individual
graphical marks; it does not allow for overall structural impressions having a
significant impact on understanding.
There have been many practical benefits of this line of research, such as its application
to automatic view generation in the visual analysis system Tableau [43]. Building
on his previous, more theoretical work in automated graph design [42], Mackinlay
provides users of Tableau with the option of automatically choosing the best graph
for their data, based on the type of data dimensions being visualized. The variable
encoding model has also provided a framework for usefully including knowledge of
perception in visualization research. However, a body of theory that concerns only
object properties is in danger of missing the forest for the trees. It is in some ways
surprising that infovis has taken such a narrow view of visual information, since the
closely related field of human-computer interaction (HCI) has dealt extensively with
the idea that a visual interface represents structure.
2.2 Visual Structure in Human-Computer Interaction
In human-computer interaction (HCI), the idea that an interface (the system of
input methods available when using a computer program) contains information about
how its components fit together and how they can be used is a natural one. A common
way of talking about this is in terms of a user’s mental model of a system [48]. That
is, when faced with a novel piece of software, a user tries to figure out how it works
and what interactions are possible based on the appearance of interface components.
These perceptions of form and function compose the user’s mental model, which is
used to make predictions about how to achieve a goal using the interface.
There is evidence that these mental models, far from being a purely abstract design
concept, can have a powerful effect on memory and reasoning in interface use. Kieras
and Bovair [35], in a series of experiments, presented participants with a novel device
consisting of various switches and flashing lights, then taught them how to use the
device either by rote (that is, by explaining what steps to take to achieve a specific
result) or by giving them a model of the device’s purpose and how it works, describing
it in Star Trek-inspired terms as a control panel for a “phaser bank” and assigning
purposes to the various interface components. Users given a meaningful model of how
and why a device works were not only more able to remember memorized tasks using
the device, but were also more likely to spontaneously find a more efficient way to
perform the task.
This work shows how important structure is for understanding a complex system.
While our purpose in infovis is not necessarily to solve problems (although it can be
in some cases), the argument can still be made that exploring a dataset is a matter of
learning a model for a complex system of information. Mental models are therefore
a useful way to think about how a user comes to understand a dataset.
While mental models are a good way of thinking about how people conceive of
the structure of software systems, the question of how people perceive that structure
is perhaps a more pressing one. That is, how do people construct a mental model
of a system, given the appearance and function of its interface? One of the most
common ways to discuss this process in HCI is in terms of perceived affordances.
The concept of affordances is originally derived from Gibson’s ecological perception
theory [29]. Gibson framed perception in terms of what actions a given animal sees
its environment as affording. For example, a solid, flat surface affords supporting the
animal’s movement, while a smooth, sloped surface affords sliding downwards. In
all cases, affordances are relative to the viewer; a given environment affords different
actions to a mouse and to an elephant. Any animal faced with a given environment
will automatically perceive such potentials for movement or action based on apparent
physical properties and the animal’s own abilities.
In HCI, the concept is used in a slightly different fashion, to refer to aspects of a
visual interface that suggest potential actions to a user [47]. For example, an interface
element that is styled to look like a physical toggle button suggests to the user that
it can be pressed. The general model of visual structure in HCI, then, is that people
view an interface in terms of its perceived physical affordances, derive predictions
about what actions they can take based on those affordances, and then derive a
mental model of the system by taking those actions and seeing how they meet their
predictions.
Given the amount of research overlap between HCI and infovis, it is surprising that
visualization is rarely thought of in terms of what mental models a technique suggests
to a user. There are two likely reasons for this. The first is that HCI assumes that the
systems it deals with are interactive, so the ability of a user to predict the outcome
of her actions is an obvious consideration. Infovis, on the other hand, builds on a
history of static depictions of data; interactivity is a more recent development for the
field. Consequently, the ability of a user to perceive data accurately is the primary
consideration.
The other reason is the lack of well-defined tasks in infovis. Having a model of
how a system works is obviously necessary if you need to use it in pursuit of a
goal. Knowing what you can do and how to do it are prerequisites for solving a
problem. But in infovis, we don’t necessarily know what problem we’re trying to
solve. The tasks we feel visualization systems are meant to address are vague ones like
understanding a dataset, forming hypotheses, pattern recognition, and exploration.
These are important tasks, and the possibility of systems that can help perform them
is what excites people about visualization. But they are also tasks that lack a clear
end state. Perhaps this aspect of visualization makes structural properties of the
interface seem less important than in other domains.
However, even a task without a clear goal can benefit from structure, even if the
contribution of a user’s mental model seems less direct or obvious. Some of the ways
that visual structure can affect understanding and general reasoning are illuminated
by work in diagrammatic reasoning and visual cognition.
2.3 Visual Cognition of Diagrams
While the information processing approach has provided a way to apply perception
research to information visualization, it is less well-suited to understanding visualization
from the perspective of higher-level cognition; that is, not only how people
perceive information, but how they learn, reason with, and remember information.
This cognitive perspective forces us to consider the structural properties of visualization
and how they affect not only what information is extracted but how that
information is understood.
Theories that focus on reasoning with visual representations include Stenning and
Oberlander’s view of diagrams and language as logically equivalent yet supporting
different facilities of inference [54]. That is, by making certain aspects of a problem
specific through visual representation, diagrams such as Euler circles can make
certain problem constraints explicit and therefore restrict potential inferences to a
smaller, valid subset. Similarly, Larkin and Simon [41] consider the differences between
graphical and verbal representations as differences in what information is made
salient and explicit. In a graphical representation, information is naturally organized
by location, while in a sentential representation it is organized sequentially. This
makes graphs more useful for, e.g., solving geometry problems, and language more
useful for problems that require logical reasoning. The authors consider what effects
the structure of a representation has on understanding, although they focus on the
very broad differences between words and pictures rather than defining differences
among types of graphical structure.
The importance of such differences, however, is illuminated by the extensive body
of work by Tversky and colleagues on how people interpret information presented
in different visual representations. For example, the authors presented the same
simple two-point data as either a bar chart or a line graph and asked for users’
interpretations [63]. They find that those viewing a bar chart tended to describe the
diagram as depicting two separate groups, while those viewing a line graph described
the data as a trend. This effect holds even when the interpretations conflicted with
the labels on the data points. For example, a line graph showing the average height
of males versus females prompted one participant to describe the chart as saying
“The more male a person is, the taller he/she is.” These findings and others are
further discussed as examples of how schematic figures such as bars and lines are
interpreted in varying contexts [56]. Many of these figures have seemingly natural
interpretations; for example, lines between marks imply a relationship between the
represented objects, while contours are used for grouping objects. However, in many
cases context aids the interpretation of ambiguous primitive features such as blobs
and lines by fitting their relevant properties to task demands. Understanding the
cognitive basis for these primitive features and how they can be altered in context
would go a long way towards explaining how visualization works.
This work has a particularly direct application to infovis, but it also recalls a
broader area of visual cognition that looks at how people use diagrams as an external
representation to aid in reasoning. Gattis and Holyoak [28] argue that the power of
graphical representations go beyond Larkin and Simon’s view that they merely allow
for more efficient information access in certain cases. Rather, they see diagrams and
graphs as having a special role in supporting reasoning by mapping conceptual relationships
to spatial ones, so that inferences about spatial properties can be extended
to inferences about the represented information. This view is supported by a number
of studies on diagrammatic reasoning, such as Bauer and Johnson-Laird’s finding [8]
that diagrams improve reasoning if they visually represent meaningful constraints in
a problem and Glenberg and Langston’s demonstration [30] that diagrams only improve
efficiency when their spatial mapping is conceptually meaningful. This work
taken together suggests that graphics can assist in problem solving, but only when
their spatial structure is meaningful in some way. The question of what structures
are meaningful and which are not, however, is not easily answered by existing work.
While this work suggests the importance of structure to the understanding of information
visualization, it offers no clear framework for discussing and analyzing that
structure. While they intuitively seem to be talking about the same thing, researchers
from different fields and perspectives may refer to these structural properties as visual
framing, spatial layout, graph types, and so on. A common language and theory for
discussing the effects of structure is necessary to integrate it into visualization practice,
as Bertin’s conception of retinal properties has provided a common language to
deal with object properties. A promising source for this theory is visual metaphor.