(2011) The Visual Display Of Quantitative Information.pdf
Tufte's writing is important in such fields as information design and visual literacy, which deal with the visual communication of information. He coined the word chartjunk to refer to useless, non-informative, or information-obscuring elements of quantitative information displays. Tufte's other key concepts include what he calls the lie factor, the data-ink ratio, and the data density of a graphic.
(2011) The Visual Display Of Quantitative Information.pdf
One method Tufte encourages to allow quick visual comparison of multiple series is the small multiple, a chart with many series shown on a single pair of axes that can often be easier to read when displayed as several separate pairs of axes placed next to each other. He suggests this is particularly helpful when the series are measured on quite different vertical (y-axis) scales, but over the same range on the horizontal x-axis (usually time).
As digital humanists have adopted visualization tools in their work, they have borrowed methods developed for the graphical display of information in the natural and social sciences. These tools carry with them assumptions of knowledge as observer-independent and certain, rather than observer co-dependent and interpretative. This paper argues that we need a humanities approach to the graphical expression of interpretation. To begin, the concept of data as a given has to be rethought through a humanistic lens and characterized as capta, taken and constructed. Next, the forms for graphical expression of capta need to be more nuanced to show ambiguity and complexity. Finally, the use of a humanistic approach, rooted in a co-dependent relation between observer and experience, needs to be expressed according to graphics built from interpretative models. In summary: all data have to be understood as capta and the conventions created to express observer-independent models of knowledge need to be radically reworked to express humanistic interpretation.
Integration through methods. The quantitative and qualitative data were collected concurrently, and the approach to integration involved merging. With the content of the scales on the survey in mind, the mixed methods team developed the open-ended responses on the survey and interview questions for mini focus groups to parallel visual analog scale (VAS) questions about ethical advantages and disadvantages. By making this choice intentionally during the design, integration through merging would naturally follow. The research team conducted separate analyses of the quantitative and qualitative data in parallel. For the quantitative analytics, the team calculated descriptive statistics, mean scores, and standard deviations across the four stakeholder groups. Box plots of the data by group were developed to allow intra- and intergroup comparisons. For the qualitative analytics, the investigators immersed themselves in the qualitative database, developed a coding scheme, and conducted thematic searches using the codes. Since the items on the VASs and the questions on the qualitative interview guides were developed in tandem, the codes in the coding scheme were similarly developed based on the items on the scales and the interview questions. As additional themes emerged, codes to capture these were added. The methodological procedures facilitated thematic searches of the text database about perceived ethical advantages and disadvantages that could be matched and merged with the scaled data on beliefs about ethical advantages and disadvantages.
Background: Those working in healthcare today are challenged more than ever before to quickly and efficiently learn from data to improve their services and delivery of care. There is broad agreement that healthcare professionals working on the front lines benefit greatly from the visual display of data presented in time order.
A critical output of metagenomic studies is the estimation of abundances of taxonomical or functional groups. The inherent uncertainty in assignments to these groups makes it important to consider both their hierarchical contexts and their prediction confidence. The current tools for visualizing metagenomic data, however, omit or distort quantitative hierarchical relationships and lack the facility for displaying secondary variables.
Krona is both a powerful metagenomic visualization tool and a demonstration of the potential of HTML5 for highly accessible bioinformatic visualizations. Its rich and interactive displays facilitate more informed interpretations of metagenomic analyses, while its implementation as a browser-based application makes it extremely portable and easily adopted into existing analysis packages. Both the Krona rendering code and conversion tools are freely available under a BSD open-source license, and available from:
Metagenomics is a relatively new branch of science and much of the current research is exploratory. Visualization has thus been a prominent aspect of the field, beginning with the analysis package MEGAN [1, 2]. Distilling metagenomic data into graphical representations, however, is not a trivial task. The foundation of most metagenomic studies is the assignment of observed nucleic acids to taxonomic or functional hierarchies. The various levels of granularity (e.g. ranks) inherent in these classifications pose a challenge for visualization. Node-link diagrams can be used to convey hierarchy, and bar or pie charts can relate abundances at specific levels, but neither of these methods alone creates a complete illustration of classificatory analysis. Furthermore, taxonomic and functional hierarchies are often too complex for all nodes to be shown, and wide variations in abundances can be difficult to represent. MEGAN addresses these problems by augmenting node-link diagrams with small, log-scaled quantitative charts at the nodes. This type of display is also used by the web-based metagenomic platform MG-RAST . The approach has the advantage that nodes are explicitly represented in the hierarchy, regardless of magnitude. Its drawback, though, is that its disparate quantitative charts and logarithmic scaling obfuscate relative differences in abundances. Another web-based platform, METAREP , features naturally scaled heatmaps of abundance, but only for specific ranks. Both MG-RAST and METAREP can also display the relative abundances of children for individual nodes while browsing their hierarchies. A common strength of MEGAN, MG-RAST, and METAREP is that they facilitate direct comparison of multiple datasets at each node, such as metagenomes sampled from different regions or under different conditions. It is important to note, however, that these comparisons will be of predictions, rather than true abundances.
A common criticism of RSF displays is the difficulty of comparing similarly sized nodes. To make comparisons easier, Krona sorts nodes by decreasing magnitude with respect to their siblings. In addition, the nodes can be colored using a novel algorithm that works with the sorting to visually emphasize both hierarchy and quantity. This algorithm, which is enabled by default, uses the hue-saturation-lightness (HSL) color model to allow procedural coloring that can adapt to different datasets. First, the hue spectrum is divided among the immediate children of the current root node. Each of these children in turn subdivides its hue range among its children using their magnitudes as weights. Coloring each sorted node by the minimum of its hue range causes recursive inheritance of node hue by the largest child of each generation. The result is visual consistency for lineages that are quantitatively skewed toward particular branches. To distinguish each generation without disrupting this consistency, the lightness aspect of the HSL model is increased with relative hierarchical depth, with saturation remaining constant.
To visualize secondary attributes in addition to magnitude, individual nodes in Krona may be colored by variable. For categorical variables, users may define the color of every node in the XML. For quantitative variables, a gradient may be defined that will color each node by value. An example of this is shown in Figure 3, where each node is colored by a quantitative red-green gradient representing classification confidence.
Coloring by classification confidence. Human gut sample MH0072 from the MetaHIT project  was classified using PhymmBL and displayed using Krona. Abundance can be simultaneously visualized with an accessory attribute by linking it to hue. In this example, hue is used to display classification confidence as reported by PhymmBL. The average confidence value for each node is colored from low (red) to high (green), distinguishing uncertain from certain classifications. An interactive version of this chart is available on the Krona website.
Additionally, metagenomic data are often generated at discrete points across multiple locations or times. Krona is able to store the data from multiple samples in a single document. Individual samples may then be stepped through, at any zoom level, using the navigation interface at the top left. For example, in Figure 2 Krona is displaying one of seven depth samples from the oceanic water column. Advancing through these samples progresses through samples at greater and greater depths. The transition between samples is animated using a polar "tween" effect, emphasizing the difference between samples. The result of this style of navigation is a series of moving pictures, where the taxa dynamically grow and shrink from sample to sample-in this case as sampling descends the water column. This approach is eye-catching for a few samples, but direct comparison between many samples simultaneously is difficult with radial charts. Analysis across many samples is better left to traditional heatmap and differential barchart visualizations.