Current location - Loan Platform Complete Network - Big data management - Information Visualization Organizer
Information Visualization Organizer

Regardless of the total amount and complexity of data, most relationships between data can be categorized into three types: Comparison / Composition / Distribution & Linkage.

Comparison charts are often needed for categorical/temporal data comparisons. Comparison of fewer items, such as five regions, can be represented by a bar chart.

Bars When there are more entries, such as more than 12, a bar chart on a mobile device can be crowded, and a bar chart is more appropriate. Generally no more than 30 data items, otherwise it is easy to bring visual and memory burden. Bar charts also have many rich applications. For example, stacked bar charts, waterfall charts, horizontal bar charts, horizontal axis positive and negative charts.

Seeing trends - line charts Line charts are useful when the X-axis is continuous (e.g. time) and focused on trends.

Widening the Difference - Nightingale Rose Chart. Because the radius and area of a sector are squared, a Nightingale Rose Chart magnifies the difference between values and is suitable for comparing values of similar size. Rose diagrams are also suitable for representing periodic/time concepts such as weeks and months. It is still recommended that no more than 30 data items be used, beyond which bar charts can be considered.

When comparing data in two categories or even more dimensions, try a bi-directional bar chart. Use color to distinguish between regions, and hollow/solid to distinguish between receipt and delivery volumes, to compare regions as a whole, but also to compare areas in detail.

Playing the game is a bit more difficult. Add another dimension to the two-way chart, as in the table below, which compares the profits and corresponding revenues and costs for 5 regions. Please think about this for a moment before sliding down to see the recommended chart.

A quick glance at the graph shows that the Shenzhen district is less profitable than the Guangzhou district, and even though its revenue is higher than the Guangzhou district, its costs are relatively higher than the Guangzhou district.

Achievement - Bullet Charts Examine the achievement of metrics such as revenue attainment and range (excellent, good, poor).

Bullet charts, because they resemble the trajectory of a bullet after it has been fired. Compared to a dashboard, it is able to express a wealth of data information in a small space, and has greater efficacy advantages in information delivery.

If you also want to compare 4 quarters of revenue, just differentiate them with different colors. In the chart below, it is clear at a glance that the second quarter performed better, while the first quarter did not.

Performance - Radar Chart. Multi-dimensional performance data, such as aggregate ratings, are often represented as radar charts. It's seen more often in games. It is more commonly used in business and finance, and is suitable for expressing a known result within a fixed framework. Commonly used for business conditions, financial health.

Indicator score close to the center of the circle, that is, in a poor state , should be analyzed and improved; Indicator score close to the outer edge of the line, that is, in an ideal state . For example, I analyze corporate finance, divided into six categories: sales, marketing, R & D, customer service, technology, management. It would be clear to plot a dimensional comparison of budget and actual overhead by radar chart. The following chart:

Above is the "comparison" category of common charts, summarized as follows:

A whole is divided into several parts. This is the type of chart that is used for compositional charts, such as the percentage of shipments received in the five regions, and the source of the company's profits.

Single Level - Pie Chart

In Level 1, a bar chart is used to compare the volume of shipments received in the five regions. If you look at the share, a pie chart is more appropriate. Pie charts are flawed in that they are good at expressing a large percentage of a category. but it's not very good at comparing. 30% and 35% are hard to tell on a pie chart with the naked eye. When there are too many categories, it's also not a good idea to represent them in a pie chart.

What if there were 17 regions? Pie charts generally have no more than 9 categories, and more than that it is recommended to show them as bars.

In addition to pie charts, ring charts (donut charts) can also show percentages, the difference being that the middle of the pie chart is hollowed out, and in the hollow area text information is displayed, such as a headline, which has the advantage of being more efficient in terms of space utilization.

Layering - ring diagrams, rising sun diagrams

For management, it is important to start with the big picture and the key points. For example, the head of the region needs to see at a glance the key regions and key divisions (as shown below), how to show them?

This is called the Rising Sun diagram, drilling down layer by layer to see the data, the key areas of the region and the composition of the corresponding divisions at a glance.

Cumulative Trend - Stacked Area Chart

A case of looking at the composition of values over time: how do you visualize the trend in revenue composition over the last four years in the first region (which contains the four key regions)?

The recommended solution is a stacked area map, which shows the contribution of the components (regions) to the total (regions) and shows the evolution of the total (regions) . It is important to note that the starting point for area revenue does not start at y=0, but is stacked on top of the following areas to form a final whole.

Guidelines for the best design of area charts: put fluctuating categories at the top, use transparent colors, don't have more than 4 categories, start the y-axis at 0, and don't use area charts to show discrete data, only continuous data has intermediate values.

Cumulative Comparisons - Stacked Bar Charts

If you swap the label text (i.e., year) and the legend (i.e., region) on the x-axis of the above chart (Figure A, below) and use it to look at the composition of income in each region over the last four years, which would be a more appropriate chart to use?

Stacked Area Chart A and Stacked Bar Chart B both show cumulative values. The difference is that the x-axis of a stacked area chart is continuous data (e.g., time) and the x-axis of a stacked bar chart is categorical data. In this case, the x-axis is non-continuous categorical data, so scheme B is more appropriate.

Cumulative Increase/Decrease - Waterfall Chart

A waterfall chart can be used if you want to express the evolution of quantities between two data points. A value that starts out as a value, after a continuous process of adding and subtracting, results in a value . A waterfall chart graphs this process and is often used to show revenues and expenditures in financial analysis.

Distribution & linkage charts allow you to see the distribution of data and find certain links, such as correlations, outliers, and clusters.

Two Variables - Scatterplot

Still using the business example, the chart below shows the distribution of cost/revenue for a single ticket for a national network.

Looking at it this way alone, you might not see much, not if you add two averages.

With the averages added, you know which outlets are above average and which are below average. But there are so many outlets that you can't click on them one by one to see which region they are in, and adding color to the scattering makes a lot of sense.

With this graph, you can see which regions have lower single-ticket margins and are in desperate need of a boost, such as the widely clustered fourth region in the lower right corner, where single-ticket revenues are lower than average and single-ticket costs are higher than average.

Three Variables - The Bubble Chart

As we all know, outlets' total profit is not only related to single-ticket profit, but also to volume (i.e., the volume of pieces received), which, when expressed as the size of the area of the scattering, becomes the volume of pieces received, which becomes the volume of pieces received. intake volume, it becomes a bubble chart.

Everything that has to do with spatial attributes can be analyzed using a geographic map. For example, sales by region, or density of stores in a commercial area. Bubble charts combined with maps can evolve into heat maps. With a heat map, you can see which outlets are receiving and delivering more shipments and need to deploy resources.

Geographic maps must use the coordinate dimension. This can be latitude and longitude, or geographic names (Shanghai, Beijing). The coordinate granularity can be as fine as a specific street or as wide as all countries in the world. POI is a very important element. POI is an abbreviation of "Point of Information", which can be translated as "Point of Information", and each POI contains four aspects of information, including name, category, latitude and longitude, and nearby hotels, restaurants, stores and other information. With POIs, data can be presented in geographic dimensions

Optimal design guidelines: i. Use thin map contour lines; choose the right color scheme; use less fill patterns; and choose the right data intervals.

User behavior analytics, the actions of browsing, clicking, and visiting a page are presented in a highlighted visualization. The figure below shows the clicking behavior of a user in Google search results.

Summary: When we get the data, we first refine the key information, clarify the data relationships and themes , and then choose the right charts for visualization .

A good visualization tells a story, revealing the patterns behind the data. The perception of the use of visualization perhaps comes from the chart below. While clearly structured, it is only for Excel charts and is not rich enough.

Dimensions are often mentioned in data analysis. Dimensions are the perspective from which the data is viewed and how it is described. We can say that region is a dimension, and this dimension includes the cities of Shanghai and Beijing. We can also think of sales as a dimension that contains all kinds of sales data. Dimensions can be expressed in terms of time, numerical values, or text, which is often used as a category. The essence of data analysis is the combination of various dimensions

Dimensions are mainly three types of data structures: text, time, and values. The region of Shanghai, Beijing is the text dimension (can also be called category dimension), the sales degree is the numerical dimension, time is the world

Numerical dimensions can be calculated through the processing of other dimensions, for example, by the regional dimension, count out how many are in Shanghai, how many are in Beijing. The dimensions can be interchanged. For example, Age was originally a numeric dimension, but it can be categorized as a child, youth, or old age through the division of age into three age groups, which is then converted to a textual dimension.

1. Box-and-line diagram

Box-and-line diagrams are generally not well understood, it can accurately reflect the discrete data dimensions (maximum, minimum, median, quadratic) situation. Any discrete data is suitable for box-and-line plots.

The figure below is a typical application of a box-and-line plot. The upper and lower ends of the line represent the maximum and minimum values of a particular set of data. The top and bottom of the box represent the top 25% and 75% of the values in the data set. The horizontal line in the middle of the box represents the median.

2. Relationship diagrams

Diagrams that show the relevance and correlation of things, such as social chains, brand communication, or the flow of information of some kind.

If you have a tweet, and you want to look at the chain of communication: which Vloggers shared it, who shared it before the Vloggers, etc., you can create a diffuse mesh to analyze the viral marketing process. The relationship graph relies on a large amount of data, and it has no concept of dimensionality.

3. Rectangular Tree Chart

As mentioned above, columnar charts are not suitable for expressing data in too many categories (e.g. hundreds), so what to do? Rectangular Tree Chart appears. It visually represents values by area and categories by color.

In the figure below, the color families represent each category dimension, and there are multiple secondary categories under the category dimension. If expressed as a bar chart, it would be a disaster. A rectangular tree diagram is a breeze.

E-commerce, product sales, and other analytics involving a large number of categories can use rectangular tree diagrams.

4. Sankey diagram

A relatively cold chart, which often indicates the state of change and flow of information.

5.0 Funnel Charts

The famous Conversion Rate Visualization, which applies to conversion analysis in a fixed process, and which you can think of as a simplified version of the Sanky Chart. Conversion rates can also be expressed as sets of numbers, not necessarily as a funnel chart.

Readability**

The primary function of a chart is to explain, not design, and most charts in particular fall into the trap of over-designing.

Objectivity

Interpretation of data can show a lot of results because of the different perspectives and viewpoints of each person. This is why we often say that statistics can lie.

Below is a bar chart of sales, which doesn't look like there's been a huge change in sales.

Change another chart to show. You can see the trend of growth in change.

In fact, there is no difference between the two charts, why? The difference is only in the axes. The first chart starts at 0 on the Y axis, the second chart starts at 2.45 . The second is a partially intercepted bar graph.

Uniformity

If the overall color of the chart is cool, don't add warm colors.

If the chart text is elegant black, don't add Song.

If the data for a region, with a bar chart comparison, other regions also follow the bar chart style.

If for a chart, red is used for women and blue for men, this specification should be reflected in all charts. The same is true for all design elements except color.

If there are multiple charts, chart elements should be standardized, such as title, axis scale, and axis position.

Why do users need to "turn data into charts"?

The ultimate answer must be a return to the first principle of business management - open source and reduce costs. Businesses need data to analyze how they can save more money and how they can make more money. The future of BI can't position itself as a "tool", but rather as a "service".

1.0 In terms of process, an exploratory visualization looks like this:

This type of visualization focuses on the micro-functionality of the chart, like auxiliary lines, alerts, various chart types, and so on.

2.0 Interpretive Visualization Requirements

Generally focused on story-telling scenarios after you've finished exploring your data and developing some data insights. Some of the " One diagram to understand XXX" and "One diagram to understand XXX" that you see on the web are interpretive visualizations.

This category focuses on overall chart visualization, such as combining multiple charts to create a report or storyboard, so it provides features like title editor, layout editor, and so on. The BI products on the market today, like NetEase, BDP, Tableau, and PowerBI are all based on this model

1. This business-oriented product framework is not well suited to the domestic market.

Because these products are basically aimed at professional users (data analysts), they ignore the fact that most Chinese companies don't have dedicated data analysis positions. The companies that can afford to have data analysts are generally medium to large enterprises, and they may have more ability to pay, but that also means fewer users.

Professional users correspond to data analysts, while semi-professional users correspond to users like finance, sales, HR, etc., who are professional in business but not in data analysis. This type of user's daily work is generally focused on explanatory visualization, such as year-end summaries, annual planning, monthly reporting need to take advantage of the data visualization. The process for this type of user looks like this:

[image upload failed... (image-e6e0b4-1556103840929)]

Users can import data and generate charts without much complexity. Problems:

Understanding of visualization: Information visualization is the correct graphical representation of complex information and logical relationships ,

? Attract readers through the unique beauty and interest of pictures ? Make the content more understandable by optimizing the presentation

? Closer to the distance between the reader and the product, to enhance brand awareness

Works 1: security products home page display

Inspiration: from the requirements document to see these sub-products name has the imperial guards, gossip array, the imperial city river ...... at that time felt very interesting, the mind immediately came up with a new idea. Immediately came to mind a picture of the ancient city, the ancient city around the soldiers, gossip array, the Royal River and so on. After expressing this idea with the visual designer, we hit it off and eventually produced this program. The middle of the city building was red at first, a little too eye-catching, in order to avoid dominating the scene and reflect the feeling of data protection, it changed it into this semi-transparent, very data-oriented virtual feel.

Works 2: Product Structure

Inspiration: Through competitive analysis, we found that domestic and international counterparts are very hard at work in this area, so we also strive to describe the product structure and relationships clearly in a single diagram. The next article will talk about the specific design process.

Work 3: Using the flow diagram

Inspiration: the product manager gave this diagram is very rigorous, but it is difficult for the user to understand, so it was first simplified into a one-way flowchart with a wireframe diagram, but this is not beautiful and intuitive enough. The clever visual designer solved this problem by embellishing the graphic.

Modified (partial):

Improved:

Work 4: Schematic description of the program

It is also the first to sort out the logic of the information and express it in a more comprehensible way, and then embellished by the visual designer.

Improved diagram:

Getting something right starts with knowing what the criteria for getting it right are. Putting these failures together, one can roughly conclude what the reasons for failure are, and what the criteria for good are.

[Image upload failed... (image-cf4898-1556103840928)]

From the point of view of the form of expression "infographics" as a visual tool should include the following six categories: charts, diagrams, graphs, tables, maps, lists.

According to the form of characteristics we often charts are divided into relationship flow charts, narrative illustration type, tree structure charts, time distribution class and spatial deconstruction of five types.

1, relational flow chart

2, narrative illustration type chart

Narrative charts are charts that emphasize the time dimension, and with the passage of time, the information is constantly changing.

3, tree structure diagram

The complicated data through the branch combing way to express clearly. Using grouping, each group is again categorized by the main frame to indicate the master-slave structure.

4, time expression class diagram

Time expression class diagram as long as the timeline as the center of the text data can be added. From the design point of view, the theme into the graphic design, select the important event point interpretation, you can make the picture beautiful, deepen the understanding of the strength.

5, spatial structure class schematic

The use of design language to the complex structure of the model, virtualization is the significance of the existence of spatial structure schematic

This process needs to be done collaboratively, data need to be screened and organized, precision is the first condition , followed by sorting. The next step is to sort it out. The first step is to identify the main logic, and then filter the secondary content to create a well-crafted design.

1, the basic graphic ideas

Column charts and pie charts are the two most commonly used basic graphics, but the simple geometric form is difficult to give people a sense of design. Creativity of the basic graphics to highlight the design theme , you can achieve more with less

Above the picture in the left and right of the content is exactly the same, but even if the reader does not pay attention to the right picture in detail can be comprehended.

2. Highly engaging and visually appealing

From traditional web pages to social microblogging, users are browsing faster and faster for information, and highly engaging is the most valuable asset.

3. Simplicity and clarity

4. Symbolism

In the design, we should pay attention to maintain the unity of the style, so as to make people visually coherent, pleasing to the eye.

1. Pie charts in the wrong order

Pie charts are a very simple visualization tool, but they are often overly complex. Shares should be visually sorted and have no more than 5 breakdowns. There are two sorting methods both of which will allow your readers to quickly grab the most important information

Method 1: Place the largest portion of the share at 12 o'clock, the second largest portion counterclockwise, and so on.

Method 2: Place the largest portion at 12 o'clock and then clockwise

2. Use dotted lines in line charts

Dotted lines can be distracting, but solid lines with the right color are easier to distinguish from each other

3.

Your content should logically and visually guide the reader through the data. Sort categories alphabetically, by number of times, or by numeric size

4. Data obscurity

Make sure that data is not lost or overwritten by the design. For example, use transparency effects in area charts to ensure that users can see all of the data

5. Consume more of the reader's energy

Be sure to make the data easier to understand by using auxiliary graphical elements, such as adding a trend line to a scatterplot

6. Erroneously present the data

6. strong>

Make sure that any presentation is accurate, for example, bubble charts should be the same size as the values, and don't just label them randomly

7. Use different colors in heatmaps

Some colors stand out more than others, giving the data an unnecessarily heavy element. Instead, you should use a single color, and then express that by the shade of the color

8. Columns are too wide or too narrow

It's a good idea to adjust the spacing between columns to 1/2 of the width

9. Difficulty in comparing the data

Comparison is an presenting difference but it's much less effective when your readers can't easily compare. Ensuring that the data is presented in a consistent way will allow your readers to compare

10. Use three-dimensional diagrams

While these diagrams may seem exhilarating, 3D diagrams also tend to distract from expectations and scramble the data; sticking to 2D is king

The essence of numerical visualization is to use a variety of visual attributes to express the magnitude of a data value. There are several types of visual attributes: position, length, area, and color. Corresponding to the visual design of point, line, surface and color values.

The core idea of its visualization is to relate it to the things that have values in our present world in an anamorphic way, based on context.

If it's running at 15km/h, then a graph of an athlete running can be drawn to represent that number. If it's running at 70km/h, then you can draw a cheetah running, with a blurred background to express how fast it's running. If you want to describe the height of the mountain 5km, you can draw a mountain that towers into the sky, giving a visual image of a high mountain, more creative design can be developed around the imagination

The speed of the car driving, divided into slow, medium and super-speed, as shown in the figure on the left below. When expressing evaluative information, you need to expand associations based on the context. For example, if we say: 50 millimeters of precipitation, we might imagine a test tube with 50 millimeters of water.

A one-dimensional table is shown below, with a single row or column of data in the data table. We need to analyze the goals of the data visualization, follow up on the goals can be divided into the following categories:

? Data that emphasizes absolute values;

? Data emphasizing trends;

? Percentage data;

? Data of different types.

3.1.1 Bar Charts

An income of $10,000 is twice the income of $5,000, and a GDP of one trillion is twice the GDP of five hundred billion; this type of data is called isoparametric data. Readers of bar charts are generally visually attracted to the columns themselves, and do not pay attention to the starting point of the vertical axis. Users tend to default to the length of the columns to represent the size of the absolute value. So the starting point of the vertical axis of a bar chart must start at zero.

3.1.2 Histograms

The essential difference between histogram data is that it expresses the distribution of quantities over consecutive intervals. In statistics, the vertical axis of a histogram is required to be count data, that is, a histogram is used to count the number of objects in an interval.

3.1.3 Histogram Variation: Bar Chart

Bar charts also have a great typographic advantage, being able to display text and bars on one side, and being able to append explanations to categories. In China, if not for typographic reasons, please be careful with these horizontal bars.

3.1.4 Bar chart variant: counting bars

3.1.5 Bar chart variant: radial bar, radial bar, spiral

Bar charts are twisted and distorted in order to fit into a typographic area or to add interest to the graphic.

3.1.6 Variations on a bar chart: replacing the bar with an anaglyph

In graphic design, posters and promotional pages, an anaglyph is usually added to make the expression of data more vivid. The basic idea is to center the association around the main body of the data, replacing the column with an anthropomorphic object.

Example 1: If you are describing soccer-related content, you can replace the column with an image of a soccer ball.

Example 2 : If the description is star related, then the pillar can be replaced by an image of a star.

Example 3 : If the description is about the differences between men and women, then the image of men and women can be used instead of the column.

Example 4 : If the data is related to smoking, it is exactly right to replace the column with the shape of a cigarette butt.

Example 5 : If it is the height of a mountain, then the shape of a mountain can be used.

3.1.7 Bar Chart Variations: Expanding the Reorganization Design by Certain Dimensions

In the previous section, the idea of replacing columns with anthropomorphic objects was still in the framework of bar charts. But very often, it is even possible to leave the bar chart behind and develop associations based on keywords. In the process of association, we just need to remember the essence of data visualization mentioned in the first chapter: to represent the size of the data through four visual elements: position, length, size, and color.

Example 2 : PM2.5 values for cities and provinces (hypothetical data)

This kind of data can only be expanded with the keyword location and presented as a map.

PM2.5 is an unimaginative concept, so it is unlikely that the visualization can be expanded over PM2.5. Then this data can only be expanded with the keyword location and presented as a map.

The province itself is a fixed shape and size surface on the map, and the values can be represented by color heat maps (below, left).

Example 3: Visits by website

Example 4: Migration map

The data prototype for a migration map for a single city is still a one-dimensional array. When developing the design using the map as a dimension, what needs to be expressed is the connectivity of the individual cities to Beijing. The information about the length of the line is already used by the distance from the city to Beijing, so only the color of the line can be used to represent the value.

3.2 Emphasizing Trend Data

3.2.2 Variation of the Ordinate Chart: Curve Chart

3.2.3 Variation of the Ordinate Chart: SMA Chart

3.2.4 Variation of the Ordinate Chart: Area Chart

< strong> 3.2.5 Variation of the folded line chart: stock index chart

In general, percentage data is expressed using a pie chart (or a ring chart), which is the most conventional.

The difference between a ring chart and a pie chart is that a ring chart allows for a better integration of the topic with the chart.

3.3.2 Pie Chart Variation: Transforms a pie into an anthropomorphic form of an object.

Example 1 : If it is describing the components of the human body, then the visualization can be centered around the human figure, transforming the shape of the pie into the shape of a human.

Example two : If you want to describe the percentage of people in various types of industries, then you can consider drawing 100 people, all types of industries with or without the use of the style of the graphic, as shown in the figure below left; and when you want to describe the source of guns in various types of shootings and killings, as shown in the figure on the right below.

STEP1: Determine the correct ideology

"Correct" is the most basic requirement of the infographic, so the first thing to do here is to make sure that the content of the infographic is correct.

For products whose business is more complex and difficult to understand, you can let the product manager draw a diagram according to their own understanding first, and the designer and product manager communicate to confirm that both sides have the same understanding.

"Taobao technology in this decade" has a good saying "good architecture diagram full of beauty". The Taobao engineers proved this in ten years. But it's not just technical architecture diagrams, it's also good flowcharts, structure diagrams, infographics, etc. All of these are full of beauty.

How do you optimize the presentation of an infographic? If it's a complex diagram with complex logic, you can do something like this:

Although there is no error in the logic, the arrows are crossed, and it doesn't look good.