Getting Started

How to load & format data

The data for each Swarm is provided in a "CSV" File.

You can import as many separate CSV Files into separate Flow Swarms as it takes to present your data story.

The CSV File is typically maintained using Excel or Google Sheets, and the CSV File may also be exported from various databases or data analysis environments.  Each row within the CSV File consists of a single data value (Dot) plus other data fields that specify items such as:

  • Location information within the 3-D space
  • Values used as search criteria in data filters
  • Optional parameters that control display items such as color, the size of the Dots, and other settable parameters
  • Special computations, such as Average, Mean, Standard Deviation, Z-Score, etc. that have been computed using Excel, Google Sheets or other software that may be represented in charts within a Flow.

The first row of the CSV File must always contain the column names for the data contained within the CSV File.  Although "newline" characters and other special characters can appear in the column names, a Flow Best Practice is to avoid using these types of characters in column names.

The data within each CSV File should conform to the following conventions:

 

First row contains the column titles

These column titles may appear in various dropdown menus within the Flow, so it is best to keep the lengths of the column titles fairly short.

 

One data value per line

For a given Swarm the CSV File can contain any number of supporting data fields, but each line must contain just one data "value" for any columns that will be used to either provide data values to be displayed in a Chart or provide members of an axis within the 3-D space.

For example, in the "Humanity" Featured Flow, if a Chart is to show Life Expectancy as a function of Year and Country, the CSV File can contain any number of fields, such as Year, Country, Continent, Rainfall Amount, Average Temperature, etc., and must also contain a single instance of the value (Life Expectancy) that is to be displayed in a Chart, and a single instance of each data element that defines the axes for a Chart (Year and Country).  Fields other than Year, Country, and Life Expectancy can be used to filter records that appear in the Chart.

The Flow team is adept at transforming CSV Files that contain multiple "values" per line to CSV files that conform to the rules noted above, so please let us know if you need any assistance in transforming your CSV File.  Classic CSV Files that need to be "unpivoted" are files that contain a category (such as country) and a set of values (such as life expectancy) for multiple years within a single record. 

 

Sorting of records within the CSV File

For some types of Charts, such as line charts, the values appearing in the line charts are connected by default by lines that run from the data value provided in the previous input CSV row to the data value provided in the next input CSV row.  When using the default way of connecting the dots, it is important to sort the CSV file appropriately prior to importing the CSV file into the appropriate Swarm.  This is a restriction that may be modified or eliminated in the future, but this is the convention that is currently in effect.

Here is a simplified example of the ordering for the time-series data for the "Humanity" Featured Flow.  The data value of interest is Life Expectancy, and the other data fields provide information for placing the data value into the 3-D space, sequencing the Dots that will be connected by straight lines in line charts, and other supporting purposes.

 

Continent

Country

Year

Life Expectancy

Other Columns

Africa

Egypt

1950

38.1

. . .

Africa

Egypt

1951

38.8

. . .

.  . .

 

 

 

. . .

Africa

Kenya

1950

45.2

. . .

Africa

Kenya

1951

45.4

. . .

. . .

 

 

 

. . .

Americas

Brazil

1950

51.8

. . .

Americas

Brazil

1951

52

. . .

Table 1 - Segments of a CSV file that are appropriately sorted for display in a line chart

 

Errors within a CSV File

If you have structural data errors within a CSV File, the issue will probably be seen in the Charts that you generate within Flow.  You may resolve errors within the CSV File at any time, and you may easily reload a CSV File.  Editing of data values within Flow is supported in some contexts, but at this time, you should assume that the maintenance of data values is usually performed within the CSV Files.

 

Formatting Specifications

More info on formatting for Dates and Times and Geospatial data. 

 

Updated