Skip to main content

Data

Manage your data on columns.

Columns only store data for "Upload" type. With saying that, other types of data are basically living connection, you see most updated data in analysis time.

Some sources

Google spreadsheet#

This is very straightforward use case, you paste a spreadsheet ID or the full URL to create a new data connection. Connect Google Spreadsheet

NOTE: the bot account "columns.ai@gmail.com" has been deprecated. We have migrated the Columns BOT to "bot@columns.ai". So when you use this option to provide access to your Google Spreadsheet, please share the viewer permission to bot@columns.ai

Upload files#

For every user, columns provides up to 5GB cloud storage to save uploaded data files, and you can run data analysis on data. Today, Columns supports CSV/TSV files only.

Compressed CSV/TSV#

Do you have a large CSV file? Sure, sometimes we work with large files, large files takes long time to upload, and it takes long time for Columns load it up too! So we support compressed CSV/TSV file using GZIP, because CSV/TSV are usually highly compressiable, you may see 20x or 30x size reduction, and it speeds up everything. Also it saves your storage quota as well.

For example, I had a 500MB CSV file, after compression, it's only 20MB, that is 25x reduction ratio. To make things simple, please just run gzip command on your file, it will keep the original extension .csv or .tsv in output file name, that is how we can tell if it is CSV or TSV with a compressed file.

Here is a simple example, assume large.csv is 1GB in size

gzip large.csv output: large.csv.gz

Now, you only need to upload this large.csv.gz, Columns handles the rest. We recommend you compress your file before uploading it when the raw size is larger than 50MB. Enjoy the speed and use less storage space!

Data warehouse#

Data warehouse connections provides a SQL-oriented approach for you to build up a data catalog. In the Columns SQL IDE, you can write and test your SQL query, save it as a data model.

The way how it works on Columns side, Columns will execute the SQL query to fetch its results and cache in Columns distributed storage for a certain interval of time defiend by user, this keeps data freshness at user-defined level, and provide super fast data analysis performance. That's why our users always experience low-latency even when they analyze with real Big Data in an interactive slice/dice fashion, enjoy the speed!

Please refer this video on how to connect your data warehouse.

HTTP endpoints#

Http endpoints usually refers to a REST API (commonly in JSON payload) or a data files hosted on a web site. Similarly Columns supports CSV/TSV files for http data files, JSON format for HTTP API. This medium post illustrates how to build HTTP data connections, illustrated by two examples.

Columns Api Endpoint#

Columns Ai provides simple API users to pump data into our reliable, fast, auto-scaled realtime engine. We have free plan for most use cases and paid plan for large use cases, check out Columns Realtime Api.

Once you have real time data streaming into your api endpoint, now you can create a data source connecting to that Api Endpoint. On columns UI, you can analyze, visualize and storytelling from this data source, similarly to other data sources, but differently...

This is realtime streaming data, so you have choice to turn your visual story live, which means your story will be lively updated using animation as time goes. Simply to say, there are only two steps you need:

  • Pump your data into this API endpoint.
  • Build and share a live story by connecting to it.

Other sources to support - coming soon#

Columns is working on adding these sources support:

  • Cloud storage: Google Drive
  • Cloud storage: Dropbox
  • Cloud storage: OneDrive