How to Begin Streaming Your Data to Vega

The primary method of streaming buoy data into the Vega database is through a desktop application called Ziggy Stardust.

To download Ziggy, please go here, click “Downloads” and choose the appropriate version for the computer that will be running it.

Once downloaded, wikis with step-by-step instructions for setting up Ziggy can be found here. Which instruction set to follow depends on the type of data file used.

Description

For most sites, buoy data arrive in a simple, CSV format of some type. The date/time format used varies wildly and there are other nuances that add complexity to a system that needs to parse the incoming data as it arrives, but that is largely dealt with. Most data logging systems stream data and append to a single file creating, over time, a large text file of time-series data.

In most instances, the files are nearly devoid of descriptive metadata. A few of the more modern logging systems include simple headers that may include a free-text name (user defined) and perhaps units. Many of the existing systems are of an older style though which include no headers, just raw text.

 

Figure 1: An example older-style csv data file with no headers. Date/time is stored as year, day of year, and a 3 or 4 digit time (100 would be 1:00AM, 1300 would be 1:00PM).

 

Ziggy Stardust is a java app written to run on the client desktop machine. Ziggy monitors select files for updates (changes in modified date/time), parses any new data written to the file, attaches the required metadata, and then forwards those data onto some next step. The software was written so that different data handling modules were loosely coupled, allowing for a choice of data sources and difference destinations. For example, one might configure a raw text file as a data source and a MySQL database as the destination.

 

Figure 2: Screenshot of Ziggy Stardust. It is monitoring files for multiple instrumented sites administered by the UW-Center for Limnology.

 

Almost all sites currently use a module that attempts to robustly upload data to the GLEON central repository. The “robust upload” mechanism caches all data it receives locally in XML files and makes unlimited attempts to upload those files to a remote FTP server. When the upload succeeds, the file is deleted. If the upload fails, it is re-tried one minute later. Upload is re-attempted until upload is successful or Ziggy is shut down. Upload attempts resume even for previously cached files when Ziggy is re-opened.