SELKIELogger  1.0.0
SLConvert

NAME

SLConvert - SELKIE Logger file conversion utility

SYNOPSIS

SLConvert -h

SLConvert [-v] [-c VARFILE] [-d] [-r RESAMPLE] [-o OUTFILE] [-T ID] [-t {csv,xlsx,parquet,mat}] [-z] DATFILE

DESCRIPTION

SLConvert is the main utility to convert recorded data files into formats better suited for analysis.

Currently, these formats include:

  • Comma Separated Values (CSV), optionally compressed using GZip
  • Excel spreadsheet
  • MATLAB compatible data file
  • Apache Parquet files

A channel mapping (.var) file corresponding to the data file will be used to speed up processing if found based on the data file name or if explicitly specified. If no file is found or specified then the main data file is processed to identify the sources and channels in use. This will take longer as the data file must be processed twice.

OVERVIEW

Data processing

Records stored in the data file are aggregated based on the timestamps generated by the data source selected as the main clock. The interval between timestamps from this source defines the output data rate. The default clock source corresponds to the timestamps generated by the main Logger software, but an alternative source can be specified using the -T/--timesource option.

When no resampling period is given, one record will be generated for each timestamp. Each record contains the value of each data channel received in the interval since the timestamp, with missing channels represented as a blank or otherwise invalid value. Missing values can occur when sources provide data at different rates, or if input data is corrupted or missing. If multiple values are received for a given channel during a single interval, the last value received is used.

By default, a record is output for each timestamp (or resampling interval), even if no other data has been received. These empty records can be suppressed using the -d/--dropna option.

Resampling

When resampling is enabled using the -r/--resample option, the records are aggregated over the resampling interval and the mean of each channel is output. Note that while this correctly discards missing or numerically invalid values, no allowance is made for what each channel may represent (e.g. angular values).

The resampling interval needs to be specified as a value and a suffix taken from the frequency strings defined for Pandas DateOffset objects, listed in the DateOffset documentation. Common examples would be 1 second ('1s'), 10 minutes ('10min' or '10T'). Intervals less than one second can also be specified in milliseconds ('L' or 'ms' suffixes).

OPTIONS

-h, –help : Display short help message. All other options will be ignored

-v, –verbose : Increase output verbosity

-c VARFILE, –varfile VARFILE : Path to channel mapping file (.var file).

-d, –dropna : Drop empty records during processing

-r RESAMPLE, –resample RESAMPLE : Resample data before writing to file

-o OUTFILE, –output OUTFILE : Output file name

-T ID, –timesource ID : Data source to use as main clock

-t FORMAT, –format FORMAT : Output file format. where format is one of csv, xlsx, parquet, mat

-z, –compress : Enable compression of CSV output