NETLAKE >> Data Analysis Toolbox

Data Analysis Toolbox

Are you interested in analysis of automatic monitoring data from a waterbody, but don’t know where to start?  Start here!

With the advent and proliferation of high frequency in situ data collection from lakes has come the need to process unprecedented quantities of data in a useful and effective manner. This need has driven the development, or adoption, of a variety of techniques, programs and methodologies for working with high frequency lake data. It was, therefore, thought timely to provide an easily accessible and digestible synopsis of some of these topics. Discussions between interested members of the NETLAKE COST Action (ES1201) and a poll of Action members led to the identification of a range of such topics by the community which was felt to be of broad potential interest to those collecting high frequency data from lakes, and indeed rivers. Individual specialists were identified for each of these topics with each then writing a ‘factsheet’ intended as a beginner’s guide to the topic. The intention was to briefly describe the objective of the method, a specific application for it, some details of the background knowledge and data requirements necessary for its use, and a broad description of how the method should be implemented. Additionally, some advice, in the form of tips, where to find further information and how to access any code was included. These factsheets were peer reviewed by experts within NETLAKE, then edited and collated. The factsheets can be downloaded individually or collectively.

Data analysis factsheet 001: Data cleaning and QA/QC

The objective of this factsheet is to describe some of the procedures that can be used to process high frequency monitoring (HFM) data in order to ensure that obvious errors have been removed and that the data can be considered quality controlled. Some examples from two long running monitoring stations are presented and discussed.

Example file

Suggested citation: de Eyto, E. and Pierson, D. 2016. Data handling: cleaning and quality control. In Obrador, B., Jones, I.D. and Jennings, E. (Eds.) NETLAKE toolbox for the analysis of high-frequency data from lakes (Factsheet 1). Technical report. NETLAKE COST Action ES1201. pp. 2-6. https://research.thea.ie/handle/20.500.12065/1947.

Data analysis factsheet 002: Lake Heat Flux Analyzer

The software tool Lake Heat Flux Analyzer (LHFA) has been written to enable the calculation of lake heat fluxes, and related terms, from standard meteorological variables.

Heat flux

Suggested citation: Jones, I.D. 2016. Lake Heat Flux Analyzer (LHFA). In Obrador, B., Jones, I.D. and Jennings, E. (Eds.) NETLAKE toolbox for the analysis of high-frequency data from lakes (Factsheet 2). Technical report. NETLAKE COST Action ES1201. pp. 7-10. https://research.thea.ie/handle/20.500.12065/1948.

Data analysis factsheet 003: General Lake Model (GLM)

The General Lake Model (GLM) is a one-dimensional hydrodynamics model. Hydrodynamic models describe the thermal properties and the mixing dynamics in water bodies. Based on inflow and outflow data, as well as meteorological data, GLM calculates a water and energy balance resulting in vertical profiles of temperature, salinity and density over time.

MyLake

Suggested citation: Frassl, M., Weber, M. and Bruce, L. 2016. The General Lake Model (GLM). In Obrador, B., Jones, I.D. and Jennings, E. (Eds.) NETLAKE toolbox for the analysis of high-frequency data from lakes (Factsheet 3). Technical report. NETLAKE COST Action ES1201. pp. 11-15. https://research.thea.ie/handle/20.500.12065/1949.

AttachmentSize
PDF icon netlake_toolbox_03_glm_0.pdf570.9 KB
Data analysis factsheet 004: Lake Metabolizer

Lake Metabolizer is an Rpackage for estimating lake metabolism and related terms from data collected by high frequency, in situ lake monitoring stations with relative ease. The package can be used to calculate lake metabolism using five different methods: bookkeeping, ordinary least squares, maximum likelihood, Kalman filter, and Bayesian.

Lake Metabolizer

Suggested citation: Woolway, R.I. 2016. Lake Metabolizer. In Obrador, B., Jones, I.D. and Jennings, E. (Eds.) NETLAKE toolbox for the analysis of high-frequency data from lakes (Factsheet 4). Technical report. NETLAKE COST Action ES1201. pp. 16-22. https://research.thea.ie/handle/20.500.12065/1950.

Data analysis factsheet 005: ECDSOFT and OnLineMonitor

ElectroChemistry Data SOFTware (ECDSOFT) (Omanović, D., Branica, M. 1998) is designed for data treatment from electrochemical methods but it is general enough to accept any set of signals that matches the required format. The software itself is being continuously upgraded and the authors are open for further improvements on request.

Example file

Suggested citation: Omanović, D. and Pižeta, I. 2016. High Frequency data treatment and visualization with ECDSOFT and OnLineMonitor. In Obrador, B., Jones, I.D. and Jennings, E. (Eds.) NETLAKE toolbox for the analysis of high-frequency data from lakes (Factsheet 5). Technical report. NETLAKE COST Action ES1201. pp. 23-27. https://research.thea.ie/handle/20.500.12065/1951.

AttachmentSize
PDF icon netlake_toolbox_05_ecdsoft.pdf777.81 KB
Data analysis factsheet 006: Ice Modelling with MyLake

MyLake is a simple one-dimensional (1D) daily time-step model that can be used to simulate seasonal changes in ice coverage in lakes. This model is aimed at researchers who prefer to use Matlab/Octave language for scientific computing applications. This factsheet describes briefly how to set-up the MyLake lake model in order to simulate thermal stratification and ice phenology in a lake.

GLM

Suggested citation: Couture, R.M. and Tominaga, K. 2016. Lake stratification and ice phenology: Modelling with MyLake. In Obrador, B., Jones, I.D. and Jennings, E. (Eds.) NETLAKE toolbox for the analysis of high-frequency data from lakes (Factsheet 6). Technical report. NETLAKE COST Action ES1201. pp. 28-34. https://research.thea.ie/handle/20.500.12065/1952.

AttachmentSize
PDF icon netlake_toolbox_06_mylake.pdf736.54 KB
Data analysis factsheet 007: Data Mining

.

Example file

Suggested citation: Pižeta, I. 2016. Knowledge Discovery in Databases - Data Mining. In Obrador, B., Jones, I.D. and Jennings, E. (Eds.) NETLAKE toolbox for the analysis of high-frequency data from lakes (Factsheet 7). Technical report. NETLAKE COST Action ES1201. pp. 35-39. https://research.thea.ie/handle/20.500.12065/1953.

AttachmentSize
PDF icon netlake_toolbox_07_dms.pdf532.26 KB
Data analysis factsheet 008: Bayesian calibration

The objective of this factsheet is to describe some of the procedures that can be used to process high frequency monitoring (HFM) data in order to ensure that obvious errors have been removed and that the data can be considered quality controlled. Some examples from two long running monitoring stations are presented and discussed.

Bayesian

Suggested citation:Honti, M. 2016. Bayesian calibration of mechanistic models of lake metabolism. In Obrador, B., Jones, I.D. and Jennings, E. (Eds.) NETLAKE toolbox for the analysis of high-frequency data from lakes (Factsheet 8). Technical report. NETLAKE COST Action ES1201. pp. 40-46. https://research.thea.ie/handle/20.500.12065/1954.

AttachmentSize
PDF icon netlake_toolbox_08_bayesian.pdf2.55 MB
Data analysis factsheet 009: Whole-column metabolism

This technique allows determination of metabolic rates, gross primary production (GPP), ecosystem respiration (ER) and net ecosystem production (NEP for different depth layers along the water column as well as areal, depth-integrated, rates (i.e. per unit area).

Whole-lake

Suggested citation: Obrador, B., Christensen, J. and Staehr, P.A. 2016. Determination of whole-column metabolism from profiling data. In Obrador, B., Jones, I.D. and Jennings, E. (Eds.) NETLAKE toolbox for the analysis of high-frequency data from lakes (Factsheet 9). Technical report. NETLAKE COST Action ES1201. pp. 47-51. https://research.thea.ie/handle/20.500.12065/1955.

Data analysis factsheet 010: Dynamic Factor Analysis

DFA is a dimension-reduction method that estimates underlying common patterns in a set of time-series. The main tool is the MARSS (Multivariate Auto-regressive Space-State Model; Holmes et al., 2012) R-Package.

DFA

Suggested citation: Aguilera, R. and Marcé, R. 2016. Pattern detection using Dynamic Factor Analysis (DFA). In Obrador, B., Jones, I.D. and Jennings, E. (Eds.) NETLAKE toolbox for the analysis of high-frequency data from lakes (Factsheet 10). Technical report. NETLAKE COST Action ES1201. pp. 52-56. https://research.thea.ie/handle/20.500.12065/1956.

AttachmentSize
PDF icon netlake_toolbox_10_dfa.pdf930.62 KB
Data analysis factsheet 011: Inferential modelling of time series by evolutionary computation

The hybrid evolutionary algorithm (HEA) has been designed: 1) to represent and forecast multivariate relationships between environmental conditions and population densities by inferential (IF-THEN-ELSE) models, and 2) to quantify ‘tipping points’ for population outbreaks by IF-conditions.

Suggested citation: Recknagel, F. and Ostrovsky, I. 2016. Inferential modelling of time series by evolutionary computation. In Obrador, B., Jones, I.D. and Jennings, E. (Eds.) NETLAKE toolbox for the analysis of high-frequency data from lakes (Factsheet 11). Technical report. NETLAKE COST Action ES1201. pp. 57-60. https://research.thea.ie/handle/20.500.12065/1957.

AttachmentSize
PDF icon netlake_toolbox_11_hea.pdf699.42 KB
Full set of data analysis factsheets.

Suggested citation for the complete set of factsheets: Obrador, B., Jones, I.D. and Jennings, E. (Eds.) NETLAKE toolbox for the analysis of high-frequency data from lakes. Technical report. NETLAKE COST Action ES1201. 60 pp. https://research.thea.ie/handle/20.500.12065/1946.