Characterizing Noise

pydaddy

A package to derive an SDE equation form the data.

enter image description here Conda

pydaddy is a python package implementing a data driven SDE method. pydaddy provides an interface which takes time series data as input, runs the analysis and returns an output object through which data and analysed results can be systematically visualized and saved.

How it works

Suppose m(t) is a SDE time series data.

$image4$

The the package calculates the deterministic (drift) and stochastic (diffusion) component of dm/dt.

$image5$	$image6$

This data driven SDE method is based on the assumption that the noise in the time series is uncorrelated and Gaussian in nature, with zero mean and unit variance.

$image9$

pydaddy extracts the noise form the data and checks if it holds true to its assumptions.

$image10$	$image11$

Features

Simple one line execution of analysis.
Produces intuitive visuals of data and obtained result.
Supports time series data with both scalar and vector order parameters.
Generates interactive, drift and diffusion sliders for user defined timescale range or list.
Diagnostics help understand change in order of drift and diffusion with varying timescale.

Requirements

Python:

python version >=3.5 , <=3.8

Packages dependencies:

statsmodels, version 0.11.1
matplotlib, version 3.2.2
scipy, version 1.5.2
numpy, version 1.19.1
seaborn, version 0.10.1
tqdm, version 4.48.2
plotly, version 4.14.3

Installation

pydaddy is available both on pypi and anaconda cloud, which offers two recommended methods of installations, using pip python package manager or anaconda package manager.

Note

To run the example notebooks on your system after installing the package, please download the notebook files to your PC from the github repo.

Alternately, its recommended to simply clone or download git the repository.

Installing via conda

enter image description here Conda

Conda is a package manager that handles sourcing of all dependencies in a relatively straight-forward, cross-platform manner. pydaddy is available on the tee-lab channel.

Important

Make sure you have anaconda or miniconda installed and have activated the conda default (base) environment using conda activate base , before proceeding.

Install pydaddy package with all its dependencies using conda simply requires executing the command

conda install -c tee-lab pydaddy

To install pydaddy in a clean virtual environment use

conda create --name MY_ENV_NAME -c tee-lab pydaddy

Replace MY_ENV_NAME with your desired name for environment.

You can now access pydaddy package by activating the newly created environment, conda activate MY_ENV_NAME

Installing via pip

To install using pip package manager, run:

python -m pip install pydaddy

Manual installation

This method is not recommended unless you experience problems with conda or pip. To install using setuptools, download the source code manually and run python setup.py install from the terminal.

This will install the package in your current environment (if you are working in any environment).

If you have git installed, you can clone the repo and install using the following commands.

$ git clone https://github.com/tee-lab/pydaddy.git
$ cd pydaddy
$ python setup.py install

Important

Without git, you will need to download the zip-file of the code, extract it and follow the above instructions.

Click here to download source code zip file.

Usage

The time series data is given as input to the Characterize method along with all other optional parameters.

Show pydaddy.Characterize documentation

Characterize

class pydaddy.Characterize(data, t=1.0, Dt=1, dt=1, bins=None, inc=None, inc_x=None, inc_y=None, slider_timescales=None, n_trials=1, show_summary=True, drift_threshold=None, diff_threshold=None, drift_degree=5, diff_degree=5, drift_alpha=0, diff_alpha=0, fit_functions=False, **kwargs)

Bases: object

Analyse a time series data and get drift and diffusion plots.

Parameters

data (list) – time series data to be analysed, data = [x] for scalar data and data = [x1, x2] for vector where x, x1 and x2 are of numpy.array object type

t (float, array, optional(default=1.0)) –
float if its time increment between observation

numpy.array if time stamp of time series

Dt (int,'auto', optional(default='auto')) –
time scale for drift

if ‘auto’ time scale is decided based of drift order.

dt (int, optional(default=1)) – time scale for difusion

inc (float, optional(default=0.01)) – increment in order parameter for scalar data

inc_x (float, optional(default=0.1)) – increment in order parameter for vector data x1

inc_y (float, optional(default=0.1)) – increment in order parameter for vector data x2

fft (bool, optional(default=True)) – if true use fft method to calculate autocorrelation else, use standard method

slider_timescales (list, optional(default=None)) – List of timescale values to include in slider.

n_trials (int, optional(default=1)) – Number of trials, concatenated timeseries of multiple trials is used.

show_summary (bool, optional(default=True)) – print data summary and show summary chart.

**kwargs – all the parameters for inherited methods.

Returns

output – object to access the analysed data, parameters, plots and save them.

Return type

pydaddy.output.Output

Key parameters

data: list
time series data to be analysed, data = [x] for scalar data and data = [x1, x2] for vector where x, x1 and x2 are of numpy.array types.

t: array or float
float if its time increment between observation numpy.array if time stamp of time series

See doc strings or code documentation for more information.

Example using sample data set

See Data set description for more information about the datasets.

import pydaddy
#load data
data, t = pydaddy.load_sample_dataset('model-data-vector-ternary')
# Analyse
ddsde = pydaddy.Characterize(data,t)

# Show drift slider plot
ddsde.drift()
# Show diffuision slider plot
ddsde.diffusion()
# Show timeseries plot
ddsde.timeseries()
# Show histograms
ddsde.histograms()
# Show all inputed, calculated and assumed parameters of the analysis
ddsde.parameters()
# Export data to disk
ddsde.export_data()

Characterize returns an output object in which all analysed results are stored. Results can be visualised or stored by calling appropriate functions:

Show pydaddy.output.output documentation

pydaddy.output.output

pydaddy.output

alias of <module ‘pydaddy.output’ from ‘/home/docs/checkouts/readthedocs.org/user_builds/pyddsde/checkouts/latest/pydaddy/output.py’>

summary(): show summary
drift() : drift slider plot
diffusion() : diffusion slider plot
timeseries(): time series plot
histograms() : histogram plots
noise_characterstics(): noise characteristics plots
visualise(timescale): drift and diffusion plots for a timescale
diagnostics(): diagnostics plots
data(timescale): get drift and diffusion data for a timescale
export_data(): Save data as csv files and mat files
plot_data(data): plot data on a 3d axis

For more examples see this notebook.

Motivation

This project is motivated by the study of group behaviour dynamics of animals, especially schooling fish. The sample data distributed along with this package is from experiments conducted by TEElab, IISc.

Data set description

pydaddy has six data set included along with the package which can be loaded using load_sample_dataset(dataset_name) function

Show pydaddy.load_sample_dataset documentation

pydaddy.load_sample_dataset

class pydaddy.load_sample_dataset(name)

Bases:

Load sample data set provided.

Available data sets:

‘fish-data-etroplus’

‘model-data-scalar-pairwise’

‘model-data-scalar-ternary’

‘model-data-vector-pairwise’

‘model-data-vector-ternary’

Parameters

name (str) – name of the data set

Returns

data (list) – timeseries data

t (float, array) – timescale

Experiment data (from experimentation or from observations)

fish-data-ectropus:
A data from experiment conducted with a group of 30 fish, in which the group polarity in x and y directions are recorded every 0.12 seconds.

Source

The fish data is a part of the work done in the Noise-Induced Schooling of Fish

Simulation data

A synthetic data set obtained from the simulation of fish interactions.

model-data-vector-pairwise :
Pairwise interaction of fish simulated in two dimension.

model-data-vector-ternary:
Ternary interaction of fish simulated in two dimension.

model-data-scalar-pairwise:
Pairwise interaction of fish simulated in single dimension.

model-data-scalar-ternary:
Ternary interaction of fish simulated in single dimension.

Source

The simulation method is inspired from the work done in Noise-induced Effects in Collective Dynamics and Inferring Local Interactions from Data

Glossary

drift
First jump moment

diffusion
second jump moment

Acknowledgement

Licence

Distributed under GNU General Public License v3.0. See Licence.txt for more information.

Reference

[1] Noise-induced Effects in Collective Dynamics and Inferring Local Interactions from Data Preprint Github

[2] Noise-Induced Schooling of Fish Preprint Github