Characterizing Noise

pydaddy

A package to derive an SDE equation form the data.

Documentation Status image1 Binder enter image description here image2 image3 Conda

pydaddy is a python package implementing a data driven SDE method. pydaddy provides an interface which takes time series data as input, runs the analysis and returns an output object through which data and analysed results can be systematically visualized and saved.

How it works

Suppose m(t) is a SDE time series data.

image4

The the package calculates the deterministic (drift) and stochastic (diffusion) component of dm/dt.

image5

image6

image7

image8

This data driven SDE method is based on the assumption that the noise in the time series is uncorrelated and Gaussian in nature, with zero mean and unit variance.

image9

pydaddy extracts the noise form the data and checks if it holds true to its assumptions.

image10

image11

image12

image13

Features

  • Simple one line execution of analysis.

  • Produces intuitive visuals of data and obtained result.

  • Supports time series data with both scalar and vector order parameters.

  • Generates interactive, drift and diffusion sliders for user defined timescale range or list.

  • Diagnostics help understand change in order of drift and diffusion with varying timescale.

Requirements

Python:

  • python version >=3.5 , <=3.8

Packages dependencies:

  • statsmodels, version 0.11.1

  • matplotlib, version 3.2.2

  • scipy, version 1.5.2

  • numpy, version 1.19.1

  • seaborn, version 0.10.1

  • tqdm, version 4.48.2

  • plotly, version 4.14.3

Installation

pydaddy is available both on pypi and anaconda cloud, which offers two recommended methods of installations, using pip python package manager or anaconda package manager.

Note

To run the example notebooks on your system after installing the package, please download the notebook files to your PC from the github repo.

Alternately, its recommended to simply clone or download git the repository.

Installing via conda

image14 enter image description here image15 Conda image16

Conda is a package manager that handles sourcing of all dependencies in a relatively straight-forward, cross-platform manner. pydaddy is available on the tee-lab channel.

Important

Make sure you have anaconda or miniconda installed and have activated the conda default (base) environment using conda activate base , before proceeding.

Install pydaddy package with all its dependencies using conda simply requires executing the command

conda install -c tee-lab pydaddy

To install pydaddy in a clean virtual environment use

conda create --name MY_ENV_NAME -c tee-lab pydaddy

Replace MY_ENV_NAME with your desired name for environment.

You can now access pydaddy package by activating the newly created environment, conda activate MY_ENV_NAME

Installing via pip

To install using pip package manager, run:

python -m pip install pydaddy

Manual installation

This method is not recommended unless you experience problems with conda or pip. To install using setuptools, download the source code manually and run python setup.py install from the terminal.

This will install the package in your current environment (if you are working in any environment).

If you have git installed, you can clone the repo and install using the following commands.

$ git clone https://github.com/tee-lab/pydaddy.git
$ cd pydaddy
$ python setup.py install

Important

Without git, you will need to download the zip-file of the code, extract it and follow the above instructions.

Click here to download source code zip file.

Usage

The time series data is given as input to the Characterize method along with all other optional parameters.

Show pydaddy.Characterize documentation

Characterize

class pydaddy.Characterize(data, t=1.0, Dt=1, dt=1, bins=None, inc=None, inc_x=None, inc_y=None, slider_timescales=None, n_trials=1, show_summary=True, drift_threshold=None, diff_threshold=None, drift_degree=5, diff_degree=5, drift_alpha=0, diff_alpha=0, fit_functions=False, **kwargs)

Bases: object

Analyse a time series data and get drift and diffusion plots.

Parameters
  • data (list) – time series data to be analysed, data = [x] for scalar data and data = [x1, x2] for vector where x, x1 and x2 are of numpy.array object type

  • t (float, array, optional(default=1.0)) –

    float if its time increment between observation

    numpy.array if time stamp of time series

  • Dt (int,'auto', optional(default='auto')) –

    time scale for drift

    if ‘auto’ time scale is decided based of drift order.

  • dt (int, optional(default=1)) – time scale for difusion

  • inc (float, optional(default=0.01)) – increment in order parameter for scalar data

  • inc_x (float, optional(default=0.1)) – increment in order parameter for vector data x1

  • inc_y (float, optional(default=0.1)) – increment in order parameter for vector data x2

  • fft (bool, optional(default=True)) – if true use fft method to calculate autocorrelation else, use standard method

  • slider_timescales (list, optional(default=None)) – List of timescale values to include in slider.

  • n_trials (int, optional(default=1)) – Number of trials, concatenated timeseries of multiple trials is used.

  • show_summary (bool, optional(default=True)) – print data summary and show summary chart.

  • **kwargs – all the parameters for inherited methods.

Returns

output – object to access the analysed data, parameters, plots and save them.

Return type

pydaddy.output.Output

Key parameters

data: list

time series data to be analysed, data = [x] for scalar data and data = [x1, x2] for vector where x, x1 and x2 are of numpy.array types.

t: array or float

float if its time increment between observation numpy.array if time stamp of time series

See doc strings or code documentation for more information.

Example using sample data set

See Data set description for more information about the datasets.

import pydaddy
#load data
data, t = pydaddy.load_sample_dataset('model-data-vector-ternary')
# Analyse
ddsde = pydaddy.Characterize(data,t)

# Show drift slider plot
ddsde.drift()
# Show diffuision slider plot
ddsde.diffusion()
# Show timeseries plot
ddsde.timeseries()
# Show histograms
ddsde.histograms()
# Show all inputed, calculated and assumed parameters of the analysis
ddsde.parameters()
# Export data to disk
ddsde.export_data()

Characterize returns an output object in which all analysed results are stored. Results can be visualised or stored by calling appropriate functions:

Show pydaddy.output.output documentation

pydaddy.output.output

pydaddy.output

alias of <module ‘pydaddy.output’ from ‘/home/docs/checkouts/readthedocs.org/user_builds/pyddsde/checkouts/latest/pydaddy/output.py’>

  • summary(): show summary

  • drift() : drift slider plot

  • diffusion() : diffusion slider plot

  • timeseries(): time series plot

  • histograms() : histogram plots

  • noise_characterstics(): noise characteristics plots

  • visualise(timescale): drift and diffusion plots for a timescale

  • diagnostics(): diagnostics plots

  • data(timescale): get drift and diffusion data for a timescale

  • export_data(): Save data as csv files and mat files

  • plot_data(data): plot data on a 3d axis

For more examples see this notebook.

badge

Motivation

image17

This project is motivated by the study of group behaviour dynamics of animals, especially schooling fish. The sample data distributed along with this package is from experiments conducted by TEElab, IISc.

Data set description

pydaddy has six data set included along with the package which can be loaded using load_sample_dataset(dataset_name) function

Show pydaddy.load_sample_dataset documentation

pydaddy.load_sample_dataset

class pydaddy.load_sample_dataset(name)

Bases:

Load sample data set provided.

Available data sets:

‘fish-data-etroplus’

‘model-data-scalar-pairwise’

‘model-data-scalar-ternary’

‘model-data-vector-pairwise’

‘model-data-vector-ternary’

Parameters

name (str) – name of the data set

Returns

  • data (list) – timeseries data

  • t (float, array) – timescale

Experiment data (from experimentation or from observations)

fish-data-ectropus:

A data from experiment conducted with a group of 30 fish, in which the group polarity in x and y directions are recorded every 0.12 seconds.

Source

  • The fish data is a part of the work done in the Noise-Induced Schooling of Fish

Simulation data

A synthetic data set obtained from the simulation of fish interactions.

model-data-vector-pairwise :

Pairwise interaction of fish simulated in two dimension.

model-data-vector-ternary:

Ternary interaction of fish simulated in two dimension.

model-data-scalar-pairwise:

Pairwise interaction of fish simulated in single dimension.

model-data-scalar-ternary:

Ternary interaction of fish simulated in single dimension.

Source

  • The simulation method is inspired from the work done in Noise-induced Effects in Collective Dynamics and Inferring Local Interactions from Data

Glossary

drift

First jump moment

diffusion

second jump moment

Acknowledgement

Licence

Distributed under GNU General Public License v3.0. See Licence.txt for more information.

Reference

[1] Noise-induced Effects in Collective Dynamics and Inferring Local Interactions from Data Preprint Github

[2] Noise-Induced Schooling of Fish Preprint Github