pydaddy
A package to derive an SDE equation form the data.
pydaddy is a python package implementing a data driven SDE method. pydaddy provides an interface which takes time series data as input, runs the analysis and returns an output object through which data and analysed results can be systematically visualized and saved.
How it works
Suppose m(t) is a SDE time series data.
The the package calculates the deterministic (drift) and stochastic (diffusion) component of dm/dt.
This data driven SDE method is based on the assumption that the noise in the time series is uncorrelated and Gaussian in nature, with zero mean and unit variance.
pydaddy extracts the noise form the data and checks if it holds true to its assumptions.
Features
Simple one line execution of analysis.
Produces intuitive visuals of data and obtained result.
Supports time series data with both scalar and vector order parameters.
Generates interactive, drift and diffusion sliders for user defined timescale range or list.
Diagnostics help understand change in order of drift and diffusion with varying timescale.
Requirements
Python:
python version >=3.5 , <=3.8
Packages dependencies:
statsmodels, version 0.11.1
matplotlib, version 3.2.2
scipy, version 1.5.2
numpy, version 1.19.1
seaborn, version 0.10.1
tqdm, version 4.48.2
plotly, version 4.14.3
Installation
pydaddy is available both on pypi and anaconda cloud, which offers two recommended methods of installations, using pip python package manager or anaconda package manager.
Note
To run the example notebooks on your system after installing the package, please download the notebook files to your PC from the github repo.
Alternately, its recommended to simply clone or download git the repository.
Installing via conda
Conda is a package manager that handles
sourcing of all dependencies in a relatively straight-forward,
cross-platform manner. pydaddy is available on the tee-lab
channel.
Important
Make sure you have anaconda or miniconda installed and have activated
the conda default (base) environment using conda activate base
,
before proceeding.
Install pydaddy package with all its dependencies using conda simply requires executing the command
conda install -c tee-lab pydaddy
To install pydaddy in a clean virtual environment use
conda create --name MY_ENV_NAME -c tee-lab pydaddy
Replace MY_ENV_NAME
with your desired name for environment.
You can now access pydaddy package by activating the newly created
environment, conda activate MY_ENV_NAME
Installing via pip
To install using pip package manager, run:
python -m pip install pydaddy
Manual installation
This method is not recommended unless you experience problems with conda
or pip. To install using setuptools, download the source code manually
and run python setup.py install
from the terminal.
This will install the package in your current environment (if you are working in any environment).
If you have git installed, you can clone the repo and install using the following commands.
$ git clone https://github.com/tee-lab/pydaddy.git
$ cd pydaddy
$ python setup.py install
Important
Without git, you will need to download the zip-file of the code, extract it and follow the above instructions.
Click here to download source code zip file.
Usage
The time series data is given as input to the Characterize
method
along with all other optional parameters.
Show
pydaddy.Characterize
documentationCharacterize
- class pydaddy.Characterize(data, t=1.0, Dt=1, dt=1, bins=None, inc=None, inc_x=None, inc_y=None, slider_timescales=None, n_trials=1, show_summary=True, drift_threshold=None, diff_threshold=None, drift_degree=5, diff_degree=5, drift_alpha=0, diff_alpha=0, fit_functions=False, **kwargs)
Bases:
object
Analyse a time series data and get drift and diffusion plots.
- Parameters
data (list) – time series data to be analysed, data = [x] for scalar data and data = [x1, x2] for vector where x, x1 and x2 are of numpy.array object type
t (float, array, optional(default=1.0)) –
float if its time increment between observation
numpy.array if time stamp of time series
Dt (int,'auto', optional(default='auto')) –
time scale for drift
if ‘auto’ time scale is decided based of drift order.
dt (int, optional(default=1)) – time scale for difusion
inc (float, optional(default=0.01)) – increment in order parameter for scalar data
inc_x (float, optional(default=0.1)) – increment in order parameter for vector data x1
inc_y (float, optional(default=0.1)) – increment in order parameter for vector data x2
fft (bool, optional(default=True)) – if true use fft method to calculate autocorrelation else, use standard method
slider_timescales (list, optional(default=None)) – List of timescale values to include in slider.
n_trials (int, optional(default=1)) – Number of trials, concatenated timeseries of multiple trials is used.
show_summary (bool, optional(default=True)) – print data summary and show summary chart.
**kwargs – all the parameters for inherited methods.
- Returns
output – object to access the analysed data, parameters, plots and save them.
- Return type
Key parameters
- data: list
time series data to be analysed, data = [x] for scalar data and data = [x1, x2] for vector where x, x1 and x2 are of numpy.array types.
- t: array or float
float if its time increment between observation numpy.array if time stamp of time series
See doc strings or code documentation for more information.
Example using sample data set
See Data set description for more information about the datasets.
import pydaddy
#load data
data, t = pydaddy.load_sample_dataset('model-data-vector-ternary')
# Analyse
ddsde = pydaddy.Characterize(data,t)
# Show drift slider plot
ddsde.drift()
# Show diffuision slider plot
ddsde.diffusion()
# Show timeseries plot
ddsde.timeseries()
# Show histograms
ddsde.histograms()
# Show all inputed, calculated and assumed parameters of the analysis
ddsde.parameters()
# Export data to disk
ddsde.export_data()
Characterize
returns an output object in which all analysed results
are stored. Results can be visualised or stored by calling appropriate
functions:
Show
pydaddy.output.output
documentationpydaddy.output.output
- pydaddy.output
alias of <module ‘pydaddy.output’ from ‘/home/docs/checkouts/readthedocs.org/user_builds/pyddsde/checkouts/latest/pydaddy/output.py’>
summary()
: show summarydrift()
: drift slider plotdiffusion()
: diffusion slider plottimeseries()
: time series plothistograms()
: histogram plotsnoise_characterstics()
: noise characteristics plotsvisualise(timescale)
: drift and diffusion plots for a timescalediagnostics()
: diagnostics plotsdata(timescale)
: get drift and diffusion data for a timescaleexport_data()
: Save data as csv files and mat filesplot_data(data)
: plot data on a 3d axis
For more examples see this notebook.
Motivation
This project is motivated by the study of group behaviour dynamics of animals, especially schooling fish. The sample data distributed along with this package is from experiments conducted by TEElab, IISc.
Data set description
pydaddy has six data set included along with the package which can be
loaded using load_sample_dataset(dataset_name)
function
Show
pydaddy.load_sample_dataset
documentationpydaddy.load_sample_dataset
- class pydaddy.load_sample_dataset(name)
Bases:
Load sample data set provided.
Available data sets:
‘fish-data-etroplus’
‘model-data-scalar-pairwise’
‘model-data-scalar-ternary’
‘model-data-vector-pairwise’
‘model-data-vector-ternary’
- Parameters
name (str) – name of the data set
- Returns
data (list) – timeseries data
t (float, array) – timescale
Experiment data (from experimentation or from observations)
- fish-data-ectropus:
A data from experiment conducted with a group of 30 fish, in which the group polarity in x and y directions are recorded every 0.12 seconds.
Source
The fish data is a part of the work done in the Noise-Induced Schooling of Fish
Simulation data
A synthetic data set obtained from the simulation of fish interactions.
- model-data-vector-pairwise :
Pairwise interaction of fish simulated in two dimension.
- model-data-vector-ternary:
Ternary interaction of fish simulated in two dimension.
- model-data-scalar-pairwise:
Pairwise interaction of fish simulated in single dimension.
- model-data-scalar-ternary:
Ternary interaction of fish simulated in single dimension.
Source
The simulation method is inspired from the work done in Noise-induced Effects in Collective Dynamics and Inferring Local Interactions from Data
Glossary
Acknowledgement
Licence
Distributed under GNU General Public License v3.0. See
Licence.txt
for more information.
Reference
[1] Noise-induced Effects in Collective Dynamics and Inferring Local Interactions from Data Preprint Github