NSDF file writer

nsdfwriter Module

Writer for NSDF file format.

class nsdf.nsdfwriter.NSDFWriter(filename, dialect='ONED', mode='a', **h5args)[source]

Bases: object

Writer for NSDF files.

An NSDF file has three main groups: /model, /data and /map.

mode

str

File open mode. Defaults to append (‘a’). Can be ‘w’ or ‘w+’ also.

dialect

nsdf.dialect member

ONED for storing nonuniformly sampled and event data in 1D arrays.

VLEN for storing such data in 2D VLEN datasets.

NANPADDED for storing such data in 2D homogeneous datasets with NaN padding.

model

h5.Group

/model group

data

h5.Group

/data group

mapping

h5.Group

/map group

time_dim

h5.Group

/map/time group contains the sampling time points as dimension scales of data. It is mainly used for nonuniformly sampled data.

modeltree

(h5.Group): ‘/model/modeltree group can be used for storing the model in a hierarchical manner. Each subgroup under modeltree is a model component and can contain other subgroups representing subcomponents. Each group stores the unique identifier of the model component it represents in the string attribute uid.

add_event_1d(source_ds, data_object, source_name_dict=None, fixed=False)[source]

Add event time data when data from each source is in a separate 1D dataset.

For a population of sources called {population}, a group /map/event/{population} must be first created (using add_event_ds). This is passed as source_ds argument.

When adding the data, the uid of the sources and the names for the corresponding datasets must be specified in source_name_dict and this function will create one dataset for each source under /data/event/{population}/{name} where {name} is the name of the data_object, preferably the field name.

Parameters:
  • source_ds (HDF5 Dataset) – the dataset /map/event/{populationname}{variablename} created for this population of sources (created by add_event_ds_1d). The name of this group reflects that of the group under /data/event which stores the datasets.
  • data_object (nsdf.EventData) – NSDFData object storing the data for all sources in source_ds.
  • source_name_dict (dict) – mapping from source id to dataset name. If None (default) it tries to use the uids in the source_ds. If the uids do not fit the hdf5 naming convention, the index of the entries in source_ds will be used.
  • fixed (bool) – if True, the data cannot grow. Default: False
Returns:

dict mapping source ids to datasets.

add_event_ds(name, idlist)[source]

Create a group under /map/event with name name to store mapping between the datasources and event data.

Parameters:
  • name (str) – name with which the datasource list should be stored. This will represent a population of data sources.
  • idlist (list) – unique ids of the data sources.
Returns:

The HDF5 Group /map/event/{name}.

add_event_ds_1d(popname, varname, idlist)[source]

Create a group under /map/event with name name to store mapping between the datasources and event data.

Parameters:
  • popname (str) – name of the group under which the datasource list should be stored. This will represent a population of data sources.
  • varname (str) – name of the dataset mapping source uid to data. This should be same as the name of the recorded variable.
Returns:

The HDF5 Dataset /map/event/{popname}/{varname}.

add_event_nan(source_ds, data_object, fixed=False)[source]

Add event data when data from all sources in a population is stored in a 2D array with NaN padding.

Parameters:
  • source_ds (HDF5 Dataset) – the dataset under /map/event created for this population of sources (created by add_nonunifrom_ds).
  • data_object (nsdf.EventData) – NSDFData object storing the data for all sources in source_ds.
  • fixed (bool) – if True, this is a one-time write and the data cannot grow. Default: False
Returns:

HDF5 Dataset containing the data.

add_event_vlen(source_ds, data_object, fixed=False)[source]

Add event data when data from all sources in a population is stored in a 2D ragged array.

When adding the data, the uid of the sources and the names for the corresponding datasets must be specified and this function will create the dataset /data/event/{population}/{name} where {name} is name of the data_object, preferably the name of the field being recorded.

Parameters:
  • source_ds (HDF5 Dataset) – the dataset under /map/event created for this population of sources (created by add_nonunifrom_ds).
  • data_object (nsdf.EventData) – NSDFData object storing the data for all sources in source_ds.
  • fixed (bool) – if True, this is a one-time write and the data cannot grow. Default: False
Returns:

HDF5 Dataset containing the data.

Notes

Concatenating old data with new data and reassigning is a poor choice for saving data incrementally. HDF5 does not seem to support appending data to VLEN datasets.

h5py does not support vlen datasets with float64 elements. Change dtype to np.float64 once that is developed.

add_model_filecontents(filenames, ascii=True, recursive=True)[source]

Add the files and directories listed in filenames to /model/filecontents.

This function is for storing the contents of model files in the NSDF file. In case of external formats like NeuroML, NineML, SBML and NEURON/GENESIS scripts, this function is useful. Each directory is stored as a group and each file is stored as a dataset.

Parameters:
  • filenames (sequence) – the paths of files and/or directories which contain model information.
  • ascii (bool) – whether the files are in ascii.
  • recursive (bool) – whether to recursively store subdirectories.
add_modeltree(root, target='/')[source]

Add an entire model tree. This will cause the modeltree rooted at root to be written to the NSDF file.

Parameters:
  • root (ModelComponent) – root of the source tree.
  • target (str) – target node path in NSDF file with respect to ‘/model/modeltree’. root and its children are added under this group.
add_nonuniform_1d(source_ds, data_object, source_name_dict=None, fixed=False)[source]

Add nonuniform data when data from each source is in a separate 1D dataset.

For a population of sources called {population}, a group /map/nonuniform/{population} must be first created (using add_nonuniform_ds). This is passed as source_ds argument.

When adding the data, the uid of the sources and the names for the corresponding datasets must be specified and this function will create one dataset for each source under /data/nonuniform/{population}/{name} where {name} is the name of the data_object, preferably the name of the field being recorded.

This function can be used when different sources in a population are sampled at different time points for a field value. Such case may arise when each member of the population is simulated using a variable timestep method like CVODE and this timestep is not global.

Parameters:
  • source_ds (HDF5 dataset) – the dataset /map/nonuniform/{population}/{variable} created for this population of sources (created by add_nonunifrom_ds_1d).
  • data_object (nsdf.NonuniformData) – NSDFData object storing the data for all sources in source_ds.
  • source_name_dict (dict) – mapping from source id to dataset name. If None (default), the uids of the sources will be used as dataset names. If the uids are not compatible with HDF5 names (contain ‘.’ or ‘/’), then the index of the source in source_ds will be used.
  • fixed (bool) – if True, the data cannot grow. Default: False
Returns:

dict mapping source ids to the tuple (dataset, time).

Raises:

AssertionError when dialect is not ONED.

add_nonuniform_ds(popname, idlist)[source]

Add the sources listed in idlist under /map/nonuniform/{popname}.

Parameters:
  • popname (str) – name with which the datasource list should be stored. This will represent a population of data sources.
  • idlist (list of str) – list of unique identifiers of the data sources. This becomes irrelevant if homogeneous=False.
Returns:

An HDF5 Dataset storing the source ids when dialect is VLEN or NANPADDED. This is converted into a dimension scale when actual data is added.

Raises:

AssertionError if idlist is empty or dialect is ONED.

add_nonuniform_ds_1d(popname, varname, idlist)[source]

Add the sources listed in idlist under /map/nonuniform/{popname}/{varname}.

In case of 1D datasets, for each variable we store the mapping from source id to dataset ref in a two column compund dataset with dtype=[(‘source’, VLENSTR), (‘data’, REFTYPE)]

Parameters:
  • popname (str) – name with which the datasource list should be stored. This will represent a population of data sources.
  • varname (str) – name of the variable beind recorded. The same name should be passed when actual data is being added.
  • idlist (list of str) – list of unique identifiers of the data sources.
Returns:

An HDF5 Dataset storing the source ids in source column.

Raises:

AssertionError if idlist is empty or if dialect is not ONED.

add_nonuniform_nan(source_ds, data_object, fixed=False)[source]

Add nonuniform data when data from all sources in a population is stored in a 2D array with NaN padding.

Parameters:
  • source_ds (HDF5 Dataset) – the dataset under /map/event created for this population of sources (created by add_nonunifrom_ds).
  • data_object (nsdf.EventData) – NSDFData object storing the data for all sources in source_ds.
  • fixed (bool) – if True, this is a one-time write and the data cannot grow. Default: False
Returns:

HDF5 Dataset containing the data.

Notes

Concatenating old data with new data and reassigning is a poor choice for saving data incrementally. HDF5 does not seem to support appending data to VLEN datasets.

h5py does not support vlen datasets with float64 elements. Change dtype to np.float64 once that is developed.

add_nonuniform_regular(source_ds, data_object, fixed=False)[source]

Append nonuniformly sampled variable values from sources to data. In this case sampling times of all the sources are same and the data is stored in a 2D dataset.

Parameters:
  • source_ds – the dataset storing the source ids under map. This is attached to the stored data as a dimension scale called source on the row dimension.
  • fixed (bool) – if True, the data cannot grow. Default: False
Returns:

HDF5 dataset storing the data

Raises:
  • KeyError if the sources in `data_object` do not match
  • those in `source_ds`.
  • ValueError if the data arrays are not all equal in length.
  • ValueError if dt is not specified or <= 0 when inserting
  • data for the first time.
add_nonuniform_vlen(source_ds, data_object, fixed=False)[source]

Add nonuniform data when data from all sources in a population is stored in a 2D ragged array.

When adding the data, the uid of the sources and the names for the corresponding datasets must be specified and this function will create the dataset /data/nonuniform/{population}/{name} where {name} is the first argument, preferably the name of the field being recorded.

This function can be used when different sources in a population are sampled at different time points for a field value. Such case may arise when each member of the population is simulated using a variable timestep method like CVODE and this timestep is not global.

Parameters:
  • source_ds (HDF5 dataset) – the dataset under /map/nonuniform created for this population of sources (created by add_nonunifrom_ds).
  • data_object (nsdf.NonuniformData) – NSDFData object storing the data for all sources in source_ds.
  • fixed (bool) – if True, this is a one-time write and the data cannot grow. Default: False
Returns:

tuple containing HDF5 Datasets for the data and sampling times.

TODO:

Concatenating old data with new data and reassigning is a poor choice. waiting for response from h5py mailing list about appending data to rows of vlen datasets. If that is not possible, vlen dataset is a technically poor choice.

h5py does not support vlen datasets with float64 elements. Change dtype to np.float64 once that is developed.

add_static_data(source_ds, data_object, fixed=True)[source]

Append static data variable values from sources to data.

Parameters:

source_ds (HDF5 Dataset) –

the dataset storing the source

ids under map. This is attached to the stored data as a dimension scale called source on the row dimension.

data_object (nsdf.EventData): NSDFData object storing

the data for all sources in source_ds.

fixed (bool): if True, the data cannot grow. Default: True

Returns:

HDF5 dataset storing the data

Raises:
  • KeyError if the sources in `source_data_dict` do not match
  • those in `source_ds`.
add_static_ds(popname, idlist)[source]

Add the sources listed in idlist under /map/static.

Parameters:
  • popname (str) – name with which the datasource list should be stored. This will represent a population of data sources.
  • idlist (list of str) – list of unique identifiers of the data sources.
Returns:

An HDF5 Dataset storing the source ids. This is converted into a dimension scale when actual data is added.

add_uniform_data(source_ds, data_object, tstart=0.0, fixed=False)[source]

Append uniformly sampled variable values from sources to data.

Parameters:
  • source_ds (HDF5 Dataset) – the dataset storing the source ids under map. This is attached to the stored data as a dimension scale called source on the row dimension.
  • data_object (nsdf.UniformData) – Uniform dataset to be added to file.
  • tstart (double) – (optional) start time of this dataset recording. Defaults to 0.
  • fixed (bool) – if True, the data cannot grow. Default: False
Returns:

HDF5 dataset storing the data

Raises:
  • KeyError if the sources in `source_data_dict` do not match
  • those in `source_ds`.
  • ValueError if dt is not specified or <= 0 when inserting
  • data for the first time.
add_uniform_ds(name, idlist)[source]

Add the sources listed in idlist under /map/uniform.

Parameters:
  • name (str) – name with which the datasource list should be stored. This will represent a population of data sources.
  • idlist (list of str) – list of unique identifiers of the data sources.
Returns:

An HDF5 Dataset storing the source ids. This is converted into a dimension scale when actual data is added.

contributor[source]

List of contributors to the content of this file.

description[source]

Description of the file. A text string.

license[source]

License information about the file. This is text string.

method[source]

(numerical) methods applied in generating the data.

rights[source]

The rights of the file contents.

set_properties(properties)[source]

Set the file attributes (environments).

Parameters:properties (dict) –

mapping property names to values. It must contain the following keyes:

title (str) creator (list of str) software (list of str) method (list of str) description (str) rights (str) tstart (datetime.datetime) tend (datetime.datetime) contributor (list of str)

Raises:KeyError if not all environment properties are specified in the dict.
software[source]

Software (one or more) used to generate the data in the file.

tend[source]

End time of the simulation/recording.

title[source]

Title of the file

tstart[source]

Start time of the simulation / data recording. A string representation of the timestamp in ISO format

nsdf.nsdfwriter.add_model_component(component, parentgroup)[source]

Add a model component as a group under parentgroup.

This creates a group component.name under parent group if not already present. The uid of the component is stored in the uid attribute of the group. Key-value pairs in the component.attrs dict are stored as attributes of the group.

Parameters:
  • component (ModelComponent) – model component object to be written to NSDF file.
  • parentgroup (HDF Group) – group under which this component’s group should be created.
Returns:

HDF Group created for this model component.

Raises:
  • KeyError if the parentgroup is None and no group
  • corresponding to the component’s parent exists.
nsdf.nsdfwriter.match_datasets(hdfds, pydata)[source]

Match entries in hdfds with those in pydata. Returns true if the two sets are equal. False otherwise.

nsdf.nsdfwriter.write_ascii_file(group, name, fname, **compression_opts)[source]

Add a dataset name under group and store the contents of text file fname in it.

nsdf.nsdfwriter.write_binary_file(group, name, fname, **compression_opts)[source]

Add a dataset name under group and store the contents of binary file fname in it.

nsdf.nsdfwriter.write_dir_contents(root_group, root_dir, ascii, **compression_opts)[source]

Walk the directory tree rooted at root_dir and replicate it under root_group in HDF5 file.

This is a helper function for copying model directory structure and file contents into an hdf5 file. If ascii=True all files are considered ascii text else all files are taken as binary blob.

Parameters:
  • root_group (h5py.Group) – group under which the directory tree is to be created.
  • root_dir (str) – path of the directory from which to start traversal.
  • ascii (bool) – whether to treat each file as ascii text file.