storage

Experiment data storage.

There are two main use cases for the functionality in this module: reading/writing data during an experiment session, and reading data once an experiment is complete (i.e. for analysis). See the user guide for information on these use cases/api.jpeg/api.jpeg/api.jpeg.

class axopy.storage.Storage(root='data', allow_overwrite=False)[source]

Top-level data storage maintainer.

See the user guide for more information.

Parameters:
  • root (str, optional) – Path to the root of the data storage filestructure. By default, ‘data’ is used. If the directory doesn’t exist, it is created.

  • allow_overwrite (bool, optional) – Specifies whether or not the storage interface allows you to overwrite a task’s data for a subject if it already exists.

create_task(task_id)[source]

Create a task for the current subject.

Parameters:

task_id (str) – The ID of the task to add. The name must not have been used for another task for the current subject.

Returns:

writer – A new TaskWriter for storing task data.

Return type:

TaskWriter

require_task(task_id)[source]

Retrieves a task for the current subject.

Parameters:

task_id (str) – The ID of the task to look for. The task must have already been run with the current subject.

Returns:

reader – A new TaskReader for working with the existing task data.

Return type:

TaskReader

property subject_id

The current subject ID.

When setting the subject ID for a new subject (i.e. one that doesn’t exist already), storage for that subject is created.

property subject_ids

Generate subject IDs found in storage sorted in alphabetical order.

Returns:

subject_id – ID of the subject found.

Return type:

str

property task_ids

Generate names of tasks found for the current subject.

Note that there may be no tasks found if the subject_id has not been set or if the subject hasn’t started any tasks. In this case, nothing is yielded.

to_zip(outfile)[source]

Create a ZIP archive from a data storage hierarchy.

For more information, see storage_to_zip().

class axopy.storage.TaskReader(root)[source]

High-level interface to task storage.

Parameters:

root (str) – Path to task’s root directory. This is the directory specific to a task which contains a trials.csv file and HDF5 array files.

array(name)[source]

Retrieve an array type’s data for all trials.

iterarray(name)[source]

Iteratively retrieve an array for each trial.

Parameters:

name (str) – Name of the array type.

pickle(name)[source]

Load a pickled object from storage.

Parameters:

name (str) – Name of the pickled object (no extension).

property trials

A Pandas DataFrame representing the trial data.

class axopy.storage.TaskWriter(root)[source]

The main interface for storing data from a task.

Usually you get a Taskwriter from Storage, so you don’t normally need to create one yourself.

Parameters:

root (str) – Path to the task root (e.g. ‘data/subject_1/taskname’).

trials

TrialWriter for storing trial data.

Type:

TrialWriter

pickle(obj, name)[source]

Write a generic object to storage.

This can be useful to persist an object from one task to another, or to store something that doesn’t easily fit into the AxoPy storage model (trial attributes and arrays). Be cautious, however, as pickles are not the best way to store things long-term nor securely. See the advice given here, for example: http://scikit-learn.org/stable/modules/model_persistence.html

Parameters:
  • obj (object) – The object to pickle.

  • name (str) – Name of the pickle to save (no extension).

write(trial)[source]

Write trial data.

This must be the last thing done for the current trial. That is, make sure all arrays have accumulated all data required. This method flushes trial and array data to files for you.

Important note: The trial’s arrays are cleared after writing.

Parameters:

trial (Trial) – Tral data. See TrialWriter.write() and Trial for details.

class axopy.storage.TrialWriter(filepath)[source]

Writes trial data to a CSV file line by line.

Parameters:

filepath (str) – Path to the file to create.

data

Dictionary containing all trial data written so far.

Type:

dict

write(data)[source]

Add a single row to the trials dataset.

Data is immediately added to the file on disk.

Parameters:

data (dict) – Data values to add.

axopy.storage.makedirs(path, exist_ok=False)[source]

Recursively create directories.

This is needed for Python versions earlier than 3.2, otherwise os.makedirs(path, exist_ok=True) would suffice.

Parameters:
  • path (str) – Path to directory to create.

  • exist_ok (bool, optional) – If exist_ok is False (default), an exception is raised. Set to True if it is acceptable that the directory already exists.

axopy.storage.read_hdf5(filepath, dataset='data')[source]

Read the contents of a dataset.

This function assumes the dataset in the HDF5 file exists at the root of the file (i.e. at ‘/’). It is primarily for internal usage but you may find it useful for quickly grabbing an array from an HDF5 file.

Parameters:
  • filepath (str) – Path to the file to read from.

  • dataset (str, optional) – Name of the dataset to retrieve. By default, ‘data’ is used.

Returns:

data – The data (read into memory) as a NumPy array. The dtype, shape, etc. is all determined by whatever is in the file.

Return type:

ndarray

axopy.storage.storage_to_zip(path, outfile=None)[source]

Create a ZIP archive from a data storage hierarchy.

The contents of the data storage hierarchy are all placed in the archive, with the top-level folder in the archive being the data storage root folder itself. That is, all paths within the ZIP file are relative to the dataset root folder.

Parameters:
  • path (str) – Path to the root of the dataset.

  • outfile (str, optional) – Name of the ZIP file to create. If not specified, the file is created in the same directory as the data root with the same name as the dataset root directory (with “.zip” added).

Returns:

outfile – The name of the ZIP file created.

Return type:

str

axopy.storage.write_hdf5(filepath, data, dataset='data')[source]

Write data to an hdf5 file.

The data is written to a new file with a single dataset called “data” in the root group. It is primarily for internal usage but you may find it useful for quickly writing an array to an HDF5 file.

Parameters:
  • filepath (str) – Path to the file to be written.

  • data (ndarray) – NumPy array containing the data to write. The dtype, shape, etc. of the resulting dataset in storage is determined by this array directly.

  • dataset (str, optional) – Name of the dataset to create. Default is ‘data’.