flexiznam.schema package¶
Module contents¶
This module handles all the different types of dataset we can have.
The main functions common to all datasets are implemented in the main
flexiznam.schema.datasets.Dataset
class.
Each type of dataset will have a subclass, such as
flexiznam.schema.datasets.camera_data.CameraData
that
inherits the main Dataset class and can re-implement any methods if needed.
For convenience, all these dataset classes are imported in the __init__.py. You can therefore just use schema.Camera. The init also populate the Dataset class property Dataset.SUBCLASSES which lists all the existing subclasses. It is a dictionary with flexilims datatype name as keys and the corresponding class object as reference.
Usage¶
The simpler entry points are the class methods:
flexiznam.schema.datasets.Dataset.from_folder
and
flexiznam.schema.datasets.Dataset.from_flexilims
.
flexiznam.schema.datasets module¶
- class flexiznam.schema.datasets.Dataset(path, is_raw, dataset_type, name=None, extra_attributes=None, created=None, project=None, project_id=None, origin_id=None, flm_session=None)[source]¶
Bases:
object
Master class. Should be inherited by all datasets
Class to handle dataset identification and validation
SUBCLASSES are held in different files and added to the Dataset class by schema.__init__.py
- SUBCLASSES = {'camera': <class 'flexiznam.schema.camera_data.CameraData'>, 'harp': <class 'flexiznam.schema.harp_data.HarpData'>, 'microscopy': <class 'flexiznam.schema.microscopy_data.MicroscopyData'>, 'scanimage': <class 'flexiznam.schema.scanimage_data.ScanimageData'>}¶
- associated_files(folder=None)[source]¶
Give a list of all files associated with this dataset
- Parameters
folder – Where to look for files? default to self.path
Returns:
- property dataset_type¶
Type of the dataset. Must be in PARAMETERS[‘dataset_types’]
- flexilims_report(flm_data=None)[source]¶
Describe the difference between the dataset and what is on flexilims
Differences are returned in a dictionary: property: (value in dataset, value in flexilims)
Attributes not present in either dataset or on flexilims are labelled as ‘N/A’
- flexilims_status()[source]¶
Status of the dataset on flexilims
Status can be ‘up-to-date’, ‘different’ or ‘not online’
This function does not check flexilims these only value: ‘createdBy’, ‘objects’, ‘dateCreated’, ‘customEntities’, ‘incrementalId’, ‘id’, ‘origin_id’
- format(mode='flexilims')[source]¶
Format a dataset
This can generate either a ‘flexilims’ type of output (a series similar to get_entities output) or a ‘yaml’ type as that used by flexiznam.camp
The flexilims series will not include elements that are not used by the Dataset class such as created_by
- Parameters
mode – ‘flexilims’ or ‘yaml’
- static from_flexilims(project=None, name=None, data_series=None, flm_session=None)[source]¶
Loads a dataset from flexilims.
If the dataset_type attribute of the flexilims entry defined in Dataset.SUBCLASSES,this subclass will be used. Otherwise a generic Dataset is returned
- Parameters
project – Name of the project or hexadecimal project_id
name – Unique name of the dataset on flexilims
data_series – default to None. pd.Series as returned by flz.get_entities. If provided, superseeds project and name
flm_session – authentication session to access flexilims
- classmethod from_folder(folder, verbose=True, flm_session=None)[source]¶
Try to load all datasets found in the folder.
Will try all defined subclasses of datasets and keep everything that does not crash. If you know which dataset to expect, use the subclass directly
- static from_origin(project=None, origin_type=None, origin_id=None, origin_name=None, dataset_type=None, conflicts=None, flm_session=None)[source]¶
Creates a dataset of a given type as a child of a parent entity
- Parameters
project (str) – Name of the project or hexadecimal project_id
origin_type (str) – sample type of the origin
origin_id (str) – hexadecimal ID of the origin. This or origin_name must be provided
origin_name (str) – name of the origin. This or origin_id must be provided
dataset_type (str) – type of dataset to create. Must be defined in the config file
conflicts (str) –
What to do if a dataset of this type already exists as a child of the parent entity?
- append
Create a new dataset with a new name and path
- abort or None
Through a
flexiznam.errors.NameNotUniqueError
and exit- skip or overwrite
Return a Dataset corresponding to the existing entry if there is exactly one existing entry, otherwise through a
flexiznam.errors.NameNotUniqueError
flm_session (
flexilims.Flexilims
) – authentication session to connect to flexilims
- Returns
a dataset object (WITHOUT updating flexilims)
- Return type
- get_flexilims_entry()[source]¶
Get the flexilims entry for this dataset
- Returns
a dictionary or [] if the entry is not found
- Return type
- property is_raw¶
Is that dataset containing raw or processed data?
- is_valid()[source]¶
Dummy method definition. Should be reimplemented in children classes
Should return True if the dataset is found a valid, false otherwise
- property name¶
Full name of the dataset as it would appear on Flexilims.
Including mouse, sample, session and recording, whichever apply.
- property path_full¶
Get full path including the CAMP root
- property path_root¶
Get CAMP root path that should apply to this dataset
- property project¶
Parent project. Must be defined in config project list
- property project_id¶
Hexadecimal ID of the parent project. Must be defined in config project list
- update_flexilims(mode='safe')[source]¶
Create or update flexilims entry for this dataset
- Parameters
mode (str) – One of: ‘update’, ‘overwrite’, ‘safe’ (default). If ‘safe’, will only create entry if it does not exist online. If ‘update’ will update existing entry but keep any existing attributes that are not specified. If ‘overwrite’ will update existing entry and clear any attributes that are not specified.
- Returns
Flexilims reply
flexiznam.schema.camera_data module¶
- class flexiznam.schema.camera_data.CameraData(path, timestamp_file, metadata_file, video_file, name=None, extra_attributes=None, created=None, project=None, is_raw=True, flm_session=None)[source]¶
Bases:
flexiznam.schema.datasets.Dataset
- DATASET_TYPE = 'camera'¶
- VALID_EXTENSIONS = {'.avi', '.bin', '.csv', '.mp4', '.txt'}¶
- VIDEO_EXTENSIONS = {'.avi', '.bin', '.mp4'}¶
- from_flexilims(name=None, data_series=None, flm_session=None)[source]¶
Create a camera dataset from flexilims entry
flexiznam.schema.harp_data module¶
- class flexiznam.schema.harp_data.HarpData(name, path, binary_file, csv_files=None, extra_attributes=None, created=None, project=None, is_raw=True, flm_session=None)[source]¶
Bases:
flexiznam.schema.datasets.Dataset
- DATASET_TYPE = 'harp'¶
- classmethod from_flexilims(project=None, name=None, data_series=None, flm_session=None)[source]¶
Create a harp dataset from flexilims entry
flexiznam.schema.scanimage_data module¶
- class flexiznam.schema.scanimage_data.ScanimageData(path, name=None, tif_files=None, csv_files=None, extra_attributes=None, created=None, project=None, is_raw=True, flm_session=None)[source]¶
Bases:
flexiznam.schema.datasets.Dataset
- DATASET_TYPE = 'scanimage'¶
- from_flexilims(name=None, data_series=None, flm_session=None)[source]¶
Create a camera dataset from flexilims entry
- static from_folder(folder, verbose=True, mouse=None, session=None, recording=None, flm_session=None)[source]¶
Create a scanimage dataset by loading info from folder
- property tif_files¶
List of tif files, sorted alphabetically (automatically)
flexiznam.schema.microscopy_data module¶
- class flexiznam.schema.microscopy_data.MicroscopyData(path, name=None, extra_attributes=None, created=None, project=None, is_raw=True, flm_session=None)[source]¶
Bases:
flexiznam.schema.datasets.Dataset
Subclass to handle detection of ex vivo microscopy images
- DATASET_TYPE = 'microscopy'¶
- VALID_EXTENSIONS = {'.czi'}¶
- from_flexilims(name=None, data_series=None, flm_session=None)[source]¶
Create a microscopy dataset from flexilims entry