Getting started with Flexiznam =============================== Creating a project ------------------ To create a new project, log into `Flexilims `_ and click **Add New Stuff** > **New Project**. Once created, add the hexadecimal project ID to the config file in ``~/.flexiznam/config.yml``. Adding mice ----------- The best way to add a new mouse to flexilims is to import the data directly from MCMS. This can also be from the command line using ``flexiznam add-mouse``. See ``flexiznam add-mouse --help`` for documentation. It will log in to MCMS, look for a mouse based on it's name and download a one-line csv with the info about that mouse. It will then read the downloaded file, load it in a pandas Dataframe and delete the file (to make sure it can be re-downloaded without naming issue). Mice can also be added manually through the Python API using :py:meth:`flexiznam.main.add_mouse`. Acquisition and file transfer ----------------------------- This will work only if the acquisition pipeline works as predicted, that means that at acquisition: * files are named automatically * scanimage creates a directory for each recording * file names are: ``MOUSE_SESSION_RECORDING_PROTOCOL`` * all path are relative to ``DATAROOT``: ``////...`` File transfer is not handled by flexiznam. You should transfer all the data to CAMP first. We will just check that it is available. The path to the camp folder containing the projects must be set in the config file. The best way to transfer files from a windows computer might be to use `robocopy `_. Mount the CAMP drive and just do:: robocopy /e /z With ``/e`` to copy recursively, including empty directory and ``/z`` to use restartable mode (in case connection is lost). .. note:: You can also consider: * ``/j`` to copy using unbuffered I/O (recommended for large files). * ``/copy:DAT`` to copy data, attributes and timestamps but not ownership or ACL. * ``/move`` to move instead of copying (delete after successful upload) Syncing data ------------ The most efficient way to upload data to Flexilims is using a YAML file. Briefly, the YAML file format is: .. literalinclude:: ../../flexiznam/camp/yaml_format.yml :language: yaml Fortunately, you do not need to type out all the datasets and paths by hand. Instead, you can create a minimal YAML file and then use Flexiznam to fill in the details. See example below: .. literalinclude:: ../../flexiznam/camp/minimal_example_acquisition_yaml.yml :language: yaml You can also include notes or other optional attributes. This YAML file must be saved in the session folder. The mouse folder must be named like the mouse, the session folder like the session and the recording folder like the recording. Once all the data are on CAMP, the first step is to validate and autopopulate the YAML:: flexiznam process-yaml --source_yaml "path/to/acq_yaml.yml" --target_yaml "path/to/target_yaml.yml" This will call the :py:meth:`flexiznam.camp.sync_data.parse_yaml` method and create a local copy of the yml called ``acq_yaml_autogenerated_full_file.yml``. If a dataset cannot be located or loaded, the yaml file with contain a warning starting with ``XXERRORXX``. A list of such errors will also be printed on the console. Here is for instance an example output:: Reading example_acquisition_yaml.yml Found some issues with the yaml: - Dataset: `ref_for_motion` Could not find dataset "ref_for_motion". Found "PZAH4.1c_S20210513_R181858_SphereCylinder00001, PZAH4.1c_S20210513_R182025_SphereCylinder00001, PZAH4.1c_S20210513_R182758_SphereCylinder00001" instead - Dataset: `overview_picture_02` Could not find dataset "overview_picture_02". Found "overview00001, overview00002" instead - Dataset: `harp_data_csv` Dataset not found. Path /Volumes/lab-znamenskiyp/home/shared/projects/3d_vision/Data/ParamLog/R193432_Retinotopy does not exist Fix manually these errors before uploading to flexilims Processed yaml saved to example_acquisition_yaml_autogenerated_full_file.yml Before uploading, one must then manually edit the yaml to fix it. You can call ``process-yaml`` on the fixed yaml until there is no error. Finally, you can add the entries to flexilims:: flexiznam yaml-to-flexilims --source_yaml "path/to/processed_yaml.yml" Querying the database --------------------- :py:mod:`flexiznam.main` provides high-level functions to retrieve and update entries on the database. Methods of :py:mod:`flexiznam.main` are directly available in the :py:mod:`flexiznam` namespace. .. py:currentmodule:: flexiznam.main First, create a Flexilims session by calling :py:meth:`get_flexilims_session`. This returns a :py:class:`flexilims.Flexilims` object with your authentication credential that you can pass to other methods. The simplest way is to just provide the project name and use the authentication details stored in the config files:: import flexiznam as flz flz_session = flz.get_flexilims_session(project) :py:meth:`get_entities` is the most generic method and will retrieve any data type, filtered by name, id, origin, or arbitrary attribute. It returns a :py:class:`pandas.DataFrame` by default. :py:meth:`get_entity` has the same functionality but expects only a single result and returns a :py:class:`pandas.Series`:: exp_session = flz.get_entity( datatype='session', name=session_name, flexilims_session=flz_session ) Other useful methods include :py:meth:`get_children`, which returns all children of a given entity, and :py:meth:`get_datasets`, which returns a dictionary containing paths to all datasets of a given type in a given session, for example:: si_datasets = flz.get_datasets( exp_session['id'], recording_type='two_photon', dataset_type='scanimage', flexilims_session=flz_session ) Adding processed datasets ------------------------- New entries for pre-processed datasets can be added by calling the :py:meth:`add_dataset` method. However, this is not recommended. .. py:currentmodule:: flexiznam.schema.datasets Instead, when the new processed dataset is created as a child on an existing entity, such as an experimental session or recording, it is best to use the static :py:meth:`Dataset.from_origin` method of the :py:class:`Dataset` class, found in :py:mod:`flexiznam.schema`:: from flexiznam.schema import Dataset suite2p_dataset = Dataset.from_origin( project=project, origin_type='session', origin_id=exp_session['id'], dataset_type='suite2p_rois', conflicts=conflicts ) This method will automatically set the flexilims name and path attribute of the new dataset, based on the path attribute of the parent passed by `origin_id` and return an instance of :py:class:`Dataset`. It will also automatically handle conflicts, providing options to `append`, `overwrite`, `abort` or `skip` if a dataset of a given type is already associated with parent entity. .. note:: If using the `skip` mode of :py:meth:`Dataset.from_origin`, will either return a :py:class:`Dataset` object corresponding to the existing entry, if it exists, or to a new entry. You can use :py:meth:`Dataset.get_flexilims_entry` to check if the entry already exists - it will return `None` if it does not. .. warning:: The output of :py:meth:`Dataset.from_origin` is an abstraction of the dataset you *would like to create*. The method itself does not update the database. It's a good idea to do this only after the pre-processing step is completed in case of a crash. You can set any additional attributes using the `extra_attributes` property of the `Dataset` object. When ready (i.e. once preprocessing is completed and the output files have been saved), you can push the changes to flexilims by invoking the :py:meth:`Dataset.update_flexilims` method of the `Dataset` object:: suite2p_dataset.update_flexilims(mode='overwrite')