Skip to content

RPIBioinformatics/SpecDB

master
Switch branches/tags

Name already in use

A tag already exists with the provided branch name. Many Git commands accept both tag and branch names, so creating this branch may cause unexpected behavior. Are you sure you want to create this branch?
Code

Latest commit

3fdc287

Git stats

Files

Permalink
Failed to load latest commit information.
Type
Name
Latest commit message
Commit time
cli
Feb 3, 2023
Jun 21, 2022
sql
Feb 3, 2023
Jan 24, 2022
Jan 24, 2022

SpecDB: A Relational Database for archiving biomolecular NMR Spectra Data

SpecDB is a relational database to store and distribute biomoleculae NMR experimental data. SpecDB stores the raw Free Induction Decay (fid) from an NMR experiment in a structured way that matches fid records to biomoleculae sample and experimental meta-data.

Repository Organization

├── LICENSE
├── README.md
├── TUTORIAL.md
├── sample
│   ├── sample.db
│   ├── sample_forms/
│   └── sample_sessions/
├── specdb
│   ├── specdb
│   └── specdb.py
└── sql
    ├── specdb.sql
    └── template.str

Above is the tree layout for SpecDB's repository structure.

  • sample/: location of sample data for the tutorial
    • sample/sample.db: sample SQLite SpecDB database
    • sample/sample_forms/: contains example forms for different data types to enter into SpecDB
    • sample/sample_sessions: contains example Bruker data collection sessions
  • specdb: location of specdb python library and command line interface (CLI)
    • specdb/specdb: the specdb CLI
    • specdb/specdb.py: the specdb library. users will typically only interact with the CLI
  • sql/: directory where the SpecDB SQLite schema resides
    • sql/specdb.sql: the SpecDB SQLite schema definitions
    • template.str: the minimal NMR-STAR file SpecDB attempts to write

Getting Started

The goal of SpecDB is to capture and organize time domain data that is generated from biomolecular NMR experiments. To make the time domain data useful for downstream applications, the experiment's metadata such as protein sequence, buffer information, the specific pulse sequence performed, and much more all should be captured. The idea behind SpecDB is to provide users with a set of tools to enter sample and experiment metadata into a SQLite database.

This Getting Started guide will cover two common scenarios for users installing SpecDB. (1) on a machine where they have install permissions, and (2) if the user is on a shared cluster where they do not have install permissions.

To operate SpecDB on a machine where you do have install permissions, here are the steps we recommend.

  1. clone this repository: git clone https://github.rpi.edu/RPIBioinformatics/SpecDB.git
  2. to make SpecDB operate as a command line tool, the PATH environment variable needs to amended. export PATH=$PATH:{location of SpecDB}/SpecDB/specdb/
    • if a bash profile file is being amended, be sure to source the profile after the PATH environment variable is edited
  3. the required 3rd part modules for SpecDB are pandas, ruamel.yaml, and pynmrstar. do pip3 install pandas ruamel.yaml pynmrstar to download the three libraries
  4. After the pip install SpecDB should be operational. Verify that SpecDB is on the PATH and libraries installed with specdb --help. The output should be the following:

usage: specdb [-h] {create,forms,insert,summary,query,backup,restore} ...

Command line tool for interfacing with SpecDB

positional arguments:
  {create,forms,insert,summary,query,backup,restore}
                        command description
    create              instantiate a new database
    forms               generate a template form for requested table tables to
                        generate forms for are: `user`, `project`, `target`
                        `construct`, `expression`, `batch`, `buffer`, `pst`,
                        `spectrometer`, and a JSON for a session
    insert              insert a single json file into SpecDB
    summary             make a summary report for SpecDB database. if ran with
                        no table provided, then a summary for every table is
                        made.
    query               query records from SpecDB summary table. If no
                        --output is given then results are simply print to
                        screen
    backup              perform incremental backup. specdb configure must be
                        ran first
    restore             perform database restoration from a SpecDB backup

optional arguments:
  -h, --help            show this help message and exit

example command lines:
  specdb create --db my.db --backup /backups/my.backup.db
  specdb query --db my.db --sql "SELECT * FROM table users LIMIT 10"
  specdb insert --file specdb.yaml --env my.db --write
  specdb backup --db my.db --backup /backups/my.backup.db
  specdb forms --table user project --num 3 1
  specdb summary --table user --db my.db

To operate SpecDB in a shared cluster without install permissions, it is first recommended to follow whatever standard operating procedures there are for software installation. If virtual environments are an option, the following steps work for SpecDB:

  1. make a python virtual environment with venv. python3 -m venv {name of your environment}
  2. to active the environment, source {name of environment}/bin/activate
  3. perform the pip install described above
  4. to exit the virtual environment do deactivate {name of environment}

Acknowledgements

The functions to perform the incremental backup are taken from the following repository: https://github.com/nokibsarkar/sqlite3-incremental-backup.git.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages