Permalink
Show file tree
Hide file tree
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Showing
110 changed files
with
15,724 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,116 @@ | ||
# SpecDB: A Relational Database for archiving biomolecular NMR Spectra Data | ||
|
||
SpecDB is a relational database to store and distribute biomoleculae NMR experimental data. SpecDB stores the raw **F**ree **I**nduction **D**ecay (__fid__) from an NMR experiment in a structured way that matches __fid__ records to biomoleculae sample and experimental meta-data. | ||
|
||
|
||
## Repository Organization | ||
|
||
``` | ||
├── LICENSE | ||
├── README.md | ||
├── TUTORIAL.md | ||
├── sample | ||
│ ├── sample.db | ||
│ ├── sample_forms/ | ||
│ └── sample_sessions/ | ||
├── specdb | ||
│ ├── specdb | ||
│ └── specdb.py | ||
└── sql | ||
├── specdb.sql | ||
└── template.str | ||
``` | ||
|
||
Above is the tree layout for SpecDB's repository structure. | ||
|
||
* `sample/`: location of sample data for the tutorial | ||
- `sample/sample.db`: sample SQLite SpecDB database | ||
- `sample/sample_forms/`: contains example forms for different data types to enter into SpecDB | ||
- `sample/sample_sessions`: contains example Bruker data collection sessions | ||
* `specdb`: location of **specdb** python library and command line interface (__CLI__) | ||
- `specdb/specdb`: the **specdb** CLI | ||
- `specdb/specdb.py`: the **specdb** library. users will typically only interact with the CLI | ||
* `sql/`: directory where the **SpecDB** SQLite schema resides | ||
- `sql/specdb.sql`: the **SpecDB** SQLite schema definitions | ||
- `template.str`: the minimal NMR-STAR file **SpecDB** attempts to write | ||
|
||
## Getting Started | ||
|
||
The goal of SpecDB is to capture and organize time domain data that is generated | ||
from biomolecular NMR experiments. To make the time domain data useful for | ||
downstream applications, the experiment's metadata such as protein sequence, | ||
buffer information, the specific pulse sequence performed, and much more all | ||
should be captured. The idea behind SpecDB is to provide users with a set of tools to enter sample and experiment metadata into a SQLite database. | ||
|
||
This _Getting Started_ guide will cover two common scenarios for users installing SpecDB. (1) on a machine where they have install permissions, and (2) if the user is on a shared cluster where they do not have install permissions. | ||
|
||
|
||
To operate SpecDB on a machine where you do have install permissions, here are the steps we recommend. | ||
1. clone this repository: `git clone https://github.rpi.edu/RPIBioinformatics/SpecDB.git` | ||
2. to make SpecDB operate as a command line tool, the `PATH` environment variable needs to amended. `export PATH=$PATH:{location of SpecDB}/SpecDB/specdb/` | ||
- if a bash profile file is being amended, be sure to `source` the profile after the `PATH` environment variable is edited | ||
3. the required 3rd part modules for SpecDB are `pandas`, `ruamel.yaml`, and `pynmrstar`. do `pip3 install pandas ruamel.yaml pynmrstar` to download the three libraries | ||
4. After the `pip install` SpecDB should be operational. Verify that SpecDB is on the `PATH` and libraries installed with `specdb --help`. The output should be the following: | ||
|
||
``` | ||
usage: specdb [-h] {create,forms,insert,summary,query,backup,restore} ... | ||
Command line tool for interfacing with SpecDB | ||
positional arguments: | ||
{create,forms,insert,summary,query,backup,restore} | ||
command description | ||
create instantiate a new database | ||
forms generate a template form for requested table tables to | ||
generate forms for are: `user`, `project`, `target` | ||
`construct`, `expression`, `batch`, `buffer`, `pst`, | ||
`spectrometer`, and a JSON for a session | ||
insert insert a single json file into SpecDB | ||
summary make a summary report for SpecDB database. if ran with | ||
no table provided, then a summary for every table is | ||
made. | ||
query query records from SpecDB summary table. If no | ||
--output is given then results are simply print to | ||
screen | ||
backup perform incremental backup. specdb configure must be | ||
ran first | ||
restore perform database restoration from a SpecDB backup | ||
optional arguments: | ||
-h, --help show this help message and exit | ||
example command lines: | ||
specdb create --db my.db --backup /backups/my.backup.db | ||
specdb query --db my.db --sql "SELECT * FROM table users LIMIT 10" | ||
specdb insert --file specdb.yaml --env my.db --write | ||
specdb backup --db my.db --backup /backups/my.backup.db | ||
specdb forms --table user project --num 3 1 | ||
specdb summary --table user --db my.db | ||
``` | ||
|
||
To operate SpecDB in a shared cluster without install permissions, it is first recommended to follow whatever standard operating procedures there are for software installation. If virtual environments are an option, the following steps work for SpecDB: | ||
1. make a python virtual environment with `venv`. `python3 -m venv {name of your environment}` | ||
2. to active the environment, `source {name of environment}/bin/activate` | ||
3. perform the `pip install` described above | ||
4. to exit the virtual environment do `deactivate {name of environment}` | ||
|
||
## Acknowledgements | ||
|
||
The functions to perform the incremental backup are taken from the following repository: https://github.com/nokibsarkar/sqlite3-incremental-backup.git. | ||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
Oops, something went wrong.