Skip to content

Commit

Permalink
Browse files Browse the repository at this point in the history
another update for the reviews
  • Loading branch information
xperthunter committed May 9, 2022
1 parent b5f321a commit fbb1f42
Show file tree
Hide file tree
Showing 9 changed files with 756 additions and 7 deletions.
6 changes: 3 additions & 3 deletions TUTORIAL.md
Expand Up @@ -4,7 +4,7 @@ The purpose of the following tutorial is to provide examples of how to use SpecD

## 1. SpecDB help menus and subcommands

The first entry point to look for guidance on SpecDB functions is to use the help menus. If `specdb help` results in the help menu for SpecDB, then it is installed correctly. SpecDB has seven sub commands, each listed below and the command line arguments each take:
The first entry point to look for guidance on SpecDB functions is to use the help menus. If `specdb --help` results in the help menu for SpecDB, then it is installed correctly. SpecDB has seven sub commands, each listed below and the command line arguments each take:

1. `specdb create --db --backup`
2. `specdb forms --table --num`
Expand All @@ -14,7 +14,7 @@ The first entry point to look for guidance on SpecDB functions is to use the hel
6. `specdb backup --db --backup`
7. `specdb restore --backup --backup`

The subcommands listed above in the logical order the commands are used in. Each subcommand has a separate help menu from `specdb --help` that can be accessed, (e.g `specdb forms --help`). Users first need to create a SpecDB SQLite database file with `create`. Next, users need to populate the database with information, the `forms` command make the forms for the data fields needed for the SpecDB schema. With a filled form, users use `insert` to insert insert the form into their database. To verify/check what they inserted, users can use `summary` to investigate the contents of any SpecDB table. Users can pull data out of the database with `query`. With `query` users provide a SQL SELECT statement on the SpecDB summary view to pull data out of the database. Commands `backup` and `restore` are for the incremental backup operations.
The subcommands listed above in the logical order the commands are used in. Each subcommand has a separate help menu from `specdb --help` that can be accessed, (e.g `specdb forms --help`). Users first need to create a SpecDB SQLite database file with `create`. Next, users need to populate the database with information, the `forms` command makes the forms for the data fields needed for the SpecDB schema. With a filled form, users use `insert` to insert the form into their database. To verify/check what they inserted, users can use `summary` to investigate the contents of any SpecDB table. Users can pull data out of the database with `query`. With `query` users provide a SQL SELECT statement on the SpecDB summary view to pull data out of the database. Commands `backup` and `restore` are for the incremental backup operations.

## 2. Instantiating a new SpecDB database

Expand Down Expand Up @@ -131,7 +131,7 @@ buffer_components: # describe the component(s) of a buffer, REQUIRED: `buffer_id

It is important to note that if the `--num` option is provided, that the number of iterations to take match the number of tables requested. In the above case, the `buffer` form was created once because of the `1` after the `--num` and three `buffer_components` were made because of the `3` after the `1` in the `--num` options. The number of options in `--table` and `--num` are in a one-to-one correspondence with each other. If no `--num` options are provided it is assumed that all tables are produced just once.

Inspecting `sample/sample_forms/complete_sample.yaml` will find all the information required to describe a biomolecular NMR sample. It is recommended that users use `specdb forms` to create the forms they when they need because users can define multiple entities at a time, and one general form will not suffice. However, it is instructive to see all the metadata items that are tracked in SpecDB by looking at `complete_sample.yaml`.
Inspecting `sample/sample_forms/complete_sample.yaml` will find all the information required to describe a biomolecular NMR sample. It is recommended that users use `specdb forms` to create the forms when they need them because users can define multiple entities at a time, and one general form will not suffice. However, it is instructive to see all the metadata items that are tracked in SpecDB by looking at `complete_sample.yaml`.

To follow along with the sample forms provided in the repository, perform the following commands:

Expand Down
15 changes: 13 additions & 2 deletions cli/specdb
Expand Up @@ -88,7 +88,7 @@ sdb_query = sdb_subs.add_parser('query',
("query records from SpecDB summary table. If no --output is given"),
(" then results are simply print to screen"))))
sdb_query.add_argument('--sql', nargs='+', type=str, metavar='<str>',
required=True, help=''.join((
required=False, default=False, help=''.join((
('query using sql syntax. The query can be on any table.'),
(' If no --output format is given, results are printed to screen.'))))
sdb_query.add_argument('--star', action='store_true', required=False,
Expand All @@ -99,6 +99,12 @@ sdb_query.add_argument('--db', type=str, metavar='<path>', required=True,
(' use `specdb create` to create a new database file'))))
sdb_query.add_argument('--out', type=str, metavar='<path>', required=True,
help='directory to place results of the query')
sdb_query.add_argument('--indices', type=str, metavar='<str>', nargs='+',
required=False, default=False, help=''.join((
("provide a list of row ids in the summary table to collect\n"),
("users can provide a list of ids directly on the command line space "),
("separated, or in a .csv file with all ids comma separated "),
("on first line"))))

# backup level parser
sdb_backup = sdb_subs.add_parser('backup',
Expand Down Expand Up @@ -178,7 +184,12 @@ elif sdb.command == 'summary':
Summary.summary(db=sdb.db, table=sdb.table)

elif sdb.command == 'query':
Query.query(db=sdb.db, sql=sdb.sql[0], star=sdb.star, output_dir=sdb.out)
Query.query(
db=sdb.db,
sql=sdb.sql[0],
indices=sdb.indices[0],
star=sdb.star,
output_dir=sdb.out)

elif sdb.command == 'backup':
Backup.backup(db=sdb.db, object_dir=sdb.objects, backup_file=sdb.shafile)
Expand Down
105 changes: 105 additions & 0 deletions specdb/Backup.py
@@ -0,0 +1,105 @@
#!/usr/bin/env python3

"""
Module for backing up and restoring a SpecDB database
"""

SQLITE_PAGE_SIZE_INDEX = 16
SQLITE_HEADER_LENGTH = 16
SQLITE_PAGE_COUNT_INDEX = 28

import hashlib
from hashlib import sha256
import os
import sqlite3
import sys

def backup(db=None, object_dir='objects', backup_file='backup.txt'):
"""
This function performs the incremental backup
This function is taken with slight modifications from the following Github
repository: https://github.com/nokibsarkar/sqlite3-incremental-backup.git
For this function, we implemented the python version of the backup at
sqlite3-incremental-backup/python/sqlite3backup/python
Much credit to Github user nokibsarkar
Parameters
----------
+ db path to sqlite database to backup
+ object_dir path to objects directory where all pages will reside
+ backup_file file to save sha256 hashes of pages
Returns
-------
True if backup successful
"""

page_size = 0
# Open the database.
with open(db, "rb") as db_file_object:
assert(
db_file_object.read(SQLITE_HEADER_LENGTH) == b"SQLite format 3\x00")
db_file_object.seek(SQLITE_PAGE_SIZE_INDEX, os.SEEK_SET)
page_size = int.from_bytes(db_file_object.read(2), 'little') * 256
db_file_object.seek(SQLITE_PAGE_COUNT_INDEX, os.SEEK_SET)
page_count = int.from_bytes(db_file_object.read(4), 'big')

pages = []
with open(db, "rb") as db_file_object:
for page_number in range(page_count):
db_file_object.seek(page_number * page_size, os.SEEK_SET)
page = db_file_object.read(page_size)
hash = sha256(page).hexdigest()
directory, filename = hash[:2], hash[2:]
file_path = os.path.join(object_dir, directory, filename)
if not os.path.exists(file_path): #
os.makedirs(os.path.dirname(file_path), exist_ok=True)
with open(file_path, "wb") as file_object:
file_object.write(page)
pages.append(hash)

# Write the pages to the object directory.
with open(backup_file, 'w') as fp:
fp.write('\n'.join(pages))

return True


def restore(backup=None, backup_file=None, object_dir=None):
"""
This function performs the restore function from an incremental backup
This function is taken with slight modifications from the following Github
repository: https://github.com/nokibsarkar/sqlite3-incremental-backup.git
For this function, we implemented the python version of the backup at
sqlite3-incremental-backup/python/sqlite3backup/python
Much credit to Github user nokibsarkar
Parameters
----------
+ backup path to sqlite database to restore to
+ object_dir path to objects directory where all pages will reside
+ backup_file file to save sha256 hashes of pages
Returns
-------
True if restore successful
"""

# Read the pages from the backup file
with open(backup_file, 'r') as fp:
pages = fp.read().split('\n')

# Open the database.
with open(backup, "wb") as db_file_object:
# Iterate thourgh the pages and write them to the database.
for page in pages:
path = os.path.join(object_dir, page[:2], page[2:])
with open(path, "rb") as file_object:
db_file_object.write(file_object.read())

# Restoration is complete
return True
3 changes: 1 addition & 2 deletions specdb/Forms.py
Expand Up @@ -73,7 +73,6 @@ def collect_schema_comments(tables=None, num=None):
form_dic[table_name].yaml_add_eol_comment(
comment, column, column=35)

print('form_dic num', num)
return form_dic

def forms(table=None, num=None, input_dict=None):
Expand Down Expand Up @@ -105,7 +104,7 @@ def forms(table=None, num=None, input_dict=None):

#print(table)
form_dic = collect_schema_comments(tables=table, num=num)
print(json.dumps(form_dic,indent=2))
#print(json.dumps(form_dic,indent=2))
yaml = ruamel.yaml.YAML()
yaml.preserve_quotes = True

Expand Down
32 changes: 32 additions & 0 deletions specdb/Insert.py
Expand Up @@ -783,6 +783,38 @@ def insert(file=None, db=None, write=False):

conn.commit()

if 'default_processing_scripts' in record:

for index, scripts in record['default_processing_scripts'].items():

print(scripts)

try:
processing_script_path = os.path.abspath(
scripts['default_processing'])
except:
print('cannot get default processing script path')
print(f"given: {scripts['default_processing']}")
print('Aborting')
sys.exit()

with open(processing_script_path, 'rb') as fp:
fbytes = fp.read()

scripts['default_processing'] = fbytes

status, value = insert_logic(
table='default_processing_scripts',
dic=scripts,
cursor=c,
write=write
)

scripts['default_processing'] = processing_script_path
print(status, value)



# now insert sessions if present
if 'session' in record:

Expand Down

0 comments on commit fbb1f42

Please sign in to comment.