scriptutils.part_retrieval

Module Contents

Classes

ImportFile

Record for a file in the package parts inventory, containing all information needed for collation

PackageInventory

List of all of the parts imported into a package in various files

Functions

accession_to_sbol_uri(accession: str, prefix: str = NCBI_PREFIX) → str

Change an NCBI accession ID to an equivalent NCBI SBOL URI

generic_part_download(urls: list[str], package: str) → list[str]

Attempt to download parts of unknown format from generic URLs. Detect format by parsing retrieved material

import_parts(package: str) → list[str]

Compare package specification and inventory and attempt to import all missing parts

package_parts_inventory(package: str, targets: List[str] = None) → PackageInventory

Search all of the SBOL, GenBank, and FASTA files of a package to find what parts have been downloaded

remap_prefix(uri: str) → str

retrieve_genbank_accessions(ids: List[str], package: str) → List[str]

Retrieve a set of nucleotide accessions from GenBank

retrieve_igem_parts(ids: List[str], package: str) → List[str]

Retrieve a set of iGEM parts from SynBioHub when possible, direct from the Registry when not.

retrieve_parts(ids: List[str], package: str) → List[str]

Attempt to download parts from various servers

retrieve_synbiohub_parts(ids: List[str], package: str) → List[str]

Retrieve a set of SBOL parts from SynBioHub

sbol_uri_to_accession(uri: str, prefix: str = NCBI_PREFIX, remaps: dict[str, str] = None) → str

Change an NCBI SBOL URI to an accession ID

Attributes

FASTA_iGEM_PATTERN

IGEM_FASTA_CACHE_FILE

IGEM_SBOL2_CACHE_FILE

IGEM_SBOL2_TRANSIENT_CACHE_FILE

IGEM_SBOL3_CACHE_FILE

NCBI_GENBANK_CACHE_FILE

NCBI_PREFIX

OTHER_FASTA_CACHE_FILE

OTHER_GENBANK_CACHE_FILE

SBOL_iGEM_PATTERNS

iGEM_SOURCE_PREFIX

prefix_remappings

source_list

FASTA_iGEM_PATTERN = http://parts.igem.org/cgi/partsdb/composite_edit/putseq.cgi?part={}
IGEM_FASTA_CACHE_FILE = iGEM_raw_imports.fasta
IGEM_SBOL2_CACHE_FILE = iGEM_SBOL2_imports.nt
IGEM_SBOL2_TRANSIENT_CACHE_FILE = iGEM_SBOL2_imports.xml
IGEM_SBOL3_CACHE_FILE = iGEM_SBOL3_imports.nt
class ImportFile(path: str, file_type: str = sbol3.SORTED_NTRIPLES, namespace: str = None)

Record for a file in the package parts inventory, containing all information needed for collation

id_to_uri

For FASTA and GenBank, map from ID to full URI

get_sbol3_doc(self) sbol3.Document

Access a file’s contents in SBOL3 format. If not loaded, they will be loaded. If not in SBOL3, they will be converted.

Returns

SBOL3 document for the file’s contents

NCBI_GENBANK_CACHE_FILE = NCBI_GenBank_imports.gb
NCBI_PREFIX = https://www.ncbi.nlm.nih.gov/nuccore/
OTHER_FASTA_CACHE_FILE = Other_FASTA_imports.fasta
OTHER_GENBANK_CACHE_FILE = Other_GenBank_imports.gb
class PackageInventory

List of all of the parts imported into a package in various files

add(self, import_file, uri: str, *aliases: str) None
SBOL_iGEM_PATTERNS = ['https://synbiohub.org/public/igem/BBa_{}', 'https://synbiohub.org/public/igem/{}']
accession_to_sbol_uri(accession: str, prefix: str = NCBI_PREFIX) str

Change an NCBI accession ID to an equivalent NCBI SBOL URI :param accession: to convert :param prefix: prefix to use with accession, defaulting to NCBI nuccore :return: equivalent URI

generic_part_download(urls: list[str], package: str) list[str]

Attempt to download parts of unknown format from generic URLs. Detect format by parsing retrieved material

Parameters

url – list of URL to download from

Returns

list of parts that were able to be downlaoded and parsed

iGEM_SOURCE_PREFIX = http://parts.igem.org/
import_parts(package: str) list[str]

Compare package specification and inventory and attempt to import all missing parts

Parameters

package – path of package to search

Returns

list of parts URIs imported

package_parts_inventory(package: str, targets: List[str] = None) PackageInventory

Search all of the SBOL, GenBank, and FASTA files of a package to find what parts have been downloaded

Parameters
  • package – path of package to search

  • targets – list of parts we are looking for, for adding prefixes to FASTA and GenBank materials

Returns

dictionary mapping URIs and alias URIs to available URIs

prefix_remappings
remap_prefix(uri: str) str
retrieve_genbank_accessions(ids: List[str], package: str) List[str]

Retrieve a set of nucleotide accessions from GenBank :param ids: SBOL URIs to retrieve :param package: path where retrieved items should be stored :return: list of items retrieved

retrieve_igem_parts(ids: List[str], package: str) List[str]

Retrieve a set of iGEM parts from SynBioHub when possible, direct from the Registry when not. :param ids: SBOL URIs to retrieve :param package: path where retrieved items should be stored :return: list of items retrieved

retrieve_parts(ids: List[str], package: str) List[str]

Attempt to download parts from various servers

Parameters
  • ids – list of URIs

  • package – path of package to retrieve from

Returns

list of URIs successfully retrieved

retrieve_synbiohub_parts(ids: List[str], package: str) List[str]

Retrieve a set of SBOL parts from SynBioHub :param ids: SBOL URIs to retrieve :param package: path where retrieved items should be stored :return: list of items retrieved

sbol_uri_to_accession(uri: str, prefix: str = NCBI_PREFIX, remaps: dict[str, str] = None) str

Change an NCBI SBOL URI to an accession ID :param uri: to convert :param prefix: prefix to use with accession, defaulting to NCBI nuccore :return: equivalent accession ID

source_list