Scripts: Exploration¶
Description:
#TODO
compare_padmet¶
- Description:
#Compare 1-n padmet and create a folder output with files: genes.csv:
fieldnames = [gene, padmet_a, padmet_b, padmet_a_rxn_assoc, padmet_b_rxn_assoc] line = [gene-a, ‘present’ (if in padmet_a), ‘present’ (if in padmet_b), rxn-1;rxn-2 (names of reactions associated to gene-a in padmet_a), rxn-2]- reactions.csv:
- fieldnames = [reaction, padmet_a, padmet_b, padmet_a_genes_assoc, padmet_b_genes_assoc, padmet_a_formula, padmet_b_formula] line = [rxn-1, ‘present’ (if in padmet_a), ‘present’ (if in padmet_b), ‘gene-a;gene-b; gene-a, ‘cpd-1 + cpd-2 => cpd-3’, ‘cpd-1 + cpd-2 => cpd-3’]
- pathways.csv:
- fieldnames = [pathway, padmet_a_completion_rate, padmet_b_completion_rate, padmet_a_rxn_assoc, padmet_b_rxn_assoc] line = [pwy-a, 0.80, 0.30, rxn-a;rxn-b; rxn-a]
- compounds.csv:
- fieldnames = [‘metabolite’, padmet_a_rxn_consume, padmet_a_rxn_produce, padmet_b_rxn_consume, padmet_rxn_produce] line = [cpd-1, rxn-1,’‘,rxn-1,’‘]
usage:
compare_padmet.py --padmet=FILES/DIR --output=DIR [--padmetRef=FILE] [-v]
option:
-h --help Show help.
--padmet=FILES/DIR pathname of the padmet files, sep all files by ',', ex: /path/padmet1.padmet;/path/padmet2.padmet OR a folder
--output=DIR pathname of the output folder
--padmetRef=FILE pathanme of the database ref in padmet
-
padmet_utils.exploration.compare_padmet.
compare_padmet
(padmet_path, output, padmetRef=None, verbose=False)[source]¶ #Compare 1-n padmet and create a folder output with files: genes.csv:
fieldnames = [gene, padmet_a, padmet_b, padmet_a_rxn_assoc, padmet_b_rxn_assoc] line = [gene-a, ‘present’ (if in padmet_a), ‘present’ (if in padmet_b), rxn-1;rxn-2 (names of reactions associated to gene-a in padmet_a), rxn-2]- reactions.csv:
- fieldnames = [reaction, padmet_a, padmet_b, padmet_a_genes_assoc, padmet_b_genes_assoc, padmet_a_formula, padmet_b_formula] line = [rxn-1, ‘present’ (if in padmet_a), ‘present’ (if in padmet_b), ‘gene-a;gene-b; gene-a, ‘cpd-1 + cpd-2 => cpd-3’, ‘cpd-1 + cpd-2 => cpd-3’]
- pathways.csv:
- fieldnames = [pathway, padmet_a_completion_rate, padmet_b_completion_rate, padmet_a_rxn_assoc, padmet_b_rxn_assoc] line = [pwy-a, 0.80, 0.30, rxn-a;rxn-b; rxn-a]
- compounds.csv:
- fieldnames = [‘metabolite’, padmet_a_rxn_consume, padmet_a_rxn_produce, padmet_b_rxn_consume, padmet_rxn_produce] line = [cpd-1, rxn-1,’‘,rxn-1,’‘]
Parameters: - padmet_path (str) – pathname of the padmet files, sep all files by ‘,’, ex: /path/padmet1.padmet;/path/padmet2.padmet OR a folder
- output (str) – pathname of the output folder
- padmetRef (padmet.classes.PadmetRef) – padmet containing the database of reference, need to calculat pathway completion rate
- verbose (bool) – if True print information
compare_sbml¶
- Description:
compare reactions in two sbml.
Returns if a reaction is missing
And if a reaction with the same id is using different species or different reversibility
usage:
compare_sbml.py --sbml1=FILE --sbml2=FILE
option:
-h --help Show help.
--sbml1=FILE path of the first sbml file
--sbml2=FILE path of the second sbml file
compare_sbml_padmet¶
- Description:
- compare reactions in sbml and padmet file
usage:
compare_sbml_padmet.py --padmet=FILE --sbml=FILE
option:
-h --help Show help.
--padmet=FILE path of the padmet file
--sbml=FILE path of the sbml file
-
padmet_utils.exploration.compare_sbml_padmet.
compare_sbml_padmet
(sbml_document, padmet)[source]¶ compare reactions ids in sbml vs padmet, return nb of reactions in both and reactions id not in sbml or not in padmet
Parameters: - padmet (padmet.classes.PadmetSpec) – padmet to udpate
- sbml_file (libsbml.document) – sbml document
convert_sbml_db¶
- Description:
This tool is use the MetaNetX database to check or convert a sbml. Flat files from MetaNetx are required to run this tool. They can be found in the aureme workflow or from the MetaNetx website. To use the tool set:
mnx_folder= the path to a folder containing MetaNetx flat files. the files must be named as ‘reac_xref.tsv’ and ‘chem_xref.tsv’ or set manually the different path of the flat files with:
mnx_reac= path to the flat file for reactions
mnx_chem= path to the flat file for chemical compounds (species)
- To check the database used in a sbml:
- to check all element of sbml (reaction and species) set:
- to–map=all
- to check only reaction of sbml set:
- to–map=reaction
- to check only species of sbml set:
- to–map=species
- To map a sbml and obtain a file of mapping ids to a given database set:
- to-map:
- as previously explained
- db_out:
- the name of the database target: [‘metacyc’, ‘bigg’, ‘kegg’] only
- output:
- the path to the output file
For a given sbml using a specific database.
Return a dictionnary of mapping.
the output is a file with line = reaction_id/or species in sbml, reaction_id/species in db_out database
- ex:
- For a sbml based on kegg database, db_out=metacyc: the output file will contains for ex:
R02283 ACETYLORNTRANSAM-RXN
usage:
convert_sbml_db.py --mnx_reac=FILE --mnx_chem=FILE --sbml=FILE --to-map=STR [-v]
convert_sbml_db.py --mnx_folder=DIR --sbml=FILE --to-map=STR [-v]
convert_sbml_db.py --mnx_folder=DIR --sbml=FILE --output=FILE --db_out=ID --to-map=STR [-v]
convert_sbml_db.py --mnx_reac=FILE --mnx_chem=FILE --sbml=FILE --output=FILE --db_out=ID --to-map=STR [-v]
options:
-h --help Show help.
--to-map=STR select the part of the sbml to check or convert, must be in ['all', 'reaction', 'species']
--mnx_reac=FILE path to the MetaNetX file for reactions
--mnx_chem=FILE path to the MetaNetX file for compounds
--sbml=FILE path to the sbml file to convert
--output=FILE path to the file containing the mapping, sep = " "
--db_out=FILE id of the output database in ["BIGG","METACYC","KEGG"]
-v verbose.
-
padmet_utils.exploration.convert_sbml_db.
check_sbml_db
(sbml_file, to_map, verbose=False, mnx_reac_file=None, mnx_chem_file=None, mnx_folder=None)[source]¶ Check sbml database of a given sbml.
Parameters: - sbml_file (str) – path to the sbml file to convert
- to_map (str) – select the part of the sbml to check must be in [‘all’, ‘reaction’, ‘species’]
- verbose (bool) – if true: more info during process
- mnx_reac_file (str) – path to the flat file for reactions (can be None if given mnx_folder)
- mnx_chem_file (str) – path to the flat file for chemical compounds (species) (can be None if given mnx_folder)
- mnx_folder (str) – the path to a folder containing MetaNetx flat files
Returns: (name of the best matching database, dict of matching)
Return type:
-
padmet_utils.exploration.convert_sbml_db.
map_sbml
(sbml_file, to_map, db_out, output, verbose=False, mnx_reac_file=None, mnx_chem_file=None, mnx_folder=None)[source]¶ map a sbml and obtain a file of mapping ids to a given database.
Parameters: - sbml_file (str) – path to the sbml file to convert
- to_map (str) – select the part of the sbml to check must be in [‘all’, ‘reaction’, ‘species’]
- db_out (str) – the name of the database target: [‘metacyc’, ‘bigg’, ‘kegg’] only
- output (str) – path to the file containing the mapping, sep = ” “
- verbose (bool) – if true: more info during process
- mnx_reac_file (str) – path to the flat file for reactions (can be None if given mnx_folder)
- mnx_chem_file (str) – path to the flat file for chemical compounds (species) (can be None if given mnx_folder)
- mnx_folder (str) – the path to a folder containing MetaNetx flat files
Returns: (name of the best matching database, dict of matching)
Return type:
dendrogram_reactions_distance¶
- Description:
Use reactions.csv file from compare_padmet.py to create a dendrogram using a Jaccard distance.
From the matrix absence/presence of reactions in different species computes a Jaccard distance between these species. Apply a hierarchical clustering on these data with a complete linkage. Then create a dendrogram. Apply also intervene to create an upset graph on the data.
usage:
dendrogram_reactions_distance.py --reactions=FILE --output=FILE [--padmetRef=STR] [--pvclust] [--upset=INT] [-v]
option:
-h --help Show help.
-r --reactions=FILE pathname of the file containing reactions in each species of the comparison.
-o --output=FOLDER path to the output folder.
--pvclust launch pvclust dendrogram using R
--padmetRef=STR path to the padmet Ref file
-u --upset=INT number of cluster in the upset graph.
-v verbose mode.
flux_analysis¶
- Description:
- Run flux balance analyse with cobra package. If the flux is >0. Run also FVA and return result in standard output
usage:
flux_analysis.py --sbml=FILE
flux_analysis.py --sbml=FILE --seeds=FILE --targets=FILE
flux_analysis.py --sbml=FILE --all_species
option:
-h --help Show help.
--sbml=FILE pathname to the sbml file to test for fba and fva.
--seeds=FILE pathname to the sbml file containing the seeds (medium).
--targets=FILE pathname to the sbml file containing the targets.
--all_species allow to make FBA on all the metabolites of the given model.
get_pwy_from_rxn¶
- Description:
- From a file containing a list of reaction, return the pathways where these reactions are involved. ex: if rxn-a in pwy-x => return, pwy-x; all rxn ids in pwy-x; all rxn ids in pwy-x FROM the list; ratio
usage:
get_pwy_from_rxn.py --reaction_file=FILE --padmetRef=FILE --output=FILE
options:
-h --help Show help.
--reaction_file=FILE pathname of the file containing the reactions id, 1/line
--padmetRef=FILE pathname of the padmet representing the database.
--output=FILE pathname of the file with line = pathway id, all reactions id, reactions ids from reaction file, ratio. sep = " "
-
padmet_utils.exploration.get_pwy_from_rxn.
dict_pwys_to_file
(dict_pwy, output)[source]¶ Create csv file from dict_pwy. dict_pwy is obtained with extract_pwys()
Parameters:
-
padmet_utils.exploration.get_pwy_from_rxn.
extract_pwys
(padmet, reactions)[source]¶ #extract from padmet pathways containing 1-n reactions from a set of reactions ‘reactions’ Return a dict of data. dict, k=pathway_id, v=dict: k in [total_rxn, rxn_from_list, ratio ex: {pwy-x:{‘total_rxn’:[a,b,c], rxn_from_list:[a], ratio:1/3}}
Parameters: - padmet (padmet.classes.PadmetSpec) – padmet to udpate
- reactions (set) – set of reactions to match with pathways
Returns: dict, k=pathway_id, v=dict: k in [total_rxn, rxn_from_list, ratio ex: {pwy-x:{‘total_rxn’:[a,b,c], rxn_from_list:[a], ratio:1/3}}
Return type:
padmet_stats¶
- Description:
- From a file containing a list of reaction, return the pathways where these reactions are involved. ex: if rxn-a in pwy-x => return, pwy-x; all rxn ids in pwy-x; all rxn ids in pwy-x FROM the list; ratio
usage:
get_pwy_from_rxn.py --reaction_file=FILE --padmetRef=FILE --output=FILE
options:
-h --help Show help.
--reaction_file=FILE pathname of the file containing the reactions id, 1/line
--padmetRef=FILE pathname of the padmet representing the database.
--output=FILE pathname of the file with line = pathway id, all reactions id, reactions ids from reaction file, ratio. sep = " "
-
padmet_utils.exploration.get_pwy_from_rxn.
dict_pwys_to_file
(dict_pwy, output)[source] Create csv file from dict_pwy. dict_pwy is obtained with extract_pwys()
Parameters:
-
padmet_utils.exploration.get_pwy_from_rxn.
extract_pwys
(padmet, reactions)[source] #extract from padmet pathways containing 1-n reactions from a set of reactions ‘reactions’ Return a dict of data. dict, k=pathway_id, v=dict: k in [total_rxn, rxn_from_list, ratio ex: {pwy-x:{‘total_rxn’:[a,b,c], rxn_from_list:[a], ratio:1/3}}
Parameters: - padmet (padmet.classes.PadmetSpec) – padmet to udpate
- reactions (set) – set of reactions to match with pathways
Returns: dict, k=pathway_id, v=dict: k in [total_rxn, rxn_from_list, ratio ex: {pwy-x:{‘total_rxn’:[a,b,c], rxn_from_list:[a], ratio:1/3}}
Return type:
-
padmet_utils.exploration.get_pwy_from_rxn.
main
()[source]
report_network¶
- Description:
Create reports of a padmet file.
all_pathways.tsv: header = [“dbRef_id”, “Common name”, “Number of reaction found”, “Total number of reaction”, “Ratio (Reaction found / Total)”]
all_reactions.tsv: header = [“dbRef_id”, “Common name”, “formula (with id)”, “formula (with common name)”, “in pathways”, “associated genes”]
all_metabolites.tsv: header = [“dbRef_id”, “Common name”, “Produced (p), Consumed (c), Both (cp)”]
usage:
report_network.py --padmetSpec=FILE --output_dir=dir [--padmetRef=FILE] [-v]
options:
-h --help Show help.
--padmetSpec=FILE pathname of the padmet file.
--padmetRef=FILE pathname of the padmet file used as database
--output_dir=dir directory for the results.
-v print info.