-
Notifications
You must be signed in to change notification settings - Fork 489
Add kMetaShot and DM for kMetaShot #7421
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
SantaMcCloud
wants to merge
29
commits into
galaxyproject:main
Choose a base branch
from
SantaMcCloud:add_kMetaShot
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from all commits
Commits
Show all changes
29 commits
Select commit
Hold shift + click to select a range
3862f71
Add kMetaShot and DM for kMetaShot
SantaMcCloud f251d0e
fix test and remove unneeded file
SantaMcCloud 4ef6892
Update tools/kmetashot/kmetashot.xml
SantaMcCloud 7e91306
Update tools/kmetashot/kmetashot.xml
SantaMcCloud a961b6c
Update data_managers/data_manager_kmetashot/data_manager_conf.xml
SantaMcCloud 2953837
make file names shell safe
SantaMcCloud 5e9d599
Merge branch 'add_kMetaShot' of https://github.com/SantaMcCloud/tools…
SantaMcCloud cfa9bf1
change help text for param
SantaMcCloud e10b88e
more information about the output
SantaMcCloud 0acaf08
rewrote DM and change help section
SantaMcCloud 4d0b6eb
add pyscript
SantaMcCloud 6a8350d
fix test
SantaMcCloud a56eec4
fix linting
SantaMcCloud 0acc0e1
typo
SantaMcCloud be65946
fix assert_content in test
SantaMcCloud c22e642
change DM to a single file tool
SantaMcCloud 61157d7
remove .py file§
SantaMcCloud ed60219
chage profile version
SantaMcCloud e3d8e77
remove output label
SantaMcCloud a230944
change single quotes to double quotes
SantaMcCloud 102fb8f
restart test
SantaMcCloud 2ed91cf
fix test
SantaMcCloud e9bbec2
Merge branch 'add_kMetaShot' of https://github.com/SantaMcCloud/tools…
SantaMcCloud b1b05c1
maybe test fix
SantaMcCloud 587e629
maybe test fix
SantaMcCloud 4e6cfc8
maybe test fix
SantaMcCloud d7821e7
fix test
SantaMcCloud bd8a260
change download link from version 1
SantaMcCloud 77d0f1b
Update kmetashot.xml
SantaMcCloud File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,10 @@ | ||
| categories: | ||
| - Data Managers | ||
| - Metagenomics | ||
| homepage_url: https://github.com/gdefazio/kMetaShot | ||
| description: Data manager for kMetaShot reference data | ||
| long_description: Data manager for kMetaShot reference data | ||
| name: kmetashot_build_database | ||
| owner: iuc | ||
| remote_repository_url: https://github.com/galaxyproject/tools-iuc/tree/main/data_managers/data_manager_kmetashot | ||
| type: unrestricted |
83 changes: 83 additions & 0 deletions
83
data_managers/data_manager_kmetashot/data_manager/kmetashot_datamanager.xml
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,83 @@ | ||
| <tool id="kmetashot_build_database" name="kMetaShot" tool_type="manage_data" version="@TOOL_VERSION@+galaxy@VERSION_SUFFIX@" profile="@PROFILE@"> | ||
| <description>database builder</description> | ||
| <macros> | ||
| <token name="@TOOL_VERSION@">2.0</token> | ||
| <token name="@VERSION_SUFFIX@">0</token> | ||
| <token name="@PROFILE@">24.1</token> | ||
| </macros> | ||
| <requirements> | ||
| <requirement type="package" version="@TOOL_VERSION@">kmetashot</requirement> | ||
| </requirements> | ||
| <command><![CDATA[ | ||
| mkdir -p "$out_file.extra_files_path" && | ||
| #if $test != "true": | ||
| #if $release == "1": | ||
| wget "https://zenodo.org/records/17591095/files/kMetaShot_reference.h5" && | ||
| mv "kMetaShot_reference.h5" "$out_file.extra_files_path" && | ||
| #else: | ||
| wget "https://zenodo.org/records/17375120/files/kMetaShot_bacteria_archaea_2025-05-22.h5" && | ||
| mv "kMetaShot_bacteria_archaea_2025-05-22.h5" "$out_file.extra_files_path" && | ||
| #end if | ||
| #else: | ||
| touch '$out_file.extra_files_path'/kMetaShot_bacteria_archaea_2025-05-22.h5 && | ||
| #end if | ||
| cp "$dmjson" "$out_file" | ||
| ]]></command> | ||
| <configfiles> | ||
| <configfile name="dmjson"><![CDATA[ | ||
| { | ||
| "data_tables":{ | ||
| "kmetashot":[ | ||
| { | ||
| "dbkey":"kmetashot", | ||
| "version":"${release}", | ||
| #if $test == "true": | ||
| "path":"${out_file.extra_files_path}/kMetaShot_bacteria_archaea_2025-05-22.h5", | ||
| "name":"kMetaShot reference data 2025-05-22 - TEST", | ||
| "value":"2025-05-22" | ||
| #else: | ||
| #if $release == "1": | ||
| "path":"${out_file.extra_files_path}/kMetaShot_reference.h5", | ||
| "name":"kMetaShot reference data 2022-07-31", | ||
| "value":"2022-07-31" | ||
| #else: | ||
| "path":"${out_file.extra_files_path}/kMetaShot_bacteria_archaea_2025-05-22.h5", | ||
| "name":"kMetaShot reference data 2025-05-22", | ||
| "value":"2025-05-22" | ||
| #end if | ||
| #end if | ||
| } | ||
| ] | ||
| } | ||
| }]]> | ||
| </configfile> | ||
| </configfiles> | ||
| <inputs> | ||
| <param name="release" type="select" multiple="false" label="kMetaShot reference data release"> | ||
| <option value="1">First release</option> | ||
| <option value="2">Second release</option> | ||
| </param> | ||
| <param name="test" type="hidden" value="" checked="false" label="Run test"/> | ||
| </inputs> | ||
| <outputs> | ||
| <data name="out_file" format="data_manager_json" /> | ||
| </outputs> | ||
| <tests> | ||
| <test expect_num_outputs="1"> | ||
| <param name="release" value="2"/> | ||
| <param name="test" value="true"/> | ||
| <output name="out_file"> | ||
| <assert_contents> | ||
| <has_text text="25-05-22"/> | ||
| <has_text text="kMetaShot reference data 2025-05-22 - TEST"/> | ||
| </assert_contents> | ||
| </output> | ||
| </test> | ||
| </tests> | ||
| <help><![CDATA[ | ||
| Download and extract kMetaShot reference data. | ||
| ]]></help> | ||
| <citations> | ||
| <citation type="doi">10.1038/s41592-023-01940-w</citation> | ||
| </citations> | ||
| </tool> |
20 changes: 20 additions & 0 deletions
20
data_managers/data_manager_kmetashot/data_manager_conf.xml
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,20 @@ | ||
| <data_managers> | ||
| <data_manager tool_file="data_manager/kmetashot_datamanager.xml" id="kmetashot_build_database"> | ||
| <data_table name="kmetashot"> | ||
| <output> | ||
| <column name="value"/> | ||
| <column name="dbkey"/> | ||
| <column name="name"/> | ||
| <column name="version"/> | ||
| <column name="path" output_ref="out_file"> | ||
| <move type="file"> | ||
| <source>${path}</source> | ||
| <target base="${GALAXY_DATA_MANAGER_DATA_PATH}">kmetashot/${value}/${path}</target> | ||
| </move> | ||
| <value_translation>${GALAXY_DATA_MANAGER_DATA_PATH}/kmetashot/${value}/${path}</value_translation> | ||
| <value_translation type="function">abspath</value_translation> | ||
| </column> | ||
| </output> | ||
| </data_table> | ||
| </data_manager> | ||
| </data_managers> |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,3 @@ | ||
|
|
||
| 25-05-22 kMetaShot-25-05-22 kMetaShot reference data 2025-05-22 2 /tmp/tmpf_hplx2a/galaxy-dev/tool-data/kmetashot/2/kMetaShot_bacteria_archaea_2025-05-22.h5 | ||
| 2025-05-22 kmetashot kMetaShot reference data 2025-05-22 - TEST 2 /home/sf373/sf373/galaxy/tool-data/kmetashot/2025-05-22/tmp/tmpt500ppxr/job_working_directory/000/1/outputs/dataset_46c18f80-8c8e-416d-ad88-f91119feab82_files/kMetaShot_bacteria_archaea_2025-05-22.h5 |
9 changes: 9 additions & 0 deletions
9
data_managers/data_manager_kmetashot/tool-data/kmetashot.loc.sample
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,9 @@ | ||
| #This is a sample file distributed with Galaxy that enables tools | ||
| #to use a the kMetaShot database. | ||
| #You will need to create these data files using the following command | ||
|
|
||
| #wget [selected version] [url_from_donwlaod] | ||
|
|
||
| #The <version> column indicates the version from the kMetaShot ref data was downloaded | ||
|
|
||
| #25-05-22 kMetaShot-25-05-22 kMetaShot reference data 2025-05-22 2 /mnt/galaxyIndices/kMetaShot_database/kMetaShot_bacteria_archaea_2025-05-22.h5 |
6 changes: 6 additions & 0 deletions
6
data_managers/data_manager_kmetashot/tool_data_table_conf.xml.sample
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,6 @@ | ||
| <tables> | ||
| <table name="kmetashot" comment_char="#" allow_duplicate_entries="False"> | ||
| <columns>value, dbkey, name, version, path</columns> | ||
| <file path="tool-data/kmetashot.loc" /> | ||
| </table> | ||
| </tables> |
7 changes: 7 additions & 0 deletions
7
data_managers/data_manager_kmetashot/tool_data_table_conf.xml.test
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,7 @@ | ||
| <tables> | ||
| <!-- Location of kmetashot indexes for testing --> | ||
| <table name="kmetashot" comment_char="#" allow_duplicate_entries="False"> | ||
| <columns>value, dbkey, name, version, path</columns> | ||
| <file path="${__HERE__}/test-data/kmetashot.loc" /> | ||
| </table> | ||
| </tables> |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,13 @@ | ||
| name: kmetashot | ||
| owner: iuc | ||
| description: an alignment-free taxonomic classifier based on k-mer/minimizer counting | ||
| long_description: | | ||
| kMetaShot, a bioinformatic approach relying on k-mer/minimizer profiling | ||
| from the reference prokaryotic genomes, in order to build a concise | ||
| representation of genomic diversity and perform MAG taxonomic | ||
| classification up to the strain level | ||
| homepage_url: https://github.com/gdefazio/kMetaShot | ||
| remote_repository_url: https://github.com/galaxyproject/tools-iuc/tree/main/tools/kmetashot | ||
| categories: | ||
| - Metagenomics | ||
| type: unrestricted |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,92 @@ | ||
| <tool id="kmetashot" name="kMetaShot" version="@TOOL_VERSION@+galaxy@VERSION_SUFFIX@" profile="@PROFILE@"> | ||
| <description>an alignment-free taxonomic classifier based on k-mer/minimizer counting</description> | ||
| <macros> | ||
| <token name="@TOOL_VERSION@">2.0</token> | ||
| <token name="@VERSION_SUFFIX@">0</token> | ||
| <token name="@PROFILE@">24.1</token> | ||
| </macros> | ||
| <requirements> | ||
| <requirement type="package" version="@TOOL_VERSION@">kmetashot</requirement> | ||
| </requirements> | ||
| <command detect_errors="exit_code"> | ||
| <![CDATA[ | ||
| #import re | ||
|
|
||
| mkdir "output" "bins" && | ||
|
|
||
| #for $file in $bins_dir: | ||
| #set $identifier = re.sub("[^\s\w\-]", "_", str($file.element_identifier)) | ||
| ln -s "$file" "bins/${identifier}.$file.ext" && | ||
| #end for | ||
|
|
||
| kMetaShot_classifier_NV.py | ||
| -b "bins" | ||
| -o "output" | ||
| -r "$reference.fields.path" | ||
| -p "\${GALAXY_SLOTS:-1}" | ||
| -a ${ass2ref} | ||
|
|
||
| ]]> | ||
| </command> | ||
| <inputs> | ||
| <param argument="--bins_dir" type="data" multiple="true" format="fasta,fasta.gz" label="Bin(s)/MAG(s) fasta file"/> | ||
| <param argument="--reference" type="select" label="Select reference"> | ||
| <options from_data_table="kmetashot"> | ||
| <filter type="sort_by" column="2"/> | ||
| </options> | ||
| <validator type="no_options" message="No reference data for kMetaShot is installed. Please contact the Galaxy adminstrators to request one be installed."/> | ||
| </param> | ||
| <param argument="--ass2ref" type="float" min="0.0" value="0.0" max="1.0" label="Set ass2ref parameter" help="Set the number of non redundant minimizers found in classified MAG for classified strain"/> | ||
| </inputs> | ||
| <outputs> | ||
| <collection name="result" type="list"> | ||
| <discover_datasets pattern="(?P<designation>.*)\.csv" format="tabular" directory="output"/> | ||
| </collection> | ||
| </outputs> | ||
| <tests> | ||
| <!-- Since this tool need his ref data to work there is no way to test this tool really because of this there is only this test to see of the tool is starting or not --> | ||
| <test expect_exit_code="1" expect_failure="true"> | ||
| <param name="bins_dir" value="all_contig.fasta.gz" ftype="fasta.gz"/> | ||
| <param name="ass2ref" value="0.2"/> | ||
| <assert_command> | ||
| <has_text text="kMetaShot_classifier_NV.py -b bins"/> | ||
| <has_text text="-o output"/> | ||
| <has_text text="-a 0.2"/> | ||
| </assert_command> | ||
| </test> | ||
| <test expect_exit_code="1" expect_failure="true"> | ||
| <param name="bins_dir" value="all_contig.fasta.gz" ftype="fasta.gz"/> | ||
| <param name="ass2ref" value="0.3"/> | ||
| <assert_command> | ||
| <has_text text="kMetaShot_classifier_NV.py -b bins"/> | ||
| <has_text text="-o output"/> | ||
| <has_text text="-a 0.3"/> | ||
| </assert_command> | ||
| </test> | ||
| </tests> | ||
| <help> | ||
| <![CDATA[ | ||
|
|
||
| To learn more about the inside of the tool you can visit the `kMetaShot GitHub page <https://github.com/gdefazio/kMetaShot>`_! | ||
|
|
||
| **Input** | ||
|
|
||
| Fasta file(s) in fasta format or/and fasta.gz format (.fa, .fasta, .fna, .fa.gz, .fasta.gz, .fna.gz are allowed extensions) | ||
|
|
||
| **Reference** | ||
|
|
||
| The reference data needed for this tool must be provided from the data manager which always installs the latest version of the data | ||
|
|
||
| **Output** | ||
|
|
||
| The Output is a collection with csv file(s) which contained the classification for the inputted bin(s)/MAG(s) | ||
|
|
||
| In the output files(s) each line has the followed structure: num,bin,ass2ref,taxid,species,genus,family,order,class,phylum,superkingdom,organism_name | ||
| This means each bin will be classified by this tool and all ids (NCBI) will be written in this order if the tool can classified it together with the organism name! | ||
|
|
||
| ]]> | ||
| </help> | ||
| <citations> | ||
| <citation type="doi">10.1093/bib/bbae680</citation> | ||
| </citations> | ||
| </tool> | ||
Binary file not shown.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.