Services - emiraest (SOAP and Soaplab) [Provided by: European Bioinformatics Institute (EMBL-EBI)]

This looks like a Soaplab service. Click here for more info and guidance on how to use this service

About Soaplab

Soaplab services are command line applications, wrapped as SOAP services, and served from a Soaplab server. All Soaplab services have the same generic set of SOAP operations (depending on the Soaplab version) as they all share a standardised interface.

Certain tools, like the Taverna workflow workbench, provide automatic support for the Soaplab way of executing these services. In some cases you will need to use the Soaplab Server Base URL rather than the WSDL location in these tools.
More information on Soaplab clients is available here.

Further documentation on Soaplab services is available:

Provider:
European Bioinformatics Institute (EBI)

Location:
UNITED KINGDOM

Submitter / Source:
Gilles (29 days ago)

Base URL:
http://www.ebi.ac.uk/soaplab/services/assembly_fragment_assembly.emiraest

WSDL Location:
http://www.ebi.ac.uk/soaplab/services/assembly_fragment_assembly.emiraest?wsdl(download last cached WSDL file)

Documentation URL(s): None Login to add a documentation URL Description(s): No description(s) yet Login to add a description Details (from Soaplab server): from Soaplab server(8 days ago)

ds_lsr_analysis :
- analysis :
  - name : emiraest
  - output :
  - type : Assembly Fragment Assembly
  - version : 6.3.0
  - installation : Soaplab2 default installation
  - description : MIRAest fragment assembly program
  - analysis_extension :

Show all

ds_lsr_analysis :
- analysis :
  - name : emiraest
  - output :
  - type : Assembly Fragment Assembly
  - version : 6.3.0
  - installation : Soaplab2 default installation
  - description : MIRAest fragment assembly program
  - analysis_extension :
    - parameter :
      - standard :
        
        repeatable :
      - base :
      - data :
        
        ioformat : unspecified
        
        iotype : input
        
        repeatable :
      - base :
      - standard :
        
        list :
        
        name : Set parameters suited to the input type
        
        list_item :
        
        shown_as : Unspecified
        
        level : 0
        
        value : unspecified
        
        shown_as : Fasta
        
        level : 0
        
        value : fasta
        
        shown_as : PHD
        
        level : 0
        
        value : phd
        
        shown_as : CAF
        
        level : 0
        
        value : caf
        
        type : full
        
        repeatable :
      - base :
      - data :
        
        ioformat : url
        
        iotype : input
        
        repeatable :
      - base :
      - data :
        
        ioformat : url
        
        iotype : input
        
        repeatable :
      - base :
      - data :
        
        ioformat : unspecified
        
        iotype : input
        
        repeatable :
      - base :
      - data :
        
        ioformat : unspecified
        
        iotype : input
        
        repeatable :
      - base :
      - data :
        
        ioformat : unspecified
        
        iotype : input
        
        repeatable :
      - base :
      - data :
        
        ioformat : unspecified
        
        iotype : input
        
        repeatable :
      - base :
      - data :
        
        ioformat : unspecified
        
        iotype : input
        
        repeatable :
      - base :
      - data :
        
        ioformat : unspecified
        
        iotype : input
        
        repeatable :
      - base :
      - data :
        
        ioformat : unspecified
        
        iotype : input
        
        repeatable :
      - base :
      - data :
        
        ioformat : unspecified
        
        iotype : input
        
        repeatable :
      - base :
      - standard :
        
        list :
        
        name : Quality grades of de-novo assembly
        
        list_item :
        
        shown_as : Draft
        
        level : 0
        
        value : draft
        
        shown_as : Normal
        
        level : 0
        
        value : normal
        
        shown_as : Accurate
        
        level : 0
        
        value : accurate
        
        type : full
        
        repeatable :
      - base :
      - standard :
        
        list :
        
        name : Quality grades for mapping
        
        list_item :
        
        shown_as : Draft
        
        level : 0
        
        value : draft
        
        shown_as : Normal
        
        level : 0
        
        value : normal
        
        shown_as : Accurate
        
        level : 0
        
        value : accurate
        
        type : full
        
        repeatable :
      - base :
      - standard :
        
        list :
        
        name : Clipping grade modifiers
        
        list_item :
        
        shown_as : Light
        
        level : 0
        
        value : light
        
        shown_as : Medium
        
        level : 0
        
        value : medium
        
        shown_as : Heavy
        
        level : 0
        
        value : heavy
        
        type : full
        
        repeatable :
      - base :
      - base :
        
        ordering : 21
        
        name : highlyrepetitive
        
        help : A modifier switch for genome data that is deemed to be highly repetitive. The assemblies will run slower due to more iterative cycles that give mira a chance to resolve nasty repeats.
        
        option :
        
        name : EDAM:
        
        value : Generic boolean
        
        type : normal
        
        qualifier : highlyrepetitive
        
        default : false
        
        mandatory : false
        
        type : boolean
        
        prompt : Highly repetitive DNA
      - base :
        
        ordering : 22
        
        name : highqualitydata
        
        help : A modifier switch when the sequences that are used are of exceptional quality. mira will then bump up a few quality parameters which should lead to less false positives in the repeat and SNP detection routines.
        
        option :
        
        name : EDAM:
        
        value : Generic boolean
        
        type : normal
        
        qualifier : highqualitydata
        
        default : false
        
        mandatory : false
        
        type : boolean
        
        prompt : High quality data
      - base :
        
        ordering : 23
        
        name : estmode
        
        help : Switches mira to a good initial preset for assembling EST data. Note that this is not needed (and even counterproductive) when used with miraEST.
        
        option :
        
        name : EDAM:
        
        value : Generic boolean
        
        type : normal
        
        qualifier : estmode
        
        default : false
        
        mandatory : false
        
        type : boolean
        
        prompt : Preset EST assembly mode
      - base :
        
        ordering : 24
        
        name : horrid
        
        help : Sets a number of parameters useful when dealing with really horrid data sets. Useful means that parameters are chosen to so that time and memory consumption do not explode beyond all hope of the program returning. Note that MIRA will return in most cases useful assemblies with this switch, but these might not be as optimised as with normal operation. The definition of ‘horrid’ is a bit flexible, for example, (a) a genomic projects with more than 2.000 reads that all seem to align partly to each other but have different repetitive structures or (b) EST clusters with a few thousand almost similar reads.
        
        option :
        
        name : EDAM:
        
        value : Generic boolean
        
        type : normal
        
        qualifier : horrid
        
        default : false
        
        mandatory : false
        
        type : boolean
        
        prompt : Preset horrid data set mode
      - base :
        
        ordering : 25
        
        name : borg
        
        help : Sets several parameters to have mira try to assemble as many reads as possible. Will probably slow down the assembly process and use more memory. ‘We are MIRA of borg. You will be assembled, resistance is futile!’
        
        option :
        
        name : EDAM:
        
        value : Generic boolean
        
        type : normal
        
        qualifier : borg
        
        default : false
        
        mandatory : false
        
        type : boolean
        
        prompt : Force assembly
      - standard :
        
        list :
        
        name : Load Job Type
        
        list_item :
        
        shown_as : EXP files from a file of filenames
        
        level : 0
        
        value : fofnexp
        
        shown_as : Load and assemble FASTA
        
        level : 0
        
        value : fasta
        
        shown_as : Load and assemble CAF
        
        level : 0
        
        value : caf
        
        shown_as : Load and assemble PHD
        
        level : 0
        
        value : phd
        
        shown_as : PHD files from a file of filenames
        
        level : 0
        
        value : fofnphd
        
        type : full
        
        repeatable :
      - base :
      - base :
        
        ordering : 27
        
        name : fo
        
        help : If set to ‘Y’, the project will not be assembled and no assembly output files will be produced. Instead, the project files will only be loaded. This switch is useful for checking consistency of input files.
        
        option :
        
        name : EDAM:
        
        value : Generic boolean
        
        type : normal
        
        qualifier : fo
        
        default : false
        
        mandatory : false
        
        type : boolean
        
        prompt : Filecheck only
      - base :
        
        ordering : 28
        
        name : mxti
        
        help : Some file formats above (FASTA, PHD or even CAF and EXP) possibly don’t contain all the info necessary or useful for each read of an assembly. Should additional information, such as like clipping positions etc., be available in a XML trace info file in NCBI format (see File formats), then set this option to ‘Y’ and it will be merged to the data loaded. Please note, quality clippings given here will override quality clippings loaded earlier or performed by mira. Minimum clippings will still be made by the program, though.
        
        option :
        
        name : EDAM:
        
        value : Generic boolean
        
        type : normal
        
        qualifier : mxti
        
        default : false
        
        mandatory : false
        
        type : boolean
        
        prompt : Merge XML trace info
      - standard :
        
        list :
        
        name : Read Naming Scheme
        
        list_item :
        
        shown_as : Sanger
        
        level : 0
        
        value : sanger
        
        shown_as : TIGR
        
        level : 0
        
        value : tigr
        
        type : full
        
        repeatable :
      - base :
      - standard :
        
        list :
        
        name : External quality
        
        list_item :
        
        shown_as : None
        
        level : 0
        
        value : none
        
        shown_as : SCF
        
        level : 0
        
        value : SCF
        
        type : full
        
        repeatable :
      - base :
      - base :
        
        ordering : 31
        
        name : eqo
        
        help : Only takes effect when ‘lj’ is fofnexp. Defines whether or not the qualities from the external source override the possibly loaded qualities from the load job project. This might be of use in case some post-processing software fiddles around with the quality values of the input file but one wants to have the original ones.
        
        option :
        
        name : EDAM:
        
        value : Generic boolean
        
        type : normal
        
        qualifier : eqo
        
        default : false
        
        mandatory : false
        
        type : boolean
        
        prompt : External quality override
      - base :
        
        ordering : 32
        
        name : droeqe
        
        help : Should there be a major mismatch between the external quality source and the sequence (e.g. the base sequence read from a SCF file does not match the originally read base sequence), should the read be excluded from assembly or not. If not, it will use the qualities it had before trying to load the external qualities (either default qualities or the ones loaded from the original source).
        
        option :
        
        name : EDAM:
        
        value : Generic boolean
        
        type : normal
        
        qualifier : droeqe
        
        default : false
        
        mandatory : false
        
        type : boolean
        
        prompt : Discard read on eq error
      - base :
        
        ordering : 33
        
        name : uti
        
        help : Two reads sequenced from the same clone template form a read pair with a known minimum and maximum distance. This feature will definitively help for contigs containing lots of repeats. Set this to ‘Y’ if your data contains information on insert sizes. Information on insert sizes can be given via the SI tag in EXP files (for each read pair individually), or for the whole project using dismin and dismax
        
        option :
        
        name : EDAM:
        
        value : Generic boolean
        
        type : normal
        
        qualifier : uti
        
        default : false
        
        mandatory : false
        
        type : boolean
        
        prompt : Use template information
      - range :
        
        format : %d
        
        max : 4
        
        min : 1
        
        repeatable :
      - base :
        
        ordering : 34
        
        name : ess
        
        help : Controls the starting step of the EST assembly and is therefore only useful in miraEST. EST assembly is a three step process, each with different settings to the assembly engine, with the result of each step being saved to disk. If results of previous steps are present in a directory, one can easily ‘play around’ with different setting for subsequent steps by reusing the results of the previous steps and directly starting with step two or three.
        
        option :
        
        name : EDAM:
        
        value : Generic integer
        
        type : normal
        
        name : scalemax
        
        value : 4
        
        type : style
        
        name : scalemin
        
        value : 1
        
        type : style
        
        qualifier : ess
        
        default : 1
        
        mandatory : false
        
        type : long
        
        prompt : Integer start step
      - base :
        
        ordering : 35
        
        name : ps
        
        help : Controls whether date and time are printed out during the assembly. Suppressing it isn’t useful in normal operation, only when debugging or benchmarking.
        
        option :
        
        name : EDAM:
        
        value : Generic boolean
        
        type : normal
        
        qualifier : ps
        
        default : false
        
        mandatory : false
        
        type : boolean
        
        prompt : Print date
      - base :
        
        ordering : 36
        
        name : lsd
        
        help : Straindata is a key value file, one read per line. First the name of the read, then the strain name of the organism the read comes from. It is used by the program to differentiate different types of SNPs appearing in organisms and classifying them.
        
        option :
        
        name : EDAM:
        
        value : Generic boolean
        
        type : normal
        
        qualifier : lsd
        
        default : false
        
        mandatory : false
        
        type : boolean
        
        prompt : Load straindata
      - base :
        
        ordering : 37
        
        name : lb
        
        help : A backbone is a sequence (or a previous assembly) that is used as a template for the current assembly. The current assembly process will first assemble reads to loaded backbone contigs before creating new contigs. This feature is helpful for assembling against previous (and already possibly edited) assembly iterations, or to make a comparative assembly of two very closely related organisms. Please read ‘very closely related’ as in ‘only SNP mutations or short indels present’.
        
        option :
        
        name : EDAM:
        
        value : Generic boolean
        
        type : normal
        
        qualifier : lb
        
        default : false
        
        mandatory : false
        
        type : boolean
        
        prompt : Load backbone
      - range :
        
        format : %d
        
        min : 1
        
        repeatable :
      - base :
        
        ordering : 38
        
        name : sbuip
        
        help : When assembling against backbones, this parameter defines the pass iteration (see nop) from which on the backbones will be really used. In the passes preceding this number, the non-backbone reads will be assembled together as if no backbones existed. This allows mira to correctly spot repetitive stretches that differ by single bases and tag them accordingly. Rule of thumb – if backbones belong to the same strain as the reads to assemble, set to 1. If backbones are a different strain, then set sbuib to 1 lower than nop (example – nop 4 and sbuip 3).
        
        option :
        
        name : EDAM:
        
        value : Generic integer
        
        type : normal
        
        name : scalemin
        
        value : 1
        
        type : style
        
        qualifier : sbuip
        
        default : 3
        
        mandatory : false
        
        type : long
        
        prompt : Start backbone usage in pass
      - standard :
        
        repeatable :
      - base :
      - standard :
        
        list :
        
        name : Backbone File type
        
        list_item :
        
        shown_as : Fasta
        
        level : 0
        
        value : fasta
        
        shown_as : CAF
        
        level : 0
        
        value : caf
        
        shown_as : GenBank
        
        level : 0
        
        value : gbf
        
        type : full
        
        repeatable :
      - base :
      - range :
        
        format : %d
        
        max : 3000
        
        min : 1000
        
        repeatable :
      - base :
        
        ordering : 41
        
        name : brl
        
        help : Parameter for the internal sectioning size of the backbone. Extremely repetitive sequences may require reducing the default value, but the default value should work well in 99.9% of all cases.
        
        option :
        
        name : EDAM:
        
        value : Generic integer
        
        type : normal
        
        name : scalemax
        
        value : 3000
        
        type : style
        
        name : scalemin
        
        value : 1000
        
        type : style
        
        qualifier : brl
        
        default : 2500
        
        mandatory : false
        
        type : long
        
        prompt : Backbone rail length
      - range :
        
        format : %d
        
        max : 100
        
        min : -1
        
        repeatable :
      - base :
        
        ordering : 42
        
        name : bbq
        
        help : Defines the default quality that the backbone sequences have if they came without quality values in their files (like in GBF format or when FASTA is used without .qual files). A value of -1 causes mira to use the same default quality for backbones as for reads.
        
        option :
        
        name : EDAM:
        
        value : Generic integer
        
        type : normal
        
        name : scalemax
        
        value : 100
        
        type : style
        
        name : scalemin
        
        value : -1
        
        type : style
        
        qualifier : bbq
        
        default : -1
        
        mandatory : false
        
        type : long
        
        prompt : Backbone base quality
      - base :
        
        ordering : 43
        
        name : abnc
        
        help : The standard mode of the assembler is to assemble available reads to a backbone and make new contigs with the remaining reads. If this option is set to ‘N’, the reads that cannot be assembled into existing contigs are put as singlets into the assembly, not forming new contigs.
        
        option :
        
        name : EDAM:
        
        value : Generic boolean
        
        type : normal
        
        qualifier : abnc
        
        default : false
        
        mandatory : false
        
        type : boolean
        
        prompt : Also build new contigs
      - range :
        
        format : %d
        
        min : 20
        
        repeatable :
      - base :
        
        ordering : 44
        
        name : mrl
        
        help : Minimum length that reads must have to be considered for the assembly. Shorter sequences will be filtered out at the beginning of the process and won’t be present in the final project.
        
        option :
        
        name : EDAM:
        
        value : Generic integer
        
        type : normal
        
        name : scalemin
        
        value : 20
        
        type : style
        
        qualifier : mrl
        
        default : 40
        
        mandatory : false
        
        type : long
        
        prompt : Minimum read length
      - range :
        
        format : %d
        
        min : 1
        
        repeatable :
      - base :
        
        ordering : 45
        
        name : nop
        
        help : Defines how many iterations of the whole assembly process are done. Rule of thumb – for quick and dirty assembly use 1 (not recommended). For assembly using read extensions and / or automatic contig editing (-ure and -ace) use at least 2. The recommended setting is 3 or higher, as some knowledge generated by the assembler can be used only from the third iteration on. More than 3 passes might be useful for projects containing many repetitive elements. See also -rbl and -mr for parameters that affect the assembly and disentanglement of possible repeats.
        
        option :
        
        name : EDAM:
        
        value : Generic integer
        
        type : normal
        
        name : scalemin
        
        value : 1
        
        type : style
        
        qualifier : nop
        
        default : 3
        
        mandatory : false
        
        type : long
        
        prompt : Number of passes
      - base :
        
        ordering : 46
        
        name : sep
        
        help : Defines whether the skim algorithm (and with it also the recalculation of Smith-Waterman alignments) is called in between each main pass. If set to ‘N’, skimming is done only when needed by the workflow, either when read extensions are searched for (-ure) or when possible vector leftovers are to be clipped (-pvc). Setting this option to ‘Y’ is highly recommended, setting it to ‘N’ is only for quick and dirty assemblies.
        
        option :
        
        name : EDAM:
        
        value : Generic boolean
        
        type : normal
        
        qualifier : sep
        
        default : false
        
        mandatory : false
        
        type : boolean
        
        prompt : Skim each pass
      - range :
        
        format : %d
        
        min : 1
        
        repeatable :
      - base :
        
        ordering : 47
        
        name : rbl
        
        help : Defines the maximum number of times a contig can be rebuilt during main assembly passes (-nop) if misassemblies, due to possible repeats, are found.
        
        option :
        
        name : EDAM:
        
        value : Generic integer
        
        type : normal
        
        name : scalemin
        
        value : 1
        
        type : style
        
        qualifier : rbl
        
        default : 2
        
        mandatory : false
        
        type : long
        
        prompt : RMB break loops
      - base :
        
        ordering : 48
        
        name : sd
        
        help : Default is ‘Y’ for mira and ‘N’ for miraEST. A spoiler can be either a chimeric read or it is a read with long parts of unclipped vector sequence still included (that was too long for the -pvc vector leftover clipping routines). A spoiler typically prevents contigs being joined; MIRA will cut them back so that they present no more harm to the assembly. Recommended for assemblies of mid-to-high coverage genomic assemblies; not recommended for assemblies of ESTs as one might lose splice variants with that. A minimum number of two assembly passes (-nop) must be run for this option to take effect.
        
        option :
        
        name : EDAM:
        
        value : Generic boolean
        
        type : normal
        
        qualifier : sd
        
        default : false
        
        mandatory : false
        
        type : boolean
        
        prompt : Spoiler detection
      - base :
        
        ordering : 49
        
        name : sdlpo
        
        help : Defines whether the spoiler detection algorithms are run only for the last pass or for all passes (-nop). Takes effect only if spoiler detection (-sd) is on.
        
        option :
        
        name : EDAM:
        
        value : Generic boolean
        
        type : normal
        
        qualifier : sdlpo
        
        default : false
        
        mandatory : false
        
        type : boolean
        
        prompt : Spoiler detection last pass only
      - range :
        
        format : %d
        
        min : 0
        
        repeatable :
      - base :
        
        ordering : 50
        
        name : bdq
        
        help : Defines the default base quality of reads that have no quality read from a file.
        
        option :
        
        name : EDAM:
        
        value : Generic integer
        
        type : normal
        
        qualifier : bdq
        
        default : 10
        
        mandatory : false
        
        type : long
        
        prompt : Base default quality
      - base :
        
        ordering : 51
        
        name : ugpf
        
        help : MIRA has two different pathfinder algorithms it chooses from to find its way through the (more or less) complete set of possible sequence overlaps; a genomic and an EST pathfinder. The genomic looks a bit into the future of the assembly and tries to stay on safe grounds using a maximum of information already present in the contig that is being built. The EST version, on the contrary, will directly jump at the complex cases posed by very similar repetitive sequences and try to solve those first; it is willing to fall down to brute force when really bad cases (such as coverage with thousands of sequences) are encountered. Generally, the genomic pathfinder will also work quite well with EST sequences (but might get slowed down a lot in pathological cases), while the EST algorithm does not work so well on genomes. If in doubt, leaveas ‘Y’ for genome projects and set to ‘N’ for EST projects.
        
        option :
        
        name : EDAM:
        
        value : Generic boolean
        
        type : normal
        
        qualifier : ugpf
        
        default : false
        
        mandatory : false
        
        type : boolean
        
        prompt : Use genomic pathfinder
      - base :
        
        ordering : 52
        
        name : uess
        
        help : Another important switch if you plan to assemble non-normalised EST libraries, where some ESTs may reach coverages of several hundreds or thousands of reads. This switch lets MIRA save a lot of computational time when aligning those extremely high coverage areas (but only there), at the expense of some accuracy.
        
        option :
        
        name : EDAM:
        
        value : Generic boolean
        
        type : normal
        
        qualifier : uess
        
        default : false
        
        mandatory : false
        
        type : boolean
        
        prompt : Use emergency search stop
      - range :
        
        format : %d
        
        min : 1
        
        repeatable :
      - base :
        
        ordering : 53
        
        name : esspd
        
        help : Defines the number of potential partners a read must have for MIRA switching into emergency search stop mode for that read.
        
        option :
        
        name : EDAM:
        
        value : Generic integer
        
        type : normal
        
        name : scalemin
        
        value : 1
        
        type : style
        
        qualifier : esspd
        
        default : 500
        
        mandatory : false
        
        type : long
        
        prompt : Emergency search stop partner depth
      - base :
        
        ordering : 54
        
        name : umcbt
        
        help : Defines whether there is an upper limit of time to be used to build one contig. Set this to ‘Y’ in EST assemblies where you think that extremely high coverages occur. Less useful for assembly of genomic sequences.
        
        option :
        
        name : EDAM:
        
        value : Generic boolean
        
        type : normal
        
        qualifier : umcbt
        
        default : false
        
        mandatory : false
        
        type : boolean
        
        prompt : Use max contig build time
      - range :
        
        format : %d
        
        min : 1
        
        repeatable :
      - base :
        
        ordering : 55
        
        name : bts
        
        help : Depending on -umcbt above, this number defines the time in seconds alloted to building one contig.
        
        option :
        
        name : EDAM:
        
        value : Generic integer
        
        type : normal
        
        name : scalemin
        
        value : 1
        
        type : style
        
        qualifier : bts
        
        default : 10000
        
        mandatory : false
        
        type : long
        
        prompt : Build time in seconds
      - base :
        
        ordering : 56
        
        name : ure
        
        help : Defines whether there is an upper limit of time to be used to build one contig. Set this to ‘Y’ in EST assemblies where you think that extremely high coverages occur. Less useful for assembly of genomic sequences.
        
        option :
        
        name : EDAM:
        
        value : Generic boolean
        
        type : normal
        
        qualifier : ure
        
        default : false
        
        mandatory : false
        
        type : boolean
        
        prompt : Use read extension
      - range :
        
        format : %d
        
        min : 1
        
        repeatable :
      - base :
        
        ordering : 57
        
        name : rewl
        
        help : Only takes effect when -ure is set to ‘Y’. The read extension routines use a sliding window approach on Smith-Waterman alignments. This parameter defines the window length.
        
        option :
        
        name : EDAM:
        
        value : Generic integer
        
        type : normal
        
        name : scalemin
        
        value : 1
        
        type : style
        
        qualifier : rewl
        
        default : 30
        
        mandatory : false
        
        type : long
        
        prompt : Read extension window length
      - range :
        
        format : %d
        
        min : 1
        
        repeatable :
      - base :
        
        ordering : 58
        
        name : rewme
        
        help : Only takes effect when -ure is set to ‘Y’. The read extension routines use a sliding window approach on Smith-Waterman alignments. This parameter defines the number maximum number of errors (disagreements) between two alignments in the given window.
        
        option :
        
        name : EDAM:
        
        value : Generic integer
        
        type : normal
        
        name : scalemin
        
        value : 1
        
        type : style
        
        qualifier : rewme
        
        default : 2
        
        mandatory : false
        
        type : long
        
        prompt : Read extension with max errors
      - range :
        
        format : %d
        
        min : 0
        
        repeatable :
      - base :
        
        ordering : 59
        
        name : feip
        
        help : Only takes effect when -ure is set to ‘Y’. The read extension routines can be called before assembly and/or after each assembly pass (see -nop). This parameter defines the first pass in which the read extension routines are called. The default of 0 tells mira to extend the reads the first time before the first assembly pass.
        
        option :
        
        name : EDAM:
        
        value : Generic integer
        
        type : normal
        
        qualifier : feip
        
        default : 0
        
        mandatory : false
        
        type : long
        
        prompt : First extension in pass
      - range :
        
        format : %d
        
        min : 0
        
        repeatable :
      - base :
        
        ordering : 60
        
        name : leip
        
        help : Only takes effect when -ure is set to ‘Y’. The read extension routines can be called before assembly and/or after each assembly pass (see -nop). This parameter defines the last pass in which the read extension routines are called. The default of 0 tells mira to extend the reads the last time before the first assembly pass.
        
        option :
        
        name : EDAM:
        
        value : Generic integer
        
        type : normal
        
        qualifier : leip
        
        default : 0
        
        mandatory : false
        
        type : long
        
        prompt : Last extension in pass
      - base :
        
        ordering : 61
        
        name : tpae
        
        help : This option is useful in EST assembly. Poly-AT stretches at the end of reads that were not correctly masked or clipped in pre-processing steps from external programs get tagged here. The assembler will not use these stretches for critical operations. Additionally, the tags do provide a good visual anchor when looking at the assembly with different programs.
        
        option :
        
        name : EDAM:
        
        value : Generic boolean
        
        type : normal
        
        qualifier : tpae
        
        default : false
        
        mandatory : false
        
        type : boolean
        
        prompt : Tag poly-AT at ends
      - range :
        
        format : %d
        
        min : 1
        
        repeatable :
      - base :
        
        ordering : 62
        
        name : pbwl
        
        help : Only takes effect when -tpae is set to ‘Y’. Defines the window length within which all bases (except the maximum number of errors allowed) must be either A or T to be considered a polybase stretch.
        
        option :
        
        name : EDAM:
        
        value : Generic integer
        
        type : normal
        
        name : scalemin
        
        value : 1
        
        type : style
        
        qualifier : pbwl
        
        default : 7
        
        mandatory : false
        
        type : long
        
        prompt : Polybase window length
      - range :
        
        format : %d
        
        min : 1
        
        repeatable :
      - base :
        
        ordering : 63
        
        name : pbwme
        
        help : Only takes effect when -tpae is set to ‘Y. Defines the maximum number of errors allowed in a given window length such that a stretch is considered to be a polybase stretch. The distribution of these errors is not important.
        
        option :
        
        name : EDAM:
        
        value : Generic integer
        
        type : normal
        
        name : scalemin
        
        value : 1
        
        type : style
        
        qualifier : pbwme
        
        default : 2
        
        mandatory : false
        
        type : long
        
        prompt : Polybase window max errors
      - range :
        
        format : %d
        
        min : 1
        
        repeatable :
      - base :
        
        ordering : 64
        
        name : pbwgd
        
        help : Only takes effect when -tpae is set to ‘Y’. Defines the number of bases from the end of a sequence (if masked, from the end of the masked area) within which a polybase stretch is looked for without finding one.
        
        option :
        
        name : EDAM:
        
        value : Generic integer
        
        type : normal
        
        name : scalemin
        
        value : 1
        
        type : style
        
        qualifier : pbwgd
        
        default : 9
        
        mandatory : false
        
        type : long
        
        prompt : Polybase window grace distance
      - base :
        
        ordering : 65
        
        name : pvc
        
        help : Mira will try to identify possible sequencing vector relicts present at the start of a sequence and clip them away. These relicts are usually a few bases long and were not correctly removed from the sequence in data pre-processing steps of external programs. You might want to turn off this option if you know (or think) that your data contains a lot of repeats and the option below to fine tune the clipping behaviour does not give the expected results.
        
        option :
        
        name : EDAM:
        
        value : Generic boolean
        
        type : normal
        
        qualifier : pvc
        
        default : false
        
        mandatory : false
        
        type : boolean
        
        prompt : Possible vector clip
      - range :
        
        format : %d
        
        min : 0
        
        repeatable :
      - base :
        
        ordering : 66
        
        name : pvcmla
        
        help : The clipping of possible vector relicts option works quite well. Unfortunately the bounds of repeats or differences in EST splice variants sometimes show the same alignment behaviour as possible sequencing vector relicts and could therefore also be clipped. To stop the vector clipping from mistakenly clipping repetitive regions or EST splice variants, this option puts an upper bound to the number of bases a potential clip is allowed to have. If the number of bases is below or equal to this threshold then the bases are clipped. If the number of bases exceeds the threshold then the clip is NOT performed. Setting the value to 0 turns off the threshold i.e. clips are then always performed if a potential vector is found.
        
        option :
        
        name : EDAM:
        
        value : Generic integer
        
        type : normal
        
        qualifier : pvcmla
        
        default : 18
        
        mandatory : false
        
        type : long
        
        prompt : Possible vector clip max length allowed
      - base :
        
        ordering : 67
        
        name : qc
        
        help : Default is ‘N’, but is automatically set to ‘Y’ when using the setparam options ‘fasta’ or ‘phd’ (can be turned off again by subsequent options afterwards). This will let mira perform its own quality clipping before sequences are entered into the assembly. The clip function performed is a sequence end window quality clip with back iteration to get a maximum number of bases as useful sequence. Note that the bases clipped away here can still be used afterwards if there is enough evidence supporting their correctness when the option -ure is turned on.
        
        option :
        
        name : EDAM:
        
        value : Generic boolean
        
        type : normal
        
        qualifier : qc
        
        default : false
        
        mandatory : false
        
        type : boolean
        
        prompt : Quality clip
      - range :
        
        format : %d
        
        max : 35
        
        min : 15
        
        repeatable :
      - base :
        
        ordering : 68
        
        name : qcmq
        
        help : This is the minimum quality required of bases in a window in order to be accepted. Please be cautious and don’t use extreme values here, because then the clipping will be too lax or too harsh. Values below 15 and higher than 35 are disallowed.
        
        option :
        
        name : EDAM:
        
        value : Generic integer
        
        type : normal
        
        name : scalemax
        
        value : 35
        
        type : style
        
        name : scalemin
        
        value : 15
        
        type : style
        
        qualifier : qcmq
        
        default : 20
        
        mandatory : false
        
        type : long
        
        prompt : Quality clip minimum quality
      - range :
        
        format : %d
        
        min : 10
        
        repeatable :
      - base :
        
        ordering : 69
        
        name : qcwl
        
        help : This is the length of a window in bases for the quality clip.
        
        option :
        
        name : EDAM:
        
        value : Generic integer
        
        type : normal
        
        name : scalemin
        
        value : 10
        
        type : style
        
        qualifier : qcwl
        
        default : 30
        
        mandatory : false
        
        type : long
        
        prompt : Quality clip window length
      - base :
        
        ordering : 70
        
        name : mbc
        
        help : This will let mira perform a ‘clipping’ of bases that were masked out (replaced with the character X). It is generally not a good idea to use mask bases to remove unwanted portions of a sequence; the EXP file format and the NCBI traceinfo format have excellent possibilities to circumvent this. But because a lot of pre-processing software is built around cross_match, scylla- and phrap-style base masking, the need arised for mira to be able to handle this too. mira will look at the start and end of each sequence to see whether there are masked bases that should be ‘clipped’.
        
        option :
        
        name : EDAM:
        
        value : Generic boolean
        
        type : normal
        
        qualifier : mbc
        
        default : false
        
        mandatory : false
        
        type : boolean
        
        prompt : Masked bases clip
      - range :
        
        format : %d
        
        min : 0
        
        repeatable :
      - base :
        
        ordering : 71
        
        name : mbcgs
        
        help : While performing the clip of masked bases, mira will look if it can merge larger chunks of masked bases that are a maximum of -mbcgs apart.
        
        option :
        
        name : EDAM:
        
        value : Generic integer
        
        type : normal
        
        qualifier : mbcgs
        
        default : 20
        
        mandatory : false
        
        type : long
        
        prompt : Masked bases clip gap size
      - range :
        
        format : %d
        
        min : 0
        
        repeatable :
      - base :
        
        ordering : 72
        
        name : mbcmfg
        
        help : While performing the clip of masked bases at the start of a sequence, mira will allow up to this number of unmasked bases in front of a masked stretch.
        
        option :
        
        name : EDAM:
        
        value : Generic integer
        
        type : normal
        
        qualifier : mbcmfg
        
        default : 40
        
        mandatory : false
        
        type : long
        
        prompt : Masked bases clip max front gap
      - range :
        
        format : %d
        
        min : 0
        
        repeatable :
      - base :
        
        ordering : 73
        
        name : mbcmeg
        
        help : While performing the clip of masked bases at the end of a sequence, mira will allow up to this number of unmasked bases behind a masked stretch.
        
        option :
        
        name : EDAM:
        
        value : Generic integer
        
        type : normal
        
        qualifier : mbcmeg
        
        default : 60
        
        mandatory : false
        
        type : long
        
        prompt : Masked bases clip max end gap
      - base :
        
        ordering : 74
        
        name : emlc
        
        help : If on, ensures a minimum left clip on each read according to the parameters in -mlcr & -smlc
        
        option :
        
        name : EDAM:
        
        value : Generic boolean
        
        type : normal
        
        qualifier : emlc
        
        default : false
        
        mandatory : false
        
        type : boolean
        
        prompt : Ensure minimum left clip
      - range :
        
        format : %d
        
        min : 0
        
        repeatable :
      - base :
        
        ordering : 75
        
        name : mlcr
        
        help : If -emlc is ‘Y’, checks whether there is a left clip whose length is at least the size specified here.
        
        option :
        
        name : EDAM:
        
        value : Generic integer
        
        type : normal
        
        qualifier : mlcr
        
        default : 25
        
        mandatory : false
        
        type : long
        
        prompt : Minimum left clip required
      - range :
        
        format : %d
        
        min : 0
        
        repeatable :
      - base :
        
        ordering : 76
        
        name : smlc
        
        help : If -emlc is ‘Y’ and the actual left clip is < -mlcr, then set the left clip of read to the value given here.
        
        option :
        
        name : EDAM:
        
        value : Generic integer
        
        type : normal
        
        qualifier : smlc
        
        default : 30
        
        mandatory : false
        
        type : long
        
        prompt : Set minimum left clip
      - range :
        
        format : %d
        
        min : 1
        
        repeatable :
      - base :
        
        ordering : 77
        
        name : bph
        
        help : Default is 14 on 32 bit systems and 16 on 64 bit systems. Controls the number of consecutive bases n which are used as a word hash. The higher the value the faster the search. The lower the value the more weak matches are found. Values below 10 are not recommended.
        
        option :
        
        name : EDAM:
        
        value : Generic integer
        
        type : normal
        
        name : scalemin
        
        value : 1
        
        type : style
        
        qualifier : bph
        
        default : 14
        
        mandatory : false
        
        type : long
        
        prompt : Bases per hash
      - range :
        
        format : %d
        
        min : 1
        
        repeatable :
      - base :
        
        ordering : 78
        
        name : hss
        
        help : This is a parameter controlling the stepping increments with which hashes are generated. This allows for a more fine-grained search as matches are now found with at least n+s (see -bph) equal bases instead of the SSAHA 2n. The higher the value the faster the search. The lower the value the more weak matches are found.
        
        option :
        
        name : EDAM:
        
        value : Generic integer
        
        type : normal
        
        name : scalemin
        
        value : 1
        
        type : style
        
        qualifier : hss
        
        default : 4
        
        mandatory : false
        
        type : long
        
        prompt : Hash saving step
      - range :
        
        format : %d
        
        min : 1
        
        repeatable :
      - base :
        
        ordering : 79
        
        name : pr
        
        help : Controls the relative percentage of exact word matches in an approximate overlap that has to be reached to accept the overlap as a possible match. Increasing this number will decrease the number of possible alignments that have to be checked by Smith-Waterman later on in the assembly, but it might also lead to the rejection of weaker overlaps (i.e. overlaps that contain a higher number of mismatches).
        
        option :
        
        name : EDAM:
        
        value : Generic integer
        
        type : normal
        
        name : scalemin
        
        value : 1
        
        type : style
        
        qualifier : pr
        
        default : 50
        
        mandatory : false
        
        type : long
        
        prompt : Percent required
      - range :
        
        format : %d
        
        min : 1
        
        repeatable :
      - base :
        
        ordering : 80
        
        name : mhpr
        
        help : Controls the maximum number of possible hits one read can maximally transport to the Smith-Waterman alignment phase. If more potential hits are found, only the best ones are taken. This is an important option for tackling projects that contain extreme assembly conditions. For example, 5000 reads that are all very similar would generate around 40 to 50 million possible alignments (forward and reverse complement). Setting this parameter to 200 reduces the number of alignments to check to around 1.5-2 million. As the assembly increases in passes (-nop), different combinations of possible hits will be checked, always the probably best ones first. So the accuracy of the assembly should only suffer when lowering this number too much.
        
        option :
        
        name : EDAM:
        
        value : Generic integer
        
        type : normal
        
        name : scalemin
        
        value : 1
        
        type : style
        
        qualifier : mhpr
        
        default : 200
        
        mandatory : false
        
        type : long
        
        prompt : Max hits per read
      - range :
        
        format : %d
        
        max : 100
        
        min : 1
        
        repeatable :
      - base :
        
        ordering : 81
        
        name : bip
        
        help : The banded Smith-Waterman alignment uses this percentage number to compute the bandwidth it has to use when computing the alignment matrix. E.g. expected overlap is 150 bases, bip=10 -> the banded SW will compute a band of 15 bases to each side of the expected alignment diagonal, thus allowing up to 15 unbalanced inserts / deletes in the alignment. INCREASING AND DECREASING THIS NUMBER – increasing will find more non-optimal alignments but will also increase SW runtime between linear and ^2, decreasing will work the other way round (it might miss a few bad alignments but gain speed).
        
        option :
        
        name : EDAM:
        
        value : Generic integer
        
        type : normal
        
        name : scalemax
        
        value : 100
        
        type : style
        
        name : scalemin
        
        value : 1
        
        type : style
        
        qualifier : bip
        
        default : 15
        
        mandatory : false
        
        type : long
        
        prompt : Bandwidth in percent
      - range :
        
        format : %d
        
        min : 1
        
        repeatable :
      - base :
        
        ordering : 82
        
        name : bmin
        
        help : Minimum bandwidth in bases to each side.
        
        option :
        
        name : EDAM:
        
        value : Generic integer
        
        type : normal
        
        name : scalemin
        
        value : 1
        
        type : style
        
        qualifier : bmin
        
        default : 25
        
        mandatory : false
        
        type : long
        
        prompt : Bandwidth minimum
      - range :
        
        format : %d
        
        min : 1
        
        repeatable :
      - base :
        
        ordering : 83
        
        name : bmax
        
        help : Maximum bandwidth in bases to each side.
        
        option :
        
        name : EDAM:
        
        value : Generic integer
        
        type : normal
        
        name : scalemin
        
        value : 1
        
        type : style
        
        qualifier : bmax
        
        default : 50
        
        mandatory : false
        
        type : long
        
        prompt : Bandwidth maximum
      - range :
        
        format : %d
        
        min : 1
        
        repeatable :
      - base :
        
        ordering : 84
        
        name : mo
        
        help : Minimum number of overlapping bases needed in an alignment of two sequences to be accepted.
        
        option :
        
        name : EDAM:
        
        value : Generic integer
        
        type : normal
        
        name : scalemin
        
        value : 1
        
        type : style
        
        qualifier : mo
        
        default : 15
        
        mandatory : false
        
        type : long
        
        prompt : Minimum overlap
      - range :
        
        format : %d
        
        min : 1
        
        repeatable :
      - base :
        
        ordering : 85
        
        name : ms
        
        help : Describes the minimum score of an overlap to be taken into account for assembly. mira uses a default scoring scheme for SW align. Each match counts 1, a match with an N counts 0, each mismatch with a non-N base -1 and each gap -2. Use a bigger score to weed out a number of chance matches, a lower score to perhaps find the single (short) alignment that might join two contigs together (at the expense of computing time and memory).
        
        option :
        
        name : EDAM:
        
        value : Generic integer
        
        type : normal
        
        name : scalemin
        
        value : 1
        
        type : style
        
        qualifier : ms
        
        default : 15
        
        mandatory : false
        
        type : long
        
        prompt : Minimum score
      - range :
        
        format : %d
        
        max : 100
        
        min : 1
        
        repeatable :
      - base :
        
        ordering : 86
        
        name : mrs
        
        help : Describes the min percentage of matching between two reads to be considered for assembly. Increasing this number will save memory but one might lose possible alignments. A maximum of 80 is probably sensible here. Decreasing below 55 will probably make memory and time consumption explode.
        
        option :
        
        name : EDAM:
        
        value : Generic integer
        
        type : normal
        
        name : scalemax
        
        value : 100
        
        type : style
        
        name : scalemin
        
        value : 1
        
        type : style
        
        qualifier : mrs
        
        default : 65
        
        mandatory : false
        
        type : long
        
        prompt : Minimum relative score
      - base :
        
        ordering : 87
        
        name : egp
        
        help : Defines whether or not to increase penalties applied to alignments containing long gaps. Setting this to ‘Y’ might help in projects with frequent repeats. On the other hand, it is definitively disturbing when assembling very long reads containing multiple long indels in the called base sequence … although this should not happen in the first place and is a sure sign for problems lying ahead. When in doubt, set it to ‘Y’ for EST projects and de-novo genome assembly, set it to ‘N’ for assembly of closely related strains (assembly against a backbone). When set to ‘N’, it is recommended to have -amgb and -amgbemc both set to ‘Y’.
        
        option :
        
        name : EDAM:
        
        value : Generic boolean
        
        type : normal
        
        qualifier : egp
        
        default : false
        
        mandatory : false
        
        type : boolean
        
        prompt : Extra gap penalty
      - standard :
        
        list :
        
        name : Extra gap penalty level
        
        list_item :
        
        shown_as : Low
        
        level : 0
        
        value : low
        
        shown_as : Medium
        
        level : 0
        
        value : medium
        
        shown_as : High
        
        level : 0
        
        value : high
        
        shown_as : EST split splices
        
        level : 0
        
        value : est
        
        type : full
        
        repeatable :
      - base :
      - range :
        
        format : %d
        
        max : 100
        
        min : 1
        
        repeatable :
      - base :
        
        ordering : 89
        
        name : megpp
        
        help : Has no effect if extra_gap_penalty is off. Defines the maximum extra penalty in percent applied to ‘long’ gaps.
        
        option :
        
        name : EDAM:
        
        value : Generic integer
        
        type : normal
        
        name : scalemax
        
        value : 100
        
        type : style
        
        name : scalemin
        
        value : 1
        
        type : style
        
        qualifier : megpp
        
        default : 100
        
        mandatory : false
        
        type : long
        
        prompt : Maximum extra gap penalty percent
      - standard :
        
        repeatable :
      - base :
      - standard :
        
        list :
        
        name : Contig analysis
        
        list_item :
        
        shown_as : None
        
        level : 0
        
        value : none
        
        shown_as : Text
        
        level : 0
        
        value : text
        
        shown_as : Signal
        
        level : 0
        
        value : signal
        
        type : full
        
        repeatable :
      - base :
      - range :
        
        format : %d
        
        max : 100
        
        min : 1
        
        repeatable :
      - base :
        
        ordering : 92
        
        name : rodirs
        
        help : When adding reads to a contig, reject the reads if the drop in the quality of the consensus is > the given value in %. Lower values mean stricter checking. This value is doubled should a read be entered that has a template partner (a read pair) at the right distance.
        
        option :
        
        name : EDAM:
        
        value : Generic integer
        
        type : normal
        
        name : scalemax
        
        value : 100
        
        type : style
        
        name : scalemin
        
        value : 1
        
        type : style
        
        qualifier : rodirs
        
        default : 15
        
        mandatory : false
        
        type : long
        
        prompt : Reject on drop in relscore
      - range :
        
        format : %d
        
        max : 100
        
        min : 1
        
        repeatable :
      - base :
        
        ordering : 93
        
        name : dmer
        
        help : When adding reads to a contig, reject the reads if the error in zones known as dangerous exceeds the given value in %. Lower values mean stricter checking in these danger zones. For the time being, only regions tagged as ALUS or REPT in the experiment file are considered dangerous.
        
        option :
        
        name : EDAM:
        
        value : Generic integer
        
        type : normal
        
        name : scalemax
        
        value : 100
        
        type : style
        
        name : scalemin
        
        value : 1
        
        type : style
        
        qualifier : dmer
        
        default : 1
        
        mandatory : false
        
        type : long
        
        prompt : Danger max error rate
      - base :
        
        ordering : 94
        
        name : mr
        
        help : One of the most important switches in MIRA. If set to ‘Y’, MIRA will try to resolve misassemblies due to repeats by identifying single base stretch differences and tag those critical bases as RMB (Repeat Marker Base, weak or strong). This switch is also needed when MIRA is run in EST mode to identify possible inter-, intra- and intra-and-interorganism SNPs.
        
        option :
        
        name : EDAM:
        
        value : Generic boolean
        
        type : normal
        
        qualifier : mr
        
        default : false
        
        mandatory : false
        
        type : boolean
        
        prompt : Mark repeats
      - base :
        
        ordering : 95
        
        name : asir
        
        help : Only takes effect when -mr is set to ‘Y’, effect is also dependent on the fact whether strain data (see -lsd) is present or not. Usually, mira will mark bases that differentiate between repeats, when a conflict occurs between reads that belong to one strain. If the conflict occurs between reads belonging to different strains they are marked as SNP. However, if this switch is set to ‘Y’,= then conflicts within a strain are also marked as SNP. This switch is mainly used in assemblies of ESTs; it should not be set for genomic assembly.
        
        option :
        
        name : EDAM:
        
        value : Generic boolean
        
        type : normal
        
        qualifier : asir
        
        default : false
        
        mandatory : false
        
        type : boolean
        
        prompt : Assume SNP instead of repeat
      - range :
        
        format : %d
        
        min : 2
        
        repeatable :
      - base :
        
        ordering : 96
        
        name : mrpg
        
        help : Only takes effect when -mr is set to ‘Y’. This defines the minimum number of reads in a group that are needed for the RMB (Repeat Marker Bases) or SNP detection routines to be triggered. A group is defined by the reads carrying the same nucleotide for a given position, i.e., an assembly with mrpg=2 will need at least two times two reads with the same nucleotide (having at least a quality as defined in -mgqrt) to be recognised as repeat marker or a SNP. Setting this to a low number increases sensitivity, but might produce a few false positives, resulting in reads being thrown out of contigs because of falsely identified possible repeat markers (or wrongly recognised as SNP).
        
        option :
        
        name : EDAM:
        
        value : Generic integer
        
        type : normal
        
        name : scalemin
        
        value : 2
        
        type : style
        
        qualifier : mrpg
        
        default : 2
        
        mandatory : false
        
        type : long
        
        prompt : Minimum reads per group
      - range :
        
        format : %d
        
        min : 25
        
        repeatable :
      - base :
        
        ordering : 97
        
        name : mgqrt
        
        help : Only takes effect when -mr is set to ‘Y’. This defines the minimum quality of a group of bases to be taken into account as potential repeat marker. The lower the number, the more sensitive you get, but lowering below 25 is not recommended as a lot of wrongly called bases can have a quality approaching this value and you’d end up with a lot of false positives. The higher the overall coverage of your project the better, and the higher you can set this number. A value of 35 will probably remove all false po
        
        option :
        
        name : EDAM:
        
        value : Generic integer
        
        type : normal
        
        name : scalemin
        
        value : 25
        
        type : style

License(s): None Login to add license info Cost: No info yet Login to add cost info Usage Conditions: No info yet Login to add usage conditions info Contact Info: None Login to add contact info Publications: for this service. This can be a URI to the publication and/or a DOI. None Login to add publication info Citations: None Login to add a citation