Genome Atlas SOAP

Provider:
Center for Biological Sequence Analysis (CBS)

Location:
European Union

Submitter / Source:
pfhallin (about 1 year ago)

Base URL:
http://ws.cbs.dtu.dk/cgi-bin/soap/ws/quasi.cgi

WSDL Location:
http://www.cbs.dtu.dk/ws/GenomeAtlas/GenomeAtlas_3_0_ws1.wsdl(download last cached WSDL file)

Documentation URL(s): pfhallin (about 1 year ago)http://www.cbs.dtu.dk/ws/GenomeAtlas

Login to add a documentation URL Description(s): from provider’s description doc (about 1 year ago) This Web Service accesses the database records and various tools of the
GenomeAtlas database v3. The records maintained by this database are synchronized regularly
with the Entrez Genome Project (http://www.ncbi.nlm.nih.gov/genomes/lproks.cgi?view=1)

#
# DATABASE LOOK-UP FUNCTIONS
#
1. getSeq
Get one or more genomic sequences from the Genome Atlas database (update regularly against
Entrez Microbial Genomes), providing the genbank accession number.
Input:
* ‘genbank’ : A genbank accession number
Output:
* ‘sequencedata’
* ‘sequence’ [array]
* ‘id’ : id sequence
* ‘comment’: comment of sequence
* ‘seq’ : The DNA sequence of the genome
2. getProt
Get the protein sequences encoded by annotated coding regions of GenBank record
Input:
* ‘genbank’ : A genbank accession number
Output:
* ‘sequencedata’
* ‘sequence’ [array]
* ‘id’ : id sequence
* ‘comment’: comment of sequence
* ‘seq’ : The translations of the predicted protein coding genes

3. getOrfs
Get the nucleotide sequences of annotated coding regions of GenBank record
Input:
* ‘accession’: Ond or more GenBank accession numbers.
Output:
* ‘contig’
* ‘id’: accession number as provided in input
* ‘sequencedata’: An array of sequencedata objects
* ‘id’ : The identifier of the sequence ( output from GenBank record converter )
* ‘seq’ : Protein coding DNA sequence

4. queryGenomes
Query records of the GenomeAtlas database
Input:
* ‘search’ : Records can be search by various optional fields (AND separated) All fields
except ‘pid’ are surrounded by wildcards.
* ‘kingdom’ : bacteria / archaea
* ‘phyla’ : Phyla
* ‘pid’ : Project id
* ‘organism’ : Organism name
* ‘genbank’ : Genbank accession number
* ‘refseq’ : RefSeq accession number
* ‘segment’ : Segment / replicon name (e.g. ‘GENOME[PID]’, ‘Chromosome”‘, ‘pVir’ …)
* ‘hideMerged’ : yes / no: Hide merged segments (GENOME[PID])

Output: An array of entries containing:
* ‘descriptions’ : A genome atlas database record
* ‘entry’
* ‘field’ : The name of the field (e.g. ATCONTENT, NGENES)
* ‘description’ : A descriptive text for the field
* ‘entry’ : A genome atlas database record
* ‘kingdom’ : bacteria / archaea
* ‘phyla’ : Phyla
* ‘pid’ : Project id
* ‘organism’ : Organism name
* ‘genbank’ : Genbank accession number
* ‘refseq’ : RefSeq accession number
* ‘segment’ : Segment / replicon name (e.g. ‘GENOME[PID]’, ‘Chromosome”‘, ‘pVir’ …)
* ‘properties’ : Returned the calculated gemomic properties of the segment
* ‘ATCONTENT’
* ‘NGENES’
* ‘LENGTH’
* ‘BPPRGENE’
* ‘CODING_FRACTION’
* ‘GEOMETRY’
* ‘RNAMMER_TSU_COUNT’
* ‘RNAMMER_SSU_COUNT’
* ‘RNAMMER_LSU_COUNT’
* ‘GLO_DIR_REPEAT’
* ‘GLO_INV_REPEAT’
* ‘SR_PERCENT’
* ‘ANN_TRNA_COUNT’
* ‘TRNA_SCAN_COUNT’
* ‘TRUE_PROTEINS’
* ‘TRUE_PROT_RATIO’
* ’60_ORIGIN’
* ’60_TERMINUS’
* ‘ADNACC’
* ‘CURVATURE_AVG’
* ‘ELHASSAN_AVG’
* ‘OLSON_AVG’
* ‘ORNSTEIN_AVG’
* ‘RRRECIEVER_COUNT’
* ‘HISKA_1_COUNT’
* ‘HISKA_2_COUNT’
* ‘HISKA_3_COUNT’
* ‘HISKA_COUNT’
* ‘HWE_HK_COUNT’
* ‘LOC_DIR_REPEAT’
* ‘LOC_EVR_REPEAT’
* ‘LOC_INV_REPEAT’
* ‘LOC_MIR_REPEAT’

5. getFeatures
Get details for all annotated features of a single genbank record
Input:
* ‘accession’ : Genbank accession number
* ‘features’ : Comma separated list of features to be returned
(e.g. all or cds,rrna,trna)
* ‘keys’ : Comma separated list of keys to be returned
(e.g. all or locus_tag,gene,translation)

Output: ‘features’: An array of ‘feature’ elements, containing:
* ‘type’ : feature type, e.g. CDS, rRNA, tRNA
* ‘begin’ : lower boundary of annotation
* ‘end’ : upper boundary of annotation
* ‘end’ : upper boundary of annotation
* ‘dir’ : Annotation direction + or /
* ‘label’ : Acquired from ‘gene’ annotation
* ‘featurekey’ : An array of additional annotation keys provided in the Genbank record
* ‘Key’ : the annotation key, e.g. ‘product’
* ‘Value’ : the annotation value, e.g. ’16S ribosomal RNA’

Please be aware, that begin and end refers to the boundaries of the annotation,
meaning that if multiple concatenations/junctions are present in the annotation, begin
end and will only refer to the smallest and largest of those numbers. To get a detailed map
of the junction, this is found in the ‘featurekey’ element, having attribute key=coordinates.

#
# TOOLS
#

6. DNApropertyRun
Calculates structural and physical properties of the DNA molecule. These properties
are used in the DNA Atlas representation on the Genome Atlas web pages. Properties include
Intrinsic Curvature, Stacking energy, position preference, various repeats etc. (please see
below for documentation). Use operation ‘pollQueue’ to poll the status of the job.

Input:
* ‘method’ : Calculation method, specifying which result are to be generated,
e.g. ‘Intrinsic Curvature’ (see documentation below)
* ‘sequence’
* ‘id’ : Sequence identifier
* ‘seq’ : DNA sequence

The following DNA properties can be calculated:

Intrinsic Curvature
DNA curvature is calculated using the CURVATURE programme (Bolshoy et al. 1991, Shpigelman
et al. 1993). The term curved DNA here refers to DNA that is intrinsically curved
in solution and can be readily characterised by anomalous migration in acrylamide
gels. There are different models for curved DNA (Sinden et al. 1998), although the
predictions for curvature fragments largerthan a few hundred bp is essentially the
same (Haran et al. 1994). The scale is in arbitrary “Curvature units”, which ranges
from 0 (e.g. no curvature) to 1.0, which is the curvature of DNA when wrapped around
the nucleosome. The scale used for this atlas ranges 3 standard deviations around
the mean.

* R.R. Sinden and C.E. Pearson and V.N. Potaman and D.W. Ussery DNA: Structure and
Function (1998) 5A:1-141

* E.S. Shpigelman and E.N. Trifonov and A. Bolshoy CURVATURE: Software for the Analysis
of Curved DNA. (1993) 9:435-444

* T.E. Haran and J.D. Kahn and D.M. Crothers Sequences elements responsible for
DNA curvature (1994) 225:729-738

* A. Bolshoy and P. McNamara and R.E. Harrington and E.N. Trifonov Curved DNA Without
A-A – Experimental Estimation of All 16 DNA Wedge Angles (1991) 88:2312-2316

Position Preference
– a trinucleotide model based on the preferential location
of sequences within nucleosomal core sequences (Satchwell et al. 1986). We use the
magnitude (e.g.absolute values) of the trinucleotide numbers as a measure of DNA
flexibility (Baldi et al. 1996). The trinucleotide values range from essentially
zero (0.003, presumably more flexible), to 0.28 (considered rigid). Since very few
of the trinucleotide have values close to zero (e.g. little preference for nucleosome
positioning), this measureis considered most sensitive towards the low (“flexibity”)

* S.C. Satchwell and H.R. Drew and A.A. Travers Sequence periodicities in chicken
nucleosome core DNA (1986) 191:659-675

* P. Baldi and S. Brunak and Y. Chauvin and A. Krogh Naturally occurring nucleosome
positioning signals in human exons and introns. (1996) 263:503-510

Stacking Energy
Base-stacking energies are from the dinucleotide values provided by (Ornstein et
al. 1978). The scale is in kcal/mol, and the dinucleotide values range from -3.82
kcal/mol (will melt easily) up to a maximum value of -14.59 kcal/mol (which would
require more energy to destack or melt the helix). (All 10 values are listed in the
table below.) A positive peak in base-stacking (i.e., numbers closer to 0) reflectsregions
of the helix which would de-stack or melt more readily. Conversely, minima (larger
negative numbers) in this plot would represent more stable regions of the chromosome.

Dinucleotide melting energies in kcal/mols:

(GC).(GC) -14.59
(AC).(GT) -10.51
(TC).(GA) -9.81
(CG).(CG) -9.61
(GG).(CC) -8.26
(AT).(AT) -6.57
(TG).(CA) -6.57
(AG).(CT) -6.78
(AA).(TT) -5.37
(TA).(TA) -3.82

* R.L. Ornstein and R. Rein and D.L. Breen and R.D. MacElroy An optimized potential
function for the calculation of nucleic acid interaction energies. I. Base stacking
(1978) 17:2341-2360

Protein Deformability
“Protein Induced Deformability” dinucleotide values are from protein induced deformation
of DNA helices as determined by examination of more than a hundred cr et et al. 1997al
structures of DNA/protein complexes (Olson et al. 1998). The dinucleotide values
range from 2.1 (the least deformable dinucleotide), to 12.1 (i.e., the dinucleotide
step (CpG), which is often deformed by proteins). Thus, on this scale, a larger value
reflects a more deformable sequence whilst a smaller value indicates a region where
the DNA helix is less likely to be changed dramatically by proteins. The average
protein deformability value in the entire E. coli K-12 genome is 5.12.

* Goffeau et al. The Yeast Genome Directory (1997) 387 (supplement):5-105

* W.K. Olson and A.A. Gorin and X.J. Lu and L.M. Hock and V.B. Zhurkin DNA sequence-dependent
deformability deduced from protein-DNA crystal complexes. (1998) 95:11163-11168

Propeller twist
We use propeller twist as a measure of helix rigidity, since the propeller twist
angles have been shown to be inversely related to rigidity of the DNA helix in crystals
(el Hassan et al. 1996). Thus, a region with high propeller twist would
mean the helix is quite rigid in this area, and similarly regions that are quite
flexible would have a low propeller twist. Propeller twist values were obtained from
cr et et al. 1997allographic data (el et al. 1996), with the exception of the TA
step, which was taken from a theoretical estimate (Gorin et al. 1995). Plots using
other sets of propeller twist dinucleotide values were very similar (data not shown).
The average propeller twist value in the entire E. coli K-12 genome is -12.63 degrees.

* Goffeau et al. The Yeast Genome Directory (1997) 387 (supplement):5-105

* M.A. el Hassan and C.R. Calladine Propeller-twisting of base-pairs and the conformational
mobility of dinucleotide steps in DNA. (1996) 259:95-103

* A.A. Gorin and V.B. Zhurkin and W.K. Olson B-DNA twisting correlates with base-pair
morphology. (1995) 247:34-48

DNase I Sensitivity
DNase I values are based on experimentally determined trinucleotide values (Brukner
et al. 1995, Brukner et al. 1995). These values are reflectiveof the anisotropic
flexibility or “bendability” of a particular DNAsequence. The trinucleotide values
range from -0.280 (rigid) to +0.194 (very “bendable” towards the major groove). Smoothing
over a large regions, (which is necessary for viewing entire genomes) tends to smooth
out differences in bendability. The average DNase I (“bendability”) value in the

* I. Brukner and R. Sanchez and D. Suck and S. Pongor Sequence-dependent bending
propensity of DNA as revealed by DNase I: parameters for trinucleotides. (1995) 14:1812-1818

* I. Brukner and R. Sanchez and D. Suck and S. Pongor Trinucleotide models for DNA
bending propensity: comparison of models based on DNaseI digestion and nucleosome
packaging data. (1995) 13:309-317

Palindromic hexamers
For a given sequence, any palindrome of 6 nt (e.g., AAATTT) is given a value of 1,
while all bases not included inpalindromic hexamers are given a value of 0 (van et
al. 2003).

* van Noort V, Worning P, Ussery DW, Rosche WA, Sinden RR Strand misalignments lead
to quasipalindrome correction (2003) 19:365-9

G Content
The “G Content” of a given sequence is merely the fraction of G’s in a given sequence
(Jensen et al. 1999). It can range from 0(no G’s), to 1 (all G’s). For a sequence
that is 50% AT content, one would expect roughly 25% G’s.

* L. J. Jensen and C. Friis and D.W. Ussery Three views of complete chromosomes
(1999) 150:773-777

A Content
The “A Content” of a given sequence is merely the fraction of A’s in a given sequence
(Jensen et al. 1999). It can range from 0(no A’s), to 1 (all A’s). For a sequence
that is 50% AT content, one would expect roughly 25% A’s.

* L. J. Jensen and C. Friis and D.W. Ussery Three views of complete chromosomes
(1999) 150:773-777

T Content
The “T Content” of a given sequence is merely the fraction of T’s in a given sequence
(Jensen et al. 1999). It can range from 0(no T’s), to 1 (all T’s). For a sequence
that is 50% AT content, one would expect roughly 25% T’s.

* L. J. Jensen and C. Friis and D.W. Ussery Three views of complete chromosomes
(1999) 150:773-777

C Content
The “C Content” of a given sequence is merely the fraction of C’s in a given sequence
(Jensen et al. 1999). It can range from 0(no C’s), to 1 (all C’s). For a sequence
that is 50% AT content, one would expect roughly 25% C’s.

* L. J. Jensen and C. Friis and D.W. Ussery Three views of complete chromosomes
(1999) 150:773-777

GC Skew
For many genomes there is a strand bias, such that one strand tends to have more
G’s, whilst the other strand has more C’s.This GC-skew bias can be measured the number
of G’s minus the number of C’s over a fixed length (e.g. 10,000 bp) of DNA(Jensen
et al. 1999). The values can range from +1 (all G’s on the examined sequence, with
all C’s on the other strand), to -1(the reverse case – all C’s on the examined sequence,
and all G’s on the other strand). There is a correlation with GC-skewand the replication
leading and lagging strands.

* L. J. Jensen and C. Friis and D.W. Ussery Three views of complete chromosomes
(1999) 150:773-777

Percent AT
The percent AT is a running average of the AT content, over a given window size.
Typically for a bacterial genomes of about5 Mbp, the window size is 10,000 bp. The
Percent AT can range from 0 (no AT content) to 1 (100% AT). The Percent AT iscorrelated
with other DNA structural features, such that AT rich regions are often more readily
melted, tend to be lessflexible and more rigid, although they can also be readily
compacted chromatin proteins (Pedersen et al. 2000).

* A.G. Pedersen and L.J. Jensen and H.H. Staerfeldt and S. Brunak and D.W. Ussery
A DNA structural atlas of textitE. coli (2000) 299:907-930

AT Skew
For some genomes there is also an AT strand bias, such that one strand tends to have
more A’s, whilst the other strand hasmore T’s. This AT-skew bias is measured as the
number of A’s minus the number of T’s over a fixed length (e.g. 10,000 bp) ofDNA
(Jensen et al. 1999). The values can range from +1 (all A’s on the examined sequence,
with all T’s on the other strand), to-1 (the reverse case – all T’s on the examined
sequence, and all A’s on the other strand). For some genomes, there is acorrelation
with AT-skew and the replication leading and lagging strands.

* L. J. Jensen and C. Friis and D.W. Ussery Three views of complete chromosomes
(1999) 150:773-777

Global Direct Repeats
Global Direct repeats are found by taking the first 100 bp of sequence, and
looking for the best match within the whole segment, on the same strand, in the
same direction [5′ to 3′] (Skovgaard et al. 2002). Values are binned into 10
values, and represent the lower end of the best match, and range from 0 (10% or
less match) to 9 (at least 90 out of the 100 nucleotides match perfectly).

Global Inverted Repeats
Global Direct repeats are found by taking the first 100 bp of sequence, and
looking for the best match within the whole segment, on the opposite strand, in
the same direction [5′ to 3′] (Skovgaard et al. 2002). Values are binned into
10 values, and represent the lower end of the best match and range from 0 (10%
or less match) to 9 (at least 90 out of the 100 nucleotides match perfectly).

* M. Skovgaard and L.J. Jensen and C. Friis and H.H. Staerfeldt,and P. Worning
and S. Brunak The Atlas Visualization of Genomewide Information (2002) 33:49-63

Direct Repeats
Local Direct repeats are found by taking a 100 bp sequence window, and looking for
the best match of a 30 bp piece withinthat window, on the same strand, in the same
direction (Jensen et al. 1999). Values can range from 0 (no match at all) to 1(one
or more perfect match within the window).

* L. J. Jensen and C. Friis and D.W. Ussery Three views of complete chromosomes
(1999) 150:773-777

Everted Repeats
Local Everted repeats are found by taking a 100 bp sequence window, and looking for
the best match of a 30 bp piece withinthat window, on the opposite strand, in the
same direction (Jensen et al. 1999). Values can range from 0 (no match at all) to
1(one or more perfect match within the window).

* L. J. Jensen and C. Friis and D.W. Ussery Three views of complete chromosomes
(1999) 150:773-777

Local Inverted Repeats
Local Inverted repeats are found by taking a 100 bp sequence window, and looking
for the best match of a 30 bp piece withinthat window, on the opposite strand, in
the opposite direction (Jensen et al. 1999). Values can range from 0 (no match at
all)to 1 (one or more perfect match within the window).

* L. J. Jensen and C. Friis and D.W. Ussery Three views of complete chromosomes
(1999) 150:773-777

Mirror Repeats
Local Mirror repeats are found by taking a 100 bp sequence window, and looking for
the best match of a 30 bp piece withinthat window, on the same strand, in the opposite
direction (Jensen et al. 1999). Values can range from 0 (no match at all) to 1(one
or more perfect match within the window).

* L. J. Jensen and C. Friis and D.W. Ussery Three views of complete chromosomes
(1999) 150:773-777

Quasi-palindromes
“Quasi-palindromes” are short inverted repeats, which are found by taking a 30 bp
piece of sequence, and looking for matcheswith at least 6 out of 7 nt matching, on
the opposite strand, in the opposite direction (van et al. 2003). Values canrange
from 0 (no match at all) to 1 (one or more perfect match within the window).

* van Noort V, Worning P, Ussery DW, Rosche WA, Sinden RR Strand misalignments lead
to quasipalindrome correction (2003) 19:365-9

Perfect-palindromes
“Perfect-palindromes” are short inverted repeats, which are found by taking a 30
bp piece of sequence, and looking forperfect matches of 7 nt or longer, on the opposite
strand, in the opposite direction (van et al. 2003). Values can rangefrom 0 (no match
at all) to 1 (one or more perfect match within the window).

* van Noort V, Worning P, Ussery DW, Rosche WA, Sinden RR Strand misalignments lead
to quasipalindrome correction (2003) 19:365-9

Simple Repeats
A “simple repeat” is a region which contains a simple oligonucleotide repeat, like
microsattelites. Simple repeats are foundby looking for tandem repeats of length
R within a 2R-bp window. By using the values 12, 14, 15, 16, and 18 for R, allsimple
repeats of lengths 1 through 9 are calculated, of length of at least 24 bp (Jensen
et al. 1999). Values can range from 0(no match at all) to 1 (one or more perfect
match within the window).

* L. J. Jensen and C. Friis and D.W. Ussery Three views of complete chromosomes
(1999) 150:773-777

Current undocumented properties are:
AAAA
CCCC
TTTT
GGGG
T4 or C4 vs. A4 or G4
(Y)10 vs. (R)10
(CR)5 vs. (YG)5
(CA)3
(CG)3
(TA)3
(TG)3
(YR)5

Output:
* ‘jobid’ : The 32 byte identification string of the job
* ‘datetime’ : The last timepoint at which the status of the job has changed
* ‘status’ : Possible values are QUEUED, ACTIVE, FINISHED, WAITING, REJECTED,
UNKNOWN JOBID or QUEUE DOWN
* ‘expires’ : Normal expire time is 24hrs. Job results should be downloaded
before that.

7. DNApropertyFetchResult
Retrieves the result from a job submitted using ‘DNApropertyRun’

Input:
* ‘jobid’ : The 32 byte identification string of the job
Output:
* ‘method’ : Method, as provided in request
* ‘values’ : Calculation results given as a string separated by comma. Each
position in the list corresponds to the position in the input
sequence.

8. trnascanRun
Submit the input parapeter(s) and sequence data and returns a job identifier
to tRNAscan-SE 1.23 (April 2002)

Input:
* ‘kingdom’ : The kingdom of the genomic sequence
3 kingdoms are available: bac, euk, arc. This is specified
only once for the sequences in the current job.
* ‘sequence’ : (A single sequence object containing:)
* ‘id’ : The identifier of the sequence
* ‘seq’ : The sequence specified as one continous string
Output:
* ‘jobid’ : The 32 byte identification string of the job
* ‘datetime’ : The last timepoint at which the status of the job has changed
* ‘status’ : Possible values are QUEUED, ACTIVE, FINISHED, WAITING, REJECTED,
UNKNOWN JOBID or QUEUE DOWN

9. trnascanFetchResult
Once the status is ‘FINISHED’ the results generated by the Web Service can be retrieved by
specifying the jobid;

Input
* ‘jobid’ : The 32 byte identification string of the job
Output
* ‘annsource’
* ‘method’ : The name of the prediction method
* ‘version’ : Version of name of the prediction method
* ‘ann’ (ann object with the following content:)
* ‘sequence’
* ‘id’ : sequence identifier as uploaded by the user
* ‘seq’ : sequence as uploaded by the user
* ‘annrecords’
* ‘annrecord’
* ‘feature’ : E.g. ‘Ala,TGC’
* ‘range’
* ‘begin’ : begin position of the tRNA gene
* ‘end’ : end position of the tRNA gene
* ‘score’
* ‘value’ : Cove score
10. pollQueue [common]
Once obtained from ‘runService’, a job identification can be used to poll the
status to see if the result is ready for download.

Input
* ‘jobid’ : The 32 byte identification string of the job
Output
* ‘jobid’ : The 32 byte identification string of the job
* ‘datetime’ : The last timepoint at which the status of the job has changed
* ‘status’ : Possible values are QUEUED, ACTIVE, FINISHED, WAITING, REJECTED,
UNKNOWN JOBID or QUEUE DOWN

11. aaUsage
Calculate the amino acid usage in a genome (proteome) and generates
a base64 encoded image (PNG) showing a diagram of this usage.
Input
* ‘contig’ : Array of genome sequences
* ‘id’ : Identifier of the genome
* ‘sequencedata’ : Container for one or more sequences (typically a proteome)
* ‘sequence’
* ‘id’ : Id of the protein
* ‘seq’ : Protein sequence
Output
* ‘sequence’
* ‘id’ : Genome identifier, provided in the input
* ‘image’ : Image object
* ‘comment’ : Description of the image
* ‘encoding’ : Encoding of the binary content of the image (base64)
* ‘MIMEtype’ : File type (image/png)
* ‘content’ : Encoded binary content
* ‘aaUsage’
* ‘entry’
* ‘name’ : Name of the amino acid, e.g. Ala, Val, Leu …
* ‘count’ : Number of occurences in the genome
* ‘freq’ : Frequency of the amino acid
* ‘group’ : Amino acid class: Polar,Aromatic,Sulfur,
Aliphatic,Structural,+,-

12. codonUsage
Calculate the codon usage in a genome (orfs) and generates
a base64 encoded image (PNG) showing a diagram of this usage.
Input
* ‘contig’ : Array of genome sequences
* ‘id’ : Identifier of the genome
* ‘sequencedata’ : Container for one or more sequences (typically a proteome)
* ‘sequence’
* ‘id’ : Id of the protein
* ‘seq’ : Protein sequence
Output
* ‘sequence’
* ‘id’ : Genome identifier, provided in the input
* ‘image’ : Image object
* ‘comment’ : Description of the image
* ‘encoding’ : Encoding of the binary content of the image (base64)
* ‘MIMEtype’ : File type (image/png)
* ‘content’ : Encoded binary content
* ‘aaUsage’
* ‘entry’
* ‘codon’ : DNA triplet (e.g. ATG …)
* ‘freq’ : Frequency of the triplet
* ‘count’ : Number of occurrences in each genome
* ‘aa’ : Corresponding amino acid

For more information, please contact Peter F. Hallin: pfh@cbs.dtu.dk,
David W. Ussery (dave@cbs.dtu.dk), or Krisoffer Rapacki (rapacki@cbs.dtu.dk)

pfhallin (about 1 year ago)This Web Service accesses the database records and various tools of the
CBS GenomeAtlas database v3. The records maintained by this database are synchronized regularly
with the Entrez Genome Project (http://www.ncbi.nlm.nih.gov/genomes/lproks.cgi?view=1)

The service allows calculation of various DNA properties (intrinsic curvature, stacking energy, position preference, repeats, and base compositions) as well as extracting proteomes and genome sequences. Various properties like gene count, length, replicon count, replicon names etc can be obtained for each project/genbank accession number.

Login to add a description License(s): None Login to add license info Cost: No info yet Login to add cost info Usage Conditions: No info yet Login to add usage conditions info Contact Info: None Login to add contact info Publications: for this service. This can be a URI to the publication and/or a DOI. None Login to add publication info Citations: None Login to add a citation