Nucleotide frequency

Calculate nucleotide composition (A, C, G, T/U) from FASTA or plain sequences. Returns counts and percentages, including ambiguous bases.

Limits: up to 20 MB (uncompressed) • up to 800,000 records

About this tool

Nucleotide Frequency calculates base composition from FASTA or plain sequence input, reporting counts and percentages for A, C, G, T/U and ambiguous characters. This is useful for composition analysis, bias detection, and quick QC checks. For GC% only, use the GC content calculator. For full dataset statistics (N50, length distribution, exports), use Sequence stats. For nucleotide alphabets and ambiguity codes, see sequence alphabets reference.

For larger datasets, multi-file runs, or more involved workflows, this can be executed separately as a custom analysis.

Tool guarantees

✓ No hidden transformations
✓ Input processed only for this request
✓ FASTA structure preserved in output

Results

Submit input to see results here.

Nucleotide frequency: counts and percentages of each base (A, C, G, T/U) in the input.
Ambiguous bases: characters such as N are reported separately from standard bases.
Percentages: computed relative to total valid nucleotides unless otherwise specified.
DNA vs RNA: T and U are handled depending on the specified alphabet.
See Sequence alphabets for valid symbols and ambiguity codes.

Inspect base composition for quality control and bias detection.
Identify overrepresented or underrepresented nucleotides in sequences.
Compare composition across samples or processing steps.

Nucleotide composition provides a more detailed view than GC content alone, showing the balance between all bases and revealing potential biases.

It is commonly used in exploratory analysis and quality control, especially when evaluating sequencing output or preparing datasets for downstream workflows.

Nucleotide frequency

Tools

About this tool

Results

Details

Definitions & notes

Common use cases

Background