Nucleotide frequency
Tools
Calculate nucleotide composition (A, C, G, T/U) from FASTA or plain sequences. Returns counts and percentages, including ambiguous bases.
About this tool
Nucleotide Frequency calculates base composition from FASTA or plain sequence input, reporting counts and percentages for A, C, G, T/U and ambiguous characters. This is useful for composition analysis, bias detection, and quick QC checks. For GC% only, use the GC content calculator. For full dataset statistics (N50, length distribution, exports), use Sequence stats. For nucleotide alphabets and ambiguity codes, see sequence alphabets reference.
- ✓ No hidden transformations
- ✓ Input processed only for this request
- ✓ FASTA structure preserved in output
Details
- Nucleotide frequency: counts and percentages of each base (A, C, G, T/U) in the input.
- Ambiguous bases: characters such as N are reported separately from standard bases.
- Percentages: computed relative to total valid nucleotides unless otherwise specified.
- DNA vs RNA: T and U are handled depending on the specified alphabet.
- See Sequence alphabets for valid symbols and ambiguity codes.
- Inspect base composition for quality control and bias detection.
- Identify overrepresented or underrepresented nucleotides in sequences.
- Compare composition across samples or processing steps.
Nucleotide composition provides a more detailed view than GC content alone, showing the balance between all bases and revealing potential biases.
It is commonly used in exploratory analysis and quality control, especially when evaluating sequencing output or preparing datasets for downstream workflows.