Compute GC content (GC%) from FASTA or plain sequence input. Supports DNA and RNA alphabets with explicit handling of ambiguous bases.

Input
Accepts FASTA or plain sequence text.
Plain-text FASTA or sequence text. Processed synchronously.
Counted alphabet
Limits: up to 20 MB (uncompressed) • up to 800,000 records

About this tool

GC Content Calculator computes GC% from FASTA or plain sequence input using DNA (A/C/G/T) or RNA (A/C/G/U) alphabets. Ambiguous or non-standard characters are excluded from the GC% denominator and reported separately. This is useful for quick sequence composition checks and quality control. Need full dataset metrics (N50, length distribution, counts, exports)? Use Sequence stats. For nucleotide alphabets and ambiguity codes, see sequence alphabets reference.

Tool guarantees
  • No hidden transformations
  • Input processed only for this request
  • FASTA structure preserved in output

Results

Submit input to see results here.

Details

  • GC%: percentage of G and C bases among valid nucleotides (A, C, G, T/U).
  • Ambiguous bases: characters such as N are excluded from the GC% denominator.
  • See Sequence alphabets for valid nucleotide symbols.

  • Quickly check sequence composition before downstream analysis.
  • Detect contamination or unexpected organism origin via abnormal GC%.
  • Compare GC% across samples after trimming or filtering steps.
  • Validate synthetic sequences or primers for expected composition.

GC content is a simple but widely used measure of sequence composition. Different organisms and genomic regions often exhibit characteristic GC levels, making it useful for quick sanity checks.

While GC% alone is not sufficient for biological interpretation, it can highlight obvious issues such as contamination, incorrect references, or unexpected sequence subsets before running heavier analyses.