Inspect small genome assembly FASTA files with assembly-focused metrics: total length, contig or scaffold count, N50/N90, L50/L90, GC%, N content, and longest records.

Input
Accepts assembly FASTA only. FASTQ input is intentionally rejected.
Plain-text assembly FASTA, optionally gzip-compressed. Intended for small assemblies and teaching examples.
Limits: up to 15 MB (uncompressed) • up to 100,000 records

About this tool

Assembly FASTA statistics summarizes contig or scaffold FASTA files for quick inspection and teaching. It reports descriptive assembly metrics only; it does not assess gene completeness, taxonomy, contamination, or biological quality. For general FASTA/FASTQ summaries with interactive length plots and a filterable table, use Sequence stats. For metric definitions and interpretation notes, see the Genome assembly statistics reference. For strict file validation, use the FASTA/FASTQ validator.

For larger datasets, multi-file runs, or more involved workflows, this can be executed separately as a custom analysis.

Tool guarantees
  • No hidden transformations
  • Input processed only for this request
  • FASTA structure preserved in output

Results

Submit input to see results here.
Running analyses at scale?
We execute bioinformatics analyses and data-processing workflows on your data.
Useful for larger datasets, multiple files, or tasks that are not convenient to run locally.
Request custom analysis
Helpful?
Thanks for the feedback.

Details

N50 is the record length at which records of that length or longer contain at least half of the total assembly span. L50 is the number of longest records needed to reach that point. N90/L90 use the same idea at 90%.

When assemblies have similar expected size and scope, a higher N50 usually means longer contiguous records, while a lower L50 means fewer records cover half of the assembly. These are contiguity indicators only. They do not measure completeness, contamination, correctness, or annotation quality.