0

I wanted information regarding which format is considered the most raw from an illumina sequencer ?

  1. fasta
  2. fastq
  3. bcl
  4. bam
    As per my research it should be bcl but I am not sure.
  • What do you mean by “raw”? None of the files you’ve listed are “raw” in any real sense. In fact, FASTQ and BCL are more or less equivalent, just stored in different ways. For the difference between FASTA, FASTQ and BAM, refer to https://bioinformatics.stackexchange.com/a/385/29. – Konrad Rudolph Jun 01 '21 at 13:56

1 Answers1

1

BCL (binary base call) files are the most raw data that you will likely get. As a matter of fact, the images taken by the machine each cycle are even more raw but it makes no sense to interact with this data directly. The base calls are derived from counting the spots (clusters) on the images.

For any useful application you would convert them to fastq (e.g. using illumina bcl2fastq).

What is the goal of your question? Do you suspect some problems with your data?
Adding this information will make it easier to interpret what kind of answer you are looking for - and less likely that your question is downvoted.

PPK
  • 886
  • 4
  • 13