The raw output from the Illumina for the paired end data (required for sequencing the new RADseq protocol found in the Rapture method (Ali et al. 2015) needs to be organized so that all the ligated ends line up on the left side. The perl "flip and trim" script does this. The first thing to do to determine the efficiency of your ligation and library prep is to compare the total number of reads from the raw data compared with the data after "flip and trim". Ideally you want at least 70% of the reads retained (80% is even better).

You can do this in the command line with the following steps (assuming you are in the directory where both the original and "FLIP" files are located.

1. Count the number of lines in the original raw file.

$ wc -l originalfilename.fastq # gives the number of lines in the original .fastq

405258908

2. Count the number of lines in the "FLIP" file.

$ wc -l FLIPfilename.fastq # gives the number of lines in the Flipped .fastq

315350592

3. Divide each number by 4 (because each read has 4 lines of data associated with it) with the simple "expr" command (this is for integers only and the command needs spaces between the number and operators).

$ expr 405258908 / 4

101314727

$expr 315350592 / 4

78837648

4. Divide the FLIP reads by the raw reads and multiply by 100 to get percentage. Here we use a simple awk command.

$awk 'BEGIN{ # hit return here to get the carrot prompt

> print (78837648/101314727*100); # do the calculation

> }' # show the answer on the screen

77.8146

So in this case there are 77.8% of the reads retained after the flip/trim script - which is good!

You can do this in the command line with the following steps (assuming you are in the directory where both the original and "FLIP" files are located.

1. Count the number of lines in the original raw file.

$ wc -l originalfilename.fastq # gives the number of lines in the original .fastq

405258908

2. Count the number of lines in the "FLIP" file.

$ wc -l FLIPfilename.fastq # gives the number of lines in the Flipped .fastq

315350592

3. Divide each number by 4 (because each read has 4 lines of data associated with it) with the simple "expr" command (this is for integers only and the command needs spaces between the number and operators).

$ expr 405258908 / 4

101314727

$expr 315350592 / 4

78837648

4. Divide the FLIP reads by the raw reads and multiply by 100 to get percentage. Here we use a simple awk command.

$awk 'BEGIN{ # hit return here to get the carrot prompt

> print (78837648/101314727*100); # do the calculation

> }' # show the answer on the screen

77.8146

So in this case there are 77.8% of the reads retained after the flip/trim script - which is good!