Genome assembly of Parapercis colias (Blue cod/Rāwaru); An endemic marine fish species of New Zealand
Fishery stocks are important resources for communities and nations, but many have been heavily exploited and become vulnerable to collapse. The aim of fisheries science is to understand the nature and structure of fish stocks to ensure fishing pressure can be adjusted to sustainable levels. Genetic methods have made many important contributions to fisheries management, however the development has been limited on whole genome sequencing techniques that are much more powerful and promising than what is currently being used. Technical advances in genetic approaches would provide an invaluable resource to the improvement and development of New Zealand fisheries and aquaculture species. Blue cod / Rāwaru (Parapercis colias) is a popular commercial, recreational and customary fishery in New Zealand. Rāwaru is an endemic marine fish species to New Zealand where it has more recently been considered for aquaculture development. Rāwaru fisheries have been heavily exploited and experienced several fisheries closures in the past, this has led to management changes in how it can be caught. To support Rāwaru fisheries science, the goal of this thesis research is to produce the first whole genome assembly for the species and compare its genome structure to other fish species with whole genome sequences available. Three assemblies were initially generated, one using Illumina short-read sequences (32x coverage) with the assembler Masurca, another using Pacific Biosciences HiFi long-read sequences (10x coverage) with the assembler Peregrine, and a third assembly using Pacific Biosciences HiFi data in combination with Hi-C data generated on a PacBio system, using the assembler Hifiasm. Completeness statistics of these assemblies were compared to determine the best assembly to be used for annotation and downstream analyses. The Illumina assembly produced a BUSCO completeness score of 85.80% and an N50 of 35.3Kb, the Peregrine assembly produced a BUSCO score of 97.70% and an N50 of 551.4Kb and lastly the Hifiasm assembly produced a BUSCO score of 96% and an N50 value of 348.7Kb. The final genome assembly that was subsequently used in future analyses was the Peregrine assembly. The final genome assembly is 643 MB long and consists of 3,535 scaffolds. Within the final genome assembly Repetitive elements covered 36.17% of the genome. In comparison to 6 other New Zealand fish species; King Tarakihi, Tarakihi, Barracouta, Blue moki, Butter fish and Kahawai, Rāwaru has the second largest repetitive element percentage behind Barracouta. Rāwaru also has the second largest genome within this group of species. This is consistent with the trend; as genome size increases so does the proportion of repetitive elements. A P-value of 0.0006 and a Pearsons correlation coefficient of R=0.96 confirmed this trend for repetitive elements. Rāwaru also has the second largest proportion of genes in comparison to the other 6 New Zealand species with 22,842 protein coding genes. There is however, no correlation with genome size and number of genes as well as genome size and gene length also where a P-value of 0.57 and Pearsons correlation coefficient of R=0.2656 confirmed this for number of genes and a P value of 0.7689 and Pearsons correlation coefficient of R= -0.1374104 confirmed this for gene length. This is the first time the Rāwaru genome has been assembled and annotated where 22,842 genes were annotated through maker and 22,720 of those genes were annotated through BLAST where a filtering methods of 5 blast hits and genes with the highest bit score were represented in the annotation. Interproscan was also used to annotate the genome of this Rāwaru where there were 82.8% GO terms were identified for genes predicted and 99.5% of gene models were produced with a balstp annotation. More than 98% of the Rāwaru genes were functionally annotated. No exon statistics could be provided as no transcriptomics data was used in this thesis. This thesis provides a valuable resource to assist in the studies of population genomics, conservation, fisheries management and eco evolutionary comparative studies in other teleost fish both endemic to New Zealand and globally.