An analysis of the Kozak consensus in retinal genes and its relevance to gene therapy.
McClements ME., Butt A., Piotter E., Peddle CF., MacLaren RE.
Purpose: The classic Kozak consensus is a critical genetic element included in gene therapy transgenes to encourage the translation of the therapeutic coding sequence. Despite optimizations of other transgene elements, the Kozak consensus has not yet been considered for potential tissue-specific sequence refinement. We screened the -9 to -1 region relative to the AUG start codon of retina-specific genes to identify whether a Kozak consensus that is different from the classic sequence may be more appropriate for inclusion in gene therapy transgenes that treat inherited retinal disease. Methods: Sequences for 135 genes known to cause nonsyndromic inherited retinal disease were extracted from the NCBI database, and the -9 to -1 nucleotides were compared. This panel was then refined to 75 genes with specific retinal functions, for which the -9 to -1 nucleotides were placed in front of a GFP transcript sequence and RNAfold predictions performed. These were compared with a GFP sequence with the classic Kozak consensus (GCCGCCACC), and sequences from retinal genes with minimum free energy (MFE) predictions greater than the reference sequence were selected to generate an optimized Kozak consensus sequence. The original Kozak consensus and the refined retina Kozak consensus were placed upstream of the Renilla luciferase coding sequence, which were used to transfect retinoblastoma cell lines Y-79 and WERI-RB-1 and HEK 293T/17 cells. Results: The nucleotide frequencies of the original panel of genes were determined to be comparable to the classic Kozak consensus. RNAfold analysis of a GFP transcript with the classic Kozak sequence in the 5' untranslated region (UTR) generated an MFE prediction of -503.3 kcal/mol. RNAfold analysis was then performed with a GFP transcript containing each -9 to -1 Kozak sequence of 75 retinal genes. Thirty-eight of the 75 genes provided a greater MFE value than -503.3 kcal/mol and exhibited an absence of stable secondary structures before the AUG codon. The -9 to -1 nucleotide frequencies of these genes identified a Kozak consensus of ACCGAGACC, differing from the classic Kozak consensus at positions -9, -5, and -4. Applying this sequence to the GFP transcript increased the MFE prediction to -500.1 kcal/mol. The newly identified retina Kozak sequence was also applied to Renilla luciferase plus the REP1 and RPGR transcripts used in current clinical trials. In all examples, the predicted transcript MFE score increased when compared with the current transcript sequences containing classic Kozak consensus sequences. In vitro transfections identified a 7%-9% increase in Renilla activity when incorporating the optimized Kozak sequence. Conclusions: The Kozak consensus is a critical element of eukaryotic genes; therefore, it is a required feature of gene therapy transgenes. To date, the classic sequence of GCCRCC (-6 to -1) has typically been incorporated in gene therapy transgenes, but the analysis described here suggests that, for vectors targeting the retina, using a Kozak consensus derived from retinal genes can provide increased expression of the target product.