Open reading frame
In
In
Biological significance
One common use of open reading frames (ORFs) is as one piece of evidence to assist in
Short ORFs (sORFs)
Some short ORFs (sORFs), also named Small open reading frames,[7] usually < 100 codons in length,[8] that lack the classical hallmarks of protein-coding genes (both from ncRNAs and mRNAs) can produce functional peptides.[9] 5’-UTR of about 50% of mammal mRNAs are known to contain one or several sORFs,[10] also called upstream ORFs or uORFs. However, less than 10% of the vertebrate mRNAs surveyed in an older study contained AUG codons in front of the major ORF. Interestingly, uORFs were found in two thirds of proto-oncogenes and related proteins.[11] 64–75% of experimentally found translation initiation sites of sORFs are conserved in the genomes of human and mouse and may indicate that these elements have function.[12] However, sORFs can often be found only in the minor forms of mRNAs and avoid selection; the high conservation of initiation sites may be connected with their location inside promoters of the relevant genes. This is characteristic of SLAMF1 gene, for example.[13]
Six-frame translation
Since DNA is interpreted in groups of three nucleotides (codons), a DNA strand has three distinct reading frames.[14] The double helix of a DNA molecule has two anti-parallel strands; with the two strands having three reading frames each, there are six possible frame translations.[14]
Software
Finder
The ORF Finder (Open Reading Frame Finder)[15] is a graphical analysis tool which finds all open reading frames of a selectable minimum size in a user's sequence or in a sequence already in the database. This tool identifies all open reading frames using the standard or alternative genetic codes. The deduced amino acid sequence can be saved in various formats and searched against the sequence database using the basic local alignment search tool (BLAST) server. The ORF Finder should be helpful in preparing complete and accurate sequence submissions. It is also packaged with the Sequin sequence submission software (sequence analyser).
Investigator
ORF Investigator
Predictor
OrfPredictor[17] is a web server designed for identifying protein-coding regions in expressed sequence tag (EST)-derived sequences. For query sequences with a hit in BLASTX, the program predicts the coding regions based on the translation reading frames identified in BLASTX alignments, otherwise, it predicts the most probable coding region based on the intrinsic signals of the query sequences. The output is the predicted peptide sequences in the FASTA format, and a definition line that includes the query ID, the translation reading frame and the nucleotide positions where the coding region begins and ends. OrfPredictor facilitates the annotation of EST-derived sequences, particularly, for large-scale EST projects.
ORF Predictor uses a combination of the two different ORF definitions mentioned above. It searches stretches starting with a start codon and ending at a stop codon. As an additional criterion, it searches for a stop codon in the 5' untranslated region (UTR or NTR, nontranslated region[18]).
ORFik
ORFik is a R-package in Bioconductor for finding open reading frames and using Next generation sequencing technologies for justification of ORFs.[19] [20]
orfipy
orfipy is a tool written in Python / Cython to extract ORFs in an extremely and fast and flexible manner.[21] orfipy can work with plain or gzipped FASTA and FASTQ sequences, and provides several options to fine-tune ORF searches; these include specifying the start and stop codons, reporting partial ORFs, and using custom translation tables. The results can be saved in multiple formats, including the space-efficient BED format. orfipy is particularly faster for data containing multiple smaller FASTA sequences, such as de-novo transcriptome assemblies.[22]
See also
- Coding region
- Putative gene
- Sequerome – A sequence profiling tool that links each BLAST record to the NCBI ORF enabling complete ORF analysis of a BLAST report.
- Micropeptide
References
- ^ PMID 29366605.
- ^ Brody LC (2021-08-25). "Stop Codon". National Human Genome Research Institute. National Institutes of Health. Retrieved 2021-08-25.
- OCLC 185042615.
- PMID 9300666.
- ^ ISBN 978-0-387-98785-9.
- PMID 9415985.
- S2CID 254966620.
- PMID 35154250.
- S2CID 206639549.
- PMID 24163100.
- PMID 8016865.
- PMID 22927429.
- PMID 27424222.
- ^ S2CID 6413018.
- ^ "ORFfinder". National Center for Biotechnology Information.
- ^ Dhar DV, Kumar MS (2012). "ORF Investigator: A New ORF finding tool combining Pairwise Global Gene Alignment". Research Journal of Recent Sciences. 1 (11): 32–35.
- ^ "OrfPredictor". bioinformatics.ysu.edu. Archived from the original on 2015-12-22. Retrieved 2015-12-17.
- PMID 2319646.
- .
- PMID 34147079.
- PMID 33576786.
- ^ Singh U (2021-02-13), urmi-21/orfipy, retrieved 2021-02-13
External links
- Translation and Open Reading Frames
- hORFeome V5.1 - A web-based interactive tool for CCSB Human ORFeome Collection
- ORF Marker - A free, fast and multi-platform desktop GUI tool for predicting and analyzing ORFs
- StarORF - A multi-platform, java-based, GUI tool for predicting and analyzing ORFs and obtaining reverse complement sequence
- ORFPredictor Archived 2015-12-22 at the Wayback Machine - A webserver designed for ORF prediction and translation of a batch of EST or cDNA sequences