RNA secondary structure prediction using conditional random fields model

2013 | International Journal of Data Mining and Bioinformatics
Share via twitter Share via email Download PDF

Authors:
Chinae Thammarongtham
Robert W. Cutler
Jeerayut Chaijaruwanich

Abstract

—Non-coding RNAs (ncRNAs) have important biological functions in living cells dependent on their conserved secondary structures. Here, we focus on computational RNA secondary structure prediction by exploring primary sequences and complementary base pair interactions using the Conditional Random Fields (CRFs) model, which treats RNA prediction as a sequence labelling problem. Proposing suitable feature extraction from known RNA secondary structures, we developed a feature extraction based on natural RNA’s loop and stem characteristics. Our CRFs models can predict the secondary structures of the test RNAs with optimal F-score prediction between 56.61 and 98.20% for different RNA families.