
LncPTPred is a Machine Learning (ML) based tool to predict the interaction between Long non-coding RNA (lncRNA) and Protein. It has been executed in 2 phases: Data Curation & Machine Learning. In the Data curation phase, data has been collected from Photoactivatable Ribonucleoside-enhanced Crosslinking and Immunoprecipitation (PAR-CLIP), High-throughput Sequencing of RNA isolated by Crosslinking Immunoprecipitation (HITS-CLIP) and Enhanced crosslinking and immunoprecipitation (eCLIP) based experimental assay to extract RNA binding position corresponding to given protein. Then, they are rigorously pre-processed against LncRbase v.2 and Ensembl biomart databases to pull out positively bound lncRNA. Finally, the non-interacting segments are screened to generate negative data. Within the interacting segments, we have checked for specific sequence motifs which affect lncRNA’s binding affinity. This has been shown using motif binding plot.
Here we have incorporated LightGBM based ensemble model to execute the prediction task. The distinctive feature of our tool is its ability to predict the interacting lncRNA segments and corresponding binding probabilities (in the form of Final_Interacting_Score) for a particular protein.
Standalone version of LncPTPred tool can be accessed from here.
Run your Analysis from here.