Portfolio
Featured Projects

Deciphering Protein Sequence Relationships: A Python Based Approach
- Utilized Python scripting and bioinformatics libraries (e.g., Biopython) to analyze protein.
- Gained proficiency in sequence alignment, phylogenetic tree construction, and structure prediction using BLAST, INTERPROSCAN, SWISS-MODEL.
- Identified key functional domains and structural motifs in proteins using Python scripting and Biopython, contributing to a deeper understanding of protein function and evolution.
NLP-Driven Gene Function Prediction using BERT in Biological Sequences
- Working on a gene function prediction using NLP (BERT model) on biological sequences, focusing on TensorFlow.
- Using BERT, we train on biological sequences to identify gene regions, predicting functionality in novel sequences with NLP, TensorFlow, and robust algorithms.
- Automating gene identification and characterization with BERT can significantly advance gene function understanding for genomic research and applications.


Highly accurate protein structure prediction with Alphafold
-
Leveraged AlphaFold AI and ColabFold to explore protein folding, employing advanced predictive analysis and visualization tools (e.g., PyMOL) for structural assessment.
-
Demonstrated understanding of cutting-edge deep learning models in protein structure prediction and their significance in decoding complex protein architectures.
-
Utilized cutting-edge AlphaFold AI and ColabFold to predict protein structures with high accuracy (e.g., pLDDT > 90), illuminating potential applications in structural biology, drug discovery, and collaborative research.
Data Analysis on the prevalence of Anemia and its factors in Pregnant and Non-Pregnant Women.
-
Performed comprehensive statistical analysis of anemia prevalence in women using WHO datasets and R/Python.
-
Analyzed hemoglobin concentration distribution through box plots, confusion matrices, and histograms, identifying potential biases and disparities.

Featured Reports
Epigenetic Study of Socioeconomic Hardship from EHR-linked Biodata Bank
The topic of this seminar report is Epigenetic Study of Socioeconomic Hardship from EHR-linked Biodata Bank, presented by Dr. Yaomin Xu. He holds the PhD in Statistics from Case Western Reserve University.