ai

AI in Bioinformatics

I am a bioinformatics professional pursuing an M.S. in Bioinformatics at Indiana University, with a B.E. in Biomedical Engineering from Mumbai University. My expertise spans NGS data analysis, variant calling, RNA-seq, workflow automation (Snakemake, Nextflow), and cloud computing (AWS, GCP).

I am passionate about the intersection of AI and bioinformatics, particularly in machine learning-driven genomics. Currently, I am working on gene prediction using BERT NLP models and experimenting with AI integration in bioinformatics workflows. I enjoy exploring and implementing AI-driven solutions using ChatGPT, DeepSeek, and Gemini APIs to enhance genomic data interpretation and automation.

ai

AI in Bioinformatics

I am a bioinformatics professional pursuing an M.S. in Bioinformatics at Indiana University, with a B.E. in Biomedical Engineering from Mumbai University. My expertise spans NGS data analysis, variant calling, RNA-seq, workflow automation (Snakemake, Nextflow), and cloud computing (AWS, GCP).

I am passionate about the intersection of AI and bioinformatics, particularly in machine learning-driven genomics. Currently, I am working on gene prediction using BERT NLP models and experimenting with AI integration in bioinformatics workflows. I enjoy exploring and implementing AI-driven solutions using ChatGPT, DeepSeek, and Gemini APIs to enhance genomic data interpretation and automation.

Projects

Bulk RNA-Seq Analysis

Analyzed bulk RNA-seq data to identify differentially expressed genes (DEGs), enriched pathways, and key biomarkers using DESeq2, KEGG, and Reactome. Applied NGS pipelines for gene expression profiling, enabling insights into disease mechanisms and therapeutic targets.

Single-Cell RNA-Seq

Developed single-cell RNA-seq pipelines using Seurat and Scanpy to explore cellular heterogeneity and immune responses. Applied clustering, normalization, and differential analysis to uncover key insights in tumor microenvironments, immunotherapy, and disease progression.

Variant Calling & Annotation

Designed variant calling pipelines (BWA, GATK, VEP) for detecting clinically significant mutations and novel variants. Automated workflows to enhance genomic data interpretation, supporting applications in precision medicine, rare disease research, and population genetics.

ATAC-Seq & Epigenomics

Performed chromatin accessibility analysis using ATAC-Seq and MACS2 to map open chromatin regions and regulatory elements. Integrated data with RNA-seq and ChIP-seq to investigate gene regulation, epigenetic modifications, and transcription factor binding sites.

Drug Profiler- ADMET Prediction

ADMET AI is a comprehensive tool for drug discovery researchers to quickly assess the pharmacokinetic and toxicity profiles of potential drug candidates. Using machine learning models trained on TDC (Therapeutics Data Commons) datasets, it predicts key ADMET properties that are critical for drug development.

Clinical Trial Monitoring System

The Clinical Trial Monitoring System is a powerful web-based dashboard designed for clinical researchers, data scientists, and medical professionals to monitor ongoing clinical trials.The system integrates multi-omics data (proteomics and genomics), clinical observations, and adverse event tracking to provide a holistic view.

Professional Experience

Genomics Analyst

Genomics Analyst Intern at Karyosoft Inc., building scalable variant analysis pipelines for long- and short-read sequencing. Automating workflows with Snakemake and AWS Batch. Enhancing genomic interpretation with NLP, troubleshooting workflows, supporting clients, and maintaining documentation for transparency and best practices.

Research Assistant

Conducting genetic language modeling using BERT transformers to predict gene functions with 89.5% accuracy. Developing single-cell RNA-seq pipelines for tumor analysis. Presenting research findings, coordinating lab discussions, and contributing to a book chapter on RNA modifications under faculty supervision.

Teaching Assistant

Supporting students in Introduction to bioinformatics course by grading assignments, evaluating project reports, and resolving coding queries in Python and R. Coordinating guest lectures and assisting in course management. Facilitating discussions on computational tools and workflows to enhance student understanding of bioinformatics analysis.

Active Leadership & Academic Involvement

Graduate Ambassador

Facilitating student engagement by mentoring prospective students, coordinating with university departments, and organizing events like orientations and workshops. Leading outreach initiatives through virtual events and social media. Representing the university at recruitment events, open days, and campus tours to promote academic programs.

President Luddy Journal Club

Leading the Luddy Journal Club to foster academic discussions and knowledge-sharing. Organizing events, inviting speakers, and facilitating Q&A sessions. Advocating for funding at ICSC meetings. Enhancing networking, communication, and event management skills while engaging with emerging topics in biomedical informatics.

Co-Director Graphic Design (SAPB)

Designing visual materials to promote campus events as part of Student Activities Programming Board (SAPB). Collaborating on marketing strategies to boost student engagement. Aligning designs with event goals, assisting in event management, and enhancing branding to create a dynamic and connected student community at Indiana University.

The Bioinformatics Behind Eli Lilly’s Oral GLP-1 Breakthrough: A Computational Journey from Molecule to Medicine

Introduction Eli Lilly's recent announcement of successful Phase 3 trial results for orforglipron marks a significant milestone in metabolic disease treatment. As the first small molecule oral GLP-1 receptor agonist to complete a Phase 3 trial, orforglipron has...

How to Upload Local Scripts to a Specific Folder in a GitHub Repository

How to Upload Local Scripts to a Specific Folder in a GitHub Repository Use Case: You have a set of files (e.g., .py, .ipynb, etc.) on your local machine and want them to appear inside a specific...

Exploring Hypoxia-Induced Gene Expression and Pathway Alterations in Prostate Cancer Cells

Abstract Hypoxia, a condition of reduced oxygen, drives significant changes in prostate cancer progression. This study analyzes the impact of hypoxia on gene expression in two prostate cancer cell lines, LNCaP (androgen-sensitive) and PC3 (androgen-independent), using...

Rshiny Single Cell RNA-seq Analysis Dashboard

Introduction As a student in bioinformatics, I am constantly dealing with data from single-cell RNA sequencing (scRNAseq). Analyzing these large datasets can be challenging, especially when it comes to visualizing and exploring clusters within the data. So, for one of...

Exploring Gene Expression Changes in Prostate Cancer Cells Under Hypoxia

In this project, I explore differential gene expression and pathway enrichment in two prostate cancer cell lines—LNCaP and PC3—under different oxygen conditions: hypoxia (low oxygen) and normoxia (normal oxygen). Hypoxia is a common feature of tumor environments and...

Understanding Genome Assembly Using De Bruijn Graphs: From Concepts to Code

Genome assembly is a key concept in bioinformatics, and it involves reconstructing a long DNA sequence from short, overlapping fragments known as reads. The purpose of this article is to explain how we can use De Bruijn graphs and Eulerian paths to solve genome...

RNA-seq Analysis Pipeline

A Step-by-Step Guide to RNA-seq Data Processing: From Data Download to Raw Counts In this article, I will walk you through the process of analyzing RNA-seq data, starting from downloading the raw data to obtaining gene expression counts. This is an essential workflow...

My Role as a Graduate Ambassador at the Luddy School of Informatics

As a Graduate Ambassador at the Luddy School of Informatics, Computing, and Engineering at Indiana University, I play a pivotal role in bridging the gap between the university and prospective students. This position allows me to engage in a variety of activities aimed...

Different types of Bioinformatics data

Bioinformatics data encompasses various types, each crucial for advancing our understanding of biological systems and enhancing medical research. Bioactivity data reveals how compounds interact with biological systems, essential for drug discovery and safety...

Evolution of DNA Sequence Alignment: Key Milestones

From the foundational Needleman-Wunsch Algorithm in 1970 to AI-powered techniques in 2024, the field of DNA sequence alignment has seen transformative advancements. These innovations have revolutionized genomics, making sequence analysis faster, more accurate, and...