About Me

Shiva Sadeghpour

I'm a bioinformatician specializing in computational tools for genomics and data science. I started as a wet lab technician at Caltech's neuroscience department, running experiments and analyzing gene expression data before pivoting to computational biology. I earned my master's in bioinformatics working with 1.5 TB of metagenomic data, developing expertise in Unix, R, Python, and Nextflow. Today, I build scalable genomic pipelines, apply machine learning to multi-omic datasets, and engineer data solutions for biomarker discovery and early disease detection. I've collaborated with teams at NASA, CSU, and biotech startups, consistently translating complex biological data into actionable insights that advance diagnostics and treatment.

Research

Large-Scale Metagenomic Analysis

The Problem: Analyzing massive genomic datasets (1.5 TB across 60 samples) using traditional tools like BLAST becomes computationally prohibitive, taking weeks to complete and creating a bottleneck in research workflows.

My Solution: I architected a scalable bioinformatics pipeline using Profile Hidden Markov Models (pHMMs) to replace direct sequence alignment. I implemented parallelized workflows on High-Performance Computing (HPC) infrastructure to optimize resource usage, allowing for the simultaneous processing of 640,000+ predicted proteins.

The Impact: This work successfully identified light-harvesting gene families across all samples, demonstrating how algorithmic optimization and workflow automation can transform computational bottlenecks into tractable problems. The pipeline's efficiency and reproducibility proved critical for generating publication-ready results linking gene distribution to environmental variables in extreme desert ecosystems.

Contact

Email: shiva.sghpr@gmail.com

Resume

Resume Page 1
Resume Page 2