(ID: 2026-2574)

Axle is a bioscience and information technology company that offers advancements in translational research, biomedical informatics, and data science applications to research centers and healthcare organizations nationally and abroad. With experts in biomedical science, software engineering, and program management, we focus on developing and applying research tools and techniques to empower decision-making and accelerate research discoveries. We work with some of the top research organizations and facilities in the country including multiple institutes at the National Institutes of Health (NIH).

Benefits We Offer:

100% Medical, Dental & Vision Coverage for Employees
Paid Time Off and Paid Holidays
401K match up to 5%
Educational Benefits for Career Growth
Employee Referral Bonus
Flexible Spending Accounts:
- Healthcare (FSA)
- Parking Reimbursement Account (PRK)
- Dependent Care Assistant Program (DCAP)
- Transportation Reimbursement Account (TRN)

We are seeking a Data Scientist II to join our vibrant team supporting the National Cancer Institute (NCI) at the NIH in Rockville, MD. This role is embedded within NCI's Center for Biomedical Informatics and Information Technology (CBIIT), where you will directly advance cancer research by building the computational infrastructure that scientists depend on every day.

You will support the full omics data lifecycle across a broad spectrum of modalities, including bulk RNA-seq, single-cell RNA-seq (scRNA-seq), spatial transcriptomics, Digital Spatial Profiling (DSP), whole genome and exome sequencing (WGS/WES), metagenomics, metabolomics, and proteomics, as well as clinical, imaging, and biospecimen data. A core part of this role involves developing workflows that integrate these modalities to support systems-level biological questions, cross-cohort studies, and NCI CBIIT initiatives.

You will collaborate closely with NCI scientists, bioinformaticians, clinician-researchers, data engineers, software developers, and government stakeholders to ensure analytical infrastructure is FAIR-compliant, containerized, version-controlled, well-documented, and purpose-built for long-term reuse across the research community.

Key Responsibilities

Bioinformatics Workflow and Data Pipeline Development: Design, build, and maintain reproducible pipelines for diverse biomedical data types — including genomic, transcriptomic, single-cell, spatial, proteomic, metagenomic, metabolomic, and clinical datasets. Develop reusable transformation logic and curated datasets supporting analytics, dashboards, APIs, notebooks, and downstream research workflows.
Multi-Omics Analysis: Support NCI CBIIT labs in their analysis workflows including bulk RNA-seq (QC, DEG, GSEA), single-cell RNA-seq (clustering, UMAP/t-SNE, cell type annotation, DEG), and Digital Spatial Profiling (annotation, QC, normalization, spatial deconvolution, volcano plots, heatmaps).
Data Integration and Lifecycle Support: Enable reliable data movement from source systems into structured, analysis-ready formats. Support ingestion, curation, metadata capture, source-to-target mapping, schema management, provenance tracking, and long-term maintainability of data products.
Statistical Modeling and Machine Learning: Apply statistical and ML methods — including hypothesis testing, regression, clustering, PCA, UMAP, t-SNE, and classification — to biomedical datasets. Incorporate AI/LLM-based extraction where appropriate, with clear validation and communication to stakeholders.
Researcher-Facing Applications and Visualization: Build and support interactive dashboards (Shiny, Streamlit), notebooks, reports, and APIs enabling researchers to explore multi-omics and clinical data. Support figure generation for QC, differential expression, pathway, and spatial analyses.
Collaboration: Partner with data scientists, bioinformaticians, researchers, developers, and government stakeholders to translate scientific needs into technical specifications, data models, and reusable workflows that accelerate biomedical research.

Required Qualifications

Education & Background: Bachelor's degree in Data Science, Bioinformatics, Computer Science, Biological Sciences, or a related field (advanced degree preferred), or equivalent experience. Demonstrated experience in a data-intensive role supporting biomedical research or scientific computing.
Data Science and Bioinformatics Expertise: Strong proficiency in Python and R for analysis, scripting, and visualization. Hands-on experience with at least two omics data types (e.g., bulk RNA-seq, scRNA-seq, spatial transcriptomics, proteomics, metagenomics, GWAS).
Analytical Skills: Solid understanding of statistical modeling, dimensionality reduction, clustering, differential expression, and pathway analysis. Ability to work with structured, semi-structured, and unstructured data across relational and data lake environments.
Collaboration & Communication: Strong problem-solving skills with the ability to communicate effectively across technical and non-technical audiences. Able to translate scientific needs into technical solutions and clearly articulate risks, assumptions, and limitations.
Domain Alignment: Genuine interest in biomedical and translational research. Ability to quickly learn domain-specific terminology and workflows, with awareness of data governance, privacy, and compliance requirements for clinical and research data.

Preferred Qualifications

Data Platform Experience: Experience building analytics solutions in platforms such as Snowflake, Databricks, or cloud data warehouses, with integrations across databases, APIs, dashboards, and application environments.
Bioinformatics Workflow Tooling: Experience with workflow and reproducibility tools used in Galaxy, Terra, Nextflow/WDL, Snakemake, Singularity, or CWL. Familiarity with the scverse Python ecosystem (Scanpy, Squidpy, SCIMAP, AnnData) and spatial single-cell analysis methods, including PhenoGraph, Louvain/Leiden clustering, UMAP, and Ripley's L statistic, is a plus.
Research and Application Enablement: Experience preparing curated datasets for dashboards, APIs, and web applications. Familiarity with Posit Connect, R/Shiny, Streamlit, Jupyter, or similar platforms is a plus.
Cloud, HPC, Storage, and Automation: Experience with AWS (EC2, S3, Lambda), object storage, relational databases, scheduled jobs, API integrations, and secure data movement. Familiarity with HPC environments, SLURM/SGE, or NIH Biowulf is preferred.
Biomedical Domain Knowledge: Background in biomedical research, clinical research, or healthcare analytics. Familiarity with standards such as HL7/FHIR, CDISC, or OMOP, and experience with clinical, genomic, or biospecimen data is a plus.
Governance and Reproducibility: Experience with metadata management, data lineage, open-source code release, containerized analyses, and secure handling of de-identified or access-controlled research datasets.
Training and Scientific Enablement: Experience creating documentation, training materials, or workshops for researchers and non-coder audiences. Ability to support tool adoption and explain workflows and results clearly is strongly preferred.

Disclaimer: The above description is meant to illustrate the general nature of work and level of effort being performed by individuals assigned to this position or job description. This is not restricted as a complete list of all skills, responsibilities, duties, and/or assignments required. Individuals may be required to perform duties outside of their position, job description or responsibilities as needed.

The diversity of Axle's employees is a tremendous asset. We are firmly committed to providing equal opportunity in all aspects of employment and will not tolerate any illegal discrimination or harassment based on age, race, gender, religion, national origin, disability, marital status, covered veteran status, sexual orientation, status with respect to public assistance, and other characteristics protected under state, federal, or local law and to deter those who aid, abet, or induce discrimination or coerce others to discriminate.

Accessibility: If you need an accommodation as part of the employment process please contact: [email protected]

This role has a market-competitive salary with an anticipated base compensation range listed below. Actual salaries will vary depending on a candidate's experience, qualifications, skills, and location.

Disclaimer: The above description is meant to illustrate the general nature of work and level of effort being performed by individuals assigned to this position or job description. This is not restricted as a complete list of all skills, responsibilities, duties, and/or assignments required. Individuals may be required to perform duties outside of their position, job description or responsibilities as needed.

The diversity of Axle’s employees is a tremendous asset. We are firmly committed to providing equal opportunity in all aspects of employment and will not tolerate any illegal discrimination or harassment based on age, race, gender, religion, national origin, disability, marital status, covered veteran status, sexual orientation, status with respect to public assistance, and other characteristics protected under state, federal, or local law and to deter those who aid, abet, or induce discrimination or coerce others to discriminate.

Accessibility: If you need an accommodation as part of the employment process please contact: [email protected]

This role has a market-competitive salary with an anticipated base compensation range listed below. Actual salaries will vary depending on a candidate’s experience, qualifications, skills, and location.

Salary Range

$130,000—$145,000 USD

Data Scientist II

Other open roles at Axle(6)