To Apply for this Job Click Here
Role: Sr. Data Engineer/ Scientist
Location: Huntsville, AL
Clearance: Active TS/SCI
Senior Data Scientist / Engineer with a strong background in data infrastructure and large-scale data environments.
This role supports the definition, architecture, and engineering of a large-scale on-premises data environment designed to ingest, store, process, and distribute space domain data across multiple classification levels.
This is not a traditional data science role. The position sits at the intersection of:
- Data engineering
- Data architecture
- Infrastructure engineering
- Systems engineering
and will play a key role in shaping the technical direction, requirements development, and acquisition strategy for mission data systems.
This role focuses on designing and supporting a large-scale data environment that will process and manage significant volumes of space domain data. The system is expected to scale to tens of petabytes of data and hundreds of infrastructure racks, supporting advanced analytics and mission operations.
The ideal candidate has experience working with large data environments, distributed systems, or high-performance computing platforms, and understands how data platforms are designed from both a software and infrastructure perspective.
This position offers the opportunity to support critical national security missions while working on complex, large-scale data and infrastructure challenges.
Requirements
- Bachelor’s degree in computer science, data science, engineering, or related quantitative field, and a current Top Secret/Sensitive Compartmented Information (TS/SCI) security clearance
- Minimum 7 years of progressive experience across both data engineering and data science disciplines, including expert proficiency in Python and Structured Query Language (SQL); demonstrated success designing Extract, Transform, Load/Extract, Load, Transform (ETL/ELT) pipelines; developing and deploying machine learning models; and working with big data technologies such as Apache Spark, Apache Airflow, or Apache Hadoop
- Demonstrated knowledge of enterprise data architecture and storage hardware (e.g. Storage Area Networks (SAN), Network Attached Storage (NAS), and object storage), data center design principles, data governance practices, and understanding of space data characteristics, including experience with the National Space Intelligence Center (NSIC) Data Catalog
- Develop architecture for large-scale data platforms supporting petabyte-scale storage and analytics workloads
- Define infrastructure requirements including:
- Support planning for scalable environments ranging from 200-500 racks
- Design distributed data environments including:
- Translate mission needs into technical system requirements and design artifacts
- Support capacity planning, including threshold vs objective growth models
- Assist with OTA, RFP, and acquisition planning activities
- Evaluate vendor solutions and participate in technical reviews
- Collaborate across cybersecurity, network, infrastructure, and mission teams
- Support development of architecture documentation and engineering packages
Desired Skills
- Master’s degree or Doctor of Philosophy (PhD) in computer science, engineering, data science, or related field
- Experience designing or supporting large-scale data environments (petabyte-scale)
- Strong understanding of data center infrastructure and architecture
- Experience with distributed data systems and high-performance computing (HPC) environments
- Knowledge of data storage architectures
- Experience with data ingestion and large-scale analytics pipelines
- Familiarity with DoD or Intelligence Community environments
- Understanding of multi-classification or cross-domain architectures
- Experience supporting technical requirements development and acquisition processes
- Knowledge of data center and telecommunications standards
T1454624CHS-LF_1781279123
