Software Engineer - ML Infrastructure

San FranciscoFullTimePosted Oct 3, 2025

Company Background
Specter's mission is to help automate the physical world.

Today, we build video sensors with state-of-the-art AI agents that answer any question, anywhere in their environments. Our systems can automatically detect and reason about any physical activity captured on camera, from security incidents (e.g. perimeter intrusion, theft, LPR), to safety monitoring (e.g. PPE detection, injured people), to operational efficiency (e.g. material tracking, congestion monitoring). We offer both long range wireless (1km range) and wired sensor variants to suit any deployment.

Our co-founders Xerxes and Philip are passionate about empowering our partners in the fast approaching world of physical AI and robotics. We are a small, fast growing team who hail from Anduril, Tesla, Uber, and the U.S. Special Forces.

The Role
Specter is hiring an ML Infrastructure engineer to build and scale the machine learning systems that power real-time perception and inference across our edge-cloud platform. This role owns the data, training, deployment, and serving infrastructure for the computer vision and VLM systems that enable autonomous monitoring and orchestration across our customers' physical assets.

Responsibilities:

Designing and implementing scalable ML training and inference pipelines for perception models (object detection, tracking, classification, segmentation) and VLMs.
Developing continuous training and evaluation systems to improve model performance from production data feedback loops.
Designing large-scale multi-modal data pipelines for ingesting, processing, and indexing video and sensor data spanning both batch and streaming workloads.
Creating data pipelines for ingesting, labeling, versioning, and managing massive multi-modal sensor datasets (video, radar, lidar, thermal).
Implementing model monitoring, A/B testing frameworks, and performance analytics for deployed perception systems.
Collaborating with perception researchers to transition models from research to production at scale across thousands of edge nodes.
Building tools and infrastructure for distributed training, hyperparameter optimization, and experiment tracking.

Qualifications:

Strong software engineering fundamentals in Python, with working proficiency in a systems language (Rust, Go, C++) for performance-sensitive data and inference paths.
Working proficiency with ML frameworks (PyTorch, TensorFlow) and model optimization tooling.
Deep experience building and operating model inference systems at scale, request routing, batching, autoscaling, caching, and latency/throughput tuning under real production load.
Hands-on experience with distributed compute frameworks for ML and data workloads (Ray, Spark, or equivalent), including GPU cluster management and orchestration.
Strong understanding of distributed systems fundamentals: partitioning, replication, backpressure, exactly-once semantics.
Experience with vector databases (QDrant, LanceDB, or equivalent) for similarity search and retrieval workloads.
Familiarity with LLM/VLM serving frameworks (VLLM, SGLang, TensorRT-LLM) in production is a strong plus.
Familiarity with video processing, sensor fusion, or multi-modal perception systems is a plus.