Meta is seeking a highly motivated and experienced Operations Platform Engineer to own and evolve the robotics test and data infrastructure that underpins motor validation, experimentation, and downstream analysis. We are building next-generation robotic systems that operate in the real world and generate large volumes of high-fidelity data. As these systems scale, we need robust software infrastructure to ensure that motor control, testing, telemetry, and data pipelines are repeatable, observable, and usable across teams.
This is a hands-on, builder role at the intersection of robotics hardware, system software, and data platforms. You’ll help research and engineering teams scale beyond one-off solutions by turning early prototypes into reliable, extensible systems that increase test throughput, improve data quality, and accelerate iteration.
Responsibilities
- Design and build motor and actuator test infrastructure, including control loops, data capture, and validation tooling
- Develop and standardize repeatable test stations that scale across hardware variants, labs, and teams
- Define and implement telemetry schemas and data contracts for robotic systems (commands, feedback, environment, failures), ensuring consistency across programs
- Build time-synchronized data pipelines to support debugging, replay, offline analysis, and training workflows
- Establish observability standards for robotic systems, including metrics, logging, diagnostics, anomaly detection, and dashboards
- Partner closely with robotics hardware, firmware, research, safety, and operations teams to ensure systems are reliable, safe, and extensible
- Identify and eliminate bottlenecks in data quality, test throughput, and system reliability as usage scales to more teams and more robots
- Drive architecture decisions that balance rapid experimentation with long-term maintainability, operational robustness, and scalability
- Support fleet and lab validation workflows by enabling consistent test execution across platforms (e.g., Lithium, Ber, Boron, Carbon, Aloha, Mimmic, Trossen)
- Contribute to system-level failure understanding by enabling instrumentation and workflows that accelerate failure triage and root cause analysis
Minimum Qualifications
- 7+ years of experience in software engineering, systems engineering, robotics engineering, or related fields
- 3+ years of experience working close to hardware, including motors, sensors, actuators, embedded systems, and/or embedded Linux environments
- Proven ability to design and build test frameworks or infrastructure for physical systems (labs, manufacturing tests, reliability rigs, end-of-line, or similar)
- Experience building data ingestion pipelines for high-frequency and/or real-time telemetry (including time sync, buffering, backpressure, and schema evolution)
- Systems engineering fundamentals: APIs, data schemas, failure modes, reliability, operational discipline and maintainable interfaces
- Ability to operate effectively in ambiguous, fast-moving environments with evolving requirements
- Proven communication and collaboration skills across hardware, software, and research disciplines Experience in industrial robotics, automation, embedded Linux, or real-time systems
- Experience with robotics data formats, replay systems, simulation pipelines, or log-based debugging at scale
- Familiarity with observability tooling and practices (metrics, logging, tracing, dashboards, alerting)
- Experience supporting ML or research teams through infrastructure (data capture, labeling support, dataset generation, evaluation pipelines) rather than model development
- Prior experience bringing prototype systems into scaled, multi-team usage, including documentation, onboarding, and operational support
- Demonstrated ability to integrate AI tools to optimize/redesign workflows and drive measurable impact (e.g., efficiency gains, quality improvements)
- Experience adhering to and implementing responsible, ethical AI practices (e.g., risk assessment, bias mitigation, quality and accuracy reviews)
- Demonstrated ongoing AI skill development (e.g., prompt/context engineering, agent orchestration) and staying current with emerging AI technologies