Skip to main content

Machine Learning Engineer - ML Training Platform

Expired
This role has expired and is no longer accepting applications. Browse similar roles →
Pluralis Research
Melbourne, VIC
hybrid
Full Time / Permanent

Apply for this job

Posted 3 months ago
This role is expired

These roles are hiring now

View all similar roles →

AI Engineer

NCS Australia
Melbourne, VIC
hybrid
  • Design and deliver end-to-end AI/ML solutions using AWS and/or Azure
  • Strong experience in AI/ML Engineering, Data Engineering, or related fields
  • Python, TensorFlow, PyTorch, Scikit-learn, MLOps
Posted 4d ago

AI Engineer (Senior)

AMP
Melbourne, VIC
hybrid
  • Design and deliver AI capabilities across the North platform
  • Deep experience delivering AI/ML solutions in enterprise environments
  • Python, AWS/Azure/Databricks, Generative AI, NLP, RAG, Agentic AI
Posted 7d ago
Featured

Senior AI Engineer, Software

Future Secure AI
Future Secure AI
Sydney, NSW
hybrid
  • Build AI products and platform, lead customer-facing engagements
  • 7+ years software development experience
  • Node.js, LangChain, AI Agent frameworks, cloud environments (Azure, AWS, GCP)
Posted 29d ago

AI/ML Software Engineer

Akordi
Sydney, NSW
  • Design, develop & deploy ML models for decision intelligence platform
  • 2+ years commercial ML/AI engineering experience
  • Python, TensorFlow, scikit-learn, AWS, REST APIs
Posted 14h ago

Pluralis Research carries out foundational research on Protocol Learning: multi-participant training of foundation models where no single participant has, or can ever obtain, a full copy of the model. The purpose of Protocol Learning is to facilitate the creation of community-trained and community-owned frontier models with self-sustaining economics.

We're looking for Senior/Staff engineers with 5+ years of experience in distributed systems and ML large-scale training. You'll be implementing a novel substrate for training distributed ML models that work under consumer grade internet connection.

Responsibilities

Distributed Training Architecture & Optimization

  • Design and implement large-scale distributed training systems optimized for heterogeneous hardware operating under low-bandwidth, high-latency conditions.
  • Develop and optimize model-parallel training strategies (data, tensor, pipeline parallelism) with custom sharding techniques that minimize communication overhead.
  • Optimize GPU utilization, memory efficiency, and compute performance across distributed nodes.
  • Implement robust checkpointing, state synchronization, and recovery mechanisms for long-running, fault-prone training jobs.
  • Build monitoring and metrics systems to track training progress, model quality, and system bottlenecks.

Decentralized Networking & Resilience

  • Architect resilient training systems where nodes can fail, networks can partition, and participants can dynamically join or leave.
  • Design and optimize peer-to-peer topologies for decentralized coordination across non-co-located nodes.
  • Implement NAT traversal, peer discovery, dynamic routing, and connection lifecycle management.
  • Profile and optimize communication patterns to reduce latency and bandwidth overhead in multi-participant environments.
What You'll Bring
  • Strong experience building and operating distributed systems in production.
  • Hands-on expertise with distributed training frameworks (FSDP, DeepSpeed, Megatron, or similar).
  • Deep understanding of model parallelism (data, tensor, pipeline parallelism).
  • Expert-level Python with production experience (concurrency, error handling, retry logic, clean architecture).
  • Strong networking fundamentals: P2P systems, gRPC, routing, NAT traversal, distributed coordination.
  • Experience optimizing GPU workloads, memory management, and large-scale compute efficiency.
What We Offer
  • Equity-heavy compensation with meaningful ownership in a mission-driven company
  • Competitive base salary for senior engineering roles in Australia
  • Visa sponsorship available for exceptional candidates
  • Remote-first with optional access to our Melbourne hub
  • World-class team — team mates were previously at at Google, Amazon, Microsoft, and leading startups

Backed by Union Square Ventures and other tier-1 investors, we're a world-class, deeply technical team of ML researchers and engineers.

Pluralis is unapologetically ideological. We view the world as a better place if we are able to implement what we are attempting, and Protocol Learning as the only plausible approach to preventing a handful of massive corporations monopolising model development, access and release, and achieving massive economic capture. If this resonates, please apply.