Machine Learning Engineer - ML Training Platform at Pluralis Research (Expired)

AI Jobs Australia

Machine Learning Engineer - ML Training Platform

Expired

This role has expired and is no longer accepting applications. Browse similar roles →

Pluralis Research

Melbourne, VIC

hybrid

Full Time / Permanent

Apply for this job

Posted 4 months ago

This role is expired

These roles are hiring now

View all similar roles →

Senior Generative AI Developer (GCP)

NTT Data

Melbourne, VIC | Sydney, NSW

hybrid

Build & deploy LLM applications on GCP with RAG architectures
3-7 years software experience with Generative AI & LLM focus
Python, Vertex AI, BigQuery ML, LangChain, MLOps, Docker, Kubernetes

Posted 2d ago

2027 Applied Science Intern (Machine Learning, Recommender Systems)

Amazon

Melbourne, VIC

Develop novel ML solutions and prototypes for Deep Learning and Generative AI
PhD candidate in computer science, machine learning, or related fields
Python, Deep Learning, NLP, Computer Vision, Recommender Systems

Posted 2d ago

Staff Software Engineer - Machine Learning

REA Group

Melbourne, VIC

hybrid

Lead ML engineering initiatives and drive technical excellence
8+ years software engineering experience with strong ML background
Python, ML frameworks, distributed systems, cloud platforms

Posted 2d ago

AI Engineering Manager

MYOB

Melbourne, VIC

hybrid

Design, build, and scale AI-driven applications for SaaS platform
Experience leading, mentoring, or coaching others required
LLMs, AWS SageMaker/Bedrock, TensorFlow/PyTorch, MLOps, Docker/Kubernetes

Posted 3d ago

Pluralis Research carries out foundational research on Protocol Learning: multi-participant training of foundation models where no single participant has, or can ever obtain, a full copy of the model. The purpose of Protocol Learning is to facilitate the creation of community-trained and community-owned frontier models with self-sustaining economics.

We're looking for Senior/Staff engineers with 5+ years of experience in distributed systems and ML large-scale training. You'll be implementing a novel substrate for training distributed ML models that work under consumer grade internet connection.

Responsibilities

Distributed Training Architecture & Optimization

Design and implement large-scale distributed training systems optimized for heterogeneous hardware operating under low-bandwidth, high-latency conditions.
Develop and optimize model-parallel training strategies (data, tensor, pipeline parallelism) with custom sharding techniques that minimize communication overhead.
Optimize GPU utilization, memory efficiency, and compute performance across distributed nodes.
Implement robust checkpointing, state synchronization, and recovery mechanisms for long-running, fault-prone training jobs.
Build monitoring and metrics systems to track training progress, model quality, and system bottlenecks.

Decentralized Networking & Resilience

Architect resilient training systems where nodes can fail, networks can partition, and participants can dynamically join or leave.
Design and optimize peer-to-peer topologies for decentralized coordination across non-co-located nodes.
Implement NAT traversal, peer discovery, dynamic routing, and connection lifecycle management.
Profile and optimize communication patterns to reduce latency and bandwidth overhead in multi-participant environments.

What You'll Bring

Strong experience building and operating distributed systems in production.
Hands-on expertise with distributed training frameworks (FSDP, DeepSpeed, Megatron, or similar).
Deep understanding of model parallelism (data, tensor, pipeline parallelism).
Expert-level Python with production experience (concurrency, error handling, retry logic, clean architecture).
Strong networking fundamentals: P2P systems, gRPC, routing, NAT traversal, distributed coordination.
Experience optimizing GPU workloads, memory management, and large-scale compute efficiency.

What We Offer

Equity-heavy compensation with meaningful ownership in a mission-driven company
Competitive base salary for senior engineering roles in Australia
Visa sponsorship available for exceptional candidates
Remote-first with optional access to our Melbourne hub
World-class team — team mates were previously at at Google, Amazon, Microsoft, and leading startups

Backed by Union Square Ventures and other tier-1 investors, we're a world-class, deeply technical team of ML researchers and engineers.

Pluralis is unapologetically ideological. We view the world as a better place if we are able to implement what we are attempting, and Protocol Learning as the only plausible approach to preventing a handful of massive corporations monopolising model development, access and release, and achieving massive economic capture. If this resonates, please apply.