What are the responsibilities and job description for the Staff Machine Learning Architect position at Neurophos?

About Neurophos

We are developing an ultra-high-performance, energy-efficient photonic AI inference system. We’re transforming AI computation with the first-ever metamaterial-based optical processing unit (OPU).

As AI adoption accelerates, data centers face significant power and scalability challenges. Traditional solutions are struggling to keep up, leading to rapidly rising energy consumption and costs. We’re solving both problems with an OPU that integrates over one million micron-scale optical processing components on a single chip. This architecture will deliver up to 100 times the energy efficiency of existing solutions while significantly improving large-scale AI inference performance.

We’ve assembled a world-class team of industry veterans and recently raised a $110M Series A led by Gates Frontier. Participants include M12 (Microsoft’s Venture Fund), Carbon Direct Capital, Aramco Ventures, Bosch Ventures, Tectonic Ventures, Space Capital, and others. We have also been recognized on the EE Times Silicon 100 list for several consecutive years.

Join us and shape the future of optical computing!

Location: Austin, TX or San Francisco, CA. Full-time onsite position.

Position Overview

We are seeking an experienced machine learning architect to lead the porting and optimization of large language models (LLMs), diffusion models, and other ML applications to our revolutionary optical inference engines. This role is critical to demonstrating the full potential of our metamaterial-based optical processing units (OPUs) by adapting state-of-the-art AI models to leverage our ultra-high-throughput, low-precision compute architecture. The ideal candidate will bridge the gap between cutting-edge ML research and novel hardware capabilities, ensuring customers can seamlessly deploy their AI workloads on Neurophos hardware.

Key Responsibilities

Lead the porting of LLM applications, diffusion models, and visual ML applications to Neurophos optical inference engines
Adapt models from diverse sources, including GitHub, Hugging Face, other open-source repositories, and customer private models
Work with models in various formats, including PyTorch, Triton, JAX, and emerging frameworks
Develop and implement quantization strategies to migrate models from higher precision formats (FP8, INT8, and above) to our optimized 4-bit precision (FP4/INT4) for weights and activations
Design and execute re-quantization, retraining, and other model adaptation techniques to minimize accuracy loss during precision reduction
Create or integrate third-party tools and workflows for efficient model porting and optimization
Optimize GEMM operations for high-throughput execution
Develop benchmarking methodologies to measure and validate model quality post-porting, including perplexity metrics and other quality indicators
Collaborate with hardware and software teams to co-optimize model architectures for optical compute characteristics
Publish research papers on novel optimization techniques and methodologies (with appropriate IP protection)

Qualifications

MS or PhD in Computer Science, Data Science, Machine Learning, Mathematics, or related field
7 years of experience in machine learning engineering with at least 3 years focused on model optimization and deployment
Deep expertise in neural network quantization techniques, including post-training quantization (PTQ) and quantization-aware training (QAT)
Strong proficiency in PyTorch and familiarity with other ML frameworks (JAX, Triton, TensorFlow)
Hands-on experience with transformer architectures, LLMs, and diffusion models
Experience with low-precision inference optimization (INT8, FP8, or lower)
Strong understanding of GEMM operations and linear algebra optimizations for deep learning
Experience with model evaluation metrics, including perplexity, accuracy, and benchmark suites
Track record of successfully deploying ML models on specialized hardware accelerators
Excellent communication skills with the ability to collaborate across hardware and software teams

Preferred Skills

Experience with sub-8-bit quantization (INT4, FP4) and mixed-precision inference
Familiarity with Hugging Face Transformers library and model hub ecosystem
Experience with ONNX, TensorRT, or other model optimization frameworks
Background in analog or optical computing architectures
Knowledge of in-memory computing paradigms and matrix-vector multiplication acceleration
Published research in model compression, quantization, or efficient inference
Experience with large-scale batch inference optimization
Familiarity with prefill vs. decode optimization strategies in LLM inference

What We Offer

A pivotal role in an innovative startup redefining the future of AI hardware.
A collaborative and intellectually stimulating work environment.
Competitive compensation, including salary and equity options.
Good benefits - health, vision, dental, 401 (k), etc.
Opportunities for career growth and future team leadership.
Access to cutting-edge technology and state-of-the-art facilities.
Opportunity to publish research and contribute to the field of efficient AI inference.

If you are passionate about pushing the boundaries of model optimization and driving impact in the semiconductor industry, we want to hear from you! This is a rare opportunity to work on a game-changing technology at the intersection of photonics and AI. As part of our elite team, you’ll contribute to a platform that redefines computational performance and accelerates the future of artificial intelligence. Be a key player in bringing this transformative innovation to the world.

Apply for this job

Receive alerts for other Staff Machine Learning Architect job openings

Staff Machine Learning Architect

What are the responsibilities and job description for the Staff Machine Learning Architect position at Neurophos?

What is the career path for a Staff Machine Learning Architect?

Job openings at Neurophos

Not the job you're looking for? Here are some other Staff Machine Learning Architect jobs in the San Francisco, CA area that may be a better fit.

We don't have any other Staff Machine Learning Architect jobs in the San Francisco, CA area right now.

AI Assistant is available now!