Research Engineer, Inference

normalcomputing · New York City

ExclusiveFull-time$250,000 – $325,000posted 2h ago

Apply at normalcomputing Tailored CV + cover letter

Apply directly on normalcomputing’s careers site — no account needed.

Get the next jobs like this one by email

One free alert to apply before the crowd — jobs land straight from company career pages. One-click unsubscribe.

About the role

About Normal Computing

Normal Computing builds silicon that turns thermal noise from an obstacle into a computational resource. Conventional chips spend most of their energy forcing determinism onto physics; ours compute with it. Stochastic, in-memory, asynchronous: the result is 10-100× more AI inference per dollar, per watt.

We co-design the full stack: AI-native EDA systems in production with the world's largest semiconductor companies, and the advanced ASICs they make possible. Backed by $85M+ from the world's leading deep-tech investors and built by scientists, engineers, and operators from the labs that built modern computing.

Normal works as one team across New York, Silicon Valley, London, Copenhagen, and Seoul. We hire people who want the hardest version of their craft, across every discipline, at every seniority.

The Role

As a Research Engineer focused on inference, you will develop the computational methods that make AI inference run efficiently on Normal's thermodynamic hardware. The core challenge is not adapting standard GPU kernels to a new chip. It is rethinking how operations like attention, memory access, and long-context decoding behave when the underlying substrate uses stochastic analog computation in memory rather than conventional digital logic.

Normal's ASICs run the heaviest operations of large model inference inside memory itself. Your job is to develop the algorithms that exploit this natively: understanding what transformer workloads are well-suited to stochastic analog execution, designing numerical methods that map onto the hardware's physical dynamics, and validating them against real silicon or high-fidelity simulation.

This is a co-design role. The hardware and the algorithms are developed in parallel, which means you will influence architectural decisions, not just implement against a fixed spec. The strongest candidates have a deep understanding of both large model inference and the mathematics of stochastic systems, and have built things that run on real hardware, not just in theory.

What You'll Own

Algorithm Development: Develop algorithms for transformer inference workloads running on stochastic analog processing-with-memory hardware.
Hardware Co-Design: Work directly with hardware and architecture teams to shape what the chip can and should compute natively.
Numerical Methods: Design numerical methods that exploit thermal noise and analog dynamics rather than working around them.
Evaluation & Benchmarks: Build evaluation frameworks and benchmarks that characterize algorithm behavior on real hardware or simulation.
Workload Translation: Translate insights about model workloads into constraints and opportunities for hardware design.
Rapid Prototyping: Prototype and iterate rapidly as hardware evolves from simulation to silicon.

What Makes You a Great Fit

Deep understanding of large model inference: attention mechanisms, KV cache, long-context decoding, memory bandwidth constraints
Experience with inference optimization: quantization, sparsity, kernel fusion, or memory-efficient attention
Familiarity with stochastic systems, probabilistic methods, numerical analysis, or analog computation
Experience implementing algorithms close to hardware, not just in high-level frameworks
Comfort reasoning from first principles about what a novel substrate can do efficiently
Track record of taking ideas from theory to working implementation on real hardware
Strong programming skills in Python and at least one systems language
Collaborative instinct and ability to work across hardware, architecture, and software teams

Bonus Points

PhD in machine learning, applied mathematics, physics, electrical engineering, or a related field
Exposure to analog or mixed-signal systems, in-memory compute, or non-von-Neumann architectures
Experience working on hardware that did not yet exist when you joined
Publications or open-source work in efficient inference, stochastic algorithms, or novel computing

Equal Employment Opportunity Statement

Normal Computing is an Equal Opportunity Employer. We celebrate diversity and are committed to creating an inclusive environment for all employees. All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, disability, veteran status, or any other legally protected status.

Accessibility Accommodations

Normal Computing is committed to providing reasonable accommodations to individuals with disabilities. If you need assistance or an accommodation due to a disability, please let us know at accommodations@normalcomputing.com.

Privacy Notice

By submitting your application, you agree that Normal Computing may collect, use, and store your personal information for employment-related purposes in accordance with our Privacy Policy.

Skills

Python
TensorFlow
PyTorch
NumPy
SQL

Get the next jobs like this one by email

One free alert to apply before the crowd — jobs land straight from company career pages. One-click unsubscribe.

Research Engineer, Inference

About the role

About Normal Computing

The Role

What You'll Own

What Makes You a Great Fit

Bonus Points

Skills

Similar jobs

Machine Learning Engineering Intern

[English] Senior AI Engineer - Patient Health Platform (x/f/m)

Senior AI/ML Scientist

Senior ML Ops Engineer

[Data - FR] Senior Machine Learning Engineer - Orchestration

Senior Machine Learning Scientist