Research Engineer, Inference

normalcomputing · New York City

ExclusiveFull-time$250,000 – $325,000posted 2h ago

Apply directly on normalcomputing’s careers site — no account needed.

Get the next jobs like this one by email

One free alert to apply before the crowd — jobs land straight from company career pages. One-click unsubscribe.

About the role

About Normal Computing

Normal Computing builds silicon that turns thermal noise from an obstacle into a computational resource. Conventional chips spend most of their energy forcing determinism onto physics; ours compute with it. Stochastic, in-memory, asynchronous: the result is 10-100× more AI inference per dollar, per watt.

We co-design the full stack: AI-native EDA systems in production with the world's largest semiconductor companies, and the advanced ASICs they make possible. Backed by $85M+ from the world's leading deep-tech investors and built by scientists, engineers, and operators from the labs that built modern computing.

Normal works as one team across New York, Silicon Valley, London, Copenhagen, and Seoul. We hire people who want the hardest version of their craft, across every discipline, at every seniority.

The Role

As a Research Engineer focused on inference, you will develop the computational methods that make AI inference run efficiently on Normal's thermodynamic hardware. The core challenge is not adapting standard GPU kernels to a new chip. It is rethinking how operations like attention, memory access, and long-context decoding behave when the underlying substrate uses stochastic analog computation in memory rather than conventional digital logic.

Normal's ASICs run the heaviest operations of large model inference inside memory itself. Your job is to develop the algorithms that exploit this natively: understanding what transformer workloads are well-suited to stochastic analog execution, designing numerical methods that map onto the hardware's physical dynamics, and validating them against real silicon or high-fidelity simulation.

This is a co-design role. The hardware and the algorithms are developed in parallel, which means you will influence architectural decisions, not just implement against a fixed spec. The strongest candidates have a deep understanding of both large model inference and the mathematics of stochastic systems, and have built things that run on real hardware, not just in theory.

What You'll Own

  • Algorithm Development: Develop algorithms for transformer inference workloads running on stochastic analog processing-with-memory hardware.

  • Hardware Co-Design: Work directly with hardware and architecture teams to shape what the chip can and should compute natively.

  • Numerical Methods: Design numerical methods that exploit thermal noise and analog dynamics rather than working around them.

  • Evaluation & Benchmarks: Build evaluation frameworks and benchmarks that characterize algorithm behavior on real hardware or simulation.

  • Workload Translation: Translate insights about model workloads into constraints and opportunities for hardware design.

  • Rapid Prototyping: Prototype and iterate rapidly as hardware evolves from simulation to silicon.

What Makes You a Great Fit

  • Deep understanding of large model inference: attention mechanisms, KV cache, long-context decoding, memory bandwidth constraints

  • Experience with inference optimization: quantization, sparsity, kernel fusion, or memory-efficient attention

  • Familiarity with stochastic systems, probabilistic methods, numerical analysis, or analog computation

  • Experience implementing algorithms close to hardware, not just in high-level frameworks

  • Comfort reasoning from first principles about what a novel substrate can do efficiently

  • Track record of taking ideas from theory to working implementation on real hardware

  • Strong programming skills in Python and at least one systems language

  • Collaborative instinct and ability to work across hardware, architecture, and software teams

Bonus Points

  • PhD in machine learning, applied mathematics, physics, electrical engineering, or a related field

  • Exposure to analog or mixed-signal systems, in-memory compute, or non-von-Neumann architectures

  • Experience working on hardware that did not yet exist when you joined

  • Publications or open-source work in efficient inference, stochastic algorithms, or novel computing

Equal Employment Opportunity Statement

Normal Computing is an Equal Opportunity Employer. We celebrate diversity and are committed to creating an inclusive environment for all employees. All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, disability, veteran status, or any other legally protected status.

Accessibility Accommodations

Normal Computing is committed to providing reasonable accommodations to individuals with disabilities. If you need assistance or an accommodation due to a disability, please let us know at accommodations@normalcomputing.com.

Privacy Notice

By submitting your application, you agree that Normal Computing may collect, use, and store your personal information for employment-related purposes in accordance with our Privacy Policy.

Skills

  • Python
  • TensorFlow
  • PyTorch
  • NumPy
  • SQL

Get the next jobs like this one by email

One free alert to apply before the crowd — jobs land straight from company career pages. One-click unsubscribe.

Similar jobs

Research Engineer, Inference — normalcomputing · Real Job Offers