Hedge Fund #SQ9

Location: Bala Cynwyd, PA (Philadelphia Area)
Department: Infrastructure Engineering & Systems Optimization
Start Date: Immediate
Job Type: Full-Time | Experienced Hire

About the Role

We’re looking for an HPC Engineer to lead the design, management, and performance tuning of our high-performance computing environment. This is your chance to work at the core of a fast-moving, technology-driven organization that powers complex research and trading systems at scale.

You’ll build resilient compute clusters, streamline automation, and ensure our platforms remain highly available, ultra-fast, and future-ready.

What You’ll Be Doing

Automate and Optimize

Build internal tooling (Python, Bash) for diagnostics, monitoring, and maintenance
Automate daily operations to focus on performance tuning and strategic improvements

Collaborate Across Teams

Partner with quants, developers, and app teams to solve infrastructure bottlenecks
Work hand-in-hand with storage and networking teams to deliver end-to-end solutions

Performance Engineering

Tune OS-level settings for Linux and Windows across large-scale distributed systems
Conduct deep dives on system failures and latency issues—identify, fix, and improve

Cluster & Scheduler Management

Manage the entire HPC stack: compute nodes, job schedulers (SLURM, HTCondor), and interconnects
Deploy and optimize parallel filesystems (Lustre, VAST, GPFS)

Scalable Storage Engineering

Build and operate high-throughput, low-latency storage platforms
Ensure smooth data delivery for compute-intensive and real-time workflows

Capacity Planning & Forecasting

Analyze usage trends, model future resource needs, and design scalable systems

Troubleshooting at Scale

Use system tools (sysctl, tcpdump, wireshark, strace, procmon) to identify and resolve root causes quickly

✅ What We’re Looking For

Bachelor’s in Computer Science, Engineering, or similar technical field
5+ years of experience in HPC systems engineering on Linux (Windows experience a plus)
Deep understanding of OS internals, I/O performance tuning, and system profiling
Proven experience with parallel filesystems: Lustre, GPFS, VAST
Proficient in resource schedulers like SLURM, HTCondor, or similar
Strong scripting ability in Python and Bash
Solid communicator and cross-functional collaborator

🎯 Why Join Us?

High-impact role driving the backbone of our research and trading platforms
Work with brilliant minds across infrastructure and quantitative research
Build bleeding-edge systems that push the limits of speed, scale, and reliability