
Hedge Fund #SQ9
Location: Bala Cynwyd, PA (Philadelphia Area)
Department: Infrastructure Engineering & Systems Optimization
Start Date: Immediate
Job Type: Full-Time | Experienced Hire
About the Role
We’re looking for an HPC Engineer to lead the design, management, and performance tuning of our high-performance computing environment. This is your chance to work at the core of a fast-moving, technology-driven organization that powers complex research and trading systems at scale.
You’ll build resilient compute clusters, streamline automation, and ensure our platforms remain highly available, ultra-fast, and future-ready.
What You’ll Be Doing
Automate and Optimize
- Build internal tooling (Python, Bash) for diagnostics, monitoring, and maintenance
- Automate daily operations to focus on performance tuning and strategic improvements
Collaborate Across Teams
- Partner with quants, developers, and app teams to solve infrastructure bottlenecks
- Work hand-in-hand with storage and networking teams to deliver end-to-end solutions
Performance Engineering
- Tune OS-level settings for Linux and Windows across large-scale distributed systems
- Conduct deep dives on system failures and latency issues—identify, fix, and improve
Cluster & Scheduler Management
- Manage the entire HPC stack: compute nodes, job schedulers (SLURM, HTCondor), and interconnects
- Deploy and optimize parallel filesystems (Lustre, VAST, GPFS)
Scalable Storage Engineering
- Build and operate high-throughput, low-latency storage platforms
- Ensure smooth data delivery for compute-intensive and real-time workflows
Capacity Planning & Forecasting
- Analyze usage trends, model future resource needs, and design scalable systems
Troubleshooting at Scale
- Use system tools (sysctl, tcpdump, wireshark, strace, procmon) to identify and resolve root causes quickly
✅ What We’re Looking For
- Bachelor’s in Computer Science, Engineering, or similar technical field
- 5+ years of experience in HPC systems engineering on Linux (Windows experience a plus)
- Deep understanding of OS internals, I/O performance tuning, and system profiling
- Proven experience with parallel filesystems: Lustre, GPFS, VAST
- Proficient in resource schedulers like SLURM, HTCondor, or similar
- Strong scripting ability in Python and Bash
- Solid communicator and cross-functional collaborator
🎯 Why Join Us?
- High-impact role driving the backbone of our research and trading platforms
- Work with brilliant minds across infrastructure and quantitative research
- Build bleeding-edge systems that push the limits of speed, scale, and reliability