Senior Linux Engineer – Architect & Optimize High-Performance Compute Systems

Hedge Fund #003

Senior Linux Engineer – Architect & Optimize High-Performance Compute Systems

We are seeking a highly skilled Linux Engineer to help scale and optimize a mission-critical global compute platform. This role requires a deep understanding of Linux internals, automation, system performance tuning, and networking. If you’re passionate about low-latency, high-availability compute environments and thrive in solving complex technical challenges, this is an opportunity to push the limits of system engineering.

Key Responsibilities

🛠 Compute Optimization & Performance Tuning

Optimize system performance through low-level kernel tuning, NUMA configurations, CPU affinity, and hardware acceleration techniques.
Analyze and enhance I/O throughput, network stack optimizations, and memory management for high-demand workloads.
Conduct deep-dive root cause analysis on performance bottlenecks, leveraging tools like perf, eBPF, tcpdump, strace, and systemtap.
⚙️ Automation & Infrastructure as Code

Design, develop, and maintain automated provisioning pipelines for Linux compute clusters.
Implement infrastructure-as-code (IaC) principles using Ansible, Terraform, and Packer to manage large-scale deployments.
Develop self-healing automation scripts for system failures and service degradation using Python or Go.
🌎 Scalability & Compute Engineering

Architect large-scale bare-metal and virtualized compute environments with a focus on reliability and fault tolerance.
Manage large Linux server fleets, ensuring automated configuration management, fleet consistency, and rolling OS upgrades.
Drive adoption of containerization strategies (Docker, Kubernetes) and hybrid cloud infrastructure.
🔍 System Diagnostics & Incident Response

Troubleshoot complex distributed system issues, correlating network, OS, and application-level data.
Investigate and remediate low-latency networking challenges, including kernel bypass (DPDK, XDP), TCP tuning, and packet loss analysis.
Implement real-time monitoring solutions using Prometheus, Grafana, and ELK stack to ensure infrastructure observability.

Required Qualifications

✔ 5+ years of experience designing, implementing, and managing Linux-based compute environments at scale.
✔ Deep expertise in Linux (RHEL, CentOS, Ubuntu), including kernel tuning, process scheduling, and file system performance.
✔ Strong proficiency in low-level OS debugging tools (gdb, perf, lsof, tcpdump, strace).
✔ Experience with automating OS build, test, and release pipelines.
✔ Advanced knowledge of networking fundamentals (TCP/IP, DNS, DHCP, HTTP, BGP, VPN, load balancing).

Preferred Qualifications

🚀 Expertise in Python or Go for automation, system tooling, and infrastructure development.
🔧 Experience with API development for automation and integration with compute management services.
🏗 Strong infrastructure automation skills (Terraform, Ansible, Puppet, SaltStack).
🌐 Familiarity with Kubernetes, service mesh architectures, and cloud infrastructure (GCP, AWS, Azure).
🖥 Experience with large-scale virtualization (VMware, KVM, QEMU) and hypervisor performance tuning.

Senior Why Join Us?

⚡ Lead Cutting-Edge Compute Engineering – Push the boundaries of high-performance Linux infrastructure.
🔍 Tackle Real-World Performance Challenges – Work on low-latency, high-throughput compute clusters.
🌎 Scale Globally – Engineer systems that support thousands of mission-critical workloads.
💡 Continuous Learning & Growth – Work alongside principal engineers and learn from industry experts.

Job Overview
Job Location