Site Reliability Engineer- Data Platform (Remote)

  • Full-Time
  • Remote
  • Posted on March 16, 2022

Fixed Income Firm #016

Support the release of new services, through capacity planning, rollout planning and release management.
In collaboration with data engineers, define and implement monitoring strategies, define SLAs and error budgets.
Build and deploy automation tooling for supported services and data pipelines.
Troubleshoot and remediate issues with the services you manage.
Manage and run critical production services.
Track and execute continuous improvements.
Required Qualifications:

Strong understanding of Linux. Windows a plus.
Strong proclivity for automation and DevOps practices and tooling such as Git, Ansible, Terraform
Strong experience working with monitoring and logging tools: Prometheus, ELK, Grafana.
Good programming experience in either: Bash, Python, C++ or Java
Familiarity with container orchestration platforms such as Kubernetes, Nomad.
Understanding of general networking protocols such as TCP/IP, DNS, TLS.
Broad exposure to at least one cloud platform: AWS, Google, Azure
Experience with PostgreSQL, SQL Server, Oracle, Redis or Kafka a strong plus
Familiarity working with open source software community a strong plus
Strong communication and written skills.
Financial Services experience a plus but not required.
BS or higher in a technical field: CS, Physics, Maths etc.

To apply for this job email your details to Graham.Gates@TechExecOnline.com

Job Location