FortWorthRecruiter Since 2001
the smart solution for Fort Worth jobs

Senior Platform Reliability Engineer

Company: G-research
Location: Dallas
Posted on: May 6, 2024

Job Description:

Do you want to tackle the biggest questions in finance with near infinite compute power at your fingertips?G-Research is a leading quantitative research and technology firm, with offices in London and Dallas. We are proud to employ some of the best people in their field and to nurture their talent in a dynamic, flexible and highly stimulating culture where world-beating ideas are cultivated and rewarded.This is a hybrid role based in our new Dallas infrastructure hub where we work on the latest technologies in a cutting-edge environment.The roleThe Reliability Engineering team, part of our Platforms as a Service (PaaS) function, works with a variety of technologies, including multiple Kubernetes clusters, multiple database technologies, low latency networks and big data warehouse across multiple regions around the globe.We are actively seeking an experienced Site Reliability Engineer (SRE) with a proven track record in building up and bootstrapping SRE functions across multiple teams.We want an individual who excels in ensuring the robustness, scalability, and fault tolerance of large-scale infrastructure. The ideal candidate will have a comprehensive understanding of the intricacies involved in architecting, deploying, and maintaining high-performance solutions, coupled with a track record of implementing and enhancing reliability measures across all infrastructure ecosystems.This role demands hands-on experience in orchestrating resilient systems, fine-tuning performance, and implementing proactive strategies to mitigate potential downtimes or disruptions. The successful candidate will play a pivotal role in driving the reliability, efficiency, and scalability of infrastructure platform through innovative solutions and best-in-class practices.In return, you will gain exposure to the latest hardware and software technologies in a forward-thinking company, which values innovation, personal development and training.Key responsibilities of the role include:Leading efforts to enhance existing practices across teams, fostering collaboration and synchronization to optimize system reliability and scalabilityDriving strategies for enhancing systems performance, leveraging innovative approaches to improve efficiency and streamline processesImplementing best practices for system reliability, fault tolerance, and scalability, ensuring alignment with evolving industry standardsCultivating a culture of continuous improvement, encouraging regular reviews and iterative enhancements to tools, methodologies, and processesEnhancing incident response processes by conducting comprehensive reviews, implementing improvements, and integrating learned lessons into future strategiesLeading efforts to optimize capacity planning strategies, ensuring systems are prepared for future scaling while maximizing resource utilizationCollaborating with security teams to fortify and enhance security measures within systems, ensuring compliance with evolving policies and standardsCollaborating effectively with other SREs within PaaS, and colleagues in different time zones.(Dallas and London)Who are we looking for?The successful candidate will be an experienced Platforms Reliability Engineer who is enthusiastic about contributing to an automated, scalable, reliable and high-performing Infrastructure and Platform as a Service:A strong desire to continually learn about new technologies, approaches, and systems, along with the agility to work across multiple teamsA strong communicator with excellent written communications to technical and non-technical audiencesA self-starter with excellent problem-solving skillsProficient in Go or other programming language such as Python, Rust or Java for automation and development tasksExtensive Linux, Networking and Infrastructure knowledgeExperience with CI/CD (preferably Jenkins and ArgoCD) and Configuration Management tools, such as Ansible and TerraformExperience deploying and running applications on Docker and Kubernetes, including the creation of Helm chartsFamiliarity with monitoring tools like Prometheus, Grafana, Open Telemetry and the ELK stack (Elasticsearch, Logstash, Kibana), or similarUnderstanding of core SRE concepts and their implementation in platform engineeringBeneficial experience would include:Experience building and bootstrapping an SRE organization across multiple teamsExperience working on large-scale infrastructure to improve performance, stability and efficiencyWhy should you apply?Market-leading compensation plus annual discretionary bonusInformal dress code and excellent work/life balanceExcellent paid time off allowance of 25 daysSick days, military leave, and family and medical leaveGenerous 401(k) plan16-weeks fully paid parental leaveMedical and Prescription, Dental, and Vision insuranceLife and Accidental Death & Dismemberment (AD&D) insuranceEmployee Assistance and Wellness programsGenerous relocation allowance and supportGreat selection of office snacks, and hot and cold drinksOn-site gym and car parkingby Jobble

Keywords: G-research, Fort Worth , Senior Platform Reliability Engineer, Accounting, Auditing , Dallas, Texas

Click here to apply!

Didn't find what you're looking for? Search again!

I'm looking for
in category
within


Log In or Create An Account

Get the latest Texas jobs by following @recnetTX on Twitter!

Fort Worth RSS job feeds