This job board retrieves part of its jobs from: Toronto Jobs | Emplois Montréal | IT Jobs Canada

Your last stop to finding a great job in Texas

To post a job, login or create an account |  Post a Job

Site Reliability Engineer (SRE)

Capital One

This is a Full-time position in Haslet, TX posted September 9, 2021.

Knolls 2 (12036), United States of America, Glen Allen, Virginia

Site Reliability Engineer (SRE)

We are looking for an experienced Site Reliability Engineer with operational and/or site reliability engineering background with a passion for providing superior system availability and customer experience. We are looking for candidates who can lead a 24/7 support organization, drive reliability and performance across a massive scale by mastering the full depth of the stack. As an SRE, you will have the opportunity to tackle complex problems of scale which are unique to tech companies while using your expertise in delivery and support of critical services.
  • Effectively manage troubleshooting and recovery of complex production incidents, ranging from low to critical impacts
  • Drive incident resolution through a systematic problem solving approach, coupled with a strong sense of ownership and drive
  • Actively participate in teams Agile stories (project work) to streamline and enhance day to day operations of the team
  • Create, manage and utilize appropriate technical procedural documentation (run books)
  • Proactively monitor all of the applications and infrastructure behind Capital Ones external and internal customer facing services including their availability, latency, performance, and capacity
  • Influence resiliency and scalability in production environments in Amazon Web Services (AWS)
  • Identify opportunities and develop proactive automated monitoring and alerting solutions by utilizing available tools (Splunk, DataDog, etc.)
  • Assist with conducting Root Cause Analysis (RCA) on critical production outages, develop and implement mitigation strategies
  • Utilize production support expertise to influence and support new designs, architectures, standards and methods maintaining stability and availability for large-scale distributed systems
  • Proactively identify and implement opportunities for automation of routine maintenance tasks, data gathering and resolution of common issues
  • Continuously seek to develop new skills and technical expertise, as well as proactively share knowledge with others
Basic Qualifications:
  • Bachelors Degree
  • At least 2 years of experience in technology production support

Preferred Qualifications:
  • AWS Associate level certification (Solutions Architect, SysOps Administrator, or Developer)
  • 2+ years of experience with Linux, UNIX, python, Ruby, Go, JavaScript, or NoSQL
  • 2+ years of experience with AWS, Azure or GCP
  • 2+ years experience with web API services
  • 2+ years of experience with Splunk, ELK, NewRelice, DataDog monitoring and alerts

At this time, Capital One will not sponsor a new applicant for employment authorization for this position.

by Jobble