Site Reliability Engineer
This role arises from BigChange rapid business growth. JobWatch is a multi-tenanted, SaaS and mobile applications provide 24×7 transaction processing for operational data from all customers’ business and resource location minute-by-minute. BigChange managed Quality and Information Security to ISO standards and focussed on continuous improvement of IT services, processes, cloud infrastructure and software. This hands-on role analyses availability, support and security incidents, trends and change to manage and deliver a backlog of continuous improvement actions and projects.
- Operate our cloud infrastructure to the highest levels of availability, reviewing alerts, trends, logs, and metrics by proactive daily system monitoring.
- Owning availability and security incident responses, triaging, troubleshooting, and remediating incidents, escalating where appropriate, additionally owning root cause analysis to identify improvement actions/projects;
- Progress a backlog of improvement actions/projects and security vulnerabilities, and elaborate, contribute, collaborate and complete;
- Contribute, Maintain and Improve process, infrastructure & application documentation including standard operating procedures and customer-facing technical documents;
- Infrastructure and application software Change management: responsible for planning, process validation , internal and external impact communication
- Review and verify automated scheduled jobs including, backups, patching and conduct routine BAU auditing tasks and run books.
- Administer system access rights, including managing, maintaining and auditing approval and revocation requests. Additionally reviewing existing access with a least privilege approach.
- Identify areas for improvement, collaborating with internal teams to improve tools and processes in a constant effort to improve platform availability.
- Help maintain the highest security standards by proactively responding to notifications, reviewing events, dashboards and identifying response and improvement actions.
- Compilation of management level reports including KPIs, trend analysis, system availability metrics and security incident remediation.
- Experience working in a cloud environment (AWS preferred, Azure useful)
- Extensive experience of administration Windows, and Linux/Unix, including systems diagnostics towards troubleshooting of incidents and events.
- Experience of working in an environment with ISO9001/ISO27001
- Working knowledge of systems, network performance and availability monitoring.
- Strong experience of log file analysis (CloudWatch Logs, including SQL querying).
- Familiarity with networking technologies including firewalls, routing, and load balancing.
- Exposure to infrastructure-as-code (CloudFormation).
- Exposure to configuration management (Ansible, PowerShell), version control and pipelines (git, Azure DevOps, Jenkins)
- Basic scripting ability using bash, powershell, or python.
- Writes clear and concise technical documentation, appropriate to target audience.
- Approachable and friendly, a trusted source of technical advice to others in the company
- Agile management approach including backlog, documentation, review and testing
- An ability to advise senior stakeholders, work comfortably without definition and apply a progressive technical approach to any problem
- You’ll be an inquisitive technologist and naturally encourage others to be alike. You will have good people skills and the ability to objection handle where required
- Ability to work individually and within a team
- Ability to communicate at all levels both written and verbally
- You will have an organised and proactive approach to work
- Equally comfortable interacting with colleague’s face-face and remotely.
- ITIL and ITSM certification
- AWS Certified Associate level, Microsoft Azure Certification.
- Exposure to the following technologies
- .Net Core, php, NodeJS, python
- EC2, ECS, Lambda, VPC, ELB, CloudFront, Route53, S3, CodeDeploy, SecurityHub, GuardDuty, SES, RDS, ElastiCache, KMS, WAF, CloudWatch
- MSSQL, MySQL, Redis, MongoDB, RabbitMQ
- Salary: Competitive
- Expenses paid for all overnight stays, subsistence and mileage whilst on business.
- 33 days holiday, plus bank holidays, plus ‘BigChange Birthday’.
- Pension plan (NEST)
- Health and Wellbeing contribution of £20 gross per month
- Free Virtual Yoga Classes
- Company contributed Cash Health Plan
- Flexi Friday
- “Motivational Mondays” – inspiring talks monthly from extraordinary people.
- Local fruit delivered weekly to the office.
- Being part of a supportive team with the ability to learn new skills and grow within the company.
- Experience cutting edge technology and be part of a company that is shaping the future.
All rewards are agreed between each employee and the company based on their role/offer letter and may differ from the standard
Location: Office Based
Reporting to: Technical Support Manager
To apply for this job use the link below and email us your details.Apply now