Are you our missing ingredient?
The Zonal group are one of the UK’s largest technology providers to the hospitality industry.
Our products are used by over 11,000 pubs, restaurants and hotels. Customers include national brands like Pizza Express, JD Wetherspoons and All Bar One.
If you’ve booked a table or hotel room, ordered and paid for food and drinks, received loyalty offers, or downloaded your favourite hang out’s app, you will likely have used a Zonal product.
We are a family business with Scottish roots. We operate from our modern head office in Edinburgh to our Division in Staffordshire, or our Innovation Centre in Abingdon and hotel management solutions base in Cardiff.
The makeup of our systems is changing rapidly and we’d like you to play a key part in helping us drive this forward. We’re moving towards a modern DevOps landscape with technologies like Docker, IaC and microservices.
Initially we are working with our own hosted data centre infrastructure technology however you’ll play a key role in our drive towards a future hybrid public-cloud position.
We offer remote working within the UK. We also reward our staff well with very competitive salaries, generous holiday allocation and well-structured career development plans.
What you’ll do
This role sits within the Zonal Managed Services team and is part of the wider Zonal Technical Services business unit.
Our suite of SaaS, distributed systems and product integrations help our customers run their critical business operations and provide their customers in turn with industry leading hospitality technology products.
You’ll play a key role in the formation of a new area within Zonal: Our Production Operations (ProdOps) team aims to drive operational excellence and customer focus into the operation of our SaaS hosted application suite.
As a Support Technician, you’ll bring your experience in providing level 4 support in distributed and centralised systems, integrations, and deployments to the Production Ops team underpinning our industry leading SaaS solution.
You will triage and take ownership of incidents and requests from level 3 Helpcentre support, working closely with teammates in SRE, Development and Platform Delivery to provide a responsive, stateful incident workflow, identifying opportunities to knowledge-share, enhance documentation, improve tooling and reduce toil.
You and your team will:
- Build strong, collaborative relationships acting as the glue between in-house customer facing support and delivery teams, and platform engineering (R&D) teams
- Own, run and continually improve:
- Incident triage, response through to resolution
- Logging, monitoring, and alerting services and infrastructure
- Dashboards, internal and external status pages
- Automation and tooling of manual processes
- Team processes, driving technical debt down
- Capacity analytics and demand management
- Disaster Recovery models, planning and testing
- Reduce toil (work that is largely manual, repetitive, automatable, tactical, devoid of enduring value, and that scales linearly as our services grow), maximising engineering capacity
- Bring expertise and a streetwise perspective to problem solving, reduction of complexity to operations
- Participate in On-Call cover and Incident Response
- Proactively manage delivery of key SLOs covering Detection / early warning and self-healing
- Act as key stakeholders in the technical debt reduction of our Products
Who you are
You will have a background in deploying, managing, and operating mission critical SaaS and distributed systems having spent at least some of your career as a member of a fast-paced product engineering, web operations and/or platform delivery team.
Ideally you will have a demonstrable track record of operating systems in hybrid datacentre/cloud infrastructures.
- A self-starter with a passion for technology and problem solving, with excellent analytical skills who thrives in a fast-paced autonomous environment
- Solid experience in scripting, tooling, automation, and data access – with PowerShell, T-SQL and MySQL would be an advantage
- Excellent understanding of traditional ops in a virtualised Windows/Linux environment
- Knowledge and experience of monitoring frameworks such as Zabbix, data retrieval and event correlation from Graylog
- Quick to spot opportunities and new capabilities in technologies
- Familiar with docker and container ecosystems
- Comfortable in complex provisioning and deployment scenarios
- A strong collaborator, organised, with a safe pair of hands
- A team player who enjoys influencing change and representing the operational and customer impacts in Tech Debt prioritisation
- Comfortable interacting with mixed audiences of Support, Product Delivery, Engineering, and Incident Management
- Minimum 3+ years’ experience operating and supporting production software.