Site Reliability Engineer

  • Company:
  • Location:
  • Salary:
    negotiable / month
  • Job type:
  • Posted:
    2 days ago
  • Category:

Versent is an Australian-born technology company, focused on architecting, building & operating cloud native applications, data streams, platforms and services. Our solutions are centred around AWS and best of breed technology. From a stand-still start in 2014, we’ve grown to over 360 family members both locally and internationally. With a diversified offering in professional services, managed services and product, we are poised for significant growth in 2020. Do you want to automate? Do you think of infrastructure as code? Do you enjoy building tools and software? If you like deep dives on code to identify and fix bugs, you will love this role! Due to expansion, we are seeking an experienced SRE. You will be responsible for maintaining the availability and reliability of the platform, but 40-50% of time will be developing and improving what’s there eg. improving the monitoring infrastructure etc…You will work with some of Versent’s key enterprise clients in a customer facing role. This is not a reactive role, this is a highly proactive engineering role, adding clear value to our clients.

Day to day responsibilities will include:

  • Deploy, support and monitor new and existing services, platforms, and application stacks.
  • Design, create and deliver software to improve the availability, scalability and security of our client’s mission critical infrastructure and platforms
  • Build and manage systems, infrastructure and applications through automation.
  • Provide performance tuning/recommendations for mission critical platforms and infrastructure
  • Identify, troubleshoot, and resolve issues in a live production environment that supports mission critical applications for our clients
  • Creating useful and effective documentation including Postmortems, Production Readiness Reviews, Playbooks
  • Participate in an on-call roster
  • Skills Required:

  • Proven ability to write programs using a high-level programming language like: Java, Python, Go etc
  • Experience designing and deploying infrastructure and applications on Public Cloud Platforms like AWS
  • Experience and understanding of continuous integration, delivery, deployment and testing
  • Experience handling large numbers of diverse systems with configuration management systems like: Ansible, Puppet.
  • Experience working with *nix systems
  • Understanding of standard networking protocols and components such as: HTTP, DNS, ECMP, TCP/IP, ICMP, the OSI Model, Subnetting and Load Balancing strategies.
  • Familiarity with distributed systems is a plus including: the CAP Theorem, Microservices, and the Twelve Factor App.
  • Experience with Kubernetes, Envoy, Prometheus, and/or Docker would be an asset
  • Passion for eliminating repetitive manual processes using automation.
  • #LI-SC1