Tech Jobs for Talents without Borders
English-1st. Relocation-friendly. Curated daily by Imagine.
4,304 Jobs at 191 Companies

AI - Site Reliability Engineer

Deutsche Bank

Deutsche Bank

Software Engineering, Data Science
Pune, Maharashtra, India
Posted on Monday, June 3, 2024

Job Description:

Job Title- Site Reliability Engineer


Role Description:

At the “Service Solutions and AI” Tribe, our mission is to revolutionize our Private Bank process landscape by implementing holistic, front-to-back process automation. We are committed to enhancing efficiency, agility, and innovation, with a keen focus on aligning every step of our process with the customer’s needs and expectations. Our dedication extends to driving innovative technologies, such as AI & workflow services, to foster continuous improvement. We aim to deliver ‘best in class’ solutions across products, channels, brands, and regions, thereby transforming the way we serve our customers and setting new benchmarks in the industry.

As a Site Reliability Engineer (SRE), your role is at the heart of our operations. You are tasked with maintaining the reliability and efficiency of our software systems. Your responsibilities include diagnosing and resolving system issues, automating manual processes, monitoring system performance, and implementing improvements. You will work closely with the development team to ensure scalable and efficient system designs. Additionally, you will handle system incidents as part of on-call rotations and continuously evaluate and improve existing systems and processes. Your contributions are vital in ensuring the smooth operation of our systems, leading to user satisfaction and the overall success of our business.

Join us in our journey to redefine banking with AI and service solutions into the future.

What we’ll offer you

As part of our flexible scheme, here are just some of the benefits that you’ll enjoy

  • Best in class leave policy
  • Gender neutral parental leaves
  • 100% reimbursement under childcare assistance benefit (gender neutral)
  • Sponsorship for Industry relevant certifications and education
  • Employee Assistance Program for you and your family members
  • Comprehensive Hospitalization Insurance for you and your dependents
  • Accident and Term life Insurance
  • Complementary Health screening for 35 yrs. and above

Your key responsibilities

  • System Troubleshooting and Problem Solving: You will diagnose and resolve system issues to ensure optimal performance and uptime. This includes debugging complex network issues, optimizing cloud resource allocation, and implementing robust monitoring tools.
  • Automation: You will automate manual processes, striving for efficiency and accuracy. This includes writing scripts and using infrastructure as code (IaC) tools to automate system deployments and configurations. You will build software for IT operations for example Automation of health checks, functional and non-functional monitoring and alerting, automatic recovery from failures.
  • Performance Tuning: You will monitor system performance and implement tuning improvements to ensure high availability and optimal load distribution.
  • Collaboration: You will work closely with the development team to ensure system designs are scalable, reliable, and efficient. This includes participating in system design consulting, platform management, and capacity planning. You will promote SRE culture by example and collaborate with application developers to build elegant handling of negative/fault scenarios upfront.
  • Incident Management: You will participate in on-call rotations, handling system incidents to minimize downtime and impact on business operations. You will support traditional IT operations teams as required for troubleshoot critical production issues and then create/own backlog items to prevent/recover from similar faults.
  • Continuous Improvement: You will continuously evaluate existing systems and processes, making recommendations for improvements and driving their implementation.
  • Testing: You will craft high quality software with working functional and non-functional tests
  • Peer Review: You will peer review code written by other colleagues and provide constructive feedback to help drive overall code quality.

Your skills and experience

  • Passion for SRE Practices: You should have a deep passion for Site Reliability Engineering practices and be able to exhibit them in your ways of working.
  • Development and Support Experience: You should have prior hands-on development and support experience in predominantly Java/Kubernetes/Messaging ecosystem. A heavy focus on non-functional requirements like resilience, scalability, performance, and incident response is essential.
  • Automation: A passion for automation is crucial. You should have experience in both formal and informal automation of manual IT operations tasks. Knowledge in Python, Terraform and Ansible is required to automate manual tasks.
  • CI/CD Pipelines and Testing: Extensive experience in Continuous Integration/Continuous Deployment (CI/CD) pipelines, End-to-End (E2E) automation testing, and Chaos testing in active-active topologies is a must.
  • Linux Scripting and SQL Knowledge: Strong Linux scripting and SQL knowledge are required. You should have extensive experience in troubleshooting and preventing database and application-level performance issues. Your skills in these areas will be critical in maintaining the reliability and efficiency of our software systems.
  • Cloud Knowledge: Strong knowledge on public cloud (preferrable GCP) is required. You should have extensive experience on cloud infrastructure and monitoring with strong knowledge on NewRelic and Google Cloud monitoring and alerting.
  • ITIL Knowledge: Knowledge and certification in the ITIL framework and experience on Incident Management.

How we’ll support you

  • Training and development to help you excel in your career
  • Coaching and support from experts in your team
  • A culture of continuous learning to aid progression
  • A range of flexible benefits that you can tailor to suit your needs

About us and our teams

Please visit our company website for further information:

Our values define the working environment we strive to create – diverse, supportive and welcoming of different views. We embrace a culture reflecting a variety of perspectives, insights and backgrounds to drive innovation. We build talented and diverse teams to drive business results and encourage our people to develop to their full potential. Talk to us about flexible work arrangements and other initiatives we offer.

We promote good working relationships and encourage high standards of conduct and work performance. We welcome applications from talented people from all cultures, countries, races, genders, sexual orientations, disabilities, beliefs and generations and are committed to providing a working environment free from harassment, discrimination and retaliation.

Visit Inside Deutsche Bank to discover more about the culture of Deutsche Bank including Diversity, Equity & Inclusion, Leadership, Learning, Future of Work and more besides.