Senior Cloud Application Support Engineer

Full-time
Latin America
Senior Level
Posted 5 hours ago
Apply for this position → Go ad-free with Premium ×

Atmosera empowers businesses to redefine what's possible with modern technology and human expertise. Our exceptional experience across Applications, Data & AI, DevOps, Security, and the Microsoft Azure platform enables organizations to accelerate innovation, enhance security, and optimize operational agility. As a Microsoft Partner with seven specializations, GitHub AI Partner of the Year, a member of the GitHub Advisory Board, and a member of the prestigious Microsoft Intelligent Security Association (MISA), Atmosera expertly delivers cutting-edge, integrated solutions that deliver business value.

What We’re Looking For:

We are looking for a Senior Cloud Application Support Engineer to join our team of professionals. In this role, you will focus on expert-level application observability, incident dispositioning, and support maturation. Rather than holding sole technical accountability for architecture, this role spearheads product strategy and works closely with engineering teams to stabilize, govern, and optimize critical client cloud environments through advanced technical analysis and leadership.

What You'll Do

  • Execute expert-level real-time monitoring and incident dispositioning for critical client applications by leveraging deep technical knowledge of Dynatrace and Azure Insights.

  • Correlate complex data across metrics, traces, and logs to perform deep-dive root cause analysis and identify performance bottlenecks in distributed environments.

  • Lead the triage of complex alerting environments to filter noise and ensure that high-priority incidents are identified and managed with surgical precision.

  • Analyze high-level metrics and daily reports to detect subtle system variations, proactively identifying potential problems to avoid service disruptions.

  • Evaluate the quality of existing runbooks and spearhead the creation of new standards for operating procedures, governance, and management of client environments.

  • Act as the primary technical point of contact for P1 incidents, ensuring high-level communication and coordination between all technical and business stakeholders.

  • Drive the automation of manual reporting processes to improve operational efficiency and provide more accurate insights into environment health and performance.

  • Enforce SRE best practices and SLA compliance, guiding the team on the proper management of incidents and the strategic creation of problem records.

  • Mentor junior staff on the execution of complex procedures and the interpretation of APM telemetry to foster a culture of technical excellence and proactivity.

  • Collaborate on product strategy and the implementation of best practices to optimize the performance and stability of global client environments.

The Skills You'll Need

  • Expert-level proficiency in Dynatrace and Azure Insights with a focus on advanced configuration and environment optimization.

  • Advanced technical expertise in correlating metrics traces and logs to perform deep-dive root cause analysis.

  • Deep understanding of SRE principles and proven experience managing critical P1 incidents under strict SLAs.

  • Strong leadership and communication skills to take point on P1/P2 tickets and coordinate with high-level stakeholders.

  • Ability to evaluate existing support documentation and establish new standards for governance and operating procedures.

  • Experience in automating manual reporting processes and translating telemetry into actionable business insights.

  • Strategic analytical skills to detect subtle patterns of instability and prevent potential service disruptions.

  • Capacity to guide and mentor junior team members on technical best practices and APM tool interpretation.

  • Proven track record of collaborating on product strategy and managing complex client cloud environments.

Qualifications:

  • Bachelor’s degree in computer science or a related technical major or equivalent professional experience in high-level cloud operations or equivalent job experience.

  • 5+ years of technical experience with a strong background in managed service providers or cloud hosting environments focusing on senior systems administration.

  • Bilingual proficiency is required to effectively collaborate across our distributed teams and client base.

  • Advanced certifications in Dynatrace or other APM platforms are highly preferred to demonstrate expert-level observability skills.

  • Microsoft Azure certifications are required within 90 days of employment based on current certificates and skill level.

  • Technical certificates in Azure, Windows, O365, SQL, Linux, VMware, Cisco, Palo Alto, AWS, GCP, Terraform, Dynatrace, or DevOps are a plus.

Go ad-free with Premium ×
Apply for this position →
About the Job
Full-time
Latin America
Senior Level
Posted 5 hours ago
Check if your resume is a good fit
25/100
Get Full Report
+ 1,284 new jobs added today
30,000+
Remote Jobs

Don't miss out — new listings every hour

Join Premium

Senior Cloud Application Support Engineer

Atmosera empowers businesses to redefine what's possible with modern technology and human expertise. Our exceptional experience across Applications, Data & AI, DevOps, Security, and the Microsoft Azure platform enables organizations to accelerate innovation, enhance security, and optimize operational agility. As a Microsoft Partner with seven specializations, GitHub AI Partner of the Year, a member of the GitHub Advisory Board, and a member of the prestigious Microsoft Intelligent Security Association (MISA), Atmosera expertly delivers cutting-edge, integrated solutions that deliver business value.

What We’re Looking For:

We are looking for a Senior Cloud Application Support Engineer to join our team of professionals. In this role, you will focus on expert-level application observability, incident dispositioning, and support maturation. Rather than holding sole technical accountability for architecture, this role spearheads product strategy and works closely with engineering teams to stabilize, govern, and optimize critical client cloud environments through advanced technical analysis and leadership.

What You'll Do

  • Execute expert-level real-time monitoring and incident dispositioning for critical client applications by leveraging deep technical knowledge of Dynatrace and Azure Insights.

  • Correlate complex data across metrics, traces, and logs to perform deep-dive root cause analysis and identify performance bottlenecks in distributed environments.

  • Lead the triage of complex alerting environments to filter noise and ensure that high-priority incidents are identified and managed with surgical precision.

  • Analyze high-level metrics and daily reports to detect subtle system variations, proactively identifying potential problems to avoid service disruptions.

  • Evaluate the quality of existing runbooks and spearhead the creation of new standards for operating procedures, governance, and management of client environments.

  • Act as the primary technical point of contact for P1 incidents, ensuring high-level communication and coordination between all technical and business stakeholders.

  • Drive the automation of manual reporting processes to improve operational efficiency and provide more accurate insights into environment health and performance.

  • Enforce SRE best practices and SLA compliance, guiding the team on the proper management of incidents and the strategic creation of problem records.

  • Mentor junior staff on the execution of complex procedures and the interpretation of APM telemetry to foster a culture of technical excellence and proactivity.

  • Collaborate on product strategy and the implementation of best practices to optimize the performance and stability of global client environments.

The Skills You'll Need

  • Expert-level proficiency in Dynatrace and Azure Insights with a focus on advanced configuration and environment optimization.

  • Advanced technical expertise in correlating metrics traces and logs to perform deep-dive root cause analysis.

  • Deep understanding of SRE principles and proven experience managing critical P1 incidents under strict SLAs.

  • Strong leadership and communication skills to take point on P1/P2 tickets and coordinate with high-level stakeholders.

  • Ability to evaluate existing support documentation and establish new standards for governance and operating procedures.

  • Experience in automating manual reporting processes and translating telemetry into actionable business insights.

  • Strategic analytical skills to detect subtle patterns of instability and prevent potential service disruptions.

  • Capacity to guide and mentor junior team members on technical best practices and APM tool interpretation.

  • Proven track record of collaborating on product strategy and managing complex client cloud environments.

Qualifications:

  • Bachelor’s degree in computer science or a related technical major or equivalent professional experience in high-level cloud operations or equivalent job experience.

  • 5+ years of technical experience with a strong background in managed service providers or cloud hosting environments focusing on senior systems administration.

  • Bilingual proficiency is required to effectively collaborate across our distributed teams and client base.

  • Advanced certifications in Dynatrace or other APM platforms are highly preferred to demonstrate expert-level observability skills.

  • Microsoft Azure certifications are required within 90 days of employment based on current certificates and skill level.

  • Technical certificates in Azure, Windows, O365, SQL, Linux, VMware, Cisco, Palo Alto, AWS, GCP, Terraform, Dynatrace, or DevOps are a plus.