Senior Core Infrastructure Engineer

Oracle·Oracle Recruiting
BENGALURU, IndiaFull-timePosted Jun 30, 2026
Open original posting

Designs, implements, and optimizes components in distributed systems with an emphasis on scalability, resiliency, and operability. Delivers features and load/performance tests; leverages data plane platforms and distributed state tools for high-volume retrieval, storage, and processing; and reviews peers’ implementations for scalability compliance. Builds fault-tolerant paths (redundancy, replication, automatic failover), applies recovery‑oriented principles, and implements retries, circuit breakers, and timeouts. Proactively detects and mitigates issues via tests, alarms, dashboards, and telemetry; authors runbooks and participates in incident response and RCAs. Implements standard replication and synchronization, develops automation/IaC for troubleshooting and maintenance, and applies advanced security controls (encryption, access, remediation) while ensuring change, compliance, and documentation standards are met.

Key Responsibilities
System Design & Architecture - System Scalability:
–Implements and contributes to the development for components of distributed systems that support horizontal and vertical scaling including leveraging distributed state management tools.
–Optimizes code and/or systems for large-scale data processing in large-scale systems.
–Implements scalability requirements for assigned components and reviews implementation of team members.
–Leverages components of data plane platforms to handle large-scale data retrieval, storage, and processing.
–Implements performance and load testing.
System Design & Architecture - System Reliability Design:
–Collaborates with team to build fault-tolerant components capable of withstanding in-service updates by implementing redundancy, replication, and automatic failover mechanisms.
–Applies recovery oriented computing principles to design components that effectively handle service disruptions.
–Implements retry mechanisms, circuit breakers, and timeouts to help handle network unreliability.
System Design & Architecture - System Reliability Performance:
–Implements tests and alarm configurations to proactively detect and address issues/failures.
–Supports efforts to recover from failures by drafting and executing runbooks and operational procedures.
–Builds and customizes dashboards, telemetry systems, and alerting mechanisms to monitor component health.
System Design & Architecture - Correctness / Availability:
–Designs and implements functional requirements and testing for assigned features within an existing system.
–Implements tests scenarios (e.g., fault-injection, brown-out) to evaluate system correctness.
–Implements standard data replication and synchronization techniques to maintain data integrity and availability.
Operational Troubleshooting & Incident Management:
–Diagnoses, debugs, and resolves issues in system components to support ongoing operation.
–Implements basic strategies to prevent interruptions, ensuring no maintenance windows are required for customers and users when resolving issues.
–Designs and implements automation scripts and tooling used to troubleshoot operational issues.
–Participates in operational support rotations, assisting in incident responses and root cause investigations.
Compliance & Security:
–Applies advanced security measures to protect data and applications in multi-tenant environments, including encryption and access controls.
–Implements remediation plans to continuously improve security.
–Collaborates with the team to ensure cloud infrastructure complies with relevant industry standards and regulations and that documentation is up-to-date
Automation & Change Management:
–Maintains automation scripts and tools (e.g., Infrastructure as Code (IaC)) for managing cloud infrastructure.
–Adheres to change management plans for patching, updating, and rolling back applications.


Core Responsibilities
Planning & Execution:
–Track timelines with minimal supervision, ensuring work is completed in a timely manner and is in alignment with project requirements. 
–Prioritize and adjust work as resources or timelines change, with some guidance
Collaboration & Partnership:
–Collaborates across teams to align on expectations and achieve shared objectives. Builds and maintains a comprehensive understanding of business, stakeholder, and/or customer needs to build and support effective partnerships. Actively listens to diverse perspectives and asks questions to ensure understanding of others.
Problem Solving:
–Independently identifies and addresses standard and non-standard issues in accordance with standard practices, escalating more complex issues as appropriate. Analyzes data and/or information from multiple sources to troubleshoot standard and non-standard errors. Contributes to knowledge sharing and best practices
Continuous Learning:
–Embraces continuous learning by actively seeking to build knowledge and new skills and/or tools, and staying current with industry trends and best practices. Seeks out and leverages feedback and training to improve skills. Contributes to a culture of continuous learning and knowledge sharing with team members..
Continuous Improvement:
–Develops ideas and recommends updates to increase the efficiency and effectiveness of processes, protocols, and workflows within a team. Seeks input from team members on alternative approaches and methods for improving work.

Career Level - IC3

Want jobs like this matched to you?

Swoopd scores fresh postings against your résumé so you only see the matches that matter.

Get started free