Jobs at CRED

Site Reliability Engineer

at CRED • Full-time

Location

in-office (Bengaluru, India)

Must have skills

About this Opportunity

What You Will Do?


  • Work with large-scale data engineering infrastructures and data native technologies such as Spark/EMR, Flink, Apache Pinot, Kafka, Airflow, Tableau, NiFi, Metabase, and Databricks

  • Work with Observability tools like Loki, Victoriametrics and Datadog

  • Showcase understanding of best practices in running and managing self managed platforms on kubernetes, ensuring complete observability, HA, and self-served CI/CD system

  • Foster cross-team collaboration, building and maintaining relationships with customer teams, architects, and engineering teams to jointly achieve key deliverables ensuring production scalability and stability

  • Demonstrate strong troubleshooting and debugging skills, including conducting post-incident reviews, root cause analysis, and triaging product or system issues to analyze sources, impacts, and resolve them for service operations and quality


You Should Apply If You:


  • have experience in SRE/DevOps, with a focus on distributed cloud native systems design, observability, container orchestration, maintenance, and troubleshooting

  • experience with public cloud platforms, preferably AWS

  • have hands-on experience in Kubernetes/EKS, building and operating large-scale production systems with stringent SLOs & SLAs

  • are proficient in modern DevOps programming and scripting languages: Shell, Python, GoLang

  • demonstrate experience in Linux Infrastructure management and systems administration with Linux

  • have experience with Infrastructure as code & Configuration management using tools like Terraform, Helm, Ansible

  • have expertise in Continuous Integration and Deployment (CI/CD) and release orchestration using Jenkins, ArgoCD, GitHub Actions, etc.

  • have expertise in big data systems like Spark/EMR, Flink, Airflow etc.

  • have expertise in pubsub solutions like Kafka.

  • are familiar with system observability tools such as ELK/EFK, Prometheus, Grafana, alert manager, Sysdig, Datadog, Victoria Metrics, etc.

  • have exceptional interpersonal, verbal, and written communication skills


Join the club

CRED members make better team members. become a member, and unlock the privileges of being one of the chosen few.

Find the perfect job!

Use Job Hunt AI to find the perfect job for you.

Job Hunt AI