Job Description
Key Responsibilities
- Architect, develop, and maintain scalable backend services and REST/async APIs
- Build robust, fault-tolerant distributed systems with low latency and high throughput
- Lead technical design discussions, architecture reviews, and code reviews
- Develop and deploy Operations Research models into production systems
- Handle full software lifecycle: requirement analysis, development, CI/CD, monitoring, and incident response
- Maintain containerized cloud-native apps on Azure using Infrastructure-as-Code (Terraform, Bicep)
- Own performance tuning, debugging, and system observability
- Mentor junior engineers, drive knowledge sharing, and influence technical hiring
Technical Skills
Programming & Languages
- Strong in Python
- Deep understanding of data structures, algorithms, and time/space complexity
- Writes clean, testable, production-ready code
System Design & Architecture
- Skilled in designing horizontally scalable, fault-tolerant systems
- Proficient in HLD, database schema design, API contracts, and distributed architecture
- Strong grasp of CAP theorem, eventual consistency, idempotency, and backpressure
Backend Development
- Expert in building RESTful and asynchronous APIs using Flask, FastAPI, or similar frameworks
- Skilled in microservices, monolith decomposition, and service orchestration
- Familiar with message brokers like Apache Kafka, RabbitMQ, Redis Streams and async queues like Celery
Cloud & DevOps
- Proficient in Azure (preferred) or other cloud platforms like AWS or GCP
- Familiar with Infrastructure as Code (IaC) tools like Terraform
- Experience with CI/CD pipelines using GitHub Actions, Azure DevOps, or Jenkins
- Comfortable with Docker, and working knowledge of Kubernetes or Azure Container Apps
Data Handling
- Experience with data validation and schema modeling (e.g., Pydantic, Marshmallow)
- Expertise in handling Excel, CSV, and database ingestion/egress at scale
- Familiar with OLTP/OLAP systems, SQL, NoSQL, and graph databases (e.g., PostgreSQL, MongoDB, CosmosDB, Neo4j)
Observability & Optimization
- Skilled in performance tuning, profiling, and bottleneck resolution
- Experience with observability stacks like Prometheus, Grafana, OpenTelemetry, or Azure Monitor
- Knowledge of resilience patterns such as circuit breakers, bulkheads, timeouts, and retry mechanisms
Security & Authorization
- Designed and implemented auth layers using OAuth2, JWT, and RBAC
- Experience building plug-and-play auth middleware for legacy integration
Operations Research in Production
- Developed and deployed Operations Research models in real-world production environments
- Integrated optimization/forecasting models with backend APIs for real-time and batch processing