This is an excellent opportunity for individuals looking to be part of a 0→1→10 in the next 2 years who want to experience what it is to build a product/business from scratch and hopefully start their own venture someday. Not so much for individuals who prefer stability over growth!
This role requires a strong background in backend engineering and machine learning, proficiency in relevant programming languages and tools, a willingness to embrace challenges, and a commitment to the best software development and testing practices. Additionally, familiarity with cloud platforms and a dedication to staying current with industry trends are essential for success in this role.
Python Experience: 5+ years of experience with Python. Experience with Django is an added advantage.
Cloud Experience: You should be familiar with cloud computing platforms, with a preference for expertise in AWS and knowledge of platforms like Google Cloud Platform (GCP) or Microsoft Azure.
Experience with Docker and Kubernetes: You should be proficient in Docker and Kubernetes.
Test-Driven Development: Belief in and adherence to Test-Driven Development practices is essential. This means writing tests before writing code to ensure the quality and correctness of your work.
Generative AI Experience: You must have knowledge of LLMs like Llama and Mistral and other Generative AI models like Whisper and Stable Diffusion.
Miscellaneous: You should also have knowledge of frameworks and tools such as the Django REST Framework, FastAPI, gRPC Protocol, Airflow, and Cloud Native Networking.
Deep understanding of GPU Architecture: You should have a deep knowledge of GPU Architectures like A100, A10G, and T4 chips. Experience with CUDA is a plus.
Familiarity with LLM optimization techniques: Good enough idea of optimisation techniques like Quantization, speculative decoding, continuous batching, etc.
Design and Develop Scalable Machine Learning Systems: You will be responsible for collaborating with the tech team to design and build machine learning systems that are scalable and ready for production use from the start. This involves the end-to-end development of machine learning models and pipelines. You should be able to deploy and benchmark an ML model in under 30 minutes.
Conduct Extensive Research: You'll need to stay current with the latest machine learning technologies and research to identify the best approaches and tools for the job.
Improve Metrics: You will develop strategies for improving metrics using real-world data. This likely involves optimizing and fine-tuning machine learning models to achieve better results.
Infrastructure Improvements: As part of the engineering team, develop application layers on top of infrastructure components. This also includes designing and building server architecture, setting up databases, and managing server resources efficiently.