
Senior Data Scientist
- Singapore
- Permanent
- Full-time
This is a full-stack applied AI role, covering data handling, model training, deployment, monitoring, and optimization in production.Key Responsibilities1. LLM Fine-tuning & Evaluation
- Fine-tune and adapt LLMs for domain-specific configuration assistance.
- Apply instruction tuning, LoRA, RLHF, and domain adaptation.
- Establish automated evaluation pipelines for accuracy, latency, and safety.
- Design, test, and optimize prompt strategies for varied scenarios, personas, and workflows.
- Develop reusable prompt templates and dynamic context injection logic.
- Run A/B tests to measure prompt impact on user outcomes.
- Implement semantic retrieval with VectorDB (e.g., FAISS, Pinecone, Weaviate).
- Build GraphDB (e.g., Neo4j, TigerGraph) pipelines to represent and query configuration relationships.
- Combine embedding search with graph reasoning for richer context in LLM outputs.
- Optimize retrieval for both latency and relevance.
- Apply quantization, pruning, and distillation to right-size LLMs for deployment.
- Benchmark trade-offs between quality, speed, and cost across CPU/GPU/edge.
- Collaborate with infrastructure teams on inference optimization.
- Extract, clean, and structure configuration and schema data (JSON, YAML, XML).
- Proficiency with SQL for querying and transforming relational datasets.
- Build automated pipelines for continuous retraining and RAG index updates.
- Apply schema-aware data modeling for improved retrieval and training.
- Collaborate with software engineers to integrate AI into live products.
- Develop APIs and microservices for LLM-powered features.
- Set up monitoring dashboards, drift detection, and feedback loops.
- Implement safety guardrails to prevent hallucinations and unsafe recommendations.
- Ensure compliance with data privacy regulations (e.g., GDPR, SOC 2).
- Apply data anonymization and access control practices.
- Design output filtering to avoid sensitive or incorrect recommendations.
- 3+ years in Data Science, ML, or NLP with hands-on LLM fine-tuning experience.
- Proven skills in prompt engineering and RAG pipeline development.
- Experience with VectorDB and GraphDB integration.
- Hands-on experience with model quantization and optimization.
- Proficiency in Python (Hugging Face Transformers, PyTorch, LangChain).
- Proficiency with SQL and relational data modeling.
- Knowledge of YAML, JSON, XML, and schema-based data structures.
- Strong grasp of MLOps principles for production deployment.
- Experience with GPU optimization tools (ONNX Runtime, TensorRT).
- Background in software configuration management systems.
- Familiarity with CI/CD, Docker, Kubernetes for ML services.
- Experience in LLM evaluation frameworks (e.g., Ragas, HELM, OpenAI Evals).