Human Interview for LLM Evaluation
2 weeks ago

Job description
, consectetur adipiscing elit. Nullam tempor vestibulum ex, eget consequat quam pellentesque vel. Etiam congue sed elit nec elementum. Morbi diam metus, rutrum id eleifend ac, porta in lectus. Sed scelerisque a augue et ornare.
Donec lacinia nisi nec odio ultricies imperdiet.
Morbi a dolor dignissim, tristique enim et, semper lacus. Morbi laoreet sollicitudin justo eget eleifend. Donec felis augue, accumsan in dapibus a, mattis sed ligula.
Vestibulum at aliquet erat. Curabitur rhoncus urna vitae quam suscipit
, at pulvinar turpis lacinia. Mauris magna sem, dignissim finibus fermentum ac, placerat at ex. Pellentesque aliquet, lorem pulvinar mollis ornare, orci turpis fermentum urna, non ullamcorper ligula enim a ante. Duis dolor est, consectetur ut sapien lacinia, tempor condimentum purus.
Access all high-level positions and get the job of your dreams.
Similar jobs
We're looking for an LLM Evaluation, Benchmarking & Experimentation Engineer to rigorously test our proprietary LLM API and build the infrastructure for systematic model improvement. · The primary focus is on executing established benchmarks with methodological rigor, while s ...
3 weeks ago
Role: Lead Applied Scientist/Generative AI Engineer · Location: Gurugram · Experience: 8+ years · Job Summary · We are looking for a Senior Generative AI Engineer with strong hands-on experience in AI/ML and LLM technologies. The role requires excellent communication, leadership ...
2 days ago
We refer candidates to our partner that collaborates with the worlds leading AI research labs to build and train cutting-edge AI models. · Frame and design high-quality machine learning tasks to enhance LLM capabilities. · Build and optimize ML models for NLP, classification, pre ...
1 month ago
In this role, you will be working on projects to help fine-tune large language models (like ChatGPT) using your strong analytical and English comprehension skills. · ...
3 weeks ago
I'm seeking a technical mentor to help deepen my understanding of LLM evaluation and benchmarking. · ...
1 month ago
MUST HAVE: PhD in Maths domain. · This role involves working on projects to fine-tune large language models using analytical and English comprehension skills. · ...
3 weeks ago
LLM Evaluation Lead (Applied Scientist) 8+ Years · Key Skills: · Deep understanding of LLM behavior & failure modes · Prompt sensitivity, hallucination analysis, RAG failure modes · Evaluation strategy ownership (offline & online) · Tools: Ragas, DeepEval, TruLens · Red-teaming & ...
2 days ago
Combine deep software engineering expertise with frontier AI research to influence how large language models understand and solve real-world coding problems. · ...
1 week ago
Auditing and improving Large Language Models (LLMs) for finance tasks. · Responsibilities:Evaluating LLM outputs for accuracy in financial tasks. · ...
1 month ago
Senior Machine Learning Engineer – LLM Evaluation
Only for registered members
We are hiring Senior Machine Learning Engineers to support advanced LLM evaluation and benchmarking initiatives for a leading AI research lab. · Frame novel ML problems aimed at improving LLM capabilities · Design, build, and optimize ML models (classification, prediction, NLP, r ...
1 month ago
LLM Evaluation Specialist for AI Chat Workflows
Only for registered members
We're building an AI-first knowledge management platform with chat-based agents that edit documents and manage plans. · Evaluate output quality. · Diagnose failure modes (hallucinations, grounding issues). · ...
1 week ago
Creative Writer with Statistical Expertise Needed for LLM Evaluation
Only for registered members
We are seeking a talented creative writer who possesses a strong statistical background. · ...
1 month ago
LLM + Retrieval Engineer … Build a Source-Grounded Outreach Suggestion System + Evaluation Loop
Only for registered members
We're building an internal system that helps B2B teams write non-generic outreach by using structured information pulled from public sources (company websites, competitor sites, LinkedIn posts, YouTube video transcripts etc.). The system should generate actionable outreach sugges ...
1 month ago
Full-Time (40 hrs/week) Finance RLHF / LLM Evaluation Assistant (Rubrics, Golden Answers, QA)
Only for registered members
I'm hiring a full-time assistant to help me execute and scale finance-focused RLHF / LLM evaluation contracts. · ...
3 weeks ago
+We are looking for a Senior LLM Engineer with strong hands-on experience in agentic AI systems.The role focuses on designing and implementing multi-agent workflows, building LLM evaluation pipelines, and optimizing LLM behavior using prompt engineering. · ...
1 month ago
Job Description: · We are seeking an experienced AI Engineer with a strong background in Natural Language Understanding (NLU) who is passionate about pushing the boundaries of Conversational AI. In this role, you will design, develop, and deploy scalable AI solutions leveraging L ...
4 days ago
We are seeking a highly skilled Machine Learning Engineer with 4+ years of experience to bridge the gap between Generative AI and actionable business intelligence. You will be responsible for optimizing Large Language Model (LLM) performance and building sophisticated clustering ...
5 days ago
Job summary · We are seeking an experienced AI Engineer with a strong background in Natural Language Understanding (NLU) who is passionate about pushing the boundaries of Conversational AI.Responsibilities:Design, fine-tune, and deploy LLM-based applications for Conversational AI ...
1 month ago
This internship offers the opportunity to work on interesting client projects implementing explainability frameworks for AI model outputs. · Work on interesting client projects implementing explainability frameworks (RAGAS) for AI model outputs. · Fine-tune open-source LLMs using ...
2 weeks ago
The company is seeking an experienced Gen AI/LLM Engineer to design, develop and deploy scalable AI/ML solutions focused on Large Language Models (LLMs). The engineer will collaborate with business partners, data scientists and product teams to deliver innovative reliable and com ...
1 month ago