Fieldguide logo
Fieldguide

Powering the future of trust with modern software for assurance & advisory firms.

AI Engineer, Quality

AI EngineerMachine Learning EngineerFull TimeRemoteSeniorTeam 11-50H1B SponsorCompany SiteLinkedIn

Location

California

Posted

29 days ago

Salary

$170K - $220K / year

Seniority

Senior

Bachelor DegreeExperience acceptedEnglishPostgresPythonReactTypeScript

Job Description

• Design and build a unified evaluation platform that serves as the single source of truth for all of our agentic systems and audit workflows • Build observability systems that surface agent behavior, trace execution, and failure modes in production, and feedback loops that turn production failures into first-class evaluation cases • Own the evaluation infrastructure stack including integration with LangSmith and LangGraph. • Translate customer problems into concrete agent behaviors and workflows • Integrate and orchestrate LLMs, tools, retrieval systems, and logic into cohesive, reliable agent experiences • Build automated pipelines that evaluate new models against all critical workflows within hours of release • Design evaluation harnesses for our most complex Agentic systems and workflows • Implement comparison frameworks that measure effectiveness, consistency, latency, and cost across model versions • Design guardrails and monitoring systems that catch quality regressions before they reach customers • Use AI as core leverage in how you design, build, test, and iterate • Prototype quickly to resolve uncertainty, then harden systems for enterprise-grade reliability • Build evaluations, feedback mechanisms, and guardrails so agents improve over time • Work with SMEs and ML Engineers to create evaluation datasets by curating production traces. • Design prompts, retrieval pipelines, and agent orchestration systems that perform reliably at scale • Define and document evaluation standards, best practices, and processes for the engineering organization • Advocate for evaluation-driven development and make it easy for the team to write and run evals • Partner with product and ML engineers to integrate evaluation requirements into agent development from day one • Take full ownership of large product areas rather than executing on narrow tasks

Job Requirements

  • Multiple years of experience shipping production software in complex, real-world systems
  • Experience with TypeScript, React, Python, and Postgres
  • Built and deployed LLM-powered features serving production traffic
  • Implemented evaluation frameworks for model outputs and agent behaviors
  • Designed observability or tracing infrastructure for AI/ML systems
  • Worked with vector databases, embedding models, and RAG architectures
  • Experience with evaluation platforms (LangSmith, Langfuse, or similar)
  • Comfort operating in ambiguity and taking responsibility for outcomes
  • Deep empathy for professional-grade, mission-critical software (experience with audit and accounting workflows are not required)

Benefits

  • Health insurance
  • Professional development opportunities
  • Flexible work arrangements

Related Job Pages

More AI Engineer Jobs

PostHog logo

AI Product Engineer

PostHog

Product analytics, session replay, feature flags, A/B testing, data warehouse, CDP, surveys. PostHog does that.

AI Engineer29 days ago
Full TimeRemoteTeam 11-50Since 2020H1B No Sponsor

As an AI Product Engineer, you'll build full-stack AI applications, collaborate with teams, engage with users, implement AI features, and own product features end-to-end.

AIBig DataDjangoPythonReactTypeScript
United States
AI Engineer29 days ago
Full TimeRemoteSince 2017

As an Applied AI Engineer, you will architect AI systems for enterprise coaching, develop LLM components, and enhance user experiences through data-driven insights.

AIAWSAzureDockerGCPKubernetesMatplotlibMlNumpyPandasPlotlyPysparkScikit-LearnSeabornSQL
New York + 1 moreAll locations: New York, California
Lyric - Clarity in motion. logo

Lead AI Engineer

Lyric - Clarity in motion.

Simplifying the business of care.

AI Engineer29 days ago
Full TimeRemoteTeam 201-500H1B No Sponsor

Lead the development of AI/ML systems for document processing, mentor engineering teams, and implement advanced machine learning solutions focused on language models and data extraction.

Amazon TextractAWSAzureGCPKubeflowPythonPyTorchTesseract
United States
$211.6K - $317.3K / year
Full TimeRemote

We're building the data infrastructure that makes AI agents trustworthy instead of error-prone. We provide continuously refreshed, verified B2B data for autonomous AI agents and GTM workflows. We've tripled growth while maintaining 100% gross dollar retention and staying cashflow...

PythonAWSSnowflakeAPI designdistributed systemsOpenAIPineconedata modelingETLperformance optimizationvector embeddingssemantic searchLLM APIs
United States