Innodata Inc.
Innodata solves your toughest data engineering challenges using artificial intelligence and human expertise.
Senior Language Data Scientist
Location
New Jersey
Posted
10 days ago
Salary
Not specified
Postgraduate Degree5 yrs expEnglishPython
Job Description
• Lead projects and own processes for creating, validating and annotating data for use in LLM/ML applications
• Design/improve workflows to create data for AI/ML training and evaluation
• Dive deep into existing workflows and processes to gather data and insights
• Work closely with client stakeholders on understanding goals, gathering requirements, proposing solutions, and executing them.
• Contribute to establishing best practices and standards for generative AI development
Job Requirements
- 5+ years of relevant experience with data creation, curation, and analysis for GenAI applications
- MA in (computational) linguistics, data science, computer science (AI / ML / NLU), quantitative social sciences or a related scientific / quantitative field, PhD strongly preferred
- Ability to collaborate directly with technical stakeholders including senior project managers, data engineers, and research scientists
- Knowledge of how components of GenAI products or services combine to work
- Excellent problem-solving skills, with the ability to think critically and creatively to develop innovative AI solutions
- Experience with Natural Language Processing (NLP) techniques and tools, such as SpaCy, NLTK, or Hugging Face.
- Proficiency in Python to handle / transform large datasets
Benefits
- Providing technical mentorship and guidance to junior team members