Analytica
Data-driven consulting and technology services
Data Scientist – NLP
Location
District of Columbia
Posted
116 days ago
Salary
Not specified
Postgraduate Degree4 yrs expEnglishAWSAzureCloudKerasOpen SourcePythonPy TorchSparkTensorflow
Job Description
• Support long-term federal client engagements projects in the DC Metro area
• Apply statistical programming, modeling, visualization techniques, data mining, and forecasting skills to analyze challenging public sector problems
• Pre-processing - Demonstrate the skills and experience to collect, clean, and prepare data sets for input into a computational model using Python
• Feature Engineering and Attribute Evaluation - Candidate must demonstrate experience with NLP feature engineering methods such as TF-IDF, word2vec, GloVe, and FastText
• Modeling - Candidates will have practiced skills and experience selecting classification modeling techniques to fit the business problem
• Validation - Strong candidates will describe their experience with investigating, reporting, and justifying model results
• Visualization - Experience in presenting the results of their modeling activities, depicting the insights realized, and explaining the relevance of their results to the organization’s business challenges
Job Requirements
- Master's degree required, and PhD preferred in Statistics, Mathematics, Computer Science, or similar
- High degree of experience utilizing SAS, R, or Python to support NLP use cases such as Document Summarization, Named Entity Recognition, Sentiment Analysis, and/or Topic Modeling
- At least four years of experience developing scalable, production-ready NLP solutions using sci-kit learn, Keras, TensorFlow, PyTorch, Spark NLP
- Experience using git/github to version control source code
- Experience leveraging transformer architecture to develop NLP models
- Experience with open source NLP packages such as Gensim, SpaCy, or NLTK
- Experience with BERT, GPT-J, RoBERTa, T5 or other transformers
- Experience with GenAI and Prompt Engineering is a plus
- Experience in Databricks and MLFlow is a plus
- Experience with machine translation and transcription of foreign language documents using Microsoft Azure translation services is a plus
- Experience working in an AWS cloud environment and with related AWS services such as Bedrock and Textract
- Experience coordinating and maintaining user stories
- Must be a US citizen
- Must be able to obtain and maintain a Public trust security clearance
Benefits
- competitive compensation with opportunities for bonuses
- employer paid health care
- training and development funds
- 401k match
Related Guides
Related Categories
Related Job Pages
More Data Scientist Jobs
Data Scientist117 days ago
Full TimeRemoteTeam 10,001+Since 1984H1B Sponsor
Jr. Marketing Data Scientist supporting advanced marketing analytics initiatives
NumpyPandasPythonScikit-LearnSQL
Data Scientist – AI Focus
Spotter LabsSpotter is a software company offering autonomous dispatching to long-haul spot market trucking in the US.
Data Scientist118 days ago
Full TimeRemoteTeam 1-10Since 2020H1B No Sponsor
Data Scientist developing predictive models for trucking safety insights
PythonSQLTableau
Illinois
Data Scientist118 days ago
Full TimeRemoteTeam 201-500H1B No Sponsor
Senior Data Scientist leveraging data-driven solutions for clinical trials at OneStudyTeam.
Amazon RedshiftAWSCloudPythonScikit-LearnSQL
United States
Data Scientist118 days ago
ContractRemoteTeam 11-50H1B No Sponsor
Principal Data Scientist designing data ecosystem for generative entertainment.
BigQueryKafka