Socure
The leading provider of digital identity verification and fraud solutions. Salesinfo@socure.com
Staff Data Scientist – Entity Resolution, IDGraph
Location
United States
Posted
3 days ago
Salary
$170K - $205K / year
Postgraduate Degree5 yrs expEnglishPy SparkPython
Job Description
• Lead the evaluation and continuous improvement of entity resolution and entity linking pipelines.
• Debug new builds, identify anomalies, and recommend modeling or system-level improvements.
• Define, implement, and maintain scalable performance and quality metrics, leveraging automation and LLM-based approaches where appropriate.
• Partner with Engineering to optimize entity linking and ranking systems using Learning-to-Rank and related techniques.
• Design methods to assess and classify entity confidence and quality across the graph.
• Design and implement a comprehensive data quality framework for graph-based identity data.
• Translate abstract quality concepts (e.g., reliability, stability, consistency) into measurable signals.
• Use data quality insights to guide modeling decisions, experimentation strategy, and product prioritization.
• Identify and operationalize generalized, high-impact predictive signals derived from graph structure, temporal dynamics, and relational patterns.
• Develop scalable approaches to link prediction, label propagation, and semi-supervised learning within the ID Graph.
• Explore and evaluate advanced graph modeling techniques, including graph-based ML, knowledge graph methods, and Graph Neural Networks (GNNs), when appropriate.
• Focus on durable abstractions rather than one-off features, ensuring solutions are explainable, compliant, and reusable across multiple products.
• Collaborate closely with Engineering, Product Management, Compliance, and downstream product teams.
• Act as a technical leader within the Identity organization, influencing modeling standards, experimentation rigor, and best practices.
• Translate complex technical findings into clear insights and recommendations for both technical and non-technical stakeholders.
• Support the launch of new product capabilities built on top of the ID Graph.
• Demonstrate strong ownership, strategic impact, and assertive communication.
• Mentor peers, foster a culture of growth, and build authentic relationships across teams.
• Embrace feedback, adapt resiliently to challenges, and pursue continual self-improvement.
Job Requirements
- Master’s or PhD in Computer Science, Data Science, Machine Learning, Statistics, Mathematics, or a related field
- 5+ years of experience in applied data science, machine learning, or artificial intelligence, with a focus on graph-based modeling and large-scale data systems
- Strong proficiency in Python and PySpark
- Deep experience with Classification models, Learning-to-Rank, Anomaly Detection, Statistical Modeling
- Experience building and maintaining production-grade ML systems at scale
- Hands-on experience with Databricks
- Familiarity with graph databases and query languages such as NeptuneDB and OpenCypher
- Experience with graph processing frameworks (e.g., GraphFrames)
Benefits
- Offers Equity
- Offers Bonus