iBusiness Funding
Helping to provide capital in an efficient and transparent manner to every small and medium-sized business in America.
AI Knowledge Data Engineer
Location
Florida
Posted
82 days ago
Salary
$180K - $240K / year
Bachelor DegreeEnglishAirflowElastic SearchPythonPy TorchSparkTensorflow
Job Description
• Architect, implement, and optimize retrieval-augmented generation (RAG) workflows by integrating local LLMs (e.g., Llama) with retrieval mechanisms (vector search, Elasticsearch, FAISS, Weaviate)
• Design, build, and maintain scalable data pipelines for ingesting, transforming, indexing, and retrieving structured and unstructured data from diverse sources
• Design, build, and scale addressable services and tools specifications that can be leveraged by LLMs and Agents to orchestrate workflows
• Orchestrate and scale training data operations, including data curation, versioning, and lineage tracking for large-scale LLM training and fine-tuning
• Develop and maintain ontologies, knowledge graphs, and semantic data models to structure and integrate domain-specific knowledge for improved retrieval and reasoning
• Implement and optimize knowledge retrieval strategies (dense/sparse retrieval, ranking algorithms) to maximize system accuracy and relevance
• Aggregate disparate knowledge bases and heterogeneous data into a fused approach for access to relevant contextual information
• Design cognitive memory systems for AI agents, enabling persistent knowledge retention and contextual awareness across interactions
• Collaborate with AI researchers, data scientists, and engineers to align knowledge architecture with business objectives and ensure data quality
• Evaluate and integrate new technologies and research advancements in LLMs, RAG, information retrieval, and knowledge representation
• Maintain clear and comprehensive documentation of models, pipelines, and workflows.
Job Requirements
- Bachelor’s or Master’s degree in Computer Science, Data Science, Machine Learning, or a related field
- Proven experience designing and scaling data pipelines and training data workflows for LLMs or similar AI systems
- Strong background in information retrieval systems, vector search technologies, and RAG frameworks (e.g., FAISS, Pinecone, Elasticsearch, Milvus)
- Proficiency in programming (Python) and machine learning libraries (TensorFlow, PyTorch)
- Experience with ontologies, knowledge graphs, and semantic technologies (RDF, OWL, SPARQL)
- Familiarity with distributed data processing and orchestration tools (e.g., Spark, Airflow, Kubeflow)
- Excellent analytical, problem-solving, and communication skills
- Ability to work collaboratively in a cross-functional, fast-paced environment.
Benefits
- medical, dental, and vision coverage
- 401(k) with company match
- paid time off