iBusiness Funding

Helping to provide capital in an efficient and transparent manner to every small and medium-sized business in America.

AI Knowledge Data Engineer

Full TimeRemoteTeam 201-500Since 2013H1B No SponsorCompany SiteLinkedIn

Location

Florida

Posted

82 days ago

Salary

$180K - $240K / year

Bachelor DegreeEnglishAirflowElastic SearchPythonPy TorchSparkTensorflow

Job Description

• Architect, implement, and optimize retrieval-augmented generation (RAG) workflows by integrating local LLMs (e.g., Llama) with retrieval mechanisms (vector search, Elasticsearch, FAISS, Weaviate) • Design, build, and maintain scalable data pipelines for ingesting, transforming, indexing, and retrieving structured and unstructured data from diverse sources • Design, build, and scale addressable services and tools specifications that can be leveraged by LLMs and Agents to orchestrate workflows • Orchestrate and scale training data operations, including data curation, versioning, and lineage tracking for large-scale LLM training and fine-tuning • Develop and maintain ontologies, knowledge graphs, and semantic data models to structure and integrate domain-specific knowledge for improved retrieval and reasoning • Implement and optimize knowledge retrieval strategies (dense/sparse retrieval, ranking algorithms) to maximize system accuracy and relevance • Aggregate disparate knowledge bases and heterogeneous data into a fused approach for access to relevant contextual information • Design cognitive memory systems for AI agents, enabling persistent knowledge retention and contextual awareness across interactions • Collaborate with AI researchers, data scientists, and engineers to align knowledge architecture with business objectives and ensure data quality • Evaluate and integrate new technologies and research advancements in LLMs, RAG, information retrieval, and knowledge representation • Maintain clear and comprehensive documentation of models, pipelines, and workflows.

Job Requirements

  • Bachelor’s or Master’s degree in Computer Science, Data Science, Machine Learning, or a related field
  • Proven experience designing and scaling data pipelines and training data workflows for LLMs or similar AI systems
  • Strong background in information retrieval systems, vector search technologies, and RAG frameworks (e.g., FAISS, Pinecone, Elasticsearch, Milvus)
  • Proficiency in programming (Python) and machine learning libraries (TensorFlow, PyTorch)
  • Experience with ontologies, knowledge graphs, and semantic technologies (RDF, OWL, SPARQL)
  • Familiarity with distributed data processing and orchestration tools (e.g., Spark, Airflow, Kubeflow)
  • Excellent analytical, problem-solving, and communication skills
  • Ability to work collaboratively in a cross-functional, fast-paced environment.

Benefits

  • medical, dental, and vision coverage
  • 401(k) with company match
  • paid time off

Related Categories

Related Job Pages