Data Engineer mfd - New Delhi, India - Codepan GmbH

    Codepan GmbH
    Codepan GmbH New Delhi, India

    2 weeks ago

    Default job background
    Description

    Codepan founded in 2014 is a Berlinbased AIInnovation Hub. Our team of passionate data scientists engineersand technologists applies machine learning to solve realworldproblems for clients as well as to incubate and accelerate our ownAI product ideas.

    Codepan is currentlydeveloping an AIbased product using capabilities of stateoftheartLLM technologies in the space of Intelligent Documentprocessing.

    Tasks

    As a DataEngineer in our team youll architect build and maintain advanceddata pipelines and storage solutions. Youll play a pivotal role inenabling our analytics and AI teams to work efficiently with largedatasets including those used for training and deploying LargeLanguage Models (LLMs) and RetrievalAugmented Generation (RAG)models.

    KeyResponsibilities:

    • Designand optimize scalable data pipelines to support advanced analyticsmachine learning and AI projects with a particular focus onapplications involving LLMs and RAGs.
    • Developrobust data warehousing solutions that ensure fast reliable accessto large volumes of data optimizing for query performance andsystem scalability.
    • Collaborate with AIresearch and development teams to understand data requirements andensure the seamless integration of AI models with our dataecosystem.
    • Implement data governance andsecurity measures adhering to best practices and regulatorystandards to safeguard sensitiveinformation.
    • Utilize and advocate forcloudbased technologies and services to enhance our data processingcapabilities ensuring our infrastructure is both flexible andcosteffective.
    • Regularly evaluate and adopt newtools and technologies to keep our data infrastructure at theforefront of industry standards particularly those enhancing LLMand RAG functionalities.
    • Simplify complex dataflows making data easily accessible for nontechnical stakeholderswhile maintaining the integrity and confidentiality of thedata.

    Requirements

    • Bachelorsor Masters degree in Computer Science Engineering InformationSystems or a related field.
    • At least 5 years ofproven experience in data engineering with a track record ofdeveloping scalable data solutions.
    • Strongtechnical expertise in SQL/NoSQL databases Python Java and ETLprocesses.
    • Experience with cloud platforms(Azure/GCP) and familiarity with big datatechnologies.
    • A keen interest in AItechnologies especially LLMs and RAG models with a desire to stayupdated on the latest trends andtechniques.
    • Solid foundation in data securityprinciples and a commitment to implementing privacycompliant datamanagement practices.
    • Excellent problemsolvingskills ability to work collaboratively in a team environment andstrong communication skills for explaining complex technicalconcepts.

    Join us to contribute tocuttingedge projects in AI and analytics leveraging your expertiseto create impactful data solutions.

    Salary: 30L35L remote job in India