Senior Data Engineer

Erlangen
Berlin
AI & Biometrics
Full-time
Apply Now

Worldcoin participates in the E-Verify Program

Worldcoin is an Equal Opportunity / Affirmative Action employer committed to diversity in the workplace. All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, sexual orientation, age, national origin, disability, protected veteran status, gender identity or any other factor protected by applicable federal, state or local laws.

Worldcoin is also committed to working with and providing reasonable accommodations to individuals with disabilities. Please let your recruiter know if you need an accommodation at any point during the interview process.

About the Company:

Worldcoin is a new, collectively owned global currency that will be distributed fairly to as many people as possible. Worldcoin will launch by giving a free share to everyone on Earth. We believe that this is an essential step to accelerate the transition towards a more inclusive global economy, providing new ways for everyone to share future prosperity. We hope you’ll join us on our ambitious journey.

About the AI & Biometrics Team:

The AI & Biometrics team is building a biometric iris recognition system that can work reliably with more than a billion users and enables them to claim their free share of WLD. We use cutting-edge machine learning deployed on custom hardware to enable high-quality image acquisition, identification, and fraud prevention, all while requiring minimal user interaction. Our technology, coupled with privacy-preserving data collection, allows us to increase system performance and reduce model bias.

About the Opportunity:

This role is responsible for the data pipelines that fuel our machine learning engines. From dedicated field tests all over the world, we receive millions of images monthly. Our ML models—especially the identifier models—require large high-quality datasets. To create datasets on such a scale, images need to be pre-processed and passed through our labeling services; this role is responsible for designing and building such pipelines to generate high-quality datasets.

In this role you will: 

  • Design data pipelines to handle large-scale data ingestion. This includes figuring out ways to store and process the data with robust features for filtering, pre-processing, and versioning.
  • Create frameworks and build tools to monitor our data sets for their integrity and health.
  • Develop end-to-end automated data pipelines to make ingestion, transformation, and distribution of data.
  • Improve our data quality by deploying techniques like semi-supervised learning, human-in-the-loop machine learning, and fine-tuning with human feedback.
  • Build out data infrastructure to train large neural networks using self-supervised and contrastive learning.
  • Work closely with other internal stakeholders to incorporate their data usage needs.
  • Build and refine custom data labeling services that directly influence the quality of our ML models.

About You:

  • Industry experience as a Data Engineer or Machine Learning Engineer dealing with data infrastructure, distributed systems, and fault-tolerant data pipelines.
  • Experience deploying models and data infrastructure on Kubernetes using reusable Helm charts.
  • Experience with infrastructure tools for provisioning, deployment, and monitoring such as Terraform, Docker, Datadog, Grafana, AWS services such as Glue, Kinesis, Analytics, Lambda.
  • Experience with graph databases such as GraphDB, Amazon Neptune, Neo4J is a big plus.
  • Experience with automated data pipeline and workflow management tools, e.g. Argo Workflow, Airflow, Dagster.
  • Experience with MongoDB, PostgreSQL, and Redis.

Senior Data Engineer

Apply Now

Category

AI & Biometrics

Location

Erlangen / Berlin

Type

Full-time