Full job description
Samba is seeking a Senior Software Engineer for the Identity team in Amsterdam. The role involves designing, building, and operating production-grade data engineering systems that handle identity-linked data at scale. Candidates should have 8+ years of experience in software engineering focused on data engineering, backend systems, or distributed data infrastructure. Required skills include Python, SQL, JavaScript, distributed processing frameworks (Spark, Databricks), cloud platforms (AWS/GCP), workflow orchestration tools (Airflow, dbt, Prefect), and data warehousing technologies like Snowflake. Responsibilities include developing ETL/ELT workflows, building API-first services, ensuring data quality and privacy compliance (GDPR, CCPA), mentoring engineers, and collaborating cross-functionally. Preferred experience includes streaming data frameworks (Kafka, Flink), AI/ML integration, and familiarity with ad tech or identity resolution. Benefits include health insurance, wellness offerings, life and disability insurance, retirement plans, paid holidays and PTO, and various incentives.
What you'll do
- Design, build, and maintain reliable data pipelines for ingestion, transformation, and distribution of identity-linked data at scale
- Develop ETL/ELT workflows using distributed computing frameworks on cloud infrastructure
- Design and build API-first services exposing processed identity data to internal and external consumers
- Implement data quality validation, monitoring, and observability for owned components
- Contribute to platform-grade reusable components enabling downstream teams
- Take end-to-end ownership of key components within identity resolution systems
- Design and implement privacy-compliant data handling practices applying GDPR, CCPA, and data governance policies
- Engage cross-functional stakeholders to support downstream use cases
- Drive technical design, produce design documents, and contribute to architecture discussions
- Conduct rigorous code reviews and uphold high standards for code quality
- Mentor engineers through feedback, pairing, and design review
- Collaborate across teams and advocate for shared standards
- Own reliability of components, monitor health, respond to incidents, and follow up on improvements
- Participate in on-call rotations and improve operational practices
- Drive improvements to CI/CD pipelines, deployment processes, and testing coverage
Requirements
- 8+ years of professional software engineering experience with focus on data engineering, backend systems, or distributed data infrastructure
- Proficient in Python and SQL; comfortable with JavaScript in full-stack or API contexts
- Strong hands-on experience with distributed processing frameworks (e.g., Spark, Databricks) working with large-scale datasets in production
- Practical experience with cloud platforms (AWS and/or GCP) and their core data services
- Hands-on experience with workflow orchestration tools (Apache Airflow, dbt, Prefect, or equivalent)
- Strong familiarity with data warehousing and lakehouse technologies, including Snowflake
- Solid understanding of data privacy regulations (GDPR, CCPA) and practical experience building compliant systems
- Familiarity with platform-thinking and API-first service design
- Clear communicator and cross-functional collaborator
- Active mentor who invests in others and provides direct feedback
- Preferred: experience with streaming data processing frameworks (Kafka, Flink, Spark Streaming)
- Preferred: experience incorporating AI and machine learning capabilities into production data workflows
- Preferred: exposure to ad tech, identity resolution, data licensing, or digital media, familiarity with device graphs, audience segmentation, or Measurement
Tech stack
PythonSQLJavaScriptSparkDatabricksAWSGCPApache AirflowdbtPrefectSnowflakeKafkaFlinkSpark Streaming
Benefits
Health insuranceWellness offeringsLife and disability insuranceRetirement savings planPaid holidaysPaid time off (PTO)Bonuses, short-term incentives, and long-term incentives