Full job description
Samba is hiring a Senior Software Engineer for the Data Integration team in Amsterdam. The role involves designing, building, and operating components of the data integration platform that handles data ingestion, processing, enrichment, and distribution. Candidates must have 5+ years of software engineering experience focused on data engineering, backend systems, or distributed data infrastructure. Required skills include proficiency in Python and SQL, experience with distributed processing frameworks like Spark or Databricks, cloud-native data systems on AWS or GCP, API-first service development, streaming/event-driven frameworks such as Kafka or Flink, and workflow orchestration tools like Apache Airflow or dbt. Familiarity with GDPR and CCPA data privacy regulations is required. Responsibilities include building reliable data pipelines, owning platform components, collaborating across teams, ensuring operational reliability, participating in on-call rotations, and maintaining CI/CD pipelines. Benefits include health insurance, wellness offerings, life and disability insurance, retirement plans, paid holidays, PTO, and incentive bonuses.
What you'll do
- Design and build reliable data pipelines for ingestion, transformation, and distribution of large-scale datasets
- Develop ETL/ELT workflows using distributed computing frameworks on cloud infrastructure
- Build API-first services exposing ingestion, processing, and distribution capabilities
- Implement data quality validation, monitoring, and observability
- Build reusable platform components serving downstream consumers
- Take ownership of components within the data integration platform and drive their reliability and iteration
- Build partner and destination integrations end-to-end including throughput tuning and operational handoff
- Apply GDPR, CCPA, and Samba data governance requirements to systems
- Collaborate with team members and adjacent teams to understand downstream use cases
- Drive technical design for components, produce design documents, and participate in architecture discussions
- Conduct code reviews and uphold standards for code quality, testability, and maintainability
- Build working relationships with adjacent teams and reason about cross-functional requirements
- Own reliability of components, monitor health, respond to incidents, and follow through on post-mortem improvements
- Participate in on-call rotations and contribute to improving operational practices
- Build and maintain CI/CD pipelines, deployment processes, and testing coverage
Requirements
- 5+ years of professional software engineering experience with a Bachelor's degree in Computer Science, Software Engineering, or related technical field (or 3+ years with a Master's, PhD with no prior experience, or equivalent)
- Meaningful focus on data engineering, backend systems, or distributed data infrastructure
- Proficiency in Python and SQL; ability to write clean, well-tested, production-ready code
- Hands-on experience with distributed processing frameworks (e.g., Spark, Databricks) in production
- Hands-on production experience building cloud-native data systems on AWS, GCP, or Databricks
- Experience building API-first services with focus on correctness and reliability
- Working experience with streaming or event-driven data processing frameworks (e.g., Kafka, Flink, Spark Streaming)
- Experience with workflow orchestration tools (Apache Airflow, dbt, Prefect)
- Familiarity with data privacy regulations (GDPR, CCPA) and understanding of their impact on system design
- Clear communicator who participates actively in design discussions and works well across teams
- Comfortable advising more junior engineers on technical matters
Tech stack
PythonSQLSparkDatabricksAWSGCPKafkaFlinkSpark StreamingApache AirflowdbtPrefectSnowflake
Benefits
Health insuranceWellness offeringsLife and disability insuranceRetirement savings planPaid holidaysPaid time off (PTO)Bonuses, short-term incentives, and long-term incentives