AdTechTalent
Data Science82 days agoOn-site

Samba TV

Data Scientist

pythonpysparkdatabricksdelta lakesqlawsgcpairflowmachine learningmlopsairagllmvector databasessemantic searchknowledge graphentity resolutionprobabilistic record linkageembedding-based matchingcausal inferencemediaad techaudience modelingmeasurement

Key details

Salary

Not specified

Employment type

Full-time

Seniority

Mid-level

Years experience

3-5

Location

Warsaw, Poland

Full job description

Mid-level Data Scientist role in Warsaw responsible for end-to-end delivery of data science projects with minimal guidance. Requires expertise in measurement or audience modeling and ability to build production-ready ML and AI solutions. Responsibilities include project ownership, methodology decisions, solution design, coding in Python and PySpark on Databricks, mentoring juniors, and cross-functional collaboration. Requires bachelor's degree (master's preferred) in quantitative field, 3-5 years experience, advanced Python, SQL, PySpark, Databricks, cloud knowledge (AWS/GCP), core ML skills, MLOps proficiency, and exposure to modern AI methods. Preferred skills include knowledge graphs, probabilistic record linkage, causal inference, and media/ad tech experience.

What you'll do

  • Own end-to-end delivery of significant data science projects from problem scoping to production deployment
  • Make independently-reasoned decisions on methodology, model selection, and evaluation; document technical solutions
  • Lead solution design; break down complex epics into user stories with acceptance criteria; adopt DataOps and MLOps best practices
  • Build production-quality Python and PySpark code on Databricks; implement advanced ML and AI workflows including entity resolution, probabilistic record linkage, embedding-based matching, semantic similarity, and LLM-augmented pipelines
  • Develop and maintain reusable tools, libraries, and documentation; conduct code reviews with constructive feedback
  • Mentor junior data scientists on technical execution, code quality, and career development; lead internal talks or workshops on ML topics
  • Collaborate cross-functionally with product, engineering, and operations; translate business requirements into technical specifications; partner with data engineering on scalable pipeline design; participate in design reviews and working groups

Requirements

  • Bachelor's degree in Statistics, Data Science, Computer Science, Mathematics or related quantitative field; Master's preferred
  • 3–5 years of hands-on data science experience with ability to own and deliver complex projects independently
  • Advanced Python with production-quality code, testing, and documentation
  • Strong SQL and PySpark skills for billion-row datasets
  • Experience with Databricks workflows, Delta Lake, and job orchestration
  • Working knowledge of cloud platforms (AWS or GCP)
  • Solid command of core ML techniques: regression, classification, clustering, model evaluation, experimental design
  • Proficiency with MLOps practices: experiment tracking, pipeline orchestration (Airflow), reproducible model deployment
  • Exposure to modern AI methodologies: RAG systems, LLM-augmented models, vector databases, semantic search
  • Strong communication skills for documentation and cross-functional collaboration
  • Ability to mentor junior data scientists and contribute to team standards

Tech stack

PythonPySparkDatabricksDelta LakeSQLAWSGCPAirflowMLAIRAG systemsLLM-augmented modelsvector databasessemantic searchknowledge graphRDFOWLSPARQL

Apply now

This MVP uses a placeholder application flow. In production, this section can connect to an external apply URL or a native application form.

Similar jobs

More roles worth a look

Related opportunities based on specialty and working model so candidates can keep momentum.