Full job description
Merkle, a dentsu company, is seeking a mid to senior Data Engineer to design and build big data architecture pipelines for data lakehouses in cloud environments. Responsibilities include data ingestion, processing, supporting machine learning model deployment, collaborating with BI consultants, and automating CI/CD pipelines. Candidates should have 3-7 years of experience with Python, cloud data services (Azure, AWS, GCP), big data technologies (Databricks, Snowflake, etc.), relational and NoSQL databases, streaming technologies (Kafka, Event Hubs), orchestration tools (Airflow, dbt), and CI/CD tools (Azure DevOps, Jenkins). Strong analytical and communication skills are required. The role offers hybrid work flexibility and is located in Prague, Czech Republic.
What you'll do
- Design and implement data ingestion and processing pipelines using public cloud services
- Assist stakeholders with data-related technical issues and infrastructure needs
- Collaborate with Business Intelligence consultants to assemble large and complex data sets
- Support machine learning teams in deployment and optimization of AI/ML models and algorithms
- Develop data pipelines for marketing automation, customer acquisition, and other business areas
- Use infrastructure as code and CI/CD tools to automate release pipelines
- Document data pipelines and logic using Confluence
- Plan activities using Agile methodology in Jira
- Support pre-sales by proposing technical solutions and effort estimation
Requirements
- Experience in building and productionizing big data architectures, pipelines and data sets
- Understanding data concepts and patterns of big data, data lake, lambda architecture, stream processing, DWH, and BI
- 3-7 years of experience in a Data Engineer role
- Advanced Python programming skills
- Experience with object-oriented, functional, or scripting languages like Bash, Scala, Java, R, PowerShell
- Experience with data services from Azure, AWS, and GCP public clouds
- Experience with big data technologies such as Databricks, Fabric, AWS Glue, Snowflake, Dataproc, BigQuery
- Experience with relational databases like MS SQL, Postgres, Aurora DB and NoSQL databases like DynamoDB, MongoDB, Elasticsearch, Redis
- Experience with streaming technologies like Kafka, Event Hubs, Kinesis
- Experience with orchestration, compute, and ETL services like dbt, Airflow, Cloud Composer, AWS Step Functions, AWS Lambda, Azure Functions
- Strong analytic skills with structured and unstructured datasets
- Experience building processes supporting data transformation, data structures, metadata, dependency and workload management
- Experience setting up and using CI/CD automation tools like Azure DevOps, GitHub Actions, Jenkins
- Good communication skills and ability to adapt to changing circumstances
Tech stack
PythonBashScalaJavaRPowerShellAzureAWSGCPDatabricksFabricAWS GlueSnowflakeDataprocBigQueryMS SQLPostgresAurora DBDynamoDBMongoDBElasticsearchRedisKafkaEvent HubsKinesisdbtAirflowCloud ComposerAWS Step FunctionsAWS LambdaAzure FunctionsAzure DevOpsGitHub ActionsJenkinsPower BITableauLookerAdobe AnalyticsGoogle AnalyticsSalesforce Cloud
Benefits
Hybrid work models with adaptable hours and home office optionsCompetitive salariesVolunteer daysWellness daysModern officesMulti-cultural environment fostering creativity and innovationGlobal collaboration with over 70,000 colleagues across 145 countriesInclusive and supportive culture celebrating diversityInnovation-driven projects with next-generation technologiesPurpose-driven culture fostering sustainable growthGrowth opportunities through training, mentorship, and data-driven projectsAccess to opportunities across the dentsu global network