Full job description
Director, Data Collective to lead Sovrn's data platform team managing pipelines, lakehouse, data services, and cloud infrastructure. Role includes team leadership, architecture, hands-on engineering, AI-native data engineering, operational excellence, and cross-functional collaboration. Requires 10+ years software/data engineering experience, 4+ years leading engineering teams, proficiency with Python, Spark, Kafka/Redpanda, Databricks, AWS, Terraform, CI/CD, and AI/agentic production systems. Experience with large-scale data systems, vector databases, adtech infrastructure, and data security compliance preferred. Location: Boulder, CO with hybrid work option. Salary $225,000-$250,000 plus bonus, equity, and benefits.
What you'll do
- Own skill mix of Data Collective team; lead hiring and performance management for engineers from mid-level to Principal
- Set technical and cultural standards for team including design, code review, on-call, and cross-team partnership
- Mentor and grow engineers through hands-on design collaboration, technical coaching, and career frameworks
- Partner with engineering leadership on org-wide planning, budgeting, and roadmap tradeoffs; represent team to executives
- Drive architectural decisions across pipeline design, data modeling, lakehouse architecture, and data services
- Contribute to design and architecture of data pipelines, lakehouse, and data services including streaming, batch, petabyte-scale storage and query
- Lead design reviews and set technical standards; raise engineering rigor, observability, and operational excellence
- Make tradeoffs on performance, cost, governance, and reliability; validate team's estimates and risk assessments
- Set team direction on AI-native data engineering including LLMs, RAG, agentic workflows, AI-assisted tooling
- Establish standards for evaluating, trusting, and operating AI-powered systems in production including observability, fallback, governance, cost control
- Identify high-leverage AI applications in data stack such as pipeline optimization, anomaly detection, automated data quality, forecasting, LLM-powered data services
- Own operational posture of data platform including SLOs, on-call health, incident response, continuous improvement
- Manage infrastructure cost footprint across AWS and Databricks; drive cost improvements through architecture and commitment management
- Drive Infrastructure as Code adoption, policy-as-code, governance frameworks (RBAC/ABAC, IAM, SCIM), and CI/CD for infrastructure
- Balance investment in new capability, platform health, and tech debt
- Provide domain expertise to enable business growth through data services and data models
- Partner with Product, Data Science, AI/ML, Platform, and Security teams to ship end-to-end and improve data asset usability and safety
- Serve as senior counsel to internal teams, leadership, and external customers of Data-as-a-Service products
- Communicate clearly from architecture documents and design reviews to executive updates on cost, capacity, and risk
Requirements
- 10+ years of software / data engineering experience with hands-on track record in data platforms, distributed systems, or backend infrastructure
- 4+ years leading and growing engineering teams including hiring, leveling, and performance management of senior and principal-level engineers
- Deep, current technical proficiency including reading code, writing design docs, and leading architecture
- Hands-on experience in big data and distributed data processing in AWS ecosystem (Python, Spark, Kafka/Redpanda, Databricks or similar lakehouse platforms)
- Experience operating data systems at scale: real-time streaming, batch pipelines, data lakes, metadata management, lineage, and governance
- Working knowledge of cloud platform engineering practices: IaC (Terraform), CI/CD, observability, IAM, and cost management
- Track record of leading or contributing to AI/agentic engineering efforts in production
- Hands-on experience operating production vector databases at scale with pipelines and infrastructure to refresh hundreds of millions of vectors daily
- Familiarity with adtech data infrastructure and programmatic ecosystem is a strong plus
- Experience with data security and compliance (PII, CCPA, GDPR)
- Ability to communicate architectural concepts and team strategy to engineers, executives, and board
- Comfort driving technical and organizational decisions in ambiguous, fast-moving environments
Tech stack
PythonRedpandaKafkaDatabricksSparkAWSS3TerraformDatadogGitHubLLMsagentic toolingCI/CDIAMSCIMvector databases
Benefits
Medical, dental, and vision coverageShort and long-term disabilityLife insurancePaid parental leave401(k) plan and match11 paid holidaysFlexible vacationCommuter benefitsBonus and equity