Full job description
Lead strategy, architecture, and technical execution of a modern data platform powering internal intelligence, AI systems, and external B2B data products. Build data capabilities including knowledge graph products, enterprise-grade APIs, MCP-compatible services, developer-facing data tools, AI-ready data services, and commercial data products. Define product, graph, API, semantic, governance, and commercialization requirements. Partner with enterprise data platform team to deliver capabilities. Collaborate with Chief AI Officer, Product, Engineering, and Commercial leadership. Responsibilities include enterprise data strategy, platform leadership, data architecture, modeling, interoperability, semantic foundation, ontology, knowledge graph, rights and provenance management, AI-ready platform enablement, external data product commercialization, governance, trust, lifecycle management, and team leadership. Require 10+ years experience in data engineering, architecture, platform leadership, cloud data platforms, entity resolution, taxonomy design, Snowflake or similar, graph and vector databases, ML/AI infrastructure, data governance, and commercialization. Strong leadership, communication, and strategic skills needed. Location: New York, NY.
What you'll do
- Define and lead the company’s enterprise product data strategy, architecture, and operating model
- Shape, scale, and steward the shared enterprise product data platform across ingestion, storage, transformation, orchestration, governance, access, and activation
- Evaluate and select core technologies across data warehousing, Snowflake, lakehouse infrastructure, orchestration, graph databases, vector databases, metadata tooling, and ML/AI infrastructure
- Design and operationalize a modern Snowflake/lakehouse architecture capable of supporting structured, semi-structured, and unstructured data at scale
- Lead the development of robust ETL and ELT pipelines across batch, streaming, and event-driven workflows
- Establish canonical data models, semantic layers, and shared definitions across business and product domains
- Drive interoperability across internal systems and external products to support internal operations and external commercial use cases
- Establish data contracts, quality thresholds, freshness standards, schema versioning, lineage requirements, and validation systems for production-grade data infrastructure
- Design and govern enterprise ontology frameworks for consistency across entities, attributes, behaviors, relationships, and events
- Architect and scale a commercial intelligence graph connecting creators, sites, content, topics, entities, products, brands, retailers, user intent, audience behavior, licensing rights, attribution requirements, and downstream customer use cases
- Establish a clear semantic foundation to reduce disconnected schemas and one-off pipelines
- Design data models and access systems preserving source, rights, permissions, attribution, consent, freshness, licensing status, and commercial usage constraints
- Ensure external data products can answer provenance, ownership, freshness, usage, and value flow questions
- Define how the intelligence layer is exposed to AI systems, agents, LLM applications, enterprise copilots, search products, commerce platforms, and developer ecosystems
- Support AI-native applications including model training, retrieval, inference, personalization, agentic workflows, and context delivery
- Support integration of structured and unstructured data, vector-based retrieval, and model-facing services for AI and machine learning
- Build evaluation and trust mechanisms for AI-facing data products including retrieval quality, source ranking, freshness scoring, confidence signals, hallucination-reduction workflows, provenance checks, and feedback loops
- Partner with product, engineering, and commercial leadership to turn core data assets into external B2B offerings including APIs, MCP-compatible services, developer tools, intelligence products, and data licensing models
- Build secure, reliable, enterprise-grade data services for customers, partners, applications, agents, and LLM ecosystems
- Define technical and operational requirements for data productization including access patterns, permissions, tenancy, SLAs, observability, documentation, and monetization support
- Partner with commercial leadership to support packaging, pricing, metering, entitlement management, usage reporting, customer-level access controls, and contract-specific restrictions
- Establish strong standards for data governance, lineage, metadata, cataloging, privacy, quality, security, and compliance
- Create frameworks for data stewardship, lifecycle management, and long-term retention of high-value longitudinal data
- Ensure platform reliability, security, privacy, and access control
- Protect creator value by preserving attribution, usage boundaries, licensing status, content ownership, monetization logic, and downstream reporting
- Build and lead a high-performance team spanning data engineering, data architecture, platform engineering, ontology and semantic modeling
- Translate complex technical tradeoffs into clear business decisions, investment priorities, and product implications
- Serve as a senior strategic voice on how data can become a durable competitive advantage
Requirements
- 10+ years of experience in data engineering, data architecture, platform engineering, or related leadership roles
- Proven success building and scaling modern cloud-based data platforms in complex, high-volume environments
- Deep experience with entity resolution, identity graphs, canonical entity modeling, deduplication, taxonomy design, and reconciliation of messy real-world data across content, commerce, behavioral, and partner datasets
- Deep expertise in modern cloud data architecture, including Snowflake or comparable warehouse/lakehouse systems, graph databases, vector stores, orchestration frameworks, metadata systems, and production-grade data APIs
- Demonstrated experience leading enterprise data platform strategy, architecture, and evolution
- Strong experience with modern data stack technologies across storage, compute, orchestration, transformation, observability, and governance
- Hands-on understanding of ontology design, semantic modeling, metadata strategy, and knowledge graph architecture
- Experience building data platforms that support AI and machine learning use cases, including unstructured data, vector-based retrieval, and model-facing services
- Experience exposing data capabilities as external products, such as APIs, developer platforms, partner integrations, or commercially licensed data services
- Strong understanding of enterprise-grade reliability, security, privacy, and access control
- Demonstrated ability to lead both strategy and execution, from architecture decisions to org design to delivery
- Experience managing and developing senior technical talent across multiple data disciplines
- Strong cross-functional communication skills and ability to work effectively with executive, product, engineering, and commercial leaders
- Ability to operate in ambiguous environments and create structure, standards, and momentum where they do not yet exist
- Has personally led platform architecture decisions and shipped production data products
- Comfortable moving between whiteboard architecture, schema design, API product requirements, vendor evaluation, executive tradeoff discussions, and team-building
- Experience building externally consumed, customer-facing data platforms with developer documentation, authentication, entitlements, SLAs, versioning, observability, and customer support workflows
Tech stack
Snowflakelakehousegraph databasesvector databasesmetadata toolingML/AI infrastructureETLELTAPIsMCP-compatible servicesorchestration frameworksdata warehousingdata modelingontology designsemantic modelingknowledge graphdata governancedata catalogingdata securitydata privacydata compliancedata licensingdeveloper platformsretrieval systemswebhooksmodel orchestrationagent frameworks