Senior -Software Developer - Data Platform - Watson Orders
Watson Orders is an IBM Silicon Valley based technology group working on a world-class conversational AI system. Our mission is to deliver advanced solutions that address real-world needs in the quick service restaurant industry. We use state-of-the-art Machine Learning and related technologies to deliver a product that will help serve tens of millions of customers per day.
Your Role and Responsibilities
We are currently seeking a talented software developer focused on our data platform that powers transformative AI/ML products reaching tens of millions of customers per day, feeding billions of customers worldwide. The department covers data infrastructure, data pipelines, analysis, and performance optimization.
The ideal candidate has experience architecting, developing, and supporting large-scale data platforms & infrastructure with a focus on resilience, scalability, and performance within a fast-growing, agile environment.
• Develop and maintain the petabyte scale data lake, warehouse, pipelines, and query layers.
• Develop and support multi-region data ingestion system from geographically distributed edge AI systems.
• Develop and support AI research pipelines, training and evaluation pipelines, audio re-encoding and scanning pipelines, and various analysis outputs for business users
• Use pipelines to manage resilient idempotent coordination with external databases, APIs, and systems
• Work with AI Speech and Audio engineers to support and co-develop heterogenous pipelines over large flows of conversation AI data to support and accelerate experimentation with new AI models and improvements
Required Technical and Professional Expertise
- 5+ Years Professional Python Experience
- 2+ Years PubSub Experience (Kafka, Kinesis, SQS, MQTT, etc)
- 3+ Years working in petabyte scale data platforms
- 3+ Years working in AWS
- Experience building schema-based parsers or ETLs using standard tooling in Python
- Experience developing with Apache Avro, Parquet Schemas, SQLAlchemy (or similar ORMs), and pySpark in Python
Preferred Technical and Professional Expertise
- Professional experience with data platforms powering conversational AI (chatbots, virtual assistants, etc.)
- Professional experience developing and supporting large scale Lakehouses
- Professional experience architecting and implementing large scale query engines such as Presto