Lead Data Engineer- Pyspark - Hyderabad
Salesforce
This job is no longer accepting applications
See open jobs at Salesforce .See open jobs similar to "Lead Data Engineer- Pyspark - Hyderabad" Imagine.To get the best candidate experience, please consider applying for a maximum of 3 roles within 12 months to ensure you are not duplicating efforts.
Job Category
Software EngineeringJob Details
About Salesforce
We’re Salesforce, the Customer Company, inspiring the future of business with AI+ Data +CRM. Leading with our core values, we help companies across every industry blaze new trails and connect with customers in a whole new way. And, we empower you to be a Trailblazer, too — driving your performance and career growth, charting new paths, and improving the state of the world. If you believe in business as the greatest platform for change and in companies doing well and doing good – you’ve come to the right place.
We’re Salesforce, the Customer Company, inspiring the future of business with AI+ Data +CRM. Leading with our core values, we help companies across every industry blaze new trails and connect with customers in a whole new way. And, we empower you to be a Trailblazer, too — driving your performance and career growth, charting new paths, and improving the state of the world. If you believe in business as the greatest platform for change and in companies doing well and doing good – you’ve come to the right place.
The Data and Analytics Organization (DnA) is Salesforce's cornerstone for fostering growth and margins through unparalleled data insights. From robust governance to strategic execution, we support data pioneers with an unbiased approach. Our Enterprise Data Strategy builds a solid data foundation, fostering a culture of data-driven decisions. We ensure end-to-end quality through a cohesive data supply chain. By deploying and integration platform tools, we enable seamless data access and automated data management driving efficiency and growth with actionable insights.
Your Impact:
- Be responsible for the technical solution design, lead the technical architecture and implementation of data acquisition and integration projects, both batch and real time
- Define the overall solution architecture needed to implement a layered data stack that ensures a high level of data quality and timely insights
- Communicate with product owners and analysts to clarify requirements
- Craft technical solutions and assemble design artifacts (functional design documents, data flow diagrams, data models, etc.)
- Build data pipelines data processing tools and technologies in open source and proprietary products
- Serve the team as a domain expert & mentor for ETL design, and other related big data and programming technologies
- Identify incomplete data, improve quality of data, and integrate data from several data sources
- Proactively identify performance & data quality problems and drive the team to remediate them. Advocate architectural and code improvements to the team to improve execution speed and reliability
- Design and develop tailored data structures
- Reinvent prototypes to create production-ready data flows
- Support Data Science research by designing, developing, and maintaining all parts of the Big Data pipeline for reporting, statistical and machine learning, and computational requirements
- Perform data profiling, sophisticated sampling, statistical testing, and testing of reliability on data
- Clearly articulate pros and cons of various technologies and platforms in open source and proprietary products Implement proof of concept on new technology and tools to help the organization pick the best tools and solutions
- Strong SQL optimization and performance tuning experience in a high volume data environment that uses parallel processing
- Teams are using the following: SQL, Python, Airflow, AWS, Spark, Tableau, AWS EMR, Snowflake
- Participate in the team’s on-call rotation to address sophisticated problems in real-time and keep services operational and highly available
Required Skills:
- 4 - 12 years experience in data engineering
- Build programmatic ETL pipelines with SQL based technologies and platforms
- Solid understanding of databases, and working with sophisticated datasets
- Data governance, verification and data documentation using current tools and future adopted tools and platform
- Work with different technologies (Python, shell scripts) and translate logic into well-performing SQL
- Perform tasks such as writing scripts, web scraping, getting data from APIs etc.
- Automate data pipelines using scheduling tools like Airflow
- Experience with CI/CD technologies and tools like Jenkins, Ant or Gradle, Github
- Be prepared for changes in business direction and understand when to adjust designs
- Experience writing production level SQL code and good understanding of Data Engineering pipelines
- Experience with Hadoop ecosystem and similar frameworks
- Previous projects should display technical leadership with an emphasis on data lake, data warehouse solutions, business intelligence, big data analytics, enterprise-scale custom data products
- Knowledge of data modeling techniques and high-volume ETL/ELT design
- Experience with version control systems (Github, Subversion) and deployment tools (e.g. continuous integration) required
- Experience working with Public Cloud platforms like GPC, AWS, or Snowflake
- Ability to work effectively in an unstructured and fast-paced environment both independently and in a team setting, with a high degree of self-management with clear communication and commitment to delivery timelines
- A related technical degree required
Accommodations
If you require assistance due to a disability applying for open positions please submit a request via this Accommodations Request Form.
Posting Statement
At Salesforce we believe that the business of business is to improve the state of our world. Each of us has a responsibility to drive Equality in our communities and workplaces. We are committed to creating a workforce that reflects society through inclusive programs and initiatives such as equal pay, employee resource groups, inclusive benefits, and more. Learn more about Equality at www.equality.com and explore our company benefits at www.salesforcebenefits.com.
Salesforce is an Equal Employment Opportunity and Affirmative Action Employer. Qualified applicants will receive consideration for employment without regard to race, color, religion, sex, sexual orientation, gender perception or identity, national origin, age, marital status, protected veteran status, or disability status. Salesforce does not accept unsolicited headhunter and agency resumes. Salesforce will not pay any third-party agency or company that does not have a signed agreement with Salesforce.
Salesforce welcomes all.
This job is no longer accepting applications
See open jobs at Salesforce .See open jobs similar to "Lead Data Engineer- Pyspark - Hyderabad" Imagine.