Apply Directly on :https://www.capgemini.com/in-en/jobs/GLE4oI8BR2H0kp-eBn7f/palantir-data-engineer--9-to-15-years--pan-india/
Job Responsibilities
Data Ingestion into Foundry from external data sources /legacy systems using using Agents/ Magritte connectors, Data Connection. Working with Raw files.
Excellent proficiency in data processing scripting languages like but not limited to Python, Pyspark, sql
Design, create and maintain a optimal data pipeline architecture in foundry
Ability to create data-pipelines and optimize data pipelines using: Pyspark for back-end, Typescript for front-end. Publishing and Using shared libraries in code repository
Assemble large, complex data sets that meet functional / non-functional business requirements in foundry.
Identify, design, and implement internal process improvements: automating manual processes, optimizing data delivery, re-designing infrastructure for greater scalability, etc in foundry
Palantir scheduling jobs for pipeline. Monitoring Data pipeline health and configuring health checks and alerts.(Data expectations)
Build analytics tools using Contour, Quiver, Workshop Application, Slate that utilize the data pipeline to provide actionable insights into KPIs like customer acquisition, operational efficiency and other key business performance metrics
Good Understanding and working knowledge on Foundry Tools: Ontology, Contour, Object-explorer, Ontology-Manager, Object-editor using Actions/ Typescript, Code workbook, Code Repository, Foundry ML
Candidate must have 5+ years of experience in a Data Engineer role, Should have experience using the following software/tools:
Hadoop, Spark, Kafka, etc.
Experience with relational SQL and NoSQL databases, including Postgres and Cassandra/Mongo dB
Experience with stream-processing systems: Storm, Spark-Streaming, etc.
Experience with object-oriented/object function scripting languages: Python, Java, C++, Scala, etc.
Advanced working SQL knowledge and able to quickly envision a technical solution based on functional requirements: At least 4+ years in sql
Experience building and optimizing ‘big data’ data pipelines, architectures and data sets : At least 5+ years in Pyspark/ Python
Experience performing root cause analysis on internal and external data and processes to answer specific business questions and identify opportunities for improvement
Strong analytic skills related to working with big datasets.
Build processes supporting data transformation, data structures, metadata, dependency and workload management.
Experience supporting and working with cross-functional teams in a dynamic environment.
Location
Bengaluru, Karnataka, India
About Company
Capgemini is a global leader in partnering with companies to transform and manage their business by harnessing the power of technology. The Group is guided everyday by its purpose of unleashing human energy through technology for an inclusive and sustainable future. It is a responsible and diverse organization of 350,000 team members in more than 50 countries. With its strong 55-year heritage and deep industry expertise, Capgemini is trusted by its clients to address the entire breadth of their business needs, from strategy and design to operations, fueled by the fast evolving and innovative world of cloud, data, AI, connectivity, software, digital engineering and platforms. The Group reported in 2022 global revenues of €22 billion.
Get The Future You Want | www.capgemini.com