Big Data Engineer
Experis Singapore| Date Posted: 14-Feb-2021
EA Licence No: 02C3423
Bachelor's / Honours, Masters / PhD
- Data platform construction and administration (design, capacity planning, installation, monitoring, optimization, etc.)
- Responsible for implementation and ongoing administration of Hadoop infrastructure.
- Working with data delivery teams to setup new Hadoop users. This job includes setting up Linux users, setting up Kerberos principals and testing HDFS, Hive, Pig and MapReduce access for the new users.
- Setting up and managing the new and existing Spark, HBase cluster.
- Performance tuning and debugging the memory and server related exceptions related to Hadoop, Spark , HBase etc.
- Cluster maintenance as well as creation and removal of nodes using tools like Ganglia, Nagios, Cloudera Manager Enterprise, Dell Open Manage and other tools.
- Performance tuning of Hadoop clusters and setting up the monitoring and notification frameworks.
- Working with Cloudera support for performance tuning and version upgradation.
- Working with team to build and manage CI/CD pipelines, containerization and deployments.
- Database design and structure, real-time data integration and database tools to deliver top-notch data management
- Develop, administer, maintain, and implement data governance policies and procedures for ensuring the security, quality, integrity and availability of the company's data
- Provide domain-based experience with toolsets to help get the most from current and next-generation data technologies
- Manage the migration to new data technologies for structured, unstructured, streaming and high volume data
- Solution design using proven patterns, awareness of anti-patterns, performance benchmarking
- Collaborate with stakeholders and engineers to ensure that solutions meet business needs
- Degree in Computer Science, Software Engineering, Electrical Engineering, Applied Mathematics or related field of study.
- Significant experience working with big data processing tools and technologies such as Hadoop, Yarn, Hive, Impala, Sqoop, Spark, Spark Streaming, HBASE, Flume, Kudu, Kafka, NiFi, etc.Good troubleshooting skills, understanding of system's capacity, bottlenecks, basics of memory, CPU, OS, storage, and networks.
- Good knowledge of Linux and Networking.
- Familiarity with open source configuration management and deployment tools such as Puppet or Chef, Docker, Kubernetes and Linux scripting.
- Master of Unix commands
- Proficiency in SQL queries and relational databases
- Experience in MPP (massively parallel processing)
- Experience with Azure, AWS or other Cloud-based PaaS/SaaS environments is an added plus
Octavius, Whei Jie Yong EA License No.: 02C3423 Personnel Registration No.: R1110096