You are viewing a preview of this job. Log in or register to view more details about this job.

Data Engineer

Looking for Rockstar US based Engineers who have been itching for an opportunity to flex their programing / Analytical skills in creating Fantabulous Next-Gen Network Technology Products.

Skill Set:

Python, Spark, NodeJS, KAFKA, Scala, HDFS and Java (any or few of the listed skill combination is fine)

NoSQL and PostGreSQL

Job Duties:

Design and Develop Artificial Intelligence and Machine Learning applications in Big Data ecosystems using spark.

Develop and maintaining the backend Databases and ETL flow for client's enterprise portal.

Development (coding the actual application), Reviewing and maintaining code, Unit Testing & Integration Testing, Application Build and Deployment.

Develop data pipeline to integrate UI frameworks (NodeJs) with messaging systems (Kafka).

Analyzing existing databases (postgres and MYSQL) to develop effective systems.

Develop store procedures in Oracle and MySQL and optimizing queries to improve frontend performance.

Participate in discussions with clients for analyzing the user requirement for new projects and decide on its possibility of implementation as per the existing technologies.

Develop spark applications for processing batch and streaming data sets using programming languages python, Scala and Java.

Create a synchronizing mechanism from a local server to HDFS using LFTP and python.

Setup and maintain a multi node KAFKA cluster for the production environment.

Analyzing information to recommend and plan the installation of new systems or modifications of an existing system.

Hadoop, Spark Cluster Setup from the scratch for preproduction environment.

Write SQL queries and performed Back-End Testing for data validation to check the data integrity during migration from back-end to front-end.

Use Hive data warehouse tool to analyze the data in HDFS and developed Hive queries.

Implement Spark custom UDF's to achieve comprehensive data analysis using both Scala and python.

Providing POC’s on Core Spark and Spark SQL replicating the existing ETL logic.

Develop many successful ETL flow for the Data movement using python as well as pyspark.

Take up different tasks in the various phases of software development which also includes preparing software specifications and business requirement documents.

Create Dimensional models for the backend tables in MYSQL.

Work extensively in data analysis, wireless systems for developing predictive and forecast algorithms.

Create and pushed data many HIVE external tables with the structured data for the Data Scientists to train their models.

Accumulate the real-time networking data from the provided REST api’s and pushed to the DB using python.

Use the anomaly detection custom function developed by the data scientists as a spark UDF and created an RDBMS table for the UI Dashboards.