Antrixsh Gupta
BlockChain Engineer, Data Scientist, ML/AI Expert
Danalitic India Pvt. Ltd
11 years of total IT experience in all phases of Hadoop Development, Java Development along with experience in Application Development, Data modeling, Data mining, Data Science, Machine Learning, Deep Learning & NLP also a Blockchain revolution enthusiast.
-Good experience with Big Data Ecosystems, ETL
-Expertise in Java, Python and Scala
-Experience in data architecture including Data ingestion pipeline design, Data analysis and Data Analytics, advanced Data processing. Experience optimizing ETL workflows.
-Experience in Hadoop (Cloudera, HortonWorks, MapR, IBM Big Insights) - Architecture, Deployment and Development.
-Experience in extracting source data from Sequential files, XML files, Excel files, transforming and loading it into the target Data warehouse.
-Experience with database SQL and NoSQL (MongoDB) (Cassandra )
-Hands on experience with Hadoop Core Components (HDFS, MapReduce) and Hadoop Ecosystem (Sqoop, Flume, Hive, Pig, Impala, Oozie, HBase).
-Experience in ingesting real time/near real time data using Flume, Kafka, Storm
-Experience in importing and exporting the data using Sqoop from Relational Database to HDFS and reverse.
-Hands on Experience on Linux systems
-Experience in using Sequence files, AVRO file, Parquet file formats; Managing and reviewing Hadoop log files
-Good knowledge in writing Spark application using Python, Scala and Java
-Experience in writing MapReduce jobs.
-Efficient in analyzing data using HiveQL, Pig Latin, partitioning an existing data set with static and dynamic partition, tune data for optimal query performance.
-Good experience transformation and storage: HDFS, MapReduce, Spark
-Good understanding of HDFS architecture.
-Experienced in Database development, ETL, OLAP, OLTP
-Knowledge of extracting an Avro schema using avro-tools and evolving an Avro schema by changing JSON files
-Experience in HBase tables to load large sets of structured, semi-structured and unstructured data coming from UNIX, NoSQL and a variety of portfolios.
-Experience in UNIX Shell scripting.
-Developing and maintaining applications on the AWS platform
-Experience with developing and Maintaining Applications written for Amazon Simple Storage Service, Amazon Dynamo DB, Amazon Simple Queue Service, Amazon Simple Notification Service, Amazon Simple Workflow Service, AWS Elastic Beanstalk, and AWS Cloud Formation.
-Picking the right AWS services for the application
-Proven expertise in employing techniques for Supervised and Unsupervised (Clustering, Classification, PCA, Decision trees, KNN, SVM) learning, Predictive Analytics, Optimization methods and Natural Language Processing(NLP), Time Series Analysis.
-Experienced in Machine Learning Regression Algorithms like Simple, Multiple, Polynomial, SVR(Support Vector Regression), Decision Tree Regression, Random Forest Regression.
-Experienced in advanced statistical analysis and predictive modeling in structured and unstructured data environment.
-Strong expertise in Business and Data Analysis, Data Profiling, Data Migration, Data Conversion, Data Quality, Data Governance, Data Lineage, Data Integration, Master Data Management(MDM), Metadata Management Services, Reference Data Management (RDM).
-Hands on experience of Data Science libraries in Python such as Pandas, NumPy, SciPy, scikit-learn, Matplotlib,
Seaborn, BeautifulSoup, Orange, Rpy2, LibSVM, neurolab, NLTK.
-Good Understanding of working on Artificial Neural Networks and Deep Learning models using Theano and Tensorflow
packages using in Python.
-Experienced in Machine Learning Classification Algorithms like Logistic Regression, K-NN, SVM, Kernel SVM, Naive
Bayes, Decision Tree & Random Forest classification.
-Hands on experience on R packages and libraries like ggplot2, Shiny, h2o, dplyr, reshape2, plotly, RMarkdown,
ElmStatLearn, caTools etc.
-Efficiently accessed data via multiple vectors (e.g. NFS, FTP, SSH, SQL, Sqoop, Flume, Spark).
-Experience in various phases of Software Development life cycle (Analysis, Requirements gathering, Designing) with
expertise in writing/documenting Technical Design Document(TDD), Functional Specification Document(FSD), Test
Plans, GAP Analysis and Source to Target mapping documents.
-Excellent understanding of Hadoop architecture and Map Reduce concepts and HDFS Framework.
-Strong understanding of project life cycle and SDLC methodologies including RUP, RAD, Waterfall and Agile.
-Very good knowledge and understanding of Microsoft SQL server, Oracle, Teradata, Hadoop/Hive.
-Strong expertise in ETL, Data warehousing, Operational Data Store (ODS), Data Marts, OLAP and OLTP technologies.
-Experience working on BI visualization tools (Tableau, Shiny & QlikView).
-Analytical, performance-focused, and detail-oriented professional, offering in-depth knowledge of data analysis and
statistics; utilized complex SQL queries for data manipulation.
-Equipped with experience in utilizing statistical techniques which include Correlation, Hypotheses modeling, Inferential
Statistics as well as data mining and modeling techniques using Linear and Logistic regression, clustering, decision
trees, and k-mean clustering.
-Expertise in building Supervised and Unsupervised Machine Learning experiments using Microsoft Azure utilizing
multiple algorithms to perform detailed predictive analytics and building Web Services models for all types of data:
continuous, nominal, and ordinal.
-Expertise in using Linear & Logistic Regression and Classification Modeling, Decision-trees, Principal Component
Analysis (PCA), Cluster and Segmentation analyses, and has authored and coauthored several scholarly articles
applying these techniques.
-Mitigated risk factors through careful analysis of financial and statistical data. Transformed and processed raw data for
further analysis, visualization, and modeling.
-Proficient in research of current process and emerging technologies which need analytic models, data inputs and output,
analytic metrics and user interface needs.
-Assist in determining the full domain of the MVP, create and implement its relevant data model for the App and work with
App developers integrating the MVP into the App and any backend domains.