Faiyaz Ahmad

I'm Looking for role

About

  • Recently graduated with MS in Data Science with High Distinction from University of Michigan-Dearborn on 27th April 2024.
  • Over 4.5+ years of rich experience in data science field with Manufacturing Operations, Healthcare, and Sport Analytics.
  • Expertise in developing, fine-tuning, and deploying machine learning, deep learning, computer vision, and natural language processing (NLP) models.
  • Hands-on experience on developing the cost-optimized, efficient, secure, scalable, and highly available machine learning and data analytics solutions in AWS and GCP using compute, database, object storage, data warehousing, and serverless architecture.
  • Strong technical background in programming in Python and R, data structure and algorithm, data modeling, data normalization, ad-hoc data analysis, and complex SQL querying.
  • Proficient in handling large-scale data and maintaining robust ETL/ELT pipelines to support data analysis and machine learning workflows.
  • Reach out via email: faiyazahmad5045@gmail.com

Education

Master of Science, Data Science

Aug'2022 - Apr'2024

University of Michigan-Dearborn, Michigan, USA

Relevant courses: Database Systems,Cloud Computing (GCP), Big Data, Multivariate Statistics, Deep Learning, Applied Regression Analysis, Natural Lanaguage Processing, Data Security and Privacy, and Artificial Intelligence.

Bachelor of Technology, Mechanical Engineering

2014 - 2018

Motilal Nehru National Institute of Technology, Prayagraj, India

Relevant courses: Industrial engineering, Numerical methods and statistical techniques, Quality Engineering, Software Project Management

Professional Experience

Machine Learning Engineer

Jun'2024 - Present

ScriptChain Health, Boston, MA, USA

  • Architected a data preprocessing pipeline leveraging S3, and EC2, utilized SageMaker for ML models training, and deployment.
  • Fine-tuned the Llama-2 Model by using Quantized Low Rank Adaption (QLoRA) with 8 bit quantization for clinical note generation using the patient health records. Reduced the GPU VRAM requirement in AWS SageMaker by 50%.
  • Built the hybrid model using Graph Attention Module, Time-LLM, and Llama to predict the 30-days hospital readmission.
  • Reduced data preprocessing time from 500 hours to 10 minutes by implementing efficient table-joining operations on 500 million rows using AWS S3, Glue, Athena, and optimized SQL queries.
  • Led cost-saving initiative for AI team’s AWS resources, optimized compute and storage allocation and saving $6,000 annually.

Research Assistant (Data Science)

Jan'2024 - Apr'2024

Sustainability Center-University of Michigan, Dearborn, MI, USA

  • Implemented ResNet50, MobileNetV2, and U-Net architecture with modified dense layer for facial landmark detection on thermal images with custom loss function i.e wing loss to train the model. Achieved 0.04 Normalized Mean Error.
  • Built the video streaming data analysis system for raspberry pi mini autonomous vehicle for object detection using Kinesis with low latency, debug and installed C++ Kinesis Video Streaming SDK producer in raspberry pi module.
  • Developed Django based full-stack web application for advance data analysis and machine learning model development using JavaScript, HTML, Python script, S3, API Gateway, Lambda, and SageMaker. Built python Rest API for CRUD operations.

Data Science Analyst

Aug'2023 - Nov'2023

iLabs-University of Michigan, Dearborn, MI, USA

  • Conducted web-scraping using python script for data collection. Performed data prepossessing, data modeling, exploratory data analysis and statistical analysis to pinpoint critical factors contributing to a hockey team’s success.
  • Built Interactive Tableau Dashboard to track the Players and teams performances.

Data Scientist

Jul'2018 - Mar'2022

Hero MotoCorp Limited, Halol, Gujarat, India

  • Led the implementation of 10+ computer vision systems for quality initiatives i.e component identification, anomaly detection, and missing parts detection in production line, and collaborated with CFT teams, achieved $100,000 in annual savings.
  • Conducted A/B testing to optimize heat treatment process parameters for crankshaft manufacturing, achieved 5% improvement in quality rate, and performed statistical analysis using R to monitor process and quality parameter variations.
  • Designed a CNN-based autoencoder with PyTorch for automated defect detection on painted components. Optimized 10 manpower requirement, reduce quality stations from 10 to 3 and ensured 100% quality inspection coverage.
  • Implemented anomaly detection in streaming sensors data from different machines, utilizing the PCA, Isolation Forest and K-Means Clustering technique, increase the machine availability by 100+ hours/month, and productivity improvement by 10%.
  • Trained predictive maintenance model for predicting the Residual Life of Machine based on different sensor data. Trained the Linear Regression, Random Forest, and CNN based model. Achieved 9.8 RMSE generalization error using CNN model.
  • 10% productivity improvement in machine shop by developing a real-time production monitoring dashboard to locate the failure points and root cause analysis quickly. Implemented by using AWS cloud services such as Kinesis, Lambda, DynamoDB, Grafana, and Tableau to track the KPIs i.e equipment efficiency, productivity, quality rate etc.
  • 90% reporting time reduction in regular and ad-hoc reporting by implementing ETL systems with AWS Glue, S3, Lambda, and Athena, extracting data from various sources including ERP systems (SAP) and SQL databases.
  • Successfully led 20+ Major and Minor data analytics projects related to resource optimization, process improve- ment, and quality improvement related projects in the manufacturing unit involved in project planning, scheduling, resource allocation, project execution, effective communication and collaboration with stakeholders
  • Designed dashboards using Amazon QuickSight, Looker and Tableau to track machine OEE, productivity metrics, MTBF, and key operational parameters by extracting data from Redshift and BigQuery through complex SQL queries.
  • Performed batch data processing using Spark on EMR, staged data in S3, and transferred to Redshift for complex querying and analysis
  • Utilized Athena with the Glue Catalog for ad-hoc querying, integrating with Tableau for analysis,visualization and reporting.
  • Performed statistical analysis and hypothesis testing using R on a manufacturing machine process assesses quality parameters, identifies variations, and optimizes product quality
  • Executed data modeling, normalization , and conducted data quality checkpoints to ensure data accuracy and integrity.

Portfolio Projects

Batch and Streaming Big Data Analysis Projects
End to End Football Leagues Analytics    [Live Dashboard]
  • Designed a robust football data analytics pipeline on both AWS and GCP, encompassing 950+ global leagues, ensuring automated data collection, preprocessing, and storage.
  • Deployed a cloud solution emphasizing serverless architecture, automated scheduling, minimal overhead cost, real-time data processing, and seamless integration with visualization and machine learning tools.
  • Orchestrated data warehousing solutions using GCP’s BigQuery and AWS’s Redshift, enabling robust SQL querying, seamless data transfer, and integration with Looker, Tableau and machine learning platforms like Vertex AI and SageMaker.
Twitter Streaming Sentiment Analysis    [Report]
  • Built streaming Pipeline using Twitter API, NiFi, Kafka, and Spark Streaming using AWS EC2 Instance.
  • Trained Word2Vec model and Decision Tree models for sentiment prediction. Achieved 78% accuracy, and 0.69 AUC-ROC.
  • Built a dashboard for visualizing the sentiment on particular topic, and adhoc-queries on streaming data for 5 mins window.
SQL Queries performance evaluation    [Report]
  • Extracted the IMDB dataset from server, preprocessed, and created the ER diagram, DDL and DML statements.
  • Created different analytical queries, and evaluated the performance of lookup queries with and without indexing.
  • By indexing for lookup queries reduce the queries time by about 85%.
Machine Learning Projects
Bank Telemarketing Term Deposit Subscription Analysis and Prediction    [Report]
  • Performed the Data Cleaning, Exploratory data analysis, ANOVA, and chi-square test to check feature independence.
  • Trained the logistic regression and decision tree classifier to predict the subscription probability. Achieved 0.60 AUC-ROC.
Adult Income Class Classification
  • Performed data preprocessing, exploratory data analysis, feature engineering, and treating the unbalanced data.
  • Trained classifier a logistic regression and random forest classifier for predicting the annual income class. Achieved 0.66 F-1 Score and 0.88 ROC-AUC score.
Deep Learning Projects
Deep Implicit Movie Recommendation system
  • Implemented a deep implicit movie recommendation system for IMDB dataset to predict the rating of user for particular movie
  • Optimized the model using triplet loss, by measuring similarity between user and item embeddings. Achieved 92% accuracy
Autonomous Driving Object Detection using YOLOv2
  • Developed a object detection model to identify the 80 different objects with 5 anchor boxes by using YOLO algorithm.
  • Performed transfer learning using pre-trained YOLOv2 model and increased the accuracy of model by applying Non-max supression.
Remaining Useful Prediction for the Aircraft Gas Turbine Engine    [Report]
  • Performed the exploratory data analysis, feature engineering, and build the CNN architecture for time-series analysis
  • Trained the CNN model and fine-tune the model to overcome the overfitting issue. Achieved 21.5 MAE and 24.3 RMSE.
Semantic Image Segmentation using U-Net Architecture
  • Implemented semantic image segmentation on the CARLA self-driving car dataset.
  • Used Encoder block and Decoder block along with skip-connection at each level to improve the accuracy of masking prediction.
  • Applied sparse categorical cross-entropy loss to train the model for pixelwise prediction
Handwritten digit recognition
  • Trained Convolutional and Artificial Neural Network based classifier to predict the handwritten 28x28 pixel image.
  • Achieved accuracy of 98% in CNN and 97% on ANN. Evaluated F-1 Score, Recall, Precision and confusion matrix.
NLP Projects
Neural Machine Translation using Attention Model
  • Developed an attention-based model for Neural Machine Translation (NMT) specifically designed to translate human-readable dates into machine-readable dates.
  • Model incorporates pre-attention Bi-LSTM and post-attention Long Short-Term Memory layer to enhance translation accuracy.
Named-Entity Recognition to Process Resume using Transformer Model
  • Developed a Transformer-based model to process resumes and extract information such as name, skills, designation etc.
  • Performed transfer learning using the DistilBERT fast tokenizer and a pre-trained transformer model for parsing resumes.
Parts of Speech Tagging using Viterbi Algorithm
  • Implemented Viterbi algorithm for parts of speech tagging for each word in a sentence using Penn Treebank datasets.
  • Achieved 92.3% accuracy on test set using Viterbi algorithm and 85.18% accuracy for assigning most frequent tag.
Word Sense Disambiguation
  • Applied Naïve Bayes algorithm for identification of correct sense of particular word in given context.
  • Created a bag of words for each sense for a given words in a training set.
  • Implemented 5 fold cross validation and achieved 90.18% accuracy on ‘bass’ dataset.
Extractive Text Summarization using TF-IDF Vectorizer
  • Developed the text summarization by using the concepts TF-IDF and Centroid on CNN News Dataset.
  • Centroid based approach is providing better central idea for given text and achieved the average 0.42 ROUGE-1

Skills

Programming: Python, C++, JavaScript, R, Matlab, C90%
Batch Processing: Redshift, Glue, EMR, BigQuery, Cloud Scheduler80%
Visualization: Tableau, QuickSight, Looker, Seaborn, ggplot70%
Big Data: Spark, Spark SQL, Spark Streaming90%
Data Streaming: Kinesis, Kafka, Spark Streaming 80%
Compute Services: EC2, ECS, EKS, Lambda, Compute Engine, GKE 80%
Object Storage: AWS S3, Cloud Storage80%
SQL Databases: PostgreSQL, MySQL, RDS, MS SQL Server80%
ML/DL Framework: Scikit-Learn, Tensorflow, Keras, Pytorch75%
Statistical Analysis: t-test, z-test, chisq test, Anova, A/B testing80%
Batch Processing: EMR, Glue, DataFlow, Cloud Scheduler85%
Machine Learning: Linear Regression, Logistic Regression, Random Forest, SVM, Gradient Boosting90%
Deep Learning: CNN, RNN, LSTM, GRU, GAN, YOLO, Attention, Transformers, BERT90%
Machine Learning Platform: Vertex AI, AWS SageMaker80%

Certifications

img img img

Recommenations

There are few recommendations from my reporting managers, colleagues, and friends

Faiyaz has accomplished various projects under my mentorship and supervision, including setting up a reliability lab, an export assembly line, and machine shop expansion. Additionally, he has demonstrated exceptional skills in implementing projects related to computer vision to prevent model mixing in engine assembly, and data acquisition systems for vehicle dynamometer. He possesses a strong work ethic, adaptability, willingness to learn, good decision-making abilities, and a sense of responsibility. He is also an excellent project planner, collaborator, leader, and has demonstrated his ability to solve complex problems.

Parthesh Mehta

Deputy General Manager, Kohler India

Faiyaz was in my Production Engineering team and taking care of the Machine Shop, Engine Assembly, OBL and IBL, We have done a lot of Improvement projects in Hero Motocorp Halol plant, I still remember his contribution in Machine shop Capacity Expansion, New SQA lab Installation, Export line installation and digitalization of Final Inspection. He has superb analytical skills, He is committed and dedicated towards his work and does things differently than others, He knows how to use Engineering in a practical way, His out of box thinking differentiates him from others, Keep learning and keep growing.

Rahul Ranjan

Section Head,Production Engineering, Hero Motocorp Limited

In my time with Faiyaz, I was consistently impressed by his deliverables. His data engineering work was performed at an advanced level, his data visualization was intentional and informative, and his ability to provide thorough documentation was something that was an unexpected (but welcome) surprise. Faiyaz has a unique combination of skills in multiple analytics disciplines. I believe he would benefit any future employer, and particularly an employer who can only support a small data analytics team. He is thoughtful, responsible, and doesn't take shortcuts. He proudly does his work right the first time and makes sure the output delivers on all objectives. I strongly recommend Faiyaz.

Christopher Jimenez

Project Manager, iLabs-University of Michigan-Dearborn