“The true sign of intelligence is not knowledge but imagination.”
I am just a curious learner. I explore a lot of things.
That has let me learn about photography, Carnatic music, Bharatanatyam dance, guitar,
djembe, tabla, paper quilling, and random arts and crafts and I did fail in a few :P.
You can find some of my works at the bottom of the page or on YouTube.
Enough about my creative side, I Guess! My Creative side will peek out at times.
Professionally I am a "Techie".
I have Graduated with an MSc in Big Data Science with the desire to shift towards
the world of Data Science. I look forward to joining an organization where my growth in
Data Science can be aligned with organizational growth.
"Never Give Up! Life is too short. Learn as much as you can even if you fail many times!"
Anuja Jyothi Vijayakumar,
Wembley, United Kingdom
anujajo84@gmail.com
Master of Science in Big Data Science • Sep 2020 - Sep 2021
Modules: Applied Statistics • Big Data Processing • Data Mining • Deep Learning and Computer Vision • Digital Media and Social Networks • Natural Language Processing • Neural Networks and NLP • Machine Learning
Diploma in 3D Animation & Visual Effects • April 2017 - March 2018
Autodesk Maya • Adobe Flash • Nuke • Fusion • Adobe After Effects • Adobe Illustrator • Adobe Photoshop • Adobe Flash • CorelDRAW
Diploma in Kerala Mural Painting • April 2016 - March 2017
Training Programme on IBM Mainframe S/390• April 2007 - Jan 2008
CICS Application programming• COBOL programming • JCL • SQL Basics & DB2 Application programming
Bachelor of Technology in Information Technology • August 2003 - April 2007
Relevant Modules: Object Oriented Programming • Probability and Statistics • Data Structures and Algorithms • Database Management System • Software Engineering • Program Design and Development • Object Oriented Analysis and Design • Software Quality Management • Total Quality Management
Programming Languages •
Python •
SQL •
PySpark
Databases •
Oracle •
MySQL •
Snowflake •
NoSQL •
DynamoDB •
Microsoft SQL
Data Visualization
Microsoft Power BI •
Seaborn •
Matplotlib •
Pattern Recognition •
Anomaly Detection
Text Classification for Deception Detection in Amazon Reviews
Pre-processed using Regex, trained using K fold cross validation and Linear SVC classifier, determined the precision-recall accuracy and f-score. A comparison was done by adding additional features and different stemming and lemmatization process. Finally, did an exploratory analysis on the dataset using seaborn and Matplotlib to explore some of the linguistic and stylistic traits of the reviews and compared the two classes
Sequence Labelling for Movie Queries
CRF Tagging of Movie. Tagging on the movie queries dataset first done using BIO tags and then improved by adding more features and using POS tags.
Identifying the genders of the characters in the Eastenders British Soap Opera.
The following are performed to experiment and optimize the classification: preprocessing, tokenization, different ngram models, line and sentence level features, cross-validation, and more.
Word Representation and Text Classification with Neural Networks
Sentiment Analysis of Twitter and IMDB Large Movie Review Own word embedding is obtained by training a skip-gram neural network model using negative sampling. Glove Embeddings were also used as pre-trained embedding. Various algorithms like LSTM and CNN were explored. Finally, all these are applied to real-world text classification tasks like the sentiment analysis on Twitter and the IMDB Movie reviews.
Neural Machine Translation and Neural Dialogue Systems
Implemented a version of the seq2seq NMT model, and enriched that with attention, using a pre-trained BERT as a classifier model, implementing a series of dialogue act taggers and created an end-to-end dialogue system.
Deep learning Course-works
Critical analysis and study performed on Generative Adversarial Nets and on super-resolution using Convolutional Neural Networks. These two were used on the MNIST dataset and the PyTorch framework is explored
Data Mining Course-works
Data Preprocessing, Data exploration and visualization (exploring sklearn, seaborn, pandas, numpy, and matplotlib libraries), Data Warehousing and On-line Analytical Processing (exploring cubes library), Classification, clustering, and Association Analysis (exploring mlxtend library), Outlier Detection, Web Mining (exploring BeautifulSoup library)
Big Data Processing project
Ethereum Analysis involving Time Analysis, Scam analysis, and comparative evaluation of the usage of Spark and MrJob. Thus, the following are performed: The number of transactions per month, average transactions per month, top ten most popular stories, top ten most active miners, Lucrative scams, Scam over time and correlation, Gas used over time.
Consultant• June 2022 - Present
Data Technical Trainer• Jan 2023 - Mar 2023
Consultant• Aug 2018 - July 2020
Software Engineer & Onsite Co-ordinator• Jan 2008 - Aug 2014