Sun. Jul 14th, 2024

    I think it is safe to assume that every professional here on LinkedIn has at some point come across the emphasis that in the 21st century, no business can afford to not take full advantage of the data it generates. Customers and clients are speaking to you through data they are generating when they interact with your offerings. Thus, the ability to understand these hidden messages has propelled the title “Data Scientist” to be crowned the sexiest job of the 21st century. Here are some stats:

    • 90% of all data has been created in the last two years
    • Today it would take a person approximately 181 million years to download all the data from the internet
    • Google gets over 3.5 billion searches daily.
    • WhatsApp users exchange up to 65 billion messages daily.
    • 80-90% of the data we generate today is unstructured
    • 95% of businesses cite the need to manage unstructured data as a problem for their business

    The last two bullet points are quite telling. Most of our data has no proper structure to it – social media posts, videos, images, emails, and text messages. Most of it is not labelled. These present some very unique challenges when it comes to doing analyses and uncovering insights. This is where seasoned data scientists come in. They have the skillsets and tools to dig into those Gigs of voice calls using algorithms that “understand” what the client is talking about. I used the keyword “seasoned” for a reason. It takes lot of hard work to get to that level and a lot of searching and reading. That journey to up-skill yourself can be a very frustrating one because a large body of machine learning content online focuses on Supervised Machine Learning where one deals with much cleaner datasets that have a structure to them and most importantly labels. I too have a course on this I teach via Udemy:

    No alt text provided for this image

    But if 80 – 90% of our data is unstructured, there is a need to learn how to perform Unsupervised Machine Learning and that is what I am teaching in my new Udemy course. For those reading this who are data science students or practitioners, I discuss dimension reduction techniques, clustering and autoencoders. There are projects with codes written in Python: fraud detection (anomaly detection), customer segmentation (clustering) and feature extraction (autoencoders). Here’s the content:

    No alt text provided for this image

    You can access the course here:

    Udemy published the course today so there is a coupon valid for the next 5 days to give you 50% off. The code is UNSUPMLFP2020

    Happy learning and it’s a brave new world!

    53 / 100

    Leave a Reply

    Your email address will not be published. Required fields are marked *

    This site uses Akismet to reduce spam. Learn how your comment data is processed.