The Video is about handling categorical variables or category data type that contains a lot of categories.
In certain situations, you will find a categorical variable where there are many categories and many of these categories are very small occupies a very small proportion of the data itself. In such cases, you can reduce the number of categories in your data by combining the very small categories into one single category.
Complete PANDAS COURSE for FREE: http://surl.li/cectw
Join ML+ membership for exclusive Data science content
Checkout complete Data Scientist Learning Path here: https://edu.machinelearningplus.com/s...
🔹 Tips and Tricks on Combining Categorical Variables in Python Pandas.
We typically name it as either you can give it any name. Let's look at an example based on book orders data. So this is the data set. In this we have a city building column so this is the city where the order was placed. There are various different categories inside it.
Let's look at the value counts of this. So this is a big list out of this big list there are certain big prominent cities, Karachi Lahore, Islamabad, Rawalpindi, Feisal abajo, the top few cities are big ones. But later on, as you go down through the data set, there are only one or two mentions of certain city.
So what we are going to do is we are going we are going to first convert this particular column, the text of this particular column to lowercase, because in certain instances, I have seen that capital K in Karachi is mentioned with a small K. And both of them might be considered as two different cities, we want to avoid that situation.
So first, let's convert it to lowercase everything to lowercase, you can do that by accessing the str attribute, this attribute will contain all the string methods. So lower is a string method applied on top of city building and save it and city building itself. Let's run that. So that's done. Now, if you look at city building, let's look at city building this column.
So everything is now lowercase. Let's call value bounds on this and get the largest nine cities. So these are the nine nine Largest cities.
Let me know in the comments section if you have any questions!
🤝 Like, Share, Subscribe for more!
Follow us on our social media handles for all updates, events and live sessions-
✅ Instagram: / machinelearningplus
✅ LinkedIn: / machine-learning-plus
✅ YouTube: / numyard
✅ Twitter: / r_programming
✅ Website: https://www.machinelearningplus.com/
If you enjoyed this video, be sure to throw it a like and make sure to subscribe to not miss any future videos!
Thanks for watching!
#machinelearningplus #python #pandas #datascience