DSL-732
Assignment 3: Data Science
Take any data science problem and briefly describe the problem you are looking at. What are the variable(s) you would like to study? What are the sources of data? Please share the rough schema of your project data. Please perform the first cut analysis using data collection, data pre-processing and data visualization and share insights for your project. Please also upload your scripts properly commented and the links to the data sources with your assignment.
Men's Shoe Prices
A list of 10,000 men's shoes and the various prices at which they are sold
​
About Dataset
​
Context
A list of 10,000 men's shoes and the various prices at which they are sold..
​
Content
This is a list of 10,000 men's shoes provided by Datafiniti's Product Database.
The dataset includes shoe name, brand, price, and more. Each shoe will have an entry for each price found for it and some shoes may have multiple entries.
Note that this is a sample of a large dataset. The full dataset is available through Datafiniti.
​
Acknowledgements
What I Can Do with This Data
You can use this data to determine brand markups, pricing strategies, and trends for luxury shoes E.g.:
-
What is the average price of each distinct brand listed?
-
Which brands have the highest prices?
-
Which ones have the widest distribution of prices?
-
Is there a typical price distribution (e.g., normal) across brands or within specific brands?
-
Further processing data would also let you:
-
Correlate specific product features with changes in price.
You can cross-reference this data with a sample of our Women's Shoe Prices to see if there are any differences between women's brands and men's brands.
​
Data Schema
A full schema for the data is available in our support documentation.
​
About Datafiniti
Datafiniti provides instant access to web data. We compile data from thousands of websites to create standardized databases of business, product, and property information.
​
Inspiration
Datafiniti provides instant access to web data. We compile data from thousands of websites to create standardized databases of business, product, and property information
​
Data visualization