Plot a boxplot for “price” vs “cut” from the dataset “diamond.csv”. Which of the categories under “cut” have the highest median price? for data science using Python in Anaconda - Jupyter
#import the library Pandas
import pandas as pd
#Put the diamond.csv dataset on Root Directory
#Read the dataset
data1 = pd.read_csv('diamond.csv')
#Plot a boxplot for “price” vs “cut” from the dataset “diamond.csv”
import matplotlib.pyplot as plt
import seaborn as sns
#=====================================
# Box plot for two variables:
#=====================================
sns.boxplot(x=data1["cut"],y = data1["price"],data = data1)
plt.show()
#Which of the categories under “cut” have the highest median price?
data1.groupby('cut')['price'].median()
Output:-