I have a dataset with precomputed means and standard deviations. The values depend on three different categorical values. I would like to create two barplots to split the first categorical variable across them. The other two categorical values should be separated on the x-axis and by using different colors.
In seaborn terms, I want to create barplots with seaborn.catplot based on categorical x and accustoming order as well as hue and col arguments while being able to add my own custom standard deviations.
The following code gives the means of the barplots pretty straightforward:
import seaborn as sns
import matplotlib.pyplot as plt
tips = sns.load_dataset("tips")
tip_sumstats = (tips.groupby(["day", "sex", "smoker"])
.total_bill
.agg(["mean", 'sem'])
.reset_index())
sns.catplot(
data=tip_sumstats,
x="day",
order=["Sun", "Thur", "Fri", "Sat"],
y="mean",
hue="smoker",
col="sex",
kind="bar",
height=4,
)
This answer solves the problem when hue and order are not involved. However, in the above case, using
def errplot(x, y, yerr, **kwargs):
ax = plt.gca()
data = kwargs.pop("data")
data.plot(x=x, y=y, yerr=yerr, kind="bar", ax=ax, **kwargs)
g = sns.FacetGrid(tip_sumstats, col="sex", hue="smoker", height=4)
g.map_dataframe(errplot, "day", "mean", "sem")
I do not understand how to modify this version such that it respects the categorical order on the x-axis defined by some order argument. Furthermore, I do not understand how to add a dodge=True to it such that the differently colored bars appear next to each other.
This question tries to solve something similar. However, the approach is very technical and not straightforward at all. To me, it seems weird that no straightforward solution exists.


