Introduction
- Matplotlib is Python’s plotting library built on NumPy that can integrate directly with Pandas
- Workflow:
flowchart LR
id1(data) --> id2(plot)
id2 --> id3(axis on figure)
id3 --> id4(customize)
id4 --> id5(save & share plot)
Figure and plot are used interchangably
Importing and getting started
plt.plot;
creates a blank plot or figure (the ; gets rid of the array)
- if the
;
is annoying, you can just type plt.show()
after plt.plot
to get the same effect
Pyplot vs object-oriented
- According to the documentation, we should always use object-oriented when possible
Methods of getting started
Example workflow
Different types of plots
ax.scatter(data)
: creates a scatterplot
ax.bar(data, data)
: creates a bar plot
ax.barh(list(data), list(data))
: creates a horizontal bar plot
ax.hist(data)
: creates a histogram
Combining plots
Plotting from pandas DataFrames
car_sales.plot(x="column name", y="column name")
: this is the Pandas version of matplotlib
car_sales.plot.hist()
Pyplot vs matplotlib object-oriented method
- When trying to get a quick visualization, pylot is fine but the object-oriented method is better for all other instances
Object oriented method
Object oriented subplots
Customizing plots
plt.style.available
: shows you what you have available to you
plt.style.use('seaborn-whitegrid')
plt.style.use('seaborn')
plt.style.use('ggplot')
ax.set(title="", xlabel="", ylabel="")
ax.legend().set_visible(True)
cmap='winter'
— you can google different color maps
ax.set_xlim([])