Show and Tell -Datamallet Python library for Automatic Data Visualization and Data Preprocessing

Good day everyone!
I am pleased to introduce datamallet to you.

Datamallet is an open source collection of helpful functions and modules built by Data scientists for Data scientists, to help expedite the data science workflow.

Datamallet is built on top of Scikit-learn, plotly, pandas, numpy and scipy. It contains helpful scikit-learn transformers for preprocessing data, creating new features and automatic data visualization.

Installation

pip install datamallet

Since this is the plotly forum, I would like to show how the autoplot function within datamallet works.

Library imports

from datamallet.visualization import AutoPlot
import plotly.express as px
from datamallet.visualization import AutoPlot

Use inbuilt dataframe from plotly express (this can be replaced with your dataframe

tips = px.data.tips()

Instantiate an autoplot object

autoplot = AutoPlot(df=tips, include_scatter=True,include_pie=True,include_box=True,
                    include_sunburst=True,
                    include_violin=True,
                    include_treemap=True,
                    include_histogram=True,
                    include_correlation=True,
                    create_html=True,
                    filename='autoplot')

Call the show method, this creates an html file with all the plotly charts, as well as a list of plotly graph objects

list_of_charts = autoplot.show()

Look in your directory, and there should be an autoplot.html file (you can change the filename attribute to anything you want), this file would contain scatter plots, correlation plots, histograms, boxplots, violinplots, treemaps, sunburst charts, pie charts (more charts to come).

These charts are created by some heuristics which I have come across during my day to day job as a Data scientist. and more attributes can be specified in the Autoplot class.

As of version 0.10.2, (over 20 releases so far) datamallet supports the following chart types:

-Scatter plots.

-Correlation plots.

-Histogram.

-Box plots.

-Violin plots.

-Treemaps.

-Sun burst Charts.

-Pie Charts.

And a ton of scikit-learn compatible data transformers.

I am interested in having new collaborators and people interested in becoming core developers :sunglasses:.

Important links

Github repo https://github.com/bodealamu/datamallet
Pypi page https://pypi.org/project/datamallet/

2 Likes

Hi Olabode @bodealamu
Thank you for sharing Datamallet with us. I’m very intrigued. Would you mind sharing images or gifs to the post so we get a better understanding of what Datamallet is capable of? I think that would also help community member quickly understand the full potential of Datamallet.

I would do so, also as of version 0.12.0, it has support for density contour charts as well as density heatmap.

I recently uploaded a short video demonstrating datamallet for visualization to youtube. People that are interested can jump to the 5 mins mark.

1 Like

Great video that explains DataMallet. Thanks for making sharing with us, @bodealamu

Hi bodealamu

Looks :+1:, Thanks :pray: for sharing.

Thank you!

Thank you! The focus of the video was just on the visualization aspect, but datamallet does much more.

1 Like