GUI tools for data visualization: Tableau
Georgetown University
Spring 2024
On to the lab
1 - Connecting to a File: This section on your start page indicates where you can connect to your data files saved on your computer
2 - Connecting to a Server: You’ll usually use this section if you’re working for a company that uses specific servers. Tableau can connect to multiple different servers such as Oracle, PostgreSQL, Azure, Dropbox, and more.
3 - Saved Data Sources: These are sample data sets that Tableau provides. When you download the software, Tableau also provides you a repository folder which also holds these clean sample data sets as excel files. When you click the samples, they will automatically load without you needing to find the files.
4 - Open a Workbook: This area is where you can find recent workbooks opened and quickly load them in if you want to work on them.
5 - Sample Workbooks: This section provides you with sample workbooks that you can open to play around in and see how they were built. Clicking on “More Samples” will lead you to a gallery of downloadable sample workbooks.
6 - Discover: The Discover side bar provides you with links to the Tableau training videos, blog, forums, and Tableau Prep (a data cleaning and prepping software).
1 - This shows you the data file you loaded. Here you can rename your data source or edit your connection to the source.
2 - The Data Interpreter is a built-in data cleaner that Tableau provides. If you choose to use it, it can identify potential areas to clean and will re-format them and provide a log of the changes they made.
3 - This area shows the “sheets/tabs” that are in your data source
4 - If you want to use multiple sheets, you can drag them into the main space here to make a join or union connection.
5 - Here you can choose the data connection you want your workbook to have. In the simplest terms, live connections have real-time updates when you’re connected to the database while an extract provides a snapshot of the data which can be refreshed at will.
6 - Here is where you can view your data which needs be in a tabular format, with clean headers. Tableau will analyze your data and automatically assign data types to them. You can change the data types. Tableau does not change you original data source!
6 - In the data tab you can find all measures. In the analytics tab you can supplement your views with reference bands, forecasts, trend lines, and more.
7 - This is a very handy little button that shows you a quick look into your data table. Instead of switching back and forth from your workspace to the Data Source page to look at your data, you can just click that button to quickly see your data.
8 - The Dimensions tab is where you can find all your categorical fields
9 - The Measures tab is where you can find all your numerical measures
10 - You can click these little tabs to open a new Sheet, Dashboard, or Story.
11 - The Filters card is where you can drag various fields to filter your view.
12 - This bar in the marks card is a drop down menu of different chart and graph types you can use, like bar, area, gantt, pie chart and more.
13 - These cards in the marks card is used if you want to add color, sizes, text, tooltips, or more to your visualizations.
14 - The Columns and Rows section is how you build your views. You can drag various fields to this area to make your visualizations.
15 - This is where your visualizations will appear. You can also drop fields there to let Tableau automatically choose how to visualize it.
Tableau uses these definitions:
In simpler terms
Dimension role
Measure role
Also, Tableau assumes that columns are fields, so it assumes tidy data
Einstein Discovery
Einstein Discovery is a no-code ML product for predictive analytics that is a sister product to Tableau (under the Salesforce umbrella)
SCRIPT
functions
Rserve
package in your installationFor running on your own machine, this is served on localhost:6311
. However the Rserve instance can be deployed on a remote server, with or without SSL-encryption.
If you don’t include the args
above, you may get an error from R
Fatal error: you must specify ‘--save’, ‘--no-save’ or ‘--vanilla’
3. Now, create the connection in Tableau
Install TabPy
. This is best done using pip
and not conda
.
This sets up tabpy
as a service that can start at the command line
and can be connected at localhost:9004
.
In Tableau, we can configure this from the same dialog we used for Rserve, except we’ll choose the TabPy option
We’ll use an AirBNB dataset composed of all AirBNB properties in New York City that were listed on 1 September, 2015.
Rserve
is running!!We’re going to create a Calculated Field, where the calculation will be in
SCRIPT_INT
SCRIPT_INT
, we specify the inputs from the Worksheet
.arg1
, .arg2
, etcYou can now select the linkage and number of clusters in a hierarchical clustering (hclust
) in Tableau and see the visualization update
Make sure that tabpy
is running!!
We create a new Parameter, called “Clustering Algorithm”
We create a Calculated Field called “Clustering” here, too
https://towardsdatascience.com/integrating-tableau-and-r-for-regression-analyses-c3cac7e199cf
https://www.tableau.com/learn/tutorials/on-demand/using-r-within-tableau
https://help.tableau.com/current/pro/desktop/en-us/r_connection_manage.htm
https://www.tableau.com/blog/building-advanced-analytics-applications-tabpy-64916
https://tableau.github.io/TabPy/
https://tableau.github.io/analytics-extensions-api/docs/ae_example_tabpy.html
DSAN 5200 | Spring 2024 | https://gu-dsan.github.io/5200-spring-2024/