Category Archives: PowerBI

The Little of Power BI Visualization Design – Part 2

Continuing our journey on applying Andy Kirks tips and tricks from his series “The Little of Visualization Design” we are now at part 2, Clever Axis Scaling. As last time I suggest you read his post first so we already have some common ground.

Why use clever axis scaling

Clever axis scaling is a tool in order to create some drama in your visualization. It can also help you highlight values and draw your consumers eye towards it. Things that stand out will get attention, our brain is simple in this regard, and in this case that is what we are after.

In the example the y-axis is set to 50, but the maximum value is 76. Now, my quick thought was “Great! this is easy, lets just set the y-axis to 50 so the helper line is 50”. This is easy to do, however the chart actually gets cut at 50. So it ends up looking the image below, which is not what we want at all. We are now hiding the most interesting data!

Max y-axis value set to 50 truncates your graph

So I tried setting it to 100. Now, that works okay, but the highest line on the y-axis is now not 50 so the dramatic effect of 76 shooting way above the last helper line on the y-axis disappears even though we still see the biggest increase on the chart to the left. So what is the solution? You need to try out what works best with your data. In this case it seems to work fine to choose 76, the maximum value in the chart. Now this will not always be the case because we can not control how many lines on the y-axis we get. If I make the chart higher you can see that the y-axis changes with numbers as well.

Trying out different max values on the Y-axis
Different heights can remove some of the effect we are after

Use it carefully

This solution also has the drawback that you are hardcoding the minimum and maximum value. So if you suddenly have a value higher than 76 you will loose it! In the end it comes down to what you want to tell with your chart, if your chart is going to change values often and how dramatic you want it. If you have no idea how your numbers will behave in the future I will not advice you to hard code min/max values unless you need it for a specific occation like a presentation. When you are done with that spesific occasion I suggest you turn them back to automatic to minimize confusion.

Final result

As with all tools PowerBI has some limitations compared to custom code and for example using something like D3.js where you can do absolutely everything you want! Having these limitation can make it a challenge to use all these tips and tricks  going forward, but we will do the best we can! In this case we might have some problems trying to create a more dramatic effect in our storytelling. Both with your axis’ as we have seen here, but also with data labeling as PowerBI does not let you choose which data points to highlight. So if you try to label the highest value without hovering over it it is not possible. Or at least I didn’t manage to, but if you do please let me know how you did it.

Also, if anyone has a way of hiding the ESRI logo in Power BI Desktop please let me know. They are not pretty and are driving me crazy!

Speaking at SQLSaturday Oslo, September 2nd 2017

I got some great new during my vacation I got a mail saying I was selected to give a talk at SQLSaturday in Oslo this year. I also got notified by the tweet below which I had completely forgotten from last year. Nice surprise to see I have succesfully made a goal of mine, even though I had forgotten I had set it! I’ve been to every SQLSaturday that has been in Oslo and it has always been a great event so looking forward to be able to contribute with a talk myself this year!

My talk is titled “Data Visualization – More Than a Hygiene Factor”, based on a quote from this Medium Post. You can read my abstract below.

"For many companies data visualization is still a hygiene factor; necessary but not crucial"

In a world where everyone wants to use data to drive their business forward it is important to be able to communicate and speak the language of data even though data itself can be complex. One way of doing this is by making good data visualisations. Good data visualisations are engaging, they are informative and they let your data tell you its story. Too often data visualisation gets a low priority making the final result feeling lacklustered and making the users uninspired. 

In this session we look at some data visualisation principles and best practices, in order to deliever your message with a clear point of view and minimize confusion. Lastly we will look at how you can use these practices with Power BI in order to improve how data can be communicated to your end users in the best possible way making them come back over and over.

SQLSaturday is a free 1-day training event for Microsoft Data Platform and SQL Server professionals, providing a variety of high-quality technical sessions. If you work on the Microsoft Data Platform SQLSaturday is a great way to get inspired and hear about new things. You can find more information about SQLSaturday, September 2nd in Oslo here!

 

 

“The Little of Visualization design” – with Power BI

Andy Kirk has an excellent series called “The Little of Visualization Design” where he gives small tips and tricks that can improve your data visualizations. If you have not seen it I strongly recommend it. Now, what I am going to try and do every week after summer vacation is to try and show you have you can take these tricks and use them with Power BI. But let’s kick start it now with part 1, dual labeling. I  suggest that you read the original post by Andy first so we are at a common ground about what we are going to look at which is this pie chart.

Dual labeling. It is suprisingly normal to see and it generates more cluster on your data visualisation than you need. Repeating something will not make things clearer, it will just create more ink on your graph and make it harder to focus on what’s important.

Now if you punch in the data and create a pie chart in Power BI we get what is shown below.

So Power BI does not provide you with a dual labeling issue at front, but it is quite easy to reproduce it with Power BI. In the “Format” pane you have a bunch of options which usually are great, but you have to use it with care and have a clear vision of why you are changing the original chart if not you can end up with all of these different variations.

The one in the bottom left is probably the closest to the one in the original post. It has dual labeling, and it has quite similar colors on the pie slices. Andy Kirk’s proposed solution is to remove the labeling and provide it directly onto the pie since the colors in the original graph is so similar. Now, that doesn’t  sound to far away from the default graph that Power BI provides us with. However the default is not perfect and here is what I would do in order to improve it:

  1. In Label Style choose “Category, data value”. This makes us see the actual number.
  2. Increase font size of detail label.
  3. Increase font size of the title. In general I think all default font sizes in Power BI are too small. I always feel like I need stronger contact lenses when creating a chart…
  4. Sort the chart by value so the slices appear in order of size.
    Note: I had originally made the font size of the detail label a bit bigger. However, this made the detail label for Canada disappear. Probably because it would take up the same space as Israel. So I wish they could make the position of the label a bit more dynamic.

In the end we end up with the chart below. So all in all the default chart Power BI created wasn’t too bad, but it could be improved. And make sure you are aware that not all options in the format pane in Power BI makes your data visualisation better, it could make it worse!

I’m looking forward to some weeks of summer and then I’ll continue this series when I am back! Thanks for reading. If you have any questions or feedback drop me a comment, it is greatly appreciated.

Creating a dynamic card with text value in Power BI

Alternative title: Finding string value from a dimension with highest numeric value

The company i work for uses Yammer and I have founded a Data Visualization group which I am, manually, keeping some statistics about using Power BI. The other day I found myself wanting a dynamic card in Power BI in order to highlight which day was the most active when it came to amount of posts. In order to do this I had to figure out how to make a measure returning not the max value, but the day which had the maximum amount of posts. My usecase is finding a weekday, but maybe you want to see which product is the most popular, which county has the most purchases or which salesperson has the most sales. All these cases should be able to reuse this measure.

 

First try

My inital thought was to create a table in my calculation and then slice it to return only one row and one column leaving me with one cell which had the day with the most number of posts. I made a measure which used a TOPN returning the row with most posts followed by a SELECTCOLUMN to select only the column which had the weekday in it. Now, this turned out to return me the overall correct day, but it did not work when I added a filter and the POwer BI visualization returned errors so I had to start over.

 

The solution

I have created three measures to solve this.

1) One simple sum of [Number of Posts]:


Number of Posts = SUM(Sheet1[NumberOfPosts])

2) Finding the day with the most posts by using MAXX


MaxPostsPerWeekDay = MAXX(VALUES(Sheet1[WeekDay]);[Number of Posts])

3) Using FIRSTNONBLANK on my WeekDay column and then return the value where the sum equals the maximum value. So in the end my measure looks like this.

Most Popular Day = 
IF(
    ISBLANK([Number of Posts]);
    BLANK();
    FIRSTNONBLANK(
        Sheet1[WeekDay];
        IF(
            [Number of Posts] =
            CALCULATE(
                [MaxPostsPerWeekDay];
                VALUES(Sheet1[WeekDay])
            );
            1;
            BLANK()
        )
    )
)

The first IF is to remove days that has no posts, in my case there is not much activity in the weekends so they will get filtered out. The beauty of this measure is that it is not limited to crads, but also easy to use with your filters and in tables where you would want it.

Creating a guest user for Datazen

If you ever want to have public access to your dashboards you can get this by creating a guest user and giving that user access to the dashboards you want to be publicly available. This is great if you want to place dashboards on a web page without haveing the users to log in to see them.

You can create a guest user by creating a new user on the server with this info:
Username: guest
Mail: guest@guest.com
Name: Guest

Now, remember that this user should only have access to dashboards that is intended for everyone to see. If you give it access to everything, business critical information can be available to anyone.

Using JSON functions in SQL Server 2016

With SQL Server 2016 we are finally able to analyze and query JSON data. It is not that often I use XML, but JSON is so much used it is about time we can use it in SQL Server. In this entry I’ll let you follow me as I take a first look at it using some data from New York Times.

New York Times has a web app called Chronicle, which lets you see how many articles has mentioned specific words and this data you can also export as JSON. I chose to use the words Radio, Televion, Mail and Internet and downloaded one JSON file per word (for some reason it doesn’t work to get the data for several terms at once). I also chose to remove one of the sets of square parenthesis since our graph_data only will have one term and not a list of terms. I end up with four files that looks like this.

graph_data

So the graph_data has a term that in our case is mail, radio, television and internet and an array of data which has the number of articles containing the number of articles with the term, the year and the total number of articles published in that year so we for example can calculate a percentage of how many articles the term was in.

In order to read the JSON from the file I first load the entire file into a variable using the following code

DECLARE @ChronicleMail VARCHAR(MAX)
SELECT @ChronicleMail = BulkColumn FROM OPENROWSET(BULK'C:\Users\CTP3\Documents\JSON\MailOrg.json', SINGLE_BLOB) q;
SELECT @ChronicleMail

The last line selects the text so we can have a look at what is saved in the variable, which basicly is just one long string.

Not so interesting so far, but lets start using the new JSON functions. The JSON_VALUE function returns one scalar value from a JSON string. If you try using JSON_VALUE on something returning an array you will get a NULL returned. To get an array returned you must use the JSON_QUERY function. This is fine, but if you want to insert your data into a table the function you want to use is the OPENJOSN that lets you reference some array in your JSON and then return the elements.

JSONValueAndQuery

In our case the data is an array so lets call OPENJSON and call it on the data.

SELECT
    *
FROM OPENJSON(@ChronicleMail, '$.graph_data.data')

JSONData

From the result set we can see that we now have one row per year in our data. Now that is cool and all, but we kind of want the values in different columns, not the entire JSON in a column with name value. To fix this you can add a WITH clause after the OPENJSON function.

SELECT 	
	[NumberOfArticles]
	,[TotalArticles]
	,[Year]
FROM OPENJSON(@ChronicleMail, '$.graph_data.data')
WITH(
	[NumberOfArticles] int '$.article_matches',
	[Year] int '$.year',
	[TotalArticles] int '$.total_articles_published'
)

JSONDataColumns

Excellent! We now have the data in a table structure and we can insert into an actual table or do whatever we want with it. I wanted to add the term to the data so I ended up just joining it to this dataset using the JSON_VALUE function to pull only the term.

After doing this for all for files I know have a table with all terms and data for each year and I am free to use whatever tool I’d like to visualize it, f.ex PowerBI or Datazen, or since we’re in SQL Server 2016 now we can make a mobile report in SSRS. I chose PowerBI for now and added a calculation for percentage and also added year as a date to produce this.

PowerBI

Lastly, not sure if fun fact worthy, but if you would have used the OPENJSON directly on graph_data you will be able to see the key and datatype of the other elements and from this you will see that you would have to use JSON_VALUE for the key term, while the JSON_QUERY for the key data since it is an array. All in all I think these JSON functions is a great addition to SQL Server in 2016, I am sure I will be using them quite a bit!

Datazen Branding

It is possible to do branding in Datazen so you can use your own backgrounds, create a custom color palette, etc. However in the product documentation it is only listed the names of the files you need in this brand package. It does not show you how the layout of the files that are not image files should look like so you can create the images, but not the complete brand package.

I have tried to get a hold of how these files should be created for a while, and yesterday luck finally stroke. So, a big thanks to @cmfinlan for sharing a brand package template. You can find the template below. Enjoy creating custom branding for your needs, and may all your maps be pink!

BrandTemplate

PinkMap

One way of auto refreshing Datazen in your browser

Yesterday I saw Christopher Finlan tweet about one way of making your Datazen dashboard auto refresh. He used a Firefox add-in for this, and I’d like to show you how I have been doing this.

If you want to show Datazen on a big screen there is at this point no out-of-the-box way to auto refresh your dashboard, meaning you have to open it to get new data. If you use the browser viewer you can quite quickly create one of the simplest websites in the world to do this.

In the header of your .html file add the following line:

<meta http-equiv="refresh" content="90" />

It will make your page refresh every 90 seconds. This can also be used if you want to circle some of your dashboards, sort of like a carousel. The line below will after 30 seconds go to the specified URL instead of just refreshing the page. To complete the circle make sure to go to the first dashboard in your last dashboard.

<meta http-equiv="refresh" content="30; URL='NextDashboard'">

So what does the .html file look like for the easiest example? Something like this, just exchange “DashboardURL” with your URL:

<!DOCTYPE html>
<html>
<head>
<title>Name of Dashboard</title>
<meta http-equiv="refresh" content="90 ">
</head>
<body style="margin:0px;padding:0px;overflow:hidden">
<iframe src="DashboardURL" frameborder="0" style="overflow:hidden;overflow-x:hidden;overflow-y:hidden;height:100%;width:100%;position:absolute;top:0px;left:0px;right:0px;bottom:0px" height="100%" width="100%"></iframe>
</body>

Note: If you are going to setup an auto refresh I’d suggest you also set up your guest account so you can have access to those dashboards without having to login. You can then use the public dashboard URL in the source of your iframe.

Creating a custom map for Datazen

Datazen lets you create and use custom maps which can be a really useful feature. What I am using for creating these maps is good, old Paint, internet and QGIS, which you can download here, http://www.qgis.org/en/site/forusers/download.html.

Say you want to create a map of your office. Maybe not the most useful map, but it is a small job to create an office layout in paint so lets start there. You can use other images as your starting point, but I want to make the entire thing from scratch to show that it is possible. My office will look like the one below with eight tables that I am going to map to different persons so we can see who actually does some work around the office! (None, of the names or data is real just to be sure)

Office

Next we need to convert this .png file to a vector file. I’m using http://www.autotracer.org/ for this, but there are other methods of doing it, software, other websites etc. Upload your file and make sure to choose DXF format on your output file.

convert

Now that we have a vector file of our office lets open up QGIS and just drag the .dxf file into it. Choose the default coordinate system and the layer with geometry type “linestring”. You should now have something like this in Qgis.

qgisOutline

Right click your layer, choose “save as…” and choose ESRI shapefile in the popup. When this is done you can right click your “linestring” layer and remove it. We are now going to edit the shapefile a little to make it simpler and also give the different shapes the name of people in our office so we can map data to their names. Right click the layer and press toggle editing. You are now able to edit, merge or delete sections. We can see that our vector image has some double lines around each desk, probably because we used to thick of a brush in paint so lets remove them.

toggelEditing

Find the “Select Feature” on your toolbar and select a section of your layer. This section becomes yellow and you can delete it. After deleting some of the sections I end up with a simple layer with eight squares, my office desks.

SelectFeature

We need to give the shapes some more friendly names. Right click your layer and select “Open attribute table”. You get a popup and in my case I see that all my shapes has the name C7 in the layer column. I’m going to select a row in this attribute table, and see that by doing this the related section in my layer behind turns yellow. I now know which section I am editing and I’ll say that Joe works by this desk. Then I do the same for the rest of the sections.

Attribute

Save your layer after giving all rows a name. One more step and it is ready to be imported into Datazen. If you try to import it at this stage your data will only end up on the lines of the desk and it is REALLY hard to see what color it has. Press Vector -> Geometry tools -> Lines to polygon to make your lines into polygons, an area. You will need to create a new output shapefile and save that.

Area

Alright, time to open up Datazen. Drag a map of your choice onto the Layout View, select “custom map from file” and find your shapefile. You will need the .shp and the .dbf file. Note: You can not have the shapefile open in QGIS at this point. It’s like reading from an Excel file, you can’t have it open when adding it to Datazen.

Press preview, and voila! You have created a map and inserted it into Datazen. Have fun, now the only limit is your mind or your customers need!

OfficeDatazen

Datazen: Brand package error

EDIT: If you want to get a brand package template you can find one here.

Do not copy/paste filenames for a brand package directly from the Datazen “End User Documentation”! Some of the dashes are em dashes and is not recognized by the server when you try to upload the .zip file with all the images. It will result in the following error message. You can see that the character in red circle which looked like a dash is now read as a û.

InvalidCharacter

Now, this wouldn’t be too bad if it wasn’t for the fact that I am now unable to delete this brand package as shown below. I uploaded this package to the server and I can upload another brand package to my individual HUBs, but for now I have yet to be able to delete the faulty brand package. I’m guessing I need to be able to access the RavenDB and delete it from there, but since this is a test server without a lot of dashboards I think I’ll just reinstall the server and start from scratch again.

deleteError