All posts by barmartland

Speaking at SQLSaturday Oslo, September 2nd 2017

I got some great new during my vacation I got a mail saying I was selected to give a talk at SQLSaturday in Oslo this year. I also got notified by the tweet below which I had completely forgotten from last year. Nice surprise to see I have succesfully made a goal of mine, even though I had forgotten I had set it! I’ve been to every SQLSaturday that has been in Oslo and it has always been a great event so looking forward to be able to contribute with a talk myself this year!

My talk is titled “Data Visualization – More Than a Hygiene Factor”, based on a quote from this Medium Post. You can read my abstract below.

"For many companies data visualization is still a hygiene factor; necessary but not crucial"

In a world where everyone wants to use data to drive their business forward it is important to be able to communicate and speak the language of data even though data itself can be complex. One way of doing this is by making good data visualisations. Good data visualisations are engaging, they are informative and they let your data tell you its story. Too often data visualisation gets a low priority making the final result feeling lacklustered and making the users uninspired. 

In this session we look at some data visualisation principles and best practices, in order to deliever your message with a clear point of view and minimize confusion. Lastly we will look at how you can use these practices with Power BI in order to improve how data can be communicated to your end users in the best possible way making them come back over and over.

SQLSaturday is a free 1-day training event for Microsoft Data Platform and SQL Server professionals, providing a variety of high-quality technical sessions. If you work on the Microsoft Data Platform SQLSaturday is a great way to get inspired and hear about new things. You can find more information about SQLSaturday, September 2nd in Oslo here!

 

 

“The Little of Visualization design” – with Power BI

Andy Kirk has an excellent series called “The Little of Visualization Design” where he gives small tips and tricks that can improve your data visualizations. If you have not seen it I strongly recommend it. Now, what I am going to try and do every week after summer vacation is to try and show you have you can take these tricks and use them with Power BI. But let’s kick start it now with part 1, dual labeling. I  suggest that you read the original post by Andy first so we are at a common ground about what we are going to look at which is this pie chart.

Dual labeling. It is suprisingly normal to see and it generates more cluster on your data visualisation than you need. Repeating something will not make things clearer, it will just create more ink on your graph and make it harder to focus on what’s important.

Now if you punch in the data and create a pie chart in Power BI we get what is shown below.

So Power BI does not provide you with a dual labeling issue at front, but it is quite easy to reproduce it with Power BI. In the “Format” pane you have a bunch of options which usually are great, but you have to use it with care and have a clear vision of why you are changing the original chart if not you can end up with all of these different variations.

The one in the bottom left is probably the closest to the one in the original post. It has dual labeling, and it has quite similar colors on the pie slices. Andy Kirk’s proposed solution is to remove the labeling and provide it directly onto the pie since the colors in the original graph is so similar. Now, that doesn’t  sound to far away from the default graph that Power BI provides us with. However the default is not perfect and here is what I would do in order to improve it:

  1. In Label Style choose “Category, data value”. This makes us see the actual number.
  2. Increase font size of detail label.
  3. Increase font size of the title. In general I think all default font sizes in Power BI are too small. I always feel like I need stronger contact lenses when creating a chart…
  4. Sort the chart by value so the slices appear in order of size.
    Note: I had originally made the font size of the detail label a bit bigger. However, this made the detail label for Canada disappear. Probably because it would take up the same space as Israel. So I wish they could make the position of the label a bit more dynamic.

In the end we end up with the chart below. So all in all the default chart Power BI created wasn’t too bad, but it could be improved. And make sure you are aware that not all options in the format pane in Power BI makes your data visualisation better, it could make it worse!

I’m looking forward to some weeks of summer and then I’ll continue this series when I am back! Thanks for reading. If you have any questions or feedback drop me a comment, it is greatly appreciated.

Learning by Sharing

I am currently in the process of re-learning D3.js, a javascript library used for data visualisation. That means I am always looking interesting datasets or existing visualisations I can remake. Over at reddit.com/r/dataisbeautiful some time ago one of the busiest pie charts I have ever seen was posted. It was showing how many players was playing at certain maps in the game Team Fortress 2. Unfortunately there where over 14 000 slices in this pie chart and it just didn’t make much sense.

                                                        Worldst busiest pie chart

I used about 5 minutes in Power BI to create a remake that was a bar chart in order to show why that is a better graph choice to show ratio between different categories. However, as some other users commented that removed some of the complexity which also is quite interesting. The fact that more than 14 000 maps are being played at in a month in a game that is ten years old is quite impressive.

                                                5 minute remake to show ratios better

So I ended up starting to think about a remake and asked the user who posted the pie chart if he had the dataset he had used. The internet is full of awesome people and in about one hour I had a csv files with 14 000 datapoints.

I have always gone code first, but I thought this time I would at least try to sketch some ideas on paper. I’m not the greatest at drawing, but that is also not the point. So I made this sketch. Initially I wanted to be able to create an overview which then could be broken down by some interaction. Unfortunately I did not have both the datasets I wanted at the time I built the visualisation, but I do now so maybe I have to revist this some time.

With that design in mind I built the visualisation. I’m not going to go into detail about how I did that here. You can see my cource code here if you want to have a look. The original is interactive so you can hover over a bubble to see more information.

                                           Finished visualisation

Context, context, context

Now, even though I made an interactive version I ended up deciding to post a static image of the chart. That was a mistake. The original pie chart had created quite a bit of discussion at /r/dataisbeautiful so my head took it for granted that everyone knew what this represented. They did not. Internet is big, and not everyone lives in one subreddit. So many users did not get the background or the context, and even though I had written that in a comment, comments are not read by a lot of Reddit’s users. The chart had no context and as of today the comment with the highest rating still reads: “Looks nice but no fucking idea what anything is representing”.

However, a lot of people did actually get what I was trying to show. How complex and how distrubuted the player base in Team Fortess is. I have gone through the thread and followed up question and such and as long as you can filter out the trolls and take things as constructive feedback you can get some really nice feedback. Where else would you get feedback from many, many thousands of people on your work?

  1. Context is hard on Reddit as a platform. You basically only have the title to explain as many users won’t read anything else, only look at the image.
  2. I should have posted the interactive version. It provides some more detail that people where missing. I also, for some reason, had removed the text stating “hover over a bubble for more information” in the static image. Including that text might have made more people go to the interactive version.
  3. Check more colorblind simulators. I used some simulators, but I think I forgot it with the last version. THere will be colorblind people when you share it with a userbase this big. If you need to test yours you can try this one.
  4. People in general have a lower threshold for saying negative things so you need to be able to filter comments quite hard and which you decide to listen to.
  5. It is written bubble, while boubble, as I had written all over the comments, is a river in France!

All in all, sharing your work is a really good exercise. Working on a project it is easy to go blind, but the internet won’t let you go easy if they see something to pick on.

If you create something you are proud of make sure to post it to /r/dataisbeautiful over at Reddit. We are always looking for original content!

                                    Gif of some steps of development

Creating a dynamic card with text value in Power BI

Alternative title: Finding string value from a dimension with highest numeric value

The company i work for uses Yammer and I have founded a Data Visualization group which I am, manually, keeping some statistics about using Power BI. The other day I found myself wanting a dynamic card in Power BI in order to highlight which day was the most active when it came to amount of posts. In order to do this I had to figure out how to make a measure returning not the max value, but the day which had the maximum amount of posts. My usecase is finding a weekday, but maybe you want to see which product is the most popular, which county has the most purchases or which salesperson has the most sales. All these cases should be able to reuse this measure.

 

First try

My inital thought was to create a table in my calculation and then slice it to return only one row and one column leaving me with one cell which had the day with the most number of posts. I made a measure which used a TOPN returning the row with most posts followed by a SELECTCOLUMN to select only the column which had the weekday in it. Now, this turned out to return me the overall correct day, but it did not work when I added a filter and the POwer BI visualization returned errors so I had to start over.

 

The solution

I have created three measures to solve this.

1) One simple sum of [Number of Posts]:


Number of Posts = SUM(Sheet1[NumberOfPosts])

2) Finding the day with the most posts by using MAXX


MaxPostsPerWeekDay = MAXX(VALUES(Sheet1[WeekDay]);[Number of Posts])

3) Using FIRSTNONBLANK on my WeekDay column and then return the value where the sum equals the maximum value. So in the end my measure looks like this.

Most Popular Day = 
IF(
    ISBLANK([Number of Posts]);
    BLANK();
    FIRSTNONBLANK(
        Sheet1[WeekDay];
        IF(
            [Number of Posts] =
            CALCULATE(
                [MaxPostsPerWeekDay];
                VALUES(Sheet1[WeekDay])
            );
            1;
            BLANK()
        )
    )
)

The first IF is to remove days that has no posts, in my case there is not much activity in the weekends so they will get filtered out. The beauty of this measure is that it is not limited to crads, but also easy to use with your filters and in tables where you would want it.

Tracking myself in 2017

2017 is here and once again I’ll have a goal of writing here at least one time every month. Last year I failed this goal, but this year? Let’s give it another go.

I have a list of projects I want to do and I’ve finally started to track what I am doing every day. I’ve thought of this for a while as I think it would be interesting to see if I am able to find out if there are parameters in my life that affect things like my mood and evergy level. To do this tracking I’ve made a very simple website that is hosted in Azure with an Azure SQL Database behind to store my data. This was needed to make the registration as easy as possible. I had an Excel sheet the first couple of days before I got this site up and running, and I would not recommand that approach at all if you’re going to do the same. It is a too big effort to open an excel sheet every time you drink something or whatever you want to register. So far I am very happy with how it is working out, but I have started to get quite the backlog of increasing size with ideas for improvements. I have told myself to not change anything during January and then do an evalutation of how I think it is working out and which changes are absolutely needed. I’ve also decided not too look at any of the data until January is over. I don’t want to be too aware of what I am logging in the beginning, but rather try to make logging into somethig I just do and then when I have some months of data I can see if something sticks out from them.

So far I have split registration into two parts. First part is activities during the day. Things such as what I eat, drink, if I work out and for how long etc. The other part is a registration I do at the end of each day. Here I note down my mood, energy level, stress level and overall feeling as well as some small notes about f.ex if I hang out with friends during the day or did 100 math puzzles on an app I downloaded. As mentioned I am hoping to see if there is something that affects f.ex my mood and I am able to more actively apporach these things in the future. Not really sure if this will pop up, but at least I’ll have some more data on myself and what I do during the day.

Ever done something like this or have ideas of things that are worth noting down during the day? Let me know. I know some people take these kind of things to the extreme and I probably won’t do that, but I’m always interested in new input or ideas.

Note at the end: I really like how easy it is to create a web app in Azure and everytime I now do a push to my master branch in my Git reposit the website updates within seconds. Easy to set up a Visual Studio Team Services account and create the reposit to start working right away.

Activity registration

Creating a Dynamic Dashboard of Datazen Dashboards

One of the projects I’ve used Datazen was a project that needed operational reporting. These dashboards was to be updated at least every 10 minutes and was placed on big screens around the work area where it was needed. We also created an overview, or landing page, for super users that needed to have quick access to all dashboards without waiting for the dashboards loop on the big screens.

In the beginning this landing page was just a static html site, as we didn’t want people to use the Datazen portal at that point. Every dashboard had a static thumbnail and when a user clicked on a thumbnail they were brought to the respective Datazen dashboard.

This worked fine, but it required every user to go into each dashboard to have a look at the status since the thumbnail did not change. So we decided to spend some time to create a dynamic landing page, or a dashboard of dashboards that showed real time thumbnails of the dashboards. By doing this they couldn’t necesserily see the numbers on all of the dashboards, but they would be able to see status and where things where green or red.

Datazen can be embedded in web pages by the use of iframes. Doing this gave us a more dynamic page that was refreshed when a user entered the site or after a given interval if they stayed on it. What you will notice if you try this is that a click on a dashboard will not open the dashboard itself, but instead you will be able to interact with the dashboard in the iframe. In our case, this was not what we wanted as the purpose was to just see the status and when a user clicked on a dashboard they were sent to that specific dashboard to get more details and in full screen.

In order to solve this we created a new <div> called clickCatcher with the same size as the Datazen “thumbnail”, and made transparent. This allowed us to display the dashboard in the size of a thumbanil, but open the dashboard when it was clicked instead of interacting with the small version. The code for one thumbnail is posted below.

It’s a neat little trick that made us able to create a solution that was easier, an better, for the customer. Too see how you we auto refreshed dashboard pages you can take a look here.

HTML

<div id="kpi" class="kpi">
<a href="LinkToDashboardPage" class="thumb">
<div class="clickCatcher"></div>
<iframe src="LinkToPublicDatazenDashboard" style="overflow:hidden;overflow-x:hidden;overflow-y:hidden" frameborder="0" height="157px" width="300px"></iframe>
</a>
</div>

CSS

.clickCatcher{
display: block;
background-color:rgba(255, 0, 0, 0.0);
height: 157px;
width: 300px;
position: absolute;
}

LoadingDashboards
All dashboards are refreshed when the landing page is opened
DashboardDashboardExample
How the dashboard might look. Clicking on a thumbnail will take you to the full size dashboard

 

Everyone should try improvisation at least once

Improvisation, a word that scares many people, appeals to some and a skill everyone uses without perhaps thinking about it. During my time at university I spent about four years doing improvisation theater. At that point in time it was a way for me to relax and do something completely different than studying, but when I look back at it now there are some many things I learned doing improv that I now use every day in my work life. In my opinion everyone should try to do improv at least once during their life.

When I mention I used to do improvisation many react with “How did you dare to do that? I wouldn’t know what to do!”, but had they thought out how this conversation with me would be before it started? Had they every line thoughtout before it started? Every day everyone improvises, life doesn’t come with a script. It might be in a meeting at work, with your kids, friends or family, you don’t have to be on stage to do it. There are some valuable points that are important in improvisation, but that I also believe everyone would benefit from having training in, and doing improv is just one way of doing this.

COMMUNICATION

When I ask people what they think about improvisation if they have seen it on stage they say it is funny, and improv is fun, but being funny is not the fundamental thing in improvisation. In my opinion the most important thing about improvisation is communication, and listening to the ones you interact with on stage, or off stage. If you don’t listen to what the other persons are saying you will not be able to tell a story that seems cohesive. If one actor tries to force his story through the other actors it will get noticed by the audience and the result is usually worse than if they had reacted to what was happening elsewhere and built upon that.

Now think about a meeting you have been in where some people clearly are not ready to listen to what the other meeting attendants have to say. Others try to come with suggestion, but they just keep forcing their view through. When you leave the meeting you feel like you have not been heard and maybe a feeling that the outcome was not the best it could have been. The communication only went one way and they where simply not open for new input. Which brings me to the next point.

SAYING YES

I remember this as one of the hardest things to learn, saying yes, and not block ideas from others. If you say no to a suggestion on stage, called blocking, you have killed the momentum in a story. Saying no is so much easier though, both on stage or in real life. By saying no, you don’t have to continue on what was started, you can start over with your own idea, but real magic happens when people say yes to new ideas and build upon them.

How easy isn’t it to say no to a new idea in a meeting? It is also quite lazy by the people saying no right away, you don’t need to think what this new idea means, what could come from it or how it can affect you, your company or your customers. I try to go into every meeting open minded, listen to what people say and every time a new idea comes up use at least a couple of seconds before I respond. Not all ideas are good, but I believe all ideas, and their creators, deserves to feel like they have been heard, and maybe someone else will be able to spin a new idea out from the original.

Try it out, say yes to more and think about how often your say no to things or ideas. Saying yes is not as easy as it sounds.

MENTAL EXERCISE

Communication and saying yes is what I believe is the two most important things I brought with me from doing improvisation, but iprov is also a great mental exercise and a way of really pushing your imagination and brain in a way it might not have been challenged before. When we had our weekly training we did some training on techniques to use on stage, but also a lot of games to expand our imagination and brain processing speed. ALl of this to be prepared to create magic on stage, but also to get to know everyone in the group and feel safe around them. And again this are things that relates directly to work life. In teams where there is a relaxed and safe environment people will dare to say the ideas they have, and I also think they will have more fun and be more productive. Knowing that your team will back yoo up and help you when you struggle or do a mistake is worth gold.

If you have the opportunity to attend an improvisation course I strongly recommend you try it out even though it might seem scary at first. You might get so much out of it, I know I have.

If you want to read more about improv I can recommend the book Improv by Keith Johnstone as a starter.

SunsetPeople

Datazen Guest Authentication Will Be Missed

About a week ago a blog post was released called “Announcing End of SUpport for Datazen Products“. Now, this shouldn’t come as a surprise since Datazen is a part of Reporting Services in SQL Server 2016, more specifically mobile reports. Overall this is great news, there was however a few points I was a bit disappointed in seeing was not part of the initial release.

A really great thing that is mentioned is that a focus area has been that Datazen customers should be able to migrate to SQL Server 2016 Reporting Services easily. This is great, no need in recreating everything you have done so far, and there has been mentioned a migration tool earlier so I hope this will be a simple case once customers start migrating HUBs, users and dashboards.

My biggest dissapointment was to see that Custom authentication and Public/guest authentication is not part of the initial release, but at least on the roadmap. I have struggled with SSRS on public webpages before, and Datazen gave me an easy way of doing this. Hacking a way so that a public Sharepoint portal had to impersonate a user when a user went to pages with SSRS elements was not an easy task, and that hack also did not work when the site was migrated from SharePoint 2010 to SharePoint 2013 so we had to do it all over again. And this was even the solution that Microsoft themselves said we should go for!

Authentication via SSRS have some clear advantages, but I’m still dissapointed that there was not room for guest authentication. In the end, a lot of information is usable and important for the public, and if it is not seen what use is it? I’m crossing my fingers that this will be implemented relatively quick. If we’re lucky maybe they are able to let us use SSRS paginated reports on public sites as well!

Creating a guest user for Datazen

If you ever want to have public access to your dashboards you can get this by creating a guest user and giving that user access to the dashboards you want to be publicly available. This is great if you want to place dashboards on a web page without haveing the users to log in to see them.

You can create a guest user by creating a new user on the server with this info:
Username: guest
Mail: guest@guest.com
Name: Guest

Now, remember that this user should only have access to dashboards that is intended for everyone to see. If you give it access to everything, business critical information can be available to anyone.

Using JSON functions in SQL Server 2016

With SQL Server 2016 we are finally able to analyze and query JSON data. It is not that often I use XML, but JSON is so much used it is about time we can use it in SQL Server. In this entry I’ll let you follow me as I take a first look at it using some data from New York Times.

New York Times has a web app called Chronicle, which lets you see how many articles has mentioned specific words and this data you can also export as JSON. I chose to use the words Radio, Televion, Mail and Internet and downloaded one JSON file per word (for some reason it doesn’t work to get the data for several terms at once). I also chose to remove one of the sets of square parenthesis since our graph_data only will have one term and not a list of terms. I end up with four files that looks like this.

graph_data

So the graph_data has a term that in our case is mail, radio, television and internet and an array of data which has the number of articles containing the number of articles with the term, the year and the total number of articles published in that year so we for example can calculate a percentage of how many articles the term was in.

In order to read the JSON from the file I first load the entire file into a variable using the following code

DECLARE @ChronicleMail VARCHAR(MAX)
SELECT @ChronicleMail = BulkColumn FROM OPENROWSET(BULK'C:\Users\CTP3\Documents\JSON\MailOrg.json', SINGLE_BLOB) q;
SELECT @ChronicleMail

The last line selects the text so we can have a look at what is saved in the variable, which basicly is just one long string.

Not so interesting so far, but lets start using the new JSON functions. The JSON_VALUE function returns one scalar value from a JSON string. If you try using JSON_VALUE on something returning an array you will get a NULL returned. To get an array returned you must use the JSON_QUERY function. This is fine, but if you want to insert your data into a table the function you want to use is the OPENJOSN that lets you reference some array in your JSON and then return the elements.

JSONValueAndQuery

In our case the data is an array so lets call OPENJSON and call it on the data.

SELECT
    *
FROM OPENJSON(@ChronicleMail, '$.graph_data.data')

JSONData

From the result set we can see that we now have one row per year in our data. Now that is cool and all, but we kind of want the values in different columns, not the entire JSON in a column with name value. To fix this you can add a WITH clause after the OPENJSON function.

SELECT 	
	[NumberOfArticles]
	,[TotalArticles]
	,[Year]
FROM OPENJSON(@ChronicleMail, '$.graph_data.data')
WITH(
	[NumberOfArticles] int '$.article_matches',
	[Year] int '$.year',
	[TotalArticles] int '$.total_articles_published'
)

JSONDataColumns

Excellent! We now have the data in a table structure and we can insert into an actual table or do whatever we want with it. I wanted to add the term to the data so I ended up just joining it to this dataset using the JSON_VALUE function to pull only the term.

After doing this for all for files I know have a table with all terms and data for each year and I am free to use whatever tool I’d like to visualize it, f.ex PowerBI or Datazen, or since we’re in SQL Server 2016 now we can make a mobile report in SSRS. I chose PowerBI for now and added a calculation for percentage and also added year as a date to produce this.

PowerBI

Lastly, not sure if fun fact worthy, but if you would have used the OPENJSON directly on graph_data you will be able to see the key and datatype of the other elements and from this you will see that you would have to use JSON_VALUE for the key term, while the JSON_QUERY for the key data since it is an array. All in all I think these JSON functions is a great addition to SQL Server in 2016, I am sure I will be using them quite a bit!