Ever used shapes or images in Power BI and wanted a more flexible way of moving things forward and backwards on your canvas? Yeah, me to!
Under “View” there is a pane called “Selection Pane” and when checked you will see all the different elements on our page, and you can arrange them as you like where the ones on top of the list will be in the front of the elements below on your page. So if you use a shape for a background as me below, just pull it all the way to the bottom, and hepp! It is correctly placed as a background, and not covering the important stuff.
2018, another year where I got a new job, a year where my list of projects I’d like to do some time kept growing and a year where Twitter surprised me as a great discussion platform. To everyone I’ve talked to, everyone creating content I have seen, read or listened to and everyone that has read or heard anything I’ve done: Thank you for a great year!
New Job… Again…
When I graduated doing technical stuff was the best part of my job, digging into data and writing code and to this day I enjoy this a lot. However at some point I realized that what’s intriguing to me about the data and analytics domain is that it needs both the technical side, and the more human side where culture, change and organization comes in. Every job change I have done from my first one I have mentioned this, but when I started looking back earlier this year I realize that this has probably been taken as a positive trait in a developer, and not that I was interested in moving more towards the less technical side. I believe there is a huge advantage coming from the technical side, but I was never able to get a project where I could work a lot more on the non-technical side, and even though I have tried sometimes it comes down to the fact that a technical resource don’t always have the mandate to go as deep into the business side as I’d like. And in the end I found out it was time to move on, and search for something a bit different than what I have done so far.
After a lot of thinking and a lot of interviews I ended my career as a consultant and started in a new job on the 1st of October as Head of BI & Analytics in Sector Alarm Group. A position I can shape a bit like I want and since we are a small team, for now, I can still do technical stuff when needed or if I feel like my hands are getting to clean and needs some real data shuffling. So far it has been all about finding the baseline for where we are, and where we need to go, but I have high dreams of what we can to in 2019, and also as we move further into the future!
Talks and Presentations
I’ve always enjoyed doing talks and presentations about topics I am interested in. 2018 started out with a presentation for the Microsoft Data Platform User Group here in Oslo, and several other presentation has followed that both internally and externally. #SQLSatOslo is always a joy to attend and speak on, and I have never spoken as much about data visualization as I have done in 2018, both at projects with clients and internally in my previous employer. I still believe data visualization is one of the best ways of making data more human, and I will continue to spread the importance of it in my new job, and at external events where I can.
Working With Graduates
In 2018 I’ve been able to work quite a lot with graduates, and this is one thing I’m going to miss by not working as a consultant in a relatively big company. Seeing how hungry graduates are to learn and jump into new things is extremely inspiring. And even though I sometimes feel like I am doing absolutely nothing because they do incredible work I hope I have been able to provide them with a good understanding for data as a domain and a good foundation for their continued career. I hope the best for every graduate I have been able to work with this year!
As most people that know me know I love data visualization and Twitter is my go to source for this, #dataviz. Where I earlier only where looking for cool visualizations this year I have noticed something else about this community. They have some really high quality discussions on this forum where you can’t write more than 280 characters in one tweet! I am astound that some of the best discussions and feedback loops comes in this forum. Other forums, such as LinkedIn, to often becomes a place where everyone agrees on everything and it is all about trying to praise the poster and what they have done. The data visualization community can one day salute a great visualization, and the next day have an intense discussion where no one agrees, but are still responding in a way that shows resepct and it feels like they truly want to help out, and push both the poster and the data visualization field forward!
The biggest thing I know for 2019 is that we are expecting a new child in the start of February, and it is going to be great! Although I wonder how much we can have forgotten over the past 3,5 years about babies. At work it is time to stop talking about what we should do and start doing some actual work and deliver new, user friendly, great data products and platforms while building a business culture that is thirsty for data, and where data is an asset that everyone knows the importance of and uses every day in their work. High dreams? Hello! But we will start small both on the technical side and the business side to make sure they both can feed the other side with what they need.
Other than that I have hopes of crossing out more of the projects in my list of projects that I never get to do, I really wish the day had 27 hours because cutting sleep isn’t really an option. I’d like to do more custom data visualization and perhaps even mix it with 3D printing. I hope to write a lot more, both technical stuff here, but also more in the form of fiction. I’d like to learn how to play the guitar better, I bought a harmonica at an auction which I’d like to use more and I got a jaw harp for christmas I have to larn to play. And to top of the music part I am still working on turning datasets into music, not by using machine learning, but more as an art project. And the list goes on most likely long into 2020.
2018 has been a great year when I look back and here’s for 2019 to be ever better! And I hope you will be able to cross of some things on your list of things you’d like to do as well!
Have you ever been in a situation where you just know the next word to come out of a person’s mouth is the word “but”? In a meeting where someone says “I totally agree with you, but…” and then end up in a rant that shows they don’t agree with you at all? In a feedback situation where your manager starts out with “You’re doing a great job, but…”. This small word is often misused and we end up feeling confused about what the message of the person in front of us actually was. It is time to be more aware of when we use it, and to completely stop using it in feedback situations.
Cambridge Dictionary defines “but” as a word “used to introduce an added statement, usually something that is different from what you have said before”. “But” can be compared to a mental eraser that simply removes whatever you said before. What happens in a feedback situation, or a meeting, is that when we use the word “but” people start to build up their defenses so that they stop paying attention to what you are saying and might miss the most important points.
When I studied I also did improvisational theater, and after each show we had a feedback round. In these session we had some basic rules:
Everyone can give everyone feedback.
If you hear something for the first time forget about it, if it is the second or third time you hear you might want to concider doing something with it.
You are not allowed to seperate two sentences by the word but. Stop the first sentence, then start the second as a new one.
To this day I really like these simple rules and I try to use them for example when we have a retrospective in a project. It is hard to see yourself from all angles so everyone has to be able to provide feedback for you to grow. If you hear something for the first time it might be a one time mistake so you shouldn’t focus to hard on changing something right away if it won’t happen again. now, Why did we try not to use the word “but”? My main reason is that it often nullifies both sentences seperated by the word “but”. It makes us go from two potentially constructive feedbacks a person can work with, to no feedback they can, or want to, work with. In addition you can end up making yourself less trustwhorty in the future.
I’ve come across articles saying that you can swap swap out the word “but” with “and”. For example “you are good at x, but you’re bad at y” can become “you are good at x, and if you keep working at y, you’ll be even better”. Now in this case I don’t really mind, but they are already rephrasing the second part of the sentence so why can’t we just say part one, take a pause to let the receiver absorb what was said and then say part number two? A direct substitution of “but” with “and” is not recommended as you’ll have to think about how you say it in order to not make it sound like a “but”. Does “you are good at x, and you’re bad at y” really sound any better than if we used “but”?
During the rest of the day, or in your next meeting, think about how often you, or others say something like “yes, but…”. It is surprisingly often and if someone says it as a response to something you said think about what you feel when they say it. Do you feel respected? Understood? Do you feel good or bad? Try to make yourself more aware of when you use this little word and it might make you communcation towards others more clear. Maybe you can even feel an attitude shift in the people around you by just using this small word less.
We should strive to make our communication, and especially feedback, as clear as possible, so don’t let the word “but” hold you back!
A very normal calculation we do is to compare a value, for example sales, against how we did last year. The easiest way to do this is by using the SAMEPERIODLASTYEAR function to create a measure that look like this
As shown in the image above this works fine for a given date, but what if we want to look at how one week compared to last year? For retailers this is a very normal demand. ow can we compare monday in one week to monday in the same week last year? After all it is a big difference in how much sales is being done on a Saturday compared to a Sunday, the shop might even be closed on sunday. For this we need to create a calculated column to help us out. We are going to call it DWY (for DayWeekYear) and we can do it like this
DWY = WEEKDAY(DimDate[Date])*1000000
Note: For people not using American calender you may need to prep week number otherwise to get ISO week, for example in your date dimension. Then you can do the calculation. The important part is that is lines up with your definition of day in week, week number and year.
Here you can see that 17th of September 2017 is a Monday, but in 2018 it is a Tuesday so we are unable to compare the two. Now, the beauty here with our new DWY column is that if we subtract one from a DWY date this year we get the same day in the same week last year. So, DWY Last Year = DWY -1. Which is great! So we can use this to create a measure that compares day in week vs the same day in week last year.
This works great when we look at a specific date, but you will see that the total row for the new measure is the same as the last value, and not the sum, so we have to extend it a little bit to make it sum up all the days we are showing. To do this we will use the SUMX function to create this measure
There you have it. But can we do more? In general I don’t like having more than one measure saying the same thing. We are looking at sales for last year in both measures we have created, it is just that one is comparing with a specific date and one is comparing with day in week. I would like to just have one measure that is Sales Last Year and then we can say that if the user is filtering on a week or a weekday we will show them the day in week last year value and otherwise we will show them the specific date. Now, this might be on a case to case scenario, but if you keep both measures available to they users they will have to remember that they cannot use the Sales LY Week Total measure if they want to look at a specific date, they have to use the “normal” Sales LY measure. In my experience this is often a source for user errors that might be confusing so we can hide this complexity by combining the measure we have created.
Sales LY Total =
ISFILTERED(DimDate[Week]) || ISFILTERED(DimDate[DayInWeek])
;FactSales[Sales LY Week Total]
The finished result will behave as shown below. The same measure is used in both tables, but the one on the left is filtered on week while the one on the right is filtered by dates. If you have more ways of showing week or week day in your date dimension, like the week day name remember to include them in the IF statement to make them work as we want them to.
A while back Stephen Few argued that data can not be described as being beautiful. At that point in time I did not agree to this statement. I do think data can be beautiful, but as the term “smart data” is popping up more and more I find myself unable to use this term to describe data. Am I being a hypocrite? Can data be beautiful, but not smart?
To me data is one of the dumbest things out there, it just exists. Data on it’s own creates absolutely nothing, it wouldn’t even have existed if someone didn’t create it. So even though I find myself saying that data can be beautiful I don’t like the term smart data. Data can be structured, organized and formatted beautifully, but being smart? No!
By stating that data is smart we take away the credit from many people who works with data each day. Smart people, really smart people! They are the ones that deserves the credit for all the amazing things data can be used for, not the data itself. Do we call trees smart just because we can do amazing things with them? No, but the inventors of paper, housing and so on they were geniuses!
Give credit where credit is due. Data is lazy, it would lie on the couch all day if it could. The people that works with data on the other hand and gets the data up from the couch and creates amazing products and services with it, they do all the dirty work and should also get all the glory!
In a project we are using Talend to load a lot of data each night and we are experiencing randomly getting “Connection does not exists” error messages during our data load. This can happen at any time both during the connection phase, and so far we have been unable to see any real signs of why it is happening and when. In addition this leads connection reset often leads to our data being corrupt and unusable meaning we have to start all over. We have therefore set up an error handling when reading from this data source.
Setting up a try/catch in Talend
Create a context variable that we call continueLooping. This boolean will be used to end our loop when we reach our maximum number of attempts or the connection has been successful
Add a tJava where you initialize the variable to true
Then add a tLoop as a While and condition context.continueLooping
Now we add a tJavaflex where our try/catch block will be. Put the try block in the start code and the catch block in the end code. Mine look something like this. Feel free to add some logging in here as well so you can keep track of where the error is happening. in the main code I make the job sleep for a little while in order to give our connection some time to get back up.
Add a tJava with a “On component Ok” trigger on your database connection. Here set the continueLooping to false to stop the loop.
In the end it should look something like this:
Extending the error handling
In our case we are already looping our read by using a job above this one to read data one month at a time. The output of this job is large .csv files which we then upload to Azure blob storage in order to use Polybase to finally move the data into our SQL DWH data warehouse. Since we know we can loose connection in the middle of our read we need to clean up our .csv files before starting the loop all over for the month that we are reading. This is done by adding an If trigger on the tJavaflex where our trigger is to check which iteration we are in. If we are not in the first iteration of our loop something has gone wrong and we need to do some cleanup to make sure our data is correct in the end. We therefore remove all rows for the month we are supposed to read before we let the loop start over. Now, the only way I have been able to do this is by creating a copy of our existing file, filter out rows for current month and then write it back as the original file. In the end it looks like this:
Overall it seems to work very nicely when we are unable to trust that our data source will keep our connection open for the whole duration.
Continuing our journey on applying Andy Kirks tips and tricks from his series “The Little of Visualization Design” we are now at part 2, Clever Axis Scaling. As last time I suggest you read his post first so we already have some common ground.
Why use clever axis scaling
Clever axis scaling is a tool in order to create some drama in your visualization. It can also help you highlight values and draw your consumers eye towards it. Things that stand out will get attention, our brain is simple in this regard, and in this case that is what we are after.
In the example the y-axis is set to 50, but the maximum value is 76. Now, my quick thought was “Great! this is easy, lets just set the y-axis to 50 so the helper line is 50”. This is easy to do, however the chart actually gets cut at 50. So it ends up looking the image below, which is not what we want at all. We are now hiding the most interesting data!
So I tried setting it to 100. Now, that works okay, but the highest line on the y-axis is now not 50 so the dramatic effect of 76 shooting way above the last helper line on the y-axis disappears even though we still see the biggest increase on the chart to the left. So what is the solution? You need to try out what works best with your data. In this case it seems to work fine to choose 76, the maximum value in the chart. Now this will not always be the case because we can not control how many lines on the y-axis we get. If I make the chart higher you can see that the y-axis changes with numbers as well.
Use it carefully
This solution also has the drawback that you are hardcoding the minimum and maximum value. So if you suddenly have a value higher than 76 you will loose it! In the end it comes down to what you want to tell with your chart, if your chart is going to change values often and how dramatic you want it. If you have no idea how your numbers will behave in the future I will not advice you to hard code min/max values unless you need it for a specific occation like a presentation. When you are done with that spesific occasion I suggest you turn them back to automatic to minimize confusion.
As with all tools PowerBI has some limitations compared to custom code and for example using something like D3.js where you can do absolutely everything you want! Having these limitation can make it a challenge to use all these tips and tricks going forward, but we will do the best we can! In this case we might have some problems trying to create a more dramatic effect in our storytelling. Both with your axis’ as we have seen here, but also with data labeling as PowerBI does not let you choose which data points to highlight. So if you try to label the highest value without hovering over it it is not possible. Or at least I didn’t manage to, but if you do please let me know how you did it.
Also, if anyone has a way of hiding the ESRI logo in Power BI Desktop please let me know. They are not pretty and are driving me crazy!
Isn’t the beauty of beauty that we all find different things beautiful?
The other day Stephen Few posted a blog post named “Data Is Not Beautiful”. I have strong reasons to believe this was written because we at /r/dataisbeautiful contacted him and wondered if he was interested in doing an AMA, Ask Me Anything, where our users could ask question and he could provide his point of view and thoughts about data visualizations. Now, I have great respect for Stephen Few and I often send people interested in data visualization his way to read his material because he is so black and white and therefore it is easy to grasp what he think good data visualization is. In this blog post however, I think he misses the mark.
My beautiful might not be your beautiful
I have never been a fan of discussions regarding what is beautiful. What is beautiful is completely up to the consumer of some material, which can be music, art, nature or even data. Everyone has different taste and I never feel like these kind of discussion solves anything or contributes to making a disucssion go forward in any sort. The goal often seems to simply try and put themselves above others by claiming that they are not entitled to call anything beautiful. However, isn’t the beauty of beauty that we all find different things beautiful?
Data can indeed be beautiful
There are many things that can be beautiful about data and even though I am understanding Few’s point that this means that it is the attributes about data that is beautiful and not the data itself I find that to be a weird formulation. What about music is beautiful? It’s the combination of attributes, the compositon, volume, chords, etc. All of which is attributes of a music piece, which indeed can be beautiful.
Attributes of data can be things like structure, format, how it is organized and beyond that we can find a story to be beautiful. However, even without the story the data itself can be in what I’d call a beautiful state. It is not often I go to a customer and find beautiful data, but when I do it is a beauitufl sight indeed! Data that is well organized, formated and of good quality. I have no problem of using the word beautiful to describe this phenomenon.
I believe that data can indeed be beautiful, but in the end even if the data is beautiful it has little to no value if all it does is lie hidden within a database or an Excel sheet. Therefore it is our job to take that data and turn it into something useful for example through good visualizations. And when it comes to this I believe this older blog post from Few, “Should Data Visualizations Be Beautiful” is a lot more interesting than his recent post.
On a last note: As the moderator that reached out to Stephen Few and asked if he was interested in doing and AMA I still think it is a shame he declined. I personally find his point of view very interesting and I am sure a lot of users on /r/dataisbeautiful would have thought so too. There is something refreshing about a person speaking so loud about what he believe is right so if he at any time changes his mind we would be very happy to have him do an AMA at /r/dataisbeautiful.
I got some great new during my vacation I got a mail saying I was selected to give a talk at SQLSaturday in Oslo this year. I also got notified by the tweet below which I had completely forgotten from last year. Nice surprise to see I have succesfully made a goal of mine, even though I had forgotten I had set it! I’ve been to every SQLSaturday that has been in Oslo and it has always been a great event so looking forward to be able to contribute with a talk myself this year!
My talk is titled “Data Visualization – More Than a Hygiene Factor”, based on a quote from this Medium Post. You can read my abstract below.
"For many companies data visualization is still a hygiene factor; necessary but not crucial"
In a world where everyone wants to use data to drive their business forward it is important to be able to communicate and speak the language of data even though data itself can be complex. One way of doing this is by making good data visualisations. Good data visualisations are engaging, they are informative and they let your data tell you its story. Too often data visualisation gets a low priority making the final result feeling lacklustered and making the users uninspired.
In this session we look at some data visualisation principles and best practices, in order to deliever your message with a clear point of view and minimize confusion. Lastly we will look at how you can use these practices with Power BI in order to improve how data can be communicated to your end users in the best possible way making them come back over and over.
SQLSaturday is a free 1-day training event for Microsoft Data Platform and SQL Server professionals, providing a variety of high-quality technical sessions. If you work on the Microsoft Data Platform SQLSaturday is a great way to get inspired and hear about new things. You can find more information about SQLSaturday, September 2nd in Oslo here!
Andy Kirk has an excellent series called “The Little of Visualization Design” where he gives small tips and tricks that can improve your data visualizations. If you have not seen it I strongly recommend it. Now, what I am going to try and do every week after summer vacation is to try and show you have you can take these tricks and use them with Power BI. But let’s kick start it now with part 1, dual labeling. I suggest that you read the original post by Andy first so we are at a common ground about what we are going to look at which is this pie chart.
Dual labeling. It is suprisingly normal to see and it generates more cluster on your data visualisation than you need. Repeating something will not make things clearer, it will just create more ink on your graph and make it harder to focus on what’s important.
Now if you punch in the data and create a pie chart in Power BI we get what is shown below.
So Power BI does not provide you with a dual labeling issue at front, but it is quite easy to reproduce it with Power BI. In the “Format” pane you have a bunch of options which usually are great, but you have to use it with care and have a clear vision of why you are changing the original chart if not you can end up with all of these different variations.
The one in the bottom left is probably the closest to the one in the original post. It has dual labeling, and it has quite similar colors on the pie slices. Andy Kirk’s proposed solution is to remove the labeling and provide it directly onto the pie since the colors in the original graph is so similar. Now, that doesn’t sound to far away from the default graph that Power BI provides us with. However the default is not perfect and here is what I would do in order to improve it:
In Label Style choose “Category, data value”. This makes us see the actual number.
Increase font size of detail label.
Increase font size of the title. In general I think all default font sizes in Power BI are too small. I always feel like I need stronger contact lenses when creating a chart…
Sort the chart by value so the slices appear in order of size.
Note: I had originally made the font size of the detail label a bit bigger. However, this made the detail label for Canada disappear. Probably because it would take up the same space as Israel. So I wish they could make the position of the label a bit more dynamic.
In the end we end up with the chart below. So all in all the default chart Power BI created wasn’t too bad, but it could be improved. And make sure you are aware that not all options in the format pane in Power BI makes your data visualisation better, it could make it worse!
I’m looking forward to some weeks of summer and then I’ll continue this series when I am back! Thanks for reading. If you have any questions or feedback drop me a comment, it is greatly appreciated.