I rarely use R in Power BI, but when I do, I make sure to keep the data mashup out of it. I don’t have anything against R, it’s arguably the best statistical tool out there, however when it comes to using an expressive language for doing data wrangling, Power Query (M) is my language of choice.
Although R has been introduced to Power BI ecosystem as a complimentary tool, its capabilities exceed by far those of the Microsoft’s analytics suite. That is why in areas where Power BI is falling behind, I can always rely on R to bridge the gap.
Below I describe 5 patterns of how I use R.Execute() in Power Query. I have divided these in 2 classes – as intended and unintended use cases – just to separate the expectation for how much support these patterns will be getting from Power BI team.
Note: When using R.Execute() the only rule I follow is that if I need some data out of an R routine, I must put the execution results into a dataframe. Other than that all R rules apply.
The intended use of R in Power Query
Executing an algorithm
This is probably one of the most legit use-cases for using R in Power BI. Because there is no language/framework/library that can compare to the vast variety of algorithms implemented in R. If you need to run some exotic algorithm on your data, most likely there is an optimised version written an R, you just have to find the right package (and the right repository: CRAN or MRAN to name a few).
Here is a standard R call executing KMeans clustering on iris datataset.
Calling an R function on Power Query input
Regex is the perfect example here. While we wait for its implementation to arrive in Power Query, why not use R regex related functions regexpr, regmatches, etc?
Here is an example for parsing @mentions and #tags from a twitter feed.
Unintended uses of R in Power Query
My favorite ones of the lot. I must admit these are the patterns that I
abuse the most.
Performing file based operations
Say you have some CSV data stored in a ZIP file on your file system somewhere, the only way to get a hold of that data is either by manually unzipping the data, or use one of the lovely UnZip functions written in Power Query by either KenR or Mark White. The R way is even simpler, use read.csv(unz(zip,file.csv)) and you’re done!
Exporting data from Power Query
You’ve done your data analysts in Power Query and you’re stuck with all this data in the $Embedded$ Tabular Data Model. The only way to get the data out is by exporting this data via a table chart, into a CSV. On the other hand you can use R, and export effortlessly millions of records via write.csv function and not worry about rendering problems that might be caused by the table chart visualisation.
Using STDOUT as a data source in Power Query
This pattern was inspired by one of the questions from the Power BI community. Again, referring back to R, you can call a system commands via system2() function and use its console output as a data source in Power Query.