I’m pleased to announce the release this week of my newest course at Pluralsight, Getting Started with R in the Microsoft Data Platform. If you’re not sure why you would need R in SQL Server or how you would go about implementing it, this course is a great introduction that explores various nooks and crannies. Sometimes it’s just easier to see how it’s done rather than simply read about it.
Full disclosure…the course is not free. Unless you’ve never used Pluralsight before, in which case, you can sign up for a free trial of up to 200 minutes over 10 days. My course is 173 minutes, just under the upper limit of the trial. If you already have a Pluralsight subscription and are R-curious, please go check it out.
Today I’d like to explain who the intended audience for this course is and what the course covers (and doesn’t).
Who is this course for?
My course is NOT going to teach you R. At least that’s not its explicit goal. There are plenty of R resources at Pluralsight available to cover this topic at various levels. The good news is that you don’t already need to know R in order to understand the key points of the course. So who is it for?
- If you’re a DBA that needs to support R implementations, you’ll see where R fits into the stack and understand what configuration tasks need to be performed on the server to support R functionality. You’ll also get a taste of what the R folks are going to do to your server, so that you can plan accordingly.
- If you’re a data professional that hasn’t yet learned R, you’ll see a variety of working examples to give you a sense of what’s possible across the stack.
- On the other hand, if you’re a seasoned R practitioner and have access to SQL Server or Power BI in your environment, you’ll see how you can leverage these tools to process and analyze data and to share your results with others.
R in SQL Server
While R has been a tool for data analysts for over 20 years, it’s a technology that has long been constrained by hardware limitations of a user’s computer which typically doesn’t have the CPU or memory of a server. In 2010, Revolution Analytics released an R distribution that lifted these constraints and enabled parallel processing to improve performance of processing very large data sets. Several years later, Microsoft acquired Revolution Analytics and embedded this technology into SQL Server 2016.
As I was developing my Pluralsight course, the SQL Server 2017 changed R Services to Machine Learning Services which enhanced the R library and added Python support. I did add some coverage of implementation and configuration tasks specific to SQL Server 2017 as well as specific new features such as native and real-time scoring.
After exploring the configuration of SQL Server to ensure you have all the components necessary for using the R capabilities added with R Services or Machine Learning Services, the course highlights the functions in the RevoScaleR package (created originally by Revolution Analytics) that enable you to load data into SQL Server, perform operations on SQL Server data, or to extract data from the server.
This section of the course also shows you how to operationalize your R scripts for execution on SQL Server. Specifically, you learn how to encapsulate your R code into a T-SQL script.
If you use R to explore data by using visualizations, you’ll be interested in the section that explains how to generate image files on SQL Server. And if anyone out there is still using Reporting Services (which I sincerely hope because it’s near and dear to my heart!), you’ll learn how to display an R visualization in a report by calling an T-SQL stored procedure.
R is not just about exploring or visualizing data. A powerful capability is predictive modeling, which of course is supported by many different RevoScaleR functions. I briefly describe the range of modeling support available, and then demonstrate how to perform predictive modeling on SQL Server by using RxDForest, a decision forest algorithm. I also show you to evaluate the model.
R in Integration Services
There are several ways that you can use in Integration Services. One way is to call an R script directly by using an Execute Process task. Or you can call a T-SQL stored procedure that runs an R script, either by using an Execute SQL Task in the Control Flow or in an OLE DB data source component in the Data Flow. I show you all the options in the course.
R in Power BI
Power BI can do so much, and of course it can do R, too! One option is to use an R script as a query to bring data into the model. I’ve done all kinds of crazy things with this feature. In this course, I show you how to get and transform some weather data for a specific time and place (which is not the craziest thing I’ve ever done with R), but also I show you how to parameterize the query so that you can customize the results to whatever you want.
Of course, the techniques that I describe can be used for any other type of data. The key is that you see what’s necessary to use a script, how the data is handled by Power BI, and how to use arguments in an R script to pass in variable values dynamically.
Also, I explain how to use R visualizations in Power BI. There are a few gotchyas that I explain as well. While Power BI has some pretty cool ways to visualize your data, the ability to use R really expands your options.
As I was working on the last module of this course, I had an environment with a mix of R libraries on it which led me to encounter an issue in Power BI that puzzled me for a bit. The Home Directory setting didn’t behave as I expected, which led me to write a post that describes how to determine which R library tree is in use.
Hopefully something in this course description sparks your interest enough to go watch some or all of the course. The Pluralsight platform makes it easy to go through a course from beginning to end (and you can speed it up to get through it faster) or to jump into a particular topic of interest.
If you have any questions about using R in the Microsoft Data Platform, please leave a comment below. And if you have any suggestions for future Pluralsight courses, I’d love to hear those too!