When analyzing a questionnaire, one often wants to view the correlation between two or more Likert questionnaire item’s (for example: two ordered categorical vectors ranging from 1 to 5).
When dealing with several such Likert variable’s, a clear presentation of all the pairwise relation’s between our variable can be achieved by inspecting the (Spearman) correlation matrix (easily achieved in R by using the “cor.test” command on a matrix of variables).
Yet, a challenge appears once we wish to plot this correlation matrix. The challenge stems from the fact that the classic presentation for a correlation matrix is a scatter plot matrix – but scatter plots don’t (usually) work well for ordered categorical vectors since the dots on the scatter plot often overlap each other.
There are four solution for the point-overlap problem that I know of:
- Jitter the data a bit to give a sense of the “density” of the points
- Use a color spectrum to represent when a point actually represent “many points”
- Use different points sizes to represent when there are “many points” in the location of that point
- Add a LOWESS (or LOESS) line to the scatter plot – to show the trend of the data
In this post I will offer the code for the a solution that uses solution 3-4 (and possibly 2, please read this post comments). Here is the output (click to see a larger image):
And here is the code to produce this plot:
Continue reading “Correlation scatter-plot matrix for ordered-categorical data”