Tags

, , , , ,

A very common task in data analysis is to plot values from multiple sources. For instance, you have a bunch of .csv or .txt files and want to plot some values from these files. You can manually import one by one, or find a better way of doing. With RStudio this task can be accomplished in a very simple way. Let’s first take a look at some files to import. See the following picture.

CSV files loaded in a txt editor.

Some .CSV files to import.

In this example I have 50 .csv files in the same folder with three columns and 50 lines each.

The first task is set the current folder to the folder where the files are stored. Menu: Session -> Set Working Directory -> Choose Directory.

Or, using the following command on console

setwd("~/Dropbox/MADOC_Tests_for_parameters/Zachary/P40 G50 M0.05/plots")

The second step is to import all the files as data to RStudio with the following command

temp = list.files(pattern="*.csv")
for (i in 1:length(temp)) assign(temp[i], read.csv(temp[i]))

Once is done, you should can access the data from RStudio and interact with data, also, plotting.

Screenshot 2016-02-03 21.52.56

To plot several lines use the following commands.

plot(1,type='n',xlim=c(1,50),ylim=c(0.0,0.42),xlab='Generations', ylab='Fitness')
lines(`plot-1.csv`$Best, type='l', col=sample(rainbow(50)), lwd=2)
lines(`plot-2.csv`$Best, type='l', col=sample(rainbow(50)), lwd=2)
lines(`plot-3.csv`$Best, type='l', col=sample(rainbow(50)), lwd=2)
……Until the number of .csv or .txt files that you want to plot……
lines(`plot-50.csv`$Best, type='l', col=sample(rainbow(50)), lwd=2)

Where, xlim(1,50) sets the X axis, from 1 to 50 (same number of rows in my files), ylim(0.0,0.42) sets the Y axis, ranging from 0 to 0.42 (the maximum value that I am expecting to see), xlab is the label for X axis, and, ylab is the label for Y axis. This creates the plot.

The lines are added one by one (I have created a small Java code to do it, but a better way will be building a for loop just like to read the files).

In the lines, we set the .csv to be read, the variable to plot, in this example $Best, the type of the line “l”, and, the color col receives a random color of the function rainbow(50), where 50 is the number of lines to print.

The result is below. I hope this can be useful for someone else.

Rplot

Advertisements