Daina Chiba

Ph.D. Candidate, Rice University

world map

Computing

I put together small tips and scripts useful for using R and Stata efficiently. For LATEX presentation stuff, I created a separate page. The first part of this page talks about using an external text editor for efficient programming, and the second part explains how to run Stata and R codes remotely on server computers.

Using an external text editor

When you write a long, complicated program code (i.e., Stata's do file, R's source file, etc.), then built-in text editors for Stata and R are probably not the best choice. I suggest that you use an external text editor instead. Some of the main strengths of the text editors are as follows:

The text-editors are NOT only for those sophisticated programmers. On the contrary, if you are one of those who would never commit dumb mistakes in programming, you most probably don't need powerful editors. But if you have ever got frustrated with annoying small errors such as parenthesis mismatch, then these external editors will save the day for you. And most of the text editors introduced here are free.

for Mac OSX

I am no snob about text editors, so I use different editors for different purposes. (You might have your favorite editor already, with which to take part in the notorious Editor war.) Anyway, I use TextWrangler to edit a Stata do-file, Aquamacs Emacs to edit an R source file, and TeXShop for LATEX typesetting. The latter two softwares won't need much customizing, while you need some adjustment to use TextWrangler efficiently as an external do-file editor for Stata. Below I will explain how to customize TextWrangler to take the full advantage of it.

  1. Syntax highlighting
    Download and install language modules for Stata and R. Procedures are detailed in this blog entry (for Stata) and this blog entry (for R).
  2. Parentheses matching
    Go to Preference > Editor Defaults, and check "Balance while typing."
  3. Send to Stata
    In writing a do-file, we normally proceed with trials and errors; that is, we write some codes, test them, and then modify them. You can test the commands you wrote by submitting (a selection of) the text to Stata from within TextWrangler. Text can be submitted by highlighting, or by sending the entire file. To facilitate this feature, follow the instruction of this blog entry at Dataninja and install several Apple Scripts. You should assign shortcut keys to implement these scripts, as explained in the blog entry.
  4. Send to R
    I wrote an Apple Script ( SendToR_sel.scpt) that submits a selection of your source file to R (hacked from Do Selection.scpt for Stata introduced here). Both of these scripts should be saved in Username/Library/Application Support/TextWrangler/Scripts/.
    Again, don't forget to assign shortcut keys:
    1. Window > Palettes > Scripts
    2. Select SendToR_Sel script
    3. Choose "Set Key."
    You can select several lines of your R source code by highlighting, and then send them to R.

for Windows

Windows users have more options. I personally prefer Tinn-R to edit R code. WinEDT (along with RWinEDT plugin) is also good, but the plug-in is not free. Finally, Professor John Fox recommends Emacs/ESS. For more information about Emacs on Windows, consult ESS and XEmacs for Windows Users of R.

To edit a Stata do-file, I've heard that Crimson Editor works well with Stata.

back to top

Stata and R on Rice's Unix Servers

You may want to run an extremely long statistical programm that can take days or weeks to finish. In that case, you can submit your program to the Unix server and let it run for you in the background, so that you can log out from your computer and take some rest.

To use R, Rice students can use grid computing servers (such as SUG@R), or the VET server. Grid computers are much faster, while the VET server is easier to work with. Stata is available only on the VET server. Below, I explain the procedure to set up the path and begin using these two statistical softwares on Rice's UNIX servers.

Obtain an account

To begin, you have to have a new account in addition to the regural net ID. The VET account is easy to obtain. Just go to the IT website and apply.

To apply for an account at grid servers, you need sponsorship from a Rice faculty. Read this page for more information.

SSH to the server

Once you have an account, you can log in to the server through your terminal (the command prompt on a Windows machine, and Applications/Utilities/Terminal.app on a Mac). To log in the VET server, type in:

ssh -Y username@vet.ruf.rice.edu
To log in SUG@R server, type in:
ssh -Y username@sugar.rice.edu
You will be prompted to type in your log-in passowrd.

Set the path

After logging in to a server, you need to set a path to the statistical program you wish to use. Path can be set in each ssh session, or you can set the path permanently by modifying the .cshrc or .bashrc file.

Caution: Do not edit .cshrc/.bashrc files inadvertently.
Caution: Depending on the type of shell, you need slightly different scripts. For example, SUG@R uses bash, whereas VET uses csh.

On SUG@R

To set up a path to R on SUG@R, first you need to check the version of R available on the server. Issue the following command from the terminal:

module avail
Then, you will see a list of applications available on the server. Suppose R/2.9.0-gcc.4.1.2 is the latest version of R available on the server, then type:
echo 'export PATH=$PATH:/opt/apps/R/2.9.0-gcc.4.1.2/bin' >> .bashrc
Be sure to type it exactly as shown. Be sure to type two greater-than signs so as to append the line to the existing .bashrc file, not to replace it. To check if this was successful, you can display the .bashrc file. Type
cat .bashrc
The last line of the output should be
export PATH=$PATH:/opt/apps/R/2.9.0-gcc.4.1.2/bin
Once the path is set, you can invoke R by typing R from the terminal.

Note: You are allowed to use R interactively only for debugging and the like. Large jobs must be submitted using PBS. For more information, read the instruction at SUG@R

On VET

To set up a path for R, issue the following command from the terminal:

echo 'set path = ($path /usr/site/R/bin)' >> .cshrc
This command will append a line set path = ($path /usr/site/R/bin) to your .cshrc file. Again, you can verify the result by typing in cat .cshrc from the terminal. Once the path is set, you can invoke R by typing in R from the terminal.

For Stata, issue the following command from the terminal:

echo 'set path = ($path /usr/site/stata)' >> .cshrc
Once the path is set, you can invoke Stata by typing in stata from the terminal.

Run Stata in the background

To run a do file in the background, type

nice stata -b do yourdofile.do &
A log file named yourdofile.log is automatically created to store the results of your do file.

Run R in the background

To run an R source file in the background, type

R CMD BATCH yourRfile.R &
A log file named yourRfile.R.Rout is automatically created to store the results.

Alternatively,

R --vanilla --slave < yourRfile.R &
also works.

back to top