Computing
I put together small tips and scripts useful for using R and Stata efficiently. For LATEX presentation stuff, I created a separate page. The first part of this page talks about using an external text editor for efficient programming, and the second part explains how to run Stata and R codes remotely on server computers.
Using an external text editor
When you write a long, complicated program code (i.e., Stata's do file, R's source file, etc.), then built-in text editors for Stata and R are probably not the best choice. I suggest that you use an external text editor instead. Some of the main strengths of the text editors are as follows:
- Parentheses matching and syntax highlighting available in text editors make it easier to detect mistakes in your code.
- Text editors are equipped with more powerful search-and-replace options than the built-in editors. With text editors, more systematic corrections and alterations of your codes are possible.
- Text editors have more options in editting (e.g., ability to "undo" your editting many times, ability to assign more keyboard shortcuts, easy-to-retrieve templates, etc.).
The text-editors are NOT only for those sophisticated programmers. On the contrary, if you are one of those who would never commit dumb mistakes in programming, you most probably don't need powerful editors. But if you have ever got frustrated with annoying small errors such as parenthesis mismatch, then these external editors will save the day for you. And most of the text editors introduced here are free.
for Mac OSX
I am no snob about text editors, so I use different editors for different purposes. (You might have your favorite editor already, with which to take part in the notorious Editor war.) Anyway, I use TextWrangler to edit a Stata do-file, Aquamacs Emacs to edit an R source file, and TeXShop for LATEX typesetting. The latter two softwares won't need much customizing, while you need some adjustment to use TextWrangler efficiently as an external do-file editor for Stata. Below I will explain how to customize TextWrangler to take the full advantage of it.
- Syntax highlighting Download and install language modules for Stata and R. Procedures are detailed in this blog entry (for Stata) and this blog entry (for R).
- Parentheses matching Go to Preference > Editor Defaults, and check "Balance while typing."
- Send to Stata In writing a do-file, we normally proceed with trials and errors; that is, we write some codes, test them, and then modify them. You can test the commands you wrote by submitting (a selection of) the text to Stata from within TextWrangler. Text can be submitted by highlighting, or by sending the entire file. To facilitate this feature, follow the instruction of this blog entry at Dataninja and install several Apple Scripts. You should assign shortcut keys to implement these scripts, as explained in the blog entry.
- Send to R
I wrote an Apple Script
(
SendToR_sel.scpt) that submits a selection
of your source file to R (hacked from
Do Selection.scpt for Stata introduced
here).
Both of these scripts should be saved in
Username/Library/Application Support/TextWrangler/Scripts/.
Again, don't forget to assign shortcut keys:
- Window > Palettes > Scripts
- Select SendToR_Sel script
- Choose "Set Key."
for Windows
Windows users have more options. I personally prefer Tinn-R to edit R code. WinEDT (along with RWinEDT plugin) is also good, but the plug-in is not free. Finally, Professor John Fox recommends Emacs/ESS. For more information about Emacs on Windows, consult ESS and XEmacs for Windows Users of R.
To edit a Stata do-file, I've heard that Crimson Editor works well with Stata.
Stata and R on Rice's Unix Servers
You may want to run an extremely long statistical programm that can take days or weeks to finish. In that case, you can submit your program to the Unix server and let it run for you in the background, so that you can log out from your computer and take some rest.
To use R, Rice students can use grid computing servers (such as SUG@R), or the VET server. Grid computers are much faster, while the VET server is easier to work with. Stata is available only on the VET server. Below, I explain the procedure to set up the path and begin using these two statistical softwares on Rice's UNIX servers.
Obtain an account
To begin, you have to have a new account in addition to the regural net ID. The VET account is easy to obtain. Just go to the IT website and apply.
To apply for an account at grid servers, you need sponsorship from a Rice faculty. Read this page for more information.
SSH to the server
Once you have an account, you can log in to the server through your terminal (the command prompt on a Windows machine, and Applications/Utilities/Terminal.app on a Mac). To log in the VET server, type in:
Set the path
After logging in to a server, you need to set a path to the statistical program you wish to use. Path can be set in each ssh session, or you can set the path permanently by modifying the .cshrc or .bashrc file.
Caution: Do not edit .cshrc/.bashrc files inadvertently.
Caution:
Depending on the type of shell, you need slightly different scripts.
For example, SUG@R uses bash, whereas VET uses csh.
On SUG@R
To set up a path to R on SUG@R, first you need to check the version of R available on the server. Issue the following command from the terminal:
Note: You are allowed to use R interactively only for debugging and the like.
Large jobs must be submitted using PBS. For more information, read the
instruction at SUG@R
On VET
To set up a path for R, issue the following command from the terminal:
For Stata, issue the following command from the terminal:
Run Stata in the background
To run a do file in the background, type
Run R in the background
To run an R source file in the background, type
Alternatively,