Skip to main content

Where is the Social Science Gateway?

The Social Science Gateway (SSG) grant has ended, please read here about ongoing availability of resources created as part of that project.

We support:

APDU Logo foas-logo-small (2)

Step 4 - Using the SDS

Filesystem layout

The main filesystem is $HOME (9TB). Directory structure replicates the typical Census RDC node, on which both synthetic data and completed gold standard data reside:

  • /temporary/ for scratch space (for both SAS and Stata)
  • /rdcprojects/co/co00517 (for SSB; consult the )
  • /rdcprojects/tr/tr00612 (for SynLBD)

Data

The most current data are noted below. Older data releases may also still be present on the server.

  • SSB data resides in /rdcprojects/co/co00517/SSB/data/v5.1
  • SynLBD resides in /rdcprojects/tr/tr00612/data/synlbd/v2.0.2
  • Zero-obs datasets from the Census RDC are available at /rdcprojects/virtualrdc/ in locations otherwise corresponding to their locations on the Census RDC, e.g./economic/cbo/microdata is where the CBO files would be on the Census RDC, and /rdcprojects/virtualrdc/economic/cbo/microdata is where they are found on the SSG.

User-created programs

Users should create programs OUTSIDE of their home directories (see backup policy below). Create a directory for your project under

  • /rdcprojects/co/co00517/SSB/programs/users/(LOGIN ID)
  • /rdcprojects/tr/tr00612/programs/users/(LOGIN ID)

This ensures ease of replication on the Census internal computers.

The most robust way to ensure ease of replication is to NEVER hard-code paths. Suggested practice is to use macro/global variables to encode such paths:

SAS

%let base=/rdcprojects/tr/tr00612;
%let version=2.0.2;
%let myid=specXXX;
%let prefix=synlbd;

libname inputs "&base./data/synlbd/&version." access=readonly;
libname mydata "&base./programs/users/&myid./data";

data mydata.analysis_file;
set inputs.&prefix.1992c;

Stata

global base /rdcprojects/tr/tr00612
global version 2.0.2
global myid specXXX
global prefix synlbd

global inputs $base/data/synlbd/$version
global mydata $base/programs/users/$myid/data"

use ${inputs}/${prefix}1992c
...
save ${mydata}/analysis_file

R

base = "/rdcprojects/tr/tr00612"
version = "2.0.2"
myid = "specXXX"
prefix = "synlbd"
library(foreign)

inputs = paste(base,"/data/synlbd/",version,sep="")
mydata = paste(base,"/programs/users/",myid,"/data",sep="")

analysis_file <- read.dta(paste(inputs,prefix,"1992c.dta",sep=""))
...
save(analysis_file,file=paste(mydata,"/analysis_file.RData",sep=""))

Statistical and other software

  • SAS (9.3)
  • Stata (12.1 MP)
  • R (2.14.0)
  • OpenOffice (3.1)

are available. SAS and Stata can be launched from the KDE/Gnome application menus. R needs to be started from the command line.

Packages

R: We regularly add certain R packages, but if you need anything in particular, please contact the Help Desk. Since you cannot access the internet from within the SDS, we will need to transfer the R packages for you.

Stata: We occassionally mirror the RePEC repository of Stata packages to /rdcprojects/public/. Users can install packages by running commands such as the following: (for any package, use the first character of the name of the package in the first line)

net from /rdcprojects/public/fmwww.bc.edu/repec/bocode/e
net install estout, replace

System

The server has 128GB of RAM, 2 6-core CPUs @ 2.67 GHz, with Hyperthreading turned on (so there appear to be 24 cores).

Backup

Due to the restricted-access nature of the server, we provide backup of critical files. However, we do not back up all files on the system, so in order to ensure that your critical programs get backed up, please note the following backup policy:

  • Files in your home directory (/home/(userid)) (and your desktop) are NOT backed up.
  • Files under /rdcprojects/co/co00517 and /rdcprojects/tr/tr00612 are generally backed up, but user-created data files (in the user/ directories) may be excluded in the future.
  • User-created programs under /rdcprojects/{co,tr}/{co00517,tr00612}/.../programs/usersare ALWAYS backed up.
  • Files in the scratch space are never backed up, and are regularly removed to efficiently manage space.

Keeping informed

By default, we will subscribe you to a announcement-only mailing list (virtualrdc-sds-l@cornell.edu) to notify you of any important information about the server.

  • If you wish to have additional notifications about general changes or events at the VirtualRDC, you can subscribe from our front page to our RSS feed, or via Google to email notifications of the RSS feed.
  • If you wish to be notified at a different email address, send an email to listmanager@list.cornell.edu with the body of the message stating "subscribe virtualrdc-sds-l".

Getting help

If you need further assistance, please consult our Help page on how best to direct your inquiry.