Skip to main content

Latest Tweets

What computing resources are available at the Economics Department

ECCO is a Linux compute cluster. As compute clusters go, it is small (compared to XSEDE or similar clusters). Compared to your typical one-workstation-under-a-desk, it is large, but that's only part of the question on when and how to use it. In order to assess if ECCO is appropriate for your specific research, consider the following other resources, and go through the different scenarios outlined below.

Scenarios

  • MPI-type jobs will require several separate nodes, or multiple cores on the same node
    • Learning: In order to learn how to configure MPI, the older ECCO nodes are sufficient, and preferable to the newer nodes (note that the oldest nodes have a similar per-CPU memory count as the newest AMD node)
    • Research: Actual computation would use the largest group of homogeneous cores in the cluster (such as the AMD nodes), or go to an even bigger cluster elsewhere (see below).
  • Single-threaded jobs (most run-of-the-mill Stata, SAS, Matlab jobs) require the fastest CPUs
    • Learning: any node will suffice, or you might want to use non-ECCO workstations or resources (see below), that may offer faster single-threaded performance.
    • Research: the newest Intel node is the preferred node (can handle jobs with 24GB/core and up to 1TB of RAM). Note that by using the job scheduler, you in effect "reserve" the node (within the limits of the job submission system), so that no other job runs at the same time.
  • Multi-threaded jobs (non-MPI) should work well on all nodes, depending on the number of multiple threads:
    • Monte-Carlo or bootstrap-type applications may work best on massive multi-core nodes
    • Limited-threading (SAS, Stata) will work well on all nodes, and may depend more on other factors
  • Large-memory jobs (single-threaded non-parallelizable) or database-driven jobs (using in-memory or RAM-disk-based databases)
    • will most likely benefit from the large-memory Intel node (up to 1TB of RAM, which can be used for application or for RAM-disk)

Other resources

  • CISER - a collection of very large Windows servers [utilization, software]. Appropriate if you are running a run-of-the-mill job that requires more resources than your laptop or desktop, but want to or need to stay in the Windows world. Does not run multi-node jobs, and does not use a scheduler, so cannot guarantee certain memory or CPU resources. CISER usage is typically cost-free to members of the Economics Department.  [Request account]
  • CRADC @ CISER - similar to regular nodes, but specifically designed for confidential datasets. If any of your research involves confidential data, then ECCO is not appropriate, and you should (or even must) use CRADC. [Contact]
  • RedCloud - including RedCloud with Matlab. RedCloud works on the same basic system as Amazon - virtual servers that are leased for specific time periods, in varying quantities. While the overall computing capacity of RedCloud as of Nov 2013 is less than what's available on ECCO, this can change over time. Systems on RedCloud come "raw" - basic OS (other than the Matlab nodes), and it is up to the user to configure everything, including job schedulers. Using RedCloud is appropriate if ultimately you want to scale up to Amazon or similarly configured resources, or if ECCO does not meet your software needs in some way. For advanced users. RedCloud is NOT free.
  • Amazon EC2 - the ultimate "elastic" compute resources. Run a custom system for free at minimal usage levels, or build a 26,000 core supercomputer for that one 18 hour long massive compute job. Setting up a job scheduler-based compute system is straightforward if not for the faint-hearted. Cost varies from $0 to quite a lot, depending on your needs. You must handle all the software install.  For advanced users.
  • XSEDE - For statistical jobs that can scale to large cluster, XSEDE is the NSF-funded national infrastructure. Read the "Getting started" guide. The job schedulers on the different clusters are similar to the one used on ECCO (not an accident), but you will lack commercial software (SAS, Stata, etc.), although you will find R and python. Provides a lot more support than an Amazon cluster might, and possibly for free, but requires an application process (exploratory allocations are easy and relatively fast, full-scale applications are judged on merit, similar to full grant proposals).