-
Notifications
You must be signed in to change notification settings - Fork 0
Infrastructure
The survey involves several moving parts and vendors. This section attempts to document and list the responsibilities of each component of the system.
-
Google App Engine: GAE provides a hosting service for web apps that are written in either Java or Python. As Clojure compiles JVM byte code, it is indistinguishable from Java once compiled and can be run on GAE also. The libraries to facilitate Clojure on GAE are on github. GAE handles randomization, creating the HTML version of questions, storing respondent answers, and providing basic security for the data (e.g. users cannot fill out the survey twice). GAE will also handle communication with other services, listed below.
-
Amazon EC2: EC2 is a scalable computing platform. One buys time on Amazon's servers. While GAE is fairly specialized in what can execute in that environment, EC2 allows any software that the user installs. In our case, this is Apache and PostgreSQL with the PostGIS extensions. This will allow queries of the user's location and community polygons against administrative units in Canada. For example, we can run a query of the respondent's latitude and longitude and get back the appropriate province or local political district. The EC2 instance also serves
.kml
files that describe the district boundaries and can be loaded into a Google Map in a browser for display. -
Google Apps for Domains: GAE uses the apps for domains service for DNS. The domain is
mappingcommunities.ca
. The admin user isadmin
and password isdrawingftw
.
This diagram shows the request-response cycle for a user getting a survey page, which includes a dynamically generated district shape file that will be displayed on his/her map.
The first thing one must select when setting up an Amazon EC2 instance is the base operating system. The Ubuntu linux distribution has several images availablable. The most recent stable release is appears to by Oneiric Ocelot. To this we will need to install (if not installed by default):
- Apache: for handling web requests from GAE
- PHP: probably the best tool for reading the requests, generating the queries, and returning the results
- PostgreSQL: the database engine
- PostGIS (and related libraries): adds spatial indexes and functions to PostgreSQL
These are installed from the install script, along with the Canadian census boundary data.
The Amazon service requires an image. The US East Oneiric image is ami-29f43840.
You must enable SSH access in the security group step of setting up the instance (I missed this on the first few attempts). You will also need to use the SSH key provided in the setup.
The gis
directory in the repository is for scripts and setup of the instance. You will need to connect to the github repository to get them. You will need to add a new public key to your github account or bring over your private key to the EC2 instance. I found it easiest to use the Amazon key pair I generated. From your own machine, copy the private key to the Amazon instance, and then log in. My key pair is called ec2kp.pem
.
$ ssh-add ~/.ssh/ec2kp.pem
$ scp ~/.ssh/ec2kp.pem ubuntu@URL:~/.ssh
$ ssh ubuntu@URL
This should log you into the EC2 instance. To see what the public key looks like:
$ cat ~/.ssh/authorized_keys
Copy this to a new github ssh key. To install the software:
$ sudo apt-get install git make
$ ssh-agent bash
$ ssh-add ~/.ssh/ec2kp.pem
$ git config --global user.name "YOUR NAME"
$ git config --global user.email "YOUR EMAIL"
$ git clone [email protected]:bowers-illinois-edu/community-maps.git
$ cd community-maps/gis
$ make
Both Amazon and Google are supposed to support autoscaling. We want to do this. We are not exactly sure how to test this.
On amazon: We think that we need to load instances from S3. We don't plan to save data on Amazon, so we don't need to worry about the Elastic Block Storage.
On Google: We'll set a daily budget and modify it over time. Set it really high the first day and decrease thereafter. One of us will want to keep an eye on the dashboards for both Google and Amazon.
We do not yet have a domain name but will need one.