Modify SimpleGrid
From MyWiki
Contents |
Overview
This project seeks to address the primary problem associated with the TeraGrid community account model, namely, all requests from the science gateway look the same to the resource provider, and therefore the identity of the end user is unknown to the resource provider. A typical scenario is as follows. A user logs into a web portal with her username and password. Within the portal, the user creates a job that requires the use of TeraGrid resources. The portal, on behalf of the user, submits the job to a TeraGrid resource provider. Using a proxy certificate signed by its community credential, the portal authenticates to the resource provider and submits the job. Since the proxy certificate does not contain any information about the end user, the resource provider finds it difficult to exercise fine-grained access control, strong auditing and effective incident response. See a paper recently presented at the GCE Workshop at SC07 for details.
Our solution involves embedding additional information into the proxy certificate in the form of a SAML assertion, which is stored in a non-critical X.509 extension. On the client side, information such as the portal user's username and email address can be encoded into the SAML assertion. On the TeraGrid server side, the SAML assertion can be extracted from the proxy certificate and logged, and the information in the assertion can be used for fine-grained access control.
For our proof of concept, we use SimpleGrid, which is a basic toolkit for building and teaching science gateways. SimpleGrid is built on GridSphere, which is an open-source portlet-based Web portal toolkit. After installing and configuring GridSphere and SimpleGrid, we needed to make a few modifications so that we could embed a SAML assertion into a proxy certificate. The major change is the addition of a new Java class. A method in this class takes a proxy credential along with some user information (such as username and email address) and returns a new proxy credential with an embedded SAML assertion containing the user information. This is accomplished by using the GridShib SAML Tools.
One issue is the internal representation of the proxy credentials. The SimpleGrid code uses class GSSCredential while the SAML Tools relies on class GlobusCredential. It is possible to convert an object of type GSSCredential to an object of type GlobusCredential, but we needed to find the correct parameters to do so. Another issue is the determination of the portal user's IP address (which may be used by the resource provider for access control). According to JSR 168 (section PLT.16.3.3, "Request and Response for Included Servlets/JSPs"), the getRemoteAddr method must return null. We are still looking for alternate methods for getting the user's IP address. Other than those issues, most other changes were made to the user interface so as to allow the passing of username and email address among the various java files.
First Steps - Get SimpleGrid Up and Running
- I started off with upgrading my VMware box to Fedora 7 and installing the latest RPMs for Tomcat (5.5.23), ant (1.6.5) (including ant-scripts, ant-nodeps, and ant-contrib), the MySQL client and server (5.0.37), a MySQL JDBC connector (mysql-connector-java-3.1.12), jakarta-commons-codec-1.3, and Sun's java (1.6.0_02).
- In MySQL, I created two databases:
CREATE DATABASE gridsphere; CREATE DATABASE simplegrid;
- Next, I downloaded and installed GridSphere 3.0.5. At the time the current version available was 3.0.8, but Shaowen told me that the database config and loading mechanisms changed between v.3.0.5 and v.3.0.8, and this adversely affected SimpleGrid. Since I was user "root" when I deployed GridSphere into Tomcat, I had to fix ownership and permissions of the files:
chgrp -R tomcat $CATALINA_HOME/webapps/gridsphere chmod -R g+w $CATALINA_HOME/webapps/gridsphere
- Then I created a new user in GridSphere and logged in once. Note that I had to keep the GridSphere source directory around since the build step of SimpleGrid depends on libraries in the GridSphere source tree.
- I downloaded the SimpleGrid 0.5 package from Yan Liu's and Shaowen Wang's site. The idea was to get the basic SimpleGrid portlet up and running.
- I had to configure SimpleGrid for building and deploying.
- In simplegrid/build.properties, I set gridsphere.home and gridsphere.build to point to src directory for GridSphere
- In simplegrid/webapp/simplegrid.properties, I modified all properties to point to appropriate servers/directories for tfleury
- Since I built/deployed SimpleGrid as root, I had to fix ownership and permissions of the $CATALINA_HOME/webapps/simplegrid directory as I did for gridsphere above.
- I restarted tomcat, logged back into GridSphere, and modified the "Layout" for the "loggedin" user by adding a "SimpleGridUserGisolve" portlet and a "SimpleGridDMSGisolve" portlet". The former allows a user to get a credential from a MyProxy server or from a file on disk. The latter allows the user to submit a simple job using GridFTP and GRAM.
- I had to copy $CATALINA_HOME/webapps/simplegrid/storage/samples/sample to $CATALINA_HOME/webapps/simplegrid/storage/tfleury/dms/datasets/ and $CATALINA_HOME/webapps/simplegrid/storage/samples/sample.jpg to $CATALINA_HOME/webapps/simplegrid/storage/tfleury/dms/images
- On each of the teragrid servers I wanted to use, I had to copy the correct SimpleGrid/applications/dms.*.tar.gz tarball to my home directory and untar it.
- I downloaded the TeraGrid root certificates and installed them in /etc/grid-security/certificates
- At this point, I was able to use the SimpleGrid portlet to fetch my credential from myproxy.teragrid.org, and submit a GT4 job to each of the teragrid servers.
Next Steps - GridShib/SAML Enable the X.509 Credentials
- Now came the task of modifying the SimpleGrid code so as to utilize Tom's GridShib-SAML-tools to embed a SAML assertion in the X.509 credential used when submitting the job to GRAM.
- Initially, Tom gave me some jar files that had certain Java methods exposed which allowed me to do what I wanted to do. Then, Tom formalized these methods into an API, which required me to alter my code. The rest of these instructions use the new API.
- I downloaded the GridShib-SAML-Tools v.0.2.0. I ran "ant" to build the jars.
- I copied three files from gridshib-saml-tools-0_2_0/lib to SimpleGrid/simplegrid/lib/globus:
globus-opensaml-1.1.jar gridshib-common-0_2_0.jar gridshib-saml-tools.jar
- For the main code that embeds a SAML assertion into a credential, I elected to write a static method. See the attached file SamlCred.java for the source code. This file resides in simplegrid/src/org/gisolve/demo/grid/security/SamlCred.java .
- The static method org.gisolve.demo.grid.security.SamlCred.embedSAMLInCred is called twice in simplegrid/src/org/gisolve/demo/grid/security/SimpleCred.java, once in the load() method (when a credential is read in from disk) and once in the logon() method (when a credential is fetched from a MyProxy server). I also added a new method "samlProxyInfo" to display the contents of the SAML-embedded credential in the portlet (using "openssl x509 -noout -text -in filename"). It's a total hack and unnecessary for the functionality we are seeking, but it allowed me to see if my static method had worked or not.
- I also made a bunch of other little changes to the source code (both Java and JSP). For example, I added a new checkbox to the portlet to "Embed SAML into proxy when getting". I also extracted out the GridSphere login name to be used for the SAML assertion.
- At this point (August 14, 2007) there are two outstanding issues.
- First, the directories for the teragrid sites are hard coded in simplegrid's config file. This may not be such a big deal if we use a "community" account where all processing is handled with a single credential. If we need access to multiple teragrid accounts, then the current code is not up to the task.
- Also we cannot get the user's remote IP for embedding into the SAML assertion. This is a limitation of the portlet standard (GridSphere in this case). Currently, I have hardcoded "192.168.0.1" as the IP. Yan and Shaowen have indicated that they will investigate this problem.
View the List of Code Modifications (i.e. output of "cvs diff")
Check It Out!
You can get the latest version of the code from the CVS server at NCSA:
cvs -d:ext:USERNAME@cvs.ncsa.uiuc.edu:/CVS/srd checkout simplegrid
Be sure to substitute your actual username for USERNAME above. Currently, you must be a member of the Security Research & Development (SRD) group at NCSA to get access to the code.
The code resides in the simplegrid/simplegrid directory. To build the code:
ant jar; ant war; ant deploy
Now restart the Tomcat server.
