- Functionality:
- Fault Tolerance
- System Elements:
-
- The CORBA ORB: Sun's implementation of an ORB for java, supports
naming and rpc's.
- Casino Server: The middleware
server responsible for serving blackjack to clients. The game server
registers one interface with the orb naming service, the Floor
interface which contains methods allow entrance to and exit from the
gaming system. It also exports two other interfaces on request; the
Bank interface and the Tables interface. These handle the monetary
transactions and the actual game mechanics respectively.
- Database Server: The MS-SQL
system that stores all the state for one middleware server right now
and multiple ones in the future.
- Replication Manager: The application that starts the servers,
monitors them for failure, restarts them if they do fail and maintains
the Naming Service.
- Player Application: The
application that the player uses to play the game. It has one
interface, the Player interface, that it exports to the game server to
allow the game server to control the flow of game play.
- Player: The human controlling the
client application
- Test distribution:
-
- /afs/ece/class/ece841/public_html/team5/Team5-20030314.tar.gz
- Setting up the System
-
- Installing the binaries.
- All the project binaries other than the database are
in the 'tar' file. Unpack this in your afs home
directory. Since we have a distributed file system to work
with we can install one image and use it on any host.
- Running the Database Server
- The Database Server is considered "sacred" and is
always running.
- Running the Casino Server
- The Casino is controlled by the Replication Manager is
is not generally started directly.
- Running the Replication Manager
- Log in to any host in the ECE linux cluster.
- Change directory to where you unpacked the
tarball.
- Source the project environment file with the command
". team5.bsh" if you are running the BASH shell
or "source team5.csh" if you are running the
C shell.
- Start the orb with the startOrb script, if it is
already running, kill it, delete the orbd/
directory and restart it.. Wait a minute for the
orb to finish starting.
- Start the Replication Manager with the command
"./startRepman.sh ".
- You should see the output from each server as it
starts up and initializes itself.
- Once you see the message "---Server ready and
running ....." from each server, the system is
ready to run.
- Running a Player Application
- Log in to any host in the ECE linux cluster.
- Change directory to where you unpacked the
tarball.
- Source the project environment file with the command
". team5.bsh" if you are running the BASH shell
or "source team5.csh" if you are running the
C shell.
- Start the Player Application with the command
"./startPlayer.sh ".
- Test 1 - Fault Tolerance
-
In this test we demonstrate the Player Application can
recover from a lost of contact with the primary server and
re-connect with the new primary. This facility is almost
totally handled by the Replication Manager. The client just
needs to try to re-bind to the same name that it had bound
to before as the Replication Manager swaps in a reference to a
new server automatically. The Replication Manager also alerts
the new primary that it is the new primary so that is can
retrieve its state from the database. Note that the Player
Application, the database server, and the ORB are considered
"sacred".
- At the Player Application Prompt " Q--Please enter
your Player name: ", enter your name.
- At the next prompt, do nothing yet.
- From another shell, ssh to the host running the primary
server. Type "kill -9 PID", where the PID is the
process number of the primary server. The host and PID
for the primary server is displayed in the output from the
replication manager as it starts the servers.
- The replication manager should indicate that it has lost
contact with the primary and that it assigning the old
backup to be the primary and that it is bringing up a new
server to replace the lost one.
- Now type the number of chips that you want to buy into the
Player Application prompt. The Player Application should behave
as if nothing adverse happened.
it found a new server.
- Test 2 - Recovery
-
In this test we demonstrate that we can kill the primary
server, recover and kill the primary again and recover again.
This is all handled by the replication manager automatically.
- At the Player Application Prompt " Q--Please enter
your Player name: ", enter your name.
- Continue following the prompts until you are inside a game.
- From another shell, ssh to the host running the primary
server. Type "kill -9 PID", where the PID is the
process number of the primary server. The host and PID
for the primary server is displayed in the output from the
replication manager as it starts the servers.
- The replication manager should indicate that it has lost
contact with the primary and that it assigning the old
backup to be the primary and that it is bringing up a new
server to replace the lost one.
- The Player Application should behave as if nothing
adverse happened, although it might re-prompt you with the
last question it had asked before the crash.
- Repeat step 3 killing the new primary.
- Again play should continue without interuption but maybe
one repeated query.
- Test 3 - Exception Handling
-
In this test we demonstrate the ability of our code to
handle exceptions.
- Handled Exceptions
-
- If the orb is not running in the right place and you try
to start the Replication Manager, it will shutdown
gracefully.
- If the orb is not running and the Player tries to
register with the server, the Player application will handle
it.
- If a server is not running and the Player tries to
register with it, the Player application will handle
it.
- The prompt for buying quantities of chips will reject out of
range or non-sensical responses.
- The prompt for betting will not allow you to bet more
than you have, less then 10 or larger thant 25000 in
any case.
- Unhandled Exceptions
-
- If a client is killed the server gets in a race
condition.
- If the database is in a corrupt state from a
previous test, the server cannot handle it.
- Limitations
-
- The database should be cleared before the Replication
Manager is started as unhandle exceptions can result from
starting the system with uncleared data..
- The orb should be started fresh before the Replication
Manager is started.
- None of the applications handle poorly configured or
missing properties files.
- The system does not provide security.
- During a game is a primary server has been replaced the
player will be re-queried about whether to "Hit" or "Stay".
- No encryption of data.
- No logic checks to prevent false wins and artificially
high win amounts.
- If two players register with the same name, they get
unhandled exceptions on both the Server and the Player
Application.
More than one Player application at a time are not
supported.
- Killing the client can put the server in an indetermined
state.
Last modified: Sat Mar 15 16:17:31 EST 2003