Project Feedback
and Evaluation
Dependable Distributed
Middleware Systems
18-749, Spring 2006
Team #: 1
Application:
Su-Del-Ku -- A
real-time, fault-tolerant, high performance game where two or more sudoku
players can pit their intelligence against each other
Middleware + Platform:
JBosss, MySQL, Linux
Team Members: Christopher
Nelson <crnelson@cs.cmu.edu>, Saul Jaspan <saul.jaspan@gmail.com>,
Lucia de Lascurain <ldelascu@andrew.cmu.edu>, Jose Luis Rios Trevino
<jriostre@andrew.cmu.edu>, Yudi Nagata <ynagata@andrew.cmu.edu>
2/11/2006:
Feedback on project proposal
- Good division of
project responsibilities
- Interesting baseline
application (this has not been done before)
- Where are you considering
using asynchronous messages (since you mention that you might need them)?
- You’ve talked
about a server crash not affecting the continuation of the game. What
about the start-up of a new game or the addition of a client to the
game?
- Under the real-time
requirements, you have two separate requirements – client receives
reply in 200ms and game resumes within 2 seconds. Does this mean that
the 200ms applies only in the fault-free case? Also, does the 200ms
apply to ALL clients connected to the game?
- Does a client map
onto a player? You mention the term “player” under the performance
requirements
- You have a lot of
functionality listed in the baseline application features – you might
want to break those up into mandatory and optional features, just to
ensure that enough of the application is implemented and that you don’t
run out of time
Feedback on project
interfaces and end-to-end use case
MySQL on Linux working
JBoss on Linux working –
need custom JBoss because of Lomboz
Group to send me an email with
the quota that you need for JBoss and Eclipse
joinGameRoom to be the first
end-to-end use case
Recommended to start with listGames
instead because this is an end-to-end read-only method
Add a transaction id for every
parameter and also in the database tables
Recommended to take a look
at JNDI from previous years’ teams
Recommended to take a look
at previous years’ teams with the same configuration
Don’t hard code port numbers
3/2/2006:
Feedback on end-to-end use case
- Current platform:
JBoss and MySQL
- Implemented use
case that queries database and lists available games
- Demonstrated use-case
with client, server and database running on different machines
- Updated baseline
design to reflect mandatory and optional features
- Every member of
the team displayed a good understanding of their system --- Good job J
- Need to think about
how client will locate multiple servers in fault-tolerant design
3/22/2006: Checkpoint
1 Presentation
Architecture very well described
Team is on top of things and
has their implementation under control
Things to think about at this
stage – where is most of your multithreading and concurrency? This
will start to affect your performance in the last phase
Why did you need a callback,
and how did you implement it? Are there alternatives?
Annotate the methods that you
might not need if you go the stateless route
Identify which methods invoke
the recovery mechanisms (e.g., setAsPrimary)
Have you played with the interceptor
as yet? Do you have a backup plan? I would recommend that you try a
simple “kill -9” script first and then go fancier – saves you
time
4/3/2006:
Fault-tolerant Baseline
- Team has a good
understanding of their system architecture
- Using passive replication
for fault-tolerance with stateless middle-tier
- For the demo, used
a list of hard-coded servers but their current implementation allows
servers to be dynamically located
- Used global JNDI
that pushes state to database allowing potential recovery of JNDI entries
if server fails (remember to mention this in your report J)
- Discussed tradeoffs
which occur during network partitions
- Overall, the team
did a very good job in the demo
5/2/2006: Performance
Experimentation
- Good understanding
of the experimental process
- Very good “lessons
learned” section
- Latency vs. throughput
figure: forgot to characterize latency (mean? max? 99%?) and throughput
(average? peak?)
- Overall, the team
did a very good job with the experiments
Final check-off list
Baseline:
Failover:
Exception-handling:
Recovery:
Latency:
Performance:
Test-cases:
Additional features:
Overall comments
Marks