Here I list all common mistakes

Here I list all common mistakes in assignment 1, particular with respect to synchronizations.

Assignment 1

1. To implement active customer list as a Vector is quite natural. However, some students use its index number to access "customer" later. This is obviously wrong because Vector is dynamic array. That is when some other customer log out and you remove some "customer" from Vector, the index will remain invalid.

The correct way is to use Vector to store the "String" form of customer ID so that even Vector doesn't guarantee the "index" of your data remains the same, still you can find it by searching with "String" format of customer ID.

{ The following comment is not correct!!!

However, there is a little trap here. In java, you are actually "copying" the customer data from Vector or HashMap or HashTable. Some students claim that the only "critical section" occurs at "get" and "put" data from and to your data structure. So, they only "synchronized" the "list" when data is accessed.

}

The correct one is that you are indeed using "reference" of the customer object.

So, switching from "copy" notion back to "reference", the following code has another problem: The "referenced" customer data or "custStruct" may be manipulated by two threads. Unless your "Customer" class itself is thread-safe implementation, you will have trouble. i.e. modifying multiple fields of Customer is done in a synchronized function.

So, we don't have the "stale data" or "dirty data" problem because of reference instead of copy. However, here we introduce another global data: the customer structure is shared because of reference.

i.e.

Hashtable customerList=new customerList();  //I cheated by changing original HashMap to Hashtable

....

    //first we

    synchronized(customerList) {
        customerList.put(custNo, custStruct);
        }

        ....//here we modify the data of customer  //What if there are two threads both modifying our customer data?

    synchronized (customerList){
        custStruct = (customerstruct) (customerList.get(custNo));
        }
 

Here it has two mistakes. The first one I mentioned in "CORBA faq" which is a minor performance issue. You don't have to synchronize "get" and "put" since HashMap is a synchronized class of which all methods are all already synchronized. (Some one is sending in an argument which claims HashMap is not synchronized class and I am waiting for the proof because my test program shows it is.) Indeed HashMap is not thread-safe class!!!!

 

{ The following comment is WRONG!

The second one is a serious logical mistake because when the modified customer data is "put" back, you don't know if it is already modified by other thread or not.

i.e. The customer issued two consecutive requests to buy stock "IBM" of 50 and 100.  In Broker server there will be two threads handling the buying requests. At the beginning the balance of "IBM" stock is 0 and both threads copies 0 from "get" statement. It is possible the first thread modifies customer data and "puts" back 50. Then the second thread also modifies the data and "puts" back 100.

So, the best way is to check if the data in list is the same as before or not. Or the simpler solution to include whole into one big critical section which is a little bit inefficient.

}

However, even though the above comment is not correct, we still learn a big lesson here. Whatever you retrieved from a container, as long as it is not a primitive type like integer, float etc. you are using its reference which means it is possible that another threads are also using this reference retrieved from the same container. Then you need to make this element class in container to be thread-safe.

{

My mistake is from purely C++ STL because in STL everything you got from container like vector, map, set, list etc. are simply a "copy". Just imagine that your container is properly a dynamic container with varying length in case of vector, how can you maintain a pointer when your internal array or link list is modified? However, in java everything is dynamic memory which means the container is actually maintaining pointers or references. This is something I figured out this morning. Try my example and see what you can observe.

}

2. Another common mistake happens when you "search" in Vector and "get" (copy data) from Vector. Since these two operations are not "atomic" you have to synchronize them.

i.e.

    int index=myVector.indexOf(custID);

    //here is the context switch nich and some other thread might have modified Vector or even your customer data.

    MyCustomer cust=(MyCustomer)myVector.elementAt(index);

So, the correct way is to add "custSemphore.acquire()" and "custSemaphore.release()" at the beginning and end of these two statement.

 

3. Some students try to use a short cut to avoid complex file operation by using separate file for each account. i.e. for each customer it creates a file with account number as its file name. It seems that there is no synchronization problem because he is using a local file pointer to write file.

i.e.

void writeToFile(MyCustomer cust) //here we don't declare synchronized

{   

    RandomAccessFile file=new RandomAccessFile(Integer.toString(cust.accountNumber)+".txt");

    //overwrite data into file

...

}

The concept seems correct because synchronization is only necessary when we use shared global data or resources. Here the "file pointer" is declared inside function which is not shared with other function. So there is no need for synchronization at all.

Is it correct? It depends. Even though file pointer is not global, the file is still global. For example, in some cases if the code is not properly designed, we might have two customer call this "writeToFile" with same customer data. (This is not real in this assignment because we only one customer logout request will be executed. All other logout requests will be ignored. But I just say if you are not handling like this way and two customers share their account and issue logout requests at the same time.) Since your customer data cannot be write back to file with one statement or instruction because of multiple fields like account, balance, stock items etc. it is possible that two threads try to write to same file with two file pointers. Finally the file is garbed with interleaved corrupted data.

i.e.

    file.writeInt(cust.account);  //AT THE SAME TIME ANOTHER THREAD WITH ANOTHER FILE POINTER ALSO WRITING TO THIS FILE.

    file.writeFloat(cust.balance);

    ...

 

4. Also there are some logical mistakes in assignments. Quite a few of you guys simply don't check if "buyStock" has enough balance to pay or "sellStock" has enough stock to sell. This seems to be trivial but it is common sense and if we really don't care about this we even don't have any "transaction" problem at all. Because transaction fails due to short of money or stock! So, it must be handled properly.

And quite a few of you guys simply don't follow the requirement of "random generated account number in openAccount". Is it to hard to implement? Then why don't just follow the requirement?

 

5. Some of you guys forget even to submit a design report. And some guy even "slip" his report inside theory assignment.  And here I formally declare that if in future assignment you don't submit a separate report with file name "design_choice" or "design_report", I will consider there is no any design document.

 

6. And in design report, some of you simply don't know what to write. It is important to write down your design of class and functionality of each module etc. And it is more important to talk about what is your design concern about correctness and efficiency. For example, some guy claim the stock symbol data in file "stock" will be frequently used and it should be kept in memory, say in a kind of data structure. This is a typical design choice because I just cannot tolerate to see some of you repeatedly read file from "stock" whenever there is a request to check the stock symbol!

Another example is like what kind of data structure you choose to store "stock symbol". Some of you use a simple array and every time he just iterates the array to find the symbol or to search to the end and claim no finding. This is quite inefficient in the sense that java provides powerful Hashtable for you to speed up searching. (If you feel energetic, you can implement your own binary search algorithms. oopse I am kidding. :) ) So, you can talk about why you choose this and what other choice you have. Even sometimes it is hard for you to really implement it is also highly recommended for you to talk about your design because this is a design course. If you have a good design report and you explain clearly your idea it will deserve a good mark even though you cannot completely implement it.

One more example is to show how you analysis problems. Taking above example, reading "stock symbol" is an operation of accessing global data in sense that there might be multiple buying and selling requests which all need to validate stock symbol in this table. However, it need no synchronization because they are all read only operations and synchronization is only needed when the global data is modified. This is a typical analysis of problem. If you can show this kind of analysis, you report surely demonstrate your understanding of problem.

7. Test cases are a big problem in this assignment. Till now I haven't see any descent testing. Here I am not talking about those empty theory of software engineering. I am simply ask a proof of real world running testing case. The best testing is simply create some real job and run your program to see if it gives correct result. Since our system is a distributed system which allow multiple accesses, why don't you create multiple clients and run them simultaneously? By using a command line menu and show it works only is a basic requirement which shows that you have just finished coding of your program. The real debugging may just begin when you run a series of tasks at the same time. And some of submissions even don't follow the basic requirement to attach some screen shots to show your program running. This is, I should say, the basic of basic requirement of testing.

8. Some students want to avoid checking existence customer account by using incremental account number. This is not what you are supposed to do. Besides, I found another possible synchronization problem. For example, in OpenAccount he plans to append new account data at the end of a "Account.txt" file. Even though he is not using a global file pointer, still he needs to synchronize the file access because now the file itself is a global resource. See below:

RandomAccessFile raf = new RandomAccessFile(".\\Data\\account.txt", "rw");  //this is a local file pointer
raf.seek( raf.length() );
raf.writeBytes(strNewAccount +"\r\n");
raf.close();

 

If the above code is not protected by semaphore, it is possible that two threads both seek to the end of file and overwrite with each other at same location.

9. thread-safe class vs. non-thread-safe class

Let me repeat it again here. HashMap is NOT thread-safe class. So, its "get" and "put" method must be synchronized if you don't place them within your critical section part. As for HashTable, since it is thread-safe class, you don't have to synchronize its "get" and "put" method. However, suppose the object reference returned from its "get" and "put" is a global data, the object itself needs to be synchronized.

i.e. MyCustomer is the element in HashTable and it will be modified by multiple threads, so you can make its method "synchronized" method by declaring like this:

MyCustomer

{

    public synchronized void modifyCustomer(String name, float balance, ...);

    ...

}

However, for the accessing object of MyCustomer in Hashtable, it is already synchronized by Hashtable itself.

MyCustomer cust = (MyCustomer)myHashTable.get(custName);  //the get method is thread-safe, no need to synchronize

 

But if you choose to use HashMap, the above "get" must be synchronized because HashMap is not thread-safe class!

 

10. Common mistakes in openAccount?

Most students realized that it is necessary to synchronize the file access operation in openAccount. But is this enough? The detail is that you need to include both "searching in database" and "insert new account into database" in one big synchronized block. Otherwise when there are two openAccount requests at the same time, it is possible that they both generate the same random account number (even though this is highly unlikely when you use class Random, however, logically it is possible because we cannot depend on the implementation of how random account number is generated.). Then after first thread "searching in database" and then decide to insert the new account. Before this there is context switch, the second thread get chance to do the same "searching". Finally both threads claim successful opening a same new account. And one client definitely loses his balance in OpenAccount.

 

11. Common mistakes in login?

Most students don't check if the account is already "logged-in" in memory before they search in database for the account.

However, the sequence of checking in memory and disk is quite subtle here. If we check memory first and then check file to load data into memory, there is a chance that both threads are trying to "login" to same account. And when they both check in-memory, they don't find the account and they both try to load data into memory. If we are extremely unlucky and you don't want to double-check in-memory customer table when loading is successful, the two threads might overwrite each other. Or what's worse? If the second thread is very slow in loading data and there is already a sellStock request executing, you will definitely make your customer lose the stock he just bought.

So, in order to avoid this double-checking in-memory account, the best way is to check disk first and loading data. Then before inserting into in-memory customer table, check again. By doing this, we can avoid above overwriting when two threads both login to same account. Of course here we assume the in-memory customer table is correctly synchronized for access.

However, the still subtle thing is the question: Can we rely on thread-safe container class without proper synchronization?

For example, we know Hashtable is already thread-safe class of which the 'put' and 'get' method is already synchronized. However, in above case, it is not necessary to rely on this. See below.

...loading data from disk to Customer object cust and now ready to check customerTable for inserting.

if (customerTable.get(cust.custNo)==null)

{                    //this is the context switch glitch, what if another thread insert same customer here?

    customerTable.put(cust.custNo, cust);

}

Here we can see we still need to synchronize this whole block because even though "searching" and "inserting" is atomic operation, the whole operation of "login" is not.

 

12. Common mistakes in logout?

Maybe the logout is regarded as the easy one among all operations by some students. Well, believe it or not, it has most difficult synchronization issue.

First, if you want to make in-memory customer table memory efficient, you need to remove those logout customer data from table. But if you do removing, then you need to synchronize with "buyStock", "sellStock" and possibly "login" because they might be using the same customer object. Here if you think I am talking about "customerTable" level synchronization, you might think it is easy. But I am talking about issues such that "buyStock" and "sellStock" is in operation when "logout" request comes in. Obviously you need to wait for those operations to finish before you can log out.

In first assignment, this issue might not be easy for you to discover because "buy" and "sell" is executed in local memory and they are very fast. However, in second assignment when "tradeStock" is done between brokers through network you will be able to observe corrupted data frequently if you don't take care about this issue because remote transaction can be very slow.

Till now I only see one or two students explicitly deal with this problem and their approaches are quite different. First let me explain the possible "starvation" scenario in this difficult issue. Your logout would rather check if the number of current executing threads on the customer account is zero or not. If yes, you can safely logout by removing data from "activeCustomerTable". If no, you have to wait. But what if the requests for "buying" and "selling" keep coming so that even the old threads are reducing the "reference number" and the new threads are increasing the "reference number". Your logout thread can be waiting for ever. This is the so-called "starvation" problem.

The quick and easy way is to setup a flag indicating the intension of logout so that all "buying" and "selling" threads will quit by checking this "flag" at the beginning. So, the logout threads will wait until all in-process threads finish. This can be done by using a "reference counter" declared in each customer instance.

Besides the "want-to-log-out-flag" design, the two students using quite different approaches in design. One is setup an extra "in-transaction customer table" to hold all requests in process. By searching this Hashtable, you can see if you can safely logout. Another student uses more intuitive idea like explained above. That is to check an "active-thread-counter" for each logout customer. However, the tricky think here is that he doesn't choose the simple "while-loop" to wait for the counter to be zero. Instead he uses the "counter" number as parameter for another "counting semaphore" which I found not very good. Because when you check the counter and pass that number to a "counting semaphore", it is possible that some threads have already finished its "buying and "selling". So, your "counting semaphore" is always too big to finish. That is your "logout" threads will wait for ever for the "counting semaphore" to be zero.

Another simple advice is about multiple semaphore. It is extremely good to use multiple special-purpose semaphore than one single global semaphore. However, you have to be extremely careful. Otherwise the thing may become very tricky. For example, I strongly suggest not to use semaphore in a nested manner because it is very error-prone when your code misses release of semaphore.

 

13. What is a good way of synchronization analysis?

It seems to me the simplest and easiest way is to analysis function by function. You can simply tell me in which function you use what synchronization mechanism to prevent what kind of scenario of multiple accessing of what global resources.

 

14. > I've already read some comments about "logout" for assignment 1 on
> website. I have some confusion and hope you to make me out.
> If any users follow the general rule,in fact which is a common sense, to
> operate stock accounts like this:
> login(accountNo);
> buy(...); or sell(...);
> logout(accountNo);
> you only need to remember how many times the unqiue account is login. Why
> need to remember the amount of threads or to use some busy-waiting flags as
> a condition to identify when "logout" operation should actually perform.
>
> BTW, the assignment doesn't need us to consider security issues. That is,
> buy or sell method must guarantee the user who does those operations
> has login the account and receive and handle requirements from the same
> communication channel. However, we have to assume our underlying network
> level to support them for the assignment, otherwise the assignment should
> make use SSL to satisfy them, but it is not a two-week period project at
> all. -:)


The concern has nothing to do with security issue.
The problem is a simple observation of data correctness.
1. It seems that you are still viewing the whole thing like a series of 'synchronized', sequential operations issued from same client program. In fact, in distributed system, you never know if these operations are from same client program or not. And in an asynchronous server, everything is in a concurrent manner.
2. The general situation is that when logout requests comes it is possible some buy or sell operations are still in process. In cases of tradestock, since it is very slow, it is very possible. Then if your logout don't wait for all in-process operations to finish you will store incomplete data back to disk. i.e.some customer gets success reply from buyStock, while the data is lost by NOT saving back to disk.
3. By counting number of login is really helpless because we only allow ONE login. All following login requests will be ignored or treated as error. You cannot imagine there exists a 'session' concept UNLESS YOU IMPLEMENT IT by some technique like a unique session id as implicit parameter for all sell, buy operations because you never know if a particular buy or sell operations originated from a particular 'login' client. And since our 'login' just asks customer data to be loaded into memory, it has no relation with security or session or whatever.

4. However, I should say this 'wait' for all in-process operation to finish is quite an advanced topic for some students depending on their implementations. Because if they choose a simple way by using global semaphore or 'synchronize' buy, sell, logout etc. they actually make all operations in a single-threaded, sequential manner. Then there is no need for waiting because when 'logout' get chance to be execute, buy or sell cannot be execute. So, this issue is only meaningful for those with advanced techniques like using multiple semaphore or pushing deeper to fine grained synchronizations. For example, some advanced students designed to allow synchronization on level of 'transactions' which is simply a particular operation on a particular customer's stock item. This is for sure to achieve the highest efficiency. Well it is possible that your logout might be waiting for ever for 'transaction' to finish. We call this 'starvation problem'.



15. A little more about OpenAccount.

The efficiency problem might be the consideration for some students.

Most students simply search file every time a random account is generated. This is a bit inefficient. And I saw two kinds of improvements. One idea is quite easy and efficient. You can keep a in-memory customer list which will read all customers from file at the beginning when program starts. Then whenever a new customer openAccount, you simply search in memory and add it into this customer list.

Another way is quite creative and I should say, it is not easy for everybody to figure out. It is to generate all unique random account when server first time runs and store these account in some file. Then later whenever an OpenAccount request comes in, you just read the next random account from this list which might be loaded in memory when server runs. Personally I think this maybe the more professional solution.
 

16. A little more about prevention multiple login.

The way to deal with multiple login I mentioned above might not be the best way because multiple login requests all check existence of customer account in disk then search in 'login-customer-table'. They only get failure after they have searched in disk by realizing another thread is also doing the 'login'. Then why don't we setup a flag in 'login-customer-table' to prevent further login requests before we start searching in disk? This is the exact way a student invent and I want to post it here because I think this issue is ignored by most of students.

...using a Hashtable as login-customer-table, the key is customer number.

...login synchronize the login-customer table for searching customer number.

...if not found, put the customer number with a 'null' in Hashtable to prevent further login requests

...if found, return failure

...here we are sure all further login requests will not reach here, so we begin search in disk to see if customer number exists or not. (Of course, here you can combine the technique mentioned above by using a in-memory customer list to make memory-searching instead of disk-searching.)

 

17. 

> 1) When you just allow one "login" and ignore others, that means that you
> already allow other clients who share the same account can buy or sell
> stocks on different sites but never "login" the account. Is it normal for
> stock trade? Is it not security?
> 2) In fact, the stock trade should be executed serially on a single client
> because it follows user's habit:
> 1. login his/her account and establish unique communication channel with
> server.
> 2. does buy or sell operation and wait for results whatever success or
> fail.
> 3. logout his/her account and disconnect unique communication channel
> with server.
> According the process, I can make a inclusion: logout should execute after
> all of buy or sell finished whatever success or fail. So count
> "login" number give me a picture how many clients sharing with the same
> account are interested in the same object located on server.
> 3) Totally, I agree with you about "login" and "logout" should synchronize
> with "buy" or "sell" in order to support software robust. In assignment 1, I
> ignor this. I will support it in assignment 2.



First, please note our 'login' is only named so and it is indeed not like normal concept of login which requires some security check.
Second, as I have pointed out, from the perspective of single GUI of client program, it is very natural you can control the execution sequence of login-first, buy-or-sell-second. But from the perspective of server, can you control the sequence? The server just operates on whatever request it receives with some logic. So, that's why we need a design to guarantee reasonable operations. As you said, it doesn't make sense that clients can buy or sell stocks on different sites without even login the account. I agree. But can you prevent it from server? Obviously not. You have to deal with it and eliminate it.

Third, you have mentioned a communication channel between client and server. However, in RPC this channle only exists between the request and reply or one single RPC. The consecutive RPC has no sense of channel. i.e. A buy after a login is not able to be recognized as from same channel. just imagine the RPC is implemented by some TCP-like reliable request-reply pairs. You can notice this feature in 'web service' which uses HTTP request-reply to mimic RPC, even though not perfectly.
So, as a conclusion, from server's perspective you never have a so-called 'context' or 'state'. And that's why we call it 'stateless' because a server never 'remember' which client makes what RPC calls.

The traditional 'client-server' concept in LAN may not be applicable in general sense of distributed system. For example, you may want to implement a client program which imposes your login-buy-logout logic. However, a robust server design can not rely on a robust client program because we don't have a 'channel' here. The requests from multiple client program mixed the 'login-buy-logout' sequence.

 

18.

> 4) Multiple semaphores, I have to point out my view. Although
> multi-semaphore could provide fine grained critical section protection, we
> have to consider thread switch still need time to save its context even it
> is much low compared with process switch. So when you use multi-semaphore
> protect multi-section, it is possible that the execute time of multi
> critical section with a semaphore is smaller than the time of those with
> multi semaphores at the most time. That is, too much fine grained semaphore
> could become the bottle neck of system performance such that how to design
> multi-semaphore is a important issue for performance. Certainly, other
> issues for example thread of user level or thread of kernel level also
> affect performance significantly when using multi-semaphore.

 

Your point is quite interesting and I personally agree that too many semaphore may not be better than single semaphore in sense of too many 'thread-level' context switch. However, generally speaking the origin of concurrency of high performance is from concurrency of 'computation-and-IO'. i.e. A computation job which uses no IO can be in parallel with a pure IO job with no computation. Here IO may refer to disk IO or network transmission latency. For example, in our cases a tradestock function may spend most of its time in waiting for remote transaction result and at the same time you can definitely proceed a local buy or sell operation even though there is some context switch cost.

These are quite common observation.
Here I talk about some of my personal feeling about this question and I don't guarantee they are correct or make much senses. You see, the above are all from classic textbook and everybody knows it if they took OS course. However, in real world how many thread jobs are either purely computation or purely IO. I guess very few because most of real program are just a mix of computation and IO in one small thread job. If there is no semaphore or synchronization to restrict the operation of all threads, then the OS will automatically optimize their operation. For example, one thread is doing IO and it will be put in sleeping queue so that another thread can try to execute. However, if we put restriction on threads due to synchronization, this 'natural' optimization job of OS is destroyed. That is, a particular thread must finish its whole IO-computation-mixed-job before any other thread can go. Or in other words, 'synchronization' destroys the 'natural' optimization of concurrency manipulation of OS. That is why we prefer 'asynchronized' to 'synchronized' generally. You will notice this issue in the end of term when professor introduces 'virtual-synchrony'.
So, the big idea is to make threads as much 'asynchronized' as possible and let 'chaoes' decide better way of concurrency with help of algorithm of OS because people have spent so much energy and time to optimize OS concurrency with jobs.
Therefore even though you make a point that too many semaphore may increase context switch cost in sense that semaphore itself is an expensive operation, still we may gain benefit from 'asynchronization'. Of course the above point is purely my own understanding and it is open for discussion.

Generally speaking, as long as you make it right, fine-grained synchronization is better than coarse one.

 

19. Why do I need to synchronize with my OpenAccount?

I have mentioned this problem in the comments of some of my marked version. However, I would like to repeat it here.

It may seem to many of us the OpenAccount doesn't need to synchronize except for the file access part. However, is it any possibility that two threads of OpenAccount will generate same account number? To many of us, it seems to be impossible since we might be sure that the Random class would never generate two conflicting number in a series of "sequential call. Yes, it may be right. But how about two Random class object is called with same seed? Definitely it will give the same random number with same seed. Then this issue might be serious when we move on to web service when each OpenAccount request will create a servlet or instance of your implementation class. And if you are not careful, you might NOT declare the data member Random static. Then for each servlet you would have an independent Random object which will generate same account number. It is possible that your two customer's data file will be overwritten each other even though you correctly synchronize the file operation. i.e. OpenAccount(100) and OpenAccount(1000) will return the same customer no. 1234 and there is only one balance in file with either 100 or 1000.

However, this problem will not be revealed as long as you are using RMI or CORBA which will only generate one single instance of implementation class.


20. And how about the efficiency of OpenAccount?

I should say even some of our best programmers in this class have neglected this efficiency issue of OpenAccount. Generally speaking the Random will generate same random number with same seed because they are pseudo random number. So, in the first run of your server, your program generates a series of new account. When run in second time, all those random.nextInt() will generate exactly the same account number which will conflict with all previously opened account number. In other words, your program becomes slower and slower after each run when new account is finally added to file. This phenomenon will only be observed when you shut down your server at midnight and start it at 6:00am like "MyConcordia.ca". (I am kidding.) However, I am disappointed to say that very few people ever give a thought about this comparatively unimportant performance issue like using a random seed say random.setSeed((new Time()).getNanoSeconds()) before you call random.nextInt.

By the way, the class Random is thread-safe class. So, it won't be harm to declare it as static member of your implementation class so that you save trouble in web service.

 

Assignment 2

(I would rather call the following as comments than common mistakes because some of them are not so "common".)

1. The very simply mistake is that somebody simply doesn't use any TCP socket to communicate with remote broker. This maybe due to the misunderstanding that all brokers are running in same server.

 

2. Very few students even mention about deadlock problem, not even mention how to solve it. There is one student with very good analysis and solution and I want to let everybody know it.

The basic idea is to an insight on how deadlock happens. He points out that locking account without a proper sequence is the origin of this deadlock. So, the solution is to ask all "tradeStock" to lock account by a certain sequence. i.e. For the two customer account in "tradeStock", the one with smaller number of "broker ID" should be lock first. If the broker ID are same (trade stock between under same broker), then the account of smaller number should be locked first. Please noted that this is exactly the method used in OS for "avoiding" deadlock.

Generally speaking, there is two big scheme to deal with deadlock. One is to "avoid" and the other is to "break". In order to avoid, you either make all resource locking in a certain order. Another simple way is just like what I suggested in tutorial, you use "non-blocking" locking methods to avoid running into deadlock. i.e. "tryLock" a semaphore, asynchronous socket sending or receiving.  

3. A simple but not common mistake is that the "tradeStock" doesn't even use a TCP socket to communicate with other broker server. This will definitely implies failure on accomplishing requirement of A2.

4. Another simple but not common mistake is that broker server doesn't use a "thread" to monitoring incoming message. A trivial mistake is that someone is so careless that the "run" function of thread actually only runs once! i.e. There is no "while-loop" for the "run" function. Then this thread is almost useless because it only run once to exit.

5. Someone misunderstand the usage of semaphore by "new" a semaphore whenever it wants to "lock" it. i.e. He defines a member data semaphore for class Customer. Then within the "tradeStock" function, he "new"s the semaphore like this:

Customer myCust= findCustomerByID(cust1);

myCust.sema= new Semaphore(1);

myCust.sema.acquire();

...//here we do the trade stock for cust1

myCust.sema.release();

Please be noted that every time you will new a semaphore which makes semaphore meaningless because synchronization is done through "cooperation" between threads on "same" semaphore. If you create a new one every time, who else can use it? Think about it!

6. Quite strangely that not a single student misunderstands the meaning of "blocking and non-blocking". They claimed that the method "acquireUninterruptedly" of Semaphore is "non-blocking" method which can solve deadlock problem. However, read the explanation in www.sun.com and I should say it says very clearly that it is exactly the opposite, the "blocking" method. So, the misunderstanding is a kind of disease with infection.

7. As for mark, I should make a little explanation here. I setup a principle in marking that if no solution and analysis for deadlock problem are mentioned and showed in both report and code then the highest mark can only be 65 out of 70. And if your implementation reaches basic requirement your mark will be 60 or above depending on implementation quality such as if you differenciate local exchange or remote exchange. If you don't or forget to implement tradeStock function your mark maybe under 45 out of 70, depending on quality of your report.

8. Some students comment in report that OS maybe smart to switch local RPC automatically. i.e. OS can discover that the sender and receiver are in same IP address and avoid using socket communication automatically. Basically I think it is hard to believe.

9. Another interesting solution for deadlock is that only using "tryAcquire" method of Semaphore for "second customer" in tradeStock. At beginning, I thought it wouldn't work. However, it seems that it is OK because as long as you use at least one "non-blocking" method in "locking semaphore" you would not be blocked. So, it is very interesting method even though I think it makes coding a bit complicating than using all "locking" in "non-blocking" style.

10. A most common mistake is that some students forget to do the same synchronization as he/she did in assignment 1 when accessing the common resources such as in-memory customer list, customer stock account etc. However, I decide not to particular search this kind of error unless they force me to do so. So, it is possible that I don't even deduct any points for this kind of serious problem.

11. And there is one more thing I forget to mention. The business logic of "tradeStock" is NOT equal to "buyStock + sellStock"! That is to say, "tradeStock" doesn't touch the balance and it is purely an "exchange" of stock between two customers' account.