CORBA faq

1. How to say hello to CORBA?

This is the hello world and I always want to make the simplest one for hello world. Download the sample file to your "classpath" and from command line run "compile.bat", then run "run.bat". To see how to run and compile, edit both "batch" file and surely you will understand.

2. Do you have to worry about if RPC is reliable? i.e. After client makes a
> RMI invocation on server, server successfully complete this request
> and response to client. But for some reasons, client do not receive
> this reply, which is to say client think this request is fail. In
> this case, we say client and server are inconsistent.

RPC is aimed to implement "exact once" semantics. So, basically you don't have to worry about it. Otherwise you are preparing to invent wheels in modern times. So does CORBA. (How it is implemented? Try to think TCP is a reliable communication and you will see this is done by lower level.)

3. The second one is about assignment 2. In the requirement, professor
> said, "Furthermore, implement all communication among StockBrokers
> using TCP protocol". Does this imply we should develop raw socket
> programs to implement communicaions between StockBrokers.
> Furthermore, is there any special technologies in Corba that we can
> use to implement transaction service. It seems that OMG defined a
> specification for transaction service, but I dont know if there is
> product we can use to implement atomic transaction in this
> assignment. Especially, in JDK. If possible, please give me some
> suggestions.

CORBA is a rather relaxing standard which means it is supported by multiple supplier and which in essential is not stringent in implementation. Even though there is standard for transaction service specified in CORBA, it is up to supplier to implement it. As far as I know, we are supposed to implement a raw transaction by ourselves. Honestly speaking I don't know if javaIDL has implemented transaction service or not. Probably yes, but you need to check if you are interested in.
　

4. HashMap is also thread-safe in 'put' and 'get'. So you don't need to place a 'synchronized' keyword for it.

i.e.

HashMap customerList=new customerList();

....

//here we retrieve our customer data first

   synchronized (customerList){
        custStruct = (customerstruct) (customerList.get(custNo));
        }

....//here we modify customer data and it has no need of synchronization

//here we put back modified data

   synchronized(customerList) {
            customerList.put(custNo, custStruct);
        }

So, the above 'synchronized' is not necessary because HashMap like Hashtable, Vector are all 'thread-safe' class. Here is a little testing program. Even though I only test the 'put' method, I am pretty sure all other methods should be the same. Besides, I also see someone mentions similar things in his report by showing the method declaration of HashMap or something. All methods has a 'synchronized' key word before function name.

5. How to interpret 'tradeStock'?

First, please note that the two customerNo may belong to different brokers or same broker. In the first case, transaction becomes local and you saves TCP connection latency.

Second, I want to apologize for what I mentioned in last Thursday tutorial about the value of two stock should be the same. This requirement is not stated in assignment, so you don't have to assume this. By the way, in real world when two company want to merge and they trade stock of each other, they don't really require the value of exchanging must be exact matching. So, the conclusion is you only need to check availability of each stock without caring about the equality of total value of stock. (Also, floating point calculation can seldom give you exact equality of two stock value.)

Third, you have to explicitly create one thread which create a Server socket to keep listening incoming request.

class StockBroker

{ private MyThread myThread=new MyThread();

private Vector customerList, ...

...

class MyThread extends Thread

{

public void run(){

//here as private class of StockBroker, MyThread can access any data member of StockBroker

}

Fourth, you need to implement this function as a transaction which means either all or nothing. You are arbitrary to choose to start from remote or from local broker. However, if any one of them is failure, the other one must also be undone. So, the best strategy is to check local broker first and then setup a kind of protection of data.(for example, set a 'lock' on that customer or that particular stock item, depending if you want to achieve 'record lock' or 'table lock' like database.) And don't forget to check if any transaction is proceeding when you try to 'logout' because some remote transaction may be fulfilled while you logout the customer. This would create inconsistency in trading. An intuitive solution is to make a reference count check( in case you want to give even higher efficiency, I will talk about the scheme later.) or a 'lock' check.

(I have to warn you that for those who have difficulty in assignment one, you can completely ignore the discussion below.)

Fifth, what kind of check is necessary? i.e. This 'tradeStock' actually includes two steps or two small transactions. For each customer, you add one stock and subtract another stock. For 'adding', there is always no limits at all. (Do you mind if your account is credit with more money?) So, the 'credit' stock is no need for lock at all. (Unless you introduce a 'roolback' mechanism by yourself. Generally speaking, we first check availability without modification. When checking is finished, we proceed with modification, so there is no need of rollback.) Therefore 'credit' stock item can be left free for other operations. How about 'debt' stock item? It depends. If a 'buyStock' operation is applied on this stock item, I don't see any trouble at all. So, the only thing is another 'sellStock' operation.

6. Why still cann't I run ordb when I already changed the 'InitialPortNumber' to another number?

Internally 'orbd' are using another 'port number' for the local server apart from 'InitialPortNumber'. When you are running this kind of problem, it is probably due to the default 'port number' is used by other process. Try to run like this.

start orbd -portnumber SomeBigPortNumber -InitialPortNumber TryAnotherBigPortNumber

7. Can I really have data member in 'interface' definition?

The answer is yes and no. Usually interface means 'abstract' class in OOP and it can never be instantiated. So, it makes no sense to define data member in 'interface'. However, you CAN define attribute in IDL which is exactly a data member. Why? Consider IDL is simply a high-level logical language independent from any real OOP interface definition language. Whatever is defined in IDL must be interpreted by native OOP. So, usually the attribute is typically interpreted by OOP to be a pair of access methods. (Depending read only or not, you may have one 'get' method.)

8. Which class should I inherit from? And how?

Run 'idlj' with -fall and you will get class 'interface'+POA which is the class you need to inherit from. i.e. Our interface is Hello and we will get HelloPOA.

9. More about "tradeStock" function.

a) How do I design the thread for TCP socket? Do I need make it multi-threaded?

My opinion is just to make things easy. I think single-threaded server socket is enough for "tradeStock" because usually this kind of operation is not too frequent. The example in course website is a bit too complicated to handle. See my example here.

b) The assignment also states that "in order to perform a trade stock operation, two StockBrockers need to communicate between themselves." The elegant answer is, I think, service discovery. I know we will learn the right way with Web Services later, so can we bypass this, or is the assignment's intent to demonstrate the difficulties and hardships of service discovery without web services?

In practical terms, do we

1) hard-code 2 stockbrokers to know about each other
2) implement a home made "register" and "lookup" through a file (would work on a shared network file for remote stockbrokers)
3) implement "service discovery" with something like jini register() and lookup()?
　

This is a big question with a simple answer. Usually you can use option one or hard-coded two or more brokers in either file or by pass as command line parameters. This is of course the simplest solution. As for advanced students, I would also suggest another way to make migration more flexible by using "nameservice" in CORBA.

For example, both client and server program only need to know the host name and port number of "orbd" server. Then each broker server can register itself into "nameservice" under a pre-agreed name like "Broker100", "Broker101" etc. Since the name service is very powerful it can store more data under one "directory" or "NamingContext" like TCP port number, host name etc. This is exactly the similar idea of "discovery service" in web service. That is, at least you must know the basic place or directory to search for your information. Here we use the pre-agreed "name string" like "Broker100" to retrieve the "NamingContext". I don't know jini. However, I believe all kinds of "discovery service" should use the same basic logic like this.

c) ...I had a thought. I don't know how feasible it is but I would like to hear
your comments about it.

Just like clients connect to servers and are allowed to perform some
operations defined in the interface. Could we not have a similar thing for
servers to connect to each other? That way CORBA will ensure exactly once
semantics without the need for implementing sockets. We would require to
provide an identification for the server.
　

This is a feasible idea. However, I should say it is not a very good design in senses that we usually don't use CORBA for communication between server because it is very expensive. Besides this kind of initialization usually would not be used frequently. Combining this rather internal method with your business logic in "interface" design is a kind of awkward because business logic maybe invariant for quite a long time. (For example, our sellStock, buyStock etc. usually don't change.) However, this kind of system setup methods may change along with software upgrading. Then it is really a bad thing for you to implement a deprecated, out-of-date method in interface. So, from purely perspective of design, it is not a good idea.

10. > I want to ask some questions:
>
> When I test my code, I ran my client on one host with one
> server1(stock broker1), and I ran another server2(stock broker2) on
> different host. I can not run my client to connect server2 and
> server1. I found when executing client program by batch
> file(run.bat), there is a line:"java HelloClient.HelloClient
> -ORBInitialHost localhost -ORBInitialPort 4531," if I modified the
> "localhost" to remote hostname, the client program can find the
> remote host, but it can not find the localhost? If I didn't modify
> this line, the client only can find the local host and it can not
> find the remote host. How can I modify this file (or other method) so
> that the client program can find local host and remote host?

First, you need to run "orbd" at some host, say "hostA". i.e. at command line of hostA, type "orbd -ORBInitialHost localhost -ORBInitialPort 1234"

Then you can run any of your server from "hostB" by type command line at "hostB":
java HelloServer.HelloServer -ORBInitialHost HostA -ORBInitialPort 1234

Then you can run your client from "hostC" by type command line at "hostC":

java HelloClient.HelloClient -ORBInitialHost HostA --ORBInitialPort 1234

So, the thing is that you need to tell both client and server where your "orbd" server is running at which port number.

　

11. > BTW, I read "list common mistakes", and I knew I made some mistakes
> but I still conflict with what is exact right way using
> "synchronized". And about the "stock symbols" files, I also made this
> mistake. Because there is not requirement "copy stock information to
> memory" in assignment 1, I am afraid I will make a mistake if I
> copied it just like customer information. I thought I should strictly
> follow the requirement.
>

1. Even though there is no requirement for copying data into memory, however it is a programmer's commonsense to analysis the bottleneck of inefficiency. I should say it is not easy for everybody and not required to do for everybody because it is only low efficiency, not incorrectness.
2. If you have any doubt about requirement in assignment, you can either ask me by email or write down your assumption of requirement. Then explain your reason for your assumption in report.

　

12. Is there anything wrong here?

It seems RMI/CORBA is not reliable, at least they do not guarantee the consistency
between client and server.

I wrote a simple test tonight, using CORBA. In this test, client sent a sellStock request
to server. At the server side, sellStock method is intentionally delay for 10
senconds(normally it responses the client request within one second). During the waiting
period, at client side, I cut down the connection to the server(Actually,I just ctrl-C
the client program, because the client and server were on the same host, so we can say
the connection between client and its ORB was broken), hence the client had no way to
receive the reply. But the result showed that server had actually sold that stock.

First of all, this should be a question for everybody. Secondly, ask yourself if it is a problem of CORBA or RMI. Indeed it is the problem for ALL RPC.

Recall the essential difference between RPC and simple message exchange. RPC requires exactly once semantics. That is the sender and receiver will not receive repeat RPC request or mess up with reply of RPC. This can be achieved by reliable communication channel like TCP which internally uses sequence id for each message packet. Therefore RPC server based on TCP-like communication channel will no longer take care to check if request is repeated or reply is guaranteed to deliver because this becomes the responsibility of communication level. So, there is nothing wrong with CORBA or RMI in above situation. This is the characteristic of network communication, namely unreliability.

Your intension to force data consistency between client-side and server-side is purely application level business logic. And it is a little advanced in at this time. However, you will realize that you need to handle such situation at the end of this term. This is not problem of RPC because you shut down the client program instead of cut down the communication channel. So, this becomes a fault-tolerance problem. And it can be simply solved by assigning each RPC request with a simple unique request ID. For example, whenever client sends in a request, it explicitly pass a unique request ID or session ID as one function parameter. By checking the ID, server will know if the request is fulfilled or not if the recovered client resend same request ID.

Remember, when you shut down the client program, there is no way for RPC checking. And since RPC server leaves "exactly once checking" to TCP, it even is not aware shut down of client program. Does that make sense?

13. How can we use naming service to store some useful information like "host name" and "port number" a broker server?

Here is a simple demo. However, for more advanced usage I need some time to write more examples if any one is really interested in. Or you can check by yourself here.

14. How to tackle deadlock issue in "tradeStock"?

a) First of all, how is deadlock created? Let's see a possible implementation. Assume class Customer contains customerID, balance and a list of stock he owns.

class Customer

{ public int custID;

public String custName;

public float balance;

public Vector stocks;

//operationType indicates whether it is buy or sell or trade operation. And of course it is 'synchronized'

synchronized boolean updateStock(String stockSym, int qty, int operationType)

{

switch (operationType)

{

...//here is cases of buy and sell

case TRADE_OPERATION_LOCAL:

if (doCheckLocalStock(stockSym, qty)) //first we check if local qty is enough

{

if (doRemoteTrade(stockSym, qty))//we will use TCP socket send out requests and wait for result <1>

{

doUpdateStock(stockSym, qty,TRADE_OPERATION_LOCAL);//do update stock

return true;

}else return false;

} else return false;

break;

case TRADE_OPERATION_REMOTE:

if (doCheckLocalStock(stockSym, qty)) //check stock if qty is enough

{

doUpdateStock(stockSym, qty, TRADE_OPERATION_REMOTE); //update stock accordingly

return true;

}else return false;

break;

...

}

The above is some kind of pseudo code for a possible implementation of trade stock and I omit the buy and sell part. However, there is a big problem in this implementation such that it will create deadlock situation. For example, let's do an opposite direction of tradeStock call like this.

Assume cust1 in under broker1, cust2 under broker2. And from broker1 call tradestock(cust1, ..., cust2) and from broker2 call tradeStock(cust2,...cust1). Then in the broker1, your call will stop in <1> of function "doRemoteTrade" because it is waiting for result from broker2. While in broker2, your call will also stop in <1> of function "doRemoteTrade" because it is also waiting for result from broker1. However, your TCP server socket is another thread. When it receives request, it also need to call this "updateStock" function to do the TRADE_OPERATION_REMOTE. But it is stopped outside this function because your request thread is still inside <1>. This is how deadlock happens.

Another typical implementation may look like this:

class Broker

{

void tradeStock(int cust1, String stockSym1, int qty1, int cust2, String stockSym2, int qty2)

{

Customer customer1 = getCustomerFromList(cust1);//retrieve from active_customer_table

synchronized(customer1)

{

if (doRemoteTrade(cust2, stockSym2, qty2))//using TCP socket send out request and wait for reply

{

doLocalTrade(cust1, stockSym1, qty1);//update local customer data

}

...

}

...

}

This implementation essentially are same as above because they both use "synchronized" mechanism. Even though they look like differently. I believe the compiler may translate them into same implementation. The problem is that they are both "blocking mode" which means you will get blocked and cannot return from the function call.

14. How to tackle deadlock issue in "tradeStock"?

b) How to solve it?

A simple solution is that you choose not to use "synchronized" and use "Semamphore" with its "non-blocking" methods.

For example, the above code can be transformed into equivalence of Semaphore operations. Assume we declare "Semaphore sema" as a data member for class Customer. Then for each buy, sell, tradeLocal, tradeRemote call will call "sema.tryAcquire()" which is a non-blocking function. If the "sema" is already acquired by another thread, it will generate an exception. By catching the exception we count how many times the trial fails. (of course you may call Thread.sleep to let the calling thread yields so that other threads can finish his job.) If our trial number reach some max number, we know there is no way for our transaction to finish and we can safely return failure to client program. By doing this "cautiously" we won't run into deadlock situation.

15. Dead lock continued...

a) Does it mean you don't have deadlock problem in 14 a) by using "semaphore"?

As you can observe in 14 b), if you don't use a non-blocking version of "acquire" of "Semaphore you will run into similar problem just like "synchronized" which is essentially a blocking version.

b) What if I am using a transaction list? Can I prevent deadlock from happening?

I should say, unless you have a global view of whole distributed system it is hard for you to detect this kind of possible deadlock because it happens in remote hosts. For example, possibly somebody might use a transaction list to implement a mutual exclusion.

Create a kind of transaction manager class which treats all operations like buy, sell, trade stock as transactions. And as for "trade" we must split the local and remote separately. Say in local host, the transaction manager only handles "trade of cust1" and in remote host, the transaction manager handles "trade cust2". As you guessed, the transaction manager maintains a big list of current active transaction items of "buy, sell, trade stock items". And for each operation, namely buy, sell, and trade, they must consult this transaction manager to see if there is a conflicting item already inside the list. By doing this checking, you can successfully guard simultaneous access of items. However, you cannot be sure if there is a deadlock or not. For example, it is possible for your program to see a conflict of "trade_local(cust1)" and "trade_remote(cust1)" and if you jump to a conclusion of deadlock then you might be wrong. See, tradeStock(cust1, ..., cust2,...) and tradeStock(cust3,..., cust1,...) will not be a deadlock because there is no conflict in cust2 and cust3 which means sooner or later the two transaction will finish if cust2 and cust3 is not blocked.

And the more complicated issues are bigger "loop" in dead lock instead of between two nodes. For example, triangle deadlock between cust1, cust2, cust3 of three circular "tradeStock" calls. And square, pentagon...etc.

c) Do I have other choices?

I am also asking myself the same question since last term. Some of my classmates suggested a complicated algorithm to detect deadlock. I should say it is not easy and quite expensive to implement and run. Generally these are two big scheme in area of deadlock. One is to prevent deadlock from happening by detect any possible deadlock. The other way is to break a deadlock when it is going to happen. Generally speaking, it is always harder to prevent something from happening than tackling it when it really happens. For example, USA spends so much money to build an anti-missile shield which is very complicated. And Russia doesn't have enough money and it simply implement a counter-strike retaliation system which is cheap and effective.

However, for those who are really interested in this area, maybe we can have some sorts of open discussions.

d) Does your code deal with this extreme cases?

Suppose a dummy customer sleeps very late and carelessly issues a command of tradeStock(cust1, stock1,qty1, cust1, stock2, qty2) and by chances the "user-interface-design" programmer never has such kind of imagination. And your server run into a kind of deadlock. Do you know why?

Probably the "server-side" programmer knows how to differentiate the "local-trade" and "remote-trade" (I hope so. :) )and write some function of "doLocalTrade" like this:

boolean doLocalTrade(cust1, stock1, qty1, cust2, stock2, qty2)

{

synchronized(cust1)

{

synchronized(cust2)

{

if (checkStockAvailable(cust1, stock1, qty1)&&checkStockAvailable(cust2, stock2,qty2))

{

doExchangeStock(cust1, stock1, qty1, cust2, stock2, qty2);

return true;

} else return false;

}

Please note if cust1==cust2 and this nested synchronization would be a deadlock. (Even though I didn't actually know if we can use this kind of "nested synchronization", however, you can treat them as pseudo code by using semaphore. For example, in class Customer there is a data member semaphore and all mutual exclusion operation must acquire this semaphore. )

Honestly speaking, I seldom like to code in some "fancy style". However, by reading source code of you guys I do learn something exotic. But this one is purely my imagination.