rmi faq

1. Why I cannot compile the "shapelist" example?

You may not compile from proper path. If you are compiling command line, do like this:

a) check your environment variable "classpath" and find out the path for "classpath", say path1;

b) go to path1 and create whatever "package" used in example. In the "shapelist", they are using "example.RMIShape" as package. So, you need to create a directory "example\RMIShape\" and copy all source code there.

c) compile all source code from path1 by run command "javac example\RMIShape\*.java" from command line.

2. Why I cannot run "shapelist" example?

If you are compiling with above method, you will run into exceptions like "access deny" or something similar. Then try to replace the original "Naming.rebind" method with "LocateRegistry". To check exactly how, download the modified code here.

Please note I removed the lossy "example" directory and only use "RMIShape".

3. How should I run the program with RMI?

It is as simple as all other java program. From "classpath", type "java RMIShape.ShapeListServer" to run the server. And from another dos window, type "java RMIShape.ShapeListClient".

4. Do I need to create thread explicitly in server?

No, at least you don't need for this assignment. As for client side, it depends. If you run your client program from one machine and you want to achieve the effect of concurrent access, you might need to create threads. However, you also achieve almost same effect by running your client program from multiple machines at the same time.

5. Then what is the most important part of this assignment?

The most essential part of a design of a server is to make it correct, robust, efficient, fault-tolerant. In this assignment, if you don't make methods of your server class thread-safe then you don't satisfy the requirement of correctness. And basically it is a useless server.

6. What exactly is thread-safe? And what might be the part which is not thread-safe?

Please check my slides about synchronization and concentrate on the beginning part because some of advanced technique will not be necessary until your future assignment and project arrives.

Generally speaking, whenever there is chances such that a global resource is accessed by multiple threads then there is a need for thread-safe mechanism. For example, in this assignment, it doesn't say that the customer account cannot be shared which means the same customer account will not be used concurrently by multiple users. This might be common in real world. For example, a customer account is shared between members of family and they all try to manipulate the account through internet. Then we have a scenario such that multiple threads try to manipulate the data of same account. It is absolutely a place for you to take care thread-safe issue.

Another example is more obvious. When multiple customer logs out and your server tries to write their data into files, probably they are accessing the same file pointer. If you don't protect this file operation, there is a big chance such that customer data is corrupted. i.e. your file pointer is moving back and forth by two threads and you overwrite data of one customer to another.

As for other scenario, I leave them to you for thinking.

7. Do I need to use "rmic" to compile?

Generally speaking, you don't need "rmic" because I think it is a bit old and last term we all use "javac" to compile and there is no problem. If you are using "Eclipse", it also should be OK but you have to setup correct parameter. And if you are going to use "javac" instead of "rmic" you have to use "LocateRegistry" method instead of "Naming.lookup". Check the modified example.

8. Do I need to use DB, i.e. Access, MySQL etc?

Last term, some of my classmates did similar thing like that except they used Access instead of MySQL because this is the only DB we have in lab. But later they run into
problem for project as Access doesn't allow you to do full control of transaction etc.
You see, this is a design course and we concentrate on all raw methods to give full
control of every parts of program. If you choose to use any DB, you will find difficulty
sooner or later in transaction control or whatever. My suggestion is just write a simple
file-DB which stores records.

And the following part is not required but only a suggestion for those who always pursue high efficiency. And I posted only because I sent out this to one student and it should be fair to let all others be tipped even though it will probably spoil your creativity. (I think some guys might be annoyed to see I shrink their imagination space by giving too much hints.)

However, still you can speed up your performance by a
better design of your database. For example, instead of storing records linearly, you can
invent a kind of "index file" to store record pointer in such a way that by using
RandomAccessFile file accessing time is roughly a constant. This is exactly what I have
done last term. I used the account number (4 digit) as file offset of record pointer in
file and by accessing the record pointer, you can get the real file offset for its record
so that you don't do linear search for any record. This is just a suggestion and you are
free to implement the structure of your file DB any way you like.

By the way, if you have taken "advanced database design", surely you remember how records are stored in file and how index pointer are stored to speed up query.

9. The assignment talks about capturing the output. Obviously we
should test with multiple clients. Would you like to see the clients running
from different computers? I can run the server on my desktop, and clients on
my desktop & laptop.

Definitely I would expect to see a client from a remote host because this is just the purpose of distributed system. And only by doing so, it can expose those potential
problems which you cannot see when they are running in local host.
　

10. ... i'm working on the structure that we are supposed
to copy data from the secondary storage to, and i decided to make it a multi
dimensional object array i was wondering if its ok to do that.

　

The data will include both balance and stock name etc. Obviously they are different type.
So, it is not appropriate to make "multi-dimensional array".
The simplest solution is just to define a "struct" class and store the data in such an
array. And the following is simply a suggestion and you are free to choose your own design.

i.e.

class MyStock{

public String stockSymbol;
public int balance;

}

class MyCustomer
{
public vector stockVect;
public String cust_name;
public int cust_account;
...
}

class MyBroker
{
public vector custumerVect;
}

　

11.

> ... i have a question should i save all
> of the stock names along with their account in one string in the
> structure array for each customer.
>

This is a rather design issue you need to give a consideration. What I suggest is simply a little hint.
For example, a customer may have arbitary number of stock in account and for each of them it includes the "abraviated symbol name" i.e. Nasdaq etc., possibly the full name for the stock (This is an option for your design, if you store all your stock name in a common table or file, then you don't have to store each full name in each customer data. You simply need a reference number or an index for the stock. I leave this for your consideration.) And definitely we need a balance of that stock.
So, you see, each customer really have a variant length of record. What is the best way for store them in database or files? If you have taken database design you will know there are some ways to solve this problem and it sometimes requires quite a lot of technique of file structure operations, like block design. Here I give my idea of design which is used only as a kind of suggestion and you are ABSOLUTELY free to do whatever you think because we are not concentrated on database design in this course.
Instead of allocate for each customer as an arbitary number of stock item, you can simply assume each customer has potential to buy all kinds of stock. For exameple, if we have 100 different kinds of stock, then I simply give each customer_account an integer array of size of 100 so that whenever the customer buys a new stock I will use the stock index to access the array and modify the balance of stock. Here please be noted, I am talking about file structure design because you know file is basically a consecutive storage and if you don't pre-allocate each chunk for each customer, when you need to add new item to that chunk, you have to shift all data! However, in memory storage, you may not need such luxary memory allocation. The simplest choice is that you can use a "vector" of your struct class or a dynamic array to store data in memory. And you only need to transform this memory storage to file structure when customer logs out.

Hope it solves your question,

12. I am a bit surprised to see that nobody is asking a basic question: should we keep each broker as one independent server running in different host?

The answer should be yes because this is the purpose of distributed system, otherwise why should we name them as different brokers if they are all located in same server? Probably the reason that nobody is asking this question is that most of you guys already assume this is a kind of common sense.

13. What if I declare all methods of my class to be "synchronized"?

This will only demonstrate that you are not clear about where you need to make your server "thread-safe" and your implementation is extremely low efficient which defeats the purpose of "design of distributed system". And I can assure you that I will deduct a fairly enough points from your total marks of this assignment.

14.
> In RMI, is there any method that server side could use to find out
> client's host name or IP address.
>
As far as I know, there is no way for you to know the client's host name or IP address. Even if there does exist ways, it is not recommended to do that because it is one of fundamental concept of server design---stateless.

1) You will soon be taught the concept of "stateless" and professor will explain that to you clearly. However, let me give you some tips before that to help you understand this important concept. Usually server will not be responsible for the state maintaining of client. i.e. Server doesn't remember what is the last request from a particular client. This is because server is working in concurrent with multiple clients's request and concurrent request which modifies server state will make all those "remembered state of client" becomes stale. i.e. Two customers are selling stocks at the same time and they both queries current stock price first. However, if the price of stock market is flucturating very fast, the quotation given to client will be out of state. Should server maintain a table of when, and how much it quotes to who, there will be a huge data table to maintain which is a heavy burden of performance. That is why server usually doesn't remember what state the client becomes.
2. However, in some cases, we do need server to remember the state of client. For example, in E-commerce we expect the server knows exactly what orders or prices customer has checked. To achieve that, we still maintain the principle of stateless server by asking client to send in its state by technique like "cookie". That is, the client receives some encripted data from server which maintain the state of client.
3. This is a design course and I think students are always encouraged to do whatever he like to do as long as he can justify it. In this particular assignment, you don't have to worry about the issues like security or "session maintaining" etc. as long as you fullfil those definition of basic functions specified in assignment. However, there is no limit for you to implement a real practical stock server. If you can express clearly what is your design and how you implement it, I am sure you will be receiving extra points for that.

15.
> Could I make some assumptions for my assignment, e.g., a client can
> only operate on the same host after he sign in the broker. If we
> don't make assumptions, for correctness, the program will be very
> complicated.
>
I think in specification of assignment, there is no assumption such as that the account won't be shared. Therefore the scenario you mentioned will probably be possible. However, at current stage you don't have to worry about the correctness when two client programs try to access same account from different host or even same host. As far as I understand, the "interface" given in assignment cannot prevent this from happening.
However, from perspective of practice you do have ways to work it round. For example, whenever a customer logs in the server can explicitly return a unique security ID for this particular session. In all following request from this customer, your client program needs to send in this security ID as an implicit parameter so that server can justify if the request is from the same customer program. (or at least logically from same session.) Again this is purely a suggestion for design and you are not required to implement it. You are absolutely free to implement your server as you wish as long as you fullfil the basic requirement and justify whatever you specify.
Actually you may find out that you are required to do this in your project.

16.

> The last one is about the test. Could you set some test standards for
> our application. This is because it is hard of us to test all aspects
> of the application. Even if I had run thousands of tests, including
> buying and selling stocks and got the correct results, I could not
> say my program is correct because there maybe some situations that I
> did not cover.
>
Generally speaking, there is no through testing in software engineering. So is our assignment. And this course is placing "design" in first priority which means the design choice is our most concern. The testing or demo is just a way to prove what you have claimed in your report. As long as you specify what is your concerning problem in server implementation and you show how you design your solution. And explain how you test this solution and show me some evidence like snapshots of running or result file with appropriate explanation. If you can do that, I think you have done very well in your assignment.
　

17. Is there any criteria for program testing?

I only want to talk about my own understanding about program testing particular for this course. As we all know, this course is called "distributed system design" and our ultimate goal is to design efficient distributed system such that it has highest performance with maximum reliability. And first of all it must be correct. So, let's say we have at least three criteria: correctness, efficient, reliability.

As for correctness, you have to prove the server gives correct output in various situation such as concurrent access of multiple client program. i.e. You design to make a test such that a certain number of clients (by threads from same program or multiple client program from multiple hosts) try to buy or sell stock from same customer account. And there are other scenario like multiple customers try to log off at the same time and their data is correctly saved to file etc.

As for performance, you can demo how much throughput your server can achieve. i.e. How long does it take to finish a number of task, say 100 transactions. Or how much concurrency your server can handle. i.e. How many client program can be handled by your server at the same time?

The above suggestion only suits for suggestions and you are absolutely free to design your own testing cases as long as you can justify yourself.

18. How to define package and how to register in RMI?

This maybe one of the most common problem in RMI for a beginner. Download the simplified sample code here.

a) setup a "classpath" in your "environment variable" in the "property" of "my computer", say "c:\MyJava"

b) setup a working directory for server under your "classpath", say "c:\MyJava\HelloServer".

c) setup a working directory for interface under your classpath, say "c:\MyJava\helloInterface".

d) setup a working directory for client under your classpath, say "c:\MyJava\HelloClient

e) copy HelloServer.java + HelloImpl.java, HelloInterface.java, HelloClient.java to corresponding directory. (In "zip" file they are already located in correct directory). Please be noted, the "package name" is exactly the name of its directory. This is useful because it makes "class import" easier in same directory. For example, HelloImpl.java and HelloServer are in same package "HelloServer" and they can access each other.

d) from command line go to your "classpath", ie c:\MyJava. Type "javac HelloServer\*.java", "javac HelloClient.java".

e) run server by "java HelloServer.HelloServer".

f) run client by "java HelloClient.HelloClient"

Please be noted, the string at end of RMI registry in code is simply the directory name of which your "interface" locates.

i.e. registryURL = "rmi://localhost:1099/HelloInterface";

Or if you are lazy, you can go to your "classpath" from commandline and run the batch file "runHello.bat" from there.

Here also please be noted that I defined three packages which correspond to three directories.

19. ... The first question is, for a "login logout"
> program, besides username and password, should we consider create a
> new user if there is not user? or we just consider there are some
> customers in the files which stored username and password?
>

1. Please read requirement carefully and see that the "password" is not a required parameter for the interface of "login". We are dealing with distributed system and professor has relexed the requirement of security. So, even though it is a common sense that we need a password for login, you are not required to do it. Of course if you feel comfortable for doing it, there is no limit. And surely you will learn more from doing that.
2. So, we are not using password to identify user existence. Instead we use data files to check if the customer has already opened account or not.
3. If the customer has not opened account, or you cannot find the data in file of the login custno, you can ignore this request.

　

20.

> Second: do we need set quantity for every stock? if it is, how to set
> quantity for every stock?
>

1. Please read carefully about the OpenAccount requirement. You are only required to initialize the "balance" which means all stock quantity is of course zero at the time of opening a new account.
2. However I guess you might ask about the total "available" stock in stock market. If that is the case, I think the available quantity for each stock can be regarded as infinity.

3. A side-extracting story for this question is that you may be in the edge of committing a sort of economic crime. For example, in many Hollywood film you will notice that some gangsters are just setup a fake casino to cheat those gamblers. (I think the name of film is called "stinger", not quite sure.) For example, you act as a stock broker and ask people deposit money to open account in your small broker office. Then there are big chances that some customers would buy IBM stocks while others are selling IBM stocks. Instead of buying and selling on behalf of your customers, you simply balance those selling and buying by difference. For example, if you find the total quantity of buying is 1000 stock and selling is 990, then you can simply buy 1000-990= 10 stocks instead of really buying 1000 and selling 990. By doing this, you are saving those transaction fees and also choose whatever price suitable for you. i.e. buying the highest possible price and selling with lowest possible price.

So, in general the broker is not supposed to maintain the stock because broker is not market.

(I hope people are not annoyed by my side-extracting explanations. After all, this assignment is really a real world game which I think, should inspires a lot of enthusiasm and creativity.)
　

21.

It seems that nobody is asking such a common question: Can two RMI server run in same host with same port number?

Generally speaking, you cannot allow two processes sharing the same port number in one host. However, here the RMI port number is simply used as part of registry to pinpoint remote object. So, if you run two server in same host with exact same registry, then the later will overwrite registry without any exception error of port conflicts.

22.

> Now I have one question: The assignment required to record information
> including "buying price" of stock, but if when customer buy a same
> kind of stock at different time, the price perhaps is different, how
> can I record this price? using average price or?
>

This is a very good question and I just wonder why nobody asks this earlier. And I should say the average price would be a reasonable solution. As far as I know the investor only cares about their investment cost, namely the average buying price so that they can immediately know what price they can sell for profit. So, please use average price.

23. In case some guys would ask such a question. For example, if you intend to test your server for performance like throughput by sending a huge queue of requests, say 3000 random requests of buying, selling, opening, login, logout etc, what would you expect to see? In my experience, you might notice the RMI throw exceptions like the socket refuses to connect or something. Is this normal?

This might be caused by the exhausting of TCP port number when huge connection requests are thrown in channel. For example, TCP uses anonymous port number for receiver to send back acknowledgement. If the connection requests are too many within a short period, the TCP cannot handle the requests and will refuse further connection requests. I noticed that this happened when I have more than one thousand randomly chosen requests.

24. Any question about running multiple servers?

First it depends if you want to run your server in same host or multiple hosts. Usually I suggest that you should at least test them once for running in different hosts simultaneously because some problems might only be exposed then. And for debugging and testing convenience you might prefer run them in same host or possibly local host of clients. That is reasonable.

Second how do you plan to run them in same host or local host? The best way is to run them in separate directory so that the different server or broker will have different registry (the directory name in registry would be different) and different data file with same file name in its own directory. This might make things easier.

The third one is just a common observation. Usually there are two ways to make running servers easier. One is to write "batch file" to pass different command parameter to servers. For example, you define a running "bat" file for each server and pass different parameter for each server. It will also help others to run the server without so many typing. Another common way is to write some configure file under each running directory so that each server would read its own configure file for parameters. I would say both way will save a lot of description in "readme.txt" file for explaining how to run your program. I really hope you guys would do things like this to ease my job. Thanks in advance.

25.

> ...requirement of assignment about "getQuote(stockSyms)", does this mean
> that I should get number of price of stocks not fixed? if it does,
> except vector or list, are there any other methods? For example, does
> java support changed number of parameters(not overload)?
>

Read the requirement carefully and you will notice that "stockSyms" is nothing but a series of stock symbol names separated by "comma". i.e.
"Dell, IBM, Intel". So, basically the type is just "String" and in java the length of string is variant which means you don't have to worry about the length. (Even though in RPC the length of parameter usually DOES have restrictions. However, in our case, we don't worry about this. )

Also for your information, usually in "interface", try not to use any "fancy" type. Instead try to stick simple type. i.e. if you can use "String", then don't use any self-defined class. Because later you will know it may have problem when you want to extend your interface to CORBA which is language independent. The interface in CORBA restricts the type of parameter to less than two dozen. Even though you can declare your own type, it is a bit complicated. However, this is not something you have to worry for the time being.

26. Can I define a return value or argument more than primitive type in RMI? i.e. return value is a self-defined class.

(This is a complex issue and I warn you not to read it unless you are confident that you can finish you work early!!)

The answer is yes and usually it is straight forward. However, I was attempted by some demons to declare the return class extended from "RemoteObject". By looking at the name, it is pretty much like what I want. Only after a whole morning struggle with "unmarshalling" exception, I begin to recall I have made the same mistake before. This mistake nearly cannot be made by a second person. However, still I want to share this stupid experience. Download the code here.

RemoteObject is something like "callback" mechanism in RMI. For example, you can use a "server-side" object by manipulating its "client-side" image. So, in my case I don't need so-called "RemoteObject" because what I need is simply a return value. Therefore the simplest way is just to declare an object which implements Serializable interface. (However, in the example code, you don't see any explicit implementation of "Serializable". I guess as long as you use primitive type, you don't have to do anything. Except that you have to write/read another class as your "field".)

i.e.

public interface HelloInterface extends Remote
{
public Struct sayHello(String name) throws java.rmi.RemoteException;
public Struct[] sayHelloAll() throws java.rmi.RemoteException;

} //end interface

(in file "Struct.java", try to modify the declaration of Struct by following my comment and see what happens.)

//And the correct declaration of "Struct" is like this:

public class Struct implements Serializable //this will work!!

//But what I did is like this and this is wrong!

public class Struct extends RemoteObject //This won't work!!

This problem is quite complicated because what I read in example is that the "return value" or class should implement Serializable interface. For "RemoteObject", it implements both "Serializable and Remote" interface. So, it seems it will work. Why cannot I use it when it implements more than it is necessary?

***********************************************************************************************************************

The basic question is like this:

The only requirement for a return value is to implement "Serializable" interface. And one way is to implement it by yourself. The other way is to inherit from a class that has implemented it. But why is second approach not working?

Here comes one suggestion:

>>>Serializable, like all interfaces, is not inherited I beleive.
　

Thanks a lot for your message. At first glance I think you are 100% correct. However, when I tried this like below:
//public class Struct implements Serializable //this will work!!!!!!!!
public class Struct extends RemoteObject implements Serializable //this will not work!!!!!!!!!!

And the error of runtime "unmarshalling" remains.
Then I give a second thought of what you suggested. Even though you are right "interface" cannot be inherited, but class implementation CAN be inherited. That is to say, my derived class "Struct" would inherit whatever RemoteObject has implemented, namely implementation of Serializable. So, I don't think this is exactly the origin of problem even though it seems almost like a solution.
What I suspect about is that "RemoteObject" which is assumed to function like a callback. I suspect its method must be used at client side unless it is "exported" to server. Anyway it is a purely imaginition.

　

27. How about RemoteException? Does it cost when you take advantage to use it as return value?　

Has anybody taken advantage of RemoteException to make it a kind of return value when error encountered. For example, when you run into any error in server side and you throw RemoteExeption at server side and client side can "catch" this exception. What a magic! Suppose you check the data file and find the requested "custNo" doesn't exist, then you can simply throw a RemoteException to notify client side.

Isn't it good? Maybe yes, maybe no. When I did my assignment last term, I did use this method. And There are costs. First this remote exception mechanism should be very complicated and expensive. From performance perspective, it is redundant compared with a simple return value. Second, this exception is RMI native implementation which means you cannot find it in CORBA. When I tried to move to CORBA, it becomes too complicated and I quickly give it up.

28. Has anybody asked the question: Is methods of "vector" thread-safe? For example, is vector.addElement(obj) synchronized or not?

Why should I ask this question? Because in "www.sun.com", it doesn't say methods of class Vector is thread-safe. And can we really assume they are thread-safe functions? I am not convinced until I made such an experiment.

In the "VectorTest" class, I defined an array of threads which all try to call a global vector's "addElement" method and write in its own thread-id. If this method is not thread-safe, the data in vector would be corrupted. At least, you cannot get correct counting of number of each thread-id. For example, I create ten threads and each tries to write 100 times by calling "vector.addElement(id)". Then finally I should get exactly 100 of 0's, 1's, 2's,...9's each. This can only be true if the method is properly synchronized. Download the code of java.

To make a comparison, I also write a C/C++ demo of similar code. And you can see if the "mutex" is not added in code, you would always get run time error because "vector<int>" is not thread-safe. Basically all STL are not thread-safe.

In java, programmers are lucky as they don't even have to think about these issues. However, it is a pity that they can never achieve real high efficiency. Download the code of c++ here.