Simple Scanner
A.First Edition
This is first edition of my simple scanner, and there is nothing important except it uses the DFA transition
functions. I put all transition functions in an uniform format so that they can be put into an array of function
pointers. The return value of each function is the index of state it leads to. So we can call the series of
checking functions in a comfortable way. At least it suits my habit.
To write a simplest scanner to check if a token is a legal identifier of C++. That is, it should
starts with a alphabetic character and followed by alphabetic or digital character or '-'. But there
should be at most one '-' which is not the end of token.
The translation of DFA into programming language can be one-to-one and it must be! The scanner
continually call each scan function after it checks if the token ends. And if the return value of
checking function is -1---means rejected by checking, it returns false. If the string ends, scanner
checks if the state is final state by looking at the boolean array of state. Very similar to an DFA?
It actually is. Since each state transition function returns a deterministic value. This is
FUNCTION DEFINITION BY DISCRETE MATHEMATICS!
C.Further improvement
กก
#include <iostream> using namespace std; int scan0(char ch); int scan1(char ch); int scan2(char ch); bool finalStates[3]={false, true, false}; int (*scanArray[3])(char ch)={scan0, scan1, scan2}; bool scanner(char* str); int main() { char buffer[20]; cout<<"please input your token\nc:>"; while (true) { cin>>buffer; if (strcmp(buffer, "exit")==0) { break; } cout<<buffer<<(scanner(buffer)?" is accepted!":" is rejected!")<<endl; cout<<"c:>"; } return 0; } bool scanner(char* str) { int counter=0, result=0; while (str[counter]!=NULL)//scan all char { if (result==-1)//rejected { return false; } result = scanArray[result](str[counter++]);//recursive?loop? } return finalStates[result]; } int scan0(char ch) { if (ch>='a'&&ch<='z'||ch>='A'&&ch<='Z') { return 1; } else { return -1; } } int scan1(char ch) { if (ch>='a'&&ch<='z'||ch>='A'&&ch<='Z'||ch>='0'&&ch<='9') { return 1; } else { if (ch=='-') { return 2; } else { return -1; } } } int scan2(char ch) { if (ch>='a'&&ch<='z'||ch>='A'&&ch<='Z'||ch>='0'&&ch<='9') { return 1; } else { return -1; } }
Here is the result:
please input your token c:>abd abd is accepted! c:>a-bkdk0 a-bkdk0 is accepted! c:>0ofjj 0ofjj is rejected! c:>afo-b9sf afo-b9sf is accepted! c:>1- 1- is rejected! c:>afgddf- afgddf- is rejected! c:>ao9-pppo0po-o ao9-pppo0po-o is accepted! c:>exit Press any key to continue