|
November 2008 - Posts
-
I really never expected to be witness of the kind of events like those ones we are seeing nowadays. Certainly, we are experiencing a huge impact on our conception of society mainly driven by the adjustment of the economic system; probably an era is ending, a new one is about to born I cannot possibly know that for sure. I just hope for a better one. Normally, I am a pessimistic character; and more than ever we have reasons for being so. However in this post, I want to be a little less than usually, but with prudence. I just needed to reorder my thoughts for mental health: I have been seeing so much C++ and Javascript and the like code during the last time.
I am neither economist nor sociologist, nothing even closer, just an IT old regular guy, needless to say. I just want to reflect here, quite informally, about the IT model and more exclusively the software model and its role in our society, under a context as the one we are currently experimenting. What is a "software model" by the way? I mean just here (by overloading): in a society we have state, political, economical and legal models, among others. Good or bad, less recognized as such or not, I think we also have invented a software model which at some important extent orchestrates the other historically better known models, let us call them (more) natural models. I am remembering the 90s as the new millennium approached and many conjectures were made about the Y2K problem, especially concerning legacy software systems, and its potentially devastating costs and negative effects. For an instance, from here, let us take some quotes (it is worthy following the related articles pointed to):
"People have been sounding the alarm about the costs of the millennium bug--the software glitch that could paralyze computers come Jan. 1, 2000--for a couple of years. Now, the hard numbers are coming in and, if the pattern holds, they point to an even larger bill than many feared just a few months ago. [...] Outspoken Y2K-watcher Edward E. Yardeni, chief economist at Deutsche Bank Securities Inc., says the numbers show that some organizations are ''just starting'' to wake up to Y2K's potential for damage--but he believes the possible impact is enormous. In fact, Yardeni puts the chances of a recession in 2000 or 2001 at 70% because of ''a glitch in the flow of information"
Suddenly after reading that again, the old fashioned term "software crisis" (attributed to Bauer in 1968, popularized by Dijkstra seminal work in the seventies) we taught at college rooms seemed to make more sense than ever. But romanticism aside, we in IT know, software is still quite problematic for the same reasons since then. It is now a matter of size. In Dijkstra own words almost forty years ago (Humble Programmer ):
"To put it quite bluntly: as long as there were no machines, programming was no problem at all; when we had a few weak computers, programming became a mild problem, and now we have gigantic computers, programming has become an equally gigantic problem."
But those were surely different times; weren’t they? Exaggerated (as a source of businesses or even religious opportunities) or not, however, Y2K also gave us a serious (and global) warning about to what extent software had grown in many parts of our normal society model even in a time when the Web was not wired into the global business model as we have lived in the last decades, apparently.
I do not know whether final Y2K costs were as big as or even bigger than predicted but certainly the problem gave a huge impulse to investment in the software branch, I would guess. I am afraid, not always, leading to an improvement of the software model and practices. Those were the golden times that are probably ending now when appearance was frequently more valued than content and Artificial Intelligence became a movie.
What will be going on with the software model if the underlying economic model is now adjusting so dramatically and together with them the other more natural models, too? Will be short of capital for investment and consume stop or change (again) for worst software model evolution and development?
I do believe our software model remains essentially as bad as it was in the Y2K epoch. It is a consequence of its own abstract character living in a more and more "concrete" business world. I would further guess, it is even worst now for it was highly proliferated, it got more complicated structurally. Maintenance and formally understanding are now harder, among others, because dynamism, lack of standardization and because external functionality has tended to be more valued than maintainability and soundness. And exactly for that reason, I also believe (no matter how exactly the new economic model is going to look like) software will have to be stronger structurally and more reliable because after economic stabilization and restart, whenever, businesses will turn more strict to avoid just appearance once again be able to generate wealth.
I would expect (wish) software for effectively manipulating software (legacy and fresher, dynamic spaghetti included) should be more than ever demanded as a consequence. More effective testing and dynamic maintenance will be required during any adjustment phases of the new economical model. I also see standardization as a natural requirement and driver of a higher level of software quality. Platform independence will be more important than ever, I guess. No matter what agents become winners during adjustment and after recovering of the economic system, a better software model will be critical for each one of them, globally. Software as an expression of substantial knowledge will be in any case considered an asset under any circumstances. I think, we should be seeing a less speculative and consistent economical model where precisely a better software model really will make a discriminator for competing with real substance not with just fancy emptiness. The statement must be demonstrable.
Sometimes things do not result as bad as they appeared. Sometimes they result being even worst and, now, we do have to be prepared for such a scenario, absolutely. But, I also want to believe, it can also be an opportunity out there. I would like to think a software model improvement will be an essential piece of the economic transition and a transition to something better in software will be taking place, at the end. I wish it, at least.
I recommend reading Dijkstra again especially on these days just as an interesting historical comparison; and trying honestly not fooling myself, I quote him referring to his vision of a better software model, as we called here:
"There seem to be three major conditions that must be fulfilled. The world at large must recognize the need for the change; secondly the economic need for it must be sufficiently strong; and, thirdly, the change must be technically feasible."
I think, we might be having the first two of them. Concerning the third one, and slowly returning to the C++ code I am seeing, I just reshape his words: I absolutely fail to see how I could keep programs firmly within my intellectual grip, when this programming language escapes my intellectual control. But we have to, exactly.
|
-
As profiled in a previous post we are building a tool for reasoning about CSS and related HTML styling tools using some logic framework (logic grammars in human readable style are usually highly ambiguous, consequently an interesting case study). So, we need a textual language, a DSL, for the tool, I was asked to try MGrammar (simply Mg), which a DSL generator which is part of the Microsoft Oslo SDK, recently presented in MS PDP.
Actually I am unaware of the Oslo project details , so I was a little bit skeptical about the thing and because it would entail learning another new language (M). But I fight against any personal preconception and decided just to take a look and try to have fun with it. I do not pretend here to blog about the whole Oslo project, of course. Just to tell a simple story about using Mg in its current state, no more.
In general, I like very much the idea that data specification (models) and data storing, and querying and the like can be separated from procedural languages in a declarative and yet more expressive way; hopefully, my own way to express my model (beyond standard diagrams and graphic perspectives). Oslo and M approach such the general vision on model oriented software development, Mg the latter on expressivity, as I understand the source material which is not much yet.
What I have seen so far of M is indeed very interesting, I really liked it; it extends LINQ which I consider a very interesting proposal MS’s; it also reminded me OCL in some ways. In fact, there exists an open source OCL library called Oslo, which has nothing to do with MS; it is just funny to mention.
Mg I focused on tasting Mg; we do not consider M at all. We do not present a tutorial here, there is one nice for instance here. If you are interested in my example and sources please mail me and I will send you it back.
The SDK also contains a nice demo for a musical language (Song grammar) and its player in C#; and moreover M and Mg themselves are specified in Mg, the corresponding sources are available in the SDK. Hence, you learn by reading grammars, because the documentation is not yet as rich as I would like.
Mg source files resemble, at least in principle, other parser generators (like ANTLR, Coco, JavaCC, YACC/Lex and relatives). I do not know details about the parsing technique behind Mg. However, according to the mg.exe reporting it would be LR-parsing.
Mg offers really powerful features. Just as an instance, following are tokenizing rules:
token Digit = "0".."9"; token Digits = Digit+; token Sign = ("+"|"-"); token Number = Sign?Digits ("." Digits)?;
Or constraint the repetition factor (from at least 1 maximum 10) as in:
token ANotTooLongNumber = Digit#1..10;
But you may have parameters in scanner rules, thus you may write (taken from nice tutorial here):
token QuotedText(Q) = Q (^('\n' | '\r') - Q)* Q; token SingleQuotedText = QuotedText('"'); token DoubleQuotedText = QuotedText("'");
For handling different types of quotations depending on parameter Q! Notice the subpattern ^('\n' | '\r') – Q. it means any but newline or return excluding character Q.
Oslo SDK provides an editor called Intellipad, which can be used for both M and Mg development. You need to experiment a little bit before getting used with it, but, after short time, it appears to you as a very nice tool. Actually, a command-line compiler for Mg is included in the SDK, so you may edit your grammar using any text editor and compile using a shell. But, the Intellipad is quite powerful, it allows editing and debugging your grammar simultaneously, quickly; Intellipad shows you several panes for working with, in one you write your grammar, in other you may enter input to your grammar which is immediately checked against your rules. A third (tree view) pane shows you the AST being projected by your grammar on the input you are providing.
Finally, a forth pane shows you the error messages that includes those eventual parsing ambiguities according to the given input and grammar. Interesting, only then, you notice any potential conflict. I did not see any form of static analysis for warning about such cases provided in the editor. Is there any such?
Besides that, actually I enjoyed using it. I a not sure to appreciate the whole functionality but it was quite easy to create my grammar. Mg is modular, as M is. So you can organize and reuse your grammar parts. That is very nice. To start defining a language, you write something like:
language LogicFormulae{ //rules go here inside }
For defining the language you have "token", "syntax" mainly and other statements. In this case, the name "LogicFormulae" will be used later on when we access the corresponding parser programmatically.
Case Study Back to our case study, our simple language contains logic formulae like "p&(p->q) <-> p&q" (called well-formed-formulae, or wff) where as usual a lot of ambiguity (conflicts) might occur. The result is frequently that you have to rewrite your grammar into a usually uglier one in order to recover determinism and so fun is out. What I liked of Mg is a very readable style for handling such cases. For instance, I have a rule like
syntax ComposedWff = precedence 1: ParenthesisWff | precedence 1: PossibilityWff | precedence 1: NecessityWff | precedence 1: QuantifiedWff | precedence 2: NotWff | precedence 3: AndWff | precedence 4: ImpliesWff | precedence 4: EquivWff | precedence 5: OrWff ; Using the keyword precedence I can "reorder" the rule to avoid ambiguity. Thus, "all x.p & q" parses as "(all x.p) & q" under this rule because of the precedence I chose.
Another example shows you how to deal with associativity and operator precedence
syntax AndWff = Wff left(4) "&" Wff; syntax ImpliesWff = Wff right(3) "->" Wff;
Mg projects syntax in its D-Graphs using the name of the rule and token images automatically, thus it always generate an AST without specifying it which is very useful. Thus, the "AndWff" rule will produce a tree with three children (including the token "&"), labeled with "AndWff". You can use your own constructors to be produced, instead:
syntax AndWff = x:Wff left(4) "&" y:Wff => And[x, y];
Such constructors like "And" need not to be classes they will be labels in nodes. Conflicts Conflicts were in appearance solved by this way because the engine behind Intellipad did not complain anymore about my test cases. I do not know whether there is a way to verify the grammar using Intellipad, so I just assumed it was conflicts free. However, when I compiled it using the mg.exe (using the option -reportConflicts) I got a list of warnings as by any LR parser generator. That was disappointing. Are we seeing different parsing engines?
Using C# The generated parser and the graphs it produces can be accessed programmatically, something that was of my special interest in my case beyond working with Intellipad. Because I was not able to create a parser image directly from Intellipad, I did the following: I compiled my grammar using the mg.exe command producing a so-called image, a binary file (other targets are possible, I think) called "mgx". This mgx-image can be loaded in C# project using some libraries of the SDK. (It can also be executed using mgx.exe). You would load the image like this:
parser = MGrammarCompiler.LoadParserFromMgx(stream, languageName);
where "stream" would be a stream referring to the mgx file I produced with the mg compiler. And "languageName" is a string indicating the language I want to use for parsing ("LogicFormulae" in my case). And I parse any file as indicated by string variable "input" as follows
rootNode = parser.ParseObject(input, ErrorReporter.Standard);
The variable rootNode (of type object, by the way!) refers to root of the parsing graph. A special kind of object of class GraphBuilder gives you access to the nodes, node labels and node children if any. Hence, using such an object you visit the graph as usual. For instance using a pattern like:
void VisitAst(GraphBuilder builder, object node){
foreach (object childNode in builder.GetSequenceElements(node)) if (childNode is string) VisitValue(childNode); else VisitAst(builder, childNode); } For some reason, the library uses object as node type, as you are noticing, probably.
Conclusions I found Mg very powerful and working with Intellipad for prototyping was actually fun. Well it was almost always. I only have some doubts concerning use and performance. For instance, using Intellipad you are able to develop and test very fast. But it seems that there is no static analysis tool inside Intellipad for warning about eventual conflicts in the grammar. The other unclear thing I saw is the loading time of the engine in C#. It takes really too long before you get the object parser given the mgx file. In fact, the mg compiler seems to work too slowly, in particular with respect to Intellipad. It seems as though the mg.exe and Intellipad were not connected as tool chain or something like that. But surely is just me, because it is just my first contact with this SDK.
|
|
|
|