Welcome to the Institute for End User Computing, Inc. — A 501(c)(3) not for profit corporation Forging the Future for End Users Like you.

Website Navigation

Concordance / Tag Cloud

approach code computing conversational design english expression form functions ieuc implementation institute languages natural notation often origin paradigms parsing peter problem program programming semantics source system terms user users wasilko

Our VolunteerMatch Listings

Our GuideStar Report

GuideStar

Standards Compliance

  • Valid CSS!

  • Valid XHTML 1.0 Strict

In Page Navigation

Why Yet Another Language?

Programmers generally love the many languages we use to craft the code that we all depend on. It is a badge of honor to acquire skills in as many as possible, with each new language offering a resume boost.

In face, many love programming languages so much that they write one of their own. And indeed, there are thousands of them out there.

Only a few succeed in finding a large audience. Those that do tend to be the product of a very small design team if not an individual mind. For one to really catch on, it needs a niche and a unique take on the language design problem.

Forth was Chuck Moore's answer to creating very efficient and compact low level code. APL was Ken Iverson's expression of mathematical elegance in its uniform support of higher dimensional matrices and meta-programming. The History of Programming Languages holds many such examples.

On the other side of the coin, there have a been a few successful languages by committee like Sun's Java, although they tend not to rate as well in terms of sheer elegance, they often benefit from a richer array of libraries and wider deployment in the consumer sector as in JavaScript's nearly universal presence in any User's web browser of choice.

The problem with the his rich set of choices is that most languages are wedded to one or two dominant programming paradigms each of which makes it really easy to solve some kinds of problems while making it truly beastly to tackle others. So we wind up having to build hybrid solutions combining languages or manually converting code written in one to run in another by essentially re-implementing the core functionality of the programming language that code was originally written in.

This is possible because in a strictly mathematical sense all programming languages have the same computational power — a property called Turing Equivalence.

Another interesting feature of modern languages is that many of them share a nearly identical syntax derived from the C Programming Language in which Unix was written. This minimizes the learning curve for picking up the syntax of just one more language if you know the syntax of one of its predecessors, but from a human factors perspective it is very dangerous since the same syntactic construct can have subtly different semantics across languages making it easy to create syntactically valid programs that crash (or worse don't crash but do the wrong thing) at run time.

Even if you get things right, it is a major waste of mental energy to simultaneously wrap your brain around chunks of code in multiple languages.

We have also noticed at conferences that the need to communicate often trumps one's desire to desire to present actual source code because most conferees won't be familiar enough with its implementation language and coding conventions to be able to understand it. Instead the speaker will transliterate the notation into unambiguous English explicitly describing its underlying semantics with the terms of art shared by all computer scientists across programming language discourse communities.

Thus our starting position is the proposition that such a Quasi Natural Language representation should be the first approximation of our dream programming language and we call this naturalistic expression IEUC Conversational Form.

The Nature of IEUC Conversational Form

To make it work we will have to employ more advanced parsing techniques than are commonly uses, since many language notations were selected in part to simplify the parsing task facing compiler writers in days gone by. In other words all language users for all time have been paying a price each time the create new code to make the parsing problem more tractable for the individual or group performing the one time task of developing the language implementation!

By using a State of the Art Parsing Expression Grammar oriented language specification, we can admit more natural linguistic constructs and, through the use of Machine Translation (which is feasible in the unambiguous and highly restricted domain of computer source code), express those constructs in alternate natural languages including an expanded End User friendly dialect employing circumlocutions to avoid forcing End Users to master any Jargon they don't want to learn about.

We also will follow the lead of the PLT Scheme community and employ the notion of Language Levels which can be though of as nested Russian Dolls (or matryoshka in the original Russian) in which each language level encompasses the functionality of those levels it encapsulates, starting with the very simple core and added on new concepts as they are learned. This way a student or programmer can restrict what would otherwise be a very expansive and potentially hard to reason language to the right small language for what you are trying to accomplish at a given point.

We further posit that the Language Level approach can be naturally extended to encompass multiple paradigms (although one can nest paradigms in different orders breaking the Russian Doll analogy which does work within a paradigm and to less degree across paradigms depending on one's ontology of programming languages) represented by a set of readily comprehensible Kernel Languages in the Van Roy and Haridi - Mozart/OZ sense.

In either approach entire classes of encoding and logical errors which could be made if one accidentally employed features one didn't yet fully understand are foreclosed.

Of course the foregoing approach addresses the challenge of getting one's fine detail right, but for coding in the large we need to look to Knuth's Literate Programming methodology and Squeak's interactivity, with a little Expert System goodness along the lines of The Programmer's Apprentice thrown in as a binding agent.

Our ideas here are that one should be able to progressively sketch up a solution, referencing tasks by name and differing their implementation for later elaboration, moving around independently of the structural constraints imposed by the needs of a language's implementation. With literate programming, if a language requires all variable names to be listed at the start of the program, you could introduce them anywhere in your code and let the system gather them up in the right place for you. As with SmallTalk & Squeak, you should be able to inspect the program structure as it is successively refined, so you don't have to keep it all in your head at once. Moreover, as with The Programmer's Apprentice the system should provide expert assistance with these tasks by offering prompts and feedback based on your high level expression of the design patterns around which you are structuring your work. This functionality should be accessible through the same sort of Quasi Natural Language that the code itself is written in. Note that all of this can be accomplished textually without any dependencies on assigning semantics to whitespace as in Python, thus promoting maximum accessibility at the core language and read-evaluate-print-loop level.

Beyond the base language, we envision a much richer graphical development environment which we will eventually discuss elsewhere.

Development Methodology

IEUC Conversational Form will begin as a pseudo-code notation without an implementation. We really want to get a bi-directional English to Code mapping in place that will make it easier for programmers and end users to communicate with each other and with their machines.

To that end, we are taking an inductive language design approach. Starting with a wide sampling of actual code and roughly transliterating it into English using terms of art.

So the C code:

int *(*topScore) ()

becomes

a pointer named topScore to a function returning a pointer to an integer

or in a more verbose form sans jargon

A location in memory named topScore that holds a function (i.e. code that doesn't touch anything else in memory) which returns the location in memory of a whole number.

In the other direction, we are looking at things want to say in English and abstracting out of them suitable computational semantics.

So:

The account balance must not become negative.

would mean have the effect of telling the system to

To treat each attempted update of the "account balance" variable as a transaction at the end of which the system is to test the balance to determine if it is negative in which case it should roll back the transaction and throw an error.

Eventually, we will develop a language implementation that can effect such semantics directly. But during our design phase , it will much more productive to express them as source code in one of today's popular scripting languages.

Thus the early prototypes of IEUC Conversational Form will be language pre-processors generating code augmented with calls to a runtime library all of which will be written on top of and leverage languages widely available in the real world today. Specifically, we are currently leaning toward Python (possibly its Jython Java-based implementation) although Javascript, Scheme, and Haskell are also in the running.

Programming Paradigms

There are several fundamental approaches to how one can write a program.

Among these are:

  • Imperative Programming - in which you think in terms of changing variables and executing loops over data structures.
  • Declarative or Logic Programming - in which you describe logical relationships and rules that define what properties a correct solution to your problem must exhibit and then let the programming language generate and test possibilities until it finds the solution.
  • Functional Programming - in which you think primarily in terms of mathematical functions that don't change any variables outside their scope and use recursion (functions that call themselves to solve simplified versions of the problem they have been tasked with) and higher order functions (functions that take functions as inputs and possibly return new functions as their outputs).

In most cases, one of these approaches will make most of what you are trying to do very easy while making the rest very hard to accomplish. In a multiparadigm language, you can easily combine more than one approach with direct language support for doing so, otherwise you would often have to essentially implement a second language inside your language of choice, which is decidedly sub-optimal.

Languages like Leda avoid this problem.

Quasi Natural Language

Early Human-Computer Interaction research largely dismissed Natural Language User Interfaces because they don't provide any affordances to help End Users know what subset of the infinite number of possible sentences a program could understand.

It was correctly observed that presenting a user with a naked command line offered no more guidance that dropping someone into a Text Adventure Game. But unlike a game where the difficult challenge of trying to inductively solve the puzzle of how the interface works is seen as a good thing, when one is trying to accomplish a task with possibly irreversible consequence it is not.

The problem with raw natural language is that of pervasive ambiguity that humans resolve through contextual knowledge. In a command and control context, this can be particularly problematic.

At the opposite end of the spectrum is the conventional Programming Language with an unambiguous grammar, fixed semantics, and a notation optimized for making it easy to write a computer program to parse its syntax.

The problem with conventional programming languages is that they were often designed to minimize their learning curve for skilled programmers who had already internalized the idiosyncrasies of earlier languages. Worse, since there are many new languages based on earlier notations system but with subtly different semantics working in a multi-language environment is fraught with the danger of miss-encoding a program as a syntactically valid program that does something different. This is a function of the lack of redundancy of artificial notations that is not found in Natural Language where referents are often identified by extended phrases that could never be substituted by a one letter typo.

Thus at conferences we often see a researcher put up a slide of some programming language source code and then transliterate it into its unambiguous English equivalent drawing on shared programming language independent concepts that don't depend on any particular terse symbolic notation. So '(car users)' becomes "the first item in the user list".

We call this approach Quasi Natural Language. It is the use of unambiguous stylized natural language constructs to express corresponding programatic concepts that can be directly manipulated or executed.

The learning curve of such a notation can be dramatically reduced through the use of an Inverse Parser to provide visual affordances when constructing new expressions.

Advanced Parsing

Most of today's language-based user interfaces are designed around a set of parsing formalisms that grew out of early work on natural language processing in which multiple parse trees could be derived from a source text. When used for programming languages, these tools don't always map well onto how the programming language designer thinks. In the 1970's a new formalism called a Parsing Expression Grammar was suggested which now appears to have an efficient implementation.

The PEG grammar formalism promises to support a new generation of more elegant programming languages which may in turn lead to more secure and less buggy software that End Users might be able to modify and customize on their own. These documents describe this exciting new class of parsers.