An Essential and Indispensible Component for Social Networking

May 30, 2011

Linguistic Agents now possesses a unique technological capability that allows it to address the natural- language-processing needs of the rapidly expanding social and mobile markets.

Our linguistic technology produces higher resolution output and has significant advantages over statistical NLP.

Some companies in the field seem to be using a statistical approach to NLP. In a recent lecture, Chomsky blasts statistical NLP calling it “a total failure, a radical failure” In addition, Chomsky points out that these methods don’t work, and yet they are “regarded as a success”.

link to Chomsky’s lecture (video)
(beginning at minute 34)

Linguistic Agents is concentrating on computerizing Chomskian linguistics.

Once available, linguistic support is likely to become an essential and indispensible component for social networking.

Monosyllabic Roots in Hebrew

October 17, 2010

New article “Monosyllabic Roots in Hebrew” at Lingbuzz

Syntactic Graph

October 15, 2010

What is Syntactic Graph? This is the concept that replaces in Minimalist Framework “Parsing Tree” of earlier research. While the term “Syntactic Graph” was not used, the concept was introduced by Chomsky as part of radical simplification of syntactic theory.
When you build a syntactic object by recursively “merging” and “re-merging”, what you get is, mathematically speaking, a Graph.

Start with a list of Vertices (“Numeration”). Then gradually add directed Edges, removing from Numeration any Vertix pointed to by an Edge. Every time a new Edge is added, it must be from a Source in Numeration to a Goal which is either in Numeration OR accessible from Source. Entire algorithm for building Syntactic Objects in the Minimalist Framework is presented in this paragraph in terms of Graph Theory.

Adding an Edge pointing to a Vertix still in the Numeration is called “Merge”, to a Vertix already not in Numeration – is called “Re-Merge”. Existing Literature about Syntactic Graph is using the terminology of theoretical linguistics heavily.

Syntactic Graph feeds both (so-called) “Interfaces”: (a) phonological Spell-Out, (b) “Conceptual-Intentional” system(formerly Logical Form).

In Minimalist Framework, each word is usually represented (in Syntactic Graph) by a Vertix or two, with somewhat complex internal structure. Nanosyntax is further development of Minimalist Framework, at finer-grain resolution, with Syntactic Graph handling word-internal structure.

Syntactic Graph is the natural language for describing the meaning of natural language.

Representing text in computers by alphabetic characters is archaic technology – it already was ancient in Antiquity… The transition to using Syntactic Graph simply makes sense – scientifically valid representation of text and its meaning.

Increased Interest in the Semantic Web

August 9, 2010

The Semantic Web is receiving more and more attention recently which can be seen from the fact that Google bought Metaweb, the company that developed the social semantic database. It is less known that Microsoft has recently licensed the advanced linguistic technology from Cognition. As a result, Microsoft now has access to the technology of both Powerset and Cognition.

When Microsoft bought Powerset they did not integrate it into their main search product Bing, so one might have concluded that Microsoft was not really that interested in utilizing Natural Language technology, but simply wanted to make use of the brain power of Powerset’s team of computer scientists. But now that Microsoft has also licensed the scientific linguistic parsing from Cognition, it seems clear that they do in fact have plans to utilize Natural Language technology.

At this point, out of the three companies which concentrate on the computerization of the most recent theoretical linguistics, two (Powerset and Cognition) have been grabbed up by Microsoft. This leaves Linguistic Agents’ technology as the only available alternative on the market.

“Scrambling and phrase structure in synchronic and diachronic perspective” (a new PhD dissertation)

January 17, 2010

A new PhD dissertation by Joel C. Wallenberg at University of Pennsylvania.

Title: “Antisymmetry and the Conservation of c-command: Scrambling and Phrase Structure in Synchronic and Diachronic Perspective.”

Linguistic Web Initiative

January 10, 2010

Language Faculty

Language Faculty is the unique natural human ability to process complex syntactic structures.  Together with the “conceptual-intentional” and “senso-motor” subsystems, it facilitates the human abilities of understanding and speech.

Language Faculty is a subject of study by theoretical linguists.

Theoretical Linguistics

After many years of intensive research in the field of theoretical linguistics, there has been significant progress in the deciphering of the general properties of human language. The recent expansion of research beyond English and other familiar European languages has enabled the refinement and verification of the central discoveries of theoretical linguistics.

Software startups compete in how many decades of fruitful research in theoretical linguistics they choose to ignore. A typical startup skips four or five. One of the leading linguistic companies is way ahead by using a linguistic parser that implements the state of the art linguistic framework from the late eighties…

Meaning Description Language

Relying on outdated theoretical frameworks is not the only problem of today’s linguistic technologies.  Another problem is the fact that the output produced is a complex and model-dependent parse tree. This does not provide application developers with any significant advantage in comparison to dealing with plain text directly.

The progress achieved by theoretical linguistics in the study of the human Language Faculty has opened the possibility of expressing at least the partial meaning of the natural language phrases in a way which is precise and scientifically sound.

The early version of this idea was awarded the 2007 Horizon Award by Computerworld.

Linguistic Web

Standardization of the Meaning Description Language will enable the clear division of labor between providers of linguistic analysis technology on the one side, and the application developers on the other.  While Linguistic intelligence companies will provide the service of translating the natural language to the standard formal meaning description language, the language oriented applications will put these results to practical use.  This, in turn, makes the Linguistic Web possible.

The Scientific Infrastructure for the Linguistic Web

January 5, 2010

There are two major pre-requisites for the emergence of the Linguistic Web:

1) A solid Linguistic Parser

2) An extensive Lexical Semantic Database

Not only do these two elements need to exist – they must also be generally available to developers worldwide. We will now examine what is the current status of each one of these two crucial components of the Linguistic Web.

Linguistic Parser

Just about everybody has heard about the big buzz generated by Powerset’s acquisition by Microsoft. This is just an example of the magnitude of effort needed for the development of realistic, industrial strength linguistic software.  

Scientific linguistic technology necessitates a very long period of development and significant financial investment.  Powerset’s collection of natural language technologies incorporates over 25 years of intense scientific research, which originated at the PARC (Palo Alto Research Center). 

After all this invested effort in research and development, it is not clear what the policy of Microsoft will be in regards to making their sophisticated linguistic platform generally available.   

And yet, there are other players on the block with advanced linguistic parsers, who just may go ahead and make available the scientific technology which can be used in the Linguistic Web for the massive production of language oriented applications.

Lexical Semantic Database

To meet the requirements of the Linguistic Web, any Semantic Ontology must be constructed in terms of natural semantic concepts used by Language Faculty (FL), the basis for the inborn human ability to process language.

What is needed is something such as the “Semantic Map” developed by Cognition Technologies.  It took more than 20 years to build, and is probably the largest scientific linguistic database for English in the industry.

Is there a comparable Lexical Semantic Database available for everyone? Not at the moment. Perhaps what is needed is a Wikipedia-type collective effort in order to build a global Lexical Semantic Database. Obviously, this effort must be in sync with the accumulated insights of the last 60 years of intense research in theoretical linguistics. 


Any way you look at it, an infrastructure which contains both a Linguistic Parser and a Lexical Semantic Database will be needed in order to jumpstart the Linguistic Web.

Imagine the economical impact of all these various natural language solutions, in all the major languages, being developed worldwide, all using the same underlying linguistic platform. Of course, a new standard format for the representation of Natural Language Objects will also be necessary, but this is a subject for another posting.

Why Venture Capital Needs to Pay Close Attention to the Linguistic Web

December 9, 2009

It’s not about investing in such or another company.  It’s not about investing in Natural Language Processing startups. As soon as the Linguistic Web comes into existence, all software businesses will have to adjust.  Therefore, the Venture Capital industry should  be fully aware of this highly anticipated development in the immediate future.  The success of every new Internet startup may depend on its ability to integrate with the nascent Linguistic Web.

The moment one of the serious linguistic technologies, such as Powerset, would become freely available (for example, by being included in a browser), the reality of the web would be instantly changed. Linguistic Parsers belong in a browser, not in proprietary data-mining services.

Alternatively, as long as all the powerful linguistic technologies are kept proprietary, any startup that wants to develop natural language applications has to start from the very beginning and develop their own language processing system. Presently available public domain systems are simply not strong enough for practical applications. Building a Natural Language platform is clearly way too big a challenge for the typical application oriented startup company. 

How likely is it that one of the linguistic platform developers will make their advanced technologies freely available for everyone to use?   Not very.  Likewise, for some reason, the browser companies apparently are not rushing into including linguistic technology.  Does this mean that there is nothing to be done?  Not at all. 

There are a few companies concentrating on the building of sophisticated linguistic systems, and they may decide to make the technology available in one of the formats such as JavaScript, Silverlight, Java Applet, or Flash.

Of course, any serious linguistic platform must have a solid, scienticfic basis. We’re not talking here about linguistically amatuer experiments by developers who typically are not familiar with contemporary theoretical linguistics beyond the level achieved, at best, in the early seventies. Any project which is not implementing the most recent scientific advancements in theoretical linguistics would simply be totally inadequate. This would be comparable to trying to reach the moon by practicing your high-jump skills. You may be the most awesome athlete with the best trainers, motivation, and steroids, but ya ain’t gonna make it to the moon.

Since the necessary technology exists, the big switch to Natural Language Interface is just a matter of choice.

Israel: Leader of Business Innovation –

November 8, 2009



Israel: Leader of Business Innovation –

Dan Senor, co-author of ‘Start-up Nation: The Story of Israel’s Economic Miracle,’ discusses with CNBC how Israel has managed to become a leader in business innovation.

“Event Structure and the Encoding of Arguments”

October 30, 2009

As the result of sixty years of trying to define the right theory of syntax, scientific understanding of language had reached the level where practical applications are becoming possible.

One of the remaining obstacles to effective computerization of syntactic theory is the fact that the relevant literature is not easily accessible to computer scientists.

One important exception is the MIT PhD thesis “Event structure and the encoding of arguments : the syntax of the Mandarin and English verb phrase” by Jimmy Lin, written under joint supervision of leading linguistic theorists and leading computer scientists – something that, it seems, never happened before or after.

This  dissertation is a unique source that may give some understanding of the present-day syntactic theory without presupposing reader’s familiarity with ongoing linguistic research and special terminology.