Thanks to deep parsing, the Synomia engine automatically understands the intrinsic semantic structure of the corpus
The Synomia engine builds a semantic representation of the analyzed content by structuring all the extracted syntagms via a number of linguistic relations. It does this in an entirely automatic and endogenous way, without it being necessary to use dictionaries or trade ontologies.
In particular, thanks to syntactic analysis, the Synomia engine bring to the fore semantic classes of words and phrases by gathering those which have the same syntactic contexts, i.e. those which tend to complement the same verbs and the same names, and to be modified by the same adjectives.
For instance, the Synomia engine extracts fom a corpus of texts about health care constitued of the nouns plan, coverage, and program because they all complement the verbs buy, apply for, enroll in, choose, etc. and the same noun: information, cost, type, enrollment, etc. The Synomia engine extracts tens of classes of this type.
Semantics emerge from a corpus via syntax. This is the principle of distributional analysis advocated by the American linguist Zellig S. Harris (1909-1992) in the 1950s.
The Synomia engine discovers semantic content out of a corpus of texts without any preconceptions. Our engine performs all the analyses no matter what corpus it is analyzing, and returns a usable result. There is no need to provide trade dictionaries or customize specific extraction rules.
The density of the semantic network around a phrase shows what is important in the corpus. Since there is no frequency filter, it not only identifies major trends but also weak signals located in less dense areas of the network and that have a low frequency, but nevertheless connect to linguistically high-density semantic clusters.