Pubblications - Formal languages and automata

Journal papers

S. Crespi Reghizzi, V. Lonati, D. Mandrioli, M. Pradella, Toward a theory of input-driven locally parsable languages, Theoretical Computer Science, 658: 105-121 (2017)
V. Lonati, D. Mandrioli, F. Panella, M. Pradella.
Operator Precedence Languages: Their Automata-Theoretic and Logic Characterization,
SIAM Journal on Computing (SICOMP), 2015, Vol. 44, No. 4, pp. 1026-1088.
Preprint (pdf)
V. Lonati, M. Pradella.
Strategies to scan pictures with automata based on Wang tiles.
R.A.I.R.O. Theoretical Informatics and Applications, Vol 45(1): 163-180, 2011.

Preprint (pdf)

Abstract: Wang automata are devices for picture language recognition recently introduced by us, which characterize the class REC of recognizable picture languages. Thus, Wang automata are equivalent to tiling systems or online tessellation acceptors, and are based like Wang systems on labeled Wang tiles. The present work focus on scanning strategies, to prove that the ones Wang automata are based on are those following four kinds of movements: boustrophedonic, ``L-like'', ``U-like'', and spirals.
V. Lonati, M. Pradella.
Deterministic recognizability of picture languages with Wang automata.
Discrete Mathematics and Theoretical Computer Science, Vol 12(4): 73-94, 2010.
The paper includes the results presented at MFCS 2009 and SOFSEM 2010.

Preprint (pdf)

Abstract: We present a model of automaton for picture language recognition, called Wang automaton, which is based on labeled Wang tiles. Wang automata combine features of both online tessellation acceptors and 4-ways automata: as in online tessellation acceptors, computation assigns states to each picture position; as in 4-way automata, the input head visits the picture moving from one pixel to an adjacent one, according to some scanning strategy. Wang automata recognize the class REC, i.e. they are equivalent to tiling systems or online tessellation acceptors, and hence strictly more powerful than 4-way automata.
We also introduce a natural notion of determinism for Wang automata, and study the resulting class, extending the more traditional approach of diagonal-based determinism, used e.g. by deterministic tiling systems. In particular, we prove that the concept of row (or column) ambiguity defines the class of languages recognized by Wang automata directed by boustrophedonic scanning strategies.
A. Bertoni, M. Goldwurm, and V. Lonati.
The Complexity of Unary Tiling Recognizable Picture Languages: Nondeterministic and Unambiguous Cases.
Fundamenta Informaticae, Vol. 91(2): 231-249, 2009.
The paper includes the results presented at STACS 2007.

Preprint (pdf)

Abstract: In this paper we consider the classes REC1 and UREC1 of unary picture languages that are tiling recognizable and unambiguously tiling recognizable, respectively. By representing unary pictures by quasi-unary strings we characterize REC1 (resp. UREC1) as the class of quasi-unary languages recognized by nondeterministic (resp. unambiguous) linearly space-bounded one-tape Turing machines with constraint on the number of head reversals. We apply such a characterization in two directions. First we prove that the binary string languages encoding tiling recognizable unary square languages lies between NTime(2^n) and NTime(4^n); by separation results, this implies there exists a non-tiling recognizable unary square language whose binary representation is a language in NTime(4^n \log n). In the other direction, by means of results on picture languages, we are able to compare the power of deterministic, unambiguous and nondeterministic one-tape Turing machines that are linearly space-bounded and have constraint on the number of head reversals.
P. Boldi, V. Lonati, R. Radicioni, M. Santini,
The Number of Convex Permutominoes.
Information and Computation, Vol. 206(9-10): 1074-1083, 2008.
The paper includes the results presented at LATA 2007.

Preprint (pdf)

Abstract: Permutominoes are polyominoes defined by suitable pairs of permutations. In this paper we provide a formula to count the number of convex permutominoes of given perimeter. To this aim we define the transform of a generic pair of permutations, we characterize the transform of any pair defining a convex permutomino, and we solve the counting problem in the transformed space.
P. Boldi, V. Lonati, M. Santini, S. Vigna.
Graph fibrations, graph isomorphism, and PageRank.
R.A.I.R.O. Theoretical Informatics and Applications, Vol. 40: 227-253, 2006.

Preprint (pdf)

Abstract: PageRank is a ranking method that assigns scores to web pages using the limit distribution of a random walk on the web graph. A fibration of graphs is a morphism that is a local isomorphism of in-neighbourhoods, much in the same way a covering projection is a local isomorphism of neighbourhoods. We show that a deep connection relates fibrations and Markov chains with restart, a particular kind of Markov chains that include the PageRank one as a special case. This fact provides constraints on the values that PageRank can assume. Using our results, we show that a recently defined class of graphs that admit a polynomial-time isomorphism algorithm based on the computation of PageRank is really a subclass of fibration-prime graphs, which possess simple, entirely discrete polynomial-time isomorphism algorithms based on classical techniques for graph isomorphism. We discuss efficiency issues in the implementation of such algorithms for the particular case of web graphs, in which O(n) space occupancy (where n is the number of nodes) may be acceptable, but O(m) is not (where m is the number of arcs).
M. Goldwurm, and V. Lonati.
Pattern statistics and Vandermonde matrices.
Theoretical Computer Science, Vol. 356: 153-169, 2006.
The paper includes the results presented at STACS 2005.

Preprint (pdf)

Abstract: In this paper we determine some limit distributions of pattern statistics in rational stochastic models. We present a general approach to analyze these statistics in rational models having an arbitrary number of strongly connected components. We explicitly establish the limit distributions in most significant cases; they are characterized by a family of unimodal density functions defined by means of confluent Vandermonde matrices.
A. Bertoni, C. Choffrut, M. Goldwurm, and V. Lonati.
Local limit properties for pattern statistics and rational models.
Theory of Computing Systems, Vol. 39: 209-235, 2006.
The paper includes the results presented at Stacs 2004 and Dlt 2004.

Preprint (pdf)

Abstract: Motivated by problems of pattern statistics, we study the limit distribution of the random variable counting the number of occurrences of the symbol a in a word of length n chosen at random in {a,b}*, according to a probability distribution defined via a rational formal series s with positive real coefficients. Our main result is a local limit theorem of Gaussian type for these statistics under the hypothesis that s is a power of a primitive series. This result is obtained by showing a general criterion for (Gaussian) local limi t laws of sequences of integer random variables. To prove our result we also introduce and analyze a notion of symbol-periodicity for irreducible matrices, whose entries are polynomials over positive semirings; the properties we prove on this topic extend the classical Perron-Frobenius theory of non-negative real matrices. As a further application we obtain some asymptotic evaluations of the maximum coefficient of monomials of given size for rational series in two commutative variables.
D. de Falco, M. Goldwurm, V. Lonati.
Frequency of symbol occurrences in bicomponent stochastic models
Theoretical Computer Science, Vol. 327(3): 269-300, 2004.
The paper includes the results presented at Words 2003 e Dlt 2003.

Preprint (pdf)

Abstract: We give asymptotic estimates of the frequency of occurrences of a symbol in a random word generated by any bicomponent stochastic model. More precisely, we consider the random variable Y_n representing the number of occurrences of a given symbol in a word of length n generated at random; the stochastic model is defined by a rational formal series r having a linear representation with two primitive components. This model includes the case when r is the product or the sum of two primitive rational formal series. We obtain asymptotic evaluations for the mean value and the variance of Y_n and its limit distribution.
A. Bertoni, C. Choffrut, M. Goldwurm, and V. Lonati.
On the number of occurrences of a symbol in words of regular languages.
Theoretical Computer Science, Vol. 302(1-3): 431-456, 2003.

Preprint (pdf)

Abstract: We study the random variable Y_n representing the number of occurrences of a symbol a in a word of length n chosen at random in a regular language L contained in {a,b}*, where the random choice is defined via a nonnegative rational formal series r of support L. Assuming that the transition matrix associated with r is primitive we obtain asymptotic estimates for the mean value and the variance of Y_n and present a central limit theorem for its distribution. under a further condition on such a matrix, we also derive an asymptotic approximation of the discrete Fourier transform of Y_n that allows to prove a local limit theorem for Y_n. Further consequences of our analysis concern the growth of the coefficients in rational formal series; in particular, it turns out that, for a wide class of regular languages L, the maximum number of words of length n in L having the same number of occurrences of a given symbol is of the order of growth L^n / n^(1/2) for some constant L.

Proceedings papers

S. Crespi Reghizzi, V. Lonati, D. Mandrioli, M. Pradella.
Locally Chain-Parsable Languages .
In Proceedings MFCS 2015, 40th International Symposium on Mathematical Foundations of Computer Science, Milan, August 24-28, 2015.
Lecture Notes in Computer Science, Vol 9234: 154-166, 2015.

Preprint (pdf)
V. Lonati, D. Mandrioli, F. Panella. M. Pradella.
First-order Logic Definability of Free Languages.
In Proceedings CSR 2015, 10th International Computer Science Symposium in Russia, Listvyanka, July 13-17, 2015
Lecture Notes in Computer Science, Vol 9139, pp. 310-324.

Preprint (pdf)
V. Lonati, D. Mandrioli, F. Panella. M. Pradella.
Free Grammars and Languages.
In Proceedings ICTCS 2013, 14th Italian Conference on Theoretical Computer Science, Palermo, September 9-11, 2013.

Extended abstract (pdf)
F. Panella. M. Pradella. V. Lonati, D. Mandrioli,
Operator Precedence omega-languages.
In Proceedings DLT 2013, 17th International Conference on Developments in Language Theory, Paris, June 18-21, 2013.
Lecture Notes in Computer Science, Vol 7907: 396-408, 2013. .

Preprint (pdf)

Abstract: Recent literature extended the analysis of omega-languages from the regular ones to various classes of languages with "visible syntax structure", such as visibly pushdown languages (VPLs). Operator precedence languages (OPLs), instead, were originally defined to support deterministic parsing and exhibit interesting relations with these classes of languages: OPLs strictly include VPLs, enjoy all relevant closure properties and have been characterized by a suitable automata family and a logic notation. We introduce here operator precedence omega-languages (omega-OPLs), investigating various acceptance criteria and their closure properties. Whereas some properties are natural extensions of those holding for regular languages, others require novel investigation techniques.Application-oriented examples show the gain in expressiveness and verifiability offered by ωOPLs w.r.t. smaller classes.
V. Lonati, D. Mandrioli, M. Pradella.
Logic Characterization of Invisibly Structured Languages: the Case of Floyd Languages.
In Proceedings SOFSEM 2013, 39th International Conference on Current Trends in Theory and Practice of Computer Science, January 26-31, 2013.
Lecture Notes in Computer Science, Vol 7741: 307-318, 2013. .

Preprint (pdf)

Abstract: Operator precedence grammars define a classical Boolean and deterministic context-free language family (called Floyd languages or FLs). FLs have been shown to strictly include the well-known Visibly Pushdown Languages, and enjoy the same nice closure properties. In this paper we provide a complete characterization of FLs in terms of a suitable Monadic Second-Order Logic. Traditional approaches to logic characterization of formal languages refer explicitly to the structures over which they are interpreted - e.g, trees or graphs - or to strings that are isomorphic to the structure, as in parenthesis languages. In the case of FLs, instead, the syntactic structure of input strings is "invisible" and must be reconstructed through parsing. This requires that logic formulae encode some typical context-free parsing actions, such as shift-reduce ones.
V. Lonati, D. Mandrioli, M. Pradella.
Automata and Logic for Floyd Languages.
In proceedings ICTCS 2012, 13th Italian Conference on Theoretical Computer Science. Varese, Italy, September 19-21, 2012.

Extended abstract (pdf)
V. Lonati, M. Pradella.
Towards more expressive 2D deterministic automata.
In Proceedings CIAA 2011, International Conference on Implementation and Application of Automata, July 12-16, 2011.
Lecture Notes in Computer Science, Vol 6807: 225-237, 2011.

Preprint (pdf)

Abstract: REC defines an important class of picture languages that is considered a 2D analogous of regular languages. In this paper we recall some of the most expressive operational approaches to define deterministic subclasses of REC. We summarize their main characteristics and properties and try to understand if it is possible to combine their main features to define a larger deterministic subclass. We conclude by proposing a convenient generalization based on automata and study some of its formal properties.
V. Lonati, D. Mandrioli, M. Pradella.
Precedence Automata and Languages.
In Proceedings CSR 2011, 6th International Computer Science Symposium in Russia, June 14-18, 2011.
Lecture Notes in Computer Science, Vol 6651: 291-304, 2011.

Preprint (pdf)

Abstract: Operator precedence grammars define a classical Boolean and deterministic context-free family (called Floyd languages or FLs). FLs have been shown to strictly include the well-known visibly pushdown languages, and enjoy the same nice closure properties. We introduce here Floyd automata, an equivalent operational formalism for defining FLs. This also permits to extend the class to deal with infinite strings to perform for instance model checking.
V. Lonati, M. Pradella.
Picture recognizability with automata based on Wang tiles.
In Proceedings SOFSEM 2010, 36th International Conference on Current trends in Theory and Practice of Computer Sciences. Czech Republic, January 23-29, 2010.
Lecture Notes in Computer Science, Vol. 5901: 576-587, 2010.

Preprint (pdf)

Abstract: We introduce a model of automaton for picture language recognition which is based on tiles and is called Wang automaton, since its description relies on the notation of Wang systems. Wang automata combine features of both online tessellation acceptors and 4-ways automata: as in online tessellation acceptors, computation assigns states to each picture position; as in 4-way automata, the input head visits the picture moving from one pixel to an adjacent one, according to some scanning strategy. We prove that Wang automata recognize the class REC, i.e. they are equivalent to tiling systems or online tessellation acceptors, and hence strictly more powerful than 4-way automata. We also consider a very natural notion of determinism for Wang automata, and study the resulting class, comparing it with other deterministic classes considered in the literature, like DREC and Snake-DREC.
V. Lonati, M. Pradella.
Deterministic recognizability of picture languages by Wang automata.
In proceedings ICTCS 2009, 11th Italian Conference on Theoretical Computer Science. Cremona, Italy, September 28-30, 2009.

Extended abstract (pdf)
V. Lonati, M. Pradella.
Snake-Deterministic Tiling Systems.
In proceedings MFCS 2009, 34st International Symposium on Mathematical Foundations of Computer Science, Novy Smokovec (Slovakia), August 24-28, 2009.
Lecture Notes in Computer Science, Vol. 5734: 549-560.

Preprint (pdf)

Abstract: The concept of determinism, while clear and well assessed for string languages, is still matter of research as far as picture languages are concerned. We introduce here a new kind of determinism, called snake, based on the boustrophedonic scanning strategy, that is a natural scanning strategy used by many algorithms on 2D arrays and pictures. We consider a snake-deterministic variant of tiling systems, which defines the so-called Snake-DREC class of languages. Snake-DREC properly extends the more traditional approach of diagonal-based determinism, used e.g. by deterministic tiling systems, and by online tessellation automata. Our main result is showing that the concept of snake-determinism of tiles coincides with row (or column) unambiguity.
P. Boldi, V. Lonati, M. Santini, R. Radicioni.
The Number of Convex Permutominoes.
In proceedings LATA 2007, 1st International Conference on Language and Automata Theory and Applications, Tarragona (Spain) March 29 - April 4, 2007.

Preprint (pdf)

Abstract: Permutominoes are polyominoes defined by suitable pairs of permutations. In this paper we provide a formula to count the number of convex permutominoes of given perimeter. To this aim we define the transform of a generic pair of permutations, we characterize the transform of any pair defining a convex permutomino, and we solve the counting problem in the transformed space.
A. Bertoni, M. Goldwurm, and V. Lonati.
On the complexity of unary tiling-recognizable picture languages.
In proceedings STACS 2007, Aachen (Germany) February 22-24, 2007.
Lecture Notes in Computer Science, Vol. 4393: 381-392.

Preprint (pdf)

Abstract: We give a characterization, in terms of computational complexity, of the family REC_U of unary picture languages that are tiling recognizable. We introduce quasi-unary strings to represent unary pictures and we prove that any unary two-dimensional language L is in REC_U if and only if the set of all quasi-unary strings encoding the elements of L is recognizable by a one-tape nondeterministic Turing machine that is space and head-reversal linearly bounded. In particular, the result implies that the family of binary string languages corresponding to tiling-recognizable square languages lies between NTime(2^n) and NTime(4^n). This also implies the existence of a nontiling-recognizable unary square language that corresponds to a binary string language recognizable in nondeterministic time O(4^n log n).
M. Goldwurm, and V. Lonati.
Pattern occurrences in multicomponent models.
In proceedings STACS 2005, Stuttgart (Germany) February 24-26, 2005.
Lecture Notes in Computer Science, Vol. 3404, 680-692.

Preprint (pdf)

Abstract: In this paper we determine some limit distributions of pattern statistics in rational stochastic models, defined by means of nondeterministic weighted finite automata. We present a general approach to analyze these statistics in rational models having an arbitrary number of connected components. We explicitly establish the limit distributions in the most significant cases; these ones are characterized by a family of unimodal density functions defined by polynomials over adjacent intervals.
C. Choffrut, M. Goldwurm, and V. Lonati.
On the maximum coefficients of rational formal series in commuting variables.
In proceedings DLT'04, Auckland (New Zealand), December 13-17 2004.
Lecture Notes in Computer Science, Vol. 3340: 114-126, 2004.

Preprint (pdf)

Abstract: We study the growth function (R+) rational formal series in two commuting variables Defined, for every n in N, as the maximum of the coefficients of monomials of size n. We give a rather general sufficient condition under which the growth function of such a series is of the order n^(k/2) L^n for any integer k>-2 and any positive real L. Our analysis is related to the study of limit distributions in pattern statistics. In particular, we prove a general criterion for establishing Gaussian local limit laws for sequences of discrete positive random variables.
A. Bertoni, C. Choffrut, M. Goldwurm, and V. Lonati.
Local limit distributions in pattern statistics: beyond the Markovian model.
In proceedings STACS 2004, Montpellier (France), March 25-27, 2004.
Lecture Notes in Computer Science, Vol. 2996: 117-128, 2004.

Preprint (pdf)

Abstract: Motivated by problems of pattern statistics, we study the limit distribution of the random variable counting the number of occurrences of the symbol `a' in a word of length n chosen at random in {a,b}*, according to a probability distribution defined via a finite automaton equipped with positive real weights. We determine the local limit distribution of such a quantity under the hypothesis that the transition matrix naturally associated with the finite automaton is primitive. Our probabilistic model extends the Markovian models traditionally used in the literature on pattern statistics.
This result is obtained by introducing a notion of symbol-periodicity for irreducible matrices whose entries are polynomials in one variable over an arbitrary positive semiring. This notion and the related results we prove are of interest in their own right, since they extend classical properties of the Perron-Frobenius Theory for non-negative real matrices.
D. de Falco, M. Goldwurm, and V. Lonati.
Pattern statistics in bicomponent stochastic models.
In Proceedings Words '03, Turku (Finland), September 10-13, 2003. Turku (Finland), TUCS General Publication vol. 27: 344-357.

Formato pdf

Abstract: We give asymptotic estimates of the frequency of occurrences of a symbol in a random word generated by any (non-ergodic) bicomponent stochastic model. More precisely, we consider the random variable Y_n representing the number of occurrences of a given symbol in a word of length n generated at random; the stochastic model is defined by a rational formal series r having a linear representation with two primitive components. This model includes the case when r is the product or the sum of two primitive rational for mal series. We obtain asymptotic evaluations for the mean and the variance of Y_n and its limit distribution. These results improve the analysis presented in a recent work dealing with the particular case where r is the product of two primitive rational formal series.
D. de Falco, M. Goldwurm, and V. Lonati.
Frequency of symbol occurrences in simple non-primitive stochastic models.
In Proceedings 7th D.L.T. Conference, Szeged (Hungary), July 7-11, 2003.
Lecture Notes in Computer Science, Vol. 2710, pag 242-253, 2003.

Preprint (pdf)

Abstract: We study the random variable Y_n representing the number of occurrences of a given symbol in a word of length n generated at random. The stochastic model we assume is a simple non-ergodic model defined by the product of two primitive rational formal series, which form two distinct ergod ic components. We obtain asymptotic evaluations for the mean and the variance of Y_n and its limit distribution. It turns out that there are two main cases: if one component is dominant and non-degenerate we get a Gaussian limit distribution; if the two components are equipotent and have different leading terms of the mean, we get a uniform limit distribution. Other particular limit distributions are obtained in the case of a degenerate dominant compon ent and in the equipotent case when the leading terms of the expectation values are equal.
A. Bertoni, C. Choffrut, M. Goldwurm, and V. Lonati.
Asymptotic evaluation in a regular language of the number of words of given length with a fixed number of occurrences of a symbol.
Abstract in Proceedings Words '01, Palermo (Italy), September 17-21, 2001, a cura dell'Università di Palermo.
Versione preliminare del lavoro pubblicato su TCS nel 2003.

Extended abstract (pdf)

Phd thesis, reports, preprints

V. Lonati, D. Mandrioli, M. Pradella.
Logic Characterization of Floyd Language.
April 2012. Extended version of the paper accepted for presentation at SOFSEM 2013.

Arxiv
V. Lonati, D. Mandrioli, M. Pradella.
Precedence Automata and Languages.
December 2010. Extended version of the paper presented at CSR 2010.

Arxiv
P. Boldi, Violetta Lonati, M. Santini, R. Radicioni, S. Vigna.
Tree Language Determinism, Ambiguity and Typing: towards a Uniform Approach.
Rapporto interno 320-08, Università degli Studi di Milano, Dipartimento di Scienze dell'Informazione (2008).

Formato pdf

Abstract: Regular tree languages can be defined as the homomorphic image of a tree language generated by an extended context-free grammar -- much as a regular string language can be seen as the homomorphic image of a local language. The typing problem consists in finding the preimages of any tree in a given regular tree language. From the application point of view, regular tree languages have been recently rediscovered as an XML specification mechanism, and typing turns out to be a foundamental tool for document validation. In this paper we provide a systematic approach to the typing problem, by identifying several levels of determinism of tree grammars that are in turn re ected into the structure of the descending automata recognizing languages.
A. Lonati, V. Lonati, M. Santini.
Modeling and transforming a multilingual technical lexicon for conservation-restoration using XML.
Rapporto interno 311-07, Università degli Studi di Milano, Dipartimento di Scienze dell'Informazione (2007).

Formato pdf
Pattern statistics in rational models. Novembre 2004.
Phd disserttion. Advisors: Proff. A. Bertoni e M. Goldwurm.

Riassunto

Formato pdf

Violetta Lonati

Pubblications - Formal languages and automata

Journal papers

Proceedings papers

Phd thesis, reports, preprints