A characterization of moral transitive directed acyclic graph Markov models as trees and its properties Robert Castelo Arno Siebes roberto@cs.uu.nl siebes@cs.uu.nl Department of Computer Science University of Utrecht P.O. Box 80089, 3508TB Utrecht The Netherlands AMS 1991 Subject Classi cation: 62H99 Keywords: Graphical Markov model; Conditional independence; Multivariate dis- tribution; Undirected graph model; Directed acyclic graph model; Transitive di- rected acyclic graph model; Decomposable model; Lattice conditional indepen- dence model; Tree conditional independence model; Finite distributive lattice; Poset; Labeled tree. Abstract It follows from the known relationships among the di erent classes of graphical Markov models for conditional independence that the intersection of the classes of moral directed acyclic graph models (or decomposable {DEC models), and transitive directed acyclic graph {TDAG models (or lattice conditional independence {LCI models) is non-empty. This paper shows that the conditional independence models in the intersection can be characterized as labeled trees, where every vertex on the tree corresponds to a single random variable. This fact leads to the de nition of a speci c Markov property for trees and therefore to the introduction of trees as part of the family of graphical Markov Models. 1 1 Introduction Graphical Markov models are a powerful tool for the representation and analysis of conditional independence among variables of a multivariate dis- tribution. There are di erent classes of graphical Markov models. Each class is associated with a di erent type of graph, which embodies the structural (qualitative) information on the relationships among the variables involved. More precisely, every vertex of the associated graph corresponds to a random variable of the multivariate distribution. One of the most fascinating aspects is the algebraic structure that under- lies the broad spectrum of di erent classes of graphical Markov models. This underlying algebraic structure is the foundation on which the present paper develops a particular characterization of the intersection of certain classes of graphical Markov models (and for which positivity or existence of joint densities is not required). The reader may nd a comprehensive guide to the di erent types of graphical Markov models in the books of Pearl (1988), Whittaker (1990), Cox and Wermuth (1996) and Lauritzen (1996). In this paper we will deal with graphical Markov models de ned by undirected graphs (UDG models), directed acyclic graphs (DAG 1 models), chordal graphs (decomposable or DEC models), transitive directed acyclic graphs (TDAG models), and nite distributive lattices (lattice conditional independence or LCI models). In the next section the reader will nd precise graph-theoretical de nitions of these graphs. LCI models, were introduced by Andersson and Perlman (1993) in the context of the analysis of non-nested multivariate missing data patterns and non-nested dependent linear regression models. Later, Andersson, Madigan, Perlman, and Triggs (1997, theorem 4.1) showed that the class of LCI models coincides with the class of TDAG models. Either of these terms, TDAG or LCI, will be used here depending on the algebraic context used at the moment. Figure 1 shows a picture that Andersson et al. (1995) devised in order to describe the location of LCI models within the scope of models represented by undirected and directed graphs. Although the class of LCI models appears on the picture as an isolated subclass, Andersson et al. (1995, p. 38) show that they are in fact interlaced through the class of DAG models. An important characterization also depicted in this gure corresponds to the de nition 1 Sometimes also referred as acyclic directed graph {ADG 2 of those UDG models that are equivalent to certain type of DAG models (Wermuth, 1980; Kiiveri, Speed, & Carlin, 1984). Thus, undirected and directed graphs members of this intersecting class describe the same model of conditional independence. They are graphically de ned as chordal graphs and are known as DEC models. DAG DEC UDG LCI=TDAG DEC intersected LCI CG Figure 1: Relation among the classes of chain graph models (CG), directed acyclic graph models (DAG), undirected graph models (UDG), decomposable models (DEC) and lattice conditional independence models (LCI) As has been mentioned already, the intersection between the classes of DEC and LCI is non-empty, as proven by Andersson et al. (1995). In this paper, a new formalization of the graphical Markov models in of DEC\LCI is presented. In the rst place, this new formalization is based on a characteri- zation of moral TDAGs as labeled trees. Then, a Markov property for trees is introduced. Finally the relationship between this new Markov property and the rest of the existing Markov properties is investigated. From this study, follows the new formalization of the graphical Markov models in DEC\LCI. Because of the relation between trees and models for conditional indepen- dence, we will refer to DEC\LCI models as tree conditional independence {TCI models. The direct consequence of such a formalization is that it provides a di er- ent way to read the structural information (  the conditional independen- cies) contained in the model, by using the new associated Markov property. The layout of the paper is as follows. In the next section some graph- theoretic de nitions and notation will be introduced. In section 3 an overview of Graphical Markov models will be given, and it will serve to introduce section 4, where we will nd the characterization of moral TDAG models as trees, as well as the de nition of its speci c Markov property. In section 5, the notion Markov equivalence in this setting will be investigated. Finally, on section 6, the main issues of the paper will be summarized. 3 2 Background concepts, terminology and notation The notation presented here has been mainly borrowed from Lauritzen (1996) and Andersson et al. (1995), and the concepts regarding nite distributive lattices have been taken from Gratzer (1978) and Davey and Priestley (1990). For more details, the reader is referred to these publications. A graph is a pair G = (V; E) where V is the set of vertices and E is the set of edges. In the present context of graphical Markov models, the set of vertices V represents the set of random variables of the multivariate distri- bution that underlies the model. This multivariate distribution is of a family of probability distributions P de ned on a product space X = (X i ji 2 V ). For simplicity, it is convenient to refer to a random variable x i as i, and a set of random variables xA = fx i ji 2 A; A  V g, as A. Therefore, a statement of conditional independence regarding three subsets of random variables may be here speci ed as A?BjS[P ], where A; B; S  V , A; B are non-empty. It claims that the random variables in xA are conditionally independent of the random variables in xB given the random variables in x S under P . In the rest of this paper every statement of conditional independence is asserted under P , thus it will be dropped from the notation. The set of edges E is a subset of the set of ordered pairs fV  V g such that it does not contain loops, i.e., 8a 2 V (a; a) 62 E. For a given pair of vertices a; b 2 V a 6= b, a solid line in the graph joining them a{b will represent an undirected edge, i.e., it means that (a; b) 2 E and (b; a) 2 E. An arrow a ! b between these two vertices will represent a directed edge, and it means that (a; b) 2 E and (b; a) 62 E. A subgraph G S = (S; E S ) is given by a subset S  V , and the induced edge set E S = E \ (S  S). When two vertices are joined by an (un)directed edge, these two vertices are regarded as adjacent. Given a vertex v 2 V , the set of those vertices that are adjacent to it are known as the boundary of v, denoted by bd(v). Further, the closure of a vertex v is de ned as cl(v) = bd(v) [ fvg. A graph G = (V; E) is said to be complete i (x; y) 62 E ) (y; x) 2 E, or in other words, every possible pair of vertices is adjacent. A subset is complete if it induces a complete subgraph. A clique is a complete subset that is maximal with respect to , i.e., G has no larger complete subgraph that contains it. For a directed edge a ! b we distinguish between the two joined vertices by specifying that a is the parent of b, and that b is the child of a. Those 4 parent vertices that have a common child, will be considered as the parent set of this child vertex, and it will be noted as pa(v), being v the child vertex. An important concept regarding directed graphs in the context of conditional independence is the concept of immorality. An immorality is formed by two non-adjacent vertices with a common child. A directed graph without immoralities is called a moral graph. A directed graph (V; E) can be moralized by marrying non-adjacent par- ents (joining them with an undirected edge) and dropping directions on the edges in E. Given a directed graph G = (V; E), its moralized version will be noted as G m . An immorality is also known as a sink-oriented V-con guration. Cox and Wermuth (1996) de ne a V-con guration as a triplet of vertices (a; b; c) such that two of them are adjacent with the third one but they are not adjacent themselves. Therefore a sink-oriented V-con guration (an im- morality) for the previous three vertices would be, for instance, a ! b c, and the vertex b is called a collision vertex. Following the same terminology, other two types of V-con gurations are the source-oriented V-con guration, e.g. a b ! c, and the transition-oriented V-con guration, e.g. a ! b ! c. In a directed or undirected graph G = (V; E), a path from a to b is a sequence a = a 0 ; : : : ; a n = b of distinct vertices such that n > 0 and either (a i undirected graph i every path between vertices in A and B, intersects S. An undirected cycle is a path where a = b. A tree is a connected undirected graph without undirected cycles such that there is always only a unique path between any two di erent vertices from the graph. A rooted tree is a tree in which a hierarchy among the vertices is created. One of the vertices of a rooted tree is the root and it is considered at the top of the hierarchy. The leaves of a rooted tree are those vertices connected to just one other vertex and they are considered at the bottom of the hierarchy. Under this convention we will say that the root is above the leaves, and the leaves are below the root. Given a tree T = (V; E) and a vertex u 2 V , a subtree rooted at u, and noted T u , is the pair T u = (U; E U ), where the vertex set U  V contains all vertices involved in every path from u to the leaves below, and the edge set E U = E \ (U  U ). In a directed graph, a directed path is a direction-preserving path, that means all its edges point towards the same direction. A given vertex a is called the ancestor of b if there is a directed path from a to b. A directed cycle is a directed path where the rst vertex coincides with the last. 5 A directed acyclic graph (DAG) is a directed graph without directed cycles. For every vertex v, one may consider the set of those vertices that are ancestors of v, which it will be called the ancestor set of v, and noted an(v). From the de nition of ancestor set, it follows that pa(v)  an(v). In the same manner, a vertex b is called the descendant of a if there is a directed path from a to b (i.e. a is ancestor of b), and all the vertices reachable from a by directed paths will form the descendant set of a. Given a vertex v the descendant set will be noted as de(v) and the non-descendant set of v is de ned as nd(v) = V n(de(v) [ fvg). A DAG is said to be transitive {TDAG if for every vertex v, pa(v) = an(v). An undirected graph is chordal, or decomposable (DEC), i it does not contain undirected cycles of length greater than three without a chord. They are also known as triangulated graphs or rigid circuit graphs. In the introduc- tion we already mentioned that DEC models correspond to the intersection of the classes of DAG and UDG models, and therefore they characterize those UDG models that are equivalent to DAG models. In the same vein, it is possible to characterize those DAG models equivalent to UDG models, as those determined by a DAG that does not contain immoralities (sink-oriented V-con gurations), i.e. a moral DAG. An important concept regarding directed graphs is ancestral set. Let G = (V; E) be a DAG. Given a subset A  V , A is said to be ancestral i for every vertex v 2 A, an(v)  A. Since the union and intersection of ancestral sets is again ancestral, all the di erent ancestral sets contained in a DAG G = (V; E) form a ring of subsets of V , which is noted as A(G). Further, given a subset of vertices A