POL 370 The University of Warwick Data and Dependent Variable Analysis


How would you explain this methodology and could you identify the independent, dependent, control variables, how would you defend the validity and reliability of the variables, and what is the testable hypothesized relationship between the independent and dependent variable?

Data and Dependent Variable

We collected data for 145 countries over the time period 1970-2000.16 The data are structured in terms of country-year observations and, after dropping 72 such cases for which we do not have any information on the migration data (discussed below), our sample comprises 3,919 country-years.17 For the dependent variable, we rely on the information in the Global Terrorism Database (GTD) that defines terrorism as “the premeditated use or threat to use violence by individuals or subnational groups against noncombatants in order to obtain a political or social objective through the intimidation of a large audience beyond that of the immediate victims” (Enders, Sandler and Gaibulloev, 2011, p.321). This data set provides a count variable on the number of terrorist attacks – both national and transnational – that occurred within a country’s geographic boundaries.18 We use a modified version of the count variable of terrorist attacks: due to the skewed distribution of the number of terrorist attacks in a country, which is primarily driven by the large number of zeros in the data, and since our estimators require a (quasi-) continuous dependent variable, we take the natural logarithm of the count after adding the value of 1 (to avoid calculating the log of 0.


Estimating the parameters for a series of spatial temporal autoregressive models, or “spatial lag models,” is appropriate, given the theoretical argument that contends that a country’s level of terrorist attacks may be affected by other countries’ terrorism and that immigrants may be the vehicle for this diffusion process (e.g., Franzese and Hays, 2007, 2008). For capturing terrorism traveling across countries via migrant inflows, a state’s degree of terrorism at time t is modeled as a function of foreign countries’ terrorism at t-1. A weighting matrix specifies the set of such states and which “linkages” between them are important. Using a weighting matrix, we can model country linkages as conditional on whether migrant inflows do exist and by how much. More formally, our spatial lag models are defined as, yt = φyt−1 + βXt−1 + ρWyt−1 + , where yt is the dependent variable (i.e., the logged number of terrorist attacks at time t), yt−1 signifies the (one year) temporally lagged dependent variable, Xt−1 is a matrix of temporally lagged explanatory variables that we define below, and is the error term. Wyt−1 stands for the product of a row-standardized connectivity matrix (W) and the temporally lagged dependent variable (yt−1), i.e., Wyt−1 is a spatial lag and ρ the corresponding coefficient. In time-series cross-sectional analysis, the connectivity matrix W is given by a NTxNT matrix (with T NxN sub-matrices along the block diagonal) with an element wi,j capturing the relative connectivity of country j to country i (and with wi,i=0). Some define the spatial lag using the temporally lagged values of the dependent variable for methodological reasons: under certain assumptions, it justifies the use of spatial ordinary least squares (S-OLS), which is less computationally intensive than maximum likelihood methods (e.g., Ward and Gleditsch, 2008). Here, our rationale is that it takes time that there is a potential and tangible impact on terrorism via diffusion.20 Hence, we use the lagged value of yt when constructing the spatial lags. Several estimators have been proposed for time-series cross-section spatial lag models (e.g., Elhorst, 2003; Beck, Gleditsch and Beardsley, 2006; Franzese and Hays, 2007), such as S-OLS or spatial maximum likelihood (S-ML). We employ S-ML regression models, but our findings are robust to using S-OLS.21 In order to rule out the possibility of common exposure – when, e.g., some country-specific features such as regime type tend to be spatially clustered or when spatial patterns can be produced by common trends or exogenous shocks – we control for a number of relevant “exogenous-external conditions or common shocks and spatially correlated unit level factors” (Franzese and Hays, 2007, p.142). In line with Franzese and Hays (2007, 2008), we thus include a temporally lagged dependent variable that captures a country’s level of terrorism in the previous year, country fixed effects, and year fixed effects. The longitudinal nature of our data allows us to consider the role of countries’ past terrorism for their current terrorist attacks.22 While this also captures time dependencies more generally, year fixed effects control for temporal shocks that are common for all states in a given year. The temporally lagged dependent variable, country fixed effects, year fixed effects, and the set of control variables (described below) make it credible that terrorism diffusion “cannot be dismissed as a mere product of a clustering in similar [state] characteristics” (Buhaug and Gleditsch, 2008, p.230). Pl¨umper and Neumayer (2010, p.427) argue the same.