Show Summary Details

Page of

PRINTED FROM the OXFORD RESEARCH ENCYCLOPEDIA,  ENVIRONMENTAL SCIENCE ( (c) Oxford University Press USA, 2016. All Rights Reserved. Personal use only; commercial use is strictly prohibited. Please see applicable Privacy Policy and Legal Notice (for details see Privacy Policy).

date: 21 February 2018

Epigenetics and the Exposome: Environmental Exposure in Disease Etiology

Summary and Keywords

While genomics has been founded on accurate tools that lead to a limited amount of classification error, exposure assessment in epidemiology is often affected by large error. The “environment” is in fact a complex construct that encompasses chemical exposures (e.g., to carcinogens); biological agents (viruses, or the “microbiome”); and social relationships. The “exposome” concept was then put forward to stress the relatively poor development of appropriate tools for exposure assessment when applied to the study of disease etiology. Three layers of the exposome have been proposed: “general external” (including social capital, stress and psychology); “specific external” (including chemicals, viruses, radiation, etc.); and “internal” (including for example metabolism and gut microflora). In addition, there are at least three properties of the exposome: (a) it is based on a refinement of tools to measure exposures (including internal measurements in the body); (b) it involves a broad definition of “exposure” or environment, including overarching concepts at a societal level; and (c) it involves a temporal component (i.e., exposure is analyzed in a life-course perspective). The conceptual and practical challenge is how the different layers (i.e., general, specific external, and internal) connect to each other in a causally meaningful sequence. The relevance of this question pertains to the translation of science into policy—for example, if experiences in early life impact on the adult risk of disease, and on the quality of aging, how is distant action to be incorporated in biological causal models and into policy interventions? A useful causal theory to address scientific and policy question about exposure is based on the concept of information transmission. Such a theory can explain how to connect the different layers of the exposome in a life-course temporal frame and helps identify the best level for intervention (molecular, individual, or population level). In this context epigenetics plays a key role, partly because it explains the long-distance persistence of epigenetic changes via the concept of “epigenetic memory.”

Keywords: environment, exposure assessment, exposomics, causality, information transmission, omics, mechanisms, DNA methylation, epigenetic memory


Most noncommunicable diseases (NCDs) have an environmental origin (including behavioral risk factors). There are several lines of evidence that support this statement. First, the study of migrant populations has shown that, for most NCDs, people who migrate from a low-risk area to a high-risk area experience an increased risk, and the opposite happens to migrants from high- to low-risk areas (ACS, 2015). Second, there are rapid changes in the incidence of NCDs in the course of time that are incompatible with a purely inherited cause. For example, in most countries (but more recently for low-income countries) the incidence of breast cancer has increased while the incidence of cervical cancer has rapidly decreased. Third, “genetic research” itself (or, more correctly, “genomic research”)—through a large number of systematic searches for gene variants called “genome-wide association studies” (GWAS)—has come to the conclusion that inherited genetic susceptibility explains only a small proportion of NCDs, usually through an interaction between genes and the environment (Vineis & Pearce, 2011).

This is the background at the origin of new developments of NCD epidemiology, including the birth of the concept of the “exposome,” and of the popularity of epigenetic research in epidemiology. The two phenomena are related, as discussed below.

The exposome concept was originally put forward to stress the relatively poor development of exposure science when applied to the study of disease etiology. While genomics has been founded on accurate tools that lead to a limited amount of classification error, in general exposure assessment in epidemiology is affected by large error. A typical example is diet, which varies enormously between individuals and within the same individual at different points over their lifetime. Chronic NCDs are characterized by the effects of long-term exposures and long latency periods, and are therefore sensitive to exposures that vary in time. In contrast, inherited gene variants are stable and—even if measured just once—reflect life-long susceptibility. Recording, for example, the individual diet and its changes for decades is extremely difficult, and this is true for many environmental exposures. Thus, it is not surprising that environmental (i.e., not inherited) exposures are affected by large measurement error (Vineis, 2004). The latter, in turn, usually leads to an underestimation of the strength of association between exposure and disease. One is therefore confronted with an unbalanced situation in which genetic susceptibility is measured accurately and with little distortion in association estimates, while environmental exposures are not. Recent research established that approximately 90% of cancers are likely to be related to environmental causes (ACS, 2015), but this is still largely based on broad comparisons such as those between populations in different geographical areas or those obtained by observing migrants; yet we are less able to identify single environmental risk factors. In fact, for common NCDs like colon or breast cancer we still know relatively little about their specific risk factors.

The inaccuracy of exposure assessment is a main motivation for the introduction of the exposome concept; yet another important motivation lies in new developments in causality, both conceptually (what is a cause in medicine) and in practice (how genes work, how the environment works though genes). And this leads to the marriage between epigenetics and the exposome.

Concepts defines the exposome in a broad sense and presents recent developments in causal theory in the biomedical sciences, and for cancer in particular, both from a conceptual and a practical point of view including the role of epigenetics. Tools and Examples describes state of the art of exposome science and epigenetic epidemiology and pays attention to the concepts of epigenetic memory.



“Exposome” refers to the totality of exposure, internal and external. The term exposome was coined by Christopher Wild to refer to the imbalance between the effort put into genomics (in particular after the sequencing of human DNA) and the much lower effort put into exposure science; Wild also drew attention to the fact that the “environment” is a complex construct that encompasses chemical exposure (e.g., to carcinogens); biological agents (viruses, or the “microbiome”); and social relationships (Wild, 2012). As Figure 1 suggests, Wild advocates for three components of the exposome:

  1. 1. “general external” (including social capital, stress and psychology),

  2. 2. “specific external” (including chemicals, viruses, radiation, etc.) and,

  3. 3. “internal” (including for example metabolism and gut microflora).

Epigenetics and the Exposome: Environmental Exposure in Disease EtiologyClick to view larger

Figure 1. Three different domains of the exposome are presented diagrammatically with non-exhaustive examples for each of these diagrams. Reprinted with permission from Wild (2012).

There is also an important addition to the concept of the exposome—that is, in Wild’s words, “The exposome is composed of every exposure to which an individual is subjected from conception to death. Therefore, it requires consideration of both the nature of those exposures and their changes over time” (Wild, 2012). Thus, there are at least three properties of the exposome:

  1. 1. It is based on a refinement of tools to measure exposures (including internal measurements in the body);

  2. 2. It involves a broad definition of “exposure” or environment (including overarching concepts at a societal level [e.g., social capital]); and

  3. 3. It involves a temporal component (i.e., a consideration of exposures in the course of time [a “life-course” perspective]).

But how can the different layers (general, specific external, and internal) connect to each other in a causally meaningful sequence? The relevance of this question is obvious, and pertains to the translation of science into policy. For example, consider the scientific question: do experiences in early life impact on adult risk of disease, and on the quality of aging? And if so, how? The problem then turns into a policy question: how is this distant action incorporated in biological causal models and in policy interventions?

New Developments in the Concept of Causality in Biomedicine

There are several questions raised by the previous paragraph: (a) what is causality in the complex world of NCDs? (b) how do different risk factors connect to each other and to internal disease mechanisms? and (c) how does this occur in the course of time (i.e., how does molecular “memory” work in the life-long transmission of signals from external exposures to long-term biological changes)? The following sections use three different conceptual fields to show that the exposome is consistent with theories of causality: the epidemiological concept of “sufficient-component-causal framework,” the philosophical theory of causality as information transmission, and the physiological concept of “allostatic load.” These three ideas converge into the use of epigenetics in exposome work, as explained in Tools and Examples.

Sufficient-Component-Cause Framework

It has been argued that carcinogens (and other toxicants) lead to the perturbation of one or more “Adverse Outcome Pathways” (AOPs), which consist of a sequence of events including molecular initiating events, biochemical responses, cellular responses, tissue organ responses, individual responses, and lastly population responses. AOP is a term used in a specific context and we prefer to refer more generically to “pathway perturbation” (NAS, 2017). Pathway perturbation can be used as a conceptual framework for organizing and evaluating the strength of evidence concerning the steps needed for progressing from molecular/metabolic changes to an adverse outcome. Pathway perturbation can be framed within the causal model called the sufficient-component-cause framework. This considers multiple exposures or “hits” that together lead to the outcome under consideration. The model provides a way to account for how multiple factors, whether environmental exposures or genes, combine and subsequently result in disease in an individual or population. In this model, sufficient causal complexes are represented by “pies” (Figure 2) with each slice being an insufficient component (Rothman & Greenland, 2005). Generally, none of the components, alone, is necessary to cause disease. In fact, for cancer and other noncommunicable diseases, there are only a few instances of apparently necessary causes (e.g., Human Papilloma Virus [HPV] and cervical cancer [there is consensus that without HPV there would be no cervical cancer]) (Vineis & Wild, 2014).

Epigenetics and the Exposome: Environmental Exposure in Disease EtiologyClick to view larger

Figure 2. Three Sufficient Causes of Disease.

Reprinted with permission from Rothman and Greenland (2005).

Figure 2 provides a schematic diagram of three different sufficient constellations of causes (pies) in three hypothetical individuals. Each constellation of component causes in a “pie” represents a sufficient causal complex that if completed results in the outcome. As represented in Figure 1, there is no redundant component cause, meaning that the removal of any one component cause will result in complete outcome prevention.

The same concepts can apply to mechanistic steps, rather than to external causes of disease, for example to perturbation of the p16 tumor suppressor gene and other targets of HPV. Mechanisms are expected to combine along the pathways according to a scheme that is similar to those illustrated in Figure 2. Figure 3 shows as an example the complex intersection of exposures, intermediate mechanisms, and outcomes from multiple pollutants.

Epigenetics and the Exposome: Environmental Exposure in Disease EtiologyClick to view larger

Figure 3. Reprinted with permission from Casals-Casas and Desvergne (2011).

Figure 4 represents the life-course approach. The early stages of life allow individuals to build up their ability to respond to strains of different kinds (i.e., chemical, physical, biological, and psychological), and this response “build-up” constitutes a “reserve” that allows variable resilience, often depending on socioeconomic status. In past decades, epigenetic evidence has been collected about the mechanisms that allow early-life events and that lead to molecular changes having an impact on adult or late-life diseases, as shown in Tools and Examples.

Epigenetics and the Exposome: Environmental Exposure in Disease EtiologyClick to view larger

Figure 4. Life-course approach and concept of functional “reserve.”

  1. 1. Normal development and decline.

  2. 2. Exposure during development reduces functional reserve at maturity.

  3. 3. Exposure after maturity accelerates decline.

  4. 4. Combination of B and C.

Reprinted with permission from Power et al. (2013).

Causal Pathways as Information Transmission

The link between exposures and disease mechanisms is studied using “biomarkers” (i.e., biological markers). They are measurements made at the molecular level that connect, for example, a chemical exposure (say tobacco smoke) to internal alterations, for example at the DNA level. Examples in Tools and Examples will clarify how epigenetic markers are used to study the carcinogenicity of tobacco smoke. However, what biomarkers are is not completely straightforward, as they do not necessarily correspond to precise entities at the molecular level. Biomarkers are largely constructed by cross-checking data that are generated by machines (e.g., mass spectrometry) and subsequently analyzed using other machines (computers), and the results are finally interpreted by scientists. This raises the question of the ontological status of biomarkers. The question is important for at least two reasons. On the one hand, the ontological status of biomarkers is related to our conceptualization of disease causation. On the other hand, this conceptualization becomes important for policy purposes, as discussed below (Russo & Vineis, 2017).

To begin with, molecular epidemiology is often not interested in finding biomarkers per se, in the sense that biomarkers will indicate the pathogen causing disease (for instance a virus or a bacterium). Instead, finding biomarkers usually helps scientists understand the continuum of disease development from early exposures; here, biomarkers are like yardsticks to identify different stages of disease causation and onset. This means that the emphasis is on the process of disease development (the “pathway perturbation”), rather than on the entities causing disease. This is an important conceptual shift, after nearly two centuries exclusively searching for specific pathogenic agents (Carter, 2016). This way of understanding biomarkers is different from the clinical applications of molecular biology, where the “ontology” of a marker is related to our ability to intervene on such entities, in order to stop disease progression.

The use of biomarkers in exposure science requires adapting the concept of causation. We have to abandon a view of causation based on relations between objects or entities, in favor of a view based on the notion of process. In the philosophy of causality (Illari & Russo, 2014), such a view was originally proposed by W.C. Salmon (Salmon, 1984, 1997). Here, causal processes are conceptualized as world lines of objects. Consider a simple example. An airplane flying in the sky is a world line, but so is its shadow on the ground. The key question is to discriminate between world lines, or processes, that are causal and those that are not. In this approach, causal processes are capable of transmitting conserved quantities, such as mass-energy, linear momentum, or charge. Thus two airplanes flying in the sky and colliding are causal processes that intersect: either process is modified after the interaction. However, the corresponding shadows on the ground are not modified when the airplanes intersect, thus the shadows are not causal processes.

The airplane example provides a simplified image for what happens in our bodies due to internal and external exposures. However, we cannot determine whether the interactions, for instance, between tobacco smoke and DNA alterations are causal by measuring energy and momentum. Biomarkers help us to measure the forms of transmissions that are not describable with the language and tools of physics. Illari and Russo (2014) suggest that what is transmitted in the development of disease is information. Biomarkers help track the transmission of information at different key points. Simply put, causality in molecular epidemiology is to be conceptualized as the transmission of information. In this context, “information” has a much wider meaning than biological or genetic information.

This is consistent with the way in which molecular epidemiologists explain the basic concept of their methodology. Here, a recurrent idea is that of “picking up signals,” for instance:

While classical statistical models to analyzing -omics data serve the purpose of identifying signals and separating them from noise, little has been done in chronic diseases to model time into the exposure-biomarker-disease continuum

(Vineis & Chadeau-Hyam, 2011).

From these two parallel analyses [statistical analyses], we obtained lists of putative markers of (a) the disease outcome, and (b) exposure. These were compared in a second step in order to identify possible intersecting signals, therefore defining potential intermediate biomarkers

(Chadeau-Hyam et al., 2011).

Another way to establish whether a process is causal is to introduce markers (in this case they don’t need to be biomarkers) and to see whether they persist at later points in time. A simple example discussed by Salmon is the dented car, where the dent is transmitted along with the movement of the car. However, even if we dent the car, its shadow will not further transmit the mark. It is important to note that the whole approach here rests on two ideas: (a) the introduction of the marker, and (b) a counterfactual assessment of what would happen had the marker not been introduced. In molecular epidemiology, instead, in the process from exposure to disease we search for markers (without introducing markers ourselves), and for their transmission along the process. Epidemiology is usually observational, not experimental, though counterfactual reasoning is not extraneous. In Tools and Examples, the process of searching and measuring markers is clarified, using examples from epigenetics.

This idea of “information transmission” needs to be reconciled with the widespread mechanistic thinking that the health sciences have been adopting since the course of the 19th century, and that looks for specific causes (e.g., viruses or bacteria, or chemicals) acting upon the biochemical mechanisms of the body. Illari and Russo (2014) suggest that such biochemical mechanisms have to be understood as information channels. They are “channels” because this is where the information flows and where we search for biomarkers. The concept of “information” is prominent because it is the most general concept that applies to molecules (e.g., when a chemical reacts with a biological receptor), to macroscopic objects (e.g., when we dent a car), and also to psychosocial interactions (e.g., when we relate individual behavior to health outcomes).

Molecular epidemiology is thus not looking for a reduction to biological information (e.g., genetics) but for a most general way of linking exposures and disease. Many environmental factors are not specific biological entities that can cause disease; it is instead the total exposure (the exposome) that plays a role in disease etiology and development, and we need an appropriate concept of causation that links exposure and disease at different levels. This is important for scientific questions as well as policy questions. In fact, it is not on the biomarkers that public health policies can intervene, but on what they are markers of. The philosophical idea of “transmission of information” has a counterpart in studies in molecular epidemiology: this is the relation between a stressor and a receptor. This idea is explained in the next section.

Allostatic Load

Allostasis is the process whereby an organism maintains physiological stability by changing parameters of its internal milieu and by matching them appropriately to environmental demands (Beckie, 2012; McEwen, 2012). This stress response is adaptive if it is transitory and is terminated with the intervention of feedback loops. However, in the long run the suppression or depletion of processes and systems, such as of the immune system, can be damaging for the organism. Prolonged and/or uncontrollable sources of stress lead the organism to weaken its resistance and to enter a phase of exhaustion. This process has been called allostatic load (AL), and refers to the “price” the tissues or organs pay for managing chronic stress responses. Allostatic load refers to this “cost” of adaptation. Such a physiological cost may express itself over time as disease.

In the last two decades, epidemiological research has used the concept of allostatic load to explain how chronic stress can lead to physiological dysregulation and ultimately to disease. Originating as a biopsychosocial model, “allostatic load is the wear-and-tear on the body and brain resulting from chronic dysregulation (i.e., overactivity or inactivity) of physiological systems that are normally involved in adaptation to environmental challenge” (Beckie, 2012), as represented in the conceptual model of Figure 5. These concepts apply to many disease outcomes, but particularly to NCDs. Cancer itself is the product of a long-term interplay between external stimuli (mutagens, epimutagens, and selectogens) and the internal ability of the cell to repair damage and maintain homeostasis.

Epigenetics and the Exposome: Environmental Exposure in Disease EtiologyClick to view larger

Figure 5. Reproduced with permission of Michelle Kelly-Irving and Cyrille Delpierre, INSERM Toulouse.

ACE=Adverse Childhood Experience.

Carcinogenesis: The Interplay of Mutagens, Epimutagens, and Selectogens

According to recent achievements in cancer biology, cancer is a strictly individualized phenomenon from a molecular perspective:

  • Cancer occurs in stages that correspond to increasing morphological complexity and molecular heterogeneity (“intratumor diversity”), with two metastases or two areas in the same localized tumor having a different set of somatic mutations;

  • Mutations can be neutral, detrimental, or favorable for the expansion of a cell clone, depending both on the environment, which exerts a selective pressure, and the previous history of mutations in the same cell. The latter concept corresponds to the influence that previous mutations have on the effects of subsequent mutations on protein structure and function, and also on the evolution of entire gene regulatory networks.

An individual cancer is therefore the consequence of a strictly individual history: cancer arises as the product of exposure to exogenous and endogenous factors that induce mutations/epimutations and select advantageous mutations, so that cancers (and even different metastases in the same individual) differ from each other in terms of the sequence and pattern of molecular events.

The concept of “branched evolution” (Gerlinger et al., 2014) is challenging as far as timing of exposures is concerned, in relation to subsequent cancer risk. This concept has been summarized as follows:

Recent data increasingly demonstrate that evolution often occurs in a branched manner in several tumor types, leading to intratumor diversity, with subclones differing genetically and functionally. (. . .) the selective advantage of any genotype is dependent on the environment.

(Gerlinger et al., 2014).

For example, given that certain “driver” mutations may only exert their carcinogenic effects in the context of favorable selective conditions, one can postulate that past exposures may leave genetic or epigenetic alterations that are only expressed far later in time, contingent on subsequent “selectogenic” exposures. This poses particular challenges to the identification of risk factors that may exert a type of “hit-and-run” effect. It also poses important challenges to prevention, in relation to selecting the most effective timing for any specific intervention.

Nevertheless, there is now strong evidence that the risk of disease is influenced by early exposures, including in utero; and relevant life-stages include critical periods (during which changes in exposure lead to long-term effects on disease risks) and sensitive periods (during which an exposure has stronger effect on disease risk than at other times). To use these concepts in practice implies having access to multiple life-stages in epidemiological studies, and repeated measurements of biomarkers in different time windows.

Summary and Relevance to Exposome Research

Why are the preceding concepts (sufficient-component model, information transmission, allostatic load) relevant to exposome research?

  1. 1. The sufficient-component-cause paradigm stresses that NCDs are almost never monocausal, but they require a combination of exposures: NCDs correspond to a multifactorial paradigm. However, having a model to describe how causes combine to lead to sufficient complexes is not enough, since this model does not explain how causes (e.g., exposure to social or chemical stressors) interact within the biochemical mechanisms of the body; in fact, we have to explain the difference between the two classes. To put it plainly, a disease mechanism is an internal alteration of physiology that in the end leads to disease; for example, insulin resistance leading to diabetes and the interplay of mutations, epimutations, and selection leading to cancer. External causes are able to trigger internal changes that in turn are responsible for disease.

  2. 2. Still, this is partially incomplete. What exactly is transferred from outside the body to inside? And how is this entity transferred in the course of time to elicit the outcome? Our proposal is that the key concept is information transmission, according to our interpretation and adaptation of Salmon’s model of causality. Information does not only need to be transmitted but also to last in time (i.e., be fixed in molecules). This fixation takes several forms, from mutations to more subtle epigenetic modifications that occur in cell reservoirs (e.g., in stem cells). Tools and Examples proposes a theory of how information can be stored in cells and lead to long-term lasting changes, for example the effects of the Dutch famine in WWII on people’s health 60 years later.

  3. 3. There is still an important component of the construct, the idea of allostatic load. This concept was put forward to explain that internal changes have a degree of elasticity (i.e., effects are not immediately apparent thanks to a “reserve” that buffers the impact of external stimuli). Originally proposed as a theory of psychosocial stress, in fact the theory of allostatic load has broad applications, and is useful to the development of the exposome concept because it adds the further (and important) dimensions of resilience and repair mechanisms.

Tools and Examples

Epigenetic Memory

Exposure to environmental stressors affects DNA in a number of ways. Some exposures cause DNA damage and alter its structure (mutations), while the same or other environmental exposures, including those experienced during fetal development, can cause epigenetic effects which modulate DNA function and gene expression. Some epigenetic changes to DNA that affect gene transcription are at least partially reversible, while other epigenetic modifications seem to persist even for decades.

To explain the effects of early life experiences—such as famine and exposures to other stressors—on the long-term persistence of specific patterns of epigenetic modifications, Vineis et al. (2017) proposed an analogy with immune memory: an epigenetic memory can be established and maintained in self-renewing stem cell compartments. The observations of early life effects on adult diseases and the persistence of methylation changes in smokers support this hypothesis. The model of Vineis et al. is based on the epigenetic methylation changes in DNA. Though these changes are mainly adaptive in response to stressors, they are also implicated in the pathogenesis and onset of diseases, depending for example on the types of subsequent exposures.

The addition of a methyl group (methylation) to cytosines at particular DNA sites called CpG sites leads to modifications in the transcription of DNA into RNA, and therefore translation into proteins. Usually, the addition of methyl groups in promoter sequences leads to a reduced expression of the corresponding gene. Reduced methylation may lead to genome instability, particularly if it occurs in repeated sequences such as “transposable elements,” which have a high CpG density. Methylation levels in tissues in humans (and in white blood cells [WBC] used as a surrogate tissue) are influenced by exposures such as smoking, diet, or air pollution.

Box 1 illustrates this sequence of changes using tobacco smoke as an example, including long-term and past exposure. Sometimes the events involved in inducing changes in methylation are very remote and persist over time, such as the impact of in utero exposure to the Dutch famine on adult manifestations. The latter has been attributed to changes in gene methylation, though with small absolute changes in methylation levels (Heijmans et al., 2009). In addition to methylation, other mechanisms, involving DNA and/or histone modifications and small RNAs, are also implicated in epigenetic changes, but they are not discussed further here.

Box 1. Example: Smoking-associated Hypomethylation

In a series of studies on tobacco smoking, it was found that reduced methylation of several genes in WBC and also in the lung tissue of healthy subjects was associated with smoking exposure (Shenker et al., 2013). To investigate the dynamics of methylation in smoking an epigenome-wide analysis with the Illumina 450K (genome-wide) array in a population sample (Guida et al., 2015) was conducted. While for many genes methylation reverted back to levels of never-smokers, for other genes reduced methylation was still present after 30 to 40 years since smoking cessation. The stability of some of these methylation changes is not compatible with the short half-life of WBC, and instead suggests that long-term changes to patterns of DNA methylation are induced in stem cells of the bone marrow by exposure to tobacco smoke, which then persist in the stem cell compartment for decades and are transferred to differentiated circulating progeny of these stem cells even after smoking cessation. This is related to the idea of information transmission through biomarkers proposed in section 1.2.2. The development of tobacco-related diseases is detected in WBC DNA methylation but this has to be made compatible with background biological knowledge (i.e., the life time of WBCs), suggesting that the information transmission must have started much earlier.

The gene that was most affected by methylation changes in the majority of these studies was AHRR, encoding the repressor of the Ah Receptor (AhR), which in turn regulates the transcription of specific target genes in response to toxicants from the external environment. In the Shenker et al. (2013) study it was found that the list of CpGs with persistent tobacco-induced methylation changes included a number of CpGs associated with the AHRR locus. In a mouse model of smoking an initial decrease in expression of AHRR at 3 days of exposure and a significant increased expression after longer-term exposure (28 days) were observed (Shenker et al., 2013). These observations provide an example of a regulatory pathway activated by the exposure that exhibits long-term persistence, most likely in hematopoietic stem cells, as a molecular memory of the exposure. Also, they provide an example of allostatic load, since epigenetic mechanisms buffer the effects of toxicants in the long run, but they also involve a “price” the tissue or organ pays for managing chronic stress responses. In fact, it was also observed that AHRR methylation changes were able to predict lung cancer onset in the long run (prospectively)—that is, information transmission apparently is carried on until disease onset (but this requires confirmation) (Fasanelli et al., 2015).

Studies in animals also point to the existence of stable DNA methylation changes that can be induced by exposure to environmental stimuli, and can persist long term. Rodent offspring that are exposed to changes in maternal nutrition or neonatally exhibit metabolic disturbances that are associated with stably altered patterns of DNA methylation in somatic tissues (Burdge & Lillycrop, 2010). Also, stressful experiences in rodents induced changes in DNA methylation patterns of genes involved in regulating neuroendocrine signaling (Murgatroyd et al., 2009).

Taken together, the observations on the effects of environmental stressors suggest that there are long-lasting mechanisms underlying changes that are transmitted across many cell generations; the following proposed hypothesis suggests that DNA methylation is a crucial component of cellular responses to environmental signals, acting within self-renewing human stem cell populations.

Methylation Memory

Vineis et al. (2017) have hypothesized that a long-lasting epigenetic change (called here “epigenetic memory”) can be likened to the immune memory of Burnet’s clonal selection theory of adaptive immunity. Such epigenetic memory may be activated in response to changes in exposure to a wide range of exogenous or endogenous agents including toxicants, nutrients, and behavioral stimuli, analogous to exposure to exogenous antigens from infectious pathogens. Since the resulting epigenetic changes are observable many years after the initial stimulus or exposure, some of them must be embedded in stem cells, since these changes persist for much longer than the lifetime of some of the mature terminally differentiated cells in which they are observed (i.e., the WBCs).

The hypothesis is that, just as a clone of B-cells carrying a specific antibody is activated and amplified in response to reinfection of the body by a pathogen bearing the antigen that initially generated the B-cells, so self-renewing stem cells have a similar behavior: the DNA methylation status of specific genes would be altered by a particular exposure early in life and may persist in a state of primed responsiveness—allowing rapid adequate cellular responses (gene transcription) if the same exposure occurs at a later time. Thus epigenetic changes that predisposed those genes involved in detoxification to rapid and high levels of transcription in response to a repeat exposure to toxicants would be positive adaptive responses. Further details of this theory have been published elsewhere (Vineis et al., 2017), with examples and including a discussion of the weaknesses of the model. This interpretation lends support to the view of causation as information transmission: to understand disease causation we need to understand the underlying continuum process from early exposures until when the disease develops.

Epigenetics and the Exposome

The concept of epigenetic memory fills a gap in our understanding of how early exposures can have long lasting effects. For this reason, epigenetics is a particularly important component of the exposome: it could be able to connect the different layers of the exposome, from external (e.g., social factors and chemical exposures) to internal changes (biomolecules). For example, the Lifepath Study not only showed that low socioeconomic status (SES) leads to increased mortality (Stringhini et al., 2017), but also to an epigenetic “age acceleration,” (i.e., it connected external events summarized by SES with internal mechanisms that in turn are related to an impaired allostatic load and impaired aging processes).

Low socioeconomic status is associated with earlier onset of age-related chronic conditions and reduced life expectancy, but the underlying age-related mechanisms remain unclear. Based on recent reports of socioeconomic differences in DNA methylation (DNAm) patterns, and evidence that DNAm is a powerful proxy of biological age, it has been hypothesized that low SES may predispose to accelerated biological aging. A methylation-based biomarkers of aging already known was used as the “epigenetic clock” in order to assess the aging rate in the blood of more than 5,000 individuals belonging to three independent prospective cohort studies from Italy, Australia, and Ireland. Researchers tested for the association of SES with accelerated epigenetic aging (independent of chronological age). The degree of accelerated aging was greater in individuals with lower SES in all three cohorts. The pattern of this association supported the hypothesis of the accumulation of the exposures, as well as the possibility of modifiability, showing intermediate epigenetic aging for individuals who improved SES from low to high over their lifetime.

The findings of the Lifepath project suggest that socioeconomic adversity accelerates epigenetic aging, a biological mechanism that may link SES to age-related diseases and longevity beyond traditional determinants of health (Fiorito et al., submitted). In this way, the effects of SES are visible in the biology, and SES becomes a proper factor in the complex biosocial etiology of disease, again according to the information transmission conceptual model (Kelly et al., 2014).

The Tools of the Exposome

The majority of important chronic diseases are likely to result from the combination of environmental exposures to chemical, biological, and physical stressors and human genetics. There is also evidence that the effects are place- and time-specific, and influenced by socioeconomic characteristics. Although information on both environmental and genetic causes of disease is growing as a result of large-scale epidemiological research, environmental exposure data (including diet, lifestyle, environmental, and occupational factors) are often fragmentary, nonstandardized, and at crude resolution. The information on environmental factors is often incomplete or inaccurate, and the subsequent estimation of overall risks associated with these factors is severely hindered. As a result, important associations can go undetected. This limitation has been framed within the context of the exposome, the environmental counterpart of the genome.

As explained in Concepts, the concept of the exposome refers to the totality of environmental exposures from conception onwards, and has been described in detail elsewhere, including its external and internal components (Wild, 2012). There are two broad interpretations of the exposome and they are complementary. The first general approach, called “top-down,” is mainly interested in identifying new causes of disease by an agnostic approach based on “omic” technologies, similar to what has been applied in genetics with the GWAS. This approach is often called EWAS, or “exposome-wide association study,” and utilizes tools such as metabolomics or adductomics to generate new hypotheses on disease etiology. One early example is the study performed on colon cancer and metabolomics, looking—with no a priori hypothesis—for metabolomic features that could act as mediators between diet and the risk of disease (Chadeau-Hyam et al., 2011). “Agnostically” means that, in that part of the analysis, no specific assumptions or hypotheses are made (Russo & Vineis, 2017).

The second general approach is called “bottom-up” and starts with a set of exposures to determine the pathways by which they lead to disease (i.e., which pathways are perturbed by exposure). Vineis et al. (2016) have used this approach in the EXPOsOMICS Project. By comprehensively addressing the integration of the external and the internal exposomes at the individual level, the EXPOsOMICS Project has:

  1. 1. Pooled and integrated information from short-term, experimental human studies and long-term epidemiological cohorts—including adults, children, and newborns—to enable focused investigations to refine environmental exposure assessment based on the concept of life-course epidemiology (Box 2).

  2. 2. Characterized the exposome by:

    • measuring the external component of the exposome at different critical life stages by employing novel tools and drawing on experience gained in existing initiatives (personal exposure monitoring, databases coupled with GIS, remote sensing), with a focus on air and water pollution; and

    • measuring biomarkers of the internal exposome (xenobiotics and metabolites), using omic technologies (adductome, metabolome, transcriptome, epigenome, proteome).

  3. 3. Integrated external and internal exposure measures to comprehensively model and assess exposure to air pollution and water contamination in large population cohorts, through novel statistical modeling.

Together these approaches have led to the formulation of a new concept of integrated exposure assessment at the individual level—the exposome—to reducing uncertainty, and to assessing how these refinements influence disease risk estimates for combined, multiple exposures and selected diseases.

Box 2. A Conceptual Model of Life-Course Disease Risk

Population studies of chronic diseases have traditionally recruited middle-aged subjects. However, there is strong evidence that: (a) the risk of disease is influenced by early exposures, including in utero; and (b) life-stages include critical periods (during which changes in exposure lead to long-term effects on disease risks or related, intermediate markers) and sensitive periods (during which an exposure has stronger effect on disease risk than at other times) (Ben-Shlomo et al., 2016). The idea of a sequence of critical and sensitive periods leads to the concept of “chain of risk” (i.e., the interplay of early exposures and late exposures). To use this concept implies, in practice, having access to multiple life-stages in exposure assessment and epidemiological studies, and repeated measurements of biomarkers at different time windows.

Internal Exposome

There is an external component in the exposome (i.e., the development and deployment of new methods to measure external exposures, for example to sensors and smartphones). However, consistent with the rest of the chapter, we focus on the development of biomarkers with new omic technologies (Box 3).

Biomarkers of exposure that were developed during the past decades (e.g., DNA and protein adducts), while representing a significant step towards the definition of more accurate measures of personal exposure, suffered from the disadvantage of addressing single chemicals, and covering therefore only a small fraction of the exposome. However, adaptation of biomarker technologies to provide a more global estimate of the internal component of the exposome is now emerging. For example, high-resolution LC/MS (Liquid Chromatography/Mass Spectrometry) metabolomics now permits the detection and the characterization of large numbers of small molecules (typically with molecular mass less than 2000 Daltons) in biological fluids. This includes many metabolites that can be affected by environmental exposures, and that can be related to inflammation, oxidative stress, and other metabolic pathways. Analogous adductomic technologies have been pioneered to detect protein adducts in an untargeted fashion, to provide a global picture of biomarkers of individual exposure either to electrophiles (molecules that interact with DNA) or to the chemicals that are metabolically activated to electrophiles. Because they can directly modify DNA and proteins, reactive electrophiles are important constituents of the exposome.

In addition to global data on internal markers of exposure provided by these technologies, the internal component of the exposome includes further biological profile data generated by high-density biological analysis technologies. Transcriptomic, epigenomic, and proteomic profiles of biological samples provide a detailed picture of the evolving state of cells under the influence of environmental chemicals, thus revealing early mechanistic links with potential health effects (Box 3)(NAS, 2017).

Box 3. Definitions of Various Omics Terms (From NAS, 2017).

  • Adductomics: The comprehensive identification of chemicals that bind to DNA or selected proteins, such as albumin.

  • Epigenomics: The analysis of epigenetic changes in DNA, histones, and chromatin that regulate gene expression. Epigenetic changes are changes other than changes in DNA sequence that are involved in gene silencing.

  • Exposome: A term first coined by Wild (2012) to represent the totality of a person’s exposure from conception to death; exposome research involves the measurement of multiple exposure indicators by using -omics approaches.

  • Genomics: The analysis of the structure and function of genomes.

  • Metabolomics: The scientific study of small molecules (metabolites) that are created from chemicals that originate inside the body (endogenously) or outside the body (exogenously). For purposes of the present report, metabolomics is assumed to include exogenous chemicals found in biological systems in their unmetabolized forms.

  • Proteomics: The analysis of the proteins produced by cells, tissues, or organisms. Analysis is conducted to understand the location, abundance, and post-translational modification of proteins in a biological sample.

  • Transcriptomics: Qualitative and quantitative analysis of the transcriptome, that is, the set of transcripts (mRNAs, noncoding RNAs, and miRNAs) that is present in a biological sample.


We have shown how it is possible to integrate different developments in disparate fields (such as the process of carcinogenesis, the concepts of pathway perturbation and allostatic load, and novel advancements such as epigenetic memory and omic technologies) in order to provide a unifying theory of biosocial causality, based on the central idea of information transmission. This is still an unstable construct that requires further confirmation. In particular, it is to be expected that, within the developments of exposome research, the concept of allostatic load is particularly promising as a unifying concept to understand the social to biological transition (i.e., how the social becomes embedded in the biological, two of the three dimensions of the exposome). This approach stresses the importance of integrating scientific methodology with conceptual (philosophical) thinking, since the conceptualization of causation as information transmission proves to be successful also in the interpretation of new scientific data.

Further Reading

Demetriou, C. A., van Veldhoven, K., Relton, C., Stringhini, S., Kyriacou, K., & Vineis, P. (2015). Biological embedding of early-life exposures and disease risk in humans: A role for DNA methylation. European Journal of Clinical Investigation, 45(3), 303–332.Find this resource:

Rappaport, S. M., Barupal, D. K., Wishart, D., Vineis, P., & Scalbert, A. (2014). The blood exposome and its role in discovering causes of disease. Environmental Health Perspectives, 122(8), 769–774.Find this resource:

Vineis, P., van Veldhoven, K., Chadeau-Hyam, M., & Athersuch, T. J. (2013). Advancing the application of omics-based biomarkers in environmental epidemiology. Environmental and Molecular Mutagenesis, 54(7), 461–467.Find this resource:

Vineis, P., Illari, P., & Russo, F. (2017). Causality in cancer research: a journey through models in molecular epidemiology and their philosophical interpretation. Emerging Themes in Epidemiology, 14(7).Find this resource:


American Cancer Society. (2015). The Cancer Atlas. (2d ed.). Available online.

Beckie, T. M. (2012). A systematic review of allostatic load, health, and health disparities. Biological Research for Nursing, 14(4), 311–346.Find this resource:

Ben-Shlomo, Y., Cooper, R., & Kuh, D. (2016). The last two decades of life course epidemiology, and its relevance for research on ageing. International Journal of Epidemiology, 45(4), 973–988.Find this resource:

Burdge, G. C., & Lillycrop, K. A. (2010). Nutrition, epigenetics, and developmental plasticity: Implications for understanding human disease. Annual Review of Nutrition, 30, 315–339.Find this resource:

Carter, K. (2016). The rise of causal concepts of disease: Case histories. London: Routledge.Find this resource:

Casals-Casas, C., & Desvergne, B. (2011). Endocrine disruptors: From endocrine to metabolic disruption. Annual Review of Physiology, 73, 135–162.Find this resource:

Chadeau-Hyam, M., Athersuch, T. J., Keun, H. C., De Iorio, M., Ebbels, T. M., Jenab, M., . . . Vineis, P. (2011). Meeting-in-the-middle using metabolic profiling—a strategy for the identification of intermediate biomarkers in cohort studies. Biomarkers, 16(1), 83–88.Find this resource:

Fasanelli, F., Baglietto, L., Ponzi, E., Guida, F., Campanella, G., Johansson, M., . . . Vineis, P. (2015). Hypomethylation of smoking-related genes is associated with future lung cancer in four prospective cohorts. Nature Communications, 15(6), Article No. 10192.Find this resource:

Gerlinger, M., McGranahan, N., Dewhurst, S. M., Burrell, R. A, Tomlinson, I., & Swanton, C. (2014). Cancer: Evolution within a lifetime. Annual Review of Genetics, 48, 215–236.Find this resource:

Guida, F., Sandanger, T. M., Castagné, R., Campanella, G., Polidoro, S., Palli, D., . . . Chadeau-Hyam, M. (2015). Dynamics of smoking-induced genome-wide methylation changes with time since smoking cessation. Human Molecular Genetics, 24(8), 2349–2359.Find this resource:

Heijmans, B. T., Tobi, E. W., Lumey, L. H., & Slagboom, P. E. (2009). The epigenome: Archive of the prenatal environment. Epigenetics, 4(8), 526–531.Find this resource:

Illari, P., & Russo, F. (2014). Causality: Philosophical theory meets scientific practice. New York: Oxford University Press.Find this resource:

Kelly, M. P., Kelly, R. S., & Russo, F. (2014) The integration of social, behavioural, and biological mechanisms in models of pathogenesis. Perspectives in Biology and Medicine, 57(3), 308–328.Find this resource:

McEwen, B. S. (2012). Brain on stress: How the social environment gets under the skin. Proceedings of the National Academy of Sciences U.S.A., 109 (Suppl. 2), 17180–17185.Find this resource:

Murgatroyd, C. et al. (2009). Dynamic DNA methylation programs persistent adverse effects of early-life stress. Nature Neuroscience, 12(12), 1559–1566.Find this resource:

National Academies of Science, Engineering, and Medicine. (2017). Using 21st Century Science to Improve Risk-Related Evaluations. Washington, DC: The National Academies Press.Find this resource:

Power, C., Kuh, D., & Morton, S. (2013). From developmental origins of adult disease to life course research on adult disease and aging: Insights from birth cohort studies. Annual Review of Public Health, 34, 7–28.Find this resource:

Rothman, K. J., & Greenland, S. (2005). Causation and causal inference in epidemiology. American Journal of Public Health, 95(Suppl. 1), S144–150.Find this resource:

Russo, F., & Vineis, P. (2017). Opportunities and challenges of molecular epidemiology. In G. Boniolo & M. J. Nathan (Eds.), Philosophy of molecular medicine: Foundational issues in research and practice (pp. 252–281). London: Routledge.Find this resource:

Salmon, W. C. (1984). Scientific explanation and the causal structure of the world. Princeton, NJ: Princeton University Press.Find this resource:

Salmon W. C. (1997). Causality and explanation: A reply to two critiques. Philosophy of Science, 64(3), 461–477.Find this resource:

Shenker, N. S., Polidoro, S., van Veldhoven, K., Sacerdote, C., Ricceri, F., Birrell, M. A., . . . Flanagan, J. M. (2013). Epigenome-wide association study in the European Prospective Investigation into Cancer and Nutrition (EPIC-Turin) identifies novel genetic loci associated with smoking. Human Molecular Genetics, 22(5), 843–851.Find this resource:

Stringhini, S., Carmeli, C., Jokela, M., Avendaño, M., Muennig, P., Guida, F., . . . Kivimäki, M. (2017). LIFEPATH consortium: Socioeconomic status and the 25 × 25 risk factors as determinants of premature mortality: A multicohort study and meta-analysis of 1·7 million men and women. Lancet, 389, 1229–1237.Find this resource:

Vineis, P. (2004). A self-fulfilling prophecy: are we underestimating the role of the environment in gene-environment interaction research. International Journal of Epidemiology, 33(5), 945–946.Find this resource:

Vineis, P., & Chadeau-Hyam, M. (2011). Integrating biomarkers into molecular epidemiological studies. Current Opinion in Oncology, 23(1), 100–105.Find this resource:

Vineis, P., Chadeau-Hyam, M., Gmuender, H., Gulliver, J., Herceg, Z., Kleinjans, J., . . . Wild, C. P. (2016). EXPOsOMICS Consortium. The exposome in practice: Design of the EXPOsOMICS project. International Journal of Hygiene and Environmental Health, 220(2, Part A), 142–151.Find this resource:

Vineis, P., Chatziioannou, A., Cunliffe, V. T., Flanagan, J. M., Hanson, M., Kirsch-Volders, M., & Kyrtopoulos, S. (2017). Epigenetic memory in response to environmental stressors. FASEB, 31(6), 2241–2251.Find this resource:

Vineis, P., & Pearce, N.E. (2011). Genome-wide association studies may be misinterpreted: Genes versus heritability. Carcinogenesis, 32(9), 1295–1298.Find this resource:

Vineis, P., & Wild, C. P. (2014). Global cancer patterns: Causes and prevention. Lancet, 383(9916), 549–557.Find this resource:

Wild, C. P. (2012). The exposome: From concept to utility. International Journal of Epidemiology, 41(1), 24–32.Find this resource: