The general goal of our research is to clarify the mechanisms of face recognition in the human brain. Recording event-related potentials (ERPs) on the human scalp can be particularly informative for this goal. Because of their excellent time resolution, ERPs can help tracking perceptual face processes as they unfold through time.
The N170 component: introduction and background
If you flash a face stimulus to a human observer, you will elicit a series of electrical response on the scalp. These small changes of the electroencephalogram (EEG) that are time-locked and phase-locked to the stimulus onset are the ERPs. They vary in polarity, latency, amplitude and scalp topography. They can be extracted from the background EEG noise by applying the same kind of stimulation repeatedly, and averaging all the time-windows (Dawson, 1951) that follow the onset of the face stimulus.
The sudden onset (flash) of a face stimulus elicits a particularly large negative component on the adult human scalp, most prominent on lateral occipital or occipito-temporal sites, peaking at about 170 ms (usually slightly earlier than that in fact). Although it is fair to say that it was also reported in other studies at about the same time (Bötzel et al., 1995; George et al., 1996), this component has been termed the N170 in the seminal study of Bentin and colleagues (1996), which was the first study to focus on this component and in which several experiments were performed to characterize its response properties.
Importantly, this posterior component coincides in time with a positive component maximal in amplitude at centro-frontal sites, the Vertex Positive Potential (VPP) described earlier by Jeffreys (1989; 1996). Jeffreys thought that the VPP and the negative counterparts observed on occipito-temporal sites reflected the two opposite sides of dipolar responses, and that the VPP had its origin in occipito-temporal regions. Together with Carrie Joyce, we have indeed shown that the two components reflect opposite sides of the same generators, varying inversely in amplitude with the location of the reference electrode on the scalp (Joyce & Rossion, 2005).
Why should we have a particular interest in this N170 component?
Nowadays, there are hundreds of ERP studies describing the properties of the N170 component in response to faces.
The first reason for being interested in this component is that the N170 is clearly related to the kind of information that leads to conscious perception of a visual stimulus as a face. Let’s say that you present a visual stimulus that has the same amplitude spectrum as a face but that does not look like a face because the phase of the spectrum is randomized. That is, this phase-scrambled face stimulus has the same low-level visual properties as a face. Yet, it is not perceived as a face.
As you can see on the figure on the right (adapted from Rossion & Caharel, 2011), for such a phase-scrambled face, the N170 remains of small amplitude, almost non-existent. This is not true for other visual components. In particular, it is not true for the preceding P1 (peaking at about 100 ms): the P1 is of equal amplitude (and latency) to a face and a phase-scrambled face (it might even be slightly larger for the phase-scrambled image) (Rousselet et al., 2005; 2007; Rossion & Caharel, 2011).
Hence, when it comes to visual perception, the P1 and the subsequent N1(70) present with fundamentally different response properties. While the P1 is driven by the low level visual properties of the stimulus, irrespective of the meaning of the stimulus for the human brain, the N170 is associated with the information that leads to conscious perception of the stimulus as a face. In writing this, I do not mean that conscious perception takes place already at the level (onset) of the N170. Maybe, maybe not. However, what I mean is that the N170 amplitude is related to the kind of information that leads to the conscious interpretation (perception) of the stimulus as a face.
Another illustration of the relationship between the N170 and the perception of the stimulus as a face is provided by the N170 recorded to Arcimboldo/Mooney face stimuli. Here the images to compare are the same when they are presented upside-down. However, the perception of the face is altered. Accordingly, the N170 is much smaller in amplitude for inverted Mooney or Arcimboldo stimuli than when the stimuli are presented at upright.
Figure taken from Rossion & Jacques, 2011; courtesy of S. Caharel and N. George, respectively.
Note that if a face photograph is presented upside-down, the N170 is not reduced, and is in fact, paradoxically, increased (Rossion et al., 1999), as discussed below. However, there is no contradiction with the findings reported above: when a full face photograph is presented upside-down, it is still perceived as a face.
So the answer to the question of why we should we have a particular interest in this N170 component is quite straightforward: since the N170 is the first (in time) component associated with face perception, it is a component that is of particular interest for researchers interested in understanding the time-course and the nature of face perception.
That said, pictures of other objects also elicit a N170 component. No doubt about that. Some objects elicit a larger N170 than others (cars for instance, see Rossion et al., 2000). However, the N170 has a clear distinct signature for faces: it is larger in amplitude for faces than objects. It peaks sometimes earlier also. It shows a clear right lateralization. And it is typically the largest at lateral occipito-temporal sites rather than more medial, occipito-parietal sites.
Does the P1 also reflect face perception? What does sensitivity to faces mean at the level of the P1?
The P1 (or M1 in MEG), which peaks earlier than the N170, has also frequently been associated with face processing. In particular, several studies have reported differential P1 to faces and nonface stimuli (e.g., Halgren et al., 1999; Itier & Taylor, 2004; Liu et al., 2002). However, this observation has often be overinterpreted as evidence for an early stage of face detection, either of “holistic face perception” or “perception of facial parts”.
In reality, the P1 sensitivity to faces appears to be driven entirely by low-level visual cues, in particular the differential power spectra of face and other stimuli (Tanskanen et al., 2005). If faces and objects are controlled for low-level visual cues such as differences in power spectrum, there is no difference between faces and objects at the level of the P1, contrary to the N170 (Rousselet et al., 2005; 2007).
More strikingly perhaps, we recently showed that the “P1 face effect” was equally large for phase-scrambled stimuli: phase-scrambled pictures of faces lead to a larger P1 than face-scrambled pictures of objects (here cars; see Rossion & Caharel, 2011).
It means that if you compare the response properties of the P1 and N170 for faces and a highly familiar object category such as cars, you may end up with a clear dissociation between face-sensitivity at the level of the P1 and N170: P1 is accounted for by low-level visual cues, the N170 is not related to such low-level visual cues: its amplitude varies with the percept. The distinction between the two components therefore marks a border between low-level and high-level vision.
The first 200 ms (Rossion& Caharel, 2011). The P1 is larger for both faces and phase-scrambled faces than cars and phase-scrambled cars. Its face-sensitivity is therefore not directly related to the perception of a face per se. In contrast, the N170 is larger (and earlier) for faces than cars, but this amplitude difference cannot be accounted for by low-level visual properties.
Isn’t the N170 peaking too late to reflect the first perception of the stimulus as a face, or the earliest activation of a high-level face representation in the human brain?
I often hear people saying this, but I think it is a misconception. First, the most important thing to consider is that the onset of the N170 is at about 130 ms on average (with a great deal of variation across different brains).
In the monkey brain, neurons recorded in the infero-temporal cortex have a mean onset latency of about 100 ms in response to faces (e.g., Afraz et al., 2006; Kiani et al., 2005; Tsao et al., 2006). The onset latency of face-selective cells in the human brain is unknown, but comparatively, in the bigger and slower human brain, it is reasonable to consider that it should be a few tens of ms later.
Considering such latencies, and despite all the uncertainty of making relationships between the timing of spike responses and a field potential recorded on the scalp, an onset of 130 ms for the earliest activation of face representations in the human brain is not unreasonable. It is fact perfectly compatible with other measures.
Of course, one can observe earlier responses to faces, even by humans. For instance, Crouzet, Kirchner and Thorpe (2010) found saccades towards faces in visual scenes as early as 110 ms. However, it is likely that the earliest saccades towards faces are driven by low-level cues such as a differential amplitude spectra for faces and other visual stimuli (see Crouzet & Thorpe, 2011). The P1 effects showed above indicate indeed that such low-level visual properties can play a role in the early responses to faces. However, at that time, the stimulus is not yet perceived as a face by the system.
Should we systematically equalize our stimuli for low-level visual cues?
The discussion above may have led to the impression that when we compare faces to other visual stimuli, we need to systematically control for low-level visual cues. Apart from rare cases such as in the paper described above for power spectra and color (Rossion & Caharel, 2011), controlling generally means eliminating these cues. Indeed, some researchers use faces and objects that are equalized in luminance, contrast (according to one definition of contrast, generally the Michelson contrast) and even power spectrum (energy per spatial frequency band) (e.g., Rousselet et al., 2005).
Such a procedure can be very important to make a point in a given study. For instance, in order to demonstrate that the larger N170 amplitude to faces than objects is not due to low-level visual cues. This is fine.
However, the stimuli transformed can then become very artificial, very far from the kind of face stimuli that we are exposed to in real life. Since it is difficult to normalize across all color channels, these stimuli are usually presented in grey-levels for instance. Therefore, such transformations remove diagnostic low-level cues for distinguishing faces and objects, but also high-level cues (our brain “knows” that a face of a certain population is of a specific color as compared to another object category, and this is certainly part of what helps us categorizing faces).
In some studies, the “luminance-contrast-power spectrum-color normalized” face stimuli are even of the exact same size, with their features all realigned.
My view on this issue is that an obsession with the control kills many interesting phenomena, and I would strongly advice researchers in this area to be careful in wanting to control for everything (and forcing others to do so). There needs to be a fine balance between control of low-level visual cues in a given study and keeping in the stimuli all the cues that are important to understand the phenomenon that one is interested in. We have discussed these issues in two papers (Rossion & Jacques, 2008; Rossion & Caharel, 2011), but I’d like to write a few more words here on that.
For instance, because overall luminance will depend on the conditions under which the picture is taken, and these can vary greatly between the different pictures across categories, it makes sense to equalize overall luminance between the set of faces and objects compared.
It is already less obvious to me for global contrast. Faces are highly contrasted stimuli for instance, with some regions of a face being dark (eyebrows, dark eyes) and others being very light (cheeks, forefront). In a way, this is a property that could also be part of what makes faces a different kind of stimulus than some other object categories. Contrast within the stimulus is already a cue that can be picked up by the system.
Power spectrum is another one. Different object categories differ in their power spectra. In particular, faces have even more energy in low-spatial frequency bands than other object categories (Keil, 2008). If we systematically equalize for power spectra across categories, for instance by using for all stimuli the average of the amplitude spectra of all face and non-face stimuli of a given set, we remove an interesting aspect of what defines a face also for the visual system.
Again, once it has been done in a given study and it has been shown that the larger N170 in response to faces than other object categories is not explained by such cues, it’s fine. However, in subsequent studies, one does not necessarily need to remove these cues in a study that targets the N170 to manipulate task or stimuli variations within the face domain for instance.
That is, when one compares a set of faces to another set of faces, or the same faces across task manipulations. Faces vary in terms of contrast and spatial frequency also, in terms of diagnostic color cues, etc … Removing such cues removes part of the phenomenon we are interested to study.
The best example comes from studies that investigate the “other-race” face effect: on top of equalizing power spectrum and contrast, some of these studies remove color from the faces! This does not make sense to me. Color is an essential part of what defines faces of different human populations of the world. It is a highly interesting cue, and removing color in such studies does not make sense to me: it removes a substantial part of the phenomenon we are interested in.
In general, my feeling is that the problems with studies of the N170 component have not been so much due to a lack of absolute control of low-level visual cues. It has been the case for the visual P1 though, a component that is highly sensitive to luminance, contrast, power spectra variations, …. And, of course, given that the N170 immediately follows the P1, any effect on the P1 could be simply propagated to amplified at the N170 level. This why in any N170 study, interpretation of the data must be made with great care. And peak-to-peak measures with respect to the P1 should also (not exclusively) be considered when dealing with the N170. Regarding this last point, while time-point statistical analyses, ignoring the components, have been proposed as a solution to this issue, I think that they will complement but will never replace fully an ERP component based approach. As we just saw, the P1/N170 border appears to mark a categorical border between low-level and high-level vision, something that cannot be fully grasped by parametric designs and analyses.
Now that we have a better idea of what the N170 marks, what we can we do with it?
In general, I have had little interest in trying to define the N170 as a “stage” of face processing. The N170 reflects in fact a fairly long time window (130-200 ms). During this time-window, there are certainly many visual areas activated, in particular, in the ventral visual stream, and potentially exchanging signals. Associating the N170 to a “stage” of processing and a specific cortical source does not seem plausible to me. Rather, it is likely that there are many sources and many processes interlocked in time that takes place during the N170 time-window.
However, since the N170 time-window is associated with the beginning of high-level visual processes, one can use the N170 as a “tool” to define these processes and understand the nature of the perceptual encoding of faces. This is the research program that we have attempted to pursue over the past 10 years or so, and which is illustrated below.
N170 and visual competition
In a very simple study, we found that when observers fixate a central face stimulus remaining on the screen, the N170 response to a subsequent face stimulus presented at a different location is substantially reduced (with respect to a control condition in which the first stimulus is a phase-scrambled face)
(Jacques & Rossion, 2004). Importantly, the first stimulus remains on the screen when the second appears.
Note that this reduction is also found when you present first 2 lateral face stimuli, side by side, outside of fixation: the N170 elicited by a third face stimulus appearing at fixation is largely reduced (compared to a situation when two phase-scrambled stimuli are presented initially) (Jacques & Rossion, 2006). However, the N170 is reduced to a lesser extent for a central target than a lateral target face (Jacques & Rossion, 2006).
These observations are in line with single-cell recording studies in the monkey infero-temporal cortex showing that neurons tuned to respond to face stimuli exhibit a decrease of their response when more than one stimulus is present in the visual field (e.g. Miller et al., 1993). These effects are generally interpreted as reflecting a competition between visual stimuli for neural representation, to the extent that these stimuli recruit a common population of neurons (Desimone, 1998; Keysers & Perrett, 2002). Thus, the observation of a reduced N170 when 2 face stimuli are presented next to each other suggest that individual faces compete for neural representation at this latency.
At first glance, these effects are not that exciting: it makes sense that if the system is busy processing a face, the presentation of another face cannot elicit an additional (large) increase of activation. However, it shows at least that the initial representation of faces as tagged by the N170 has a large receptive field (otherwise the competition would take place at a later stage). Most importantly, it provides us with a nice tool to study the competition between faces and other visual stimuli. This is what we did in a series of studies described below.
N170 and visual expertise
A major interest of the concurrent stimulation paradigm in scalp ERPs is that it can be used as a tool to test the extent and the time course of the interaction between different shape representations. In two ERP experiments, one with Chun-Chia Kung and Michael Tarr, and the other one with Tim Curran, we showed that fixating non-face objects in a domain of expertise, novel objects (asymmetric Greebles) or cars, leads to a reduction of the N170 elicited by faces presented next to the central object (Rossion et al., 2004; Rossion et al., 2007, respectively). These observations suggest that when presented concurrently, faces and non-face objects in a domain of expertise compete for early visual categorization processes in the occipito-temporal cortex.
Admittedly, we were not the first (and last) to test for such competition effects between faces and visual objects of expertise.
However, a particular interest of our expertise studies is that the effects are very large and clear, obtained in a very simple paradigm, and yet specific in terms of their spatio-temporal localization (N170, right occipito-temporal hemisphere).
Moreover, the sensory competition effects are significantly correlated with the amount of visual expertise of the participants (see Rossion et al., 2007).
And, it is replicable and very similar in different studies (see Rossion et al., 2004; Rossion et al., 2007).
Following this work, we used the concurrent stimulation paradigm to test the respective role of spatial attention and sensory competition in accounting for the amplitude reduction of the N170 during dual face stimulation (Jacques & Rossion, 2007). In that study, ERPs time-locked to a lateralized face stimulus were recorded while subjects fixated either a face or a scrambled face stimulus (stimulus factor), and were engaged in either a high or a low attentional load task at fixation (task factor).
In these conditions, the N170 amplitude to the lateralized face stimulus is reduced both when the central stimulus is a face and when the attentional load at fixation is high. However these effects of stimulus and task factors on the N170 amplitude are largely additive. Most importantly, spatial attention modulates visual processes as early as 80 ms after stimulus onset, whereas sensory competition effects starts at the onset of the N170, at about 130 ms.
These results provide clear evidence that (1) the N170 in response to faces can be strongly modulated by spatial attention and (2) sensory competition between face representations in the extrastriate cortex reflects distinct neural processes than spatial attention.
It is a quite important demonstration also because the effects of expertise described above could have been attributed to attention (i.e., experts paying more attention to cars for instance, so reducing the response to the lateralized faces). This is the kind of “easy” arguments that are usually used to dismiss visual expertise effects (when one is short of arguments, there is always the attention account to use J). Although there a number of (good) reasons to believe that attention could not account for the competition effects mediated by expertise, the study in which attention was manipulated shows that if experts paid more attention to cars than novices then there should have had a P1 reduced relative to novices in the 2 expertise studies (Rossion et al., 2004; Rossion et al., 2007). It was not the case.
Face inversion and the N170
In his early studies, Jeffreys (1993) found a delay of latency of the VPP for inverted faces. In one experiment, Bentin et al. (1996) also reported a significant latency delay for inverted faces, and also mentioned a small increase of amplitude. Since then, the peak latency delay of the N170 for inverted faces has been reported in many studies, including a number of studies performed in the face categorization lab. This latency delay is compatible with the delay of response latency found for inverted faces in face-selective cells of the monkey brain.
However, what is really surprising is that the N170 is generally increased in amplitude for inverted faces, as in the example below. Besides the short report of Linkenkaer-Hansen et al. (1998), I believe that we reported the first clear evidence of this amplitude increase to inverted faces (Rossion et al., 1999).
If I am not mistaken, we were also the first to show that this paradoxical increase of amplitude and latency only holds for faces (at least when we compared to a series of nonface mono-oriented objects) (Rossion et al., 2000).
In the 1999 paper, we used faces only, and we also proposed two possible accounts for this paradoxical increase:
(1) the loss of configural/holistic encoding following inversion could have resulted in an increase of difficulty for encoding the face stimulus, leading to a larger and delayed face encoding process.
(2) that the larger amplitude observed for inverted faces might be a result of the recruitment of additional processing resources in object perception systems.
Although other authors (mainly Roxane Itier) have proposed different plausible accounts of this effect, I still believe that these two possibilities, which could be complementary, accounts for a large part of the effect. Indeed, other transformations that are well-known to disrupt holistic/configural face processing lead to the same effect.
For instance, if you cut the face in two parts, as in the control condition of the composite face effect, you get the same effect as when you invert a face: increase of N170 latency and amplitude. Letourneau & Mitchell (2008) showed that first. More recently, we used a series of control stimuli to show that this increase cannot be accounted for by a general effect of spatial misalignment of visual patterns (Jacques & Rossion, 2010).
I am also convinced that the effect of inversion on the N170 is not an epiphenomenon. Rather, it seems to reflect something functional, such as a loss of holistic/configural face encoding. Again with Corentin Jacques, we presented face stimuli at 12 orientations, from 0° to 330°. We used a delayed matching task and measured performance on the second face.
In that study we showed that the amplitude of the N170 varies according to a “M-shaped” function, just like behavior (Jacques & Rossion, 2007).
What about the account of an amplitude increase for inverted faces explained by recruitment of resources from a more general object processing system? There is evidence indeed from several fMRI studies that non-face-selective lateral occipital cortex activation is enhanced in response to inverted faces (e.g., Haxby et al., 1999). However, there is no such amplitude increase of inverted faces in the middle fusiform gyrus, in the so-called fusiform face area (FFA), but rather a small decrease. I also recommend having a look at the paper of Rosburg and colleagues (2010) with intracranial recordings: these authors found an increase of amplitude to inverted faces in the lateral occipital cortex but not in the ventral temporal cortex.
So all in all, I’d like to write two more things about this increase of amplitude following inversion of the face stimulus:
First, the increase of amplitude is observed only when inversion does not prevent the interpretation of the stimulus as a face, as when clear photographs of faces are used. On the other hand, if you use Mooney faces, when you invert these stimuli, they are not seen as faces anymore, and so you observe a large reduction of the N170. It is important that the two phenomena are not confounded.
Second, one should not confound this increase of amplitude with the face inversion effect on the N170: the fact that repeating the same face leads to a decrease of amplitude on the N170 for upright but not for inverted faces (Jacques et al., 2007, see below).
How does the N170 evolve across development? This is an important issue because face recognition performance appears to improve a lot with age, at least until adulthood (Carey, 1992). Yet, authors are divided with respect to the explanation of this phenomenon, in particular whether it reflects the maturation of face-specific or general processes.
This is where the study of the N170 can be important. When you measure behavior of a child at a face recognition task, it could be that face perception is as mature as in adulthood but that performance is less good because of other factors in development that will influence the behavioral response (attention, understanding of the task, motivation, selection of responses, …). However, since the N170 reflects the early activation of perceptual face representations, one can measure how perception of faces varies across developments in a behavior-free measure.
In a series of studies, Itier, Taylor and colleagues showed that the N170 undergoes dramatic changes across development, from 5 years old until adulthood: linear reduction of latency and large changes of amplitude. However, it remained unclear whether these modifications were specific to faces.
With a postdoc in my lab, Dana Kuefner, we performed an ERP study in children and adolescents (4 to 18 years old) in which we compared the N170 to pictures of faces, cars and their phase-scrambled versions.
We found that there are indeed major changes in the basic response properties of these components (reduction of latency and amplitude, lateralization of posterior topography), in particular the P1, which decreases dramatically with age (as described previously in other studies also with non face stimuli).
However, these changes are found for any kind of visual stimulus for the P1. And, they are found for both faces and objects (cars) for the N170 (Kuefner et al., 2010). A larger N170 for faces than cars, with a right lateralization, is found in younger children and does not appear to increase with age.
In short, contrary to what was claimed in previous studies, we found that the P1 and N170 do not change specifically for faces between 4 years of age and adulthood.
We concluded that there is no evidence from the characteristics of the N170 that the perception of faces changes across development.
However, it does not mean that perception of faces does not change with age. For instance, response properties of the N170, such as its sensitivity to individual faces, may well change across development. In fact, such an observation would be more compatible with what is known from behavioral studies: what truly changes during development is the performance at individualizing faces, not really at detecting faces. There is a long way before testing that in children, and what we’ll first have a look at is the sensitivity of the N170 to individual faces in adulthood. This has been a major area of interest in my laboratory over the past few years, and I’d like to show you a summary of that work here.
Evidence for individualization of faces as early as the N170
As I mentioned earlier, when one refers to the N170, it is in fact a quite a long window of duration because it ranges between about 130 ms and 200 ms (with a substantial amount of variation between people in terms of peak latency). While the N170 can be considered as a marker of the activation of a generic face representation, we believe that there is much more concerning face processing that takes place during the N170 time-window.
Interestingly, a number of authors have associated the N170 with the structural encoding stage of the Bruce & Young (1986)’ information processing model of face processing. While I do not subscribe to the idea of associating a component with a stage of processing, I often point out that “the structural encoding stage” is NOT a face detection stage in the Bruce & Young (1986) model. It was conceptualized by the authors as a level “which capture those aspects of the structure of a face essential to distinguish it from other faces” (Bruce & Young, 1986, p. 307).
Therefore, if the N170 corresponds to the structural encoding stage of that model, then it must reflect within-category discrimination! (= individualization of faces).
Most importantly, recordings of single neurons in the monkey infero-temporal cortex show that as early as these cells start to discharge selectively to faces (about 100 ms, see above), information about individual faces start to accumulate (i.e., different cells respond at a different rate to different faces, e.g., Rolls & Deco, 2000).
Considering this, it would make complete sense that beyond face/object discrimination, individual faces can be discriminated already during the N170 time-window. I believe that there is now sufficient evidence that this is the case, and although others have provided evidence supporting this view, it has been one of the main focus of interest for the research carried out in my lab over the past few years. We developed two paradigms to test that.
First, we realized that all ERP studies of face perception used versions of the “flash VEP” paradigm: a complex and highly salient face stimulus is flashed at once to the system, leading to a series of ERP responses that needs to be interpreted.
With Corentin Jacques, we developed a new paradigm that would be the equivalent of the “pattern reversal VEP” with faces: a train of 2 faces alternate with each other at about 1.3Hz (random duration between 500ms and 700ms). In such a “face identity reversal” paradigm, the changes of low-level visual information from face A to face B is limited (and was manipulated by morphing in that experiment). What is changing is the difference between the two individual faces, that is, facial identity.
The results were quite interesting: when we averaged all the pieces of EEG that followed a reversal of identity, we found that the P1 component was virtually absent: the waveform was flat until 100 ms and then there was a negative deflection peaking exactly at the latency of the classical N170 component ! It topography was also remarkably similar to the whole N170 component (Jacques & Rossion 2006).
To us, this result indicated that if one restricts the change in the stimulus to the properties that characterize an individual face, it is sufficient to elicit a N170-like component. We also showed in that study that the response was larger for a change of identity that was perceived as larger than another change, even though physical differences between the 2 swicthes of identity were kept constant (“categorical perception”, see Jacques & Rossion 2006).
The paradigm is quite powerful because you could get a lot of trials in a very short time. However, its range of application is also limited because you need to use very similar faces and minimize motion onset between pictures.
This is why, over recent years, we have rather concentrated on using a face-identity adaptation paradigm in ERPs to understand the nature and dynamics of coding for individualizing faces.
To start, we observed that a number of studies failed to report any face identity repetition effects on the N170, or observed very weak effects, so that a number of authors considered that individual face discrimination was not taking place at that latency. Therefore, inspired by a study of Gyula Kovács and colleagues (2005) and by behavioral studies of face adaptation, we used a paradigm in which we maximized our chances to observe something by presenting a first face stimulus for several seconds followed after a very brief duration by a second face stimulus (Jacques, d’Arripe & Rossion, 2007).
This second face could be either the same identity, or another identity (all unfamiliar faces). Importantly, we changed the size between the adapter and target face, and we also used different pictures of the same individual in the condition when the same face was repeated. This way, we minimized low-level repetition effects.
In these conditions, what you get is a reduction of the N170 amplitude when the target face is the same person as the adapter face. The effect is not huge, but it can be substantial (more than 1 microvolt in that study, highly significant). There are differences taking place later, but it is the earliest effect (if low-level adaptation is minimized).
This observation shows clearly that as early as the N170 peak (about 160 ms following stimulus onset), the system has accumulated sufficient evidence to individualize faces.
It is true that some studies do not find a significant difference at that level, using different stimulation parameters. However, these are null results, and I think that there are enough positive results of this sort now in the literature to make the point that individual faces are coded as early as the N170 time-window.
Moreover, again, the topographical map of this difference show the typical occipito-temporal right lateralization.
Now that we know that individual faces are coded as early as the N170, what can we do ?
Well, let’s play with that effect !
In one study that I like very much, we tried to determine what kind of information from the face conveys most of the effect. We used the identity-adaptation paradigm with faces that vary either in 3D shape or in 2D surface reflectance (color and texture), the two main sources of information for facial identity.
Behaviorally, we replicated previous findings: people are as accurate and as fast to discriminate individual faces based on shape or surface reflectance (e.g., O’Toole et al., 1999). When you have the two kinds of information added, people perform even better. This is also what we found at about 300 ms following stimulus onset: the difference between the condition in which the two sources of information differed (“different”) showed the largest difference with the condition in which the same face image was repeated (Caharel, Jiang et al., 2009).
However, and most importantly, at the level of the N170, we found that a difference in 3D shape alone led to a significant difference, while 2D surface reflectance alone was not sufficient to elicit such a difference. The effect for 3D shape alone was as large as the effect observed for the addition of the 2 sources of information.
This finding supports the view that individualization of the face is based on both kinds of sources of information but that information about shape is accumulated earlier than information about color and texture (Caharel, Jiang et al., 2009). This observation highlights a great advantage of ERPs over behavioral studies in this area of research: with ERPs we disclosed a difference in the timing of processes that behavioral results could not reveal.
Next, we investigated whether this face identity adaptation effect could be observed across head view changes. Again, since face-selective neurons in the monkey infero-temporal cortex show viewpoint-tuning in the form of a bell shape function, we reasoned that if an angle smaller than a 3/4 profile was used, we should still get an identity-adaptation effect.
In this latter study (Caharel et al., 2009), we found indeed that the N170 (only in the right hemisphere) was smaller in amplitude when the same facial identity was repeated immediately after an adapter face, even with a substantial amount of viewpoint difference (30° depth rotation).
Be careful though: this result is sometimes erroneously taken as evidence for viewpoint-invariant face representations at the level of the N170. It’s not. First, we did not really test viewpoint-tuning, but sensitivity to adaptation across viewpoints. Second, our results show that the tuning to viewpoint for a certain identity is not sharp, that’s all. If the representation is viewpoint-dependent but that there is a bell shaped function, then there could still be an effect of identity-adaptation across a 30° change, tat would disappears completely at a 3/4 profile (45°).
I also want to mention again that the fact that another study failed to find an identity adaptation effect in such a paradigm (with small changes of head rotation) cannot rule out the positive result that we observed here: one has indeed to use the most sensitive paradigm as possible, given that these effects on the N170 are not big. Some of my colleagues have also toned down the interest of the N170 adaptation effect across viewpoints because it is much smaller than the late difference that can be indeed observed on the figure above. For my part, I am interested in the N170 modulation because it’s the first one in time, and I know that at this point the system has accumulated sufficient evidence to show a sensitivity to individual faces, even across head views. Of course, evidence goes on to accumulate in the processing of the face, but I don’t know if such late effects have much to do specifically with face perception.
One thing that surprised us though is that instead of increasing, the N170 identity adaptation effect across head rotation disappeared when we used personally familiar faces (Caharel et al., 2011) ! Given that people perform usually better at matching familiar than unfamiliar faces across head rotation, we had expected the opposite. However, we found that with familiar faces, a significant effect appeared on the left hemisphere N170, suggesting that when familiar faces are used, identity-representations can be generalized across views in the left hemisphere. As for the behavioral advantage at matching familiar over unfamiliar faces, we found that it was related to late differences in the ERPs.
Early individualization of faces (N170) is based on a holistic representation
One (very) important theoretical issue in the field of face perception is whether the encoding of the face is first performed part-by-part (one eye, the mouth, … or fragments) towards a global representation, or if the face is initially encoded as a whole template.
At the level of the encoding of an individual face at least, our face-identity adaptation paradigm suggests that the second view (initial encoding of the whole individual face) is correct.
First, in the study of Jacques et al. (2007) we showed that the identity adaptation effect at the level of the N170 was not found when the exact same faces were presented upside-down, a manipulation that is well know to disrupt holistic processing.
This result strongly suggests that individualization of faces at the level of the N170 at least is based on information whose processing is lost by perception. Since inversion is known to affect greatly holistic/configural encoding of individual faces, it is reasonable to believe that the individualization of faces at the level of the N170 is based on a holistic representation.
Can we demonstrate that more directly?
In a study with Corentin Jacques, we used a variation of the composite face paradigm (Young et al., 1987), taking advantage of the visual illusion that such (unfamiliar) composite faces can create (see e.g., Rossion & Boremanse, 2008 and the illustration below).
This time, we asked participants to focus only on the top half of a face (defined as the half above a tiny white line or gap) and match this top half. The irrelevant bottom half had to be ignored, both for the adapter face and for the target face. When the two top halves were identical between the adapter and target, this bottom half could also be either of the same identity for the adapter and the target, or, of a different identity. In this latter case, a strong visual illusion is elicited: despite being physically identical the two top halves appear dissimilar.
In this condition, we found a larger N170 than when the two top halves appear identical (Jacques, & Rossion, 2009).
This result thus strongly reinforces the view that the individualization of faces at the level of the N170 is based on a holistic representation.
Note that the increase of amplitude was not large, and certainly not as large as when the two faces were truly physically different (replicating the result observed by Jacques et al., 2007). This makes sense since the perceived dissimilarity between the two faces is not as large as when they are truly physically distinct.
Moreover, and more importantly, we also showed that when the top and bottom halves are very slightly misaligned, removing the visual illusion, the N170 identity adaptation effect disappears.
Again, we take these results as evidence for an early activation of. It is an important observation because it suggests that an individual face is not encoded part-by-part (or at least that these parts are not interpreted as face-like by the system) but rather as a holistic representation.
These results have also been replicated more recently in another study (only with aligned faces, and with only the top halves changing in the real “different faces” condition) using an oddball paradigm (Kuefner et al., 2010). This latter study demonstrated that these observations can be independent of the decision-related components and behavioral responses in the composite face paradigm.
I guess that’s all for now … see our publications for more in depth discussions ….
see also our review chapter: Rossion, B. & Jacques, C. (2011). The N170 : understanding the time-course of face perception in the human brain. To appear in The Oxford Handbook of ERP Components (2011), Edited by S. Luck and E. Kappenman. Oxford University Press.
Any comment about this text? Please email email@example.com