Loading
The great majority of children learn their native language effortlessly, and exhibit surprising linguistic knowledge even at a young age. By 6 months, infants already know a few words: when hearing the word “cookie”, they look longer at a picture of a cookie than a picture of a hand. In order to learn those words, the 6-month-old must have been able to extract and store the sound component of words, the “wordform”. In fact, it has been estimated that 1-year-olds have stored and be able to recognize as many as 500 wordforms. The characteristics of the input and mechanisms allowing this surprising development in natural language acquisition have not been studied before. The general goal of the present project is to shed light on how infants achieve, and caregivers promote, early wordform learning in the real world. We will combine theories and methods from linguistics, experimental psychology, automatic speech recognition, and natural language processing as follows. In a first phase, we will use a novel technology allowing daylong recordings to gather a rich and realistic corpus representing infants’ input. We describe the wordforms present in this input by capitalizing on state-of-the-art wordform extraction algorithms. These algorithms vary in terms of the operations they carry out (e.g., extracting repeated sequences, additionally learning the language’s grammar) in different types of signal (e.g., raw acoustic speech, phonemic units). As a result, each makes some unique predictions with respect to the wordforms infants can find in the rich corpus just mentioned. In a second phase, we will check these predictions against infants’ perception, by “reverse engineering” the wordforms they succeed in finding. Previous work has shown that infants prefer frequent wordforms (which they recognize) over others that are low in frequency. A preference for a given wordform is thus a sign that infants have extracted that wordform and stored it for subsequent recognition. Given that many such wordforms need to be tested, we will develop a novel method: the “preference toy”. The toy plays a sound each time the child shakes it. Laboratory-based research with comparable conditions (e.g., preferential listening) suggests that the child will shake the toy more when this results in wordforms he/she recognizes over unrecognized wordforms. By embodying it in an age-appropriate toy, we can provide it to the child to use at home for much longer periods of time. Repeated testing should boost precision, allowing us to check our multiple competing predictions. Given that the algorithms from phase 1 vary in terms of how much knowledge they assume in the learner, we expect different predictions to be true at different ages. In the third phase, we will assess to what extent each child follows a unique path during early lexical acquisition. Since wordform learning necessarily depends on the input presented to the child, unique aspects in that input could explain individual variation across children. To understand the contributions of infant-specific versus common aspects of the input to infants' learning we will combine our innovations from the previous two phases: The child's input is captured through daylong recordings, processed to generate specific predictions, and that same child is tested on those predictions. In addition to gaining a deeper understanding of the acquisition process, this phase paves the way for applied work to be carried out in the future.
<script type="text/javascript">
<!--
document.write('<div id="oa_widget"></div>');
document.write('<script type="text/javascript" src="https://www.openaire.eu/index.php?option=com_openaire&view=widget&format=raw&projectId=anr_________::377a5876f4e81793dbdf707d2a51c41d&type=result"></script>');
-->
</script>