Skip to main content

Bloomsbury Roundtable: Size Matters? Exploring Linguistic Data in the Big Data Era

When:
Venue: Birkbeck 30 Russell Square

No booking required

The Bloomsbury Round Table:

 

Size Matters? Exploring Linguistic Data in the Big Data Era

 

Host: Department of Applied Linguistics and Communication & Centre for Multilingualism and Multiculturalism Research, Birkbeck College, University of London

Event coordinator: Professor Zhu Hua

Date: 10:00am-4:30pm, Friday, 2nd June, 2017

Venue: Room 101, 30 Russell Square (Catering in Room 102)

 

Please click here to book your ticket.

'Big data is enriching the field of language study', declared Mark Liberman of Pennsylvania during his speech at a British Academy and Philological Society panel discussion on Language, Linguistics and the Data Explosion in 2014. Whilst many linguists have worked with substantial corpora for decades, efforts to build larger databases, aided by sophisticated multimodal technologies, are on the increase. Yet language as a creative output of human beings in society is infinite and unpredictable. In what way can the Big Data paradigm help us to understand how language works, or more precisely, how human beings use language and communicate? This event as the annual Bloomsbury Round Table brings together leading researchers from the fields of corpus linguistics, variationist sociolinguistics, linguistic ethnography, narrative analysis, and multimodality to explore different kinds of research questions that need to be asked and the roles of different types of data in the big data era.

The Bloomsbury Round Table is an annual and international event where researchers are invited to present their latest work in the broad fields of Language, Communication and Cognition. It is led by Birkbeck College with contributions from nearby colleges of the University of London.

The confirmed speakers and titles are:

Capturing context for spoken corpus analysis

Svenja Adolphs (Nottingham University)

In this talk I will outline some of the ways in which data relating to the spatial, temporal and experiential contexts in which spoken discourse takes place might be used to support corpus-based analysis. This will include discussion of different aspects of context gathered from multiple sensors (e.g. position, movement and time) and their role in enabling different types of descriptions within the tradition of spoken corpus linguistics.

Triangulating datasets on urban multilingualism

Yaron Matras (Manchester University)

My talk will explore the value of dataset triangulation to assess language skills, language needs, and language vitality in a multilingual urban context'

Potential conversations between multimodality and big data

Carey Jewitt (UCL Institute of Education)

Quantitative tools can provide a big picture of the who and what of online participation, in relation to topic patterns, however, the why and how is not well explored. Most big data analysis tools collect written data, some can now collect larger files, such as image and video files, however at the moment non-numerical or non-linguistic data needs to be analysed manually. This means that visual data is often not included in big data research. This is problematic in the context of social media, for example where image, video, animation, emoticons, and (sometimes) writing are part of the multimodal ensemble. Big data analysis data crawling and mining tools, social network analysis and sentiment analysis, can tell us when users tweet, the topic of the linguistic element of the tweet, place the tweet in a geographical scape and a timeline. However, if we are interested in how this tweet was created, the resources and affordances of the twitter platform, an in-depth understanding of the social practices of twitter, how it is structured and composed, and what social meaning it achieves, we need to bring big data analysis into conversation with qualitative multimodal analysis.

Using corpus based discourse analysis to examine online patient feedback to the NHS challenges and opportunities

Paul Baker (Lancaster University)

My talk outlines the analysis of 30 million words of patient feedback to the NHS England website, focussing on the extent to which corpus based techniques can be used to gainfully answer questions set by members of the NHS Patients and Information Directorate team. I discuss the development of a generalizable method based on identifying frequent relevant terms, moving from collocation to qualitative analysis of sample concordance lines. Finally, I reflect on the challenges that were raised in developing this form of analysis and their implications for other forms of Big Data research.

'Big data': Possibilities for multimodal research on communication in health care

Jeff Bezemer (UCL Institute of Education)

In this talk I'll explore how 'big data' might be used in multimodal research. Using my work on professional practice in the operating theatre as a case study I'll discuss what multimodal analysis entails, whether and how 'big data' might be analysed multimodally, and how that might advance understanding of inter-professional communication in health care.

Beyond Big Data(ism): Online subjectivities & the value of small stories

Alex Georgakopoulou (King's College London)

The recent backlash of Big Data (cf. Dataism) in public and academic debates has mostly been preoccupied with the definition and value of size in empirical research. The very notion of data, however, has been much less problematized. Thinking about how capta can serve as (narrative) data is at the heart of my work on small stories & social media. Drawing on the insights of this work, in this talk, I will make a case not for big, small or, something in-between, data, but for thick data. Thick data allow researchers of online communication not to lose sight of 'subjects' and 'subjectivities', including their own, amidst the algorithmic aggregation.

Discussants:

Ben Rampton (King's College, London)

Li Wei (UCL Institute of Education)

Contact name:

Contact phone: 2076316499