Retrieving information from tweets during natural disasters can save lives. PhD student Kiran Zahra collaborates with linguists and international organizations to develop the appropriate methods.
I make sense out of unstructured and informal text on Twitter. In particular, I analyze tweets posted during natural disasters such as floods or earthquakes to extract information, e.g. on casualty reports and help-calls. Such information is required by humanitarian organizations to gain situational awareness and plan relief operations. As nowadays internet is available almost everywhere, people extensively use social media to share disaster-related information.
However, the challenge is to identify the most relevant parts of tweets out of an abundance of data. If we collect tweets based on particular disaster-related keywords this will also include tweets of someone who feels "flooded" with information or experiences an "earthquake" in his relationship. This is what we call noisy data. My research is about how to get rid of this noise.
We use machine learning techniques to classify the tweets. However, these algorithms require well prepared training data sets beforehand for high performance. Time is of essence during disasters and preparing new training datasets can cost lives. That's why we investigated whether an algorithm trained on a data set of an earthquake in Italy accurately detected the right tweets also of an earthquake in Myanmar that coincidentally happened at the same time by only swapping place names in both datasets. And it actually worked. These results help to ensure efficient and timely data analysis on Twitter.
Using specific linguistic features, we can distinguish from whom the information originates. Particularly valuable is the information provided by eyewitnesses, since these people directly observed the event. But also reports about family and friends present in disaster-hit region shared by people who are located in other parts of the world are very important.
That's an important point. We are in the process of finding good approaches to address ethical concerns while at the same time ensuring reproducibility. Researchers often operate in gray areas without clear guidelines. Soon we will discuss this important issue for geographic information science in a workshop which I co-organize. The name of the workshop is LESSON 2019, standing for "Legal Ethical factorS crowdSourced geOgraphic iNformation" and takes place here in Zurich on October 8 and 9. The interest in the scientific community is enormous. We have received many high-level submissions for contributions.
I come from Pakistan where I studied geography with a focus on geographic information science and remote sensing. I was also interested in computer science and took some programming courses during my bachelor studies. I always wanted to do a PhD and applied for a Swiss Government Excellence Scholarship to work with Ross Purves.
Yes, this was a great experience. We had three minutes to bring our research across to a general audience. For me, it all started with attending a course in "Storyboarding as a Research Tool" a few months earlier. We should draw our research in a single sketch. I discovered that I can create a story out of the individual sub-projects of my PhD. They are all interconnected as parts of one bigger research question. I think it's one of the dangers of a publication-based doctoral thesis, that you just get lost.
It was an amazing summer school! It focused on international research collaborations, a topic I am very interested in since I work with linguists, computer scientists and people from the disaster domain. I encounter many challenges in these collaborations and was so happy to get answers in this summer school. Together with other PhD peers from several European universities we compiled a guide on international research collaborations for early career researchers.
All these activities and courses besides the actual research work have taught me a lot. I see this as an important part of my PhD training. It gave me more confidence and polished my communication skills.
I love to socialize with other families. We meet every now and then to cook and eat together. My son really enjoys this. I am a good cook. Even in my private life it looks as if it's all about communication and people!
LESSON 2019: Legal Ethical factorS crowdSourced geOgraphic iNformation
October 8-9, 2019, University of Zurich
LERU Doctoral Summer School 2019, Edinburgh