Section 6 Exploring and manipulation

6.1 Basic exploration with R

I go through in this point some of the most important dplyr verbs that can be used to examine data from ELAN files. There are more thorough introduction in the internet, so this treatment is not very complete view.

6.1.1 filter()

6.1.2 slice()

6.1.3 count(), add_count()

6.1.4 mutate

6.1.5 rename

6.1.6 lag(), lead()

These functions can be used to access rows above or below another token

6.1.7 str_extract()

6.1.8 str_detect()

6.1.9 if_else()

6.2 Manipulating ELAN files with Pympi

Pympi is a very mature Python library and using it is enjoyable. However, I myself usually feel the most comfortable when I have the data in an R data frame, so I often fall back into that no matter what. I strongly encourage to test further data manipulation and analysis in Python, I assume that especially pandas should be very useful.