# processing_output folder This folder contains R scripts to execute six processing steps. As part of each step, one or several dataframes are exported (stored in 'processing_output' folder), which are used in later steps. For this study we worked with ELAN, an annotation tool for audio and video recordings (https://archive.mpi.nl/tla/elan). For each interaction an .eaf file was created (see 1_data/ELAN_files). We created annotations for the task structure, speech utterances, gestures and other-initiated repair sequences (time-aligned to the video recordings). Note that we used so-called 'independent' tiers (i.e., there is no parent-child structure in place), and consequently processing is needed to link annotations on various tiers together. This is done in steps 1 to 3. 1. Import ELAN data The annotations made in ELAN are exported for all 20 files at once, resulting in the file 'ELAN_output.csv'. In step 1, this file is imported into R; some basic processing steps are executed (renaming columns etc.), and the data is divided into 5 basic dataframes: '1_gestures_A', '1_gestures_B', '1_repair', '1_speech', '1_trials'. 2. Linking repair annotations In this processing step, the individual repair initation and repair solution annotations (which have a unique 'repair_id') are linked together and assigned a 'repair_seq_id' (repair sequence id). Furthermore, the corresponding speech transcriptions are added to each repair annotation. This results in the dataframe '2_repair_speech'. 3. Linking gesture annotations In this step, we semi-automatically link gesture annotations to the repair annotations, as described in Supplementary Information S1.4. This results in the dataframe '3_gestures', as well as the table '3_table_linked_gestures' (which is incorporated in S1.4). 4. Speech cleaning Here the speech annotations are cleaned. We deleted: - unclear/missing speech (speech that was inaudible, or where the transcriber was unsure about what was said) - non-lexical vocalizations (e.g. laughing, coughing, breathing, sniffing, clicking with tongue/other mouth noises, clapping) - punctuation Note that partial words were kept (and thus counted towards number of orthographic characters measure). Finally, the dataframe '4_repair_speech_clean' is exported. 5. Extract kinematic features The motion tracking data (stored in 1_data/Kinec_data) is processed such as to yield the kinematic submovement measure. This is done for each gesture stroke separately, for which the on- and offsets are derived from the manual annotations (dataframes 1_gestures_A and 1_gestures_B). At the end of this processing step, the dataframe '5_gestures_kinematics' is exported. Furthermore, submovement profile plots for some gestures are saved in the folder 'kinematic_plots'; these are used for inspection and as examples in the manuscript and Supplementary Information S1.5 6. Final dataframe Here various intermediate dataframes are merged, and arranged in such a way to yield two main dataframes that are used in the analyses: * 6_final_df_main (one row for each repair turn) * 6_final_df_division_labour (each row for each repair sequence, consisting of an initiation and solution)